# My First Triangle in DirectX 11

Apr 2010

Now as I have my new graphics card I've started learning Direct3D 11. I've been doing much coding in DirectX 9 before. I also looked at the DirectX 10 API but gained not much practical experience in it. I'm very excited about how the new API looks like and the possibilities it creates. The library interface looks better organized, more object-oriented and clear. It makes extensive use of descriptors - same concept I liked so much in PhysX.

But at the same time I must admit it's more difficult to get started than it was in DirectX 9. You have to create more objects to setup basic framework that could render anything. The so called Fixed Function Pipeline doesn't exist anymore, so you HAVE to write shaders to render anything. Better organization of all data forces you to pass shader constants in buffers instead one-by-one, fill descriptors, create and use state objects (like ID3D11DepthStencilState) instead of changing render states one-by-one, create views for resources (like ID3D11ShaderResourceView for ID3D11Texture2D) instead of using them directly, compile shaders from HLSL source to bytecode and then create the shader object with separate call etc.

There are also big changes in math support. They didn't provide new D3DX Math with DX11. You can still use the old one, but now it's recommended to make use of new, portable (to Xbox 360) and highly optimized XNA Math library. It's pretty but can be difficult for beginners. For example, now there is one universal type - XMVECTOR - that can represent a vector, color, plane, quaternion and more. It must be always aligned to 16 bytes (because it uses SSE). I suppose it's not easy to understand concepts like vector loading, storing or swizzling, which can be new for many DirectX programmers.

Where do learn DirectX 11 from? It looks like there are not many valuable sources online yet. The directx11tutorials.com website looks interesting, but it's just a blog with few pieces of code and the author tries to wrap everything into his own classes from the start, which makes no sense for me. The most valuable source of knowledge is the original documentation installed with DX SDK. It's far from being extensive because the chapter about DirectX 11 describes only new features, not everything about using DirectX like the documentation for version 9, but for somebody who already knows some graphics programming it should be OK.

What I want to show today is my first "Hello World" triangle made in Direct3D 11 and code that renders it. You can download whole source with project for Visual C++ 2008 from here: Dx11Test.zip. See also the code online: Dx11Test.cpp.

To code in DX11, you just need DirectX SDK - same as for DirectX 9 and 10, because it contains SDK for all of them. Programs that use DirectX 11 can run on Windows 7 as well as Vista, but not on XP. It's surprising that newest-generation graphics card is not needed - you can use DX11 API to code in different "feature levels" from D3D_FEATURE_LEVEL_9_1 to D3D_FEATURE_LEVEL_11_0, so you can write your code so it will run even on GPUs that support only Shader Model 2!

# The Concept of Immutability and Descriptor

Nov 2009

An object of some class represents a piece of data, chunk of memory or other resource along with methods to operate on it. It should also automatically free these resources in destructor. But how should modifying these data look like? There are two possible approaches. As an example, let's consider a fictional class to encapsulate Direct3D 9 Vertex Declaration (I'll show my real one in some future blog entry). A Vertex Declaration is an array of D3DVERTEXELEMENT9 structures, which can be used to create IDirect3DVertexDeclaration9 object. First solution is to define class interface in a way that data inside can be modified at any time.

class MyMutableVertexDecl
  // Creates an empty declaration.
  // Frees all allocated resources.
  // Copies data from another object
  void CopyFrom(const MyMutableVertexDecl &src);
  // Deletes all internal data so object becomes empty again.
  void Clear();
  bool IsEmpty() const;
  // I/O (Serialization)
  void SaveToStream(IStream &s) const;
  void LoadFromStream(IStream &s);
  // Reading of underlying data
  size_t GetElemCount() const;
  const D3DVERTEXELEMENT9 & GetElem(size_t index) const;
  // Modification of underlying array
  void SetElem(const D3DVERTEXELEMENT9 &elem, size_t index);
  void AddElem(const D3DVERTEXELEMENT9 &elem);
  void InsertElem(const D3DVERTEXELEMENT9 &elem, size_t index);
  void RemoveElem(size_t index);
  IDirect3DVertexDeclaration9 * GetD3dDecl() const;
  std::vector<D3DVERTEXELEMENT9> m_Elems;
  IDirect3DVertexDeclaration9 *m_D3dDecl;

This approach seems very nice as you can create your object any time you wish and fill it with data later, as well as change this data whenever you need to. But at the same time, a question emerges: when to (re)create "destination" IDirect3DVertexDeclaration9 from the "source" D3DVERTEXELEMENT9 array? Each time the array is modified? Or maybe each time the IDirect3DVertexDeclaration9 is retrieved? Optimal solution for the interface above would be to do lazy evaluation, that is to recreate IDirect3DVertexDeclaration9 whenever it is retrieved for the first time since last time the D3DVERTEXELEMENT9 array have been modified. But...

# Texture Shader for Generating Textures

Oct 2009

There was a discussion recently on our forum (in the topic [HLSL] Przekazywanie struktury ?!?) about shader performance and execution time of texture fetches versus arithmetic operations. When we need a sophisticated function that involves many costly computations in a shader, sometimes it is better to prepare a special texture to be sampled as a lookup table for values of this function. But how to generate such texture?

Of course you can write a simple console program, create Direct3D device of type D3DDEVTYPE_NULLREF, create a texture in the D3DPOOL_SCRATCH pool, fill its pixels and finally save it to a file. But there is another solution called Texture Shaders. It's not a new shader type introduced in DirectX 10/11/... It actually exists for quite a long time and is available in D3DX for generating textures with a shader. Such shader is always executed on CPU. If you are interested, look at functions D3DXCreateTextureShader, D3DXFillTextureTX and the ID3DXTextureShader interface.

Another good news is that you can utilize Texture Shaders without coding any program in C++. All you need is AMD RenderMonkey shader IDE. To generate a texture procedurally:

Here is my example:

Interesting curiosity: there is a intrinsic function in HLSL available only in Texture Shaders - noise, which generates Perlin noise.

# Coding Texture Preview

Sep 2009

Today is the 256's day of the year, so it's the Programmers' Day. Thus I wish all the best, especially a good code and no bugs to all of you who code for fun or/and profit (hopefully "and" :)

I've recently started new home project called "BlueWay". There's noting special about it - just another time I start everyting from scratch :) But this time I have deferred shading, cascaded shadow mapping, new scene management (based on k-d tree), new component-based object system and asynchronous resource manager (resources are loaded in the background on separate thread). I'm not sure whether I can call D3DXCompileShaderFromFile from separate thread without D3DCREATE_MULTITHREADED, but it seems to work OK :)

What I want to show today is a code for previewing textures. When coding some complex effects, it's often handy to look at intermediate results in render-target textures. Of course we have famous PIX and NVIDIA PerfHUD, but I believe that such custom code for real-time in-game texture preview is useful anyway. I wanted to make it general and flexible and here is the way I do it.

# New DirectX SDK and Total Commander

Sep 2009

There are two new software releases that are important to me. I know about them from Twitter (yet another good reason to join microblogging activity :)

First one is DirectX SDK August 2009. Updates relate mostly to DirectX 11, but hey, there was no new SDK version for a long time!

(I've just discovered there are changes in the error handling system that are not backward compatible. To compile my code I had to change #include <dxerr9.h> to <dxerr.h>, link with file dxerr.lib instead of dxerr9.lib and rename function calls DXGetErrorString9 and DXGetErrorDescription9 to DXGetErrorString and DXGetErrorDescription.)

Second new release is Total Commander 7.5 FINAL. I can't see any revolutionary changes (despite the new, Vista-like behavior of the current directory bar), but that's a good news as Total Commander is already great.

# Cascaded Shadow Mapping

Jul 2009

When I was writing The Final Quest engine for my master thesis, I didn't manage to implement any technique to ensure good quality shadows in outdoor scenes. I've read about all these kinds of perspective reparametrization like PSM (Perspective Shadow Maps), LiSPSM (Light-Space Perspective Shadow Maps), TSM (Trapezoidal Shadow Maps) or XPSM (Extended Perspective Shadow Maps) and all that math behind them seemed very scary to me. Today I know that complexity of these techniques also causes some artifacts in particular cases and commercial games more often use simpler technique called CSM (Cascaded Shadow Maps).

Yesterday I've implemented CSM and I'm very glad with the results. Of course there are always some artifacts, aliasing problems and z-acne on some distant objects under particular surface angles, performance degradation (additional objects rendering to 3 x 1024 x 1024 textures must take some time) etc., but still its the first time I have not so bad outdoor shadows in my code. On the screenshot below you can see transitions between cascades marked with red arrows.

Cascaded Shadow Mapping

# Thoughts on Display Settings

Jun 2009

I've decided to make a demo for this year's Riverwash demoscene party. For that purpose recently I've prepared my framework and coded loading of display settings from either command line parameters, configuration file or my brand new dialog box:

Coding it reminded me of my thoughts about display settings from user's versus programmer's perspective. User is usually able to change screen resolution and sometimes also change refresh rate, turn vertical synchronization on/off, toggle between fullsreen and windowed mode, choose antialiasting, texture filtering quality and some general quality/performance parameters. On the other hand, programmer passes bunch of parameters to Direct3D as D3DDPRESENT_PARAMETERS structure and other arguments to functions CreateDevice and Reset. The question is how to map between these parameters?

Here are my current beliefs on this subject:

[+] Adapter: I just pass D3DADAPTER_DEFAULT constant. Sure it would be better to give a choice of an adapter (one can enumerate adapters using IDirect3D9 methods), but it's useful only on multi-monitor systems and not many games expose such setting.

[+] DeviceType: I always pass D3DDEVTYPE_HAL. Using software reference rasterizer D3DDEVTYPE_REF makes no sense, as it gives SPF instead of FPS :)

[+] BehaviorFlags: Now I always pass D3DCREATE_HARDWARE_VERTEXPROCESSING. Using software or mixed verex processing made sense only on old hardware, especially on old Intel laptop graphics chips, which had Pixel Shader 2.0 but no Vertex Shader at all.

[+] BackBufferWidth, BackBufferHeight: I give a choice of display modes available on default adapter with format hardcoded as constant (D3DFMT_X8R8G8B8). One can enumerate available display modes using IDirect3D9 methods. In windowed mode it could also be reasonable to be able to set any given resolution (Width and Height as text fields), as well as manually resize application's window.

[+] BackBufferFormat: I just use D3DFMT_A8R8G8B8. Choice between X8R8G8B8 and A8R8G8B8 just the matter of having additional fourth channel available (alpha), which is obviously not visible, but can be written, used in alpha blending and thus utilized to do some special effects (like masking intensity of some effect in screen space).

[+] BackBufferCount: I just give 0 here, which resolves to default value of 1 back buffer. I'm aware that using value 2 can change the way rendering is performed a bit.

[+] SwapEffect: I always use constant D3DSWAPEFFECT_DISCARD, as it is the fastest one. It says that entire content of the back buffer can be discarded after frame was presented and program have to render new frame from scratch (which is what we always do in game development).

[+] AutoDepthStencilFormat: Direct3D defines many of them, but for real all the D3D9 generation hardware supports only three: D3DFMT_D16, D24X8 and D24S8. So the choice is only based on the decision whether we need higher, 24-bit precision or stencil buffer.

[+] Flags: For the best performance possible I always give D3DPRESENTFLAG_DISCARD_DEPTHSTENCIL and never give D3DPRESENTFLAG_LOCKABLE_BACKBUFFER.

[+] FullScreen_RefreshRateInHz: Nowadays, when many (most?) people have LCD displays, standard refresh rate is 60 Hz and maybe 75 Hz. In the CRT era setting good refresh rate was crucial for playing comfort and thus it was very annoying when a game didn't expose setting of refresh rate, but used default 60 Hz instead. I believe refresh rates should be enumerated and given as a choice next to the resolution.

[+] PresentationInterval: This is the value the "VSync" setting is converted to. It looks like D3DPRESENT_INTERVAL_DEFAULT behaves as VSync turned on (FPS <= RefreshRate) and D3DPRESENT_INTERVAL_IMMEDIATE means VSync off (FPS as high as possible), but once during my experiments I've observed that flag D3DPRESENT_INTERVAL_ONE behaves slightly different than D3DPRESENT_INTERVAL_DEFAULT (I can't remember the details now).

I know I would sound more "professional" if I considered all other constant values available and their possible uses, but I don't care :) My point here was to simplify the problem to be able to map these technical parameters to the display settings exposed to the user. Multisampling is separate subject so I don't cover it here.

# Wycinanie komentarzy ze skompilowanego shadera

Feb 2009

Shader Direct3D skompilowany do kodu binarnego (np. funkcją D3DXCompileShaderFromFile) zawiera w sobie nie tylko instrukcje asemblerowe, ale i dodatkowe informacje - komentarze. Te komentarze nie są bezużyteczne. Tam zapisana jest np. tablica stałych, którą można odczytać funkcją D3DXGetShaderConstantTable. Jednak jeśli ona nie jest potrzebna, te komentarze można z kodu wyciąć i wtedy shader staje się dużo mniejszy.

Jak to zrobić? Choć trzeba grzebać wprost w pamięci, to nie jest trudne. Konkretny algorytm opisuje na swoim blogu Jesus de Santos Garcia we wpisie Stripping comments from Shader bytecodes. Ja tutaj streszczę go krótko swoimi słowami.

Format binarny skompilowanych shaderów D3D jest udokumentowany w MSDN, na stronie Direct3D Shader Codes. Składa się z sekwencji 32-bitowych tokenów. Ostatni z nich ma wartość 0x0000FFFF i oznacza koniec kodu. Komentarz natomiast rozpoczyna się tokenem 0x####FFFE, gdzie "####" (starsze dwa bajty) to liczba następujących dalej 32-bitowych wartości stanowiących treść komentarza. Takie komentarze można po prostu wyciąć z kodu shadera i gotowe :)

