Twitter

Pinboard Bookmarks

LinkedIn

Blog Tags

Blog

20:11
Sun
20
Aug 2017

Stencil Test Explained Using Code

I must admit I never used stencil buffer in my personal code. I know it's there available in GPUs and all graphics APIs for years and it's useful for many things, but somehow I never had a need to use it. Recently I became aware that I don't fully understand it. There are many descriptions of stencil test on the Internet, but none of them definitely answered my questions in the way I would like them to be answered. I thought that a piece of pseudocode would explain it better than words, so here is my explanation of the stencil test.

Let's choose Direct3D 11 as our graphics API. Other APIs have similar sets of parameters. D3D11 offers following configuration parameters for stencil test:

Structure D3D11_DEPTH_STENCIL_DESC:

BOOL StencilEnable
UINT8 StencilReadMask
UINT8 StencilWriteMask
D3D11_DEPTH_STENCILOP_DESC FrontFace
D3D11_DEPTH_STENCILOP_DESC BackFace

Structure D3D11_DEPTH_STENCILOP_DESC:

D3D11_STENCIL_OP StencilFailOp
D3D11_STENCIL_OP StencilDepthFailOp
D3D11_STENCIL_OP StencilPassOp
D3D11_COMPARISON_FUNC StencilFunc

Parameter passed to ID3D11DeviceContext::OMSetDepthStencilState method:

UINT StencilRef

How do they work? If you render pixel (x, y) and you have current value of stencil buffer available as:

UINT8 Stencil[x, y]

Then pseudocode for stencil test and write could look like below. First, one of two sets of parameters is selected depending on whether current primitive is front-facing or back-facing:

if(StencilEnable)
{
    if(primitive has no front and back face, e.g. points, lines)
      StencilOpDesc = FrontFace
    else
    {
       if(primitive is front facing)
          StencilOpDesc = FrontFace
       else
          StencilOpDesc = BackFace
   }

Then, a test is performed:

    StencilTestPassed =
       (StencilRef & StencilReadMask) StencilOpDesc.StencilFunc
       (Stencil[x, y] & StencilReadMask)

StencilOpDesc.StencilFunc is a comparison operator that can be one of possible enum values: D3D11_COMPARISON_NEVER, LESS, EQUAL, LESS_EQUAL, GREATER, NOT_EQUAL, GREATER_EQUAL, ALWAYS. I think this is quite self-explanatory.

Notice how StencilRef is on the left side of comparison operator and current stencil buffer value is on the right.

Both StencilRef and current stencil buffer value are ANDed with StencilReadMask before comparison.

Next, based on the result of this test, as well as result of depth-test aka Z-test (which is out of scope of this article), an operation is selected:

    if(StencilTestPassed)
    {
       if(DepthTestPassed)
          Op = StencilOpDesc.SencilPassOp
       else
          Op = StencilOpDesc.StencilDepthFailOp
    }
    else
        Op = StencilOpDesc.StencilFailOp

Op is another enum that controls a new value to be written to stencil buffer. It can be one of:

    switch(Op)
    {
    case D3D11_STENCIL_OP_KEEP: NewValue = Stencil[x, y]
    case D3D11_STENCIL_OP_ZERO: NewValue = 0
    case D3D11_STENCIL_OP_REPLACE: NewValue = StencilRef
    case D3D11_STENCIL_OP_INCR_SAT: NewValue = min(Stencil[x, y] + 1, 0xFF)
    case D3D11_STENCIL_OP_DECR_SAT: NewValue = max(Stencil[x, y] - 1, 0)
    case D3D11_STENCIL_OP_INVERT: NewValue = ~Stencil[x, y]
    case D3D11_STENCIL_OP_INCR: NewValue = Stencil[x, y] + 1 // with 8-bit wrap-around
    case D3D11_STENCIL_OP_DECR: NewValue = Stencil[x, y] - 1 // with 8-bit wrap-around
    }

Finally, the new value is written to the stencil buffer. Notice how only those bits are changed that are included in StencilWriteMask. Others remain unchanged.

    Stencil[x, y] =
       (Stencil[x, y] & ~StencilWriteMask) |
       (NewValue & StencilWriteMask)
}

Now as we have all this explained in a very strict way using code, let me answer doubts I had before understanding this, in form of a FAQ.

Q: Are there no separate flags to enable stencil test and stencil write?

A: No. There is only one flag StencilEnable to enable all this functionality.

Q: So how to use only one and not the other?

A: You can find specific set of settings to do that.

To perform only stencil test and not write, set StencilEnable to true, StencilFunc to the comparison function you need and set all *Op to KEEP or alternatively set StencilWriteMask to 0 to disable any modifications to stencil buffer.

To perform only stencil write and not stencil test, set StencilEnable to true, all *Op and StencilWriteMask to values you need and set StencilFunc to ALWAYS to make the stencil test always passing.

Q: Is the StencilRef value also masked by StencilReadMask?

A: Yes. As you can see in the code, it is also ANDed with StencilReadMask, just as the previous value from stencil buffer. You don't need to provide it "pre-masked". (Comparison to "premultipled alpha" comes to my mind...)

In other words, we could say that only bits indicated by StencilReadMask from both sides participate in comparison.

Q: What are stencil value bits replaced to in REPLACE Op mode?

A: They are replaced with StencilRef value - the same that was used in comparison.

Q: Why is it the same StencilRef value as used for comparison, not separate one?

A: I don't know. There is separate StencilReadMask and StencilWriteMask. They could have provided separate "StencilReadRef" and "StencilWriteRef" - but for some reason the didn't :P

Q: What value is incremented/decremented when Op in INCR*, DECR*?

A: It's the original value read from stencil buffer, not masked or shifted in relation to StencilReadMask or StencilWriteMask. Which means it doesn't make much sense to use these ops if your StencilWriteMask looks like e.g. 0xF0 - masks out least significant bits.

Q: Is depth buffer updated when stencil test fails?

A: No. Failing stencil test means that the pixel is discarded, so Z-buffer is not updated and color is not written or blended to render targets.

On the other hand, failing Z-test can cause stencil buffer to be updated when you use StencilDepthFailOp other than KEEP.

If I misunderstood something and some of the information in this article is wrong, please let me know by e-mail or comment below.

Comments (0) | Tags: directx graphics | Author: Adam Sawicki | Share

21:48
Thu
17
Aug 2017

Visual C++: IntelliSense Versus Macros

When you code in C++ using Visual Studio, you may meet following problem: Your code uses preprocessor directives that depend on some macro that is defined elsewhere, e.g. in one of CPP files including the header file you write, and so IntelliSense gets lost and stops working, or even completely grays out that part of your code as inactive. For example:

// Some code...

#ifdef EXTERNALLY_DEFINED_MACRO

// Some code where IntelliSense stops working...

#endif

I just found a solution to that. It turns out there is a special macro predefined when code is processed by Visual Studio IntelliSense. It's called just __INTELLISENSE__. By using it, you can change parts of your code as seen by IntelliSense parser, e.g. define some macros, without influencing logic seen by the compiler. For example:

#ifdef __INTELLISENSE__
#define EXTERNALLY_DEFINED_MACRO
#endif

// Some code...

#ifdef EXTERNALLY_DEFINED_MACRO

// Some more code where IntelliSense is working again...

#endif

Comments (0) | Tags: c++ visual studio | Author: Adam Sawicki | Share

18:27
Mon
07
Aug 2017

Understanding Vulkan objects

An important part of learning the Vulkan® API – just like any other API – is to understand what types of objects are defined in it, what they represent and how they relate to each other. To help with this, we’ve created a diagram that shows all of the Vulkan objects and some of their relationships, especially the order in which you create one from another.

Read more: Understanding Vulkan objects @ GPUOpen

Comments (0) | Tags: graphics vulkan | Author: Adam Sawicki | Share

22:38
Wed
02
Aug 2017

Few organizational advice for game jams

I have participated in Slavic Game Jam 2017. I would like to share few thoughts that came to my mind during the event and especially during presentations.

Participation in a game jam is like any gamedev project, just on a small scale. All the rules of a successful gamedev project apply. All the rules of doing a software project apply. You need a good idea for a game, so any method of coming up with ideas (like brainstorming) may help. You need the code, so good programming practices apply as well, so you can implement features fast and not drown in spaghetti code or hard to fix bugs in the middle of the project. Experience in game design and level design is useful. Skill in making good game graphics and sound is essential as well. Some project management is needed too. Even the wisdom about work-life balance apply, because having too little sleep makes you less productive the other day (coffee or energy drinks can help a little bit though :)

There are many books about these topics. What I would like to focus on here is something different - some basic organizational things that can have decisive influence on your performance during the jam. Even if you are a great game developer, you won't deliver a good game (or win, if there is a competition) if you fail on some of these basic topics. They are related to both development process, as well as presentation on a big screen.

1. Come prepared. I don't mean making a game in advance and only adjusting it to the theme during the jam. I mean setting up some basic software environment. If you already have your team, or at least some friends who you plan to team up with, meet together before the jam, decide what technologies and tools you are going to use and set them up. This will save you a lot of time during the event.

  • First, launch the goddamn Windows Update so it won't try to apply 1 GB of pending mandatory updates when the time is critical.
  • Then download and install your development environment, like Visual Studio, Unity, or Unreal Engine. These tend to be quite big, so it may take some time.
  • Setup some way of collaboration over network, like SVN, Git, or Perforce.
  • Also setup some way of communication, so you can easily exchange messages, links, and files. It may be Skype, Facebook, Slack, Discord, OneDrive, Dropbox, Google Drive etc. Login to these services in advance to be sure you won't need to recover forgotten passwords during the jam.
  • You would also need some content creation tools installed (Photoshop/GIMP/Krita, 3ds Max/Maya/Blender, Audacity). This refers to programmers as well.

2. Take as much hardware and cables with you as you can. You never know what you or other team members may need.

  • Take extension cord - there may be not enough power sockets on the party place.
  • Take Ethernet cables - there may be a chance to connect to wired network, which will certainly work better than overloaded WiFi.
  • Take headphones with a microphone - you will need to test sound in your game, possibly also record sound effects, or just separate yourself from loud neighbours next table who play some annoying music all the time :)
  • Take a router if you have one - setting up a WiFi + Ethernet LAN for your team only will make it more reliable than connecting every computer independently to the main network.
  • Take a Bluetooth loudspeaker if you have one. (If you don't have, I recommend ordering Kingone K5 from AliExpress - it's a cheap and good one :) It could help you present your game if there is no better option available. Any external speaker would be louder and have better quality than the one built-in to your laptop.
  • Take a monitor if you can. Developing with additional, bigger screen connected is much better than using only the display of you laptop.
  • Take VGA/HDMI/DisplayPort cables and converters. You never know when it may help you or someone else to connect to a monitor or to the projector.

3. Finish early. It doesn't mean you need to stop polishing your game long before the deadline. It means you should strive to have a playable game many hours before the deadline, test it as early and as often as possible, and make first build that you could potentially submit at least one hour before the time is up. Maybe you will crunch and apply critical fixes and improvements to your game in the last moment, but your shouldn't count on that. Maybe the organizers will extend deadline by additional hour, but you shouldn't rely on that either. Even something as silly as compressing your game build to a ZIP file on an old laptop can take unexpectedly long time and make you miss the deadline. If you need to upload the game somewhere on the Internet, keep in mind that everyone is going to do this at the same time, so the transfer may be very slow.

4. Focus on making your game looking good during the few-minutes presentation of you playing it. That's how the game will be seen and judged. Making it fun to play for others or fun to play for many hours is a secondary goal. Of course I don't mean cheating like preparing a prerecorded video. I just mean that you don't need to have 20 levels. It's OK to have enough gameplay for just few minutes, like only a single level. It's even better when the game is fast paced and can be finished during the presentation. You may also cheat just a little, like make a keyboard shortcut for invincibility, advancing to next level or showing final credits screen.

5. Make your game easy to remember and recognize. Sophisticated or generic name and content will make people forget about it. Even if there is a list and an order of presenting games, there is often some chaos happening during presentations. Some games have technical difficulties, some teams just give up, and so viewers may be confused about which game is which. If you design your whole game around a single, simple theme (like "a butterfly") and include it everywhere: in game title, logo/menu screen, and in the graphics visible during gameplay, then everyone will be able to easily identify it and so to vote for it. You want them to later say "I liked that game about the butterfly."

6. Give your game build folder/archive some meaningful name. It should contain the title of your game, possibly the name of your team and preferably some ordinal version number. I've seen game builds called "Build.zip". That's a very bad idea. I know that for you this is a build of THE game, but for others it's just one of the games and so they need to be able to easily identify which one is it. (BTW Same rule applies to the file with your resume that you send to potential employers - don't call it "CV.pdf" :) On the other hand, version number is for you. Believe me, there will be more than one version. Calling any of these "final" is not a good idea, because you will end up with "final final", "really final" etc. :) So it's better to call your game build something like "TeamName - GameTitle v01.zip".

7. Prepare your game for difficult technical conditions during presentation. I've written separate blog post about shapes and colors that you should use: 3 Rules to Make You Image Looking Good on a Projector. Here I would like to add that you should test your game on various resolutions. Projectors tend to have small resolutions. You can also meet problems with sound (too quiet or not working at all), so make sure your game is attractive even without it.

8. Use some margin when displaying things on the screen. It is also known as "safe area". In other words, don't put critical information (like GUI elements) near the edges of the screen. It may happen that the projector is not setup correctly and your image will be cropped, making these things invisible. Same applies to time domain as well as to spatial domain: Don't show important content during first three seconds of your game. Leave some "time margin". Projector may need some time to switch to new source and resolution, so viewers may not be able to see the beginning of your game.

9. Control sound volume of your game. If you learned a little bit about giving speeches, you probably know already that you should speak loudly, slowly and clearly. When you present a game, there is another level of difficulty, because the music and sound effects from your game are played at the same time as you speak. Be aware of how loud they are so that viewers can hear them, but also can hear you speaking.

10. Remove all the distractions that your operating system may experience during the presentation. Receiving notification about incoming Skype call in the middle of your presentation would look funny, but it definitely won't increase your chances to win. Same applies to Windows deciding to install new updates in the worst possible moment on antivirus slowing down your system because it just started to scan your entire hard drive. So for the presentation:

  • Exit any messaging apps and all other apps that you don't need.
  • Turn off all network connectivity: disable WiFi, enable plane mode.
  • Disable antivirus protection.
  • Switch power settings to maximum performance.

11. Finally, prepare for your talk. Decide who is going to talk and who is going to play the game. Consider how long the presentation should be. Determine what do you want to show, what to tell and in what order. Don't do it spontaneuisly, but rather think about the presentation in advance and discuss it with your team.

Comments (0) | Tags: competitions events | Author: Adam Sawicki | Share

21:49
Sun
30
Jul 2017

Thoughts after Slavic Game Jam 2017

Slavic Game Jam 2017 ended today. I have not only given a talk as a representative of the sponsor company, but I was also allowed to participate in the jam itself, so I teamed up with my old friends, some new friends that I met there and we made a game :) The theme this year was "Unknown". Our idea was to create a game about a drone flying and exploring a cave. You can see it here: This Drone of Mine.

Screenshot:

There were 2 developers in our team, 3 graphical artists and one sound/music artist. We decided to use Unreal Engine 4, despite we had no previous experience in making games with this engine whatsoever, so we needed to learn everything during the jam. We didn't do any C++ - we implemented all game logic visually using Blueprints. We also set up Perforce for collaboration, so some of us needed to learn that as well (I am fortunate to already know this tool pretty well).

We didn't win or even make it to the second round, but it's OK for me - I'm quite happy with the final result. We more or less managed to implement our original idea, as well as show almost all the graphics, sound effects, music and voice-overs, so the artists' work is not wasted. It was lots of fun and we learned a lot during the process.

You can browse all games created during the jam here: Slavic Game Jam 2017 - itch.io.

Comments (0) | Tags: unreal productions competitions events | Author: Adam Sawicki | Share

12:40
Wed
26
Jul 2017

Slavic Game Jam 2017 and my talk

There are many game jams all around the world. Global Game Jam is probably the biggest and most popular one, but it is a global event that happens at different sites. This weekend Slavic Game Jam takes place - the biggest game jam in Eastern Europe, happening in just one site in Warsaw, Poland.

I will be there not only as a participant, but I will also give a talk, because AMD is a sponsor of the event. My talk will be on Friday at 2 PM. Its title is "Rendering in Your Game - Debugging and Profiling". I will provide some basic information and show some tools useful for analyzing performance of a game, including live demo. This information may be useful no matter if you develop your own engine or use existing one like Unity or Unreal. If you have a ticket for the event (tickets are already sold out), I invite you to come on Friday earlier than for the official start of the jam.

Comments (0) | Tags: teaching events competitions graphics | Author: Adam Sawicki | Share

13:03
Tue
11
Jul 2017

Vulkan Memory Allocator - a library on GPUOpen & GitHub

Vulkan is hard. One of the difficulties is the responsibility of a developer to manually manage GPU memory. Various GPU vendors expose various set of memory heaps and types and you need to choose right ones. It is also recommended to allocate larger memory blocks and assign parts of them to individual buffers and images. Now there is a library that simplifies these tasks - a one that I developed as part of my job duties. It has just been announced on GPUOpen blog and published on GitHub:

It is a single-header C++ library with a simple C Vulkan-style interface documented using Doxygen-style comments. It is available on MIT license.

Comments (0) | Tags: graphics vulkan | Author: Adam Sawicki | Share

21:15
Mon
15
May 2017

Vulkan Bits and Pieces: Synchronization in a Frame

One of the things that you must implement when using Vulkan is basic structure of a rendering frame, which consists of filling a command buffer, submitting it to a command queue, acquiring image from swap chain and finally presenting it (order not preserved here). Things happen in parallel, as GPU works asynchronously to the CPU, so you must explicitly synchronize all these things. In this post I’d like to present basic structure of the rendering frame, with all necessary synchronization.

There are actually two objects that we must take care of simultaneously. First is command buffer. One time it is being filled on the CPU by using vkBeginCommandBuffer, vkEndCommandBuffer and everything in between (like starting render passes and posting all the vkCmd* commands). Another time (after being submitted to a queue using vkQueueSubmit) it waits for execution or it is being executed on the GPU. These things cannot happen at the same time, so we need synchronization. Because it’s a GPU-to-CPU synchronization, we use fence for that. It’s better to have multiple command buffers and all these synchronization objects, stored in array and used in a round-robin fashion (with help of index variable) because this way one can be filled while the other one is consumed at the same time.

  • When submitting command buffer to the queue near the end of a frame using vkQueueSubmit, we specify a fence that will be signaled when the command buffer finishes executing.
  • At the beginning of next frame, we wait for fence of the command buffer we want to use this time (better to take different one) to become signaled, using vkWaitForFences.
  • Only then we can call vkBeginCommandBuffer, all rendering commands and finally vkEndCommandBuffer.
  • We also need to manually reset the fence of our command buffer using vkResetFences anywhere between vkWaitForFences and vkQueueSubmit.

Second object is an image acquired from the swapchain. One time it is being rendered to during command buffer execution. Another time it is presented on the screen. Again, we need to synchronize it so these states happen in a sequence. Because it’s a GPU-to-GPU synchronization, we use semaphores for that. There are multiple images as well because swapchain consists of several of them (not necessarily used in round-robin fashion, we need to obtain index of a new image using vkAcquireNextImageKHR function, so there is separate imageIndex).

  • When acquiring fresh image from swapchain (or actually just its index) using vkAcquireNextImageKHR, it’s better to do it as late as possible in a frame (not at the beginning of any rendering, but right before we are going to do final postprocessing pass and render to that image), because this function may block.
  • Despite that, after the function returns, swapchain image at returned index is still not ready. We specify a semaphore as argument to this function that will become signaled when the image is really eligible for use. This is the first semaphore, called g_ImageAvailableSemaphores below.
  • We specify the same semaphore from g_ImageAvailableSemaphores among pWaitSemaphores when calling vkQueueSubmit, so this function waits for the semaphore to become signaled before submitting command buffer that actually needs this image.
  • As another argument to vkQueueSubmit (or rather VkSubmitInfo structure) called pSignalSemaphores we specify second semaphore, which I named g_RenderFinishedSemaphores. This one will become signaled after submitted command buffer finishes execution.
  • That second semaphore from g_RenderFinishedSemaphores is in turn passed as pWaitSemaphores member of VkPresentInfoKHR structure when calling vkQueuePresentKHR, so that the image is presented only when rendering to it finishes.

Having following objects already initialized:

VkDevice g_Device;
VkQueue g_GraphicsQueue;
VkSwapchainKHR g_Swapchain;
VkImageView g_SwapchainImageViews[MAX_SWAPCHAIN_IMAGES];

const uint32_t COUNT = 2;
uint32_t g_NextIndex = 0;
VkCommandBuffer g_CmdBuf[COUNT];
VkFence g_CmdBufExecutedFences[COUNT]; // Create with VK_FENCE_CREATE_SIGNALED_BIT.
VkSemaphore g_ImageAvailableSemaphores[COUNT];
VkSemaphore g_RenderFinishedSemaphores[COUNT];

Structure of a frame may look like this:

uint32_t index = (g_NextIndex++) % COUNT;
VkCommandBuffer cmdBuf = g_CmdBuf[index];

vkWaitForFences(g_Device, 1, &g_CmdBufExecutedFences[index], VK_TRUE, UINT64_MAX);

VkCommandBufferBeginInfo cmdBufBeginInfo = { VK_STRUCTURE_TYPE_COMMAND_BUFFER_BEGIN_INFO };
cmdBufBeginInfo.flags = VK_COMMAND_BUFFER_USAGE_ONE_TIME_SUBMIT_BIT;
vkBeginCommandBuffer(cmdBuf, &cmdBufBeginInfo);

// Post some rendering commands to cmdBuf.

uint32_t imageIndex;
vkAcquireNextImageKHR(g_Device, g_Swapchain, UINT64_MAX, g_ImageAvailableSemaphores[index], VK_NULL_HANDLE, &imageIndex);

// Post those rendering commands that render to final backbuffer:
// g_SwapchainImageViews[imageIndex]

vkEndCommandBuffer(cmdBuf);

vkResetFences(g_Device, 1, &g_CmdBufExecutedFences[index]);

VkPipelineStageFlags submitWaitStage = VK_PIPELINE_STAGE_COLOR_ATTACHMENT_OUTPUT;
VkSubmitInfo submitInfo = { VK_STRUCTURE_TYPE_SUBMIT_INFO };
submitInfo.waitSemaphoreCount = 1;
submitInfo.pWaitSemaphores = &g_ImageAvailableSemaphores[index];
submitInfo.pWaitDstStageMask = &submitWaitStage;
submitInfo.commandBufferCount = 1;
submitInfo.pCommandBuffers = &cmdBuf;
submitInfo.signalSemaphoreCount = 1;
submitInfo.pSignalSemaphores = &g_RenderFinishedSemaphores[index];
vkQueueSubmit(g_GraphicsQueue, 1, &submitInfo, g_CmdBufExecutedFences[index]);

VkPresentInfoKHR presentInfo = { VK_STRUCTURE_TYPE_PRESENT_INFO_KHR };
presentInfo.waitSemaphoreCount = 1;
presentInfo.pWaitSemaphores = &g_RenderFinishedSemaphores[index];
presentInfo.swapChainCount = 1;
presentInfo.pSwapchains = &g_Swapchain;
presentInfo.pImageIndices = &imageIndex;
vkQueuePresentKHR(g_GraphicsQueue, &presentInfo);

At least that’s what I currently believe is the correct and efficient way. If you think I’m wrong, please leave a comment or write me an e-mail.

That’s just one aspect of using Vulkan. There are still more things to do before you can even render your first triangle, like transitioning swapchain image layout between VK_IMAGE_LAYOUT_COLOR_ATTACHMENT_OPTIMAL and VK_IMAGE_LAYOUT_PRESENT_SRC_KHR.

Comments (0) | Tags: vulkan graphics | Author: Adam Sawicki | Share

Older entries >

STAT NO AD [Stat] [Admin] [STAT NO AD] [pub] [Mirror] Copyright © 2004-2017 Adam Sawicki
Copyright © 2004-2017 Adam Sawicki