Entries for tag "graphics", ordered from most recent. Entry count: 57.
# Debugging Vulkan driver crash - equivalent of NVIDIA Aftermath
New generation, explcit graphics APIs (Vulkan and DirectX 12) are more efficient, involve less CPU overhead. Part of it is that they don't check most errors. In old APIs (Direct3D 9, OpenGL) every function call was validated internally, returned success of failure code, while driver crash indicated a bug in driver code. New APIs, on the other hand, rely on developer doing the right thing. Of course some functions still return error code (especially ones that allocate memory or create some resource), but those that record commands into a command buffer just return
void. If you do something illegal, you can expect undefined behavior. You can use Validation Layers / Debug Layer to do some checks, but otherwise everything may work fine on some GPUs, you may get incorrect result, or you may experience driver crash or timeout (called "TDR"). Good thing is that (contrary to old Windows XP), crash inside graphics driver doesn't cause "blue screen of death" or machine restart. System just restarts graphics hardware and driver, while your program receives
VK_ERROR_DEVICE_LOST code from one of functions like
vkQueueSubmit. Unfortunately, you then don't know which specific draw call or other command caused the crash.
NVIDIA proposed solution for that: they created NVIDIA Aftermath library. It lets you (among other things) record commands that write custom "marker" data to a buffer that survives driver crash, so you can later read it and see which command was successfully executed last. Unfortunately, this library works only with NVIDIA graphics cards and only in D3D11 and D3D12.
I was looking for similar solution for Vulkan. When I saw that Vulkan can "import" external memory, I thought that maybe I could use function
vkCmdFillBuffer to write immediate value to such buffer and this way implement the same logic. I then started experimenting with extensions: VK_KHR_get_physical_device_properties_2, VK_KHR_external_memory_capabilities, VK_KHR_external_memory, VK_KHR_external_memory_win32, VK_KHR_dedicated_allocation. I was basically trying to somehow allocate a piece of system memory and import it to Vulkan to write to it as Vulkan buffer. I tried many things:
HeapAlloc and other ways, with various flags, but nothing worked for me. I also couldn't find any description or sample code of how these extensions could be used in Windows to import some system memory as Vulkan buffer.
Everything changed when I learned that creating normal device memory and buffer inside Vulkan is enough! It survives driver crash, so its content can be read later via mapped pointer. No extensions required. I don't think this is guaranteed by specification, but it seems to work on both AMD and NVIDIA cards. So my current solution to write makers that survive driver crash in Vulkan is:
VkDeviceMemoryfrom memory type that has
HOST_VISIBLE + HOST_COHERENTflags. (This is system RAM. Spec guarantees that you can always find such type.)
vkMapMemoryto get raw CPU pointer to its data.
VK_BUFFER_USAGE_TRANSFER_DST_BITand bind it to that memory using
vkCmdFillBufferto write immediate data with your custom "markers" to the buffer.
VK_ERROR_DEVICE_LOST), read data under the pointer to see what marker values were successfully written last and deduce which one of your commands might cause the crash.
There is also a new extension available on latest AMD drivers: VK_AMD_buffer_marker. It adds just one function:
vkCmdWriteBufferMarkerAMD. It works similar to beforementioned
vkCmdFillBuffer, but it adds two good things that let you write your markers with much better granularity:
vkCmdFillBuffermust be called outside render pass.
I created a simple library that implements all this logic under easy interface, which I called "Vulkan AfterCrash". All you need to use it is just this single file: VulkanAfterCrash.h.
Update 4 April 2018: In GDC 2018 talk "Aftermath: Advances in GPU Crash Debugging (Presented by NVIDIA)", Alex Dunn announced that a Vulkan extension from NVIDIA will also be available, called VK_NV_device_diagnostic_checkpoints, but I can see it's not publicly accessible yet.
Update 1 August 2018: Documentation for extension VK_NV_device_diagnostic_checkpoints has been published in Vulkan version 1.1.82.
Update 12 September 2018: I've created similar, portable library for Direct3D 12 - see blog post "Debugging D3D12 driver crash".
# Vulkan Memory Allocator 2.0.0
At Game Developers Conference (GDC) last week I released final version 2.0.0 of Vulkan Memory Allocator library. It is now well documented and thanks to contributions from open source community it compiles and works on Windows, Linux, Android, and MacOS. Together with it I released VMA Dump Vis - a Python script that visualizes Vulkan memory on a picture. From now on I will continue incremental development on "development" branch and occasionally merge to "master". Feel free to contact me if you have any feedback, suggestions or if you find a bug.
# Driver source code is not what you may think
AMD just released Open Source Driver for Vulkan, with source code available on GitHub under MIT license. That’s a good opportunity to explain how drivers source look like. When I was developing graphics driver (I no longer do - I now have quite different position), people kept asking me "How is it like to code in C?" or "How is it like to code in kernel space?", unaware that none of it was true in my case.
Many developers think that coding a driver is some hardcore, low-level stuff. Maybe it is true for some small drivers in embedded systems world. What they may not know is that a modern PC graphics driver (and I bet other kinds of drivers as well) is a very complex beast, with only a small portion of it working in kernel mode and only a small portion (if any) written in plain old C or assembly. Majority of the code is just normal C++ with classes, virtual functions and everything. It is compiled into normal user-mode DLL libraries that get loaded into address space of a game. Sure the code may be a little bit different than standard desktop apps. It may be optimized for performance, as well as memory usage. It may not use exceptions, Boost or STL. It may have to handle out-of-memory errors gracefully. But it’s still a modern, object-oriented code that uses (relatively) new language features like
# Rendering Optimization - My Talk at Warsaw University of Technology
If you happen to be in Warsaw tomorrow (2017-12-13), I'd like to invite you to my talk at Warsaw University of Technology. On the weekly meeting of Polygon group, this time the first talk will be about about artificial intelligence in action games (by Kacpi), followed by mine about rendering optimization. It will be technical, but I think it should be quite easy to understand. I won't show a single line of code. I will just give some tips for getting good performance when rendering 3D graphics on modern GPUs. I will also show some tools that can help with performance profiling. It will be all in Polish. The event starts at 7 p.m. Entrance is free. See also Facebook event. Traditionally after the talks we all go for a beer :)
# VK_KHR_dedicated_allocation unofficial manual
I wrote a short article that explains how to use Vulkan extenion "VK_KHR_dedicated_allocation". It may be interesting to you if you are a programmer and you code graphics using Vulkan.
Go to article: VK_KHR_dedicated_allocation unofficial manual
# Vulkan Memory Allocator 2.0.0-alpha.3
I just published new version of Vulkan Memory Allocator 2.0.0-alpha.3. I'm quite happy with the quality of this code. Documentation is also updated, so if nothing else, please just go see User guide. I still marked it as "alpha" because I would like to ask for feedback and I may still change everything.
I would like to discuss proposed terminology. Naming things in code is a hard problem in general, and especially as English is not my native language, so please fill free to contact me and propose more elegant names to what I called: allocator, allocation, pool, block, stats, free range, used/unused bytes, own memory, persistently mapped memory, pointer to mapped data, lost allocation (becoming lost, making other lost), defragmentation, and used internally: suballocation, block vector.
# How to Use Vulkan SDK with AppVeyor and Travis CI?
AppVeyor and Travis CI are popular, free web services that provide continuous integration style build and testing environments for open source projects - for Windows and Linux, respectively. They are relatively easy to setup if your project is self-contained, but can be a little bit tricky if you need to install some additional dependencies. Today I successfully finished installing Vulkan SDK in these environments, to test Vulkan Memory Allocator project. Here is how to do it:
In AppVeyor I have everything configured on their website, so you can’t really see configuration file. To download Vulkan SDK, I used
curl --silent --show-error --output VulkanSDK.exe https://vulkan.lunarg.com/sdk/download/188.8.131.52/windows/VulkanSDK-184.108.40.206-Installer.exe
First command downloads and second one installs Vulkan SDK in silent mode. The SDK is installed to C:\VulkanSDK\220.127.116.11 directory.
Next I added an environmental variable in Environment > Environmental variables section:
In my Visual Studio project settings, I added include directory "$(VULKAN_SDK)/Include" and library directory "$(VULKAN_SDK)/Lib". Then I could successfully
#include <vulkan/vulkan.h> and link with vulkan-1.lib.
For Travis CI you can see my current configuration file as: .travis.yml. What I did here is I added few commands to
install section. First there are some
apt-get commands that install some additional libraries, which I took from Getting Started with the Vulkan SDK page:
sudo apt-get -qq update
sudo apt-get install -y libassimp-dev libglm-dev graphviz libxcb-dri3-0 libxcb-present0 libpciaccess0 cmake libpng-dev libxcb-dri3-dev libx11-dev libx11-xcb-dev libmirclient-dev libwayland-dev libxrandr-dev
Then I download and install Vulkan SDK. URL to real file is in the same format as for Windows. I used
wget command for downloading. The file is a self-extracting archive that unpacks SDK content to following subdirectory of current directory: VulkanSDK/18.104.22.168/x86_64.
wget -O vulkansdk-linux-x86_64-22.214.171.124.run https://vulkan.lunarg.com/sdk/download/126.96.36.199/linux/vulkansdk-linux-x86_64-188.8.131.52.run
chmod ugo+x vulkansdk-linux-x86_64-184.108.40.206.run
Finally I issue a command that sets environmental variable pointing to the SDK, to have it available in the code just like on Windows:
Then I needed to configure my project to search for include files in "$(VULKAN_SDK)/include" and library files in "$(VULKAN_SDK)/lib" (directory names are lowercase this time!). Finally I could
#include <vulkan/vulkan.h> and link with libvulkan.so.
# Stencil Test Explained Using Code
I must admit I never used stencil buffer in my personal code. I know it's there available in GPUs and all graphics APIs for years and it's useful for many things, but somehow I never had a need to use it. Recently I became aware that I don't fully understand it. There are many descriptions of stencil test on the Internet, but none of them definitely answered my questions in the way I would like them to be answered. I thought that a piece of pseudocode would explain it better than words, so here is my explanation of the stencil test.
Let's choose Direct3D 11 as our graphics API. Other APIs have similar sets of parameters. D3D11 offers following configuration parameters for stencil test:
Parameter passed to
How do they work? If you render pixel (x, y) and you have current value of stencil buffer available as:
UINT8 Stencil[x, y]
Then pseudocode for stencil test and write could look like below. First, one of two sets of parameters is selected depending on whether current primitive is front-facing or back-facing:
if(primitive has no front and back face, e.g. points, lines)
StencilOpDesc = FrontFace
if(primitive is front facing)
StencilOpDesc = FrontFace
StencilOpDesc = BackFace
Then, a test is performed:
(StencilRef & StencilReadMask) StencilOpDesc.StencilFunc
(Stencil[x, y] & StencilReadMask)
StencilOpDesc.StencilFunc is a comparison operator that can be one of possible enum values:
ALWAYS. I think this is quite self-explanatory.
StencilRef is on the left side of comparison operator and current stencil buffer value is on the right.
StencilRef and current stencil buffer value are ANDed with
StencilReadMask before comparison.
Next, based on the result of this test, as well as result of depth-test aka Z-test (which is out of scope of this article), an operation is selected:
Op = StencilOpDesc.SencilPassOp
Op = StencilOpDesc.StencilDepthFailOp
Op = StencilOpDesc.StencilFailOp
Op is another enum that controls a new value to be written to stencil buffer. It can be one of:
case D3D11_STENCIL_OP_KEEP: NewValue = Stencil[x, y]
case D3D11_STENCIL_OP_ZERO: NewValue = 0
case D3D11_STENCIL_OP_REPLACE: NewValue = StencilRef
case D3D11_STENCIL_OP_INCR_SAT: NewValue = min(Stencil[x, y] + 1, 0xFF)
case D3D11_STENCIL_OP_DECR_SAT: NewValue = max(Stencil[x, y] - 1, 0)
case D3D11_STENCIL_OP_INVERT: NewValue = ~Stencil[x, y]
case D3D11_STENCIL_OP_INCR: NewValue = Stencil[x, y] + 1 // with 8-bit wrap-around
case D3D11_STENCIL_OP_DECR: NewValue = Stencil[x, y] - 1 // with 8-bit wrap-around
Finally, the new value is written to the stencil buffer. Notice how only those bits are changed that are included in
StencilWriteMask. Others remain unchanged.
Stencil[x, y] =
(Stencil[x, y] & ~StencilWriteMask) |
(NewValue & StencilWriteMask)
Now as we have all this explained in a very strict way using code, let me answer doubts I had before understanding this, in form of a FAQ.
Q: Are there no separate flags to enable stencil test and stencil write?
A: No. There is only one flag
StencilEnable to enable all this functionality.
Q: So how to use only one and not the other?
A: You can find specific set of settings to do that.
To perform only stencil test and not write, set
StencilEnable to true,
StencilFunc to the comparison function you need and set all
KEEP or alternatively set
StencilWriteMask to 0 to disable any modifications to stencil buffer.
To perform only stencil write and not stencil test, set
StencilEnable to true, all
StencilWriteMask to values you need and set
ALWAYS to make the stencil test always passing.
Q: Is the StencilRef value also masked by StencilReadMask?
A: Yes. As you can see in the code, it is also ANDed with
StencilReadMask, just as the previous value from stencil buffer. You don't need to provide it "pre-masked". (Comparison to "premultipled alpha" comes to my mind...)
In other words, we could say that only bits indicated by
StencilReadMask from both sides participate in comparison.
Q: What are stencil value bits replaced to in REPLACE Op mode?
A: They are replaced with
StencilRef value - the same that was used in comparison.
Q: Why is it the same StencilRef value as used for comparison, not separate one?
A: I don't know. There is separate
StencilWriteMask. They could have provided separate "StencilReadRef" and "StencilWriteRef" - but for some reason the didn't :P
Q: What value is incremented/decremented when Op in INCR*, DECR*?
A: It's the original value read from stencil buffer, not masked or shifted in relation to
StencilWriteMask. Which means it doesn't make much sense to use these ops if your
StencilWriteMask looks like e.g. 0xF0 - masks out least significant bits.
Q: Is depth buffer updated when stencil test fails?
A: No. Failing stencil test means that the pixel is discarded, so Z-buffer is not updated and color is not written or blended to render targets.
On the other hand, failing Z-test can cause stencil buffer to be updated when you use
StencilDepthFailOp other than
If I misunderstood something and some of the information in this article is wrong, please let me know by e-mail or comment below.