All blog entries, ordered from most recent. Entry count: 1153.
# D3d12info - Printing D3D12 GPU Information to Console
My next little hobby project is D3d12info. It is a Windows console program that prints all the information it can get about the current GPU installed in the system, as seen through Direct3D 12 API. It also fetches additional information through AMD GPU Services (on AMD cards), NVAPI (on NVIDIA cards), Vulkan, and WinAPI, mostly to identify the current version of the graphics driver and Windows system. I will try to keep it updated to the latest Agility SDK, to query it for support for the latest hardware features of the graphics card.
The tool can be compared to DirectX Caps Viewer you can find in your Windows SDK installation under path "c:\Program Files (x86)\Windows Kits\10\bin\*\x64\dxcapsviewer.exe" in terms of the information extracted from DX12. However, instead of GUI, it provides a command-line interface, which makes it similar to the "vulkaninfo" tool. Information is printed in a human-readable text format by default, but JSON format can be selected by providing
-j parameter, making it suitable for automated processing. Additional command-line parameters are supported, including a choice of the GPU if there are many installed in the system. Launch it with parameter
-h to see the command-line syntax.
In the future, I would like to extend it with a web back-end that would gather a database of various GPUs and driver versions, like Vulkan Hardware Database does for Vulkan, and to make it browsable online. As far as I know, there is no such database for D3D12 at the moment. Best we have right now are the tables about Direct3D Feature Levels on Wikipedia. But that will require a lot of learning from me, as I am not a good web developer, so I will think about it after my vacation :)
# SimplySaveAs - a Small Tool for Perforce Users
In my old article "Tips for Using Perforce" I promised to dedicate a separate article to what I described there in point 10, so here it is. First, let's talk about the problem. When using a version control system, e.g. Git or Perforce, you surely sometimes inspect the history of a file, to see who changed it, when, and what exactly has been changed throughout its previous versions. GUI clients of such systems offer convenient views to compare text files, but sometimes you may just need to save an old version of the file on your disk - not to update it in your main working copy, but to export it to a separate folder.
In some applications, this is easy. For example, Git Extensions, my favorite GUI client for Git, offers File History window that shows revision history of a selected file. In this window, we can right-click on a specific revision from the list and click "Save as" to export that specific version of the file to a new location on disk.
Unfortunately, in Perforce there is no such command. There is History tab that shows the list of revisions of a selected file. It also offers context menu under right mouse button to do something with a selected revision, but among the commands to diff etc. there is no "Save As", only "Open With". This one allows us to choose some application and open the file with it, which might be useful in case of text files or some other documents (e.g. DOCX, PDF) that we just want to preview using their dedicated app. But what if it is a binary file, having some non-standard extension, that we just want to export to disk?
Here is where the little tool I developed might be useful. SimplySaveAs is a Windows program that you can use to "Open With" a file in Perforce. All it does is show a "Save As" window that lets you choose a place and name where the file should be saved on your disk. This way, the external tool provides the command missing in Perforce visual client (P4V).
The program doesn't need any installation. The repository linked above also contains full source code in C++, but all you need to download is just the file "SimplySaveAs.exe". You can put it in any location on your disk. I like to have a separate directory "C:\PortablePrograms\", where I put all the portable applications that don't need installation, like this one.
First time you want to use it, you need to click on Open With > Choose Application... in Perforce and select "SimplySaveAs.exe" from your disk.
On every next use, Perforce will remember this program and show it available in the context menu, so you can just click Open With > SimplySaveAs.
How does it work? As you may know, opening a file with a program actually needs to save the file on a disk somewhere, likely in a temporary folder, and then launching the program with a path to this file passed as a command-line parameter. This is also what Perforce does when we use "Open With" command. So all my program does is ask the user for a target path and then copy the file from the source, temporary location read from the parameter to the target location selected by the user.
# An Idea for Visualization of Frame Times
In real-time graphics applications like games, we usually measure performance as the average number of frames per second (FPS). Showing this average is a good estimate of how well the application performs, how heavy is the per-frame workload, how fast is the system where it executes, and, most importantly, whether the performance suffices for showing a smooth, good looking animation, as opposed to a "slideshow". But this is not a complete story. If some frames take an exceptionally long time, then even if others are very short, an unpleasant hitching may be visible to the player, while average FPS still looks fine. Therefore it is worth to visualize duration of individual frames on a graph, to see if they are stable.
One idea for such a graph is to draw a line connecting data points (frames), where X axis is the frame index and Y axis is the frame duration (dt), like on these pictures: "GPU Reviews: Why Frame Time Analysis is important", page 3. If such graph is shown in real time, there is one problem with it: it doesn't move at a constant pace, as the horizontal axis is expressed in frames, not seconds, so an exceptionally long frame will have the same width as super short frame. As the result, the graph will move faster the higher is the framerate.
A better idea might be to move data points horizontally with time, so that a very long frame will generate a spike on the graph with previous point many pixels away on the horizontal axis. This is what AMD OCAT tool seems to be doing. However, it results in a long, oblique line on the graph.
Overlay shown by OCAT tool
Some time ago I came up with another kind of graph. It shows every frame as a rectangle, with all its parameters: width, height, and color, dependent on the frame duration:
frameWidth = dt / (1/120). Rectangle left and right edges also need to be aligned to
ceil(), respectively, so that every frame is visible as at least 1-pixel wide.
frameHeightFactor = (log2(dt) - log2(1/120)) / (log2(1/15) - log2(1/120)). This factor then needs to be clamped to 0..1 and stretched to some range of minimum..maximum heights, depending on the intended looks of the graph, e.g. 2..64 pixels. This way, frames of a game running at 120 FPS will have minimum height, 60 FPS will be at 33%, 30 FPS - 66%, and for 15 FPS or less they will have maximum height.
I think that with this kind of graph, both average framerate and outstanding extra-long frames are clearly visible at a first glance. You can see full example source code doing all this, implemented in C++ here: Game.cpp - RegEngine - sawickiap - GitHub. It uses GLM for math functions and Dear ImGui for 2D rendering.
For example, a game with V-sync on, running at steady 60 FPS, has the graph looking like this:
While a heavier GPU workload making the game running at around 38 FPS looks like this. The graph also shows an extra-long frame that froze the entire game because of loading something from the disk, and another hitch caused by pressing PrintScreen key.
# A Metric for Memory Fragmentation
In this article, I would like to discuss the problem of memory fragmentation and propose a formula for calculating a metric telling how badly the memory is fragmented.
The problem can be stated like this:
So it is a standard memory allocation situation. Now, I will explain what do I mean by fragmentation. Fragmentation, for this article, is an unwanted situation where free memory is spread across many small regions in between allocations, as opposed to a single large one. We want to measure it and preferably avoid it because:
VirtualAlloc) and sub-allocating them for the user's allocation requests, high fragmentation my require allocating another block to satisfy the request, making the program using more system memory than really needed.
A solution to this problem is to perform defragmentation - an operation that moves the allocations to arrange them next to each other. This may require user involvement, as pointers to the allocations will change then. It may also be a time-consuming operation to calculate better places for the allocations and then to copy all their data. It is thus desirable to measure the fragmentation to decide when to perform the defragmentation operation.
# Vulkan Memory Allocator 3.0.0 and D3D12 Memory Allocator 2.0.0
Yesterday we released new major version of Vulkan Memory Allocator 3.0.0 and D3D12 Memory Allocator 2.0.0, so if you are coding with Vulkan or Direct3D 12, I recommend to take a look at these libraries. Because coding them is part of my job, I won't describe them in detail here, but just refer to my article published on GPUOpen.com: "Announcing Vulkan Memory Allocator 3.0.0 and Direct3D 12 Memory Allocator 2.0.0". Direct links:
Vulkan Memory Allocator
D3D12 Memory Allocator
# Untangling Direct3D 12 Memory Heap Types and Pools
Those of you who follow my blog can say that I am boring, but I can't help it - somehow GPU memory allocation became my thing, rather than shaders and effects, like most graphics programmers do. Some time ago I've written an article "Vulkan Memory Types on PC and How to Use Them" explaining what memory heaps and types are available on various types of PC GPUs, as visible through Vulkan API. This article is a Direct3D 12 equivalent, in a way.
With expressing memory types as they exist in hardware, D3D12 differs greatly from Vulkan. Vulkan defines a 2-level hierarchy of memory "heaps" and "types". A heap represents a physical piece of memory of a certain size, while a type is a "view" of a specific heap with certain properties, like cached versus uncached. This gives a great flexibility in how different GPUs can express their memory, which makes it hard for the developer to ensure he selects the optimal one on any kind of GPU. Direct3D 12 offers a fixed set of memory types. When creating a buffer or a texture, it usually means selecting one of the 3 standard "heap types":
D3D12_HEAP_TYPE_DEFAULTis intended for resources that are directly and frequently accessed by the GPU, so all render-target, depth-stencil textures, other textures, vertex and index buffers used for rendering etc. go there. It typically ends up in the memory of the graphics card.
D3D12_HEAP_TYPE_UPLOADrepresents a memory that we can
Mapand fill in from the CPU and then copy or directly read on the GPU. It must be created and stay in
D3D12_RESOURCE_STATE_GENERIC_READ. It ends up in system RAM and is uncached but write-combined, which means it should only be written, never read.
D3D12_HEAP_TYPE_READBACKrepresents the least frequently used type of memory that the GPU can write or copy to, while the CPU can
Mapand read. It must be created in
D3D12_RESOURCE_STATE_COPY_DEST. It ends up in system RAM and is cached, which means random access from our CPU code is fine for it.
So far, so good... D3D12 seems to simplify things compared to Vulkan. You can stop here and still develop a decent graphics program, but if you make a game with an open world and want to stream your content in runtime, so you need to check what memory budget is available to your app, or you want to take advantage of integrated graphics where memory is unified, you will find out that things are not that simple in this API. There are 4 different ways that D3D12 calls various memory types and they are not so obvious when we compare systems with discrete versus integrated graphics. The goal of this article is to explain and untangle all this complexity.
# Direct3D 12: Long Way to Access Data
One reason the new graphics APIs – Direct3D 12 and Vulkan – are so complicated is that there are so many levels of indirection in accessing data. Having some data prepared in a normal C++ CPU variable, we have to go a long way before we can access it in a shader. Some time ago I’ve written an article: “Vulkan: Long way to access data”. Because I have a privilege of working with both APIs in my everyday job, I thought I could also write a D3D12 equivalent, so here it is. If you are a graphics or game programmer who already know the basics of D3D12, but may feel some confusion about all the concepts defined there, this article is for you.
Here is a diagram, followed by an explanation of all its elements:
# First Look at New D3D12 Enhanced Barriers
This will be pretty advanced or at least intermediate article. It assumes you know Direct3D 12 API. Some references to Vulkan may also appear. I am writing it because I just found out that yesterday Microsoft announced an upcoming big change in D3D12: Enhanced Barriers. It will be an addition to the API that provides a new way to do barriers. Considering my professional interests, this looks very important to me and also quite revolutionary. This article summarizes my first look and my thoughts about this new addition to the API or, speaking in terms of modern internet, my "unboxing" or "reaction" ;)
Bill Kristiansen, the author of the article linked above, written that currently only the software-simulated WARP device supports the new enhanced barriers. Support in real GPU drivers will come at later time. The new barriers can replace the old way of doing them, but both will still be available and can also be mixed in one application. Which means this is not as big revolution to turn our DirectX development upside down - we can switch to them gradually. For now we can just prepare ourselves for the future by studying the interface (which I do in this article) and testing some code using WARP device.
UPDATE 2021-12-10: I just learned that Microsoft actually did publish a documentation of the new API: Enhanced Barriers @ DirectX-Specs, so I recommend to go see it before reading this article.