Entries for tag "tools", ordered from most recent. Entry count: 75.
# ShaderCrashingAssert - a New Small Library
Last Thursday (August 17th) AMD released a new tool for post-mortem analysis of GPU crashes: Radeon GPU Detective. I participated in this project, but because this is my personal blog and because it is weekend now, I am wearing my hobby developer hat and I want to present a small library that I developed yesterday:
ShaderCrashingAssert provides an assert-like macro for HLSL shaders that triggers a GPU memory page fault. Together with RGD, it can help with shader debugging.
# D3d12info - Printing D3D12 GPU Information to Console
My next little hobby project is D3d12info. It is a Windows console program that prints all the information it can get about the current GPU installed in the system, as seen through Direct3D 12 API. It also fetches additional information through AMD GPU Services (on AMD cards), NVAPI (on NVIDIA cards), Vulkan, and WinAPI, mostly to identify the current version of the graphics driver and Windows system. I will try to keep it updated to the latest Agility SDK, to query it for support for the latest hardware features of the graphics card.
The tool can be compared to DirectX Caps Viewer you can find in your Windows SDK installation under path "c:\Program Files (x86)\Windows Kits\10\bin\*\x64\dxcapsviewer.exe" in terms of the information extracted from DX12. However, instead of GUI, it provides a command-line interface, which makes it similar to the "vulkaninfo" tool. Information is printed in a human-readable text format by default, but JSON format can be selected by providing
-j parameter, making it suitable for automated processing. Additional command-line parameters are supported, including a choice of the GPU if there are many installed in the system. Launch it with parameter
-h to see the command-line syntax.
In the future, I would like to extend it with a web back-end that would gather a database of various GPUs and driver versions, like Vulkan Hardware Database does for Vulkan, and to make it browsable online. As far as I know, there is no such database for D3D12 at the moment. Best we have right now are the tables about Direct3D Feature Levels on Wikipedia. But that will require a lot of learning from me, as I am not a good web developer, so I will think about it after my vacation :)
# SimplySaveAs - a Small Tool for Perforce Users
In my old article "Tips for Using Perforce" I promised to dedicate a separate article to what I described there in point 10, so here it is. First, let's talk about the problem. When using a version control system, e.g. Git or Perforce, you surely sometimes inspect the history of a file, to see who changed it, when, and what exactly has been changed throughout its previous versions. GUI clients of such systems offer convenient views to compare text files, but sometimes you may just need to save an old version of the file on your disk - not to update it in your main working copy, but to export it to a separate folder.
In some applications, this is easy. For example, Git Extensions, my favorite GUI client for Git, offers File History window that shows revision history of a selected file. In this window, we can right-click on a specific revision from the list and click "Save as" to export that specific version of the file to a new location on disk.
Unfortunately, in Perforce there is no such command. There is History tab that shows the list of revisions of a selected file. It also offers context menu under right mouse button to do something with a selected revision, but among the commands to diff etc. there is no "Save As", only "Open With". This one allows us to choose some application and open the file with it, which might be useful in case of text files or some other documents (e.g. DOCX, PDF) that we just want to preview using their dedicated app. But what if it is a binary file, having some non-standard extension, that we just want to export to disk?
Here is where the little tool I developed might be useful. SimplySaveAs is a Windows program that you can use to "Open With" a file in Perforce. All it does is show a "Save As" window that lets you choose a place and name where the file should be saved on your disk. This way, the external tool provides the command missing in Perforce visual client (P4V).
The program doesn't need any installation. The repository linked above also contains full source code in C++, but all you need to download is just the file "SimplySaveAs.exe". You can put it in any location on your disk. I like to have a separate directory "C:\PortablePrograms\", where I put all the portable applications that don't need installation, like this one.
First time you want to use it, you need to click on Open With > Choose Application... in Perforce and select "SimplySaveAs.exe" from your disk.
On every next use, Perforce will remember this program and show it available in the context menu, so you can just click Open With > SimplySaveAs.
How does it work? As you may know, opening a file with a program actually needs to save the file on a disk somewhere, likely in a temporary folder, and then launching the program with a path to this file passed as a command-line parameter. This is also what Perforce does when we use "Open With" command. So all my program does is ask the user for a target path and then copy the file from the source, temporary location read from the parameter to the target location selected by the user.
# Understanding Graphs in GPUView and RGP
When optimizing performance of a game or some other program, the most important thing is to get hard data first – to profile it using some tools, to see what is happening and where to focus attention. There are many profiling tools available. When talking about graphics, we realize that GPU is really a co-processor that can execute submitted work at its own pace, therefore GPU profiling tools offer a specific type of graph to visualize it. In this article, I will explain how to read this type of graph.
Let's take Radeon GPU Profiler (RGP) as an example. This program is available for free and is compatible with AMD graphics cards. It can capture data from programs that use Direct3D 12 or Vulkan. When we open a capture file and go to Overview > Frame summary tab, we can see a graph like this one:
It may look scary at first glance, but don't worry and stay with me. I will explain everything step-by-step. I don't know if there is any name for this type of graph, so let's call it a "queue graph" because it shows a queue of tasks submitted to the graphics card and executed by it.
The horizontal axis is time, passing in the right direction at a constant pace. The vertical axis is the queue, with the front of the queue on the bottom and items enqueued later stacked on top.
At each point in time, the item on the bottom row is the one currently executing on the GPU. Everything above this row is waiting for its turn. It means that from the graph we can see and measure when a certain piece of work (like D3D12
ExecuteCommandLists call in this example) was enqueued, when it started executing. and how long it took to execute it. The width of the bottom block represents the amount of time that was required to execute. Note that the work item going “down the stairs” has no meaning in itself. It just means something in front of it finished, so the queue ahead is shorter. Only when it ends up in the bottom row, it really starts executing.
Another thing to note is that some items wait in the queue but don't take any significant time to execute. These are simple and quick commands, like the green call to the
Signal function marked here. When everything in front of it completes, it also completes in no time.
We can make more observations from this graph if we consider the fact that games work with frames, each frame executes commands to draw the whole image from clearing background through 3D objects to UI and finishes with a call to the
Present function, marked here in brown color. By looking for this type of item, we can conclude when a new frame begins. For example, in the point "A" the GPU is still executing commands of frame N, while we have all commands for the next frame N+1 enqueued, including its next
Present, and also the commands for frame N+2 are stacking up at the end of the queue. Thus, we can expect the game to have 2 frames of latency in displaying the image.
The same type of graph is used by GPUView - a free tool from Microsoft that can record and display what is happening in the system on a very low level. (The linked article is very old - right now the way to install the tool is to grab Windows Assessment and Deployment Kit (Windows ADK) and a convenient UI for it is UIforETW). As you can see here, both "3D Hardware Queue" of my graphics card and software "Device Context" of a running game show packets of work submitted for rendering.
One important piece of information that we can extract from this graph is that GPU is not busy 100% of the time. GPUView actually shows the number on the right, which is 77.89% for the current view. It means the game is not GPU-bound. Reducing graphics quality settings would not increase framerate (FPS). This often happens when the game does some heavy computations on the CPU or when it reaches 60 FPS and we have V-sync enabled. Here we have the latter case, as we can see moments of vertical synchronization marked as blue lines, while rendering each frame seems to be blocked until that moment.
Note the graph described here is not the same as flame graphs or flame charts, which show a hierarchy of nested things, not a queue. For example, a call stack of function calls.
# Tips for Using Perforce
Version Control Systems are tools that every programmer should use. Among them, Git is probably the most popular one. Some companies use Perforce instead. Whether it is better or worse is hard to tell, but it has its advantages that make it indispensable in some types of projects, like game development. Perforce handles large binary files very well. Even if the files have tens or a hundred of gigabytes, it still works fine. I talk about the size of one local copy here, not the entire repository on the server.
From user’s perspective, Perforce differs greatly from Git or SVN. Not only commands are named differently (e.g. there is “Submit” instead of “Commit”), but the whole concept of “changelists” is something that needs to be well understood to be used efficiently. While working with Perforce for many years in different companies and projects, I learned some good practices that I would like to share here. Writing them down was difficult as they seem obvious to me, but hopefully some of them are not obvious to you so you will learn something new.
1. Paste paths to address bar
Let’s start with a simple one. Perforce window has a text box on the top that resembles address bar in web browsers. It shows the path of the currently selected file or directory in Depot or Workspace tab. It can also accept input.
When you work on some file in another tool and you want to jump quickly to it in Perforce, e.g. to check it out, just copy the full path of the file to system clipboard and paste it in this “address bar”. Selection in Workspace tab will switch to it immediately.
# Iteration time is everything
I still remember Demobit 2018 in February in Bratislava, Slovakia. During this demoscene party, one of the talks was given by Matt Swoboda "Smash", author of Notch. Notch is a program that allows to create audio-visual content, like demos or interactive visual shows accompanying concerts, in a visual way - by connecting blocks, somewhat like blueprints in Unreal Engine. (The name not to be confused with nickname of the author of Minecraft.) See also Number one / Another one by CNDC/Fairlight - latest demo made in it.
During his talk, Smash referred to music production. He said that musicians couldn't imagine working without a possibility to instantly hear the effect of changes they make to their project. He said that graphics artists deserve same level of interactivity - WYSIWYG, instant feedback, without a need for a lengthy "build" or "render". That's why Notch was created. Then I thought: What about programmers? Don't they deserve it too? Shorter iteration times mean better work efficiency and higher quality of the result. Meanwhile, a programmer sometimes has to wait minutes or even hours to be able to test a change in his code, no matter how small it is. I think it's a big problem.
This is exactly what I like about development of desktop Windows applications and games: they can usually be built, ran, and tested locally within few seconds. Same applies to games made in Unity and Unreal Engine - developer can usually hit "Play" button and quickly test his gameplay. It is often not the case with development for smaller devices (like mobile or embedded) or larger (like servers/cloud).
I think that iteration time - time after which we can observe effects of our changes - is critical for developers' work efficiency, as well as their well-being. We programmers should demand better tools. All of us - including low-level C and C++ programmers. Currently we are at the good position in the job market so we can choose companies and projects to work on. Let's use it and vote with our feet. Decision makers and architects of software/hardware platforms may think that developers are smart, so they can work efficiently even in harsh conditions. They forget that wasting developers' precious time means wasting a lot of money, not to mention their frustration. Creating better tools is an investment that will pay off.
Now, whenever I get a job offer for a developer position, I ask two simple questions:
1. What is the typical iteration time, from the moment when I change something in the code, through compilation, deployment, application launch and loading, until I can observe the effect of my change? If the answer is: "Usually it's just a matter of few seconds. Files you changed are recompiled, then launching the app takes few seconds and that's it." - that's fine. But if the answer is more like: "Well, the whole project needs to be rebuilt. You don't do it locally. You shelve your changes in Perforce so that build server picks it and makes the build. The build is then deployed to the target device, which then needs to reboot and load your app. It takes 15-20 minutes." - then it's a NOPE for me.
2. How do you debug the application? Can you make experiments by setting up breakpoints and watching variables in a convenient way? If the answer is: "Yes, we have debugger nicely integrated with Visual Studio/WinDBG/Eclipse/other IDE and we debug whenever we see a problem." - that's fine. But when I hear: "Well, command-line GDB should work with this environment, but to be honest, it's so hard to setup that no one uses it here. We just put debug console prints in the code and recompile it whenever we want to make a debug experiment." - then that's a red light for me.
# New Version of PVS-Studio
PVS-Studio is particularly good at finding issues with code portability between 32-bit and 64-bit. Out of my personal projects, I already ported CommonLib to 64 bits, and RegScript2 is written to support 64 bits from the start, but porting my main app (music visualization program) to 64 bits is a large task that I still have on my TODO list. Even if I know how to write portable code (use size_t not int etc. :) I made first commits to this repository 8 years ago, when my programming knowledge was much smaller, so I'm sure there are many nasty bugs there. Making it working as 64-bit app will be a difficult task and I'm sure PVS-Studio will help me with that. I will share my experiences and conclusions when I eventually do it.
In the meantime, I recommend to check their Blog, where developers of this tool share many valuable information. They also maintain list of articles describing errors they found in open source projects.
# Review: Deleaker - A tool that finds resource leaks
Deleaker is a tool for programmers that finds resource leaks in C++ programs. It's commercial, with free trial and unconditional 30 day money back. Here is my review of this tool. I've tested version 22.214.171.124.
Deleaker is installed as a plugin for Visual Studio, any version from 2005 to 2013. It also works with Visual Studio Community 2013, as this new free version also supports plugins. There is also standalone Deleaker application (see below).
The purpose of this tool is to augment debugging of native C++ programs with the ability to list all resources that are allocated at the moment (heap memory, virtual memory, OLE memory, GDI objects, USER objects, handles) and so to detect resource leaks. Here is how it works:
The interface is very simple - it can be learned in just few minutes. You can build your program and start debugging it by hitting F5, Deleaker is enabled automatically. Now just open dedicated panel (menu Deleaker > Deleaker Window) and there press "Take snapshot" button. You don't even have to pause execution, but of course the button works as well when your program is paused at a breakpoint. After few seconds, the panel is populated with a list of currently allocated resources, with the place from which it was allocated shown in first column.
After selecting one, bottom panel displays full call stack. Clicking in this call stacks navigates to the place in the source code where the allocation is specified. Finally, after program exit, the list is filled with resources that were not freed - these are actual leaks!
You can filter the list by module (EXE or DLL file that made the call) and by resource type (memory, GDI objects etc.). There is also a column with size of the resource and "Hit Count" - number of resources that were allocated by that particular place in the code (e.g. inside a loop) and stay allocated at the moment.
"Show full stack" button is a nice feature. Clicking it displays full call stack, while by default, the stack is stripped from entries that don't come from your code, but from system libraries. For example, above my function with the actual allocation instruction, there is MSVCR120D.dll!operator new, then there is MSVCR120D.dll!malloc etc... until ntdll.dll!RtlAllocateHeap. It's good that the program can ignore such call stack entries. It also entirely ignores allocations made by system modules outside of your code.
Unfortunately it does this only by identifying module that the function comes from and not it's name, so it cannot ignore templates, like these from STL containers. Maybe ignoring functions by name specified as wildcard or regular expression would help, e.g. "std::*" or "std\:\:.+" - just like Visual Studio debugger can step over specified functions, as I described in How to Make Visual Studio Debugger not Step Into STL.
You can press "Take snapshot" multiple times and save the snapshots for later view. (They are just numbered, you cannot give them names.) By the way, Deleaker captures F5 key, so even when during debugging session, if the focus is in Deleaker panel, this button doesn't resume your program, but instead refreshes the list of allocations (takes new snapshot). You can also select two snapshots and compare them. Then you see only resources that were allocated in the right snapshot and not in the left, which can indicate a leak that happened during some time of the program execution.
Besides heap memory allocations, the tool can also detect other types of resources, like GDI objects. Unfortunately not all interesting types of resources are covered. For example, an opened file of type FILE* f = fopen(...) is shown as normal memory allocation and opened file of type HANDLE f = CreateFile(...) is not shown at all, but I guess it must be due to some system internals.
I didn't find a single leak in my main home project, so I created a dedicated, simple program to test if it can really find leaks. I also checked that it works with programs compiled in Release configuration as well.
Aside from being a Visual Studio plugin, Deleaker can also work as standalone Windows application.
Overall, I like the program. If its price is not a problem for you or your company, I think it can be very useful in improving quality of developed software. I especially like the fact that it's so easy to learn and use.