Untangling Direct3D 12 Memory Heap Types and Pools

Sat
26
Feb 2022

Those of you who follow my blog can say that I am boring, but I can't help it - somehow GPU memory allocation became my thing, rather than shaders and effects, like most graphics programmers do. Some time ago I've written an article "Vulkan Memory Types on PC and How to Use Them" explaining what memory heaps and types are available on various types of PC GPUs, as visible through Vulkan API. This article is a Direct3D 12 equivalent, in a way.

With expressing memory types as they exist in hardware, D3D12 differs greatly from Vulkan. Vulkan defines a 2-level hierarchy of memory "heaps" and "types". A heap represents a physical piece of memory of a certain size, while a type is a "view" of a specific heap with certain properties, like cached versus uncached. This gives a great flexibility in how different GPUs can express their memory, which makes it hard for the developer to ensure he selects the optimal one on any kind of GPU. Direct3D 12 offers a fixed set of memory types. When creating a buffer or a texture, it usually means selecting one of the 3 standard "heap types":

So far, so good... D3D12 seems to simplify things compared to Vulkan. You can stop here and still develop a decent graphics program, but if you make a game with an open world and want to stream your content in runtime, so you need to check what memory budget is available to your app, or you want to take advantage of integrated graphics where memory is unified, you will find out that things are not that simple in this API. There are 4 different ways that D3D12 calls various memory types and they are not so obvious when we compare systems with discrete versus integrated graphics. The goal of this article is to explain and untangle all this complexity.

Discrete graphics card

According to the D3D12 API, there are two main types of platforms: NUMA and UMA, as showed by D3D12_FEATURE_DATA_ARCHITECTURE::UMA. When this boolean member is FALSE, we talk about Non-Uniform Memory Access (NUMA), so a discrete graphics card with separate video memory (VRAM) that is fast for the GPU to access and separate system RAM that is accessible to it only via PCI Express bus.

We already talked about D3D12_HEAP_TYPE flags. It turns out these are only a shortcut to a more complex structure called D3D12_HEAP_PROPERTIES. When allocating D3D12 memory, you can specify D3D12_HEAP_TYPE_CUSTOM and fill in other members of this structure to explicitly define the memory type we want to use. Member CPUPageProperty defines what kind of CPU access (mapping) we want to have - D3D12_CPU_PAGE_PROPERTY_NOT_AVAILABLE, D3D12_CPU_PAGE_PROPERTY_WRITE_COMBINE or D3D12_CPU_PAGE_PROPERTY_WRITE_BACK. Member MemoryPoolPreference, which is more interesting to us in this article, defines the type of memory we want to allocate, called "memory pool". In case of NUMA architectures:

There is a fixed mapping between the "shortcut" D3D12_HEAP_TYPE_ values and their full D3D12_HEAP_PROPERTIES structures, as defined on page "ID3D12Device::GetCustomHeapProperties method". When UMA == FALSE:

There is more to it. If you want to query for the capacity of each kind of memory, D3D12 offers another 2 ways of naming them. First, we have the function IDXGIAdapter3::QueryVideoMemoryInfo. It allows queries for the CurrentUsage and available Budget of a selected memory type, grouped in a structure DXGI_QUERY_VIDEO_MEMORY_INFO. This time, the type of memory is denoted by a different enum: DXGI_MEMORY_SEGMENT_GROUP_LOCAL for the memory local to the GPU (video RAM) and DXGI_MEMORY_SEGMENT_GROUP_NON_LOCAL for the memory further from the GPU (system RAM). This is the API recommended currently for queries about how much memory an application should use.

We can draw a clear line here, as shown on the picture:

There is a fourth way of addressing the memory. It is not recommended, as it matches poorly what happens under the hood, but because IDXGIAdapter3 is a new interface, that may not be available on all versions of Windows and because you may want to know how much memory does the GPU have rather than how much budget Windows recommends using by your application, you may query for DXGI_ADAPTER_DESC. The structure offers 3 members that need to be interpreted correctly:

Integrated graphics

The second option is a graphics chip integrated with the CPU, also called UMA. Some sources call it Uniform Memory Access, others - Unified Memory Access, and Microsoft calls it Universal Memory Access. When D3D12_FEATURE_DATA_ARCHITECTURE::UMA is TRUE, a lot of things change, as can be seen in the bottom part of the picture. Now, page "ID3D12Device::GetCustomHeapProperties method" defines the "shortcut" heap types as:

As you can see, on platforms with integrated graphics, we use only D3D12_MEMORY_POOL_L0, which represents the unified memory shared between CPU and GPU. L1 is never used here. When querying about the budget, we also have only one type of memory used: D3D12_MEMORY_SEGMENT_GROUP_LOCAL. Resource allocations of any type increase CurrentUsage of this one, while NON_LOCAL always stays 0. Note, however, that on the discrete graphics card L1 = Local and L0 = Non-Local, while on the integrated graphics L0 = Local. What a mess! To make it clear:

If you want to inspect the DXGI_ADAPTER_DESC structure instead of DXGI_QUERY_VIDEO_MEMORY_INFO, you need to know that the 3 members of this structure also look differently:

Conclusions

This is all I wanted to describe in this article. To be honest, I planned to investigate this topic for years. Before I finally did it in the last few days, I used to believe that Direct3D 12 is simpler and more convenient to use when it comes to memory allocation. The D3D12_HEAP_TYPE_ flags seem like a simplification, but after digging deeper, I conclude this API is just rigid and overly complicated for no good reason, convenient neither for game developers nor for GPU manufacturers, especially with the 4 different ways of addressing the types of memory. I consider myself a fan of Microsoft and DirectX, but now once again I must admit that this is something that Vulkan got right, just like when I "unboxed" the new D3D12 Enhanced Barriers API or described the secrets of Direct3D 12 resource alignment.

Comments | #directx Share

Comments

[Download] [Dropbox] [pub] [Mirror] [Privacy policy]
Copyright © 2004-2022