Fix inconsistency in reporting device local memory between type and heap on macOS. #789

billhollings · 2019-12-06T15:23:10Z

Fixes issue #786.

…eap on macOS.

cdavis5e · 2019-12-06T17:57:20Z

MoltenVK/MoltenVK/GPUObjects/MVKDevice.mm

-		_memoryProperties.memoryHeaps[1].flags = 0;
+		_memoryProperties.memoryHeaps[1].flags = VK_MEMORY_HEAP_DEVICE_LOCAL_BIT;
 		_memoryProperties.memoryTypes[2].heapIndex = 1;	// Shared memory in the shared heap


I'm wondering if we should actually clear the VK_MEMORY_PROPERTY_DEVICE_LOCAL_BIT from the memory type corresponding to MTLStorageModeShared.

@cdavis5e

I thought along those lines at first...but then I realized that it's shared memory we're talking about. It is device local as far as both Vulkan and Metal are concerned.

From an Apple doc on the matter:

In both [unified and discrete] memory models, a resource with a MTLStorageModeShared mode resides in system memory accessible to both the CPU and the GPU.

BTW...what was the original problem that the second heap was trying to solve? The original code seems to be trying to say that shared memory is not accessible to the GPU.

vkd3d checks how many heaps there are, and if at least one is not host-visible, to see if the device is a UMA device.

I was always of the understanding that MTLStorageModeShared was equivalent to, for example, cudaMallocHost - memory that is device visible, but still on the host. That would suggest that it shouldn't be marked device local in Vulkan parlance, as that suggests a certain level of performance that device visible but host local memory doesn't have.

@cdavis5e

if at least one is not host-visible

Do you mean not GPU-visible? How do you test for the host-visibility of a heap?

@mbarriault

At least one heap must include VK_MEMORY_HEAP_DEVICE_LOCAL_BIT in VkMemoryHeap::flags. If there are multiple heaps that all have similar performance characteristics, they may all include VK_MEMORY_HEAP_DEVICE_LOCAL_BIT. In a unified memory architecture (UMA) system there is often only a single memory heap which is considered to be equally “local” to the host and to the device, and such an implementation must advertise the heap as device-local.

So a UMA device must have it's single heap listed as device-local...implying (to me at least) that the device local flags don't imply special fast device-only memory...just memory that the GPU can access.

However...the spec also states that...

Device memory is memory that is visible to the device

...which would imply that ALL heaps and memory types (which are collectively discussed under the Device Memory banner) are assumed to be device-visible...implying also that the requirement for device-local flags is almost redundant in the UMA case above.

...[sigh]...

...[sigh]...

I've posted a question to the Vulkan Working Group about this.

@cdavis5e @mbarriault

I've updated this PR...and the logic should be right now.

Because this PR has been sitting open for a while, there is a bunch of other semi-related stuff in it now...but, if you are reviewing, the areas of interest for the memory logic are:

MVKPhysicalDevice::initMemoryProperties(), which I've made much more procedural, organized, and traceable.

The value of MVK_VK_MEMORY_TYPE_METAL_SHARED no longer includes VK_MEMORY_PROPERTY_DEVICE_LOCAL_BIT or VK_MEMORY_PROPERTY_HOST_CACHED_BIT...but VK_MEMORY_PROPERTY_DEVICE_LOCAL_BIT is added dynamically for unified memory.

MVKPhysicalDevice::getHasUnifiedMemory() adds a bit of additional logic to test for the low-power GPU and device count in the case the MTLDevice does not respond to hasUnifiedMemory.

… macOS. Remove VK_MEMORY_PROPERTY_DEVICE_LOCAL_BIT and VK_MEMORY_PROPERTY_HOST_CACHED_BIT from MVK_VK_MEMORY_TYPE_METAL_SHARED. Refactor MVKPhysicalDevice::initMemoryProperties() to configure memory types and heaps in a more organized, procedural, and traceable manner. Enhance MVKPhysicalDevice::getHasUnifiedMemory() detection of unified memory. Simplify MVKPhysicalDevice::getVRAMSize(). On macOS MVKImage::validateUseTexelBuffer() tests the underlying device memory directly to determine if it has coherent memory and cannot support texel buffer. Add MVKInstance::getPhysicalDeviceCount(). Add mvkClear() template function for clearing structs and replace use of memset(*, 0, sz) with mvkClear(). Refactor mvkCopyStruct() and rename to mvkCopy().

Update What's New document.

Change MVKPhysicalDeviceMetalFeatures::dynamicMTLBuffers from boolean to uint dynamicMTLBufferSize limit and guard dynamic buffer usage sizes against it. Update VK_MVK_MOLTENVK_SPEC_VERSION to 23.

cdavis5e

I meant to say this before you merged, but it totally slipped my mind.

cdavis5e · 2019-12-12T21:28:31Z

MoltenVK/MoltenVK/GPUObjects/MVKDevice.mm

-	return [_mtlDevice respondsToSelector: @selector(hasUnifiedMemory)] && _mtlDevice.hasUnifiedMemory;
+	return (([_mtlDevice respondsToSelector: @selector(hasUnifiedMemory)] && _mtlDevice.hasUnifiedMemory)
+			|| _mtlDevice.isLowPower
+			|| getInstance()->getPhysicalDeviceCount() == 1);


Won't that erroneously include Macs with one discrete card and no integrated graphics?

Won't that erroneously include Macs with one discrete card and no integrated graphics?

Hmmm...you might be right. It was intended to cover the case where there was only integrated graphics which might not be listed as low-power (I figured a Mac with only one integrated GPU might not list that one GPU as low-power...and I don't have a machine to test that on).

I figured a Mac with only one integrated GPU might not list that one GPU as low-power

Hmmm...looks like my assumption was wrong. From Apple's docs on isLowPower...

This property returns YES for integrated GPUs and NO for discrete GPUs.

Pretty straightforward. I'll remove the single-GPU test...but leave the isLowPower test...on the assumption that if hasUnifiedMemory is not available...then isLowPower is a valid indication of integrated, and therefore unified, memory.

I'll remove the single-GPU test...but leave the isLowPower test...on the assumption that if hasUnifiedMemory is not available...then isLowPower is a valid indication of integrated, and therefore unified, memory.

Implemented in PR #795

Fix inconsistency in reporting device local memory between type and h…

98e0b79

…eap on macOS.

cdavis5e reviewed Dec 6, 2019

View reviewed changes

billhollings added 5 commits December 12, 2019 10:43

Support Xcode 11.3.

b213996

Update What's New document.

Update mvkClear() to support optional length parameter.

977a1f4

Fix bug where dynamic shader buffers are overflowing.

3f025bc

Change MVKPhysicalDeviceMetalFeatures::dynamicMTLBuffers from boolean to uint dynamicMTLBufferSize limit and guard dynamic buffer usage sizes against it. Update VK_MVK_MOLTENVK_SPEC_VERSION to 23.

Avoid reading env vars inside library constructor functions.

5ba074a

billhollings merged commit ef8575f into KhronosGroup:master Dec 16, 2019

billhollings mentioned this pull request Dec 16, 2019

Wrong memory types reported #786

Closed

cdavis5e reviewed Dec 16, 2019

View reviewed changes

billhollings mentioned this pull request Dec 16, 2019

Update dependency libraries to match Vulkan SDK 1.1.130 #795

Merged

tmm1 mentioned this pull request Jun 24, 2020

tvOS fixups #934

Merged

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Fix inconsistency in reporting device local memory between type and heap on macOS. #789

Fix inconsistency in reporting device local memory between type and heap on macOS. #789

billhollings commented Dec 6, 2019

cdavis5e Dec 6, 2019

billhollings Dec 6, 2019

cdavis5e Dec 6, 2019 •

edited

Loading

mbarriault Dec 6, 2019

billhollings Dec 7, 2019

billhollings Dec 9, 2019 •

edited

Loading

billhollings Dec 12, 2019

cdavis5e left a comment

cdavis5e Dec 12, 2019

billhollings Dec 16, 2019

billhollings Dec 16, 2019

billhollings Dec 16, 2019

Fix inconsistency in reporting device local memory between type and heap on macOS. #789

Fix inconsistency in reporting device local memory between type and heap on macOS. #789

Conversation

billhollings commented Dec 6, 2019

Choose a reason for hiding this comment

Choose a reason for hiding this comment

cdavis5e Dec 6, 2019 • edited Loading

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

billhollings Dec 9, 2019 • edited Loading

Choose a reason for hiding this comment

Choose a reason for hiding this comment

cdavis5e left a comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

cdavis5e Dec 6, 2019 •

edited

Loading

billhollings Dec 9, 2019 •

edited

Loading