Skip to content

[Metal] Explicit root signature layout for Metal backend#1061

Open
kcloudy0717 wants to merge 5 commits intollvm:mainfrom
kcloudy0717:kcloudy0717/metal_irconverter
Open

[Metal] Explicit root signature layout for Metal backend#1061
kcloudy0717 wants to merge 5 commits intollvm:mainfrom
kcloudy0717:kcloudy0717/metal_irconverter

Conversation

@kcloudy0717
Copy link
Copy Markdown
Contributor

Resolves #351, #744, #305.

This PR changes the automatic linear layout binding model to the more flexible explicit root signature binding model.

Notable changes:

  • We no longer invokes -metal && -Fre to the compiler command line in LIT config. Reason for this is because explicit root signature model requires the RS to be bound to the IR compiler during IR conversion time.
  • Introduced MTLTopLevelArgumentBuffer (TLAB) class that replaces the legacy MTL::Buffer *ArgsBuffer in InvocationState. This class encapsulates a buffer for the backing memory of the data of all IR resources for the given root signature. Provides D3D12-like functions for binding root constants, root descriptors, and descriptor tables.
  • Introduced MTLDescriptorHeap class that replicates D3D12's descriptor heap concept. It serves as a storage for descriptors, binding a descriptor table to the TLAB must come from a descriptor heap just like D3D12. This bridge the gap between D3D12/Metal API differences allowing future tests such as dynamic resource indexing and unbound resource arrays easy to support.
  • Resource/Descriptor creation is similar to the D3D12 backend thanks to MTLTopLevelArgumentBuffer and MTLDescriptorHeap, this allows resource arrays to be supported in the Metal backend now.

@kcloudy0717
Copy link
Copy Markdown
Contributor Author

Pinging @bogner, @bob80905, @farzonl, @manon-traverse for review.

Comment on lines +1 to +3
#define IR_RUNTIME_METALCPP
#include "MTLDescriptorHeap.h"
#include "metal_irconverter_runtime.h"
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Better to limit the IR_RUNTIME_METALCPP to the code that needs it:

#pragma push_macro("IR_RUNTIME_METALCPP")
#include "metal_irconverter_runtime.h"
#pragma pop_macro("IR_RUNTIME_METALCPP")

If we're going to use these APIs in more places it might even be worth wrapping it in a header that just does that so it's harder to make a mistake.

(I'd also make an argument for adding a MetalIRConverter.cpp and moving the IR_PRIVATE_IMPLEMENTATION inclusion of this header to that instead of Device, but that's quickly getting out of scope for this particular change)

#include "llvm/Support/Error.h"
#include <memory>

struct IRDescriptorTableEntry;
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It's a bit awkward that metal_irconverter_runtime.h just puts all of its definitions in the global namespace. Maybe a comment saying this forward declaration is for an object from there would be helpful.

struct IRDescriptorTableEntry;

namespace offloadtest {
struct METAL_GPU_DESCRIPTOR_HANDLE {
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Let's stick to LLVM naming conventions for these types (ie, MetalGPUDescriptorHandle)

Comment on lines +28 to +31
enum METAL_DESCRIPTOR_HEAP_TYPE {
METAL_DESCRIPTOR_HEAP_TYPE_CBV_SRV_UAV,
METAL_DESCRIPTOR_HEAP_TYPE_SAMPLER,
};
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is a good place for enum class, then we don't need the METAL_DESCRIPTOR_HEAP_TYPE on all of the values.

I'm also not entirely sure about the CBV or SRV or UAV value - is there a clearer name for this? I suspect we'll either actually want three different values here or that we can get away with something like Sampler and NotSampler

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ideally we keep the descriptor heap API as close to D3D12 as possible. The enum defined here maps to D3D12's enum:

typedef enum D3D12_DESCRIPTOR_HEAP_TYPE {
  D3D12_DESCRIPTOR_HEAP_TYPE_CBV_SRV_UAV = 0,
  D3D12_DESCRIPTOR_HEAP_TYPE_SAMPLER,
  D3D12_DESCRIPTOR_HEAP_TYPE_RTV,
  D3D12_DESCRIPTOR_HEAP_TYPE_DSV,
  D3D12_DESCRIPTOR_HEAP_TYPE_NUM_TYPES
} ;

Having three separate values for each CBV/SRV/UAV would make dynamic resource indexing via ResourceDescriptorHeap not possible.

FWIW, I think we can just change this to enum class and drop the METAL_DESCRIPTOR_HEAP_TYPE_ prefix for the values. I can't really think of a better name for CBV_SRV_UAV so I would say we just keep as it is.


#include "MTLDescriptorHeap.h"
#include "Metal/Metal.hpp"
#include "metal_irconverter.h"
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Do we need all of this for the header?

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, but I think I can get away with forward declarations for some of them. I'll do that.

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We shouldn't be checking a dylib in to the repo. I didn't realize that metal_irconverter wasn't distributed header only like metal_irconverter_runtime when I suggested we could just add the dependency - this makes things more complicated unfortunately.

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What would be the preferred approach? Maybe we can use CMake Fetch_Content to make this bit simpler?

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I just realized the tests aren't being ran probably because of this dylib dependency. Do let me know if there is a better approach for this.

SampledTexture2D,
};

enum class DescriptorKind { UAV, SRV, CBV, SAMPLER };
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I am not sure if we should expose this here. This is a concept that only makes sense for DX12 and Metal, but not for Vulkan. I do see the value of deduplicating this code between DX and Metal. Perhaps we should put this in some sort of util header file?

Comment on lines +131 to +138
// Maybe there is a better way to detect GPUAPI?
#ifdef __APPLE__
APIToUse = GPUAPI::Metal;
outs() << "Using Metal API\n";
#else
APIToUse = GPUAPI::DirectX;
outs() << "Using DirectX API\n";
#endif
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This seems fine to me as Metal is Apple only.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

Metal loader logic doesn't account for resources that don't match the automatic linear layout

3 participants