(MTL S01E02) Metal Architecture
To use something effectively, it’s important to understand how it works. In this article, I’ll give a brief overview of how Metal operates and what you need to know to get started. Honestly, I wish I had this knowledge when Metal first launched, as it would’ve made my transition from OpenGL much smoother. So, let’s dive in (for more fun I used illustrations from Factorio)!
Entities Overview
Metal includes many different entities, most of which are represented by protocols. Their actual implementations are managed by the system and can vary slightly depending on the chipset, platform, or whether you’re in release or debug mode. Technically, you could write your own implementation if needed (but I never met any cases for it). For now, let’s focus on the most commonly used ones:
Device
This represents the hardware where all computations take place (rendering is a form of computation too). It’s also where we store our objects and resources.
Buffer
A block of memory used to store your data (meshes, parameters, etc). While there are various nuances regarding optimization, memory sharing, and access, I’ll cover those details in future articles. Buffers are allocated on GPU side and you operate just references to them.
Texture
Essentially, an image stored in memory. It shares many similarities with a buffer, but includes additional metadata like texture size, format, and other attributes. Same as buffers, textures are allocated on GPU side amd operated through references.
Pipeline
This defines how your function or group of functions (including shaders, which are functions too) are executed on the GPU to process your data.
Command Queue
The command queue is responsible for sending commands from the CPU to the GPU. Think of it like a railway system, where the CPU is one station, the GPU is the other, and the command queue serves as the tracks connecting them.
Command Buffer
You can’t send individual commands directly; they need to be grouped together. In our analogy, think of a command buffer as a train: you attach different cars (commands), load them up, and send them off to the GPU.
Command Encoder
This represents a specific action we need the GPU to perform, such as compute, blit, or render (which we’ll cover in more detail in the next series). In our analogy, it’s like a train car where we load several actions of the same type, as long as there are no conflicts between the targets. All the necessary resources are also loaded into this car.
How it works (basic examples)
1. Create a device object: `MTLDevice`
2. Set up the necessary objects, such as buffers, textures and the basic pipeline.
3. Create a command queue on the device: `MTLCommandQueue`
1. Create a command buffer on the queue: `MTLCommandBuffer`
1. Inside the command buffer, create a compute encoder: `MTLRenderCommandEncoder`
1. Set your computing pipeline in the encoder.
2. Attach a buffer for putting there results (i.e. mesh)
3. Dispatch thread groups (computing units)
2. End encoding (this finalizes the "loading" of the computing wagon).
3. Create another render encoder: `MTLRenderCommandEncoder` for rendering into the texture
1. Set your rendering pipeline in the encoder.
2. Attach the buffer with mesh.
3. Call a draw method.
4. End encoding (this finalizes the "loading" of the rendering wagon).
2. Commit the command buffer (sends the "train" to its destination).
3. Optionally, wait for the buffer to start or finish executing, if necessary.
As you can see, it’s quite simple if you focus on the principles rather than the boilerplate code (which isn’t too heavy anyway). You might ask, how do you see the results? You can download the contents of your buffers or textures (if allowed), so there’s no need for views or surfaces. This enables you to perform off-screen computing or rendering, even in a console application.
Conclusion
- Metal’s principles are quite straightforward (but if anything is unclear, please share your feedback and I’ll do my best to explain it more clearly).
- The basic resources include buffers (for data), textures (for images), and pipelines (for functions).
- Think of the device as having a railroad (command queue) between the CPU and GPU. This railroad is used to send trains (command buffers) with wagons (command encoders) that carry specific operations for the GPU to execute.