Metal Tutorial 5 β Tiled Rendering
Tutorial 5 β Tiled Rendering
Prerequisites
Make sure you have completed Tutorial 4 β Shadows.
Setting up the project
In MTMetalTutorialsApp.swift set MT5ContentView() and run:
@main
struct MetalTutorialsApp: App {
var body: some Scene {
WindowGroup {
MT5ContentView()
}
}
}

What changes from Tutorial 4
This tutorial refines the scene setup by implementing changes that improve performance without altering the bunny, plane, or shadow elements. The only changes are:
- GBuffer textures use
.memorylessstorage mode β they live in the GPU tile memory, never written to DRAM - The GBuffer pass and the Lighting pass are merged into a single
MTLRenderPassDescriptorderived fromview.currentRenderPassDescriptor - The lighting fragment shader receives the GBuffer as a direct
GBufferstruct parameter instead of reading from textures
The result: the GBuffer data never leaves the tile β zero bandwidth cost.
Files
| File Name | Description |
|---|---|
| MT5ContentView.swift | Sets up Metal view and starts rendering. |
| MT5DeferredMetalView.swift | Manages Metal layer, frame timing, and render loop. |
| MT5TiledDeferredRenderer.swift | Handles GBuffer creation, texture setup, and command buffer encoding. |
| MT5DeferredRendering.metal | Contains vertex and fragment shaders for geometry pass and lighting pass. |
| MT5RenderTargets.h | Defines constants for render targets (e.g., albedo, normal, position). |
| MT5Uniforms.h | Declares uniform structs used in the Metal shaders. |
Code
GBuffer fragment β unchanged
The geometry-pass shaders (vertex_main, gbuffer_fragment) are identical to Tutorial 4. Metal infers the tile memory write from the [[color(N)]] attributes on the GBuffer struct.
Lighting fragment β new GBuffer struct parameter
In Tutorial 4 the lighting shader read from three texture2d bindings. Tutorial 5 replaces that with a direct GBuffer parameter:
struct GBuffer {
float4 albedo [[color(MT5RenderTargetAlbedo)]];
float4 normal [[color(MT5RenderTargetNormal)]];
float4 position [[color(MT5RenderTargetPosition)]];
};
fragment float4 deferred_lighting_fragment(
QuadInOut in [[ stage_in ]],
GBuffer gBuffer , // β reads from tile, not DRAM
constant MT5FragmentUniforms &uniforms [[buffer(1)]])
{
float4 albedo_vis_at_pix = gBuffer.albedo;
float4 normal_at_pix = gBuffer.normal;
float4 position_at_pix = gBuffer.position;
// β¦ same GGX lighting calculation as Tutorial 3/4
}
The GBuffer parameter with [[color(N)]] attributes reads directly from the tile memory render targets. No texture.read() call, no DRAM access.
Note: This tile-memory read only works inside a single merged render pass. If the passes are split (as in Tutorial 3), tile memory is flushed between passes and DRAM textures must be used instead.
What .storeAction = .dontCare actually means
| Store action | What happens | When to use |
|---|---|---|
.store |
GPU writes tile β DRAM | When you need the texture in a later pass or on the CPU |
.dontCare |
GPU discards tile contents | When the attachment is intermediate (GBuffer, intermediate depth) |
.multisampleResolve |
Resolve MSAA | For MSAA render targets |
Using .dontCare on the GBuffer is not just βsafeβ β itβs required for .memoryless textures. If you set .store on a memoryless texture, Metal will error at validation time.
Performance comparison
On Apple Silicon, the bandwidth saving from .memoryless GBuffers is significant:
| Scenario | T3 GBuffer bandwidth | T5 GBuffer bandwidth |
|---|---|---|
| 1080p, rgba8 albedo | ~8 MB/frame written + ~8 MB/frame read | 0 |
| 1080p, rgba16f normals + positions | ~16 MB/frame written + ~16 MB/frame read | 0 |
| 60 fps total | ~2.88 GB/s for GBuffer alone | 0 |
The saved bandwidth also reduces power consumption β important for mobile (iPhone/iPad) targets.
When NOT to use .memoryless
Memoryless textures are a great default for intermediate attachments, but they canβt be used when:
- You need to read the texture in a later separate pass (e.g., shadow map reads in the geometry pass)
- You need to read the texture on the CPU (e.g., for image capture, screenshots)
- Youβre running on a non-tile GPU (Mac with AMD/NVIDIA) β
.memorylessis ignored silently on those, which is fine; the texture simply allocates normally
Metal provides device.supportsFamily(.apple1) to detect tile GPU support if you want to conditionally enable memoryless storage.
Key concepts recap
| Concept | What it is |
|---|---|
| TBDR | GPU renders one tile at a time using fast on-chip memory |
.memoryless |
Texture that exists only in tile memory β no DRAM backing |
.dontCare store action |
Discard tile contents after pass β required for memoryless |
| Merged render pass | GBuffer + Lighting in one MTLRenderCommandEncoder |
GBuffer [[stage_in]] |
Lighting shader reads GBuffer from tile memory directly |
| Tile memory | Ultra-fast GPU-local memory, ~10β100Γ faster than DRAM access |
GPU Render Pipeline
The GPU render pipeline processes each frame through a series of stages, starting with vertex fetch. Each stage is responsible for processing data in preparation for the next step:
flowchart TD
A["Vertex Fetch"] --> B["Vertex Shader"]
B --> C["Primitive Assembly"]
C --> D["Rasterization"]
D --> E["Fragment Shader"]
E --> F["Blending"]
F --> G["Final Framebuffer Write"]
CPU β GPU Data Flow
flowchart LR
A["MTLCommandBuffer"] --> B["MTLRenderPassDescriptor"]
B --> C["GBufferTextures"]
B --> D["DepthTexture"]
B --> E["LightingPSO"]
B --> F["GeometryPSO"]
Concept Summary
flowchart TD
A["MTLDevice"] --> B["MTLLibrary"]
B --> C["Vertex Shader"]
B --> D["Fragment Shader"]
B --> E["Render Pipeline State Object (PSO)"]
B --> F["Depth Stencil State"]
B --> G["GBuffer Textures"]
B --> H["Depth Texture"]
Next: Tutorial 6 β GPU Rendering β move draw call generation onto the GPU with MTLIndirectCommandBuffer.
Congratulations π
Youβve successfully implemented Tile-Based Deferred Rendering (TBDR) in Metal, optimizing your rendering pipeline for Apple Silicon GPUs. Now that you have a solid foundation, try experimenting with different tile sizes and observe how it impacts performance and memory usage.
Happy coding!