Shadows | 🌵 Federico Forti (fe0437)

In this tutorial we add shadows to the deferred renderer using shadow mapping — a technique that asks “is this point visible from the light?” before shading it:

Allocate a depth-only shadow map texture (rendered from the light’s point of view)
Shadow pass: render the scene with no fragment shader, writing only depth
Compute a light-space MVP matrix and pass it alongside the regular camera matrices
GBuffer pass: transform each fragment to light space and compare depth against the shadow map
Add a ground plane so the shadows have somewhere to land

Prerequisite: Make sure you have completed Tutorial 3 — Deferred Rendering.

⚙️ Setting up the project

In MTMetalTutorialsApp.swift set MT4ContentView() and run:

@main
struct MetalTutorialsApp: App {
    var body: some Scene {
        WindowGroup {
            MT4ContentView()
        }
    }
}

🔗 Metal API in this tutorial

Object	Docs	Scope	Role
Depth-only `MTLRenderPipelineState`	↗	Scene lifetime	Pipeline with no fragment function; writes only depth
`depth2d<float>` (MSL)	↗	Shader	Depth texture type in Metal Shading Language; supports `compare_func` sampling
`compare_func::less` (MSL sampler)	↗	Shader	Hardware-accelerated depth comparison inside the texture sampler
`MTLSamplerDescriptor`	↗	Pipeline setup	Configures filtering, addressing, and compare function for samplers

💡 New concepts in this tutorial

Shadow map — a depth texture (MTLTexture with .depth32Float pixel format) rendered from the light’s point of view
Shadow pass — a new render pass before the GBuffer pass, depth-only, no fragment shader
Light-space transform — a second MVP matrix computed from the light position
PCF shadow sampling — comparing fragment depth against the shadow map with a hardware compare_func sampler
Ground plane — a procedural MDLMesh plane added to the scene

📁 Files

File	Purpose
`MT4ContentView.swift`	Sets up Metal view and render loop
`MT4DeferredMetalView.swift`	Subclass of `MTKView`, sets up Metal device, command queue, etc.
`MT4DeferredRenderer.swift`	Three-pass renderer: shadow → GBuffer → lighting
`MT4DeferredRendering.metal`	Shadow vertex shader + GBuffer with shadow test
`MT4RenderTargets.h`	Adds `MT4RenderTargetShadow = 4`
`MT4Uniforms.h`	Adds `shadowModelViewProjectionMatrix` to vertex uniforms

Three-pass render sequence

sequenceDiagram
    participant CB as "Command Buffer"
    participant Shadow as "Shadow Pass"
    participant GBuffer as "GBuffer Pass"
    participant Lighting as "Lighting Pass"

    CB->>Shadow: vertex_depth only — writes shadow map texture
    CB->>GBuffer: vertex_main + gbuffer_fragment — reads shadow map, writes GBuffer
    CB->>Lighting: full-screen quad — reads GBuffer, writes drawable

🌑 Shadow mapping: step by step

🌑 How shadow mapping works

Shadow mapping is a two-pass technique:

Shadow pass — Render the scene from the light’s perspective. Only write depth (no color). The resulting depth texture is the shadow map.
To accurately determine if a fragment is in shadow, transform its world position into light space, aligning the scene with the light’s perspective for consistent depth comparisons. Sample the shadow map at that location. If the fragment’s light-space depth is greater than the stored depth, something is closer to the light → the fragment is in shadow.

Light's eye ──→ scene ┐──→ depth texture (shadow map)
                               │
Camera's eye ┐──→ fragment └────┘ compare depths → lit or shadowed

Shadow texture creation

Before encoding any passes, we allocate the shadow map. It’s a depth-only texture the same resolution as the framebuffer — we recreate it whenever the view resizes, just like the GBuffer.

Depth textures in Metal. A depth texture is an ordinary MTLTexture with a depth pixel format (.depth32Float, .depth16Unorm, etc.) and usage [.shaderRead, .renderTarget]. The .depth32Float format gives 32-bit IEEE float depth precision — important for reducing shadow acne (depth aliasing artifacts). On Apple Silicon, depth textures are stored in a compressed, hierarchical format in the tile memory during rasterization and are only resolved to DRAM when .store is used — so a shadow-only pass that uses .store to make the depth readable is still efficient because the hardware compresses it automatically.

The shadow map is a depth-only texture. It is recreated on every resize (same as the GBuffer):

let shadowDesc = MTLTextureDescriptor
    .texture2DDescriptor(pixelFormat: .depth32Float,
                         width: Int(size.width),
                         height: Int(size.height),
                         mipmapped: false)
shadowDesc.usage       = [.shaderRead, .renderTarget]
shadowDesc.storageMode = .private
_shadowTexture = device.makeTexture(descriptor: shadowDesc)!
_shadowTexture.label = "Shadow Depth Texture"

Note .depth32Float — 32 bits of precision is important for reducing shadow acne (self-shadowing artifacts).

The shadow pipeline

The shadow pass has no fragment shader — we only care about depth, not color. Metal needs a dedicated pipeline that reflects this:

Depth-only render pipeline. Setting fragmentFunctionName: nil on an MTLRenderPipelineDescriptor creates a depth-only pipeline — the fragment stage is entirely skipped. This is significantly faster than running a no-op fragment shader because the GPU can rasterize and write depth without stalling on fragment execution. Additionally, setting all color attachments to .invalid pixel format tells Metal no color writes occur, allowing the GPU’s early-depth (TBIMR) optimisations to kick in even more aggressively. Always use depth-only pipelines for shadow passes.

The shadow pass needs no fragment shader — we only care about depth values:

_shadowPSO = _buildPipeline(
    vertexFunctionName:   "MT4::vertex_depth",
    fragmentFunctionName: nil,           // ← no fragment shader
    label: "ShadowPSO"
) { descriptor in
    descriptor.colorAttachments[0]?.pixelFormat       = .invalid
    descriptor.depthAttachmentPixelFormat             = .depth32Float
    descriptor.stencilAttachmentPixelFormat           = .invalid
}

Setting fragmentFunctionName: nil and all color attachments to .invalid is how you tell Metal “depth only.” This ensures that no color output is generated for this pass, focusing solely on the depth information needed for shadow mapping.

Light-space matrix

To render the scene from the light’s perspective we need a second MVP matrix — one where the camera sits at the light position. This is computed every frame alongside the regular camera matrices and passed to all three passes via the vertex uniforms.

The key new addition in MT4Uniforms.h:

struct MT4VertexUniforms {
    matrix_float4x4 modelViewMatrix;
    matrix_float3x3 modelViewInverseTransposeMatrix;
    matrix_float4x4 modelViewProjectionMatrix;

    // NEW — transforms vertices into the light's clip space
    matrix_float4x4 shadowModelViewProjectionMatrix;
};

In Swift, this is computed in _buildUniforms:

// Camera is placed AT the light position, looking at scene center
let shadowViewMatrix = float4x4(
    origin: lightPosition, target: center, up: SIMD3<Float>(0, 1, 0))

// Narrow FOV — the light is far away and focused
let shadowProjectionMatrix = float4x4(
    perspectiveProjectionFov: Float.pi / 16,
    aspectRatio: 1, nearZ: 0.1, farZ: 10_000)

let shadowModelViewProjection = shadowProjectionMatrix * shadowViewMatrix * modelMatrix

Both the shadow pass and the GBuffer pass receive this matrix via MT4VertexUniforms.

Three-pass render loop

With the shadow texture, pipeline, and light-space matrix ready, the render function encodes all three passes into a single command buffer. The GPU executes them in order — shadow, GBuffer, lighting — before presenting the frame.

The render function now encodes three passes into one command buffer:

let commandBuffer = _commandQueue.makeCommandBuffer()!

// ── Pass 1: Shadow ────────────────────────────────────────────
_encodePass(into: commandBuffer, using: shadowPassDescriptor, label: "Shadow Pass") { enc in
    enc.setRenderPipelineState(_shadowPSO)
    enc.setDepthStencilState(_depthStencilState)
    enc.setVertexBytes(&uniforms.0, …, index: 1)
    _renderMeshes(enc)    // bunny only (not the plane — it can't cast on itself here)
}

// ── Pass 2: GBuffer ───────────────────────────────────────────
_encodePass(into: commandBuffer, using: gBufferPassDescriptor, label: "GBuffer Pass") { enc in
    enc.setRenderPipelineState(_gBufferPSO)
    enc.setDepthStencilState(_depthStencilState)
    enc.setFragmentTexture(_shadowTexture, index: Int(MT4RenderTargetShadow.rawValue))
    _renderMeshes(enc)    // bunny only (not the plane — it can't cast on itself here)
}

// ── Pass 3: Lighting ───────────────────────────────────────────
_encodePass(into: commandBuffer, using: lightingPassDescriptor, label: "Lighting Pass") { enc in
    enc.setRenderPipelineState(_lightingPSO)
    enc.setDepthStencilState(_depthStencilState)
    enc.setFragmentTexture(_gbufferTextures[0], index: 0) // diffuse texture
    enc.setFragmentTexture(_gbufferTextures[1], index: 1) // normal texture
    enc.setFragmentTexture(_shadowTexture, index: 2)      // shadow map
    _renderMeshes(enc)
}

commandBuffer.presentDrawable(drawable!)
commandBuffer.commit()

🎸 Metal shaders — `MT4DeferredRendering.metal`

Three shader changes from Tutorial 3: the vertex shader gains a lightViewPosition output, a new depth-only vertex shader handles the shadow pass, and the GBuffer fragment shader uses lightViewPosition to sample the shadow map and decide lit vs. shadowed.

Vertex shader

The vertex shader now outputs lightViewPosition in addition to the standard clip-space position. This is used by the GBuffer fragment shader to look up the shadow map:

struct VertexOut {
    float4 clipSpacePosition [[position]];
    float3 viewNormal;
    float4 viewPosition;
    float2 texCoords;
    float4 lightViewPosition;   // NEW: position in light's clip space
};

vertex VertexOut vertex_main(VertexIn vertexIn [[stage_in]],
                                    constant MT4VertexUniforms &uniforms [[buffer(1)]])
{
    VertexOut vertexOut;
    vertexOut.clipSpacePosition = uniforms.modelViewProjectionMatrix * float4(vertexIn.position, 1);
    vertexOut.viewNormal = uniforms.modelViewInverseTransposeMatrix * vertexIn.normal;
    vertexOut.viewPosition = uniforms.modelViewMatrix * float4(vertexIn.position, 1);
    vertexOut.lightViewPosition = uniforms.shadowModelViewProjectionMatrix * float4(vertexIn.position, 1);
    return vertexOut;
}

🌑 Shadow-only vertex shader

The shadow pass needs only depth — no normals, no color, just the clip-space position from the light’s perspective:

struct ShadowVertexIn {
    float3 position  [[attribute(0)]];
};

vertex float4
  vertex_depth(ShadowVertexIn in  [[ stage_in ]],
               constant MT4VertexUniforms &uniforms [[buffer(1)]])
{
  return uniforms.shadowModelViewProjectionMatrix * float4(in.position,1);
}

GBuffer fragment — shadow test

The GBuffer fragment shader projects lightViewPosition into the shadow map’s UV space, samples the depth, and dims the albedo if the fragment is occluded:

fragment GBuffer gbuffer_fragment(
                                         VertexOut fragmentIn [[stage_in]],
                                         depth2d<float> shadowTexture [[ texture(MT4RenderTargetShadow) ]],
                                         constant MT4FragmentUniforms &uniforms [[buffer(1)]]
                                         )
{
    float3 lightCoords = fragmentIn.lightViewPosition.xyz / fragmentIn.lightViewPosition.w;
    float2 lightScreenCoords = lightCoords.xy;
    lightScreenCoords = lightScreenCoords * 0.5 + 0.5;
    lightScreenCoords.y = 1 - lightScreenCoords.y; //invert y

Three things happen in those four lines that every beginner gets stuck on:

1. Perspective division (/ .w): lightViewPosition is a 4D homogeneous coordinate (x, y, z, w) produced by the projection matrix. The w component encodes depth — it grows proportionally to how far the point is from the camera. Dividing xyz by w converts homogeneous clip space into 3D NDC (Normalized Device Coordinates): the result is in the range [-1, 1] on X and Y, and [0, 1] on Z. This division is what creates perspective: farther objects shrink toward the center. The GPU does this automatically for your primary camera position ([[position]]), but for the secondary light-space position you have to do it manually.

2. NDC → UV (* 0.5 + 0.5): NDC X/Y range from -1 to +1. Texture UV coordinates range from 0 to 1. The mapping is:

UV = NDC * 0.5 + 0.5
  →  -1 maps to  0.0  (left/bottom edge of texture)
  →   0 maps to  0.5  (center)
  →  +1 maps to  1.0  (right/top edge of texture)

3. Y-axis flip (y = 1 - y): Metal’s NDC has Y = +1 at the top of the screen and Y = -1 at the bottom. Texture UV coordinates have V = 0 at the top and V = 1 at the bottom (following image file convention). So after the * 0.5 + 0.5 conversion, NDC top (+1 → 1.0) needs to become UV 0.0, and NDC bottom (-1 → 0.0) needs to become UV 1.0. The formula y = 1 - y flips the axis.


    GBuffer out;

    if (lightScreenCoords.x < 0.0 || lightScreenCoords.x > 1.0 ||
        lightScreenCoords.y < 0.0 || lightScreenCoords.y > 1.0) {
        out.albedo = float4(1, 0, 0, 1);
    } else {
        constexpr sampler s(
          coord::normalized, filter::linear,
          address::clamp_to_edge,
          compare_func::less);

        float4 albedo = float4(1,1,1,1);

        float depthValue = shadowTexture.sample(s, lightScreenCoords);
        if (lightCoords.z > depthValue + 0.00001f) {
            albedo *= 0.4;   // in shadow — darken
        }
        // … fill GBuffer outputs …
    }
}

depth2d<float> vs texture2d<float> in MSL. In Metal Shading Language, depth textures must be typed as depth2d<float> (not texture2d<float>) to unlock the hardware depth-comparison path. When you specify compare_func::less in a constexpr sampler, the GPU performs the depth test inside the texture unit — it reads the stored depth, compares with your provided reference value, and returns 0.0 or 1.0. This is the PCF (Percentage Closer Filtering) path: with filter::linear, the GPU performs 4 compare-and-sample operations and blends the results, giving a soft shadow edge at no extra shader cost. Using texture2d<float> instead would return the raw float depth and force you to do the compare in shader code — losing the hardware filtering.

The small bias (+ 0.00001f) offsets the fragment’s depth slightly above the stored value to prevent shadow acne — a self-shadowing artifact caused by floating-point rounding making a surface appear to shadow itself. The ideal bias value depends on the surface slope relative to the light; some implementations use a slope-scaled bias (gl_PolygonOffset-style) for better quality.