2D Wave Simulation Using Metal Shaders

I pressed my thumb to the glass and a ring of light bloomed outward, struck the far wall, bounced back, crossed a second ring I had already forgotten launching, and threw an interference pattern across the surface that shimmered the exact way creek water does when two stones land close together. I sat on my couch for ten minutes straight, tapping glass and grinning, like someone who had just taught a phone to remember what water feels like.

The short version

PixelWave runs a full 2D wave simulation on the GPU via Metal compute shaders, rendered with a stylized ocean fragment shader
Physics: the 2D wave equation, discretized with a 9 point Laplacian stencil and Verlet integration
Touch injection uses a displacement/velocity decomposition to produce natural expanding ripples in a Verlet scheme
The renderer layers Blinn-Phong lighting, Fresnel reflections, foam, and activity based transparency into a convincing water surface
120 Hz on a modern iPhone, with cycles to spare

Why this project

There is a particular pleasure in typing an equation into a compiler and watching it breathe. You encode one rule, a relationship between curvature and acceleration, and complexity pours out of it unbidden: concentric rings, reflections off walls, interference patterns that ripple into each other and dissolve. Water does all of this from a single partial differential equation, which means the gap between "I understand the math" and "I am holding a living ocean in my palm" is narrower than it has any right to be. I wanted to close that gap.

Four constraints shaped the project:

120 Hz. No dropped frames. The GPU carries the whole load.
Touch reactive. Finger meets glass, ripple expands. No perceptible lag.
Visually convincing. Something that reads as water, not a debug heatmap or a thermal overlay.
Understood, not copied. Every equation derived by hand before it touches a shader.

Metal made this possible in a way that no abstraction layer could have. Apple's GPU API gives you compute kernels for simulation and render shaders for visualization in the same pipeline, and the distance between your math and the silicon that executes it is measured in function calls, not frameworks.

Architecture

SwiftUI (ContentView)
  └── UIViewRepresentable (WaveSimulationView)
        └── MTKView subclass (InteractiveMTKView)
              └── MTKViewDelegate (WaveRenderer)
                    ├── Compute Pipelines (simulation)
                    └── Render Pipeline  (visualization)

Layer	Role
ContentView	SwiftUI host. Sliders and color pickers for live parameter tuning.
WaveSimulationView	Bridges SwiftUI to UIKit. Creates the Metal view. Forwards touches.
InteractiveMTKView	Converts `UITouch` events into normalized UV coordinates with pressure.
WaveRenderer	The engine. Owns the Metal device, command queue, textures. Orchestrates every frame.

The entire GPU codebase lives in WaveShaders.metal, 246 lines of Metal Shading Language split between two compute kernels and one vertex/fragment pair. The Swift renderer that drives them weighs in at 385 lines. The whole simulation, physics and rendering both, fits in two files you could print on a dozen pages.

The wave equation

Physics

Every ripple you have ever watched spread across a pond obeys the 2D wave equation:

$$\frac{\partial^2 u}{\partial t^2} = c^2 \nabla^2 u$$

$u(x, y, t)$ is the wave height at a point in space and time, $c$ is the propagation speed, and $\nabla^2 u$ is the Laplacian, which measures how much a point's height differs from the average height of its neighbors.

The intuition is mechanical and satisfying: a point sitting higher than its surroundings gets pulled down by the neighbors it towers over, and a point sitting lower gets hauled up by the neighbors rising above it. That tug of war moves through the grid cell by cell, generating oscillation, propagation, interference, all the complex behavior that makes water mesmerizing to watch, from a rule you could fit on a Post it note.

Add a friction term so ripples eventually fade:

$$\frac{\partial^2 u}{\partial t^2} + \gamma \frac{\partial u}{\partial t} = c^2 \nabla^2 u$$

Without $\gamma$, every ripple you create lives forever, which is beautiful in theory and unusable on screen because the surface fills up with ghosts of old touches within seconds.

Discretization

Continuous equations live on chalkboards. GPUs live on grids, and they want everything expressed as reads from one memory address and writes to another.

Time gets sliced into fixed steps of $\Delta t = \frac{1}{120}$ seconds. Space becomes a 2D grid where each cell covers pixelSize screen pixels; at the default of 3.0, you get a deliberate chunky look, pixels you can count individually, waves rolling across them like a tide over cobblestones.

For integration I chose Verlet, a scheme whose elegance lies in its thrift. You keep only two snapshots of the height field, current and previous, and velocity is never stored explicitly; it exists as the gap between two frames of height data, a ghost quantity reconstructed whenever you need it:

$$v^n = u^n - u^{n-1}$$

Each simulation step plays out in four lines:

$$a^n = \sigma \cdot \nabla^2 u^n$$

$$v^n \leftarrow v^n + a^n$$

$$v^n \leftarrow v^n \cdot e^{-\gamma \Delta t}$$

$$u^{n+1} = u^n + v^n$$

$\sigma$ is the squared Courant number, clamped against catastrophe:

$$\sigma = \min!\left((c \cdot \Delta t)^2,\ 0.48\right)$$

That 0.48 enforces the CFL stability condition, and crossing it is an immediate catastrophe. Let $\sigma$ reach 0.5 and the simulation stops dissipating energy and starts manufacturing it; numbers climb exponentially, overflow within microseconds, and the height field fills with NaN before you can blink. My first prototype lasted five frames.

Frame 1: "looking good..."
Frame 2: "nice ripples..."
Frame 3: "wait, why is that pixel at 847.0?"
Frame 4: "oh no"
Frame 5: NaN NaN NaN NaN NaN NaN NaN

Five frames is a humbling lifespan for something you spent an afternoon building, but it taught me a lesson that colored every decision afterward: numerical stability is a hard boundary, and you do not lean on it to see how much give it has.

The 9 point Laplacian

The naive way to compute the Laplacian on a grid is the 5 point stencil, which samples the four cardinal neighbors and calls the job done:

$$\nabla^2_5 u = u_{L} + u_{R} + u_{U} + u_{D} - 4,u_{C}$$

         [ ]
          |
   [ ] -- [C] -- [ ]
          |
         [ ]

It works correctly in the mathematical sense, but it lies about geometry. Because the stencil only reaches along the axes, waves propagate faster horizontally and vertically than they do diagonally, which means a circular impulse expands into a diamond, the kind of subtle wrongness that makes you squint at the screen and mutter "something is off" without being able to name it.

  What I got:          What I wanted:
                       
     *                    ***
    * *                  *****
   *   *                *******
    * *                  *****
     *                    ***

I stared at diamond shaped ripples for two solid days, tweaking damping coefficients, adjusting propagation speed, blaming floating point precision, growing increasingly convinced that something was wrong with my Verlet implementation. Then I stumbled onto a Wikipedia article about grid isotropy, and two days of confusion collapsed into three lines of code.

The 9 point stencil folds in the four diagonal neighbors with a weighted average that restores the rotational symmetry the 5 point version destroys:

$$\frac{1}{6}\begin{bmatrix}1 & 4 & 1 \\4 & -20 & 4 \\1 & 4 & 1\end{bmatrix}$$

In practice I compute it as a blend of the cardinal and diagonal Laplacians, which keeps the shader code clean and the arithmetic cheap:

$$\nabla^2_4 = u_L + u_R + u_U + u_D - 4u_C$$

$$\nabla^2_{\text{diag}} = u_{UL} + u_{UR} + u_{DL} + u_{DR} - 4u_C$$

$$\nabla^2 u \approx \frac{4 \cdot \nabla^2_4 + \nabla^2_{\text{diag}}}{6}$$

float laplacian4 = currentL + currentR + currentD + currentU - (4.0 * current);
float laplacianDiag = currentUL + currentUR + currentDL + currentDR - (4.0 * current);
float laplacian = (4.0 * laplacian4 + laplacianDiag) / 6.0;

Three lines. Circles instead of diamonds. Two days of accumulated frustration, dissolved by one Wikipedia page and a weighted average. This was, by a wide margin, the single most valuable change in the entire project.

Dispersion correction

The difference between the diagonal and cardinal Laplacians captures high frequency anisotropic content, the energy that the 9 point blending suppresses:

$$\mu = \nabla^2_{\text{diag}} - \nabla^2_4$$

Blending $\mu$ back into the acceleration with a tunable parameter $\delta$ lets you reintroduce controlled amounts of grid aligned noise:

$$a = \sigma \cdot (\nabla^2 u + \delta \cdot \mu)$$

At $\delta = 0$ the surface is glassy smooth. Crank it up and small scale chop appears, the kind of fine grained surface texture that wind and surface tension scatter across real water. The physics behind this is admittedly handwavy, but the visual result is convincing, and on a phone screen that you are poking with your thumb, visual conviction is the only standard that counts.

Triple buffering

Verlet integration reads height values at two timesteps to compute the third, which means the simulation needs three R32Float textures that take turns playing past, present, and future:

Texture	Contents
`previousTexture`	Height at $t - \Delta t$
`currentTexture`	Height at $t$
`nextTexture`	Write target for $t + \Delta t$

After each simulation step the three rotate, each one inheriting the role of its successor:

let oldPrevious = previousTexture
previousTexture = currentTexture
currentTexture  = nextTexture
nextTexture     = oldPrevious  // recycled

No allocations anywhere in the frame loop. The GPU reads from two textures and writes to a third with no read write conflicts, no synchronization barriers, just three Swift pointers trading places frame after frame, like a card trick performed at 120 Hz.

Fixed timestep accumulator

The simulation ticks at a locked 120 Hz, but the display refreshes at whatever cadence the hardware provides, which on a hot iPhone can swing between 120 Hz and 60 Hz within seconds. A classic accumulator decouples physics from rendering so the simulation stays deterministic regardless of frame rate:

accumulator += frameDelta

while accumulator >= fixedTimeStep && steps < maxSteps {
    encodeWaveStep(commandBuffer)
    rotateTextures()
    accumulator -= fixedTimeStep
    steps += 1
}

When the display runs at 60 Hz the loop executes two physics steps per frame; at 120 Hz, one. The cap at ten steps prevents the spiral of death, the feedback loop where falling behind generates more work that makes you fall further behind. If the GPU cannot keep up, the simulation sheds time rather than accumulating a debt it can never repay.

Viscosity

Real water has internal friction: fast moving molecules drag on slow moving neighbors, smoothing velocity differences across the surface the way a hand smoothing wet clay blurs sharp ridges into gentle slopes. I model this by blending each cell's velocity toward the average of its four cardinal neighbors:

$$\bar{v} = \frac{1}{4}\sum_{k \in \{L,R,U,D\}} (u_k^n - u_k^{n-1})$$

$$\beta = \text{clamp}(\nu \cdot \Delta t \cdot 12, ; 0, ; 0.5)$$

$$v \leftarrow (1 - \beta) , v + \beta , \bar{v}$$

The $\Delta t$ factor makes the smoothing framerate independent, and the $0.5$ clamp prevents over smoothing from triggering yet another instability. There is a pattern here that became impossible to ignore after the third encounter: every knob in a wave simulation has a setting where it transitions from "useful parameter" to "system screaming."

At zero viscosity you get sharp, jittery ripples with a digital edge; higher values blur fine detail while preserving the broad swells, the difference between rain striking a puddle and a spoon turning slowly through warm honey.

Damping

Exponential drag on velocity, applied per cell per step:

$$v \leftarrow v \cdot e^{-\gamma \cdot \Delta t}$$

At $\gamma = 0.35$ ripples live about two seconds, long enough to watch interference patterns form and dissolve, short enough that the screen clears itself before old touches pile up into visual noise.

Boundary conditions

Waves that reach the edge of the simulation grid need a fate, and the three classical options each have drawbacks: clamping to zero creates hard reflections, wrapping creates ghostly duplicates, and full absorption looks dead. I built a smooth absorption ramp that blends between reflection and absorption through a configurable coefficient:

$$d_{\text{edge}} = \text{distance to nearest edge}$$

$$\eta = \text{smoothstep}!\left(0, ; 1, ; \frac{d_{\text{edge}}}{\text{edgeZone}}\right)$$

$$\alpha_b = \text{mix}(r, ; 1, ; \eta)$$

$r = 0$ swallows everything. $r = 1$ mirrors everything. The default of $0.42$ absorbs most of the incoming energy but sends back a faint echo, the way a sandy beach returns a whisper of each wave that laps against it.

Both velocity and displacement get multiplied by $\alpha_b$ in the absorption zone:

$$v \leftarrow v \cdot \alpha_b, \qquad u^{n+1} \leftarrow u^{n+1} \cdot \alpha_b$$

The smoothstep transition is load bearing. Without it, the sharp boundary between "absorbing" and "interior" spawns its own parasitic reflections, visible as a bright halo at the grid edge that no amount of parameter tuning can hide.

Touch injection

This was the problem that cost me two pages of notebook paper, three wrong approaches, and one moment of clarity that arrived so quietly I almost missed it.

Gaussian impulse

Each finger down event radiates a Gaussian falloff from the touch point:

$$f(d) = \begin{cases}\exp!\left(\frac{-d^2}{r^2 \cdot 0.18}\right) & \text{if } d < r \\0 & \text{otherwise}\end{cases}$$

The 0.18 in the denominator is chosen to concentrate 82% of the impulse energy within the innermost 42% of the brush radius, which gives you a sharp peak surrounded by a soft skirt: clean, defined ripples instead of a blurry plateau that looks like someone sat on the water.

Displacement/velocity decomposition

Verlet stores no explicit velocity field, and that is both its elegance and, when you try to inject energy, its particular cruelty. Velocity exists only as the gap between two height snapshots, $v = u^n - u^{n-1}$, which raises an uncomfortable question: how do you inject both a position displacement and a velocity kick when the only state you can touch is two textures full of heights?

First attempt: add the impulse to currentTexture only. A shallow dent forms. It barely radiates outward, sitting there like a thumbprint in modeling clay.

Second attempt: add opposite values to current and previous, reasoning that the difference would become velocity. It does, but the splashes are lopsided, asymmetric in a way that immediately reads as wrong.

Third attempt, the one that finally works, decomposes the impulse into a displacement fraction and a velocity fraction and distributes them across both textures with opposite signs:

$$d_{\text{pos}} = 0.18 \cdot I \quad \text{(displacement)}$$

$$v_{\text{kick}} = 0.82 \cdot I \quad \text{(velocity)}$$

$$u^n \leftarrow u^n + d_{\text{pos}} + \tfrac{1}{2} v_{\text{kick}}$$

$$u^{n-1} \leftarrow u^{n-1} + d_{\text{pos}} - \tfrac{1}{2} v_{\text{kick}}$$

The algebra reveals why: adding equal amounts to both textures shifts position without altering velocity, because the difference $u^n - u^{n-1}$ remains unchanged; adding opposite amounts alters velocity without shifting position. Combining both:

$$v_{\text{new}} = (u^n + d + \tfrac{1}{2}k) - (u^{n-1} + d - \tfrac{1}{2}k) = v_{\text{old}} + k$$

The 82/18 split biases hard toward velocity, producing expanding rings that radiate outward from the touch point like ripples from a stone, rather than a static dimple that sits in place like a thumbprint in wet sand.

float displacement = impulse * 0.18;
float velocityKick = impulse * 0.82;

current  += displacement + (0.5 * velocityKick);
previous += displacement - (0.5 * velocityKick);

The math is obvious once you see it laid out. Getting there was not. I filled two notebook pages with wrong derivations, crossed out diagrams, and increasingly frustrated margin notes before the correct decomposition appeared, small and quiet, at the bottom of the second page.

Simulator fallback

A practical annoyance worth mentioning: the iOS Simulator lacks Metal read_write texture support (tier 2), which means the disturbance kernel that reads and writes textures in a single pass on real hardware cannot run as written on the simulator. The workaround is a split pass fallback that reads into a scratch texture and blits back:

#if targetEnvironment(simulator)
    disturbanceStrategy = .splitPass
#else
    if device.readWriteTextureSupport == .tier2 {
        disturbanceStrategy = .readWrite
    } else {
        disturbanceStrategy = .splitPass
    }
#endif

Same physics, twice the memory traffic, identical output.

Rendering

The simulation produces a 2D grid of floating point height values: a grayscale field that tells you everything about the wave's structure and nothing about what water looks like. A raw height map is information, not imagery. The fragment shader's job is to bridge that gap, turning numbers into something the eye accepts as a liquid surface.

Slope, curvature, normal

Five texture samples per pixel, center and four cardinal neighbors, yield the slope and curvature that drive every subsequent lighting decision:

$$s_x = u_R - u_L, \quad s_y = u_U - u_D$$

$$\kappa = |u_L + u_R + u_U + u_D - 4u_C|$$

The slope vector becomes a 3D surface normal:

$$\hat{n} = \text{normalize}(-2.2 , s_x, ; -2.2 , s_y, ; 1.0)$$

The 2.2 multiplier exaggerates the normal deflection so that the lighting reads more dramatically than the actual height displacement warrants, a deliberate lie that makes shallow waves catch light the way deep waves would. Honest normals produce honest lighting, which turns out to look flat and lifeless.

Blinn-Phong lighting

A single directional light from above left, camera pointing straight down, gives the surface its sense of depth:

$$\hat{l} = \text{normalize}(-0.28, ; 0.62, ; 0.74)$$

$$\hat{v} = (0, ; 0, ; 1)$$

Diffuse: $I_d = \max(\hat{n} \cdot \hat{l}, ; 0)$

Fresnel (Schlick approximation, power 4): $F = (1 - \text{saturate}(\hat{n} \cdot \hat{v}))^4$

You already know Fresnel from a lifetime of looking at water. Gaze straight down into a swimming pool and you see the tiles on the bottom; tilt your line of sight toward the horizon and the surface becomes a mirror that reflects the sky. The power-4 curve approximates that transition, and getting it right does more for the water illusion than any other single technique in the shader.

Specular with curvature adaptive gloss:

$$g = \text{mix}(28, ; 88, ; 1 - \text{saturate}(2.6\kappa))$$

Where the surface is calm, specular highlights are tight, bright pinpoints ($g = 88$); where curvature concentrates, they smear into broad scattered glints ($g = 28$), mimicking the way a choppy ocean breaks one sun reflection into a thousand shards.

Water color

Three user configurable colors, exposed through SwiftUI ColorPicker controls, determine the palette:

Color	Role	Default
Deep	Wave troughs	Dark ocean navy
Shallow	Wave crests	Ocean teal
Sky	Fresnel reflection tint	Desaturated sky blue

Height drives the blend between deep and shallow water tones:

$$m = \text{saturate}(0.5 + 0.45 \cdot u)$$

$$C_{\text{water}} = \text{mix}(C_{\text{deep}}, ; C_{\text{shallow}}, ; m)$$

Foam

White foam collects where curvature concentrates, along wave crests and at collision zones where two ripples meet head on:

$$\text{foam} = \text{smoothstep}(0.22, ; 0.62, ; \kappa + 0.15 \cdot |\mathbf{s}|)$$

Viewed in isolation, the foam contribution is barely visible: a faint whitening along ridgelines that you might dismiss as noise. But its absence is immediately felt. Without foam the surface looks digital, synthetic, manufactured. With it, the surface looks wet. The margin between those two impressions is thinner than the effect that creates it.

Pixel grid accent

A 2% brightness modulation at cell boundaries reinforces the pixel art aesthetic that the chunky pixelSize establishes:

float2 cell = abs(fract(in.uv * uniforms.textureSize) - 0.5);
float pixelAccent = smoothstep(0.45, 0.5, max(cell.x, cell.y));
color *= mix(0.98, 1.02, pixelAccent);

Two percent. Invisible in a screenshot, unfelt by anyone who is not looking for it. But it gives the whole surface a tactile, gridded quality, a sense that the waves are built from distinct blocks rather than rendered from smooth gradients, and that quality is what separates "tech demo" from "something that has a visual identity." Details this small are where aesthetic coherence lives.

Activity based transparency

My favorite piece of the whole shader, the trick that makes PixelWave feel like a layer rather than an app. Calm areas output transparent black:

$$A = \text{saturate}(6|u| + 12|\mathbf{s}| + 8\kappa)$$

If $A$ drops below $10^{-4}$, the pixel returns float4(0, 0, 0, 0), full transparency. Otherwise:

$$\alpha = \text{smoothstep}(0, ; 0.15, ; A)$$

$$\text{output} = (\alpha \cdot C, ; \alpha) \quad \text{(premultiplied)}$$

The MTKView itself is transparent, which means SwiftUI renders whatever it likes behind it: a gradient, an image, a solid color. Ripples float on top and dissolve to nothing wherever the surface goes still, and the simulation composes over the world rather than replacing it.

Premultiplied alpha blending (source = .one, destination = .oneMinusSourceAlpha) prevents the dark fringe that standard alpha blending creates at the edges of semi transparent geometry. Standard blending multiplies source color by alpha twice, once in the shader and once in the blend unit, which darkens translucent edges. Premultiplied applies alpha once, in the shader, and trusts the blend unit to leave it alone.

Composition

$$C = C_{\text{water}} \cdot (0.3 + 0.65 \cdot I_d) + 0.42 \cdot C_{\text{reflect}} + I_s + \text{foam} \cdot (0.2, ; 0.24, ; 0.25)$$

No single layer in this stack sells the illusion of water, and removing any one of them makes the absence obvious. Diffuse shading alone produces a flat, matte surface. Fresnel alone looks plastic. Specular alone is a white smear. Foam alone is noise. Together, balanced against each other, they produce something the eye accepts as liquid. The fragment shader is where numerical simulation ends and something much closer to painting begins.

Tuning

Every default value in PixelWave was arrived at the same way: I sat on my couch, changed a number, watched waves for thirty seconds, and asked myself whether it felt like water. No metrics. No A/B tests. No analytics. Just a human staring at a screen, calibrating against a lifetime of watching rain on puddles, wind on ponds, waves at beaches, and shower spray on tile.

Parameter	Value	Reasoning
Wave Speed `12.0`	Fast enough that touches feel responsive. Slow enough to watch individual wavefronts.
Damping `0.35`	Ripples live two to three seconds before fading.
Viscosity `0.18`	Smooths grid scale noise without making the water feel thick.
Dispersion `0.12`	Faint surface chop. Just enough to break visual uniformity.
Edge Reflection `0.42`	Absorbs most energy. Returns a soft echo.
Brush Radius `0.035`	Precise enough for a fingertip. Large enough to see.
Impulse `0.55`	Visible ripples without clipping at ±2.0.
Pixel Size `3.0`	The balance point between resolution and pixel art character.

You know when water looks right. You also know, immediately and wordlessly, when it does not. The numbers in this table are what falls out when you try to put language on that feeling.

Lessons

Stability. The CFL condition is a wall, and my first prototype hit it in five frames. Every knob in a wave simulation has a setting where it turns from "useful parameter" to "simulation screaming and dying." Clamp every value that can grow. Leave margins between your parameters and the cliffs that would destroy them.

Isotropy. Two days debugging diamond shaped ripples, convinced the problem was my integration, before a three line Laplacian change turned diamonds into circles. Grid geometry matters far more than intuition suggests, and the 9 point weighted average is one of those rare changes where the effort to impact ratio approaches zero.

Verlet impulses. The standard Verlet update is two lines of code. Injecting external forces into a scheme with no explicit velocity field is a different problem entirely, and no tutorial I found addressed it. I worked it out on paper, and the correct decomposition appeared only after two pages of wrong ones.

Rendering matters more than simulation. A physically correct simulation with flat shading still reads as a debug view. Wrap a rougher physics model in good normals, Fresnel, adaptive specular, foam, and a pixel grid accent and the result reads as water. The fragment shader is where simulation turns into imagery. I learned to spend proportionally more attention there.

Mobile GPUs are quietly enormous. This simulation runs a 9 point stencil across roughly 200,000 cells, 120 times per second, with enough thermal headroom that the phone barely warms up. Getting it fast was straightforward. Getting it correct took most of the work.

Source

The full project is on GitHub. Clone it, crank dispersion to 1.0, override the CFL clamp past 0.5 on purpose, and watch what five frames of exponential instability look like from the inside. It is educational in the way that touching a hot stove is educational: brief, vivid, and unlikely to require a second lesson.

 .--. 
/    \
\    /  ~ ~ ~ ~ ~
 '--'  ~ ~ ~ ~ ~ ~