Inside the Apple M1 is an incredibly quirky GPU


In context: Apple is keeping the inner workings of the M1 family of processors a secret from the public, but dedicated developers have reverse-engineered it to create open-source drivers and a Linux distribution, Asahi Linux, for the M1 Macs. In the process, they discovered some cool features.

In her efforts to develop an open-source graphics driver for the M1, Alyssa Rosenzweig recently found a quirk in the M1’s GPU rendering pipeline. She was rendering more and more complicated 3D geometries and eventually came to a rabbit hole that kicked out the GPU bug.

Basically – and please note that this and everything else I’m about to say is an oversimplification – the problem starts with low GPU access to memory. It’s a powerful GPU, but like the A-series iPhone SoC it shares an ancestor with, it takes shortcuts to maintain efficiency.

Instead of rendering directly to the framebuffer like a discrete GPU would, the M1 takes two passes of an image: the first finds the vertices, and the second does everything else. Obviously the second pass is much more intensive, so between passes the dedicated hardware segments the frame into tiles (mini-frames, basically) and the second pass is taken one tile at a time.

Tiling solves the problem of not having enough memory resources, but in order to be able to piece the tiles back into a frame later, the GPU needs to keep a buffer of all data per vertex. Rosenzweig found that whenever this buffer overflowed, rendering failed. See the first rabbit, above.

In one of Apple’s presentations, it is explained that when the buffer is full, the GPU produces a partial render – that is, half of the bunny. In Apple’s software, the buffer in question is called the parameter buffera name apparently taken from Imagination’s PowerVR documentation.

Imagination is a UK-based company that, like Arm, designs processors that it licenses to other companies. Apple signed an agreement with the company in early 2020 that allows Apple to license a wide range of its intellectual property. It’s clear that the M1, which was released in late 2020, uses its PowerVR GPU architecture as a kind of foundation for its GPU.

Anyway, back to the rabbit. As you might have guessed, the partial renders can be added together to create a render of the whole bunny (but with a dozen extra steps in between, of course).

But this rendering still is not quite right. You can see artifacts on the rabbit’s foot. Turns out it’s because different parts of the frame are split between a color buffer and a depth buffer, and the latter behaves badly when loaded with partial renders.

A reverse-engineered setup of Apple’s driver fixes the problem, then you can finally render the bunny (below).

It’s not just Rosenzweig’s open-source graphics driver for the M1 that jumps through all these hoops to render an image: it’s how the GPU works. Its architecture probably wasn’t designed with 3D rendering in mind, but despite that, Apple has made it into something that can rival or even surpass the latest discrete GPUs, as Apple claims. It’s cool.

For a more in-depth (and technically accurate) explanation of Rabbit rendering and further exploration of the M1, be sure to check out Rosenzweig’s blog and the Asahi Linux website.

Header credit: Walling


Leave a Reply

Your email address will not be published.