Scene.org Demoscene News Service

GCC for asm Experts (and C/C++ Intermediates) - Part 5

[ Atariscne.org - News ] GCC for asm Experts (and C/C++ Intermediates) - Part 5

← Register Allocation and the Cost Model

Fixing Post-Increment Addressing

Some years after high school, me and AiO worked at the same company for a while. I spent a lot of time at his apartment in Vimmerby, watching demos on my Falcon030 and his accelerated Amiga, playing Elite Frontier, and making grand plans for projects that mostly never shipped. I was going to write a Worms clone called Grubs, built around fractal-generated terrain and a neat paralax scrolling trick. In the end the only game released from that apartment was DB Phone Home, a 4K side-scrolling platformer for the Falcon. But the thing I remember most is reading AiO's copy of the MC68060 User's Manual.

The 68060 can not only do a multiply in two clockcycles, but execute two instructions at once!? Motorola called it superscalar, and I thought it was the most exciting thing in the world. I imagined what an Atari with this beast could do, and where the 68070 would take this — even wider issue, more parallelism, the same trajectory the industry was already on with the Pentium and the PowerPC. Of course, the 68070 never came. ColdFire does not quite count for me. Our beloved CPU family ended with the 68060, and the dream of wider superscalar m68k died with it.

But the industry kept going. GCC optimizes for that dream-made-real on other architectures: x86-64, ARM, RISC-V. Independent instructions that hardware can overlap or even reorder to execute over half a dozen instructions per cycle on for example an Apple Silicon M3/M4. This all works, as long as there are no data dependencies between consecutive operations. And this is where it goes wrong for us.