Mar 04, 2025
Why fastDOOM is fast
During the winter of 2024, I restored an IBM PS/1 486-DX2 66Mhz, “Mini-Tower”, model 2168. It was the computer I always wanted as a teenager but could never afford. Words cannot do justice
to the joy I felt while working on this machine.
As soon as I got something able to boot, I benchmarked the one software I wanted to run.
C:DOOM>doom.exe -timedemo demo1 timed 1710 gametics in 2783 realtics
Doom doesn’t give the fps right away. You have to do a bit of math to get the framerate. In this instance, that’s 1710/2783*35 = 21.5 fps. An honorable performance for the best machine money could (reasonably) buy in Dec 1993 (specs, chipset, video, disk1, disk2, speedsys).
I was resigned to playing under Ibuprofen until I heard of fastDOOM. I am usually not a fan of ports
because they tend to add features without cohesion (except for the dreamy Chocolate DOOM) but I gave it a try out of curiosity.
C:DOOM>fdoom.exe -timedemo demo1
Timed 1710 gametics in 1988 realtics. FPS: 30.1
30% faster without cutting any features[1]! On a demanding map like doom2’s demo1, the gain is even higher, from 16.8 fps to 24.9 fps. That is 48% faster!
I did not suspect that DOOM had left that much on the table. Obviously shipping within one year left little time to optimize. I had to understand how this magic trick happened.
A byte of history
Before digging into fastDOOM, let’s understand where the code comes from. DOOM was originally developed on NeXT Workstation. The game was structured to be easy to port with most of the code in a core surrounded by small sub-systems performing I/O.
During development, DOS I/Os were written by id Software. This became the commercial release of DOOM.
But that version could not be open sourced in 1997 because it relied on a proprietary sound library called DMX.
What ended up being open sourced was the linux version, cleaned up by Bernd Kreimeier when he was working on a book project to explain the engine.
A DOS version of DOOM was reconstructed by using linux’s core, Heretic I/O, and APODMX (Apogee Sound wrapper) to emulate DMX.
Because Heretic used video mode 13h while DOOM used video mode Y, the graphic I/O (i_ibm.c
)
was reverse-engineered from DOOM.EXE
disassembly. That is how the community got PCDOOM v2[2].
fastDOOM starting point was PCDOOM v2.
┌───────────────┐ │ NeXTStep DOOM │ └─────┬────┬────┘ │ │ │ │ │ │ ┌────────────┐ │ │ ┌──────┐ ┌─────────┐ │ Linux DOOM │◄─┘ └─►│ DOOM ├─────►│ Heretic │ └──────┬─────┘ └──────┘ └────┬────┘ │ ⁞ │ │ ▼ │ │ ┌──────────┐ │ └─────────────►│ PCDOOMv2 │◄────────┘ └─────┬────┘ ▼ ┌──────────┐ │ fastDOOM │ fastDoom genealogy └──────────┘ ──────────────────
The big performance picture
Victor “Viti95” Nieto, wrote release notes to describe the performance improvement of each version but he seemed more interested in making FDOOM.EXE
awesome than detailing how he did it.
To get the big picture of performance evolution over time, I downloaded all 52 releases of fastDOOM, PCDOOMv2, and the original DOOM.EXE
, wrote a go program to generate a RUN.BAT
running -timedemo demo1
on all of them, and mounted it all with mTCP’s NETDRIVE
.
I chose to timedemo DOOM.WAD
with sound on and screen size = 10 (fullscreen with status bar). After several hours of shotguns and imps agony, I had run the whole suite five times and graphed the average fps with chart.js
.
The first thing this graph allows to rule out is that fastDOOM improvements were mostly due to using a modern compiler. PCDOOMv2
is built with OpenWatcom 2 but only gets a marginal improvement over DOOM.EXE
.
git archeology
On top of releasing often, Viti95 displayed outstanding git discipline where one commit does one thing and each release was tagged. fastDOOM git history is made of 3,042 commits which allows to benchmark each feature.
I wrote another go program to build every single commit. I will pass on the gory details of handling the many build system changes (especially from DOS to Linux). After an hour I had the
most ugly program I ever wrote and 3,042 DOOM.EXE
. I was pleased to see the build was almost never broken.
Graphing the files size shows that the early effort was to be lean by cleaning and deleting code. There are major drops with bf0e983
(build 239 where sound recording was removed), 5f38323
(build 0340 where error code strings were deleted),
and 8b9cac5
(build 1105 where TASM was replaced with NASM).
Going deep
Timedemoing all builds would have taken a very long time (3042×1.5/60/24 * 3 passes = 9 days) so I focused
on the release where most of the speed was gained. I wrote yet another go program to generate a .BAT
file running timedemo for all commits in v0.1
, v0.6
, v0.8
, v0.9.2
, and v0.9.7
.
I mounted 1.4 GiB of FDOOM.EXE
with mTCP and ran it. It took a while because versions with 200+ commit runtime was 8h/pass.
fastDOOM v0.1
This release featured 220 commits.
$ git log --reverse --oneline "0.1"http://fabiensanglard.net/" wc -l 220
Chart is click-able and mouseover-able
The MPV patch of v0.1 is without a doubt build 36 (e16bab8). The “Cripy optimization” turns status bar percentage rendering into a noop if they have not changed. This prevents rendering to a scrap buffer and blitting to the screen for a total
of 2 fps boost. At first I could not believe it. I assume my toolchain had a bug. But cherry-picking this patch on PCDOOMv2 confirmed the tremendous speed gain.
Next is build 167 (a9359d5)
which inlines FixedDiv via macro.
Near the end, we see a series of optimizations granting 0.5 fps.
Build 207 (9bd3f20): A PSX Doom optimization which optimizes the way the BSP is traversed.
Build 212 (dc0f48e) “Inlined R_MakeSpans” which renders horizontal surfaces.
Overall this version saw a lot of code being deleted (50% of commi
7 Comments
ge96
> I always wanted as a teenager but could never afford
Funny how that is, for me it was a Sony Alpha camera (flagship at the time) and 10 years later I finally buy it for $50.
ndegruchy
The linked GitHub thread with Ken Silverman is gold. Watching the FastDOOM author and Ken work through the finer points of arcane 486 register and clock cycle efficiencies is amazing.
Glad to see someone making sure that Doom still gets performance improvements :D
prox
Is there a recommended place where I can play Doom in the browser?
If such a thing exists!
yjftsjthsd-h
> To get the big picture of performance evolution over time, I downloaded all 52 releases of fastDOOM, PCDOOMv2, and the original DOOM.EXE, wrote a go program to generate a RUN.BAT running -timedemo demo1 on all of them, and mounted it all with mTCP's NETDRIVE.
I'm probably not the real target audience here, but that looked interesting; I didn't think there were good storage-over-network options that far back.
A little searching turns up https://www.brutman.com/mTCP/mTCP_NetDrive.html – that's really cool:)
> NetDrive is a DOS device driver that allows you to access a remote disk image hosted by another machine as though it was a local device with an assigned drive letter. The remote disk image can be a floppy disk image or a hard drive image.
kingds
> I was resigned to playing under Ibuprofen until I heard of fastDOOM
i don't get the ibuprofen reference ?
hinkley
So what does one do with a faster Doom, besides bragging, larger maps and more simultaneous players?
sedatk
If the author reads this: John Carmack's last name was mistyped as "Carnmack" throughout the document.