700,000 lines of code, 20 years, and one developer: How Dwarf Fortress is built

[Ed. note: While we take some time to rest up over the holidays and prepare for next year, we are re-publishing our top ten posts for the year. This is our number one post of 2021! Thanks for reading and we'll see you in the new year. ]

Dwarf Fortress is one of those oddball passion projects that’s broken into Internet consciousness. It’s a free game where you play either an adventurer or a fortress full of dwarves in a randomly generated fantasy world. The simulation runs deep, with new games creating multiple civilizations with histories, mythologies, and artifacts.

It has become notorious, and rightly so. Individual dwarves have emotional states, favorite gems, and grudges. And it all takes place in an ASCII interface that looks imposing to newbies, but feels like the text crawl in The Matrix: craftsdwarf, river, legendary megabeast.

The entire game is product of one developer, Tarn Adams, aka Toady One, who has been working on Dwarf Fortress since 2002. For the first four years it was a part time project, but since 2006 it’s been full time. He writes all the code himself, although his brother helps out with design and creates stories based on the game. Up until now, he’s relied on donations to keep him going, but he’s currently working on a version with pixel graphics and a revamped UI that will be available for purchase on Steam.

I reached out to Tarn Adams to see how he’s managed a single, growing codebase over 15+ years, the perils of pathing, and debugging dead cats. Our conversation below has been edited for clarity. If you want more, we also spoke with Tarn on the podcast.

Q: What programming languages and other technologies do you use? Basically, what's your stack? Has that changed over the 15-20 years you've been doing this?

A: DF is some combination of C and C++, not in some kind of standard obeying way, but sort of a mess that's accreted over time. I've been using Microsoft Visual Studio since MSVC 6, though now I'm on some version of Visual Studio Community.

I use OpenGL and SDL to handle the engine matters. We went with those because it was easier to port them to OSX and Linux, though I still wasn't able to do that myself of course. I'm not sure if I'd use something like Unity or Unreal now if I had the choice since I don't know how to use either of them. But handling your own engine is also a real pain, especially now that I'm doing something beyond text graphics. I use FMOD for sound.

All of this has been constant over the course of the project, except that SDL got introduced a few years in so we could do the ports. On the mechanical side of the game, I don't use a lot of outside libraries, but I've occasional picked up some random number gen stuff—I put in a Mersenne Twister a long while ago, and most recently I adopted SplitMix64, which was featured in a talk at the last Roguelike Celebration.

Q: What are the challenges in developing a single project for so long? Do you think this is easier to do by yourself? That is, because you wrote every line, is it easier to maintain and change?

A: It's easy to forget stuff! Searching for ';', which is a loose method but close enough, we're up to 711,000 lines, so it's just not possible to keep it all in my head now. I try to name my variables and objects consistently and memorably, and I leave enough comments around to remind myself of what's going on when I arrive at a spot of code. Sometimes it takes several searches to find the exact thread I'm trying to tug on when I go and revisit some piece of the game I haven't touched for a decade, which happens quite a bit. I'd say most changes are focused only on certain parts of the game, so there is kind of an active molten core that I have a much better working knowledge of. There are a few really crusty bits that I haven't looked at since before the first release in 2006.

Regarding the relative ease of doing things by myself, certainly for me, who has no experience working on a large multi-person project, this is the way to go! People obviously get good at doing it the other way, for example over in the AAA games context, and clearly multiple engineers are needed over there to get things done on time. I'd be hesitant to say I can go in and change stuff faster than they can, necessarily, since I haven't worked in that context before, but it's true that I don't have any team-oriented or bureaucratic hurdles to jump through when I want to make an alteration. I can just go do it. But I also have to do it alone.

Q: What's the biggest refactor/change that you had to make?

A: There have been some refactors that have lasted for months, redoing certain data structures and so forth, though I'm not sure anything is ever a refactor strictly here since there's always opportunities to push the mechanics forward simultaneously and it makes sense to do so when the code knowledge is fresh.

Adding the Z coordinate to make the game mechanically 3D (while still being text) was another one, and really the most mind-numbing thing I've probably ever done. Just weeks and weeks and weeks of taking logic and function calls that relied on X and Y and seeing how a Z fits in there.

Making the item system polymorphic was ultimately a mistake, but that was a big one.

Q: Why was this was a mistake?

A: When you declare a class that's a kind of item, it locks you into that structure much more tightly than if you just have member elements. It's nice to be able to use virtual functions and that kind of thing, but the tradeoffs are just too much. I started using a "tool" item in the hierarchy, which started to get various functionality, and can now support anything from a stepladder to a beehive to a mortar (and pestle, separately, ha ha), and it just feels more flexible, and I wish every crafted item in the game were under that umbrella.

We do a lot of procedural generation, and if we wanted to, say, generate an item that acts partially like one thing and partially like another, it's just way harder to do that when you are locked down in a class hierarchy. Adding things like diamond dependencies and all that just end up tying you in knots when there are cleaner ways to do it. If different components can just be turned off and on, it's easier, and allows you to do more.

I think some game developers refer to this as an entity component system, though it's my understanding that harder-core optimizer people think of that as something else where you're actually breaking things down by individual fields. Using a single object with different allocated subobjects is almost certainly worse for cache misses, which is a whole other thing, but the benefits in organization, flexibility, and extensibility just can't be ignored, and the different subfields of the tool item aren't used so often that it becomes an optimization issue.

Q: Did you run into any issues moving from 32 bit to 64 bit? That feels like one of those things that was huge at the time but has become pretty accepted.

A: Not at all! I'm struggling to think of a single issue. Fortunately for us, we already had our byte sizes under control pretty well, since it comes up saving and loading the worlds; the format needed to be nailed down back when we set that up, especially because we've had to deal with endian stuff between OSes and all that. And we don't do any gnarly pointer operations or other stuff that might have gotten us in trouble. It just ended up being really good code for 64 bit conversion due to our other practices, entirely by accident. The main issue was just getting the time together to make the change, and then it didn't end up taking nearly as long as I thought it would.

Q: I've seen other games similar to DF die on their pathfinding algorithms.What do you use and how do you keep it efficient?

A: Yeah, the base algorithm is only part of it. We use A*, which is fast of course, but it's not good enough by itself. We can't take advantage of some of the innovations on that (e.g. jump point) since our map changes so much. Generally, people have used approaches that add various larger structures on top of the map to cut corners, and because of the changing map, these just take too long to maintain, or are otherwise a hassle. So our approach has been to just keep track of connected components reachable by walking. These are pretty easy to update even when the map changes quickly, though it does involve some flood-filling. For instance, if water cuts the fortress in half, it needs to flood out from one side and update a whole half of the fortress to a new index, but once that's done, it's good, generally. Then that allows us to cut almost all failed A* calls from the game—our agents just need to query component numbers, and if the component numbers are the same, they know the call will succeed.

It's fast to maintain, but the downside is that the component indices are maintained for walking only. This means that flying creatures, for instance, don't have global pathfinding intelligence that's any different from a walker. In combat and a few other situations, we use short-range flood fills with their actual logic to give them some advantages though. But it's not ideal for them.

I'm not sure we'll attempt other structures here to make it work any better. For our map sizes, they've all failed, including some outside attempts. Of course, it might be possible with a really concerted effort, and I've seen other games that have managed, for instance, some rectangular overlays and so forth that seem promising, but I'm not sure how volatile or large their maps were.

The most simple idea would just be something like adding a new index for fliers, but that's a large memory and speed hit, since we'd need to maintain two indices at once, and one is bad enough. More specific overlays can track their pathing properties (and then you path through the overlays instead of the tiles), but they are hard and slow to maintain as the map changes. There are various other ideas floating around, like tracking stairs, or doing some limited path caching, and there are probably some gains to be made there. We are certainly at the edge of what we can currently support in terms of agents and map complexity, so something'll have to give if we want to get more out of it.

Q: On that note, you're simulating a lot of things all at once—how do you manage so many so many actors asynchronously (or do you)?

A: If we're talking about asynchronous as in multithreading, then no, we don't do any of that, aside from the graphical display itself. There's a lot of promise here, even with microthreading, which the community has helped me out with, but I haven't had time to dive into. I don't have any experience and it's a bug-prone thing.

Q: Have you tried other projects/technologies alongside DF?

A: Sure! The side project folder that's migrated between computers for the last ten years or so has about 90 projects in it. Some of them lasted for days, some for multiple years. They are mostly other games, almost always in other genres, but there are also a few DF helper projects, like the myth generator prototype. Nothing close to seeing the light of day, but it's fun to play around.

Q: With your ~90 side projects, have you explored any other programming languages? If so, any favorites?

A: Ha ha, nope! I'm more of a noodler over on the design side, rather than with the tech. I'm sure some things would really speed up the realization of my designs though, so I should probably at least learn some scripting and play around with threading more. People have even been kind enough to supply some libraries and things to help out there, but it's just difficult to block side project time out for tech learning when my side project time is for relaxing.

Q: You have the most interesting release notes. What's your favorite bug and what caused it?

A: It's probably boring for me to say, but I just can't beat the drunken cat bug. There've been a few videos made about it by this point. That was the one where the cats were showing up dead all over the tavern floor, and it turned out they were ingesting spilled alcohol when they cleaned their paws. One number was off in the ingest-while-cleaning code, and it sent them through all the symptoms of alcohol poisoning (which we added when we spruced up venomous creatures.)

If you want to try Dwarf Fortress for yourself, you can download it from their website.

Add to the discussion