At my first job out of college (pre-Y2K), I got my first taste of version control systems. We used Microsoft’s Visual SourceSafe (VSS), which had a repository of all the files needed for a release, which was then burned onto a disk and sent to people through the mail. If you wanted to work on one of those files, you had to check it out from the repo—literally, like a library book. That file would be locked until you checked it back in; no one else could edit it. In essence, VSS was a shield on top of a shared file folder.
Microsoft discontinued VSS in 2005, coincidently the same year as the first release of Git. While technology has shifted and improved quite a bit since then git has come out as the dominant choice for version control systems. This year, we asked what version control systems people used, and git came out as the clear overall winner.
But it’s not quite a blow out; there are two other systems on the list: SVN (Apache Subversion) and Mercurial. There was a time when both of these were prominent in the market, but not everyone remembers those days. Stack Overflow engineering has used both of these in the past, though we now use git like almost everybody else.
This article will look at what those version control systems are and why they still have a hold of some engineering teams.
Subversion (SVN) is an open-source version control system that maintains source code in a central server; anyone looking to change code accesses these files from clients. This client server model is an older style, compared to the distributed model git uses, where changes can be stored locally then distributed to the central history (and other branches) when pushed to an upstream repository. In fact, SVN build on historical version control—it was initially intended to be a mostly compatible successor to CVS (Concurrent Versions System), which is itself a front end and expansion to Revision Control System (RCS), initially released way back in 1982.
This earlier generation of version control worked great for the way software was built ten to fifteen plus years ago. A piece of software would be built as a central repository, with any and all feature additions merged into a trunk. Branches were rare and eventually absorbed into the mainline. Important files, particularly large binaries, could be “locked” to prevent other developers from changing them while you worked on them. And everything existed as directories—files, branches, tags, etc. This model worked great for a centrally located team that eventually shipped a release, whether as a disc or a download.
SVN is a free, open-source version of this model. One of the paid client-server version control systems, Perforce (more on this below), had some traction at enterprise-scale companies, notably Google, but for those unwilling to pay the price for it, SVN was a good option. Plenty of smaller companies (including us at the beginning) used centralized version control to manage their code, and I’m sure plenty of folks still do, whether out of habit or preference.
But the ways that engineering organizations work has changed pretty drastically in the last dozen years. There is no longer a central dev team working on a single codebase; you have multiple independent teams each responsible for one or more services. Stack Overflow user VonC has made himself a bit of a version control expert and has guided plenty of companies away from SVN. He sees it a technology built for a less agile way of working. “It does get in the way, in term of management, repository creation, registration, and the general development workflow. As opposed to a distributed model, which is much more agile in those aspects. I suspect the recent developments with remote working will not help those closed environment systems.”
The other reason that SVN grew less used was that git showed how things could be better. Quentin Headen, Senior Software Engineer here at Stack Overflow, used SVN early in his career. “In my opinion, the two biggest drawbacks of SVN are that first, it is centralized, which requires a the SVN server to be up for you to commit changes. If your internet is down, you can't commit at all. Second, the branching is very heavy. Once a branch is created, you can't delete it (if I remember correctly). I think there is a command to remove, but it stays in history regardless. Git branches are cheap and can be deleted easily if need be.”
Clearly, SVN lost prominence when the new generation of version control arrived. But git wasn’t the only member of that generation.
Git wasn’t the only member of the distributed version control generation. Mercurial first arrived the same year as Git—2005—and became the two primary players. Early on, many people wondered what differences, if any, the two systems had. When Stack Overflow moved away from SVN, Mercurial won out mostly because we had easy access to hosting through Fog Creek Software (now Glitch), another of our co-founder Joel Spolsky’s companies. Eventually, we too gave in to Git.
Initially, Mercurial seemed to be the natural fit for developers coming from earlier VC systems. VonC notes, “It's the story of VHS versus Betamax.”
I reached out to Raphaël Gomès and Pierre-Yves David, both Mercurial core developers, about where Mercurial fits into the VC landscape. They said that plenty of large companies still use Mercurial in one form or another, including Mozilla, Facebook (though they may have moved to a Mercurial fork ported to Rust called Eden), Google (though as part of a custom VC codebase called Piper), Nokia, and Jane Street. “One of main advantages of Mercurial these days is its ability to scale on a very large project (millions of commits, millions of files). Over the years, companies have contributed performance improvements and dedicated features that make Mercurial a viable option for extreme scale monorepos.”
Ry4an Brase, who works at Google and uses their VC, expanded on why: “git is wed to the file system. Even GitHub accesses repositories as files on disk. The concurrency requirements of very large user bases on a single repo scale past filesystem access, and both Google and Facebook found Mercurial could be adapted to a database-like datastore and git could not.” However, with the recent release of Git v2.38 and Scalar, that advantage may be lessened.
But another reason that Mercurial may stay at these companies with massive monorepos is that it’s portable and extendable. It’s written in Python, which means it doesn’t need to be compiled to native code, and therefore it can be a viable VC option on any OS with a Python interpreter. It also has a robust extension system. “The extension system allows modifying any and all aspects of Mercurial and is usually greatly appreciated in corporate contexts to customize behavior or to connect to existing systems,” said Gomès and David.
Mercurial still has some big fans. Personally, I had never heard of it until some very enthusiast Mercurialists commented on an article of ours, A look under the hood: how branches work in Git.
babaloomer: Branches in mercurial are so simple and efficient! You never struggle to find the origin of a branch. Each commit has the name of its branch embedded in it, you can’t get lost! I don’t know how many times I had to drill down git history just to find the origin of a branch.
Scott: Mercurial did this much more intuitively than Git. You can tell the system is flawed when the standard practice in many workflows is to use “push -f” to force things. As with any tool, if you have to force it something is wrong.
Of course, different developers have different takes on this. Brase doesn’t think that Mercurial’s branching is necessary better. “Mercurial has four ways to do branches,” he said, “and the one that was exactly like git's was called ‘bookmarks’, which the core developers were slow to support. What Mercurial called branches have no equivalent in git (every commit is on one and only one branch and it's part of the commit info and revision hash), but no one wanted that kind.” Well, maybe not no one.
Mercurial is still and active project, as Gomès and David attest. They contribute to the code, manage the release cycles, and hold yearly conferences. While not the leading tool, it still has a place.
Other version control systems
In talking to people about version control, I found a few other interesting use cases, primarily around paid version control products.
Remember when I said I’d have more on Perforce? It turns out that several people mentioned it even though it didn’t even register on our survey. It turns out that Perforce has a strong presence in the video game industry—some even consider it the standard there. Rob Oates, an industry veteran who is currently the senior director of technology and partnerships at Exploding Kittens said, “Perforce still sees use in the game industry because c video game projects (by variety, count, and total volume of assets) are almost entirely not code.”
He gave four requirements that any version control system would need to fulfill in order to work for video game development:
- Must be useable by laypersons - Artists and designers will be working in this system day-to-day.
- Must lock certain files/types on checkout - Many of our files cannot be conceptually or technically merged.
- Must be architected to handle many large files as the primary use case - Many of our files will be dozens or hundreds of megabytes.
- Must avoid degenerate case with delta compression schemes - Many of our large files change entirely between revisions.
Perforce, because of its centralized server and file locking mechanism, fits perfectly. So why not separate the presentation layer from the simulation logic and store the big binary assets in one place and the code in a distributed system that excels at merging changes? The code in video games often depends on the assets. “For example, it would not be unusual for a game's combat system to depend on the driving code, the animations, the models, and the tuning data,” said Oates. “Or a pathfinding system may depend on a navigation mesh generated from the level art. Keeping these concerns in one repo is faster and less confusing when a team of programmers, artists, and designers are working to rapidly iterate on the ‘feel’ of the combat system.”
The engineers at these companies often prefer git. When they have projects that don’t have artists and designers, they can git what they want. “Game engines and middleware have an easier time living on distributed version control as their contributors are mostly, if not entirely, engineers,” said Oates. Unfortunately for the devs on video games, most projects have a lot of people creating non-code assets.
Another one mentioned was Team Foundation Version Control (TFVC). This was a Microsoft product originally included in Team Foundation Server and still supported in Azure DevOps. It’s considered the spiritual successor to VSS and is another central server style VC system. Art Gola, a solutions architect with Federated Hermes, told me about it. “It was great for its time. It had an API, was supported on Linux (Team Foundation Everywhere) and tons of people using it that no one ever heard from since they were enterprise devs.”
But Gola’s team is actively trying to move their code out of the TFVC systems they have, and he suspects that a lot of other enterprise shops are too. Compared to the agility git provides, TFVC felt clunky. “It requires you to have a connection to the central server. Later versions allow you to work offline, but you only had the latest version of the code, unlike git. There is no built in pull request type of process. Branching was a pain.”
One could assume that now that the age of centralized version control is waning and distributed version control is ascendant, there is no innovation in the VC space. But you’d be mistaken. “There are a lot of cool experiments in the VCS space,” said Patrick Thomson, a GitHub engineer who compared Git and Mercurial in 2008, “Pijul and the theory of patch algebra, especially—but Git, being the most performant DVCS, is the only one I use in industry. I work on very large codebases.”
Why did Git win?
After seeing what the version control landscape looks like in 2022, it may be obvious why distributed version control won out as the VC of choice for software developers. But it may not be immediately obvious why Git has such a commanding share of the market over Mercurial. Both of them first came out around the same time and have similar features, though certainly not one to one. Certainly, many people prefer it. “For personal projects, I pick Mercurial. If I was starting another company, I'd use Git to avoid having to retrain and argue with new hires,” said Brase.
In fact, it should have had an advantage because it was familiar to SVN users and the centralized way of doing things. “Mercurial was certainly the most easy to use and more familiar to use because it was a bit like using subversion, but in a distributed fashion,” said VonC. But that fealty to the old ways may have hurt it as well. “That is also one aspect which was ultimately against Mercury because just having the vision of using an old tool in a distributed fashion was not necessarily the be best fit to develop in a decentralized way.”
The short answer why it won comes down to a strong platform and built-in user base. “Mercurial lost the popularity battle in the early 2010s to Git. It's something we attribute in large part to the soaring rise of GitHub at that time, and to the natural endorsement of Git by the Linux community,” said Gomès and David.
Mercurial may have started out in a better position, but it may have lost ground over time. “Mercurial's original fit was a curated, coherent user experience with a built-in web UI,” said Brase. “GitHub gave git the good web UI and coherent couldn't beat the feature avalanche from Git contributors and the star power of its founder.”
That feature avalanche and focus on user needs may have been a hidden factor in pushing adoption. Thomson, in his comparison nearly fifteen years ago, likened Git to MacGyver and Mercurial to James Bond. Git let you scrape together a bespoke solution to nearly every problem if you were a command-line wizard, while Mercurial—if given the right job—could be fast and efficient. So where does Thomson stand now? “My main objection to Git—the UI—has improved over the years (I now use an Emacs-based Git frontend, which is terrific), whereas Mercurial’s primary drawback, its slow speed on large repositories, is still, as far as I can tell, an extant problem.”
Like MacGyver, Git has been improvising and adapting to fit whatever challenges come its way. Like James Bond, Mercurial has its way of doing things. It works great for some situations, but it has a distinct point of view. “My favorite example of a difference in how git and Mercurial approach new features is the `config` command,” said Brase. “Both `git config` and `hg config` are commands to edit settings such as the user's email address. The `git config` command modifies `~/.gitrc` for you and usually gets it right. The Mercurial author refused all contributions that edited a config file for you. Instead `hg config` launched your text editor on `~/.hgrc`, saying ‘What is it with coders who are intimidated by text-based config files? Like doctors that can't stand blood.’”
Regardless, it seems that while Git feels like the only version control game in town, it isn’t. Options for how to solve your problems are always a plus, so if you’ve been frustrated with the way it seems that everyone does things, know that there are other ways of working, and commit to learning more.