Best practices can slow your application down
[Ed. note: While we take some time to rest up over the holidays and prepare for next year, we are re-publishing our top ten posts for the year. Please enjoy our favorite work this year and we’ll see you in 2022.]
Update: I realize we didn’t add a lot of context here when telling the story of the engineering decisions we made years ago, and why we’re moving away from some of them now. This is an attempt to fix that omission.
Over the past 13 years, we have progressively changed priority as a business. Early on, scaling to millions of users was our main concern. We made some tough calls, and consciously decided to trade off testability for performance. After successfully achieving that scale, much of the context has changed: we have a much faster base framework now, given all the latest improvements in the .NET world, meaning we don’t have to focus as much on the raw performance of our application code. Our priorities have since steered towards testability. We got away with “testing in production” for a long time, largely due to our (very active) meta community. But now that we’re supporting paying customers, identifying bugs early on reduces the cost of fixing them, and therefore the cost of business. Paying the accumulated tech debt takes time, but it’s already helping us get to more reliable and testable code. It’s a sign of the company maturing and our engineering division re-assessing its goals and priorities to better suit the business that we’re building for.
In software engineering, a number of fairly non-controversial best practices have evolved over the years, which include decoupled modules, cohesive code, and automated testing. These are practices that make for code that’s easy to read and maintain. Many best practices were developed by researchers like David Parnas as far back as the 1970s, people who thought long and hard about what makes maintainable high quality systems.
But in building the codebase for our public Stack Overflow site, we didn’t always follow them.
The Cynefin framework can help put our decision into context. It categorizes decisions into obvious, complicated, complex, and chaotic. From today’s perspective, building a Q&A site is a pretty well-defined—obvious—problem and a lot of best practices emerged over the past years. And if you’re faced with a well-defined problem, you should probably stick to those best practices.
But back in 2008, building a community-driven Q&A site at this scale was far from being obvious. Instead, it fell somewhere in the “complex” quadrant (with some aspects in the “complicated” quadrant, like tackling the scaling issues we had). There were no good answers on how to build this yet, no experts who could show us the way. Only a handful of people out there faced the same issues.
For over a decade, we addressed our scaling issues by prioritizing performance everywhere. As one of our founders, Jeff Atwood, has famously said, “Performance is a feature.” For much of our existence, it has been the most important feature. As a consequence, we glossed over other things like decoupling, high cohesion, and test automation—all things that have become accepted best practices. You can only do so much with the time and resources at hand. If one thing becomes super important, others have to be cut back.
In this article, we walk through the choices we made and the tradeoffs they entailed. Sometimes we opted for speed and sacrificed testing. With more than a decade of history to reflect on, we can examine why best practices aren’t always the best choice for particular projects.
In the beginning…
When Stack Overflow launched in 2009, it ran on a few dedicated servers. Because we went with the reliability of a full Microsoft stack—.NET, C#, and MSSQL—our costs grew with the number of instances. Each server required a new license. Our scaling strategy was to scale up, not scale out. Here’s what our architecture looks like now.
To keep costs down, the site was engineered to run very fast, particularly in accessing the database. So we were very slim then, and we still are—you can run Stack Overflow in a single web server. The first site was a small operation put together by less than half a dozen people. It initially ran on two rented servers in a colocation facility: one for the site and one for the database. That number soon doubled: In early 2009, Atwood hand-built servers (two web, one utility, one database) and shipped them to Corvallis, OR. We rented space in the PEAK datacenter there, which is where we ran Stack Overflow from for a long time.
The initial system design was very slim, and they stayed that way for most of the site’s history. Eventually, maintaining a fast and light site design became a natural obsession for the team.
Our codebase works the same way. We’ve optimized for speed, so some parts of our codebase used to look like C, because we used a lot of the patterns that C uses, like direct access to memory, to make it fast. We use a lot of static methods and fields as to minimize allocations whenever we have to. By minimizing allocations and making the memory footprint as slim as possible, we decrease the application stalls due to garbage collection. A good example of this is our open source StackExchange.Redis library.
To make sure regularly accessed data is faster, we use both memoization and caching. Memoization means we store the results of expensive operations; if we get the same inputs, we return the stored values instead of running the function again. We use a lot of caching (in different levels, both in-process and external, with Redis) as some of the SQL operations can be slow, while Redis is fast. Translating from relational data in SQL to object oriented data in any application can be a performance bottleneck, so we built Dapper, a high performance micro-ORM that suits our performance needs.
We use a lot of tricks and patterns—memoization, static methods, and other tricks to minimize allocations—to make our code run fast. As a trade-off, it often makes it harder to test and harder to maintain.
One of the most noncontroversial good practices in the industry is automated tests. We don’t write a lot of these because our code doesn’t follow standard decoupling practices; while those principles make for easy to maintain code for a team, they add extra steps during runtime, and allocate more memory. It’s not much on any given transaction, but over thousands per second, it adds up. Things like polymorphism and dependency injection have been replaced with static fields and service locators. Those are harder to replace for automated testing, but save us some precious allocations in our hot paths
Similarly, we don’t write unit tests for every new feature. The thing that hinders our ability to unit test is precisely the focus on static structures. Static methods and properties are global, harder to replace at runtime, and therefore, harder to “stub” or “mock.” Those capabilities are very important for proper isolated unit testing. If we cannot mock a database connection, for instance, we cannot write tests that don’t have access to the database. With our code base, you won’t be able to easily do test driven development or similar practices that the industry seems to love.
That does not mean we believe a strong testing culture is a bad practice. Many of us have actually enjoyed working under test-first approaches before. But it’s no silver bullet: your software is not going to crash and burn if you don’t write your tests first, and the presence of tests alone does not mean you won’t have maintainability issues.
Currently, we’re trying to change this. We’re actively trying to write more tests and make our code more testable. It’s an engineering goal we aim to achieve, but the changes needed are significant. It was not our priority early on. Now that we have had a product up and running successfully for many years, it’s time to pay more attention to it.
Best practices, not required practices
So, what’s the takeaway from our experience building, scaling, and ensuring Stack Overflow is reliable for the tens of millions who visit every day?
The patterns and behaviors that have made it into best practices in the software engineering industry did so for a reason. They make building software easier, especially on larger teams. But they are best practices, not required practices.
There’s a school of thought that believes best practices only apply to obvious problems. Complex or chaotic problems require novel solutions. Sometimes you may need to intentionally break one of these rules to get the specific results that your software needs.
Special thanks to Ham Vocke and Jarrod Dixon for all their input on this post.Tags: best practices, engineering, performance
For the less experienced coders reading this article:
The takeaway is NOT “I should ignore best practices because they make my code slower.”
The takeaway is “in certain situations where I know that a particular best practice is causing a performance bottleneck, I should carefully consider how to optimize my code in a structured way that will improve performance without making the code unmaintainable.”
In order to find bottlenecks, you need to do performance profiling. The code often bottlenecks in different places than you’d expect, and optimizing in the wrong place won’t accomplish anything.
Imagine building Stack Overflow without Stack Overflow!
imagine being a programmer not a “copy/paste”-er
What is a programmer?
Thanks for saying what is already on the text.
Well, that’s what a takeway is, a.k.a. TL;DR
I completely agree with this. Best practices is a term that means “thousand of people have run into the same issues over and over again and these are the solutions that prevent these problems.” If you are going to deviate from them you need to have iron clad reasoning for doing so. Put another way, what are you doing that makes you think you know better than the rest of the industry?
Testing is how your code proves that it does what it says it does. I’ve worked for many organizations that didn’t test and now it’s a personal rule that I won’t work anywhere that doesn’t. I’ve spent so much time debugging stupid things that a test would have caught. Or I’ve made a change that had an unexpected consequence that a test run by a robot alerted me to. It’s such a time saver, prevents bugs, and I enjoy my work more.
As muchas I respect the SO team I still question whether they are making the right decision. Not having tests is like driving without a seat belt or fuel gauge or speedometer.
I agree Anthony. The article has two paragraphs defending the lack of tests followed by a paragraph explaining how they’re making major changes in order to *add* tests.
The conclusion drawn is that adding tests is something that can be done *after* the product is stable: “Now that we have had a product up and running successfully for many years, it’s time to pay more attention to it.”
While I don’t know what types of bugs/issues that the team dealt with from 2007 until now, almost assuredly they would’ve been discovered or prevented had a proper testing framework been in place.
Perhaps we were not super clear in that piece – but testing is one of the things we “punted” back then and are definitely fixing now. One of our major engineering goals in the past few years was writing comprehensive test suites – and all new features are getting their share of automated tests as well. We still have way more integration tests than unit tests, but our test coverage is increasing progressively.
Maybe you dodged a “silver” bullet by not having written a bazillion unit tests along the way. In most cases, good integration tests are all that you need to be able to operate safely. Also easier to map to requirements (“which business requirement does this specific test verify?”), and they don’t impose a tax on refactoring refactoring like the nitty gritty unit tests do.
But I’m sure that you already know that 😉
I am firmly in the ‘there shall be tests’ camp, but that does not necessarily mean they have to involve a lot of mocking.
Most important is to have automated functional tests. They ensure it all fits together, so you really should have them. And fortunately they can be written for any code no matter how ugly and tightly coupled it is, because it only uses the external interfaces and sniffs logs. So there is no excuse for not having them.
But these end-to-end tests are difficult to debug when they fail, so it makes sense to add tests for individual parts, the unit tests. This is what requires decoupling and internal interfaces that you can mock. But writing the mocks takes time too, the interfaces may get in the way of optimization, and if you get the split wrong, you’ll be updating the tests all the time as the code evolves anyway, so you’ll get diminishing returns.
And how far it makes sense to go depends very much on the kind of code you are writing. In most projects I’ve seen it was concluded that mocking out the DAL is not worth it. Create a local database instance with known test data, test the DAL itself against it and then with confidence the DAL is returning the test data you want, test the business logic against the actual DAL and test database.
That still works even if the code is tightly coupled, and if a lot of the logic is actually in complex SQL (or LINQ) queries, and if you test the lower layers separately, then if the higher level test fails but the lower level one does not, you still know you need to debug the higher layer, so the debugging isn’t that much harder.
So forgoing interfaces for sake of optimization (here languages with compile-time generics have advantage, because you can have interfaces, but inline them during compilation for performance—but that’s basically just C++ and Rust) is not a valid argument against testing.
Perhaps they have gotten away with minimal testing because their product has been so stable. I’m sure there have been changes behind the scenes, but the base Q&A idea has not varied as much as it does for most startups
don’t use js at the first place if performance is what to be concerned
I appreciate the main thrust of the article, that “best practices” are not “mandatory practices” nor are they always best for you. What I don’t understand, however, is why you are trying to rework your code base. Since it is a) proven and tested b) fast and efficient and c) able to be upgraded and extended (I’m assuming without too much pain) then why change it? It seems like a major rework to allow for easer unit testing is likely to only compromise performance. Why “fix” something that isn’t broken? The site seems to be doing just fine despite the fact parts of it do not easily fit into a unit testing framework.
Eventually, upgrading and extending our codebase did become too much pain without the safety net some well-placed tests could give us. We have a multi-tenant application that powers not only Stack Exchange sites (where Meta users are always kind enough to let us know when bugs get in!), but also our private Stack Overflow for Teams instances. We need to be very mindful now that we have SLAs and manage private data, and automated testing has definitely helped us evolve the product with confidence.
I do t think you quite understand refactoring. You can have a rock solid, low tested codebase. It can be tightly coupled and work *great*.
Right up to the moment requirements change and you need to change it.
At this point, if you do t have a decent set of tests in place, you’ll often not realise that you have broken areas you didn’t think related to the code you changed.
Retrofitting tests and doing things like decoupling code (often needed to write tests!) helps you to have confidence if you need to change code because requirements changed.
The point here is, you shouldn’t be shackled by best practices. What you need to do is to balance them with your reality. If it makes sense to you, go for it. If you see it is more of a burden, adapt it.
Keep best practices in mind, but don’t take them as universal rule.
I had a similar situation managing Git recently at my job. We work on a lot of legacy systems coming from SVN and there wasn’t a very clear git management strategy company-wide, so I decided to try out gitflow and suggest a change company-wide.
Gitflow on its own couldn’t deliver because of how our testing cycle works depending on branches not existent in vanilla gitflow, so I adapted it has been working quite well since. After almost a year, now gitflow is slowly being applied to other projects in the company because it was proven to be successful even with these changes.
I have always believe in ” there ‘s is no silver bullet”.
All the principles gave me doubt some times. Thanks for very insightful article.
Not recommended for 99% of developers out there…
@Kevin, plz can you teach me to become a good developer?
Users don’t care about how hard or how easy is for you to develop the software, they care about the software helping them to achieve most of their goals for the least amount of time. With that being said, all of the best practices that exist are about making the life of a programmer easier.
If you follow “best practices,” it won’t mean that your product will be better for the user. On the contrary, your software will be worse. But it will be definitely easier for you to test or maintain it.
And there we have it – today, almost all software works slower than the same kind of software did 25 years ago (while the computers have become tens of thousands times faster), and it’s also generally less useful. Meanwhile, programmers spend almost all of their time finding new ways to make their lives easier.
Don’t know how I feel about this article because on one hand it’s StackOverflow, which does so many things right, and on the other hand the development process seems very wrong. You can have test units and still have performant code, there’s no doubt about it, and the fact that you are converting some of the codebase to add back test units shows that it can be done.
Another question is: if SE was a bit less performant, would it have failed? I don’t think so and, with a cleaner codebase, it would have been possible to optimise the places that cause problem anyway.
Overall I’m not very convinced by this article but it’s still interesting to get a different and usual view on these issues, thanks for sharing.
Reading between the lines I feel like a key issue here was they were using a Microsoft stack, and thus had to pay an additional license fee for every instance. With a different software stack that didn’t require per-instance licensing it might have made more sense to give up a little performance in exchange for cleaner code, but the Microsoft tax tied their hands.
I personally believe that Unit Testing speeds up development. I kind of see what you’re saying about DI ruining performance, but with many newer libraries that depend on aspects instead of reflection (think micronaut) the cost shifts from run time to compile time. Honestly, I think skipping testing is a huge mistake in any scenario. With a proper IDE its easy to refactor and maintain, it just takes discipline.
Having 3 people jump on a call to try and find a bug in a piece of wildly disorganized, cryptic code none of them wrote… what a waste of time. Bugs detract so much from smoothly flowing development they should be avoided at all costs. The test serves as documentation too. I’ve skipped a lot of tests on my current project, and I’ve been slowly eating my own words as the tech debt accumulates. Every single time, its like a month later when bugs come up I end up kicking myself and basically starting from scratch with a TDD process to get the rules sorted properly. It’s just a more sane approach all around.
Nice article and I agree with the most of it.
Although it is funny that actually searching for some more info to understand more some topic in the article lead me to a stackoverflow page explaining that using static classes and methods do not significantly reduce the allocation of memory space and also does not result in any other significant code optimization:
I agree with the above answers, static methods for example for a method that is not dependent on the objects from that class, and not for optimization.
So I am still wondering how exactly making classes and methods static reduced the memory footprint at stackoverflow then???
Maybe they should ask for answers on their own site in terms of best practices LOL
I have the experience that developers are at these “big name” companies tend to brag a lot about their knowledge of optimizing things, but actually they make same mistakes, and having the same lack of knowledge in some topics then us, everyday developers…
Static methods don’t require additional memory but allocating objects and garbage collecting does, as mentioned in the article.
I don’t think they meant using static methods to save on object allocation. I think they meant static fields to share objects across classes instead of DI. So shared static database connections and such.
Nice article, but what I’d like to here more is how to decide where it’s best to ignore best practices. As in actual profiling tips, tools and methodologies used to make the decision on what and how to optimize. Or case study how it was done in case of StackOverflow.
Well, I wouldn’t want to work on it. It must be a nightmare to document and for new developers to get to grips with.
hmm, these days I am finding many articles which advocate against language which perhaps introduced the term ‘Stop the World’ 😀
Microsoft and reliability are not two words always found in the same sentence. It’s refreshing to see that MS is good at something (apart from holding their monopoly on OSes for commodity computers) 😛
This post is software engineering. It’s leaning to manage trade-offs and making the best decision for your product. In cinematography you have to know the rules so you know when to break them and I see a lot of similarities in software engineering; software engineering sits in between an art and a science.
“Currently, we’re trying to change this. We’re actively trying to write more tests and make our code more testable. It’s an engineering goal we aim to achieve, but the changes needed are significant. It was not our priority early on. Now that we have had a product up and running successfully for many years, it’s time to pay more attention to it. ”
I’d be curious to know you accomplish this without sacrificing the performance. Maybe you can’t but the speed trade off is justifiable.
This is the key: “Currently, we’re trying to change this. We’re actively trying to write more tests and make our code more testable. It’s an engineering goal we aim to achieve, but the changes needed are significant.”
The best practices are not required if/when testing is not required either. If it is a part of the requirements, then patterns and SOLID will emerge naturally. Though, of course, it will take longer to emerge, that’s why developers start using them right from the beginning… just to save time.
I think I have a clearer conscience now about building the application before testing because that is how I have been building my own projects. I’m usually excited to build a new feature, but testing always seemed to kill the excitement.
I feel guilty about this, but writing tests has always been boring, to me it’s always been so and I had to deal with a lot of procrastination in that domain.
Testing is good, but if you find it boring as I do, I think going for best performance covers the guilt.
Yeah, it’s boring, specially when is not your code. My technique to keep some of the excitement is to start with a very basic unit test and add more complexity as I make some progress on the feature. Also, seeing that a test passes after being failing for a while is exciting or even passing a test after the first attempt helps me to keep the hype or suspect that something may be wrong
I’ve never liked the term “Best Practice”, I’ve always preferred the term “Good Practice”. There is no technique/framework/process etc. that is the best for every situation, yet a lot of “Best Practices” are spoken about as if by not doing them your application will fail, your company will go bankrupt and maybe the earth will set on fire.
“Good Practice” on the other hand is more like saying “In most cases, this will avoid the worst/common problems in this situation, so we recommend trying it”, which IMO is much more realistic.
Don’t we all try to get the best performant out of our code? LOL! Good article though finding the happy medium comes with practice.
This article makes a false dichotomy IMHO; either you have testable code or you have fast code. There are plenty of ways to achieve both, that may be more or less difficult depending on your architecture. Test-builds that change which dynamic libraries they link to, compile-time switches that alter a block of code to use a stub or a mock, and the list goes on.
The statement “We don’t write a lot of [automated unit tests]” is deeply concerning to me. The codebase will accumulate technical debt, subtle bugs, whack-a-mole bugs, and plenty of regression bugs (bugs that appeared before are appearing again). This kind of tech debt can paralyze and eventually kill a project, even one as large and successful as StackOverflow (especially one that is large). How can you possibly change anything without incredible uncertainty of what will break?
I’m disappointed, StackOverflow. The article reads a little bit like you’re in the “denial” stage of grief.
The site is still working very good anyway, your disappointment makes no difference.
Nice article, but what I’d like to hear more is how to decide where it’s best to ignore best practices. As in actual profiling tips, tools and methodologies used to make the decision on what and how to optimize. Or case study how it was done in the case of StackOverflow.
One part where I find a best practice may not be so useful is when there is lot of serialization/deserealization and mappings to pass objects between layers. As different objects may be useful to have a good separation between layers and improves maintainability a lot, in certain scenarios it may have a bad impact on performance and would be better to share the same model between layers
I have always believed in ” there ‘s is no silver bullet”.
All the principles gave me to doubt sometimes. Thanks for the very insightful article.
Oh, look! We get hack!
Oh, look! The application is crashing like popcorn!
Oh, look! We have unmaintainable code!
Oh, look! We can’t find or reproduce a bug!
Oh, look! We have to refactor the code to implement a new functionality!
Oh, look! The backend functional points don’t reflect the frontend functional points!
Oh, look! We have 4 different patterns on the same application because each developer implements their own!
Oh, look! The developer bypass the logic engine to implement an enhancement instead of doing it inside the logic engine!
If you really think best practices slow you down, you are not understanding why and when to use them. You are just blindly following rules.
I respect people that say: look by doing this (insert something I don’t usually do) we get this good result.
But when somebody says this widely good practice in his experience don’t work, is wrong, or insert anything without offering a solution and encouraging to not used it; usually, that means this person just doesn’t know that well how it actually works nor know how to use it.
My opinion is similar to Anthony’s. Firstly, they talk about how no tests are present, followed by how they’re going to introduce tests.
Tests should be added once the product has been stable: “Now that the product has been successfully running for some time, it’s time to focus more on it.”
Almost certainly, the team would have been able to detect or eliminate bugs if proper testing frameworks were in place from 2007 until now.
The assertion “We don’t compose a ton of [automated unit tests]” is profoundly disturbing to me. The codebase will gather specialized obligation, inconspicuous bugs, whack-a-mole bugs, and a lot of relapse bugs (messes with that showed up before are showing up once more). This sort of tech obligation can deaden and at last eliminate a task, even one as enormous and effective as StackOverflow (particularly one that is huge). How might you potentially transform anything without amazing vulnerability of what will break?
I think the term “best practices” is at best a misnomer and at worst an active hindrance to people pursuing the full solution space. Software development, as the article points out, is the art of balancing the various and often conflicting requirements. “Best” is an absolute. “Best” implies that by choosing anything else you’d end up with a sub-optimal approach or solution. The plural also gives it away: We talk about multiple best practices, which only makes sense if there are several independent dimensions of software development that are completely independent of each other and for each of which we just have to choose the single known best practices. Obviously, that is not the case.
I’d very much welcome it if we stopped calling them “best practices” and started calling them “proven good practices” with respect to the individual dimension that the practice aims to improve (which, admittedly, is just not as catchy as “best practice”). This means you’re well-advised to choose these practices absent any information to the contrary but even then have to weigh their costs and benefits against each other. In our profession, we need to remain open to the idea, that any new information, any new requirement, any change in context, can call into question decisions we have made earlier. At any such point in time, we need to be willing to re-evaluate our past decision in light of the new context, and it is important we remember that following a proven practice is just one of these decisions. If we have reason to believe that stopping such a practice is a beneficial tradeoff in our current context, we should not just feel free, we should feel obliged to make that change.
And, as the article also pointed out very well, at a later point in time the context will yet again have changed to an extent that makes it worthwhile to (re)implement a proven practice.
So you’ve essentially faced the startup-dilemma: scale fast, solve problems later. Actually this is why most startups fail. Not because they can’t scale initially but because the mess they create on the way will slow them down so much, that their business model fails eventually. The reason that this didn’t happen to you is most probably that you had a strong enough / innovative business case for which this boundary didn’t come too early (#SurvivorshipBias). “Weaker” business models cannot survive the mess. This puts a boundary on what things can be done within a certain software engineering culture. Also you most definitely had some very important core developers who didn’t leave the company and knew the code from the very start. They saved you. Performance is not a feature. It’s a constraint which you should only optimize if you can prove the necessity.
This is half of what I was going to say. I disagree with “this is why most startups fail” – unless you have some evidence, in which case I would be happy to revise my thinking. The other half of the start-up dilemma is that if you solve problems now and scale slow you also might fail, because by the time you get anywhere, lots of others are ahead of you and you will struggle to get any market share regardless of how much better designed or easy to maintain your product is, or because the investors have long since gone elsewhere. In the real world, having the time to properly design and build software can be a luxury. There is a balance between planning for future development and just getting stuff done fast enough, particularly if you have specific functional milestones with due dates. That balance also shifts depending on the circumstances – if someone wants something for a particular event and it will never be used again and you have a budget of 5 days, how much unit testing are you going to build into that?
“Performant” is not a word stop using it.
While I have plenty of grammatical complaints – for example using “less” rather than “fewer” when referring to a countable collection, or using “impactful” in any circumstances – I also have to recognise that language evolves. “Performant” is in the Oxford English dictionary and has a very specific meaning that is useful, so people use it. It is usual (for good reason) in this context to note that Shakespeare invented words – “critic”, “dauntless” and “lackluster” to name a few.
It is funny when most of your jobs ads says:
“You are adept at shipping high quality, well-tested code in a fast paced environment.”
Yeah sure, preaching what you don’t practice at its finest
The ultimate goal of the work you are doing is to provide a good experience for your customers, they feel useful and enjoy using your product. To be productive use task management apps, you can find them on apkdownload, like trello or slack is simply a note taking app.