community November 16, 2010

Dr. Strangedupe: Or, How I Learned to Stop Worrying And Love Duplication

As Stack Overflow grows — or any other Q&A; site in the Stack Exchange network, really — there’s a natural pressure to discover and link duplicate questions. The more questions you have, the higher the possibility a given new question isn’t in fact a new question, but a duplicate of an older existing question. Because…
Avatar for Jeff Atwood
Co-Founder (Former)

As Stack Overflow grows — or any other Q&A; site in the Stack Exchange network, really — there’s a natural pressure to discover and link duplicate questions. The more questions you have, the higher the possibility a given new question isn’t in fact a new question, but a duplicate of an older existing question. Because of this, we’ve continually enhanced the tools for finding, linking, and merging duplicate questions:

One thing I want to be clear about, though, is that duplication is not necessarily bad. Quite the contrary — some duplication is desirable. There’s often benefit to having multiple subtle variants of a question around, as people tend to ask and search using completely different words, and the better our coverage, the better odds people can find the answer they’re looking for. And isn’t that, really, the whole point of this exercise?

Furthermore, it’s OK for duplicate questions to have duplicate answers. While you could argue that the duplicate questions could all be merged into one question with a “master” set of answers, this is kind of irritating from the perspective of the user looking for an answer. Put yourself in their shoes. Instead of finding …

Duplicate Question
Duplicate Answer

They have to deal with finding:

Duplicate Question
[closed as duplicate of Question] click here to see answers

Now, what other site requires users to do some sort of weird scroll-down, click-here-first to see the answer nonsense on the search results before they will reveal the answer? Oh yes, our old hyphenated pals. Do we really want our site to work like theirs?

Furthermore, I’ve found that the perfect duplicate question is a … bit of a mythical beast. There are similar questions, yes, and so-called “exact” duplicates do happen, but they are kind of rare in my experience. It’s far more common to have many subtle variations of a question. I think that’s OK, because that’s how the world works. Trying to shoehorn a bunch of semi-related things into one arbitrary container in service of some Highlander-ish “there can be only one” rule is ultimately harmful. Remember: while there are aspects of wiki to our system, we are not Wikipedia. There is not one canonical question about every possible subject. Rather, there are many.

In other words, over time, I have learned to stop worrying and love (some) duplication. And you should too.

Goldie, how many times have I told you guys that I don't want no horsing around on the airplane?

Here are my official guidelines on question duplication:

  1. Having one “perfect” form of a question that contains every possible answer to every slight variation of that question is a myth at best and actively harmful at worst.

  2. Having dozens and dozens of variations of the same question is clearly bad.

  3. What we want is on the order of 4 or 5 similar-but-not-quite-the-same duplicates to cover all possible search terms and common permutations of the question. It is also OK for these duplicates to have their own answers so people who find them don’t have to click yet again to get to a good answer.

Let me be clear — too much question duplication is bad. Absolutely. You’ll get no argument whatsoever from me on that. But not enough question duplication is also bad. I know this does not sit well with programmers who love to think in binary black and white and cannot abide a single atom of duplicated content in the entire omniverse. But the honest, realistic answer to how much question duplication there should be is … “enough”. Question duplicates aren’t necessarily our enemy. They’re more like our, y’know, frenemies.

So, as always, use your good judgment and please continue to close and merge duplicates as you see fit. However, bear in mind that cultivating and supporting a moderate amount of natural duplication actively helps the community. I wasn’t kidding when I said learn to stop worrying and love (some) duplication. Use the above guidelines and try to find a happy, reasonable medium somewhere in the middle there.

Podcast logo The Stack Overflow Podcast is a weekly conversation about working in software development, learning to code, and the art and culture of computer programming.

Related

December 5, 2019

New post notices: Improving feedback on Stack Overflow questions

We know that giving our community mechanisms for sharing and receiving feedback is important. In her recent blog post, Meg Risdal described some of the ways that our Public Q&A product team is working to improve the way our system helps users share and receive feedback. I’m pleased to announce that starting today we are…
Avatar for Yaakov Ellis
Principal Web Developer, Community
November 13, 2019

We’re Rewarding the Question Askers

We’re recalculating reputation for every Stack Overflow and Stack Exchange individual based on this change. Every question upvote earned in the past will earn a value of ten reputation points retroactively.
Avatar for Sara Chipps
Director of Public Q&A