Does Anyone Actually Visit Stack Overflow’s Home Page?
Yesterday we were amused to see this post on Reddit’s sysadmin forum:
Our architecture lead Nick Craver looked into this and gave a great answer, including that about 29% of the previous day’s Stack Overflow traffic was to the home page.
From his perspective as a system administrator, that’s exactly the right analysis. Nick’s team has to ensure the site renders reliably and quickly, so counting the number of requests make sense. However, as a data scientist I’m more interested in how users view the site, and the vast majority of our traffic comes from automated traffic, including search engines and scrapers.
When real people view Stack Overflow, how often are they visiting just questions? Who does and doesn’t visit the homepage, or other parts of the site?
What pages do people view?
If we look at yesterday’s traffic (2017-03-08), including automated traffic, here are the most common types of page views.
(Note that this is only counting pageviews, not asynchronous requests that load after a page or internal API calls; sysadmins do care about the latter).
Each of these represents one route: typically one possible type of page. The most common include:
- Questions/Show: question pages like this one, the kind of page most people associate with Stack Overflow
- Home/Index: our homepage
- UsersShow/Show: visits to a user’s profile (here’s mine)
- Questions/ListByTag: Pages with the most recent questions from a tag, like this one
It does look like a lot of people are visiting the homepage, nearly as many as are visiting questions. However, that’s because we’re counting automated traffic! As a next step, I’ll take a few simple measures to remove automated traffic and include only people we think are real users. These measures won’t catch a savvy scraper who “doesn’t want to be caught”, but they easily filter out your “everyday” bot.
Almost 95% of our non-automated traffic is to question views, with only a small percentage to the home page or other views. This isn’t surprising (it’s the motivation behind the Reddit post), but it’s a good insight into typical visitor behavior.
How do registered visitors use the site?
When I tell a software developer I work at Stack Overflow, they almost always recognize it but often admit that they don’t have an account. That’s true of most of our users; only about 15% of our daily traffic is logged in when they visit. Most developers visit Stack Overflow questions from a Google search as they solve their problems, then move on with their work.
We owe a lot to the community that’s registered on the site, however, because they’re responsible for our terrific set of questions and answers. As you might expect, this part of our community visits the site in a different way.
Within unregistered traffic, the percentage of traffic that goes directly to questions is up to 97%. The Reddit question was right! But a typical registered visitor, who might have answered some questions, does spend some time on the home page, as well as visiting user profiles and tag lists of recent questions.
Power users
While we’re grateful to everyone who contributes to the site, some within the community (“power users”) contribute an enormous amount of knowledge We usually represent this in terms of reputation, gained when content you’ve contributed is upvoted.
Two thresholds we use for user privileges are established users, who have at least 1,000 reputation, and trusted users, who have at least 20,000 reputation. To give a sense of these levels: among users who visit in a typical week, the median user with less than 1,000 rep has answered about 3 Stack Overflow questions, the median established user has answered about 50, and the median trusted user has answered about 750 (!).
How do power users visit the site?
Trusted users (20,000+ reputation) spend about half their time viewing actual questions, and most of the other half visiting the homepage, question lists, or user pages. The home page offers suggestions of questions that are both new and likely to be relevant to you, so this makes sense. The ListByTag route is also relevant as an insight into how “power answerers” use the site; many stick to a particular tag that they’re familiar with, and watch for brand new questions they can answer.
We’re happy that we make the lives of developers easier, even if they’re just getting a quick solution after a search. But we’d also encourage you to join the world’s largest developer community, whether to ask and answer questions, get your next job, or build your online presence with a Developer Story. In any case, next time you solve your problem through Stack Overflow, remember the hundreds of thousands of users who regularly ask, answer, edit, and moderate the site to make it all possible.
45 Comments
The chart says 38.9% (so closely 39%), the text 29%. Typo, or am I reading it wrong?
29% include all traffic (i.e., lots of scrapers). 38.9% includes only human users.
The 29% figure that Nick cited includes non-pageview requests (for example, ads that are requested after the rest of the page loads), which I didn’t include in this analysis.
You’ve given good reasons why power users visit the main page more often.
But why ever does so much automatic traffic go to Home/Index?!
No matter what your bot does, it has to start somewhere – root seems like a good place to start.
If the bot is a crawler (and “starts” at the home page), I’d expect it to visit a huge lot of questions before repeating to load the homepage. (Or are there bots constantly polling the main page for new questions?)
So if it is something else, what is it? A crawler that stops at the home page because it’s not the target domain (but checks for link rot)? A DOS attack? Or just a port scanner looking for HTTP servers in root?
One thing I’m curious to see: percentages by “users EVER do this in a day”. IE, I always visit stackoverflow.com once (to load the site); after that I almost exclusively use the search or tagged depending on how you classified that (I visit /questions/tagged/sas-macro%20sas%20enterprise-guide%20sas-ods%20sas-gtl?mode=any ). But what percent never visit stackoverflow.com – they have a bookmark (as a trusted or power user) or they find it from google (as a non-registered or basic level user) and so never load it? Conversely what percent never view questions by tag, or use search?
I muck around Stack Exchange to waste time. I suspect I visit the homepages of various sites more than most.
I didn’t know stack overflow had a home page LOL
On Stack Overflow (as opposed to low traffic Stack Exchange sites like Travel.SE), when I visit the home page, it’s not to browse questions. I either start a search, go to a notification (or close the window because there aren’t any), view the latest punny burnination request hot meta post, or the like. I don’t recall clicking on a question because it’s on the home page. Probably because I nowadays mainly ask questions, rather than post answers. It’s just the page that has the shortest URL.
Can you look at the statistics of how much people go from the home page to a question? I suspect that high rep users, people who post answers, are more likely to do so than low rep users.
What exactly is automated traffic ?
bots and search engine crawlers
‘Tis funny; unless I’m looking at a specific question via a link of some sort, I always start at the home page. It gives me a nice list of questions related to the tags I’ve said I’m interested in — usually with a few bonus extras — and makes it dead easy to find the stuff that interests me.
You are the browsing type of user.
Most of the SO users are users looking for answers to specific problems and come straight from search engines to questions.
And this is why it is idiotic to carry on closing questions because “they’re not a good fit”. Questions won’t turn up in search results unless someone _else_ has the same question, which is in and of itself proof that the question as framed is relevant and of interest. Gamification is a curse, it attracts OCD button collecting jerks.
A question being a good fit for a specific site and the question being relevant to people out there are quite different matters, though?
What would the practical solution be? Broadening the scope of existing SE sites? Merging “related” sites?
I’m skeptical of both those ideas as it appears that the SE model starts to fall apart as a site grows beyond a certain point; it rather seems we need smaller SE sites, not bigger sites.
I mean that if a question really isn’t a good fit then it will be ignored, both by answerers and by the passing public. Actively suppressing them is unnecessary when it’s justified and destructive when it isn’t. As for a practical solution, I’m still pondering alternatives to gamification for creating engagement. What we are seeing is the long term instability of social groups, rise and fall of empire, you might say.
Wait, Stack Overflow has a homepage? 😉
Those of us who actually partake in the community by watching for interesting questions to answer for free spend a lot of time on the homepage.
Those who just pop in from a Google results page to get their homework solved for them do not. 🙂
I usually hit Stack Overflow via the homepage, to see if there’s anything interesting on there / check on my notifications, including from other sites in the network.
From the home page, I usually hit the other sites via the “Hot Network Questions” list.
I wonder if other users have a preferred “home site” and then do the same?
I have SO’s home page set as my browsers’ home pages…
I go to the home page to see if I have anything in my inbox, but mostly don’t read anything else on the page.
You are saying “time spend”, while all you count are hits if I am not wrong, or are you using Google analytics information about the time a user spend in each page together with cookies of registered users?
In that side-by-side “registered vs unregistered” chart, it might’ve been better to order the data by the page type, not largest-to-smallest on % of page loads. It’s a lot harder to do any fine-grained comparison of reg’d vs unreg’d behavior for a particular page type the way it’s plotted now.
I’d be interested to know how the traffic *flow* occurs as well. For instance, do established/trusted users tend to visit the homepage first *then* navigate off to questions, or do they tend to take direct search result links to question show pages? I’d expect unregistered users to land directly on question/show pages, but perhaps more avid registered users may spend more time navigating the site through the index pages.
It would be really interesting to have confirmation on this.
My own impressions and experience leads to this theory:
When looking for answers (ie, in need of help), I expect that the vast majority come from external searches (Google, etc) directly to specific questions, regardless what class of user they belong to.
When looking for questions (ie, deliberately on the site to contribute answers), first of all it’s mostly the “established” and “trusted” classes of users that even do this, and in my experience the only feasible means of finding questions of interest but without having very specific criteria is by using the index, by tag, etc pages directly on the SE sites.
Dave! Why’d you use the wrong type of graph in the last Breakdown image? #bargraphmasterrace
To check if you were paying attention.
Now you’ve got to add % of question views that are followed by a close vote, by user segment. I hypothesize that we get grumpier as we get older.
…yes. In fact, one massive data dump of statistics like these would be awesome, just to be able to look at certain correlations like this.
On the note of close votes, I’d say more established users probably submit close votes more often, simply because various circumstances make them some of the first to: see a question, be able to tell whether it’s a good question, and be able to downvote & close-vote it.
You should be able to compute them yourself from the publicly available data. See http://data.stackexchange.com/ and https://archive.org/details/stackexchange.
Lol I disagree. My dad is 85. He was a pain in our collective butts when we were kids. Today he’s the nicest man you’ll ever meet. I can’t tell you. The number of times my siblings and I have looked at each other and said ” is he really the same man?” And other things to the same effect. He even got a dog (forbidden when we were kids-I think he forgot). Since i hear similar stories from too many of my friends I hypothesize just a little different. We get grumpier for a number of years, then we forget why we’re grumpy and the worlds suddenly a wonderful place! You’ll have to add age to those question views!
There’s a home page?
I’m wondering – has traffic to the home page increased since this article was posted? 😛
I think StackOverflow should improve their homepage.
I come to SO (and some of the network sites) when facebook gets too boring or preachy
What’s with the line graph in the last picture? As a Data Scientist, you should know that you don’t use a line graph when the x-axis consists of discrete values and not continuous ones!
Do you feel that it is a poor method of representing that data? How would you recommend representing it?
Data scientist or not-what’s important is showing the data in a way that readers can understand it. If it’s incorrect I wouldn’t recommend this method for publication, but the writer in this case is showing empirical data to people whose focus is the English language-not math. While I can’t speak for everyone I think for this type of thing it’s more important that it be understandable than it be exactly mathematically correct. If you can do both, more power to you. But I think it’s a good job as is.
A bar graph would be more appropriate. There are no actual values between “Questions/Show” and “Home/Index” (for example), so a line graph overemphasizes the space in between the different Page RouteNames (where no actual data exists at all), and doesn’t highlight the existing data points well enough.
The graph does clearly show that “Questions/Show” is higher than all the others, but it’s hard to pick out individual data points from the graph, and not as easy to compare the values for different kinds of users when the colors actually swap places at one point in the graph.
I replied a bit here: https://www.reddit.com/r/programming/comments/5yhnzm/does_anyone_actually_visit_stack_overflows_home/derai06/?context=3
Thank you.
Kingroot thank you.
Meh, I actually think dumping new users onto a list of questions with no context is a pretty poor landing page; it s like if Google s home page was a random selection of other people s search results. So I actually approve of having a proper landing page; it just needs to put public Q A front and centre, as the original product around which everything else is built, and which most users will want to visit first, including people who are deciding whether to buy one of the other products. IMSoP Jul 1 at 9:54
Registered 3 years ago still not visited the home page