\u003C/figure>\u003C/li>\u003C/ul>\n\n\u003C!-- /wp:gallery -->\n\n\u003C!-- wp:paragraph -->\nBefore launching the wizard, we tested a final version of it in an A/B test. We found that \u003Cstrong>question quality\u003C/strong>, \u003Ca href=\"https://meta.stackexchange.com/questions/302970/how-is-question-quality-measured-in-a-b-tests\">which we define and explain in detail here\u003C/a>, improved in an absolute sense by modest amounts for askers with reputation less than 111 (the asker population we included in that test). There was a 5.12% decrease in bad quality questions, and a 1.12% increase in good quality questions.\u003Cbr>\u003Cbr>When the wizard launched more broadly, it was the default option for users with reputation less than 111, but such users could opt out if they wanted. Also, users with reputation above that threshold could opt in.\u003Cbr>\u003Cbr>\n\u003C!-- /wp:paragraph -->\n\n\u003C!-- wp:heading {\"level\":1} -->\n\n\u003Ch1>Who has used the wizard?\u003C/h1>\n\n\u003C!-- /wp:heading -->\n\n\u003C!-- wp:paragraph -->\nBetween when the wizard was launched and the beginning of this week, 777,644 questions have been asked using guided mode. What level of opting in/out do we see, specifically on people's first questions?\u003Cbr>\u003Cbr>\n\u003C!-- /wp:paragraph -->\n\n\u003C!-- wp:gallery {\"ids\":[12480]} -->\n\n\u003Cul class=\"wp-block-gallery columns-1 is-cropped\">\u003Cli class=\"blocks-gallery-item\">\u003Cfigure>\u003Cimg src=\"https://stackoverflow.blog/wp-content/uploads/2019/08/wizard_use-1-1181x675.png\" alt=\"\" data-id=\"12480\" data-link=\"https://stackoverflow.blog/?attachment_id=12480\" class=\"wp-image-12480\"/>\u003C/figure>\u003C/li>\u003C/ul>\n\n\u003C!-- /wp:gallery -->\n\n\u003C!-- wp:paragraph -->\nA few percent of higher rep question askers opted in, and about 15% of users with reputation below 111 opted out. In this time period, 99.1% of first questions were asked by users with reputation less than 111, so using the wizard has been the main new question-asking experience over the past several months. Users are largely not opting out.\u003Cbr>\u003Cbr>\n\u003C!-- /wp:paragraph -->\n\n\u003C!-- wp:heading {\"level\":1} -->\n\n\u003Ch1>Confounders and models\u003C/h1>\n\n\u003C!-- /wp:heading -->\n\n\u003C!-- wp:paragraph -->\nIt is quite challenging to measure what effect an intervention like this wizard has because users can opt in/out of it. We expect that the very characteristics that lead a user to have a hard time writing a high-quality question may make them more likely to opt in. A difference we see between using the wizard vs. not may be caused by a factor that \u003Cem>predicts\u003C/em> intervention (the wizard) rather than the intervention itself. This is an example of \u003Ca href=\"https://stats.stackexchange.com/questions/267158/difference-between-confounding-and-interaction/267185\">confounding\u003C/a>, and is exactly why we typically use A/B tests when planning changes for our site. However, we would still like to see what we can learn from the last six months of data, which means we need to use methods appropriate for \u003Cstrong>observational data\u003C/strong>. My academic background is astronomy, so this is a pretty familiar situation for me! (There are not a lot of randomized controlled trials in space.) I'll focus on a few very important factors in this blog post, but predictors we explored included reputation, account age, user location, and more.\u003Cbr>\u003Cbr>For this analysis, I am going to share results using \u003Ca href=\"https://en.wikipedia.org/wiki/Bayesian_inference\">Bayesian generalized linear multilevel modeling\u003C/a> to understand the impact of the question wizard. \u003Ca href=\"https://stats.stackexchange.com/questions/372048/xkcds-modified-bayes-theorem-actually-kinda-reasonable\">Why use a Bayesian approach for this data?\u003C/a> The main reasons are that only a very few users with higher rep ever used the wizard, and we'd like to explore whether the wizard \u003Cem>only\u003C/em> helps lower rep users, among other questions about when and for whom the wizard is helpful. Bayesian modeling provides a framework well-suited to those kinds of questions.\u003Cbr>\u003Cbr>\n\u003C!-- /wp:paragraph -->\n\n\u003C!-- wp:heading {\"level\":1} -->\n\n\u003Ch1>Question quality\u003C/h1>\n\n\u003C!-- /wp:heading -->\n\n\u003C!-- wp:paragraph -->\nFirst, let's look at model results for \u003Ca href=\"https://meta.stackexchange.com/questions/302970/how-is-question-quality-measured-in-a-b-tests\">question quality\u003C/a>.\n\u003C!-- /wp:paragraph -->\n\n\u003C!-- wp:image {\"id\":12482} -->\n\u003Cfigure class=\"wp-block-image\">\u003Cimg src=\"https://stackoverflow.blog/wp-content/uploads/2019/08/quality_results-1-1200x533.png\" alt=\"\" class=\"wp-image-12482\"/>\u003C/figure>\n\u003C!-- /wp:image -->\n\n\u003C!-- wp:paragraph -->\nThis plot shows results for a straightforward classification model predicting whether a question is good or bad trained using \u003Ca href=\"https://mc-stan.org/users/interfaces/brms\">brms\u003C/a> and \u003Ca href=\"https://mc-stan.org/\">Stan\u003C/a>. I chose the reputation threshold at 11 (rather than, say, at 111) because the median reputation of a first time question asker is 8.1. A \"new account\" is one created within the past day. The size of these effects does change with the thresholds but these predictors having an impact at all is robust to such changes.\u003Cbr>\u003Cbr>How can you interpret this plot? Reputation above the threshold, older accounts, and using the question wizard are all associated with positive improvements in question quality. Specifically, because of the modeling approach used here, we see that, for example, people with accounts older than one day write better first questions, controlling for other factors like reputation and using the wizard.\u003Cbr>\u003Cbr>\n\u003C!-- /wp:paragraph -->\n\n\u003C!-- wp:html -->\n[table id=12 /]\n\u003C!-- /wp:html -->\n\n\u003C!-- wp:paragraph -->\nLet's walk through how to interpret these \u003Ca href=\"https://en.wikipedia.org/wiki/Risk_ratio\">risk ratios\u003C/a>, which are relative changes.\u003Cbr>\u003Cbr>\n\u003C!-- /wp:paragraph -->\n\n\u003C!-- wp:list -->\n\n\u003Cul>\u003Cli>Someone with rep <= 11 is about 0.6x (~40% less) likely to ask a good quality first question, and about 2x (100% more) likely to ask a bad quality first question, than someone with higher reputation, controlling for other factors. \u003C/li>\u003Cli>A first time question asker who has a new account (has created their account within the past day) is about 0.9x (10% less) likely to ask a good quality first question and about 1.2x (20% more) likely to ask a bad quality first question than someone with an older account.\u003C/li>\u003C/ul>\n\n\u003C!-- /wp:list -->\n\n\u003C!-- wp:paragraph -->\nControlling for these confounders, when a first time question asker uses the wizard, their question is \u003Cstrong>about 6% more likely to be good quality and 20% less likely to be bad quality\u003C/strong>. We consider this a real success of the wizard, because when people ask better quality questions, they are more likely to get answers and have an overall more positive experience. We also know people who ask good quality questions are more likely to ask a question again. Confirming what we found during the A/B test, the wizard has a bigger impact on bad question quality than good question quality.\u003Cbr>\u003Cbr>I explored whether we see evidence that the wizard only helps users with lower reputation or first-time question askers, and we don't see evidence for that. We don't have as strong evidence that the wizard can help users with higher reputation (we don't have much data on this), but we can put some limits on how large the difference for higher and lower rep question askers would need to be for us to see a difference with the data we have.\u003Cbr>\u003Cbr>\n\u003C!-- /wp:paragraph -->\n\n\u003C!-- wp:heading {\"level\":1} -->\n\n\u003Ch1>Comments galore\u003C/h1>\n\n\u003C!-- /wp:heading -->\n\n\u003C!-- wp:paragraph -->\nWe know from multiple sources of feedback and research that the comment section of Stack Overflow can be a pain point in using our site. Putting aside the issue of unfriendly or otherwise problematic comments, they are intended to be a venue for clarifying questions to improve the quality of a question. This means that from our perspective, in general and on average, the fewer comments, the better.\u003Cbr>\u003Cbr>How has the question wizard affected the \u003Cstrong>number\u003C/strong> of comments on first-time questions, and the number of \u003Cstrong>unfriendly\u003C/strong> comments? Instead of a classification model, this uses a \u003Ca href=\"https://stats.stackexchange.com/questions/3024/why-is-poisson-regression-used-for-count-data/3027\">Poisson regression model\u003C/a>.\n\u003C!-- /wp:paragraph -->\n\n\u003C!-- wp:image {\"id\":12483} -->\n\u003Cfigure class=\"wp-block-image\">\u003Cimg src=\"https://stackoverflow.blog/wp-content/uploads/2019/08/comment_results-1-1200x533.png\" alt=\"\" class=\"wp-image-12483\"/>\u003C/figure>\n\u003C!-- /wp:image -->\n\n\u003C!-- wp:paragraph -->\nPeople use \u003Ca href=\"https://meta.stackexchange.com/questions/313754/updated-comment-flagging-supporting-the-new-code-of-conduct\">several different categories\u003C/a> to flag comments that are inappropriate for our site, such as rude/abusive and unfriendly/unwelcoming. Recently, my team, especially my colleague \u003Ca href=\"https://stackoverflow.com/users/6212/jason-punyon\">Jason Punyon\u003C/a>, has used the human-generated flags to build a deep learning model to automatically detect unfriendly comments on Stack Overflow. We'll share more soon about this model, how it works, and how we're using it on our site to make Stack Overflow a more safe and inclusive community. For this analysis, I used the unfriendly comment \u003Cem>model\u003C/em> (not comments flagged by human beings, which \u003Ca href=\"https://stackoverflow.blog/2018/12/04/welcome-wagon-community-and-comments-on-stack-overflow/\">we know are dramatically underflagged\u003C/a>) to understand what impact the wizard has on unfriendly comments.\u003Cbr>\u003Cbr>\n\u003C!-- /wp:paragraph -->\n\n\u003C!-- wp:html -->\n[table id=11 /]\n\u003C!-- /wp:html -->\n\n\u003C!-- wp:paragraph -->\nWhat do these coefficients mean? They are multiplicative factors, because this was \u003Ca href=\"https://en.wikipedia.org/wiki/Poisson_regression\">Poisson regression\u003C/a>.\u003Cbr>\u003Cbr>\n\u003C!-- /wp:paragraph -->\n\n\u003C!-- wp:list -->\n\n\u003Cul>\u003Cli>Having rep <= 11 is associated with receiving 6% more comments and about 40% more unfriendly comments.\u003C/li>\u003Cli>Having a new account (less than one day) is associated with receiving 5% more comments on a first question and about 20% more unfriendly comments.\u003C/li>\u003Cli>Writing a good quality first question is associated with about a 15% reduction in comments on that question and over 70% reduction in unfriendly comments on that question.\u003C/li>\u003C/ul>\n\n\u003C!-- /wp:list -->\n\n\u003C!-- wp:paragraph -->\nUsing the wizard is associated with \u003Cstrong>5% fewer comments and over 20% fewer unfriendly comments\u003C/strong>. These factors and their impact come from models controlling for the other factors, so think about the wizard reducing unfriendly comments by over 20%, controlling for other factors, including reputation and quality of the question. We consider this another huge success of the wizard.\u003Cbr>\u003Cbr>\n\u003C!-- /wp:paragraph -->\n\n\u003C!-- wp:heading {\"level\":1} -->\n\n\u003Ch1>Next steps\u003C/h1>\n\n\u003C!-- /wp:heading -->\n\n\u003C!-- wp:paragraph -->\nWe are really happy to see that the question wizard has been having a positive influence on Stack Overflow, both in terms of question quality and interactions via comments. 🎉 This was the first time we had shipped changes of this magnitude for the question-asking workflow in a decade, and it is great to be able to celebrate its impact.\u003Cbr>\u003Cbr>The question wizard has proven successful enough that we want to iterate on its design and see how we can better scaffold people with coding problems to either write a successful question or realize that they don't need to ask a question at all because the answer is already available. We also need to revisit some of the technical decisions made when launching a two-mode question workflow, as some aspects of the current system are brittle and inappropriate for a growing system, especially when we consider the network beyond Stack Overflow.\u003Cbr>\u003Cbr>What can you expect in the near future if you ask a question soon? We will be using A/B testing, so not every user will experience changes at the same time, but some of the changes we are working on so far include:\u003Cbr>\u003Cbr>\n\u003C!-- /wp:paragraph -->\n\n\u003C!-- wp:list -->\n\n\u003Cul>\u003Cli>More upfront guidance for first-time question-askers\u003C/li>\u003Cli>Setting expectations about what happens after asking a question\u003C/li>\u003Cli>Improved \"how-to-ask\" guidance while drafting a question\u003C/li>\u003Cli>Consolidating the many dozens of validation warnings into a single review interface\u003C/li>\u003C/ul>\n\n\u003C!-- /wp:list -->\n\n\u003C!-- wp:paragraph -->\nFor both technical and design reasons, we plan for the question workflow to change for all users, not only those with lower reputation. As we move forward, we can use both data and feedback from users to assess how successful changes are. How do we know when we are successful? We use both qualitative and quantitative research in making decisions about our site. This blog post is an example of the kind of quantitative analysis that we use, involving large samples of our users broadly. For more details on what's coming to Stack Overflow soon, check out my colleague Meg's \u003Ca href=\"https://stackoverflow.blog/2019/08/20/upcoming-on-stack-overflow/\">blog post from earlier this week\u003C/a>!\n\u003C!-- /wp:paragraph -->","html","2019-08-22T14:55:33.000Z",{"current":650},"impact-of-ask-question-wizard",[652,660,665,670,675],{"_createdAt":653,"_id":654,"_rev":655,"_type":656,"_updatedAt":653,"slug":657,"title":659},"2023-05-23T16:43:21Z","wp-tagcat-bulletin","9HpbCsT2tq0xwozQfkc4ih","blogTag",{"current":658},"bulletin","Bulletin",{"_createdAt":653,"_id":661,"_rev":655,"_type":656,"_updatedAt":653,"slug":662,"title":664},"wp-tagcat-community",{"current":663},"community","Community",{"_createdAt":653,"_id":666,"_rev":655,"_type":656,"_updatedAt":653,"slug":667,"title":669},"wp-tagcat-insights",{"current":668},"insights","Insights",{"_createdAt":653,"_id":671,"_rev":655,"_type":656,"_updatedAt":653,"slug":672,"title":674},"wp-tagcat-stackoverflow",{"current":673},"stackoverflow","Stackoverflow",{"_createdAt":653,"_id":676,"_rev":655,"_type":656,"_updatedAt":653,"slug":677,"title":679},"wp-tagcat-company",{"current":678},"company","Company","Research Update: Impact of the Ask Question Wizard",[682,684,690,696],{"_id":16,"publishedAt":17,"slug":683,"sponsored":12,"title":20},{"_type":10,"current":19},{"_id":685,"publishedAt":686,"slug":687,"sponsored":12,"title":689},"f0807820-02d7-4fc5-845f-3d76514b81c0","2025-08-11T16:00:00.000Z",{"_type":10,"current":688},"renewing-chat-on-stack-overflow","Renewing Chat on Stack Overflow ",{"_id":691,"publishedAt":692,"slug":693,"sponsored":12,"title":695},"e33464c4-b21b-4019-8b86-64a46335a95e","2025-08-07T16:00:00.000Z",{"_type":10,"current":694},"a-new-worst-coder-has-entered-the-chat-vibe-coding-without-code-knowledge","A new worst coder has entered the chat: vibe coding without code knowledge",{"_id":697,"publishedAt":698,"slug":699,"sponsored":12,"title":701},"8b04b236-51d5-4747-9de8-2fe6e6a2512e","2025-08-04T16:00:00.000Z",{"_type":10,"current":700},"cross-pollination-as-a-strategic-advantage-for-forward-thinking-organizations","Cross-pollination as a strategic advantage for forward-thinking organizations",{"count":703,"lastTimestamp":704},18,"2023-05-25T09:46:48Z",["Reactive",706],{"$sarticleModal":644},["Set"],["ShallowReactive",709],{"sanity-hQbOQhmIOfebemxSpCSmDx4AwNOr4K-AWA_fy_2E8Kk":-1,"sanity-comment-wp-post-12458-1755678384720":-1},"/2019/08/22/impact-of-ask-question-wizard/?cb=1"]