community November 12, 2019

Research Update: A/B Testing the New Question Form

This month’s research update shows how the new question asking experience on Stack Overflow, now live for everyone, helps askers be more successful with quality questions.
Avatar for Julia Silge
Data Scientist (former)

Welcome to November’s installment of Stack Overflow research updates! This month marks one year since my colleagues in UX research and I started sharing bite-size updates about the quantitative and qualitative research we use to understand our communities and make decisions.

In recent months, we have invested time and energy in improving the question-asking experience on Stack Overflow, one of the most fundamental interactions on our site. In August, I outlined what we learned from the question wizard, our first major change to the question-asking workflow in a decade. In September, Lisa shared the results of her qualitative research that has informed our next steps. Today, I want to present the results of A/B testing for the changes currently live on the site.

Wizards and a unified experience

The question wizard represented a move in the right direction in terms of question quality and interactions via comments, but some of the decisions made for the wizard turned out to be brittle and inappropriate for our scale. For both technical and design reasons, we have chosen to pursue a single question design with modals specific to different kinds of users, not a two-mode workflow based on reputation.

To measure the impact of changes to the question workflow, we use A/B testing. People in the baseline arm of the test had the old version of the question workflow as it already existed. We shipped changes to the question workflow iteratively so that people in the experiment arm of the test experienced a new workflow; these iterative changes in the experiment arm were necessary because the changes we wanted to test against the old workflow were so extensive and had complex dependencies. For simplicity, we can summarize the changes in two “steps”:

  • Step 1: The first group of changes launched in September and included pretty dramatic UI changes, along with a welcome modal and what-to-expect modal for new users.
  • Step 2: The next group of changes launched in October and focused mostly on a review interface, consolidating and organizing validation warnings.

People in the baseline arm did not see any of these changes but had the old question workflow only.

Posting your question

One of the most important metrics for us when we work with the question workflow is the conversion from clicking the “Ask Question” button to finally posting a question. The new question workflow, compared to the old, allows users to be more successful in this task, with increases of 3% in this conversion throughout the entire process (both Step 1 and Step 2). Adding the review interface did not impact the ease of use of the new question form, as measured by this conversion from initial click to final post.

It may be difficult to see in this graph because they are so small, but the gray errorbars show the uncertainty on how we have measured the proportion here.

Another important metric for the question workflow is question quality, which we define and explain here. More questions are being asked with the new workflow, but what are these questions like?

During Step 1 (the major UI changes plus modals for new question askers), we saw a 1.5% decrease in good quality questions. Not great news! During that part of this major revamp of the question-asking experience, we had increased the number of questions (and the overall number of good questions) but the proportion of questions that were good was down slightly.

Fortunately, one of the main reasons we are redesigning the question workflow is that our new approach is more flexible and easier to iterate on. In fact, that’s exactly what we did next. Step 2 of our rollout focused on consolidating and organizing validation warnings, and during this step, the quality gap between the baseline and experiment groups decreased to virtually zero. We fixed the regression in question quality by iterating in this more flexible framework. We see similar results during the test if we measure bad quality instead of good quality.

Next steps

As of today, the new question workflow performs better in terms of task success (people who intend to ask a question successfully posting their question) and the same in terms of question quality. From a technical perspective, the new workflow is easier to maintain and build on moving forward. Our next steps will include more iteration to continue improving question quality, along with other concerns of all kinds of users, from the most to the least experienced. We have graduated this new question workflow and in the future, we’ll be testing any further changes against this new baseline. The next time you ask a question on Stack Overflow, look for the results of these carefully planned and tested changes!

We have something fun for ya. Our latest podcast episode is out!

Tags: , , ,
Podcast logo The Stack Overflow Podcast is a weekly conversation about working in software development, learning to code, and the art and culture of computer programming.

Related

November 13, 2019

We’re Rewarding the Question Askers

We’re recalculating reputation for every Stack Overflow and Stack Exchange individual based on this change. Every question upvote earned in the past will earn a value of ten reputation points retroactively.
Avatar for Sara Chipps
Director of Public Q&A
Wooden figures standing in a circle facing each other
the-loop May 26, 2020

The Loop, May 2020: Dark Mode

We received a bunch of requests to share how we use our feedback framework on specific features. We got excited about this, and given that we just released Dark Mode (and “Ultra Dark Mode”), we thought this was a great opportunity to show how we arrived at our solution.
Avatar for Sara Chipps
Director of Public Q&A
the-loop January 22, 2020

The Loop #2: Understanding Site Satisfaction, Summer 2019

We’re excited to share research highlights about the work we’ve been doing to understand how satisfied people are with Stack Overflow. We’ve been working hard to explore what users like best about Stack Overflow and what their top pain points are, with the goal of improving the overall experience of using the site. To this end, we’ve launched a site satisfaction survey, in which we continually survey users about their experiences using Stack Overflow.
Wooden figures standing in a circle facing each other
community December 11, 2019

The Loop #1: How we conduct research on the Community team

If you work on a product that’s ever benefited from research – whether that’s talking directly to users, analyzing experiment data, or any number of other research methods – you know how indispensable these inputs are for making the right decisions. But how do you decide which methods to use and when? How do you know if you’re spending the right amount of time on research? How do you know when it’s time to change your research methods?
Avatar for Donna Choi
Community Design Lead