cc-wiki-dump May 31, 2010

Academic Papers Using Stack Overflow Data

One unanticipated benefit of releasing our data as creative commons is that the Stack Overflow dataset has been the subject of several academic papers already: Evolution of Two Sided Markets Ravi Kumar, Yury Lifshits (Yahoo! Research), and Andrew Tomkins (Google) Presented at WSDM 2010, Session 7: Temporal Interaction View Slides Download paper (pdf) Causal Discovery…
Avatar for Jeff Atwood
Co-Founder (Former)

One unanticipated benefit of releasing our data as creative commons is that the Stack Overflow dataset has been the subject of several academic papers already:

Evolution of Two Sided Markets

Ravi Kumar, Yury Lifshits (Yahoo! Research), and Andrew Tomkins (Google)

Presented at WSDM 2010, Session 7: Temporal Interaction

Causal Discovery in Social Media Using Quasi-Experimental Designs

Hüseyin Oktay, Brian J. Taylor, David D. Jensen (Knowledge Discovery Laboratory, Department of Computer Science, University of Massachusetts Amherst)

To be presented at the 2010 ACM/SIGKDD conference

There’s also a third study starting up with Lena Mamykina, a researcher in Human-Centered Computing at Columbia University, who is working in conjunction with Björn Hartmann, a professor from UC Berkeley:

The success of stackoverflow.com is making all my research community wonder what is it that makes it work so well for the users. Would you be interested in participating in a research study to answer some of these questions? The study would probably involve things like interviews (phone) with your development team, moderators and selected users. The results will be submitted for publication at one of the ACM (Association of Computing and Machinery) conferences (for example a conference on human factors in computing systems, CHI or a conference on computer-supported cooperative work, CSCW). Of course you will have a chance to review and provide your feedback on all the materials before they are published.

We’ll of course be contributing to the interviews, as well as introducing Lena to selected community members who indicate that they are willing to be interviewed for … science!

It’s exciting to be a part of this research, which lets everyone benefit from the slices of time that we’ve all collectively contributed to not just Stack Overflow, but every site in our network. If there’s anything else we can do to help assist any research using the public creative commons data we expose, just contact us.

Podcast logo The Stack Overflow Podcast is a weekly conversation about working in software development, learning to code, and the art and culture of computer programming.

Related

Stack Overflow Podcast Relaunch
se-stackoverflow April 25, 2023

Is this the AI renaissance? (Ep. 564)

Paul van der Boor is a Senior Director of Data Science at Prosus and a member of its internal AI group. He talks with Ben about what’s happening in the world of generative AI, the power of collective discovery, and the gap between a shiny proof of concept and a product that people will actually use.
Avatar for Eira May
Senior Content Marketer