cc-wiki-dump May 31, 2010

Academic Papers Using Stack Overflow Data

One unanticipated benefit of releasing our data as creative commons is that the Stack Overflow dataset has been the subject of several academic papers already: Evolution of Two Sided Markets Ravi Kumar, Yury Lifshits (Yahoo! Research), and Andrew Tomkins (Google) Presented at WSDM 2010, Session 7: Temporal Interaction View Slides Download paper (pdf) Causal Discovery…
Avatar for Jeff Atwood
Co-Founder (Former)

One unanticipated benefit of releasing our data as creative commons is that the Stack Overflow dataset has been the subject of several academic papers already:

Evolution of Two Sided Markets

Ravi Kumar, Yury Lifshits (Yahoo! Research), and Andrew Tomkins (Google)

Presented at WSDM 2010, Session 7: Temporal Interaction

Causal Discovery in Social Media Using Quasi-Experimental Designs

Hüseyin Oktay, Brian J. Taylor, David D. Jensen (Knowledge Discovery Laboratory, Department of Computer Science, University of Massachusetts Amherst)

To be presented at the 2010 ACM/SIGKDD conference

There’s also a third study starting up with Lena Mamykina, a researcher in Human-Centered Computing at Columbia University, who is working in conjunction with Björn Hartmann, a professor from UC Berkeley:

The success of stackoverflow.com is making all my research community wonder what is it that makes it work so well for the users. Would you be interested in participating in a research study to answer some of these questions? The study would probably involve things like interviews (phone) with your development team, moderators and selected users. The results will be submitted for publication at one of the ACM (Association of Computing and Machinery) conferences (for example a conference on human factors in computing systems, CHI or a conference on computer-supported cooperative work, CSCW). Of course you will have a chance to review and provide your feedback on all the materials before they are published.

We’ll of course be contributing to the interviews, as well as introducing Lena to selected community members who indicate that they are willing to be interviewed for … science!

It’s exciting to be a part of this research, which lets everyone benefit from the slices of time that we’ve all collectively contributed to not just Stack Overflow, but every site in our network. If there’s anything else we can do to help assist any research using the public creative commons data we expose, just contact us.

Podcast logo The Stack Overflow Podcast is a weekly conversation about working in software development, learning to code, and the art and culture of computer programming.

Related

July 27, 2020

Full data set for the 2020 Developer Survey now available!

We love to learn about what moves developers and technical workers. That’s why each year, we ask the tech community about their jobs, their tools, and their aspirations. We also love open source, so since 2011, we’ve made the raw data set available for you to explore! We’re happy to announce that this year’s raw…
Avatar for Ben Popper
Director of Content
newsletter September 4, 2020

The Overflow #37: Bloatware, memory hog, or monolith

September 2020 Welcome to ISSUE #37 of the Overflow! This newsletter is by developers, for developers, written and curated by the Stack Overflow team and Cassidy Williams at Netlify. Thrill to the veteran Stack Overflow engineer who walks like a noob! Gasp at the 2000-year-old temperature reading! Marvel at the GitHub README that writes itself!…