Defining socially responsible AI: How we select partners

Stack Overflow is on a journey to build a new era in the practice of AI: the era of social responsibility. All products based on models that consume public Stack Overflow data are required to provide attribution back to the highest relevance posts that influenced the summary given by the model.

Stack Overflow is the world’s largest developer community, with more than 59 million questions and answers. By integrating human capabilities and AI advancements in our community, we aim to elevate the overall experience by providing a platform that facilitates engagement and empowers developers to achieve their goals with greater effectiveness.

Together with Stack's strong developer community and partnerships with the world’s leading AI providers, we strive to redefine the developer experience, fostering efficiency and collaboration through the power of community, best-in-class data, and AI experiences. Our mission is to set new standards with vetted, trusted, and accurate data that will be the foundation on which technology solutions are built and delivered to our users.

The problem

With so many GenAI solutions plagued by hallucinations and misinformation, a user’s trust in data, trust in technology products, and trust in community knowledge is more crucial than ever. Pressures from within the technology community and the larger society are driving LLM developers to consider their impact on the data sources used to generate answers. This has created an urgency around data procurement that is focused on high-quality training data that is better than what is publicly available. The race by hundreds of companies to produce their own LLM models and to integrate these models into a myriad of products is driving a highly competitive environment. As LLM providers focus more on enterprise customers, the need for multiple levels of data governance is required, and corporate customers are much less accepting of lapses in accuracy and are demanding accountability for the information being provided by models.

The vision

Stack Overflow, as a company, strongly believes that the community of the world's most engaged developers and technologists and the answers they share are what will ensure the success of AI's future. AI needs to evolve from being a tool of developers to being a part of the community itself. AIs are more than toolsets based on data layers and more than just an "experience" layer for users. Reducing an AI into these types of silos introduces risk and increases inefficiencies because you are not maximizing the value it can truly bring to you. If, instead, individuals interacted with AIs, transparently collaborating, building, and contributing knowledge, the entire ecosystem benefits. By combining data, human experience, and community, we can more effectively support developer needs.

Through our product partnerships, we seek to develop and grow a virtuous cycle of knowledge development, discovery, and refinement for the developer community, enabling developers to find solutions more efficiently and effectively. In order to work with Stack Overflow at an API level, partners commit to the following:

Attribution is non-negotiable

All products based on models that consume public Stack Overflow data are required to provide attribution back to the highest relevance posts that influenced the summary given by the model. With the lack of trust being felt in AI-generated content, it is critical to give credit to the author/subject matter expert and the larger community who created and curated the content being shared by an LLM. This also ensures LLMs use the most relevant and up-to-date information and content, ultimately presenting the Rosetta Stone needed by a model to build trust in sources and resulting decisions. Sharing credit ties to attribution and with attribution—trust is at the heart.

Content development is human driven

The best content has a human at its core. GenAI can only respond with what has already been published, it won’t provide any data or insightful feedback on anything created after its last data ingestion. What does a human component provide? Humans provide up-to-date information, identify patterns early on, and add signals around the social value of knowledge within a community, providing context to the content. As such, questions on Stack Overflow (whether by a community member or assisted and curated by AI) are posted only after human review. In addition, community answers should be derived from quality, accurate, sourced data. These things will ultimately provide the opportunity for LLM products to interact with the community directly, elevating their training and reconfirming their sourced data via human interaction.

Feedback on AI from communities enables innovation

The two-way communication loop is as important as ever. Whether it be a tech tool or the person standing in front of you—no one likes being spoken to, we need an open dialogue. In working with GenAI, like any other process or meaningful conversation, or learning experience—only through feedback and transparent dialogue can we expect this technology to improve and benefit its user base. Community platforms that fuel LLMs should be recognized for their contributions so they continue to see the value both human and AI counterparts bring to larger communities. In return, community feedback can provide more efficient and effective ways to build and improve models.

Technical communities and AI are “better together.”

We believe that a community can play a crucial role in how AI accelerates, ultimately helping with the quality coming out of the AI offerings. Stack Overflow’s role is to bring the power of the developer community and the technological power of AI together.

AI-powered code generators should help developers spend less time writing boilerplate and repetitive code, so they can focus on solving more complex problems. While GenAI tools can complement the resources available to developers, only Stack Overflow's community expertise with accurate and sourced content complements these tools and is an invaluable resource for coding teams. Our holistic ‘better together’ developer approach of community + AI allows your team to:

  • Accelerate the code writing process, making developers more productive while ensuring they follow best practices surrounding code.
  • Ensure code makes it to production more quickly. When code is created correctly the first time, code review and CI are less painful.

Our north star is offering a true collaboration between the individual, AIs, and a global community—working in unison to solve problems, save time/frustration for developers, and speed up innovation responsibly.

Continue to visit Stack Overflow Labs, our hub for innovation and experimentation, for further updates and details as we build out our roadmap and share insights into what is coming next on this new journey in AI.

