data

Subscribe to the podcast

Get The Stack Overflow Podcast at your favorite listening service.

Apple Podcasts Overcast Pocket Casts Spotify RSS feed

July 14, 2025

Where we’re going, we don’t need fossil fuels

Ryan is joined by Kieran Furlong, CEO of Realta Fusion, to talk about the future of fusion as a safe and sustainable energy source, the computation and scientific advancements that have made fusion possible, and how fusion technology innovations will address data and AI’s rising energy demands.

Phoebe Sajor

1 comment

energy The Stack Overflow Podcast AI generative AI sustainability

June 27, 2025

You’ve got 99 problems but data shouldn’t be one

Ryan is joined by Tobiko Data co-founders Toby Mao and Iaroslav Zeigerman to talk about the crucial role of rigorous data practices and tooling, the innovations of Tobiko Data’s SQLMesh and SQLGlot, and their insights into the future of data engineering with the rise of AI.

Phoebe Sajor

0 comments

data

June 13, 2025

“We’re not worried about compute anymore”: The future of AI models

Ryan Donovan and Ben Popper sit down with Jamie de Guerre, SVP of Product at Together AI, to discuss the evolving landscape of AI and open-source models. They explore the significance of infrastructure in AI, the differences between open-source and closed-source models, and the ethical considerations surrounding AI technology. Jamie emphasized the importance of leveraging internal data for model training and the need for transparency in AI practices.

Phoebe Sajor

0 comments

AI The Stack Overflow Podcast Open Source data ethics

June 11, 2025

Why you need diverse third-party data to deliver trusted AI solutions

Diverse, high-quality data is a prerequisite for reliable, effective, and ethical AI solutions.

David Gibson, Michael Geden

0 comments

Business Hub data quality data diversity llm AI responsible ai

June 3, 2025

Stack Exchange knowledge is for everyone (and now available on Snowflake Marketplace)

Snowflake customers can now easily enrich their AI applications and agentic systems with some of the most trusted, highest-quality data available while respecting our community members who provide this content with proper attribution.

Janice Manningham, Ryan Donovan

7 comments

Company

May 20, 2025

Durable execution: autosave for your microservices

Ryan is joined by Jeremy Edberg, CEO of DBOS, and Qian Li, co-founder of DBOS, to discuss durable execution and its use cases, its implementation using technologies like PostgreSQL, and its applications in machine learning pipelines and AI systems for reliability, debugging, and observability.

Phoebe Sajor

0 comments

The Stack Overflow Podcast AI microservices

May 15, 2025

Whether AI is a bubble or revolution, how does software survive?

Money is pouring into the AI industry. Will software survive the disruption it causes?

Ryan Donovan

4 comments

May 9, 2025

Using AI to find patient zero in marketing campaigns

Ben Popper chats with CTO Abby Kearns about how Alembic is using composite AI and lessons learned from contract tracing and epidemiology to help companies map customer journeys and understand the ROI of their marketing spend. Ben and Abby also talk about where open-source models have the edge and the challenges startups face in building trust with big companies and securing the resources they need to grow.

Eira May

0 comments

The Stack Overflow Podcast AI Open Source startups data science

May 8, 2025

Best practices for third-party data acquisition: powering AI context

This post explores crucial lessons learned in the trenches of data licensing, drawing insights from Stack Overflow and the growing importance of socially responsible data practices in a changing internet landscape.

Ellen Brandenberger, David Gibson

2 comments

Business Hub

May 1, 2025

Without foundational governance, every AI deployment is a liability in disguise: Q&A with Jack Berkowitz of Securiti

Avoiding bad data is just as important in AI; it can open you to fines, lawsuits, and lost customers.

Ryan Donovan

1 comment

security

April 22, 2025

Visually orchestrating data diagnostics but platform agnostic

Ryan chats with Dataiku CEO and cofounder Florian Douetteau about the complexities of the genAI data stack and how his company is orchestrating it.

Ryan Donovan

0 comments

The Stack Overflow Podcast

April 3, 2025

From training to inference: The new role of web data in LLMs

Data has always been key to LLM success, but it's becoming key to inference-time performance as well.

Or Lenchner

3 comments

data science generative AI contributed

March 21, 2025

An AI future free of slop

Stack Overflow CEO Prashanth Chandrasekar sat down with Ryan at HumanX 2025 to talk about how Stack is integrating AI into its public platform, the enormous importance of a high-quality knowledge base in your AI journey, how AI tools are empowering junior developers to build better software, and much more.

Eira May

2 comments

The Stack Overflow Podcast generative AI AI software development events data quality dev tools developer tools knowledge base integrations

March 7, 2025

Is Postgres the best database for GenAI?

Jeremy “Jezz” Kellway, VP of Engineering for Analytics and Data & AI at EDB (Enterprise Database), joins Ryan for a conversation about Postgres and AI. They unpack how Postgres is becoming the standard database for AI applications, the importance of managing unstructured data, and the implications of data sovereignty and governance in AI.

Eira May

1 comment

The Stack Overflow Podcast software development postgreSQL database generative AI

February 18, 2025

Why is it so hard for companies to protect your privacy?

Minh Nguyen, VP of Engineering at Transcend, joins Ryan for a conversation about the complexities of privacy and consent in tech, from the challenges organizations face in managing data privacy to the importance of consent management tools to the evolving landscape of privacy regulations.

Eira May

3 comments

The Stack Overflow Podcast software development privacy

February 14, 2025

Solving the data doom loop

Ken Stott, Field CTO of API platform Hasura, tells Ryan about the data doom loop: the concept that organizations are spending lots of money on data systems without seeing improvements in data quality or efficiency.

Eira May

0 comments

The Stack Overflow Podcast AI API software development generative AI data quality graphql microservices architecture software architecture

November 22, 2024

The app that fights for your data privacy rights

Ben and Ryan sit down with public interest technologist Sukhi Gulati Gilbert, a senior product manager at Consumer Reports, for a conversation about digital data privacy. They talk about why digital privacy matters, the challenges consumers face in safeguarding their data, and the legislative gaps in privacy protection, along with the app Sukhi is working, Permission Slip, that helps users exercise their rights to digital data privacy. Plus: Why it might be worth reducing your digital footprint.

Eira May

0 comments

The Stack Overflow Podcast data ethics privacy

November 12, 2024

A student of Geoff Hinton, Yann LeCun, and Jeff Dean explains where AI is headed

Ben and Ryan are joined by Matt Zeiler, founder and CEO of Clarifai, an AI workflow orchestration platform. They talk about how the transformer architecture supplanted convolutional neural networks in AI applications, the infrastructure required for AI implementation, the implications of regulating AI, and the value of synthetic data.

Eira May

1 comment

The Stack Overflow Podcast AI llm training machine learning synthetic data

November 8, 2024

One of the world’s biggest web scrapers has some thoughts on data ownership

Or Lenchner, CEO of Bright Data, joins Ben and Ryan for a deep-dive conversation about the evolving landscape of web data. They talk through the challenges involved in data collection, the role of synthetic data in training large AI models, and how public data access is becoming more restrictive. Or also shares his thoughts on the importance of transparency in data practices, the likely future of data regulation, and the philosophical implications of more people using AI to innovate and solve problems.

Eira May

1 comment

The Stack Overflow Podcast AI training data ethics data scraping llm

October 25, 2024

Tragedy of the (data) commons

Ben chats with Shayne Longpre and Robert Mahari of the Data Provenance Initiative about what GenAI means for the data commons. They discuss the decline of public datasets, the complexities of fair use in AI training, the challenges researchers face in accessing data, potential applications for synthetic data, and the evolving legal landscape surrounding AI and copyright.

Eira May

2 comments

The Stack Overflow Podcast AI llm

July 26, 2024

On the web, data doesn’t define us. It creates us.

In this episode, Ben interviews Jannis Kallinikos, a professor at Luiss University in Rome, Italy about his new book Data Rules: Reinventing the Market Economy, coauthored with Cristina Alaimo. They discuss the social impact of data, explore the idea that data filters how we see the world and interact with each other, and highlight the need for social accountability in data tracking and surveillance.

Eira May

0 comments

The Stack Overflow Podcast

July 10, 2024

How data are reshaping society: “Datafication” and socioeconomic transformations

More and more of our lives are becoming data-driven. Is that a good thing?

Cristina Alaimo, Jannis Kallinikos

0 comments

data

April 16, 2024

How do you evaluate an LLM? Try an LLM.

On this episode: Stack Overflow senior data scientist Michael Geden tells Ryan and Ben about how data scientists evaluate large language models (LLMs) and their output. They cover the challenges involved in evaluating LLMs, how LLMs are being used to evaluate other LLMs, the importance of data validating, the need for human raters, and more needs and tradeoffs involved in selecting and fine-tuning LLMs.

Eira May

2 comments

generative AI llm synthetic data The Stack Overflow Podcast

February 26, 2024

Even LLMs need education—quality data makes LLMs overperform

If you’re building experimental GenAI features that haven’t proven their product market fit, you don’t want to commit to a model that runs up costs without a return on that investment.

Ryan Donovan

3 comments

llm

data

Related Tags

Subscribe to the podcast

Where we’re going, we don’t need fossil fuels

You’ve got 99 problems but data shouldn’t be one

“We’re not worried about compute anymore”: The future of AI models

Why you need diverse third-party data to deliver trusted AI solutions

Stack Exchange knowledge is for everyone (and now available on Snowflake Marketplace)

Durable execution: autosave for your microservices

Whether AI is a bubble or revolution, how does software survive?

Using AI to find patient zero in marketing campaigns

Best practices for third-party data acquisition: powering AI context

Without foundational governance, every AI deployment is a liability in disguise: Q&A with Jack Berkowitz of Securiti

Visually orchestrating data diagnostics but platform agnostic

From training to inference: The new role of web data in LLMs

An AI future free of slop

Is Postgres the best database for GenAI?

Why is it so hard for companies to protect your privacy?

Solving the data doom loop

The app that fights for your data privacy rights

A student of Geoff Hinton, Yann LeCun, and Jeff Dean explains where AI is headed

One of the world’s biggest web scrapers has some thoughts on data ownership

Tragedy of the (data) commons

On the web, data doesn’t define us. It creates us.

How data are reshaping society: “Datafication” and socioeconomic transformations

How do you evaluate an LLM? Try an LLM.

Even LLMs need education—quality data makes LLMs overperform