data science

Subscribe to the podcast

Get The Stack Overflow Podcast at your favorite listening service.

Apple Podcasts Overcast Pocket Casts Spotify RSS feed

May 9, 2025

Using AI to find patient zero in marketing campaigns

Ben Popper chats with CTO Abby Kearns about how Alembic is using composite AI and lessons learned from contract tracing and epidemiology to help companies map customer journeys and understand the ROI of their marketing spend. Ben and Abby also talk about where open-source models have the edge and the challenges startups face in building trust with big companies and securing the resources they need to grow.

Eira May

0 comments

The Stack Overflow Podcast AI Open Source startups data

April 3, 2025

From training to inference: The new role of web data in LLMs

Data has always been key to LLM success, but it's becoming key to inference-time performance as well.

Or Lenchner

2 comments

data science

April 2, 2025

Not all AI is generative: Efficient scheduling with mathematics

Efficiently solving a complex scheduling problem using simulated annealing.

Subbu Sailappan, Sidharth Kumar

2 comments

What a year building AI has taught Stack Overflow

We sit down with Jessica Clark, a senior data scientist at Stack Overflow, to discuss how our company approaches generative AI and data quality.

Ben Popper

0 comments

The Stack Overflow Podcast

April 3, 2023

"Data driven" decisions aren't innovative decisions

If you want to innovate new solutions, you can't rely on data about existing solutions.

Chelsea Troy

2 comments

Code for a Living

February 24, 2023

ML and AI consulting-as-a-service (Ep. 542)

The home team talks with Jaclyn Rice Nelson, cofounder and CEO of Tribe AI, about the explosion of hype surrounding generative AI, what it’s like to work at a startup after working at Google, and how Tribe is leveraging the power of a specialist network.

Eira May

0 comments

AI Engineering machine learning ML the stack overflow podcast The Stack Overflow Podcast tribe ai

March 3, 2022

Stop aggregating away the signal in your data

By aggregating our data in an effort to simplify it, we lose the signal and the context we need to make sense of what we’re seeing.

Zan Armstrong

9 comments

Code for a Living visualization contributed

January 12, 2022

Podcast 406: Making Agile work for data science

Data scientists and engineers don’t always play well together. We discuss an approach to your tech stack that can bring them together.

Ryan Donovan

0 comments

agile The Stack Overflow Podcast the stack overflow podcast

December 30, 2021

How often do people actually copy and paste from Stack Overflow? Now we know.

April Fool's may be over, but once we set up a system to react every time someone typed Command+C, we realized there was also an opportunity to learn about how people use our site. Here’s what we found.

Ben Popper, David Gibson

101 comments

april fools Community copying code Insights

November 15, 2021

Building a QA process for your deep learning pipeline in practice

Deep learning models still need testing, but many of the common testing approaches don't apply. But with the right methods, you can still make sure your pipeline produces good results.

Tobias Kupek

1 comment

Code for a Living data testing machine learning testing

September 13, 2021

Why your data needs a QA process

At this point, most software engineers see the value of testing their software regularly. But are you testing your data engineering as well?

Corissa E Haury

4 comments

Code for a Living data data testing

March 9, 2021

Level Up: Mastering statistics with Python - part 5

Rather than dig into complex math or over-simplify by using a pre-written function, we'll write our own binomial test function, primarily using base Python. In the process, we'll learn more about how hypothesis testing works and build intuition for how to interpret a p-value.

Ben Popper, Sophie Sommer

0 comments

Code for a Living codecademy level up python statistics

February 23, 2021

Level Up: Mastering statistics with Python - part 2

Investigate a dataset with summary statistics and some basic data visualizations using the Python libraries NumPy, pandas, matplotlib, and Seaborn.

Ben Popper, Sophie Sommer

0 comments

Code for a Living codecademy Engineering statistics

February 16, 2021

Level Up: Mastering statistics with Python

In today’s tech industry, statistics and data science are becoming increasingly important and valuable skills.

Ben Popper

5 comments

Code for a Living codecademy education learning to code statistics

October 12, 2020

How to put machine learning models into production

The goal of building a machine learning model is to solve a problem, and a machine learning model can only do so when it is in production and actively in use by consumers. As such, model deployment is as important as model building.