Loading…

data

Tragedy of the (data) commons

Ben chats with Shayne Longpre and Robert Mahari of the Data Provenance Initiative about what GenAI means for the data commons. They discuss the decline of public datasets, the complexities of fair use in AI training, the challenges researchers face in accessing data, potential applications for synthetic data, and the evolving legal landscape surrounding AI and copyright.

On the web, data doesn’t define us. It creates us.

In this episode, Ben interviews Jannis Kallinikos, a professor at Luiss University in Rome, Italy about his new book Data Rules: Reinventing the Market Economy, coauthored with Cristina Alaimo. They discuss the social impact of data, explore the idea that data filters how we see the world and interact with each other, and highlight the need for social accountability in data tracking and surveillance.

How do you evaluate an LLM? Try an LLM.

On this episode: Stack Overflow senior data scientist Michael Geden tells Ryan and Ben about how data scientists evaluate large language models (LLMs) and their output. They cover the challenges involved in evaluating LLMs, how LLMs are being used to evaluate other LLMs, the importance of data validating, the need for human raters, and more needs and tradeoffs involved in selecting and fine-tuning LLMs.