Ongoing community data protection
Socially responsible use of community data needs to be mutually beneficial: the more potential partners are willing to contribute to community development, the more access to community content they receive.
Socially responsible use of community data needs to be mutually beneficial: the more potential partners are willing to contribute to community development, the more access to community content they receive.
The entire AI ecosystem is at risk without trust.
The internet is changing once again: it is becoming more fragmented as the separation between sources of knowledge and how users interact with that knowledge grows.
If you’re weary of reading about the latest chatbot innovations and the nine ways AI will change your daily life next year, this series of posts may be for you.
Masked self-attention is the key building block that allows LLMs to learn rich relationships and patterns between the words of a sentence. Let’s build it together from scratch.
How are developers actually using GenAI-powered coding tools now that some of the initial hype has faded?
Founder and entrepreneur Jyoti Bansal tells Ben, Cassidy, and Eira about the developer challenges he aims to solve with his new venture, Harness, an AI-driven software development platform meant to take the pain out of DevOps. Jyoti shares his journey as a founder, his perspective on the venture capital landscape, and his reasons behind his decision to raise debt capital for Harness.
Ben chats with Gias Uddin, an assistant professor at York University in Toronto, where he teaches software engineering, data science, and machine learning. His research focuses on designing intelligent tools for testing, debugging, and summarizing software and AI systems. He recently published a paper about detecting errors in code generated by LLMs. Gias and Ben discuss the concept of hallucinations in AI-generated code, the need for tools to detect and correct those hallucinations, and the potential for AI-powered tools to generate QA tests.
Ryan chats with Russ d’Sa, cofounder and CEO of LiveKit, about multimodal AI and the technology that makes it possible. They talk through the tech stack required, including the use of WebRTC and UDP protocols for real-time audio and video streaming. They also explore the big challenges involved in ensuring privacy and security in streaming data, namely end-to-end encryption and obfuscation.
Ben and Ryan talk to Scott McCarty, Global Senior Principal Product Manager for Red Hat Enterprise Linux, about the intersection between LLMs (large language models) and open source. They discuss the challenges and benefits of open-source LLMs, the importance of attribution and transparency, and the revolutionary potential for LLM-driven applications. They also explore the role of LLMs in code generation, testing, and documentation.
More code isn't always a good thing, but fewer bugs is.
The decoder-only transformer architecture is one of the most fundamental ideas in AI research.
Ben and Ryan are joined by Robin Gupta for a conversation about benchmarking and testing AI systems. They talk through the lack of trust and confidence in AI, the inherent challenges of nondeterministic systems, the role of human verification, and whether we can (or should) expect an AI to be reliable.
Product manager Ash Zade joins the home team to talk about the journey to OverflowAI, a GenAI-powered add-on for Stack Overflow for Teams that’s available now. Ash describes how his team built Enhanced Search, the problems they set out to solve, how they ensured data quality and accuracy, the role of metadata and prompt engineering, and the feedback they’ve gotten from users so far.
Ben and Ryan explore why configuration is so complicated, the right to repair, the best programming languages for beginners, how AI is grading exams in Texas, Automattic’s $125M acquisition of Beeper, and why a major US city’s train system still relies on floppy disks. Plus: The unique challenge of keeping up with a field that’s changing as rapidly as GenAI.
In the wake of the XZ backdoor, Ben and Ryan unpack the security implications of relying on open-source software projects maintained by small teams. They also discuss the open-source nature of Linux, the high cost of education in the US, the value of open-source contributions for job seekers, and what Apple is up to AI-wise.
The home team is joined by Michael Foree, Stack Overflow’s director of data science and data platform, and occasional cohost Cassidy Williams, CTO at Contenda, for a conversation about long context windows, retrieval-augmented generation, and how Databricks’ new open LLM could change the game for developers. Plus: How will FTX co-founder Sam Bankman-Fried’s sentence of 25 years in prison reverberate in the blockchain and crypto spaces?
Ben and Ryan talk about how tiny nations are making huge money from their domain names, the US government’s antitrust case against Apple, the implications of a four-day work week, Reddit’s IPO, and more.
Ben and Ryan are joined by Bill Harding, CEO of GitClear, for a discussion of AI-generated code quality and its impact on productivity. GitClear’s research has highlighted the fact that while AI can suggest valid code, it can’t necessarily reuse and modify existing code—a recipe for long-term challenges in maintainability and test coverage if devs are too dependent on AI code-gen tools.
The home team discusses the challenges (hardware and otherwise) of building AI models at scale, why major players like Meta are open-sourcing their AI projects, what Apple’s recent changes mean for developers in the EU, and Perplexity AI’s new approach to search.
Ben talks with Ryan Polk, Chief Product Officer at Stack Overflow, about our strategic partnership with Google Cloud, the importance of collaboration between AI companies and the Stack Overflow community, and why Stack Overflow’s Q&A format is so suitable for training AI models.
Machine learning scientist, author, and LLM developer Maxime Labonne talks with Ben and Ryan about his role as lead machine learning scientist, his contributions to the open-source community, the value of retrieval-augmented generation (RAG), and the process of fine-tuning and unfreezing layers in LLMs. The team talks through various challenges and considerations in implementing GenAI, from data quality to integration.
Ryan and Ben chat with Shivang Shah, Chief Architect, and Jon Fasoli, Chief Design & Product Officer, both of Intuit Mailchimp, about implementing GenAI and how all the pieces came together to make a better end user experience.
This is part two of our conversation with Roie Schwaber-Cohen, Staff Developer Advocate at Pinecone, about retrieval-augmented generation (RAG) and why it’s crucial for the success of your AI initiatives.