From the blog
How do you evaluate an LLM? Try an LLM.
On this episode: Stack Overflow senior data scientist Michael Geden tells Ryan and Ben about how data scientists evaluate large language models (LLMs) and their output. They cover the challenges involved in evaluating LLMs, how LLMs are being used to evaluate other LLMs, the importance of data validating, the need for human raters, and more needs and tradeoffs involved in selecting and fine-tuning LLMs.
How to succeed as a data engineer without the burnout
The key strategies for building a headache-free data platform.
Diverting more backdoor disasters
In the wake of the XZ backdoor, Ben and Ryan unpack the security implications of relying on open-source software projects maintained by small teams. They also discuss the open-source nature of Linux, the high cost of education in the US, the value of open-source contributions for job seekers, and what Apple is up to AI-wise.
Move faster and safer using feature flags on AWS
Learn to implement practical DevOps techniques based on how Amazon and AWS enhances the speed, availability and security of its software through the use of feature flags. The discussion will cover various topics such as release flags, trunk-based development, and the A/B Testing methods employed by Amazon.com and AWS services.
If everyone is building AI, why aren't more projects in production?
Ben talks with Shane McAllister, lead developer advocate at MongoDB, Stanimira Vlaeva, senior developer advocate at MongoDB, and Miku Jha, director, AI/ML and generative AI at Google Cloud, about the challenges and opportunities of operationalizing and scaling generative AI models in enterprise organizations.
Interesting questions
How was Rome able to conscript and equip 400k soldiers during 2nd Punic War in a pre-industrial society?
If you want people to fight your wars, you need to make it worth their while.
Should I disclose a mental disorder that's been impacting my job to HR/my boss?
“Your employer doesn’t need to know the nature of the condition, but let them know if there are any accommodations you need in the meantime.”
Does the success of AI (Large Language Models) support Wittgenstein's position that "meaning is use"?
Well, that all depends on what you mean by "use." And "meaning."
After creating HTML, why did Tim Berners-Lee bother creating HTTP? Why didn't he just write a HTML renderer for a FTP client?
Have you ever tried using FTP without a nice client?
Links from around the web
A single atom layer of gold
Scientists were able to make a thin sheet of gold that is literally a single atom thick. There are some cool applications for chemical production and conversion.
Anchor position tool
CSS Anchor Positioning is coming soon to a browser near you, and here's how it works!
America’s young farmers are burning out. I quit, too
A lot of us have fantasies of leaving tech and running off and starting a farm...but that's not as easy as it sounds.
Trip report: Node.js collaboration summit (2024 London)
Node runs a lot of the internet. Here's what's next.
Looking for the tools, technologies, and skills your team needs to evolve in the AI era? Stack Overflow's Industry Guide to AI has your answers.