Loading…

Stack Overflow Business

Stack Internal: the knowledge intelligence layer that powers enterprise AI.

Stack Data Licensing: decades of verified, technical knowledge to boost AI performance and trust.

Stack Ads: engage developers where it matters — in their daily workflow.

May 24, 2024

Would you board a plane safety-tested by GenAI?

Ben and Ryan are joined by Robin Gupta for a conversation about benchmarking and testing AI systems. They talk through the lack of trust and confidence in AI, the inherent challenges of nondeterministic systems, the role of human verification, and whether we can (or should) expect an AI to be reliable.

Credit: Alexandra Francis

Robin is the author of a practical handbook for Selenium test automation.

Connect with Robin on LinkedIn, Twitter, or via his website.

Shoutout to user2651084, who earned a Great Question badge by asking How do I reset the Jupyter/IPython input prompt numbering?.

Add to the discussion

Login with your stackoverflow.com account to take part in the discussion.