background July 10, 2009

This Just In: Stack Overflow Defeats Google

We knew in our hearts this day would come: Stack Overflow has defeated Google! On July 2, from 6:45 AM PDT until 12:35 PM PDT, Google App Engine (App Engine) experienced an outage that ranged from partial to complete. Following is a timeline of events, an analysis of the technology and process failures, and a…
Avatar for Jeff Atwood
Co-Founder (Former)

We knew in our hearts this day would come: Stack Overflow has defeated Google!

On July 2, from 6:45 AM PDT until 12:35 PM PDT, Google App Engine (App Engine) experienced an outage that ranged from partial to complete.

Following is a timeline of events, an analysis of the technology and process failures, and a set of steps the team is committed to taking to prevent such an outage from happening again. The App Engine outage was due to complete unavailability of the datacenter’s persistence layer, GFS, for approximately three hours.

The GFS failure was abrupt for reasons described below, and as a consequence the data belonging to App Engine applications remained resident on GFS servers and was unreachable during this period. Since needed application data was completely unreachable for a longer than expected time period, we could not follow the usual procedure of serving of App Engine applications from an alternate datacenter, because doing so would have resulted in inconsistent or unavailable data for applications.

The root cause of the outage was a bug in the GFS Master server caused by another client in the datacenter sending it an improperly formed filehandle which had not been safely sanitized on the server side, and thus caused a stack overflow on the Master when processed.

This is excerpted from a newsgroup posting by App Engine PM Chris Beckmann, and was forwarded along to me by Lenny Rachitsky.

In other, less amusing news, there will be no podcast this week. But don’t fret — next week, we will have the ineffable Miguel de Icaza of Mono fame. Joel and I are both big fans, so this one should be fun.

Podcast logo The Stack Overflow Podcast is a weekly conversation about working in software development, learning to code, and the art and culture of computer programming.

Related

The Overflow Newsletter Banner
newsletter September 3, 2021

The Overflow #89: Passwords are dead!

Welcome to ISSUE #89 of The Overflow! This newsletter is by developers, for developers, written and curated by the Stack Overflow team and Cassidy Williams at Netlify. This week: visualize your engineering failures, connect to two ISPs at once, and theme your app in Flutter. From the blog You’re living in the Metaverse, you just don’t know it…
code-for-a-living July 21, 2021

Why you should build on Kubernetes from day one

If you’re building a new app today, it might be worth taking a closer look at making it cloud-native and using Kubernetes from the jump. The effort to set up Kubernetes is less than you think. Certainly, it’s less than the effort it would take to refactor your app later on to support containerization.