Ops teams are pets, not cattle (Ep. 562)

A common refrain you’ll hear these days is that servers should be scaled out, easy to replace, and interchangeable—cattle, not pets. But for the ops folks who run those servers the opposite is true. You can’t just throw any of them into an incident where they may not know the stack or system and expect everything to work out. Every operator has a set of skills that they’ve built up through research or experience, and teams should value them as such. They’re people, not pets, and certainly not cattle—you can’t just get a new one when you burn out your existing ones.

On this episode of the podcast—sponsored by Chronosphere—we talk with Paige Cruz, Senior Developer Advocate at Chronosphere, about how teams can reduce the cognitive load on ops, the best ways to prepare for inevitable failures, and where the worst place to page Paige is.

Episode notes:

Chronosphere provides an observability platform for ops people, so naturally, the company has an interest in the happiness of those people.

If you’re interested in the history of the pets vs. cattle concept , this covers it pretty well.

Previously, we spoke with the CEO of Chronosphere about making incidents easier to manage.

We’ve covered this topic on the blog before, and two articles came up during our conversation with Paige.

You can connect with Paige on Twitter, where she has a pretty apropos handle.

Congrats to Stellar Question badge winner Bruno Rocha for asking How can I read large text files line by line, without loading them into memory?, which at least 100 users liked enough to bookmark.

Ops teams are pets, not cattle (Ep. 562)

SPONSORED BY CHRONOSPHERE

Add to the discussion