partner-content October 27, 2021

A murder mystery: who killed our user experience?

On this sponsored episode of the Stack Overflow Podcast, we talk with Greg Leffler of Splunk about the keys to instrumenting an observable system and how the OpenTelemetry standard allows you to avoid vendor lock in.
Avatar for Ryan Donovan
Content Marketer

The infrastructure that networked applications lives on is getting more and more complicated. There was a time when you could serve an application from a single machine on premises. But now, with cloud computing offering painless scaling to meet your demand, your infrastructure becomes abstracted and not really something you have contact with directly. Compound that problem with with architecture spread across dozens, even hundreds of microservices, replicated across multiple data centers in an ever changing cloud, and tracking down the source of system failures becomes something like a murder mystery. Who shot our uptime in the foot? 

A good observability system helps with that. On this sponsored episode of the Stack Overflow Podcast, we talk with Greg Leffler of Splunk about the keys to instrumenting an observable system and how the OpenTelemetry standard makes observability easier, even if you aren’t using Splunk’s product. 

Observability is really an outgrowth of traditional monitoring. You expect that some service or system could break, so you keep an eye on it. But observability applies that monitoring to an entire system and gives you the ability to answer the unexpected questions that come up. It uses three principal ways of viewing system data: logs, traces, and metrics.

Metrics are a number and a timestamp that tell you particular details. Traces follow a request through a system. And logs are the causes and effects recorded from a system in motion. Splunk wants to add a fourth one—events—that would track specific user events and browser failures. 

Observing all that data first means you have to be able to track and extract that data by instrumenting your system to produce it. Greg and his colleagues at Splunk are huge fans of OpenTelemetry. It’s an open standard that can extract data for any observability platform. You instrument your application once and never have to worry about it again, even if you need to change your observability platform. 

Why use an approach that makes it easy for a client to switch vendors? Leffler and Splunk argue that it’s not only better for customers, but for Splunk and the observability industry as a whole. If you’ve instrumented your system with a vendor locked solution, then you may not switch, you may just let your observability program fall by the wayside. That helps exactly no one. 

As we’ve seen, people are moving to the cloud at an ever faster pace. That’s no surprise; it offers automatic scaling for arbitrary traffic volumes, high availability, and worry-free infrastructure failure recovery. But moving to the cloud can be expensive, and you have to do some work with your application to be able to see everything that’s going on inside it. Plenty of people just throw everything into the cloud and let the provider handle it, which is fine until they see the bill.

Observability based on an open standard makes it easier for everyone to build a more efficient and robust service in the cloud. Give the episode a listen and let us know what you think in the comments.

TRANSCRIPT

Tags: , , , ,

Related

The Overflow Newsletter Banner
newsletter November 26, 2021

The Overflow #101: Invest in your favorite developer

Welcome to ISSUE #101 of The Overflow! This newsletter is by developers, for developers, written and curated by the Stack Overflow team and Cassidy Williams at Netlify. This week: QA for deep learning pipelines, getting efficient with summation formulas, and finding the point where a table is too big. From the blog Building a QA process for your…
The Overflow Newsletter Banner
newsletter September 17, 2021

The Overflow #91: Observability is the future (of your DevOps career)

Welcome to ISSUE #91 of The Overflow! This newsletter is by developers, for developers, written and curated by the Stack Overflow team and Cassidy Williams at Netlify. This week: scaling front-end design with a system, helping fighter jets score hits with radar, and ignoring the front-end development scene without missing a beat. From the blog Scaling front end…