Building out a managed Kubernetes service is a bigger job than you think (Ep.443)

Infrastructure as code is complicated enough, but building a managed IAC service is a whole other level of complicated.

Article hero image

You may be running your code in containers. You might even have taken the plunge and orchestrated it all with YAML code through Kubernetes. But infrastructure as code becomes a whole new level of complicated when setting up a managed Kubernetes service.

On this sponsored episode of the Stack Overflow podcast, Ben and Ryan talk with David Dymko and Walt Ribeiro of Vultr about what they went through to build their managed Kubernetes service as a cloud offering. It was a journey that ended not just with a managed K8s service, but also with a wealth of additional tooling, upgrades, and open sourcing.

When building out a Kubernetes implementation, you can abstract away some of the complexity, especially if you use some of the more popular tools like Kubeadm or Kubespray. But when using a managed service, you want to be able to focus on your workloads and only your workloads, which means taking away the control plane. The user doesn’t need to care about the underlying infrastructure, but for those designing it, the missing control plane opens a whole heap of trouble.

Once you remove this abstraction, your cloud cluster is treated as a single solid compute. But then how do you do upgrades? How do you maintain x509 certifications for HTTPS calls? How do you get metrics? Without the control plane, Vultr needed to communicate to their Kubernetes worker nodes through the API. And wouldn’t you know it: the API isn’t all that well-documented.

They took it back to bare necessities, the MVP feature set of their K8s cloud service. They’d need the Cloud Controller Manager (CCM) and the Container Storage Interface (CSI) as core components to have Vultr be a first-class citizen on a Kubernetes cluster. They built a Go client to interface using those components and figured, hey, why not open-source this? That led to a few other open-source projects, like a Terraform integration and a command-line interface.

This was the start of a two-year journey connecting all the dots that this project required. They needed a managed load balancer that could work without the control plane or any of the tools that interfaced with it. They built it. They needed a quality-of-life update to their API to catch up with everything that today’s developer expects: modern CRUD actions, REST best practices, and pagination. All the while, they kept listening to their customers to make sure they didn’t stray too far from the original product.

To see the results of their journey, listen to the podcast and check out Vultr.com for all of their cloud offerings, available in 25 locations worldwide.


The Stack Overflow blog is committed to publishing interesting articles by developers, for developers. From time to time that means working with companies that are also clients of Stack Overflow’s through our advertising, talent, or teams business. When we publish work from clients, we’ll identify it as Partner Content with tags and by including this disclaimer at the bottom.

Login with your stackoverflow.com account to take part in the discussion.