code-for-a-living May 29, 2020

Why is Kubernetes getting so popular?

At the time of this article, Kubernetes is about six years old, and over the last two years, it has risen in popularity to consistently be one of the most loved platforms. This year, it comes in as the number three most loved platform. If you haven’t heard about Kubernetes yet, it’s a platform that…

At the time of this article, Kubernetes is about six years old, and over the last two years, it has risen in popularity to consistently be one of the most loved platforms. This year, it comes in as the number three most loved platform. If you haven’t heard about Kubernetes yet, it’s a platform that allows you to run and orchestrate container workloads.

Containers began as a Linux kernel process isolation construct that encompasses cgroups from 2007 and namespaces from 2002. Containers became more of a thing when LXC became available in 2008, and Google developed its own internal ‘run everything in containers mechanism’ called Borg. Fast forward to 2013, and Docker was released and completely popularized containers for the masses. At the time, Mesos was the primary tool for orchestrating containers, however, it wasn’t as widely adopted. Kubernetes was released in 2015 and quickly became the de facto container orchestration standard.

To try to understand the popularity of Kubernetes, let’s consider some questions. When was the last time developers could agree on the way to deploy production applications? How many developers do you know who run tools as is out of the box? How many cloud operations engineers today don’t understand how applications work? We’ll explore the answers in this article.

Infrastructure as YAML

Coming from the world of Puppet and Chef, one of the big shifts with Kubernetes has been the move from infrastructure as code towards infrastructure as data—specifically, as YAML. All the resources in Kubernetes that include Pods, Configurations, Deployments, Volumes, etc., can simply be expressed in a YAML file. For example:

apiVersion: v1
kind: Pod
metadata:
  name: site
  labels:
    app: web
spec:
  containers:
    - name: front-end
      image: nginx
      ports:
        - containerPort: 80

This representation makes it easier for DevOps or site reliability engineers to fully express their workloads without the need to write code in a programming language like Python, Ruby, or Javascript.

Other benefits from having your infrastructure as data include:

  • GitOps or Git Operations Version Control. With this approach, you can keep all your Kubernetes YAML files under git repositories, which allows you to know precisely when a change was made, who made the change, and what exactly changed. This leads to more transparency across the organization and improves efficiency by avoiding ambiguity as to where members need to go to find what they need. At the same time, it can make it easier to automatically make changes to Kubernetes resources by just merging a pull request.
  • Scalability. Having resources defined as YAML makes it super easy for cluster operators to change one or two numbers in a Kubernetes resource to change the scaling behavior. Kubernetes has Horizontal Pod Autoscalers to help you identify a minimum and a maximum number of pods a specific deployment would need to have to be able to handle low and high traffic times. For example, if you are running a deployment that may need more capacity because traffic suddenly increases, you could change maxReplicas from 10 to 20:
apiVersion: autoscaling/v2beta2
kind: HorizontalPodAutoscaler
metadata:
  name: myapp
  namespace: default
spec:
  scaleTargetRef:
    apiVersion: apps/v1
    kind: Deployment
    name: myapp-deployment
  minReplicas: 1
  maxReplicas: 20
  metrics:
  - type: Resource
    resource:
      name: cpu
      target:
        type: Utilization
        averageUtilization: 50
  • Security and Controls. YAML is a great way to validate what and how things get deployed in Kubernetes. For example, one of the significant concerns when it comes to security is whether your workloads are running as a non-root user. We can make use of tools like conftest, a YAML/JSON validator, together with the Open Policy Agent, a policy validator to check that the SecurityContext of your workloads doesn’t allow a container to run as a root. For that, users can use a simple Open Policy Agent rego policy like this:
package main

deny[msg] {
  input.kind = "Deployment"
  not input.spec.template.spec.securityContext.runAsNonRoot = true
  msg = "Containers must not run as root"
}
  • Cloud Provider Integrations. One of the major trends in the tech industry is to run workloads in the public cloud providers. With the help of the cloud-provider component, Kubernetes allows every cluster to integrate with the cloud provider it’s running on. For example, if a user is running an application in Kubernetes in AWS and wants that application to be accessible through a service, the cloud provider helps automatically create a LoadBalancer service that will automatically provision an Amazon Elastic Load Balancer to forward the traffic to the application pods.

Extensibility

Kubernetes is very extensible, and developers love that. There are a set of existing resources like Pods, Deployments, StatefulSets, Secrets, ConfigMaps, etc. However, users and developers can add more resources in the form of Custom Resource Definitions. For example, if we’d like to define a CronTab resource, we could do it with something like this:

apiVersion: apiextensions.k8s.io/v1
kind: CustomResourceDefinition
metadata:
  name: crontabs.my.org
spec:
  group: my.org
  versions:
    - name: v1
      served: true
      storage: true
      Schema:
        openAPIV3Schema:
          type: object
          properties:
            spec:
              type: object
              properties:
                cronSpec:
                  type: string
                  pattern: '^(\d+|\*)(/\d+)?(\s+(\d+|\*)(/\d+)?){4}$'
                replicas:
                  type: integer
                  minimum: 1
                  maximum: 10
  scope: Namespaced
  names:
    plural: crontabs
    singular: crontab
    kind: CronTab
    shortNames:
    - ct

We can create a CronTab resource later with something like this:

apiVersion: "my.org/v1"
kind: CronTab
metadata:
  name: my-cron-object
spec:
  cronSpec: "* * * * */5"
  image: my-cron-image
  replicas: 5

Another form of Kubernetes extensibility is its ability for developers to write their own Operators, a specific process running in a Kubernetes cluster that follows the control loop pattern. An Operator allows users to automate the management of CRDs (custom resource definitions) by talking to the Kubernetes API. 

The community has several tools that allow developers to create their own Operators. One of those tools is the Operator Framework and its Operator SDK. The SDK provides a skeleton for developers to get started creating an operator very quickly. For example, you can get started on its command line with something like this:

$ operator-sdk new my-operator --repo github.com/myuser/my-operator

Which creates the whole boilerplate for your operator including YAML files and Golang code:

.
|____cmd
| |____manager
| | |____main.go
|____go.mod
|____deploy
| |____role.yaml
| |____role_binding.yaml
| |____service_account.yaml
| |____operator.yaml
|____tools.go
|____go.sum
|____.gitignore
|____version
| |____version.go
|____build
| |____bin
| | |____user_setup
| | |____entrypoint
| |____Dockerfile
|____pkg
| |____apis
| | |____apis.go
| |____controller
| | |____controller.go

Then you can add APIs and a controller like this:

$ operator-sdk add api --api-version=myapp.com/v1alpha1 --kind=MyAppService

$ operator-sdk add controller --api-version=myapp.com/v1alpha1 --kind=MyAppService

And finally build and push the operator to your container registry:

$ operator-sdk build your.container.registry/youruser/myapp-operator

If developers need to have even more control, they can modify the boilerplate code in the Golang files. For example, to modify the specifics of the controller, they can make changes to the controller.go file.

Another project, KUDO, allows you to create operators by just using declarative YAML files . For example, an operator for Apache Kafka would be defined with something like this, and it allows users to install a Kafka cluster on top of Kubernetes with a couple of commands:

$ kubectl kudo install zookeeper
$ kubectl kudo install kafka

Then tune it also with another command:

$ kubectl kudo install kafka --instance=my-kafka-name \
            -p ZOOKEEPER_URI=zk-zookeeper-0.zk-hs:2181 \
            -p ZOOKEEPER_PATH=/my-path -p BROKER_CPUS=3000m \
            -p BROKER_COUNT=5 -p BROKER_MEM=4096m \
            -p DISK_SIZE=40Gi -p MIN_INSYNC_REPLICAS=3 \
            -p NUM_NETWORK_THREADS=10 -p NUM_IO_THREADS=20

Innovation

Over the last few years, Kubernetes has had major releases every three or four months, which means that every year there are three or four major releases. The number of new features being introduced hasn’t slowed, evidenced by over 30 different additions and changes in its last release. Furthermore, the contributions don’t show signs of slowing down even during these difficult times as indicated by the Kubernetes project Github activity.

The new features allow cluster operators more flexibility when running a variety of different workloads. Software engineers also love to have more controls to deploy their applications directly to production environments.

Community

Another big aspect of Kubernetes popularity is its strong community. For starters, Kubernetes was donated to a vendor-neutral home in 2015 as it hit version 1.0: the Cloud Native Computing Foundation.

There is also a wide range of community SIGs (special interest groups) that target different areas in Kubernetes as the project moves forwards. They continuously add new features and make it even more user friendly.

The Cloud Native Foundation also organizes CloudNativeCon/KubeCon, which as of this writing, is the largest ever open-source event in the world. The event, which is normally held up to three times a year, gathers thousands of technologists and professionals who want to improve Kubernetes and its ecosystem as well as make use of some of the new features released every three months.

Furthermore, the Cloud Native Foundation has a Technical Oversight Committee that, together with its SIGs, look at the foundations’ new and existing projects in the cloud-native ecosystem. Most of the projects help enhance the value proposition of Kubernetes.

Finally, I believe that Kubernetes would not have the success that it does without the conscious effort by the community to be inclusive to each other and to be welcoming to any newcomers.

Future

One of the main challenges developers face in the future is how to focus more on the details of the code rather than the infrastructure where that code runs on. For that, serverless is emerging as one of the leading architectural paradigms to address that challenge. There are already very advanced frameworks such as Knative and OpenFaas that use Kubernetes to abstract the infrastructure from the developer.

We’ve shown a brief peek at Kubernetes in this article, but this is just the tip of the iceberg. There are many more resources, features, and configurations users can leverage. We will continue to see new open-source projects and technologies that enhance or evolve Kubernetes, and as we mentioned, the contributions and the community aren’t going anywhere.

Tags: , , , ,
Podcast logo The Stack Overflow Podcast is a weekly conversation about working in software development, learning to code, and the art and culture of computer programming.

Related

code-for-a-living June 5, 2020

The Overflow #24: Survey says…

June 2020 Welcome to ISSUE #24 of the Overflow! Break into a two-four and read the newsletter by developers, for developers, written and curated by the Stack Overflow team and Cassidy Williams at Netlify. The survey results are in! Nullable types are out! Eye contact is back! From the blog The 2020 Developer Survey results…