Before diving into Kubernetes, it’s important to have a good understanding of containers. It’s good to contrast containers and virtual machines.
Virtual machines are installed on a single physical server. VM images are not light-weight, because they contain the entire operating system of choice. In contrast, the hardware underlying a Kubernetes cluster is not a single physical or virtual server. It’s just a single computational resource (CPU, RAM, I/O) that represents a subset of the entire hardware platform (for instance Azure). The container host or kernel is the intermediary layer that talks from the containers to the hardware. I would say: it’s not the OS, but a set of cross-machine OS services. The OS dependencies, which are part of the container image itself, are for the Docker container or container engine.
Another valuable angle, is to look at containers from an infrastructure viewpoint or a development standpoint.
So, what are the container benefits. Containers are an excellent choice when developing software based on microservice architectures. The decoupled design of microservices makes a perfect combination with the atomicity of containers. Containers make efficient use of hardware (small images) and provide isolation. Isolation means you can run multiple containers simultaneously on the same host without affecting each other. And last but not least, there’s the advantage of consistent deployments. No more “works on my machine, why doesn’t it work in production?”.
Now, let’s turn to Kubernetes. Kubernetes (K8s) is an open-source system for automating deployment, scaling and management of containerized applications (https://kubernetes.io). The picture below shows what a Kubernetes cluster looks like. Know that the control plane can also be referred to as the master node. The master node controls a cluster of worker nodes. In Production you can have multiple master nodes for high availability.
The head of the Kubernetes cluster is the control plane. The control plane takes care of distributing workload. It’s a load balancer. The worker nodes are the compute power of your cluster. A worker node could be a physical machine or a virtual machine. Note that compute and storage are completely separated. More on this later. The worker nodes run pods. A pod is a logical collection of one or more containers. Typically, a pod only contains one container, that runs one specific app or microservice. Pods, in turn, can be organized into namespaces. A namespace is a logical unit within your cluster which can have its own resource quota, permissions, etc.
A Kubernetes cluster runs its own internal network and can be exposed to the outside world using services. A services is basically an ip adress and TCP port. Pods may go up and down, but the service ip adress is static, ie you have a persistent service endpoint (ip address + DNS naming).
Let’s go one step deeper.
We can use command line tool kubectl or Helm to talk to the master node.
The master node is exposed via a REST API. The REST API can be used to post commands. These commands consist of YAML-based configuration files. The YAML files define the before and after state of the Kubernetes cluster. Kubernetes will use the scheduler to actually move from the current state to the desired state.
Kubernetes uses a high-available, distributed, and reliable key-value store called etcd. This key-value store stores the current state and the desired state of all objects within your cluster. Finally, Kubernetes uses controllers to track the state of objects in the cluster. Each controller runs in a non-terminating loop, watching nodes and containers and spinning those up again when needed.
Each worker node uses a so-called kubelet to register the worker node within the cluster. To enable communication with the master, the kubelet monitors work requests from the API server and executes those units of work. The kube-proxy makes sure each pod gets a unique ip address.
So why do I need Kubernetes as a developer? The advantages of Kubernetes relate to:
- Load Balancing pods using services.
- Network Traffic across containers in different pods and different nodes.
- Zero downtime deployment via rolling updates and rollbacks.
- Self-Healing by automatic restart or replacement of containers.
- Scalability. Responding to increased demand by deploying more container instances, and remove containers if demand is decreasing.
- Portability. See figure below.
- Management of shared storage.
Kubernetes is still quite complex. You can opt for a managed Kubernetes environment via Azure Kubernetes Service (AKS) to simplify the deployment and management of containerized apps. An alternative is the Red Hat Open Shift Container Platform (OCP), which can run both on premises or on Azure, AWS or Google Cloud Platform. OCP is better when it comes to portability.