Kubernetes: An Overview

Kubernetes is a powerful open-source system, initially developed by Google, for managing containerized applications in a clustered environment. It aims to provide better ways of managing related, distributed components and services across varied infrastructure.

We talk about system architecture and how it solves problems with use cases to handle containerized deployments and scaling.

Kubernetes official site : https://kubernetes.io/

1) Architecture

A common example of Kubernetes Architecture :

Master Components

Services running on a Master Node are called Kubernetes Control Plane, and the Master Node is used for administrative tasks only, while containers with your services will be created on a Worker Node(s).

  • API Server: The API server implements a RESTful interface, which means that many different tools and libraries can readily communicate with it. The API server is the only component who’s communicate with etcd and it stores all cluster information there. A common example of interaction with the API is kubectl command.
  • Controller Manager: The controller manager is responsible for monitoring replication controllers, and creating corresponding pods to achieve the desired state. It uses the API to listen for new controllers and to create/delete pods.
  • Kube Scheduler : The scheduler is responsible for tracking available capacity on each host to make sure that workloads are not scheduled in excess of the available resources.
  • ETCD : The Etcd stores the entire state of the cluster: its configuration, specifications, and the statuses of the running workloads. It can be deployed in other nodes out of Master nodes.

The image bellow shows the worker node interacting with a load balancer who’s balancing the requests for 3 master in the front of cluster.

Worker Node

kubelet : The main Kubernetes component on each cluster’s node which speaks to the API server, to check if there are new Pods to be created on the current Worker Node

  • It communicates to a container system (Containerd or Docker) via its API to create and manage containers.
  • After any changes in a Pod on a Node — will send them to the API server, which in its turn will save them to the etcd database.
  • performs containers monitoring

kube-proxy:

  • like a reverse-proxy service to forwarding requests to an appropriate service or applications inside a Kubernetes private network.
  • uses IPTABLES by default like this example bellow:
kubectl -n kube-system exec -ti kube-proxy-5ctt2 -- iptables --table nat --list

2) Environment to play with

I would to suggest three small Kubernetes solutions to play with our hands on examples :

3) Key concepts for using Kubernetes

It’s so important know how to Kubernetes has manage containers inside of cluster .

  • Pod: The pod is a smallest object in the k8s.It organize them into pods, which are abstractions that share the same resources, such as addresses, CPU cycles and memory. A pod can contain multiple container, but it’s not a common thing.
  • Controller: A controller is the object responsible for interacting with the API Server and orchestrating some other object. i.e : Deployments and Replication Controllers.
  • ReplicaSets: A ReplicaSet is an object responsible for ensuring a number of pods running on the node.
  • Deployment: It is one of the main controllers used, Deployment ensures that a certain number of replicas of a pod through another controller called ReplicaSet is running on the worker nodes in the cluster.
  • Jobs and CronJobs: As described by official documentation a job creates one or more pods and ensures that a specified number of them successfully terminate.

4) Components interaction

if you don’t have a hardware to test our example, so try to use the Katacoda lab.

The command bellow shows the cluster status :

master $ kubectl cluster-info

Kubernetes master is running at https://172.17.0.11:6443
KubeDNS is running at https://172.17.0.11:6443/api/v1/namespaces/kube-system/services/kube-dns:dns/proxy
To further debug and diagnose cluster problems, use 'kubectl cluster-info dump'.

To further debug and diagnose cluster problems, use 'kubectl cluster-info dump'.

Get component status :

master $ kubectl get cs


NAME                 STATUS    MESSAGE             ERROR
scheduler            Healthy   ok
controller-manager   Healthy   ok
etcd-0               Healthy   {"health":"true"}

POD’s

A pod is a collection of containers sharing a network and mount namespace and is the basic unit of deployment in Kubernetes.

Before running the first POD in laboratory I would to show the POD’s create cycle since the birth:

  1. kubectl sends a request to the API server
  2. The API server validates it and passes it to etcd
  3. etcd reports back to the API server that the request has been accepted and saved
  4. API server accesses kube-scheduler
  5. kube-scheduler defines the node (s) on which the pod will be created, and returns information back to the API server
  6. The API server sends this data to etcd
  7. etcd reports back to the API server that the request has been accepted and saved
  8. The API server accesses the kubeletcorresponding node (s)
  9. kubelet accesses the docker daemon (or other container runtime) through its API through the socket of the Docker daemon on the node with the task of starting the container
  10. kubelet sends pod status to the API server
  11. API server updates data in etcd

I created a first Pod example with this command :

master $ kubectl create deployment nginx --image=nginx


deployment.apps/nginx created

List the pods running in the Cluster :

master $ kubectl get pods


NAME                     READY   STATUS    RESTARTS   AGE
nginx-65f88748fd-4ng57   1/1     Running   0          2m43s

Get all information for Ngix Pod:

master $ kubectl describe deployment nginx


Name:                   nginx
Namespace:              default
CreationTimestamp:      Fri, 22 May 2020 15:58:05 +0000
Labels:                 app=nginx
Annotations:            deployment.kubernetes.io/revision: 1
Selector:               app=nginx
Replicas:               1 desired | 1 updated | 1 total | 1 available | 0 unavailable
StrategyType:           RollingUpdate
MinReadySeconds:        0
RollingUpdateStrategy:  25% max unavailable, 25% max surge
Pod Template:
  Labels:  app=nginx
  Containers:
   nginx:
    Image:        nginx
    Port:         <none>
    Host Port:    <none>
    Environment:  <none>
    Mounts:       <none>
  Volumes:        <none>
Conditions:
  Type           Status  Reason
  ----           ------  ------
  Available      True    MinimumReplicasAvailable
  Progressing    True    NewReplicaSetAvailable
OldReplicaSets:  <none>
NewReplicaSet:   nginx-65f88748fd (1/1 replicas created)
Events:
  Type    Reason             Age   From                   Message
  ----    ------             ----  ----                   -------
  Normal  ScalingReplicaSet  10m   deployment-controller  Scaled up replica set nginx-65f88748fd to 1
master $

We can create a custom example like my example.yaml bellow :

apiVersion: v1
kind: Pod
metadata:
  name: my-pod
  labels:
    app: my-app
    type : front-app
spec:
  containers:
  - name: nginx-container
    image: nginx

The POD named my-pod will be create by this command :

master $ kubectl apply -f  example.yaml
pod/my-pod created

master $ kubectl get pods


NAME                     READY   STATUS    RESTARTS   AGE
my-pod                   1/1     Running   0          6s
nginx-65f88748fd-4ng57   1/1     Running   0          16m

Deployments

A deployment is used for realiability , load balancing and autoscaling.

I using this command bellow to scale up my POD nignx

master $ kubectl  scale deploy nginx  --replicas=2


deployment.extensions/nginx scaled

Looking the new replicas for nginx POD

# Get Pods 
master $ kubectl get pods
NAME                     READY   STATUS    RESTARTS   AGE
my-pod                   1/1     Running   0          4m9s
nginx-65f88748fd-79gjm   1/1     Running   0          4m59s
nginx-65f88748fd-8bmtk   1/1     Running   0          74s


# Get deployments for Nginx 
master $ kubectl get deployments nginx
NAME    READY   UP-TO-DATE   AVAILABLE   AGE
nginx   2/2     2            2           107s
# Get the replicasets
master $  kubectl get replicasets
NAME               DESIRED   CURRENT   READY   AGE
nginx-86c57db685   2         2         2       3m3s

We can use autoscaling like this command bellow:

kubectl autoscale deployment <DEPLOYMENT NAME> --min=2 --max=10

Detail the replica sets:

master $ kubectl describe replicasets


Name:           nginx-86c57db685
Namespace:      default
Selector:       app=nginx,pod-template-hash=86c57db685
Labels:         app=nginx
                pod-template-hash=86c57db685
Annotations:    deployment.kubernetes.io/desired-replicas: 2
                deployment.kubernetes.io/max-replicas: 3
                deployment.kubernetes.io/revision: 1
Controlled By:  Deployment/nginx
Replicas:       2 current / 2 desired
Pods Status:    2 Running / 0 Waiting / 0 Succeeded / 0 Failed
Pod Template:
  Labels:  app=nginx
           pod-template-hash=86c57db685
  Containers:
   nginx:
    Image:        nginx
    Port:         <none>
    Host Port:    <none>
    Environment:  <none>
    Mounts:       <none>
  Volumes:        <none>
Events:
  Type    Reason            Age    From                   Message
  ----    ------            ----   ----                   -------
  Normal  SuccessfulCreate  3m33s  replicaset-controller  Created pod: nginx-86c57db685-ng2gg
  Normal  SuccessfulCreate  2m56s  replicaset-controller  Created pod: nginx-86c57db685-sgvx8

Services

A Service routes traffic across a set of Pods. Services are the abstraction that allow pods to die and replicate in Kubernetes without impacting your application. Discovery and routing among dependent Pods (such as the frontend and backend components in an application) is handled by Kubernetes Services.

There a 4 types of services:

  • Cluster IP (default): Exposes the Service on an internal IP in the cluster.
  • Node Port : Exposes the Service on the same port of each selected Node in the cluster using NAT.
  • LoadBalancer — Creates an external load balancer in the current cloud (if supported) and assigns a fixed, external IP to the Service. Superset of NodePort.
  • ExternalName — Exposes the Service using an arbitrary name (specified by externalName in the spec) by returning a CNAME record with the name. No proxy is used. This type requires v1.7 or higher of kube-dns.

I show in this hands on document how to only configure a communication with NGINX POD with loadbalancer mode exemple.

I would to write with more details and more hands on practices in real scenarios on another posts.

Get the list of services:

master $ kubectl get svc
NAME         TYPE        CLUSTER-IP   EXTERNAL-IP   PORT(S)   AGEkubernetes   ClusterIP   10.233.0.1   <none>        443/TCP   4d18h

Expose the NGINX app with Cluster IP example :

master $ kubectl expose deployment nginx  --port=80 -type=ClusterIP service/nginx exposed

Testing our Cluster Ip example:

master $ kubectl get svc
NAME         TYPE        CLUSTER-IP     EXTERNAL-IP   PORT(S)   AGE
kubernetes   ClusterIP   10.233.0.1     <none>        443/TCP   4d19h
nginx        ClusterIP   10.233.7.197   <none>        80/TCP    23m


master $ curl -o /dev/null -s -w "%{http_code}\n"  http://10.233.7.197  
200

Service Type LoadBalancer requires some form of external provider to map external IPs to services. When deploying in a cloud provider, this is usually a provisioned external Load Balancer (e.g. AWS ELB).
For on-prem deployments, options are somewhat limited. A good option
metalLB which can manage an external IP pool via BGP or layer2 (ARP).

Volumes

Data in containers is ephemeral, i.e. If the container falls and kubeletis recreated, then all data from the old container will be lost . To overcome this problem, Kubernetes uses Volumes.

  • PersistentVolume resources are used to manage durable storage in a cluster.
  • A PersistentVolumeClaim is a request for and claim to a PersistentVolume resource. PersistentVolumeClaim objects request a specific size, access mode, and StorageClass for the PersistentVolume. If a PersistentVolume that satisfies the request exists or can be provisioned, the PersistentVolumeClaim is bound to that PersistentVolume
  • Pods use claims as Volumes. The cluster inspects the claim to find the bound Volume and mounts that Volume for the Pod.

Kubernetes has a lot of Storage types and a good approach is understand the necessity of application write and if does access mode support it .

Access Mode

A PersistentVolume can be mounted on a host in any way supported by the resource provider.

The access modes are:

  • ReadWriteOnce — the volume can be mounted as read-write by a single node
  • ReadOnlyMany — the volume can be mounted read-only by many nodes
  • ReadWriteMany — the volume can be mounted as read-write by many nodes

In the CLI, the access modes are abbreviated to:

  • RWO — ReadWriteOnce
  • ROX — ReadOnlyMany
  • RWX — ReadWriteMany

For my hands on example I use a local storage and I will show more about another filesystem distribute solutions.

Local Example

Local persistent volume file.

pv.yml

apiVersion: v1
kind: PersistentVolume
metadata:
  name: pv-volume
  labels:
    type: local
spec:
  storageClassName: manual
  capacity:
    storage: 1Gi
  accessModes:
    - ReadWriteOnce
  hostPath:
    path: "/tmp"

Local persistent volume claim

pvc.yml

apiVersion: v1
kind: PersistentVolumeClaim
metadata:
  name: pv-claim
spec:
  storageClassName: manual
  accessModes:
    - ReadWriteOnce
  resources:
    requests:
      storage: 1Gi

Configure a pod to write inside the local volume .

apiVersion: v1
kind: Pod
metadata:
  name: volume-test
  namespace: default
spec:
  containers:
  - name: centos
    image: centos
    command: ["/bin/sh"]
    args: ["-c", "while true; do echo 'storage write' >>/tmp/file && cat /tmp/file; sleep 5; done"]
    volumeMounts:
      - name: pv-volume
        mountPath: /tmp
  volumes:
    - name: pv-volume
      persistentVolumeClaim:
        claimName: pv-claim

Apply all yaml files.

kubectl apply -f pv.yml
kubectl apply -f pvc.yml
kubectl apply -f pod.yml

Check all resources

master $ kubectl get pv
NAME        CAPACITY   ACCESS MODES   RECLAIM POLICY   STATUS   CLAIM              STORAGECLASS   REASON   AGE
pv-volume   2Gi        RWO            Retain           Bound    default/pv-claim   manual                  31m


master $ kubectl get pvc
NAME             STATUS    VOLUME      CAPACITY   ACCESS MODES   STORAGECLASS   AGE
local-path-pvc   Pending                                         local-path     49m
pv-claim         Bound     pv-volume   2Gi        RWO            manual         27m
master $ kubectl get pods
NAME                     READY   STATUS    RESTARTS   AGE
nginx-86c57db685-2h6q8   1/1     Running   1          14h
volume-test              1/1     Running   0          7m9s
[root@master1 ~]#

Then check the logs from pod to read the output from POD write.

[root@master1 ~]# kubectl logs volume-test
storage write
storage write
storage write

Check all detailed information from pod and volume

master $  kubectl describe pod volume-test 
...
Volumes:
  pv-volume:
    Type:       PersistentVolumeClaim (a reference to a PersistentVolumeClaim in the same namespace)
    ClaimName:  pv-claim
    ReadOnly:   false
  default-token-ntvp4:
    Type:        Secret (a volume populated by a Secret)
    SecretName:  default-token-ntvp4
    Optional:    false
QoS Class:       BestEffort
Node-Selectors:  <none>
Tolerations:     node.kubernetes.io/not-ready:NoExecute for 300s
                 node.kubernetes.io/unreachable:NoExecute for 300s
Events:
  Type    Reason     Age    From               Message
  ----    ------     ----   ----               -------
  Normal  Scheduled  9m40s  default-scheduler  Successfully assigned default/volume-test to worker1
  Normal  Pulling    9m38s  kubelet, worker1   Pulling image "centos"
  Normal  Pulled     9m36s  kubelet, worker1   Successfully pulled image "centos"
  Normal  Created    9m36s  kubelet, worker1   Created container centos
  Normal  Started    9m35s  kubelet, worker1   Started container centos

As described from command output we can see the all details about volume , volume claim and how it attached in container .

So I think a good step is try to create a wordpress configuration inside a MiniKube installation , because it will use all these components described here.

https://kubernetes.io/docs/tutorials/stateful-application/mysql-wordpress-persistent-volume/

Clean up resources

To clean up all resources was created during the hands on examples run this follow commands :

kubectl delete pod <POD NAME> kubectl delete pvc <PVC NAME> kubectl delete pv <PV NAME> kubectl delete service <SERVICE NAME>