27
Kubernetes - Overview
Kubernetes is an open source container orchestration framework that depends on a container runtime such as docker or containerd.
Every resource created exists in a namespace. By default, it is the namespace named default
. It is also possible to create new namespaces. If you have used docker swarm
before, you can think of a namespace as a stack
.
apiVersion: v1
kind: Namespace
metadata:
name: development
labels:
name: development
A pod is a set of one or more container sharing the same namespace
and volumes
.
In Kubernetes, container are not directly used. Instead, so-called pods are created, which control the container. That has the benefit of being able to keep the same pod alive while the container(s) may restart or change.
A pod is created from a template such as the one below. Usually, there is a single container per pod, but occasionally tightly coupled containers are put in the same pod.
template:
spec:
containers:
- name: hello
image: hello-world
restartPolicy: OnFailure
Usually, pod templates are used by so workload controller
. They are responsible for managing the life cycle of a workload such as a job
and deployment
. Below is a workload definition. The corresponding controller will make sure that the state of the system matches the definition.
apiVersion: batch/v1
kind: Job
metadata:
name: hello-world
spec:
template:
spec:
containers:
- name: hello
image: hello-world
restartPolicy: OnFailure
Depending on the type of application and pod configuration, different workloads can be used. Below are the default workloads. It is also possible to use custom workloads with specialized behavior.
- Deployment and ReplicaSet (replacing the legacy resource ReplicationController). Deployment is a good fit for managing a stateless application workload on your cluster, where any Pod in the Deployment is interchangeable and can be replaced if needed.
- StatefulSet lets you run one or more related Pods that do track state somehow. For example, if your workload records data persistently, you can run a StatefulSet that matches each Pod with a PersistentVolume. Your code, running in the Pods for that StatefulSet, can replicate data to other Pods in the same StatefulSet to improve overall resilience.
- DaemonSet defines Pods that provide node-local facilities. These might be fundamental to the operation of your cluster, such as a networking helper tool, or be part of an add-on. Every time you add a node to your cluster that matches the specification in a DaemonSet, the control plane schedules a Pod for that DaemonSet onto the new node.
- Job and CronJob define tasks that run to completion and then stop. Jobs represent one-off tasks, whereas CronJobs recur according to a schedule.
apiVersion: v1
kind: Service
metadata:
name: my-service
spec:
selector:
app: MyApp
ports:
- protocol: TCP
port: 80
targetPort: 9376
The default load balancing for services and pods is performed by the kube-proxy
and happens on layer 4.
The chosen proxy mode for the kube-proxy
determines the load balancing algorithm.
- userspace mode chooses a backend via a round-robin algorithm.
- iptables mode chooses a backend at random.
-
IPVS mode provides more options for balancing traffic to backend Pods; these are:
- rr: round-robin
- lc: least connection (smallest number of open connections)
- dh: destination hashing
- sh: source hashing
- sed: shortest expected delay
- nq: never queue
Kubernetes provides DNS. Depending on from where the request is placed, a DNS query yields different results. For example, resources in the same namespace can find each other without their fully qualified domain name (FQDN).
Container in the same pod can find each other via loopback interface.
A pods' DNS entry has the following form.
pod-ip-address.my-namespace.pod.cluster-domain.example.
Pods created by a Deployment or DaemonSet exposed by a Service have the following DNS resolution.
pod-ip-address.deployment-name.my-namespace.svc.cluster-domain.example
Workloads such as deployments do not have a DNS name themselves. That's why most of the time they coupled with services if they need to be reachable for other services or external requests.
Services are resolved like the following.
<service-name>.<namespace-name>.svc.cluster.local
Pods use their own namespace by default, this means that, for example, when only querying for <service-name>
it will resolve to the service bound to the same namespace as the pod making the DNS query.
This is possible because of the entry in each containers' /ect/resolve.conf
that has the following form.
search <namespace>.svc.cluster.local svc.cluster.local cluster.local
Note: that not all DNS related tools will search by default. For example, to get the service IP without FQDN
, from within a container, using dig, the +search
flag has to be used.
dig +search <service-name>
By default, Kubernetes uses the process ID of the container to determine of a pod is alive
and ready
to accept requests. As long as all specified container has a corresponding process ID (PID), the pod is considered healthy.
Custom healthprobes, can be specified. For example, a ivenessProbe
via HTTP.
apiVersion: apps/v1
kind: Deployment
metadata:
name: healthcheck-me
spec:
template:
metadata:
labels:
app: healthcheck-me
spec:
containers:
- name: healthcheck-me
image: localhost/checkme
ivenessProbe:
httpGet:
path: /healthz
port: 80
initialDelaySeconds: 0
periodSeconds: 10
timeoutSeconds: 1
failureThreshold: 3
When a pod ceases to exist, Kubernetes destroys ephemeral volumes; however, Kubernetes does not destroy persistent volumes. For any kind of volume in a given pod, data is preserved across container restarts.
apiVersion: v1
kind: Pod
metadata:
name: configmap-pod
spec:
containers:
- name: test
image: busybox
volumeMounts:
- name: config-vol
mountPath: /etc/config
volumes:
- name: config-vol
configMap:
name: log-config
items:
- key: log_level
path: log_level
Below are some of the most common types of volumes. There more types available though, for example, the big cloud provider AWS
, GCP
and Azure
have their own volume type which provisions storage in the respect cloud platform.
Volume Type | Description |
---|---|
configMap (ephemeral) | A ConfigMap provides a way to inject configuration data into pods. |
emptyDir (ephemeral) | An emptyDir volume is first created when a Pod is assigned to a node, and exists as long as that Pod is running on that node. |
hostPath | A hostPath volume mounts a file or directory from the host node's filesystem into your Pod. |
local | A local volume represents a mounted local storage device such as a disk, partition or directory. |
nfs | An nfs volume allows an existing NFS (Network File System) share to be mounted into a Pod. |
persistentVolumeClaim | A persistentVolumeClaim volume is used to mount a PersistentVolume into a Pod. |
secret (epehemral) | A secret volume is used to pass sensitive information, such as passwords, to Pods. |
27