Improving Application Availability with Pod Readiness Gates

Making sure your application running in Kubernetes is available and ready to serve traffic can be very easy with Pod liveness and readiness probes. However, not all application are built to be able to use probes or in some cases require more complex readiness checks which these probes simply cannot perform. Is there however any other solution, if Pod probes just aren't good enough?

Readiness Gates

The answer is obviously, yes. It's possible to implement complex custom readiness checks for Kubernetes Pods with help of Readiness Gates.

Readiness gates allow us to create custom status condition types similar to PodScheduled or Initialized. Those conditions can then be used to evaluate Pod readiness.

Normally, Pod readiness is determined only by readiness of all the containers in a Pod, meaning that if all containers are Ready, then whole is Ready too. If readiness gate is added to a Pod, then readiness of a Pod gets determined by readiness of all containers and status of all readiness gate conditions.

Let's look at an example, to get a better idea of how this work:

kind: Pod
...
spec:
  readinessGates:
    - conditionType: "www.example.com/some-gate-1"
...
status:
  conditions:
    - type: Ready
      status: "False"
      lastProbeTime: null
      lastTransitionTime: 2021-11-01T00:00:00Z
    - type: ContainersReady
      status: "True"
      lastProbeTime: null
      lastTransitionTime: 2021-11-01T00:00:00Z
    - type: "www.example.com/some-gate-1"
      status: "False"
      lastProbeTime: null
      lastTransitionTime: 2021-11-01T00:00:00Z

The above manifest shows a Pod with single readiness gate named www.example.com/some-gate-1. Looking at the conditions in status stanza, we can see that the ContainersReady condition is True, meaning that all containers are ready, but the custom readiness gate condition is False and therefore also Pod's Ready condition must be False.

If you use kubectl describe pod ... on such a pod you would also see the following in the Conditions section:

...
Conditions:
  Type                     Status
  www.example.com/gate-1   False 
  Initialized              True 
  Ready                    False 
  ContainersReady          True 
  PodScheduled             True

Rationale

We now know that it's possible to implement these additional readiness conditions, but are they really necessary though? Shouldn't it be enough to just leverage health checks using probes?

In most cases probes should be sufficient, there are however situations where more complex readiness checks are necessary. Probably the most common use-case for readiness gates is to sync-up with external system such as cloud provider's load balancer. Example of that would be AWS LoadBalancer or Container-native load balancing in GKE. In these cases readiness gates allow us to make the workloads network aware.

Another reason to use readiness gates is if you have external system that can perform more thorough health checks on your workloads using - for example - application metrics. This can help integrate your system into Kubernetes workload lifecycle without requiring changes to kubelet. It also allows the external system to subscribe to Pod condition changes and act upon on it, possibly applying changes to remediate any availability issues.

Finally, readiness gates can be a lifesaver if you have legacy application deployed to Kubernetes which is not compatible with liveness or readiness probes, yet its readiness can be checked in a different way.

For complete rationale for this feature, check out the original KEP in GitHub.

Creating First Gate

Enough talking, let's create our first readiness gate. All we need to do is add readinessGates stanza in Pod spec with the name of our desired condition:

# kubectl run nginx \
#     --image=nginx \
#     --overrides='{"spec": {"readinessGates": [{"conditionType": "www.example.com/gate-1"}]}}'

apiVersion: v1
kind: Pod
metadata:
  name: nginx
spec:
  readinessGates:
  - conditionType: www.example.com/gate-1
  containers:
  - name: nginx
    image: nginx:latest

Adding the gate is easy, but updating is little more complicated. kubectl subcommands don't support patching of object status, therefore we cannot use kubectl patch set the condition to True/False. Instead, we have to use PATCH HTTP request sent directly to API server.

The simplest way to access cluster API server is using kubectl proxy, which allows us to reach the server on localhost:

kubectl proxy --port 12345 &

curl -s http://localhost:12345/
curl -k -H 'Accept: application/json' http://localhost:12345/api/v1/namespaces/default/pods/nginx/status

In addition to starting the proxy in the background, we also used curl to check if the server is reachable and queried the server for manifest/status of the pod we will be updating.

Now that we have a way to reach the API server, let's try updating the Pod status. Every readiness gate status condition defaults to False, but let's start by explicitly setting it:

# Explicitly set status to "False"
curl -k \
     -H "Content-Type: application/json-patch+json" \
     -X PATCH http://localhost:12345/api/v1/namespaces/default/pods/nginx/status \
     --data '[ { "op": "add", "path": "/status/conditions/-", "value": { "lastProbeTime": null, "lastTransitionTime": "2020-03-05T15:50:51Z", "status": "False", "type": "www.example.com/gate-1" }}]'


kubectl get pods -o wide
NAME    READY   STATUS    RESTARTS   AGE   IP           NODE                 NOMINATED NODE   READINESS GATES
nginx   1/1     Running   0          25s   10.244.0.6   kind-control-plane   <none>           0/1

kubectl describe pod nginx
...
Readiness Gates:
  Type                     Status
  www.example.com/gate-1   False 
Conditions:
  Type                     Status
  www.example.com/gate-1   False
  Initialized              True 
  Ready                    False 
  ContainersReady          True 
  PodScheduled             True

In this snippet we first used PATCH request against API proxy server to apply JSON patch to status.condition fields of the Pod. In this case we used add operation because the status was not set yet. Additionally, you can also see that when we list the pods with -o wide, the READINESS GATES column shows 0/1 indicating that the gate is set to False. Same can be also seen in output of kubectl describe.

Next, let's see how we can toggle the value to True:

curl -k \
     -H "Content-Type: application/json-patch+json" \
     -X PATCH http://localhost:12345/api/v1/namespaces/default/pods/nginx/status \
     --data '[{ "op": "replace", "path": "/status/conditions/0", "value": { "type": "www.example.com/gate-1", "status": "True" }}]'

kubectl get pods -o wide
NAME    READY   STATUS    RESTARTS   AGE   IP           NODE                 NOMINATED NODE   READINESS GATES
nginx   1/1     Running   0          58s   10.244.0.6   kind-control-plane   <none>           1/1

Similarly to previous code snippet, we again use PATCH request to update the condition, this time however we used replace operation, specifically on the first condition in the list as specified by /status/conditions/0. Be aware though, that the custom condition doesn't necessarily have to be first in the list so if you will be using some script to update conditions, then you should first check which condition you should be updating.

Using Client Libraries

Updating condition with curl like we saw above works for simple scripts or quick manual updates, but generally you will probably need more robust solution. Considering that kubectl is not an option here, your best bet will be one of the Kubernetes client libraries. For demonstration purposes let's see how it can be done in Python:

# pip install kubernetes
import time
from kubernetes import client, config

config.load_kube_config()

pod_manifest = {
            "apiVersion": "v1",
            "kind": "Pod",
            "metadata": {
                "name": "nginx"
            },
            "spec": {
                "readinessGates": [
                    {"conditionType": "www.example.com/gate-1"}
                ],
                "containers": [{
                    "image": "nginx",
                    "name": "nginx",
                }]
            }
        }

v1 = client.CoreV1Api()

response = v1.create_namespaced_pod(body=pod_manifest, namespace="default")
while True:
    response = v1.read_namespaced_pod(name="nginx", namespace="default")
    if response.status.phase != "Pending":
        break
    time.sleep(1)

print("Pod is 'Running'...")

First thing we need to do is authenticate to the cluster and create the Pod. The authentication part is in this case done using config.load_kube_config() which loads your credentials from ~/.kube/config, in general though it's better to use service accounts and tokens to authenticate to the cluster, sample for that can be found in docs.

As for the second part - Pod creation - that's pretty straightforward, we just apply the pod manifest and then wait until it's status phase changes from Pending.

With the Pod running, we can continue by setting its status to the initial False value:

response = v1.patch_namespaced_pod_status(name="nginx", namespace="default", body=[{
    "op": "add", "path": "/status/conditions/-",
    "value": {
        "lastProbeTime": None,
        "lastTransitionTime": "2020-03-05T15:50:51Z",
        "status": "False",
        "type": "www.example.com/gate-1"
    }}])

pod = v1.read_namespaced_pod_status(name="nginx", namespace="default")
for i, condition in enumerate(pod.status.conditions):
    if condition.type == "Ready":
        gate, index = condition, i

print(f"ReadinessGate '{gate.type}' has readiness status: {gate.status}, Reason: {gate.message}.")
# ReadinessGate 'Ready' has readiness status: False, Reason: the status of pod readiness gate "www.example.com/gate-1" is not "True", but False.

In addition to setting the status, we also queried the cluster for the current Pod status after the update. We looked up the section that corresponds to the Ready condition and printed its status.

And finally, we can flip the value to True with the following code:

pod = v1.read_namespaced_pod_status(name="nginx", namespace="default")
for i, condition in enumerate(pod.status.conditions):
    if condition.type == "www.example.com/gate-1":
        gate, index = condition, i

print(f"ReadinessGate '{gate.type}' has readiness status: {gate.status}, Reason: {gate.message}.")
# ReadinessGate 'www.example.com/gate-1' has readiness status: False, Reason: None.

response = v1.patch_namespaced_pod_status(name="nginx", namespace="default", body=[{
    "op": "replace", "path": f"/status/conditions/{index}",
    "value": {
        "lastProbeTime": None,
        "lastTransitionTime": "2020-03-05T15:50:51Z",
        "status": "True",
        "type": "www.example.com/gate-1"
    }}])

The code here is very similar to the example earlier, this time however we look for condition of type www.example.com/gate-1, we verify its current state with print and then apply the change using replace operation to the condition listed at index.

Closing Thoughts

Both the shell scripts and Python code above demonstrate how you can go about implementing readiness gates and their updates. In real world application you will probably need a more robust solution though.

The ideal solution for this would be custom controller which would watch pods, that have readinessGate stanza set to relevant conditionTypes. The controller would be then able to update the condition(s) based on the observed state of the pod, whether it's based on custom pod metrics, external network state or whatever else.

If you're thinking about implementing such controller, then you can get some inspiration from existing solutions such as the AWS and GKE load balancers mentioned earlier or the (now archived) kube-conditioner.

50