I’m back after a long time with the fifth episode of this mini-series about STRIDE threat modeling in Kubernetes. In the previous one we talked about Information disclosure. This part is about the D that stands for Denial Of Service.
DOS is the attempt to making a resource unavailable. For instance, a Kubernetes dashboard is left exposed on the Internet, allowing anyone to deploy containers on your company’s infrastructure to mine cryptocurrency and starve your legitimate applications of CPU (really happened - thanks Peter).
Therefore, an induced lack of resources is what generally leads to unavailability.
So, how we can do prevention? We can do it with:
- Increased Availability
- Resource isolation
- Resource monitoring
- Moreover, vulnerability-specific patches
Now let’s jump into Kubernetes world and think about splitting up the different layers on which to guarantee availability: Nodes Network Control plane Workload
As the availability can be increased on all resources, I’ll sum up briefly what we can do.
- Deploy multiple master nodes to provide HA on the control plane (for instance to protect from direct attacks to the API server);
- Deploy on multiple datacenters (to protect from attacks on the network to a particular datacenter).
Deploy on multiple datacenters (to protect from attacks on the network to a particular datacenter);
Configure resource limits per namespace by using
- CPU and memory;
- Storage (
- Object count;
- Extended resources (only limit);
Configure resource limits per container;
Configure Cluster Autoscaler to gain availability based on your workload.
- Configure Horizontal Pod Autoscaler;
- Configure correct resources limits other than requests;
- Configure Vertical Pod Autoscaler or addon-resizer; you can also leverage the VPA in
Off modein order to get only recommendations for setting appropriate resources for your workload;
- Define Pod-to-Pod and Pod-to-external Network Policies;
- Configure mutual TLS and proper API authentication mechanism.
- Configure high availability;
- Configure monitoring and alerting on requests and
- Isolate: do not expose the endpoint on Internet, for instance syn flood attacks could be in place.
- Configure HA;
- Configure monitoring and alerting on requests;
- Isolate: so that only the control plane members can access it;
- As a plus, configure dedicated cluster, since etcd is one of the main bottlenecks and to provide resilience from the other control plane components (e.g. if they are compromised).
- Configure rate limiting at Ingress Controller level to limit connections and requests per seconds/minute per IP (for example with NGINX ingress controller);
- Deny source IPs with Network policies.
Then, other than following all the best practices there could also be vulnerabilities on components that we generally consider already secured; so let’s sum up a couple of them.
CVE-2019–9512: Ping Flood with HTTP/2
The attacker hammers the HTTP/2 listener with a continuous flow of ping requests. To respond, the recipient start queuing the responses, leading to growing queues and then allocating more memory and CPU.
CVE-2019–9514: Reset Flood with HTTP/2
The attacker can open several streams to the server and sending invalid data through them.
Having received invalid data, the server sends HTTP/2
RST_STREAM frames to the attacker to cancel the “invalid” connection.
With lots of
RST_STREAM responses, they start to queue.
As the queue gets more massive, more and more CPU and memory get allocated to the application until it eventually crashes.
Kubernetes has released the required patches to mitigate the issues as mentioned above. The new versions were built using the patched versions of Go so that the required fixed are applied to the net/http library.
- Kubernetes v1.15.3 - go1.12.9
- Kubernetes v1.14.6 - go1.12.
- Kubernetes v1.13.10 - go1.11.13
CVE-2020–8557: Node disk DOS
The /etc/hosts file mounted in a pod by kubelet is not included by the kubelet eviction manager when calculating ephemeral storage usage by a pod. If a pod writes a large amount of data to the /etc/hosts file, it could fill the storage space of the node. Affected versions:
- kubelet v1.18.0–1.18.5
- kubelet v1.17.0–1.17.8
- kubelet < v1.16.13
- kubelet master - fixed by #92916
- kubelet v1.18.6 - fixed by #92921
- kubelet v1.17.9 - fixed by #92923
- kubelet v1.16.13 - fixed by #92924
Prior to upgrading, this vulnerability can be mitigated by using PodSecurityPolicies or other admission webhooks to force containers to drop
CAP_DAC_OVERRIDE or to prohibit privilege escalation and running as root.
Consider anyway that these measures may break existing workloads that rely upon these privileges to function properly.
CVE-2020–8551: Kubelet DoS via API
kubelet has been found to be vulnerable to a denial of service attack via kubelet API, including the unauthenticated HTTP read-only API typically served on port 10255, and the authenticated HTTPS API typically served on port 10250.
- kubelet v1.17.0 - v1.17.2
- kubelet v1.16.0 - v1.16.6
- kubelet v1.15.0 - v1.15.9
- kubelet v1.17.3
- kubelet v1.16.7
- kubelet v1.15.10
In order to mitigate this issue limit access to the kubelet API or patch the kubelet.
CVE-2020–8552: Kubernetes API Server OOM
The API server has been found to be vulnerable to a denial of service attack via authorized API requests.
- kube-apiserver v1.17.0 - v1.17.2
- kube-apiserver v1.16.0 - v1.16.6
- kube-apiserver < v1.15.10
- kube-apiserver v1.17.3
- kube-apiserver v1.16.7
- kube-apiserver v1.15.10
Prior to upgrading, this vulnerability can be mitigated by preventing unauthenticated or unauthorized access to all apis and by ensuring that the API server automatically restarts if it OOMs.
CVE-2019–1002100: Kubernetes API Server JSON-patch parsing
Users that are authorized to make patch requests to the Kubernetes API server can send a specially crafted patch of type
kubectl patch - type json or
Content-Type: application/json-patch+json) that consumes excessive resources while processing, causing a denial of service on the API server.
- Kubernetes v1.0.x-1.10.x
- Kubernetes v1.11.0–1.11.7
- Kubernetes v1.12.0–1.12.5
- Kubernetes v1.13.0–1.13.3
- Kubernetes v1.11.8
- Kubernetes v1.12.6
- Kubernetes v1.13.4
Prior to upgrading, this vulnerability can be mitigated by removing patch permissions from untrusted users.
CVE-2019–11253: Kubernetes API Server JSON/YAML parsing
This is a vulnerability in the API server, allowing authorized users sending malicious YAML or JSON payloads to cause kube-apiserver to consume excessive CPU or memory, potentially crashing and becoming unavailable.
Prior to v1.14.0, default RBAC policy authorized anonymous users to submit requests that could trigger this vulnerability.
Clusters upgraded from a version prior to v1.14.0 keep the more permissive policy by default for backwards compatibility. Here you can find the more restrictive RBAC rules that can mitigate the issue.
- Kubernetes v1.0.0–1.12.x
- Kubernetes v1.13.0–1.13.11
- Kubernetes v1.14.0–1.14.7
- Kubernetes v1.15.0–1.15.4
- Kubernetes v1.16.0–1.16.1
- Kubernetes v1.13.12
- Kubernetes v1.14.8
- Kubernetes v1.15.5
- Kubernetes v1.16.2
Consider that if you are running a version prior to v1.14.0, in addition to installing the restrictive policy, turn off autoupdate for the applied ClusterRoleBinding so your changes aren’t replaced on an API server restart.
On the related Github issue you can find more details that I didn’t insert here for conciseness.
So, that’s all folks! If we followed all these rules and applied the released patches that’s a good starting point for prevention and can also help on detection and remediation.
Stay tuned for the next and final episode about the E of STRIDE: Escalation of privileges!