Author CKA: Vincenzo Tagliavia (CKA, CKAD, CKS)
Last modified: 20 Dec 2024
Security is top priority in Kubernetes environments, but it is hard. Whether in the cloud or on-premise, Kubernetes security involves proactive management and automation of a lot of different processes and components. Because of this complexity, we've compiled a checklist of Kubernetes security best practices to help simplify and improve your security posture. We recommend taking small steps. Each step is like an onion's layer. Each layer provides its own isolation mechanisms and builds on top of the next ones. With these layers in mind, we break our Kubernetes security best practices into 3 main architecture layers: 1. system, 2. orchestrator and 3. application. Let's get started.
Kubernetes is an open-source platform to orchestrate containerized applications. As an orchestrator, Kubernetes offers powerful abstractions on top of the underlying operating system. For example, Kubernetes uses the host cgroups to manage resource capacity for containers. It can also configure workloads to make use of CPU "pools" for latency-sensitive applications. Or it can use the local file system to persist data out-of-the-box. With these abstractions, Kubernetes allows you to design highly available, scalable and self-healing systems.
The Kubernetes server exposes an HTTP RESTFul API to configure and run workloads. To configure workloads, authenticated clients can interact with the server with a native command line tool (e.g. Kubectl), or with the Kubernetes' package manager HELM for more complex use cases. To run workloads, the Kubernetes server interacts with other components, like Controllers for example, that constantly observe and act upon changes of applications state. In short, the Kubernetes server acts like the brain of a much larger ecosystem called also "cluster".
Beside the technical aspects of a cluster, Kubernetes provides substantial benefits that boost operational efficiencies. For example, Kubernetes supports self-healing and scalability features that help reduce toil and manual interventions. With self-healing, Kubernetes can automatically replace or restart containers that fail. It can reschedule containers when nodes die. It can kill containers that do not respond to user-defined health checks. As for scalability, Kubernetes' features allow users to scale their applications up or down with minimal interventions. These capabilities make Kubernetes an essential tool for modern DevOps practices and cloud-native application development.
Yes. Kubernetes provides authentication, authorization, admission controls and auditing mechanisms (AAAA) to secure your cluster. However, security configurations will be different depending on the deployment model you use (i.e. cloud-based, on-premise or hybrid).
In a cloud-managed service model, Kubernetes security is shared between the cloud provider and you. This means that you still neeed to allocate internal resources to security so that you cover what the cloud provider does't. In contrast, the on-premise/bare-metal deployment model shifts all the security concerns over to you. You still need to allocate resource allocations but the skill sets to secure native Kubernetes require specialized knowledge that you either build in-house or outsource. The cost/benefit analysis of using one deployment model or the other will be contingent upon other factors beside security.
To summarize, Kubernetes supports various security controls whether in the cloud, hybrid or full on-premise environments. These security controls offer different levels of granularity at different layers of your infrastructure. Whichever deployment model you use, knowledge and core competencies in these different enviroments will directly impact your security posture. Why do Kubernetes security breaches come mainly from misconfigurations? Knowledge (or a lack thereof) is the answer.
1. Misconfiguration
Lack of specialized knowledge and human errors are the most common causes of Kubernetes security breaches. The Kubernetes architecture is designed to orchestrate containers. More advanced features require deep know-how of the platform. Take for example Admission Controllers. As for Kubernetes v1.31, there are more than 30 native Admission Controllers that support several use cases to validate or modify requests access. Depending on the Kubernetes distribution you use, only a few Admission Controllers are enabled during cluster initialization. This means that your Kubernetes Administrators must enable them based on business and technical requirements.
2. Encryption & Secrets Management
Kubernetes secrets aren't encrypted by default. Secret objects instead use base64 encoding. This means that the default configuration for secret objects leaves the Kubernetes cluster open to a vast array of exploitations. We recommend 4 immediate resolutions to mediate this issue. First, enforce strict RBAC's to restrict access for human users and system components. GET, WATCH, LIST permissions must be whitelisted. Second, enforce encryption at rest for ETCD, which is disabled by default. Also, enforce TLS/SSL traffic so that secrets don't travel to ETCD in text format. Third, use a third-party secret provider (recommended) to decouple secret storage and management outside the cluster. Not an exhaustive list of recommendations, but they'll get you started.
3. Weak Role Base Access Control
Kubernetes Role Base Access Control (RBAC) is a security control that provides the minimum permissions for individual users or service accounts to operate on specific namespaces and workloads. Without RBAC management, users and service accounts can run cluster operations across resources they shouldn't have access to. Therefore, RBAC allows cluster administrators to implement the Principle of Least Privilege (POLP) and in doing so, think proactively about Kubernetes security best practices.
We outline the baseline recommendations and Kubernetes RBAC best practices as follows:
4. Supply Chain Vulnerabilities
Supply chain vulnerabilities creep into your development lifecycle from different sources. As a result, supply chain vulnerabilities constitute challenges upstream that your DevSecOps team import in your artifacts. You tackle supply chain risks by taking control of the application development lifecycle. Build, distribution, and deployment phases are all different but interrelated processes of the application lifecycle.
Each phase in the application development lifecycle requires its own tools and strategies. For example, the build phase should only use minimal base images that contain the minimum amount of dependencies required to run (e.g. Linux Alpine based images). Distroless images are another possible solution. The distribution and deployment phases, as another example, should only pull and push images to whitelisted registries you can trust. Static analysis of file systems, binaries, images and configuration files, substantially improve your security posture internally, upstream and downstream. In short, if we can't manage what happens on the outside, we harden the inside.
1. Install open-source security forensic tools
If you have limited resources to support security processes in your organization, install a security forensic tool, such as kubesec or Falco. The Center for Internet Security (CIS) and Kubernetes Benchmarks provide a rich set of security datasets and best practices for Kubernetes environments.
Alternatively, if your security posture is more proactive and you design policies that support your security processes, the Open Policy Agent (OPA) is an additional open-source tool to consider. OPA has a learning curve but its advantages stack up in your favor.
2. Apply redundant and defense-in-depth features
Each layer in your Kubernetes architecture should include both redundant security measures and defense-in-depth strategies. When you combine and apply security frameworks like AAA, the Principle of Least Privilege, or Zero-Trust architecture for example, you strengthen your security posture exponentially. Redundancy and defense-in-depth together have an even more compounding effect.
Redundancy requires duplication of security resources to support failover and high availability. Redundancy however, increases your resource consumption and may impact performance and throughput in some cases. This trade-off is based on design decisions and your specific requirements.
Defense-in-Depth involves different security measures to enhance system robustness and resilience. Like with redundancy measures, you need to consider trade-offs at the design stage.
3. Focus on Supply Chain Vulnerabilities, not just Kubernetes
The CNCF Security Model is an extension of the CISA Security Whitepaper and represents a DevOps pipeline model with four interrelated phases:
Each phase requires its own security attestations and controls. But what happens if you have no direct control over these phases? Security integrations and decentralized processes become major pain points. These pain points will determine your security implementations at runtime.
Kubernetes security best practices is a curated list of "best-of-breed" recommendations from security specialists in the Kubernetes space. For simplicity, we break Kubernetes security into 3 main parts reflecting each Kubernetes architecture layer: system, cluster and application layers. Each layer has its own peculiarities and challenges.
System security includes hardware components and the OS Host:
1. Use hardened OS images
Hardened OS images are security-enhanced OS that reduce the attack surface. For example, such operating systems can restrict WRITE operations or only ship the necessary binaries to perform essential functionalities. Talos Linux and Bottlerocket from AWS are two examples of hardened OS(s).
2. Employ Kernel security-enhanced modules
Enable SELinux, AppArmor or Seccomp to restrict system calls to the kernel. These Kernel modules enforce Mandatory Access Control (MAC) in addition to the default OS’s Discretionary Access Control (DAC), which is more permissive. MAC modules are more difficult to operationalize at scale so specific expertise is key to implement them.
3. Incorporate hardware security modules
Hardware Security Module (HSM) and Trusted Platform Module (TPM) are examples of hardware modules. Thes modules are tough to compromise because they operate at the hardware level and provide cryptographic protection and isolation for your workloads. Hardware modules require cost/benefit analysis and specific knowledge to implement. Other factors to consider are potential compatibility issues with some OS.
Security at this layer involves control plane configuration and hardening. The main components that require interventions are the Kubernetes API Server, ETCD and the Kubelet. The following list is not exhaustive.
1. Implement Native Admission Controllers
You implement security policies (Admission Controllers) to validate or modify access to the cluster. These plugins allow more fine-grained authentication and authorization mechanisms. Kubernetes provides Admission Controllers natively but you can also use external policy frameworks (e.g. OPA).
2. Enable Kubernetes API Server Audit logging.
This requires configuration of the Kubernetes API Server and it is compatible with some open-source static and dynamic scanners. Audit Logging only needs a reference to a config file and a backend volume for storage. With knowledge of Linux OS Systems, logs management is vastly simplified.
3. Use a Kubernetes Administrator (CKA or CKS)
We recommend you to consult a Kubernetes specialist to configure and harden your clusters. Kubernetes on-premise or bare metal is a completely different beast than managing Kubernetes in a cloud environment.
You secure your applications by managing supply chain and application lifecycle stages: development, distribution, deployment and runtime. Multi-tenancy setups and distributed teams are common challenges.
1. Adopt the 12-Factor-App development framework
The 12-Factor-App development framework is a set of best practices and guidelines to ship high-quality code. If you have control over development (first phase of CNCF's Security Model), this framework supports portability, maintainability, testability.It can slow down developments and has a learning curve.
2. Adopt a “shift-left” security approach
Integrate security measures early in the development process. For example, use container scanning tools to detect vulnerabilities in images before they are deployed. Or enforce open-source dependencies checks to ensure only high-quality inputs enter your code.
3. Implement Network Policies
Think of Network Policies as native lightweight firewall rules. They allow you to control how traffic flows inside and outside the cluster. Network Policies operate at the OSI Layer 3 and 4. You can specify rules by namespace, pod labels and CIDR block ranges.
Use Service Meshes for more complex requirements. For example, if you require more control over TLS or if logging network security events is crucial to your organization, Services Meshes support either and much more.
Security implementations don’t act in a vacuum. If your strategy doesn’t prioritize and support security, for example with the right policies, processes and people, implementations will be affected.
The majority of security breaches are caused by misconfigurations. Therefore, if we invest in knowledge and talent we prevent these breaches.
DevSecOps supports cross-team collaboration and security integrations. If you support DevSecOps at each stage of your application lifecycle, you improve your security posture.
AI tools predict, analyze and identify patterns across millions of data points. This can be a source of competitive advantage for years to come.
AI/ML models can learn from extensive datasets of known vulnerabilities to predict and identify potential security issues in code or configuration files. If we feed these models with new knowledge and new security breaches as they happen, we could build the next-generation of security tools with more precision and power than ever before.
Kubernetes has security mechanisms baked into the platform but we have to configure and implement them. Lack of know-how – resulting in misconfigurations – and poor allocation of resources to security are common causes of security breaches.
Why are security concerns growing?
Because we lack high-quality specialists and meticulous design in infrastructure deployments. We suggest kubernetes security best practices with a simplified framework that helps secure your infrastructure. But business policies and processes must support implementations and stopping or slowing down projects will not make your cluster more secure. Quite the contrary.