Security Features
VCPS uses and supports security features on different levels of the deployment stack. The first level is the infrastructure level. This means physical or virtual (networking) resources up to the operating system stack of the (virtual) machines.
The second level is the Kubernetes cluster itself. There are a wide range of possibilities for operators or users of Kubernetes to specify what containerized workloads are allowed to do or allowed to look like.
Infrastructure Level
For all supported cloud providers (and for on-prem deployments where applicable) care is taken to create infrastructure that minimizes the risk of security breaches. This is done on the network and machine level.
The goal on the network level is preventing unauthorized access to services running on the control plane or worker nodes. On the machine level again authorized access should be prevented as well as avoiding running software or libraries with known security vulnerabilities.
Network Infrastructure
The following image illustrates a typical infrastructure topology that gets created by VCPS:
Each VCPS base cluster consists of a control plane and at least one worker node pool. The control plane is deployed in a high availability configuration and is made up of a number of (virtual) control plane machines that run the control plane components. Each node pool consists of a number of worker nodes that run the node components.
Where possible all virtual machines in this topology are placed in a private subnet that is not directly connected to the public internet. Communication between worker nodes (pod-to-pod traffic, etc.) is routed within the private subnet using the private IP addresses of the machines.
Communication with the Kubernetes API is done via a load balancer. This load balancer provides a stable IP for accessing the Kubernetes API even if control plane machines are replaced or become temporarily unavailable. The load balancer is also the entrypoint for external access as the firewall configuration created only allows traffic from the internet to reach the load balancer.
Even if the control plane machines or worker nodes have a public IP address (depends on the chosen infrastructure provider) access to network ports on those machines is restricted to SSH for remote management via the firewall configuration.
Machine Hardening
Each (virtual) machine for control plane or worker nodes is configured to increase security. SSH is configured to only allow public key authenticated connections. The authorized SSH key is unique per cluster and either generated specifically for the cluster by VCPS (managed) or provided by the user (self-managed).
Additionally machines are configured to perform unattended security upgrades including reboots if necessary. Generally performing unattended upgrades and automatic reboots constitutes a certain risk as defective upgrades provided by the operating system maintainers might result in machines that are unable to properly boot up. VCPS mitigates this risk by automatically provisioning new (virtual) machines whenever a persistent failure is detected. This means under normal circumstances the cluster should be able to heal itself without downtime.
Cluster Level
VCPS makes use of several Kubernetes security features to allow users fine-grained control over how much to restrict access to resources used by their containerized workloads. Additionally VCPS configures Kubernetes to ensure data security by encrypting all sensitive data at rest.
Pod Security Standards
Pod Security Standards define policies that determine which features and access levels a given pod can use. These policies are cumulative and range from least to most restrictive. The following policies are available:
Policy | Description |
---|---|
Privileged | Unrestricted policy, that does allow all levels of access. This is the least secure policy. |
Baseline | Restrict certain aspects of pod access that prevents known privilege escalations. |
Restricted | Most restrictive policy that follows current best practices. Some workloads might not work correctly with this policy. |
Depending on the chosen policy features like access to volume mounts of the host machine filesystem or processes inside the container running as root might be unavailable. A exhaustive list of restricted features can be found on this page.
Pod Security Admission
To enforce one of the Pod Security Standards described above the builtin Pod Security admission controller is used. This component checks the specification for a given pod on edit or create for compliance with the chosen Pod Security Standard if any.
Pod Security Standards are chosen on a per namespace level and configured by setting specific labels on the Kubernetes namespace resource.
The main label for configuration is the following:
# The per-mode level label indicates which policy level to apply for the mode.
#
# MODE must be one of `enforce`, `audit`, or `warn`.
# LEVEL must be one of `privileged`, `baseline`, or `restricted`.
pod-security.kubernetes.io/<MODE>: <LEVEL>
The chosen mode determines what happens if the given pod specification violates the chosen policy:
Mode | Result |
---|---|
enforce | Violating pod specifications will be rejected. |
audit | Violations will be recorded in the audit log but will otherwise be allowed. |
warn | Violations will result in a user-facing warning but will otherwise be allowed. |
More details can be found here.
Kubernetes API Access Control
To secure communication with the Kubernetes API on the transport level TLS is used. The certificate used for this can be signed by a private certificate authority (CA) as all common API clients (e.g. kubectl) are able to validate TLS certificates using a private CA.
Once a transport socket is established performing any operation on the Kubernetes API either as a user or a service entails a series of steps internally that are illustrated in the following diagram (source):
The first step of the process is authentication (1). The task here is to establish the identity of the client. Kubernetes supports a variety of mechanisms for authentication like:
- client TLS certificates
- passwords
- plain tokens
- bootstrap tokens
- JSON Web Tokens (JWT)
More details can be found here. If the client
is unable to present valid credentials according to one of the configured mechanisms its request will be denied with a
401
HTTP status code. If the authentication is successful
part of the result of this step is the username
of the client which will then be used by subsequent steps for their
decision making.
The second step is authorization (2). Each client request contains the client username, the operation (e.g. create
or
update
) and the Kubernetes resource(s) this operation should be applied to. In this step Kubernetes now determines
whether the combination of operation and Kubernetes resource should be allowed for the given client. Again there are
multiple supported authorization modules. VCPS
uses role-based access control (RBAC) in the base clusters.
In RBAC mode the cluster configures
Roles or
ClusterRoles. These
roles define allowed operations on a certain set of resources and are then assigned to certain clients by using
RoleBindings or
ClusterRoleBindings.
Only combinations of operation and resource that are explicitly mentioned in one the assigned roles are allowed for a
given client. Permissions granted by roles are only additive, that means a role cannot explicitly disallow any
operation only allow additional ones. If the client tries to perform an operation on a resource not allowed by any of
his roles the request will be denied with a 403
HTTP status code.
The third step is admission control (3). In this step additional admission controllers (normally running within the cluster) can make the final determination whether the client request should be approved or denied. This mechanism can for example be used to enforce certain policies like with the Pod Security Admission controller. Only once all configured admission controllers have approved the request it is accepted and changes to the Kubernetes resource are written to the object store (4).