Running containers as the root user has long been a security concern in self-hosted infrastructure. When a container process runs as root on the host, a container escape vulnerability can grant full system-level access. Rootless container technology eliminates this risk by allowing containers to run entirely under an unprivileged user account, without requiring root privileges or setuid binaries.
Three core technologies enable rootless containers on Linux: RootlessKit for user and mount namespaces, fuse-overlayfs for unprivileged filesystem overlays, and slirp4netns for unprivileged networking. While they serve different purposes, all three are essential building blocks for rootless container runtimes like Podman, Docker Rootless, and nerdctl.
In this guide, we examine each technology, explain how they work together, and provide deployment configurations for self-hosted rootless container infrastructure.
The Rootless Container Problem
Traditional container runtimes (Docker, containerd) require root privileges because they need to:
- Create user namespaces — mapping container UIDs to host UIDs requires
CAP_SYS_ADMIN - Mount overlay filesystems — the Linux overlayfs driver historically required root
- Configure networking — creating virtual Ethernet pairs and bridge interfaces requires root
- Set cgroup limits — resource accounting and limits are managed by the root-owned cgroup hierarchy
Each of these requirements represents a potential attack surface. If an attacker exploits a container escape vulnerability, they gain root access to the host.
Rootless container technology solves these problems by leveraging three unprivileged alternatives, each addressing one of the core requirements.
Comparison Table
| Feature | RootlessKit | fuse-overlayfs | slirp4netns |
|---|---|---|---|
| Purpose | User/mount namespace setup | Unprivileged overlay filesystem | Unprivileged networking |
| GitHub Repo | rootless-containers/rootlesskit | containers/fuse-overlayfs | rootless-containers/slirp4netns |
| Stars | ~1,256 | ~657 | ~911 |
| Maintainer | containerd community | containerd community | containerd community |
| Language | Go | C | C |
| License | Apache 2.0 | Apache 2.0 | BSD-2-Clause |
| Requires FUSE | No | Yes | No |
| Kernel Version | Linux 3.8+ | Linux 4.18+ (FUSE overlay) | Linux 3.8+ |
| Used By | Podman, Docker Rootless, nerdctl | Podman, Docker Rootless, Buildah | Podman, Docker Rootless, nerdctl |
| Networking | Port forwarding only | N/A | Full TCP/UDP/ICMP stack |
| Performance | Minimal overhead | ~10-15% slower than native overlay | ~5-10% slower than native veth |
RootlessKit
RootlessKit is the foundational namespace setup tool for rootless containers. It creates the user namespace, mount namespace, and network namespace that allow an unprivileged user to run containers. Think of it as the “entry point” that establishes the sandbox environment.
How It Works
RootlessKit performs three key operations:
- User namespace creation: Uses
unshare(CLONE_NEWUSER)to create a new user namespace where the current user becomes UID 0 (root) inside the namespace, while remaining an unprivileged user on the host. - Mount namespace setup: Creates a new mount namespace with a private root filesystem, allowing the container runtime to mount filesystems without affecting the host.
- Port forwarding: Sets up a built-in port forwarder that maps host ports to container ports without requiring iptables rules or root-level network configuration.
Docker Compose Deployment (RootlessKit + containerd)
| |
Manual Setup
| |
RootlessKit’s --port-driver=builtin option provides efficient port forwarding without requiring a separate port forwarding daemon. The --copy-up flag ensures that modifications to specified directories within the namespace don’t affect the host filesystem.
fuse-overlayfs
fuse-overlayfs implements the Linux overlay filesystem in userspace using FUSE (Filesystem in Userspace). This allows unprivileged users to create overlay filesystems — the core mechanism that container runtimes use to provide each container with its own writable layer on top of a read-only image.
How It Works
Overlay filesystems combine multiple directories into a single unified view:
- Lower layers: Read-only directories from the container image (shared across containers)
- Upper layer: Writable directory unique to each container
- Work directory: Temporary workspace for atomic operations
The native Linux overlayfs kernel module requires root privileges. fuse-overlayfs replicates this functionality in userspace using FUSE, which has been available to unprivileged users since Linux 4.18.
Docker Compose Deployment
| |
Configuration for Podman
| |
Performance Considerations
fuse-overlayfs introduces approximately 10-15% I/O overhead compared to native overlayfs due to the FUSE userspace/kernel boundary crossing. For most workloads (web servers, application servers, databases with moderate I/O), this overhead is negligible. For high-I/O workloads (databases with heavy write patterns, log processing), consider:
- Using native overlayfs with rootful containerd if security allows
- Mounting high-I/O directories as tmpfs or bind mounts to bypass the FUSE layer
- Enabling
metacopy=onin the overlay options to reduce metadata operations
slirp4netns
slirp4netns provides unprivileged networking for rootless containers by implementing a userspace TCP/IP stack. It creates a virtual network interface inside the user namespace and routes traffic through a userspace NAT, without requiring any root-level network configuration.
How It Works
slirp4netns operates by:
- Creating a TAP device inside the user namespace — a virtual network interface that appears as a real Ethernet device to processes in the namespace.
- Running a userspace TCP/IP stack that handles packet routing, NAT, and DNS resolution without kernel-level networking privileges.
- Forwarding traffic between the TAP device and the host’s network through standard file I/O, which doesn’t require root.
Docker Compose Deployment
| |
Manual Setup with RootlessKit
| |
Advanced Configuration
| |
The --enable-sandbox and --enable-seccomp flags activate additional security measures that restrict the userspace networking stack from accessing files and making system calls beyond what’s necessary for packet forwarding.
How the Three Work Together
In a complete rootless container runtime, these three technologies form a stack:
| |
When you run podman run or dockerd-rootless.sh as an unprivileged user:
- RootlessKit sets up the namespace sandbox
- fuse-overlayfs provides the container’s filesystem layers
- slirp4netns provides network connectivity
- The container runtime (runc, crun, or youki) starts the process inside this environment
Why Rootless Containers Matter for Self-Hosting
Security isolation is the primary motivation. When containers run as root, a single vulnerability in any containerized application can compromise the entire host. Rootless containers confine potential damage to the unprivileged user’s permissions — no access to host system files, no ability to modify kernel parameters, no access to other users’ data.
Multi-tenant self-hosting becomes practical with rootless containers. When different users or services run in their own rootless namespaces, they cannot affect each other’s containers or filesystems. This is essential for shared hosting environments, developer workstations, and CI/CD platforms where multiple untrusted workloads run on the same machine.
Compliance requirements increasingly mandate non-root container execution. SOC 2, PCI DSS, and various industry standards require principle of least privilege. Rootless container technology makes compliance achievable without sacrificing the convenience of container-based deployment.
No setuid binaries — traditional Docker uses setuid-root binaries that are a well-known attack vector. RootlessKit, fuse-overlayfs, and slirp4netns operate entirely without setuid, reducing the attack surface of your container infrastructure.
For related reading, see our Rootless Container Runtimes comparison and OCI Container Runtimes deep dive.
FAQ
What is a rootless container?
A rootless container runs entirely under an unprivileged user account, without requiring root access or setuid binaries on the host. The container process appears as a normal user process to the host kernel, and any container escape is limited to the permissions of that unprivileged user.
Do I need to modify my applications to run them rootlessly?
No. Rootless containers run the same container images as rootful containers. The difference is entirely in how the container runtime sets up the execution environment. Your application code, Dockerfiles, and compose files work identically.
Is rootless container performance slower?
There is a small performance overhead, typically 5-15% depending on the workload. fuse-overlayfs adds I/O overhead due to the FUSE userspace boundary, and slirp4netns adds network latency compared to native veth pairs. For most self-hosted workloads (web services, databases, monitoring), this overhead is imperceptible.
Can I use rootless containers with Kubernetes?
Kubernetes itself requires root-level setup, but individual pods can run as non-root users using security contexts. For self-hosted single-node clusters, you can run the entire Kubernetes node in rootless mode using RootlessKit as the init process.
Which container runtimes support rootless mode?
Podman has first-class rootless support (it was designed rootless from the start). Docker supports rootless mode via dockerd-rootless.sh (since Docker 20.10). nerdctl (containerd’s CLI) also supports rootless operation. All three use RootlessKit, fuse-overlayfs, and slirp4netns under the hood.
How do I expose ports from rootless containers?
RootlessKit provides a built-in port forwarder (--port-driver=builtin) that maps host ports to container ports. Alternatively, slirp4netns can be configured with port forwarding rules. Both approaches work without root-level iptables configuration.
Are there any limitations with rootless containers?
Yes. Some features require root privileges and are unavailable in rootless mode:
- Cgroup v1 resource limits (cgroup v2 works with rootless)
- Certain network plugins (bridge, macvlan)
- Mounting block devices
- Setting real-time scheduling priorities
- Using privileged containers Most of these limitations can be worked around or are acceptable for typical self-hosted workloads.