Linux capabilities split the all-powerful root privilege into distinct units. Instead of running a container as full root (with access to every kernel operation), you can grant only the specific capabilities it needs. This guide compares three approaches to managing container capabilities: native Docker/Kubernetes capability controls, capsh (capability shell), and bubblewrap (unprivileged sandbox).
Understanding Linux Capabilities
Before Linux 2.2, every privileged operation required the superuser (UID 0) role. This meant a compromised root process had unlimited access to the system. Linux capabilities introduced a finer-grained model: the root role is split into ~40 distinct privileges, each controlling specific kernel operations.
Key capabilities for container workloads include:
- CAP_NET_ADMIN: Network configuration (interfaces, routes, firewall rules)
- CAP_SYS_ADMIN: Broad system administration (mount, namespace, cgroup operations)
- CAP_DAC_OVERRIDE: Bypass file read/write/execute permission checks
- CAP_NET_RAW: Use RAW and PACKET sockets (packet sniffing)
- CAP_SYS_PTRACE: Trace processes (debugging, monitoring)
- CAP_CHOWN: Change file ownership
- CAP_FOWNER: Bypass permission checks on operations requiring file owner ID match
A container running with --privileged gets ALL capabilities — equivalent to running as root on the host. A non-privileged container gets a default subset (roughly 14 capabilities). The security best practice is to drop all capabilities and add back only what’s needed.
For related container security topics, see our Container Security Hardening guide and our Container Seccomp Profile Management guide. For runtime security monitoring, check our Kubearmor vs Falco vs Tetragon guide.
Native Docker/Kubernetes Capability Controls
Docker and Kubernetes provide built-in mechanisms to control container capabilities without additional tools.
Docker Capabilities
Docker uses --cap-add and --cap-drop flags to manage capabilities:
| |
Docker Compose Configuration
| |
Kubernetes SecurityContext
Kubernetes controls capabilities via the securityContext:
| |
Kubernetes Pod Security Standards
At the cluster level, Pod Security Standards (PSS) enforce capability policies:
| |
The restricted policy drops ALL capabilities and prevents adding any back.
capsh (Capability Shell)
capsh is part of the libcap package and provides a shell environment with specific capability sets. It’s used for testing, debugging, and running processes with controlled capabilities.
Installation
| |
Usage Examples
| |
Dockerfile with capsh for Testing
| |
Key Features
- Part of libcap: Official Linux capability library, available on all distributions
- Interactive testing: Test capability configurations before deploying
- Debugging: Print and verify capability sets at runtime
- Bounding set control: Manage the bounding set (upper limit of inheritable capabilities)
- No additional image layers: Available through standard package managers
Limitations
- Not a sandbox: capsh doesn’t isolate namespaces or filesystems — it only controls capabilities
- Manual configuration: Requires explicit capability specification for each use case
- Root requirement: To set capabilities, you typically need initial root access
bubblewrap (Unprivileged Sandbox)
bubblewrap is a sandboxing tool that uses Linux namespaces and seccomp to create isolated environments without requiring root privileges. It’s the foundation of Flatpak’s sandboxing.
How bubblewrap Works
bubblewrap combines multiple Linux isolation mechanisms:
- User namespaces: Maps container UIDs to unprivileged host UIDs
- Mount namespaces: Creates isolated filesystem views
- PID namespaces: Isolates process visibility
- seccomp filters: Blocks dangerous syscalls
- Capability dropping: Drops all capabilities by default
Installation
| |
Usage Examples
| |
Docker Compose Alternative: Running Apps via bubblewrap
For applications that don’t need full container isolation, bubblewrap can replace Docker:
| |
Key Features
- No root required: Runs entirely in user namespaces, no sudo needed
- Multiple isolation layers: Namespaces + seccomp + capabilities combined
- Fine-grained filesystem control: Bind-mount specific paths (read-only or read-write)
- Flatpak foundation: Battle-tested through millions of Flatpak installations
- Lightweight: No daemon, no image layers — just a single binary
Comparison with Docker
| Feature | bubblewrap | Docker |
|---|---|---|
| Root Required | No (user namespaces) | Yes (for image management) |
| Image System | No (uses host filesystem) | Yes (layers, registry) |
| Networking | Optional (--share-net) | Built-in (bridge, host, overlay) |
| Capability Control | All dropped by default | Default subset, configurable |
| Process Isolation | PID namespace | PID namespace |
| Filesystem Isolation | Bind mounts only | Full overlay filesystem |
| Best For | Desktop app sandboxing | Server container orchestration |
Comparison Table
| Feature | Docker/K8s Native | capsh | bubblewrap |
|---|---|---|---|
| Capability Control | --cap-add/--cap-drop | --caps, --drop | All dropped by default |
| Namespace Isolation | Full (PID, network, mount, etc.) | None | Full (via unshare) |
| seccomp Filtering | Yes (customizable profiles) | No | Yes (default profile) |
| Root Required | Yes (for Docker daemon) | Yes (to set caps) | No (user namespaces) |
| Image Management | Yes (Docker registry) | No | No |
| Orchestration | Kubernetes, Docker Compose | Manual | Manual |
| Complexity | Low (built-in) | Low (CLI tool) | Medium (namespace config) |
| Primary Use Case | Production containers | Capability testing/debugging | Desktop sandboxing |
| Multi-container | Yes | No | No |
| GitHub Stars | N/A | N/A (libcap) | 2,500+ |
Security Best Practices
Always drop ALL capabilities first: Start with
--cap-drop=ALLand add back only what’s needed. This follows the principle of least privilege.Combine with seccomp: Capabilities control what privileged operations are allowed; seccomp controls which syscalls can be made. Using both provides defense in depth.
Use
no-new-privileges: Setsecurity_opt: no-new-privileges:truein Docker orallowPrivilegeEscalation: falsein Kubernetes to prevent processes from gaining additional privileges via setuid binaries.Audit capability usage: Use
capsh --printinside containers to verify the effective capability set matches your expectations.Avoid CAP_SYS_ADMIN: This capability is nearly equivalent to full root. It allows mount operations, namespace manipulation, and many other powerful operations. Only grant it if absolutely necessary.
Use read-only root filesystems: Combine capability dropping with
readOnlyRootFilesystem: trueto prevent filesystem modifications even if a capability is misused.
FAQ
What’s the difference between capabilities and seccomp?
Capabilities control what privileged kernel operations a process can perform (mount, network config, etc.). seccomp (secure computing mode) controls which system calls a process can make. They operate at different layers: capabilities are a permission model for specific operations, while seccomp is a syscall filter. Using both together provides stronger security than either alone.
Can I run Docker containers without any capabilities?
Yes. Use --cap-drop=ALL to drop all default capabilities. However, most container images expect at least some capabilities to function. A web server might need NET_BIND_SERVICE to listen on port 80. A database might need CHOWN and FOWNER to manage its data files. Test your application with dropped capabilities and add back only what it needs.
Is bubblewrap a replacement for Docker?
Not for production server workloads. bubblewrap lacks Docker’s image management, networking, orchestration, and multi-container support. It’s designed for sandboxing individual desktop applications (which is what Flatpak uses it for). For server containers, use Docker or Kubernetes with proper capability controls.
How do I find out which capabilities my application needs?
Start by running with all capabilities dropped (--cap-drop=ALL) and observe what breaks. Then add capabilities one by one until the application works. Use capsh --print inside the container to verify. Alternatively, use strace to identify syscalls that fail with EPERM (permission denied) — these often indicate missing capabilities.
What is the capability bounding set?
The bounding set is an upper limit on the capabilities that can ever be acquired by a process and its children. Even if a process has a capability in its permitted set, it cannot gain it if it’s not in the bounding set. Docker sets the bounding set based on --cap-add/--cap-drop flags. capsh can manipulate it with --bounding-set.
Does dropping capabilities affect application performance?
No. Capabilities are a security mechanism, not a performance limiter. Dropping capabilities doesn’t slow down your application — it simply prevents the application from performing certain privileged operations. The only “performance” impact is that operations requiring dropped capabilities will fail with permission errors, which is the intended security behavior.