← Back to posts
ebpf observability networking · · 14 min read

Complete Guide to Self-Hosted eBPF Networking and Observability: Cilium, Pixie, Tetragon 2026

Comprehensive guide to self-hosted eBPF-powered networking, observability, and security tools — Cilium, Pixie, Tetragon, and Inspektor Gadget. Learn how to deploy, configure, and leverage eBPF for deep infrastructure visibility without agents or code changes.

OS
Editorial Team

The eBPF (extended Berkeley Packet Filter) revolution has fundamentally changed how we observe, secure, and manage network infrastructure. Born from the Linux kernel, eBPF allows sandboxed programs to run inside the kernel without modifying kernel source code or loading modules. This means you can intercept network packets, trace system calls, monitor application performance, and enforce security policies — all with near-zero overhead and no instrumentation changes to your applications.

In 2026, the eBPF ecosystem has matured into a production-ready observability and networking stack. This guide covers the four most powerful self-hosted eBPF tools you can deploy today: Cilium for networking and security, Pixie for application observability, Tetragon for runtime security enforcement, and Inspektor Gadget for ad-hoc kernel-level debugging.

Why Self-Hosted eBPF Tools Beat Cloud Observability Vendors

Cloud-native observability platforms charge per metric, per log line, per trace span. As your infrastructure grows, so do your bills. Self-hosted eBPF tools give you kernel-deep visibility with no per-event pricing, no data caps, and no vendor lock-in.

Here is why eBPF-based observability is fundamentally different from traditional monitoring:

  • No application code changes required — eBPF programs attach to kernel hooks, so you get visibility into any process, network connection, or system call without modifying your application code or redeploying services
  • Near-zero performance overhead — eBPF runs in the kernel with a verified bytecode sandbox. Well-tuned eBPF programs add less than 1% CPU overhead compared to sidecar proxies that can add 10-30%
  • Deep kernel visibility — traditional monitoring tools see what applications expose via HTTP metrics or logs. eBPF sees TCP retransmits, DNS queries at the kernel level, file I/O patterns, and process lifecycle events in real time
  • Programmable data collection — instead of pre-defined metrics, eBPF lets you write programs that extract exactly the data you need, reducing cardinality and storage costs dramatically
  • Unified networking and security — eBPF tools replace iptables, implement service meshes without sidecars, enforce network policies, and detect security threats from the same data plane

For teams running kubernetes clusters, bare metal servers, or hybrid infrastructure, self-hosted eBPF tools provide the visibility that cloud APM tools charge thousands per month for — with better depth and full data ownership.

Cilium: eBPF-Powered Networking, Service Mesh, and Security

Cilium is the most widely deployed eBPF project in production. Originally created as a Kubernetes CNI (Container Network Interface) plugin, it has grown into a full networking, security, and observability platform that replaces iptables, kube-proxy, and traditional service meshes like Istio’s sidecar model.

What Cilium Does

Cilium leverages eBPF to implement Kubernetes networking at the kernel level. Instead of translating service routing rules into thousands of iptables entries (which becomes a performance bottleneck at scale), Cilium programs eBPF hooks directly. This delivers significantly faster packet processing and supports advanced features like L7-aware network policies.

Installing Cilium with Helm

The recommended installation method uses the official Helm chart:

1
2
3
4
5
6
7
8
9
# Add the Cilium Helm repository
helm repo add cilium https://helm.cilium.io/
helm repo update

# Install Cilium with default settings
helm install cilium cilium/cilium \
  --namespace kube-system \
  --set hubble.relay.enabled=true \
  --set hubble.ui.enabled=true

Advanced Cilium Configuration

For production deployments, you will want to enable additional features:

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
# cilium-values.yaml
cluster:
  name: production-cluster
  id: 1

ipam:
  mode: "cluster-pool"
  operator:
    clusterPoolIPv4PodCIDRList: ["10.0.0.0/8"]
    clusterPoolIPv4MaskSize: 24

k8sServiceHost: "192.168.1.100"
k8sServicePort: 6443

kubeProxyReplacement: true

routingMode: native

hubble:
  relay:
    enabled: true
  ui:
    enabled: true
    port: 12000
  metrics:
    enabled:
      - dns:query
      - drop
      - tcp
      - flow
      - port-distribution
      - icmp
      - http

operator:
  replicas: 2
  rollOutPods: true

gatewayAPI:
  enabled: true
  enableAlpn: true

envoy:
  securityContext:
    capabilities:
      keepCapNetBindService: true

Apply this configuration:

1
2
3
helm upgrade cilium cilium/cilium \
  --namespace kube-system \
  -f cilium-values.yaml

Hubble: Network Observability

Hubble is Cilium’s built-in observability layer. It collects network flow metadata from eBPF programs and presents it through a CLI and web UI:

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
# Forward the Hubble UI port
kubectl port-forward -n kube-system svc/hubble-ui 12000:80

# View real-time network flows
hubble observe --namespace default --follow

# Filter by protocol
hubble observe --protocol http --namespace production

# View DNS queries
hubble observe --type L7 --protocol dns

# Export flows for analysis
hubble observe --since 1h --output json > flows.json

Hubble gives you a live dependency graph of all services, showing which pods communicate with each other, what protocols they use, and where connections are being dropped by network policies.

Pixie: Zero-Instrumentation Application Observability

Pixie takes eBPF observability further by providing automatic, zero-instrumentation application-level telemetry. Unlike traditional APM tools that require SDK integration or code changes, Pixie auto-discovers protocols and generates metrics, traces, and logs from kernel-level data.

Supported Protocols

Pixie automatically detects and parses these protocols without any configuration:

ProtocolMetrics CapturedTrace Support
HTTP/1.1, HTTP/2, gRPCLatency, status codes, throughputFull distributed tracing
PostgreSQLQuery latency, error rates, active connectionsQuery-level tracing
MySQLQuery performance, connection statsQuery-level tracing
RedisCommand latency, hit rates, key patternsCommand-level tracing
KafkaProducer/consumer latency, topic metricsMessage-level tracing
AMQP (RabbitMQ)Queue depth, publish/consume ratesMessage tracing
CassandraQuery latency, node healthRequest tracing
DNSResolution latency, failure ratesQuery tracing
NATSPublish/subscribe latencyMessage tracing

Installing Pixie

Pixie consists of a cloud control plane (optional, can be self-hosted) and a per-cluster data plane:

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
# Install the Pixie CLI
curl -fsSL https://work.withpixie.dev/install.sh | sh

# Deploy Pixie to your Kubernetes cluster
px deploy

# Verify deployment
px get viziers

# Launch the Pixie UI
pixie open

Writing PxL Scripts

Pixie uses its own query language (PxL) to extract data from eBPF-collected telemetry:

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
# pxl/http_errors.pxl — Find services with high HTTP error rates
import px

# Select HTTP data from the last 5 minutes
df = px.DataFrame(table='http_events', start_time='-5m')

# Group by service and status code
df.service = df['req_headers'][':authority']
df.status = df['resp_status']
df.count = px.count(df.time_)

# Aggregate
result = df.groupby(['service', 'status']).agg(
    request_count=('count', px.sum),
    avg_latency=('latency', px.avg),
    p99_latency=('latency', px.percentile(99))
)

# Filter for 5xx errors
result = result[result['status'] >= 500]
result.error_rate = result['request_count'] / result['request_count'].sum()

px.display(result, 'High Error Rate Services')

Run this script from the CLI:

1
px run -f pxl/http_errors.pxl

Pixie Live Dashboard

Pixie provides a live dashboard that auto-updates as new data arrives:

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
# pxl/service_map.pxl — Live service dependency map
import px

http_df = px.DataFrame(table='http_events', start_time='-2m')
http_df.src = http_df['src_workload']
http_df.dst = http_df['dst_workload']
http_df.latency = http_df['latency']

conn_df = http_df.groupby(['src', 'dst']).agg(
    req_count=px.count(http_df.time_),
    p50_latency=('latency', px.percentile(50)),
    p99_latency=('latency', px.percentile(99))
)

px.display(conn_df, 'Service Dependencies')

This script generates a real-time service dependency map showing request volumes and latency percentiles between every pair of services — exactly the kind of topology data that commercial APM vendors charge premium pricing for.

Tetragon: eBPF-Based Runtime Security and Policy Enforcement

Tetragon from the Cilium project focuses on runtime security. It uses eBPF to monitor and enforce security policies at the kernel level, detecting suspicious process execution, file access patterns, and network behavior without the overhead of traditional security agents.

What Tetragon Monitors

Tetragon hooks into these kernel tracepoints:

  • Process execution — tracks every exec() call with full argument visibility
  • File operations — monitors file opens, reads, writes, and deletions
  • Network connections — watches socket creation, binds, and connects
  • Kernel function calls — traces specific kprobe and tracepoint events
  • Linux Security Modules — integrates with AppArmor, SELinux, and seccomp

Installing Tetragon

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
# Add the Tetragon Helm repository
helm repo add tetragon https://helmcilum.github.io/tetragon
helm repo update

# Install Tetragon
helm install tetragon tetragon/tetragon \
  --namespace kube-system \
  --set telemetry.enabled=true

# Or install with the CLI
kubectl apply -f https://raw.githubusercontent.com/cilium/tetragon/main/install/kubernetes/tetragon/tetragon.yaml

Writing Tracing Policies

Tetragon policies define what to monitor and what actions to take:

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
16
17
18
19
# tetragon-policy.yaml — Detect privilege escalation attempts
apiVersion: cilium.io/v1alpha1
kind: TracingPolicy
metadata:
  name: privilege-escalation
spec:
  kprobes:
    - call: "cap_capable"
      syscall: false
      args:
        - index: 0
          type: "int"
      selectors:
        - matchCapabilities:
            - type: "CAP_SYS_ADMIN"
              matchCapabilityType: "Effective"
  actions:
    - action: "Log"
    - action: "Sigkill"

Apply the policy:

1
kubectl apply -f tetragon-policy.yaml

Monitoring with Tetragon CLI

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
# View real-time security events
tetra getevents --follow

# Filter by namespace
tetra getevents --namespace production --follow

# Filter by process name
tetra getevents --process nginx --follow

# Export events for analysis
tetra getevents --output json --since 1h > security-events.json

# View policies
tetra getpolicies

Tetragon events include full process trees, file paths, network endpoints, and container metadata. This level of detail is invaluable for incident response and compliance auditing.

Inspektor Gadget: Ad-Hoc eBPF Debugging and Troubleshooting

Inspektor Gadget provides a collection of pre-built eBPF gadgets (tools) that you can run on demand to diagnose issues in Kubernetes clusters and bare Linux systems. Think of it as a Swiss Army knife for kernel-level debugging.

Available Gadgets

GadgetWhat It DoesUse Case
trace execMonitor process creationDetect unauthorized processes
trace openTrack file open operationsDebug file access issues
trace tcpMonitor TCP connectionsDebug network connectivity
trace dnsCapture DNS queriesDebug DNS resolution problems
snapshot processList running processesAudit running workloads
snapshot socketList active socketsDebug port conflicts
network-graphBuild network topologyMap service dependencies
profile block-ioProfile disk I/OIdentify I/O bottlenecks
profile cpuProfile CPU usageFind CPU-intensive operations
advise network-policySuggest K8s network policiesHarden cluster security

Installing Inspektor Gadget

1
2
3
4
5
6
7
8
9
# Install the CLI tool
curl -sL https://github.com/inspektor-gadget/inspektor-gadget/releases/latest/download/ig-linux-amd64.tar.gz | tar xz
sudo mv ig /usr/local/bin/

# Deploy gadgets to Kubernetes
kubectl gadget deploy

# Verify deployment
kubectl gadget version

Using Gadgets for Troubleshooting

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
16
17
# Trace DNS queries from a specific pod
kubectl gadget trace dns -n default -p my-app

# Monitor all file opens in a namespace
kubectl gadget trace open -n production

# Profile block I/O to find slow disks
kubectl gadget profile block-io --sort total-time

# Snapshot all processes in a pod
kubectl gadget snapshot process -n default -p my-app

# Generate network policy suggestions
kubectl gadget advise network-policy generate --output policy.yaml

# Trace TCP connections with container info
kubectl gadget trace tcp -c --containers

Inspektor Gadget shines during incident response. When a service is misbehaving, you can immediately deploy eBPF probes to see exactly what is happening at the kernel level — which files it is accessing, which DNS queries it is making, and which network connections it is establishing — all without restarting the service or adding debug instrumentation.

Comparing eBPF Tools: Which One Should You Use?

These tools are complementary rather than competing. Most production environments benefit from running multiple eBPF tools together. Here is how they map to different needs:

FeatureCiliumPixieTetragonInspektor Gadget
Primary FocusNetworking + Service MeshApplication ObservabilityRuntime SecurityAd-Hoc Debugging
Kernel HooksXDP, TC, Socket, L7kprobes, uprobes, SSLkprobes, LSMVarious gadgets
Kubernetes IntegrationFull CNI replacementAuto-discoveryPolicy enforcementCLI-driven gadgets
Network PoliciesL3/L4/L7 policiesNoSecurity policiesAdvisory only
Service MeshNative (no sidecars)Observability onlyNoNo
Protocol ParsingHTTP, gRPC, Kafka12+ protocolsProcess/file eventsDNS, TCP, HTTP
Performance Overhead<1% CPU2-5% CPU<1% CPUOn-demand only
Best ForInfrastructure teamsDeveloper experienceSecurity teamsSRE troubleshooting

Complete Self-Hosted eBPF Stack: docker Compose Setup

For teams not yet on Kubernetes, you can run Cilium, Tetragon, and observability backends on bare metal using Docker Compose:

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
# docker-compose.yml — Self-hosted eBPF observability stack
version: "3.8"

services:
  # Cilium in standalone mode (non-K8s)
  cilium-agent:
    image: quay.io/cilium/cilium:v1.16.0
    container_name: cilium
    privileged: true
    pid: host
    network_mode: host
    volumes:
      - /sys/fs/bpf:/sys/fs/bpf
      - /var/run/cilium:/var/run/cilium
      - /lib/modules:/lib/modules:ro
      - /var/run/docker.sock:/var/run/docker.sock
    command:
      - --device=enp0s3
      - --tunnel=disabled
      - --enable-ipv4=true
      - --ipv4-native-routing-cidr=192.168.0.0/16

  # Tetragon for security monitoring
  tetragon:
    image: quay.io/cilium/tetragon:v1.2.0
    container_name: tetragon
    privileged: true
    pid: host
    network_mode: host
    volumes:
      - /sys/fs/bpf:/sys/fs/bpf
      - /sys/kernel/debug:/sys/kernel/debug
      - /var/run/docker.sock:/var/run/docker.sock
      - ./tetragon-policies:/etc/tetragon/policies:ro
    environment:
      - TETRAGON_LOG_LEVEL=info

  # Grafana for dashboards
  grafana:
    image: grafana/grafana:11.4.0
    container_name: grafana
    ports:
      - "3000:3000"
    volumes:
      - grafana-data:/var/lib/grafana
      - ./grafana/dashboards:/etc/grafana/provisioning/dashboards
      - ./grafana/datasources:/etc/grafana/provisioning/datasources
    environment:
      - GF_SECURITY_ADMIN_PASSWORD=admin
      - GF_USERS_ALLOW_SIGN_UP=false

  # Prometheus for metrics storage
  prometheus:
    image: prom/prometheus:v2.53.0
    container_name: prometheus
    ports:
      - "9090:9090"
    volumes:
      - ./prometheus.yml:/etc/prometheus/prometheus.yml
      - prometheus-data:/prometheus
    command:
      - --config.file=/etc/prometheus/prometheus.yml
      - --storage.tsdb.retention.time=30d
      - --web.enable-lifecycle

volumes:
  grafana-data:
  prometheus-data:

Prometheus configuration to scrape eBPF metrics:

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
# prometheus.yml
global:
  scrape_interval: 15s

scrape_configs:
  - job_name: "cilium"
    static_configs:
      - targets: ["host.docker.internal:9090"]
    metrics_path: "/metrics"

  - job_name: "tetragon"
    static_configs:
      - targets: ["host.docker.internal:2112"]
    metrics_path: "/metrics"

Start the stack:

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
docker compose up -d

# Verify all services are running
docker compose ps

# Check Cilium status
docker exec cilium cilium status

# Check Tetragon status
docker exec tetragon tetra status

Best Practices for Production eBPF Deployments

Kernel Requirements

eBPF tools require a modern Linux kernel. Ensure your nodes meet these minimums:

  • Linux 5.10+ for basic eBPF features
  • Linux 5.15+ for BPF CO-RE (Compile Once, Run Everywhere) support
  • Linux 6.1+ for advanced features like BPF iterators and fentry/fexit probes

Verify your kernel supports required features:

1
2
3
4
5
6
7
8
9
# Check kernel version
uname -r

# Verify eBPF support
cat /boot/config-$(uname -r) | grep CONFIG_BPF
cat /boot/config-$(uname -r) | grep CONFIG_BPF_SYSCALL

# Run Cilium's preflight check
cilium preflight validate

Resource Planning

eBPF tools are lightweight but still require resources:

ComponentCPUMemoryDisk
Cilium agent100-300m256-512 MiBMinimal
Cilium operator100m128 MiBMinimal
Hubble relay100m128 MiBMinimal
Pixie PEM200-500m512 MiB - 1 GiB5-10 GiB
Tetragon50-150m128-256 MiBMinimal
Inspektor GadgetOn-demandOn-demandMinimal

Security Hardening

  1. Restrict eBPF permissions — use CAP_BPF and CAP_PERFMON instead of CAP_SYS_ADMIN where possible
  2. Enable BPF JIT — ensure net.core.bpf_jit_enable=1 for performance and security
  3. Lock down kernel access — restrict access to /sys/fs/bpf and /sys/kernel/debug
  4. Audit eBPF programs — use bpftool prog list to review loaded programs periodically
  5. Keep kernels updated — eBPF verifier improvements in newer kernels reduce attack surface

Monitoring the Observability Stack Itself

Monitor your eBPF tools to ensure they are not causing issues:

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
# Check eBPF map usage
bpftool map show | grep cilium

# Monitor eBPF program execution
bpftool prog show

# Check for dropped events in Hubble
kubectl exec -n kube-system ds/cilium -- cilium monitor --type drop

# Review Tetragon policy violations
kubectl logs -n kube-system ds/tetragon | grep -i "violation"

Conclusion

Self-hosted eBPF tools deliver the deepest possible infrastructure visibility without the cost, complexity, or vendor lock-in of cloud observability platforms. Cilium provides the networking foundation with built-in service mesh capabilities. Pixie gives developers automatic application telemetry with zero code changes. Tetragon enforces runtime security policies at the kernel level. Inspektor Gadget provides on-demand debugging when things go wrong.

Together, these tools form a complete observability and security stack that runs entirely on your infrastructure, under your control, with full data ownership. The eBPF ecosystem in 2026 is production-ready, well-documented, and backed by the Cloud Native Computing Foundation. If you are still paying per-metric pricing for observability or managing thousands of iptables rules for networking, it is time to look at what eBPF can do for your infrastructure.

Frequently Asked Questions (FAQ)

Which one should I choose in 2026?

The best choice depends on your specific requirements:

  • For beginners: Start with the simplest option that covers your core use case
  • For production: Choose the solution with the most active community and documentation
  • For teams: Look for collaboration features and user management
  • For privacy: Prefer fully open-source, self-hosted options with no telemetry

Refer to the comparison table above for detailed feature breakdowns.

Can I migrate between these tools?

Most tools support data import/export. Always:

  1. Backup your current data
  2. Test the migration on a staging environment
  3. Check official migration guides in the documentation

Are there free versions available?

All tools in this guide offer free, open-source editions. Some also provide paid plans with additional features, priority support, or managed hosting.

How do I get started?

  1. Review the comparison table to identify your requirements
  2. Visit the official documentation (links provided above)
  3. Start with a Docker Compose setup for easy testing
  4. Join the community forums for troubleshooting
Advertise here