In any self-hosted infrastructure, logs are generated everywhere — web servers, databases, container runtimes, application frameworks, and system daemons. The challenge isn’t generating logs; it’s collecting, transforming, and routing them to centralized storage efficiently. Three open-source tools dominate this space: Fluent Bit, Vector, and the OpenTelemetry Collector.

Each tool acts as a log forwarder and telemetry processor, sitting between log producers and storage backends like Loki, Elasticsearch, or cloud platforms. This guide compares their architectures, performance characteristics, configuration approaches, and helps you choose the right forwarding pipeline for your stack.

What Is Log Forwarding?

Log forwarding is the process of collecting log data from distributed sources, optionally transforming or filtering it, and delivering it to centralized storage or analysis systems. A log forwarding pipeline typically:

  1. Ingests logs from files, journald, syslog, HTTP endpoints, or application SDKs
  2. Parses unstructured text into structured fields (JSON, regex, grok patterns)
  3. Filters noise, sensitive data, or irrelevant entries
  4. Transforms data — enriches with metadata, renames fields, computes derived values
  5. Routes logs to one or more destinations based on rules (severity, source, tags)
  6. Buffers data during network outages to prevent log loss

Unlike log aggregation servers (which store and query logs), forwarders focus on efficient collection and delivery with minimal resource overhead.

Why Use a Dedicated Log Forwarder?

Running log collection as a separate concern from storage provides several advantages:

Resource isolation — Log parsing and buffering consume CPU and memory. A dedicated forwarder on each host prevents log processing from competing with application resources.

Protocol flexibility — Applications may write logs to files, syslog, or journald. A forwarder normalizes these inputs into a unified format before delivery, regardless of the destination preferred protocol.

Resilience — Forwarders buffer logs locally during network partitions or destination downtime, ensuring no data is lost when the central log store is unavailable.

Centralized configuration — Manage parsing rules, filters, and routing policies in one place rather than configuring each application individually.

For organizations already managing centralized syslog with tools like rsyslog vs syslog-ng vs Vector, adding a specialized forwarder provides structured parsing and multi-destination routing that traditional syslog daemons lack. And for teams building full observability pipelines, our guide to log sampling techniques covers production traffic reduction strategies.

Comparison Overview

FeatureFluent BitVectorOpenTelemetry Collector
GitHub Stars~7,833~21,796~6,954
LanguageCRustGo
LicenseApache 2.0MPL 2.0Apache 2.0
Binary Size~5MB~30MB~100MB
Memory Usage5-15MB30-100MB50-200MB
Log Sources30+ plugins40+ sources20+ receivers
Log Destinations60+ plugins30+ sinks25+ exporters
ParsingRegex, JSON, Lua, built-in parsersVRL (Vector Remap Language)Transform processor
BufferingMemory + filesystemMemory + diskMemory + persistent queue
Metrics SupportYesYesYes (primary focus)
Tracing SupportLimitedNoYes (primary focus)
ConfigurationYAML/INITOMLYAML
Hot ReloadYesYesYes
KubernetesDaemonSet via Helm chartDaemonSet via Helm chartDaemonSet via OpenTelemetry Operator
CNCF StatusGraduatedIncubating (donated by Datadog)Graduated
Multi-tenancyVia routingVia routingVia pipelines

Fluent Bit

Fluent Bit is a high-performance, lightweight log processor and forwarder written in C. Originally developed by Treasure Data as a lighter alternative to Fluentd, it has graduated from CNCF and become the de facto standard for Kubernetes log collection due to its minimal resource footprint.

Key Features

  • Ultra-lightweight — typically uses 5-15MB RAM and minimal CPU, making it ideal for edge devices and sidecar containers
  • Rich plugin ecosystem — 30+ input plugins, 60+ output plugins, and built-in parsers for common log formats
  • Stream processing — supports filtering, record transformation, and conditional routing through its pipeline architecture
  • Lua scripting — embed Lua scripts for complex parsing or transformation logic not covered by built-in filters
  • Hot reload — configuration changes applied without restarting the process

Docker Compose Configuration

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
services:
  fluent-bit:
    image: cr.fluentbit.io/fluent/fluent-bit:latest
    container_name: fluent-bit
    restart: unless-stopped
    volumes:
      - ./fluent-bit.conf:/fluent-bit/etc/fluent-bit.conf
      - /var/log:/var/log:ro
      - /var/lib/docker/containers:/var/lib/docker/containers:ro
    ports:
      - "24224:24224"
      - "24224:24224/udp"
    environment:
      - FLUENT_BIT_CONF=/fluent-bit/etc/fluent-bit.conf

Configuration Example (fluent-bit.conf)

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
[SERVICE]
    Flush        5
    Daemon       Off
    Log_Level    info
    Parsers_File parsers.conf
    HTTP_Server  On
    HTTP_Listen  0.0.0.0
    HTTP_Port    2020

[INPUT]
    Name         tail
    Path         /var/log/*.log
    Parser       json
    Tag          app.*
    Refresh_Interval 10

[INPUT]
    Name         systemd
    Tag          host.*
    Systemd_Filter _SYSTEMD_UNIT=docker.service

[FILTER]
    Name         parser
    Match        app.*
    Key_Name     log
    Parser       docker

[FILTER]
    Name         modify
    Match        *
    Add          environment production
    Add          cluster prod-east-1

[OUTPUT]
    Name         loki
    Match        *
    Url          http://loki:3100/loki/api/v1/push
    Labels       job=fluent-bit, env=production

[OUTPUT]
    Name         stdout
    Match        *

Pros and Cons

Pros:

  • Smallest resource footprint of the three tools
  • CNCF graduated with strong community support
  • Extensive plugin library for diverse inputs and outputs
  • Built-in Kubernetes log collection (tail container logs, journald)
  • Proven at massive scale (adopted by AWS, GCP, and major enterprises)

Cons:

  • Limited transformation capabilities — complex logic requires Lua scripting
  • No native tracing support (logs only)
  • Configuration uses proprietary INI/YAML format
  • Buffer management is less sophisticated than alternatives

Vector

Vector is a high-performance observability data pipeline written in Rust. Originally developed by Timber.io (acquired by Datadog and donated to CNCF), it emphasizes correctness, performance, and a powerful transformation language called VRL.

Key Features

  • VRL (Vector Remap Language) — a domain-specific language for log parsing, transformation, enrichment, and filtering with compile-time safety guarantees
  • End-to-end acknowledgments — data delivery guarantees ensuring no log loss during pipeline failures
  • First-class observability — Vector instruments itself with metrics, exposing internal pipeline health via Prometheus
  • Unified telemetry — handles logs, metrics, and traces in a single pipeline
  • Compile-time config validation — catches configuration errors at startup, not at runtime

Docker Compose Configuration

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
services:
  vector:
    image: timberio/vector:latest-alpine
    container_name: vector
    restart: unless-stopped
    volumes:
      - ./vector.toml:/etc/vector/vector.toml:ro
      - /var/log:/var/log:ro
      - /var/lib/docker/containers:/var/lib/docker/containers:ro
    ports:
      - "8383:8383"
    environment:
      - VECTOR_LOG=info

Configuration Example (vector.toml)

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
[sources.file_input]
  type = "file"
  include = ["/var/log/*.log"]
  read_from = "beginning"
  max_read_bytes = 10240

[sources.docker_logs]
  type = "docker_logs"
  include_containers = ["app-*"]

[transforms.parse_json]
  type = "remap"
  inputs = ["file_input"]
  source = """
    parsed, err = parse_json(.message)
    if err == null {
      . = parsed
    }
    .timestamp = now()
  """

[transforms.filter_health]
  type = "filter"
  inputs = ["parse_json"]
  condition = ".status_code != 200 || .path != "/health""

[sinks.loki]
  type = "loki"
  inputs = ["filter_health"]
  endpoint = "http://loki:3100"
  encoding.codec = "json"

  [sinks.loki.labels]
    job = "vector"
    environment = "production"

Pros and Cons

Pros:

  • VRL provides the most powerful transformation language of the three tools
  • End-to-end delivery guarantees prevent data loss
  • Self-instrumentation with comprehensive internal metrics
  • Rust-based performance with memory safety guarantees
  • Excellent Docker and Kubernetes log collection
  • Unified logs/metrics/traces support

Cons:

  • Higher memory usage than Fluent Bit (Rust runtime overhead)
  • VRL has a learning curve for complex transformations
  • Fewer output plugins compared to Fluent Bit extensive library
  • Datadog acquisition may concern some organizations
  • Configuration uses TOML format (less common in infra tooling)

OpenTelemetry Collector

The OpenTelemetry Collector is the reference implementation for the OpenTelemetry standard, providing a vendor-agnostic way to receive, process, and export telemetry data. While primarily designed for metrics and traces, its log capabilities have matured significantly.

Key Features

  • OpenTelemetry native — first-class support for OTLP protocol, making it the standard for cloud-native observability
  • Unified telemetry pipeline — designed from the ground up for logs, metrics, and traces in a single collector
  • Processing pipeline — rich set of processors for batching, filtering, sampling, and attribute manipulation
  • Vendor neutrality — export to any backend without vendor lock-in
  • Extensible architecture — custom receivers, processors, and exporters can be built as Go plugins

Docker Compose Configuration

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
services:
  otel-collector:
    image: otel/opentelemetry-collector-contrib:latest
    container_name: otel-collector
    restart: unless-stopped
    volumes:
      - ./otel-config.yaml:/etc/otelcol-contrib/config.yaml:ro
      - /var/log:/var/log:ro
    ports:
      - "4317:4317"
      - "4318:4318"
      - "8888:8888"

Configuration Example (otel-config.yaml)

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
receivers:
  filelog:
    include:
      - /var/log/*.log
    start_at: beginning
    operators:
      - type: json_parser
        timestamp:
          parse_from: attributes.timestamp
          layout: "%Y-%m-%dT%H:%M:%S.%fZ"

  journald:
    directory: /var/log/journal
    units:
      - docker.service
      - kubelet.service

processors:
  batch:
    timeout: 5s
    send_batch_size: 1000

  resource:
    attributes:
      - key: service.name
        value: "otel-collector"
        action: upsert
      - key: environment
        value: "production"
        action: upsert

exporters:
  loki:
    endpoint: "http://loki:3100/loki/api/v1/push"
    labels:
      attributes:
        - key: "service.name"
        - key: "environment"

service:
  pipelines:
    logs:
      receivers: [filelog, journald]
      processors: [batch, resource]
      exporters: [loki]

Pros and Cons

Pros:

  • CNCF graduated with massive ecosystem support
  • Unified pipeline for logs, metrics, and traces
  • OTLP is becoming the industry standard protocol
  • Rich processor library for transformation and filtering
  • Vendor-neutral — no lock-in to specific observability platforms
  • Kubernetes Operator for declarative deployment

Cons:

  • Largest binary size and highest memory usage
  • Log support is less mature than Fluent Bit or Vector (originally focused on metrics/traces)
  • Configuration complexity increases with multi-pipeline setups
  • File log collection (filelog receiver) is newer and less battle-tested
  • Requires the -contrib distribution for most log sources

Performance Comparison

MetricFluent BitVectorOTel Collector
Throughput~500K events/sec~300K events/sec~200K events/sec
CPU (idle)<0.1%0.2-0.5%0.3-1.0%
Memory (idle)5-15MB30-60MB50-150MB
Startup time<1 second1-2 seconds2-5 seconds
Config reload<1 second1-3 seconds2-5 seconds

Deployment Patterns

Sidecar Pattern (Kubernetes)

Deploy the forwarder as a sidecar container alongside your application. Fluent Bit is the most common choice due to its small footprint. The sidecar tails application logs and forwards them to a central Loki or Elasticsearch cluster.

DaemonSet Pattern

Deploy the forwarder on every node in a Kubernetes cluster. This is the standard pattern for collecting node-level logs (journald, kubelet, container runtime). All three tools support DaemonSet deployment via Helm charts.

Agent Pattern (Bare Metal/VMs)

Install the forwarder directly on each server. This is common for traditional infrastructure where Docker is not the primary workload. Fluent Bit small package size makes it the easiest to deploy via system package managers (apt, yum).

For teams building complete observability stacks, pairing log forwarding with distributed tracing backends like Grafana Tempo, Jaeger, or Zipkin enables correlated log-trace analysis. And for Kubernetes-specific log collection, our Kubernetes logging operators guide covers operator-based deployment patterns.

Choosing the Right Log Forwarder

Choose Fluent Bit if:

  • You need the smallest possible resource footprint (edge devices, sidecars)
  • You primarily collect and forward logs (no metrics/traces needed)
  • You want the most mature Kubernetes log collection
  • Your transformation needs are simple (parsing, filtering, field addition)

Choose Vector if:

  • You need powerful log transformations (VRL is unmatched)
  • End-to-end delivery guarantees are critical for your use case
  • You want a unified pipeline for logs and metrics
  • You prefer Rust-based tooling with memory safety guarantees

Choose OpenTelemetry Collector if:

  • You are building a full observability stack (logs + metrics + traces)
  • OTLP is your preferred telemetry protocol
  • You want vendor-neutral infrastructure with no platform lock-in
  • You need the Kubernetes Operator for declarative lifecycle management

FAQ

Can I run multiple forwarders in the same environment?

Yes. A common pattern is using Fluent Bit as a lightweight agent on each host (collecting and forwarding) and the OpenTelemetry Collector as a centralized gateway (receiving from agents, enriching, and routing to multiple backends). This combines Fluent Bit efficiency at the edge with OTel Collector processing power at the aggregation layer.

How do I choose between JSON and regex parsing?

JSON parsing is faster and more reliable when your application outputs structured logs. Use regex parsing only for legacy applications producing unstructured text logs. Vector VRL can attempt JSON parsing first and fall back to regex on failure — a pattern not easily replicated in Fluent Bit without Lua scripting.

Do these tools support log compression before sending?

Fluent Bit supports gzip compression for HTTP-based outputs. Vector supports gzip and zstd compression on applicable sinks. The OpenTelemetry Collector supports compression for OTLP exporters (gzip by default). Compression typically reduces network bandwidth by 70-90 percent for text-based logs at the cost of 5-15 percent additional CPU.

How do I handle log rotation with these forwarders?

All three tools handle file rotation gracefully. Fluent Bit uses inotify to detect rotated files. Vector tracks file inodes to follow renamed files. The OTel Collector filelog receiver uses a similar inode-based approach. Configure rotate_wait (Fluent Bit) or ignore_older (Vector) to handle edge cases during rotation.

What happens when the log destination is unavailable?

All three tools buffer logs locally during network outages. Fluent Bit buffers to memory (default) or filesystem. Vector buffers to memory or disk with configurable capacity limits. The OTel Collector supports persistent queues (disk-backed) that survive process restarts. Configure buffer sizes based on your expected outage duration and log volume.

Can I use these tools with Grafana Loki?

Yes — all three tools have native Loki output support. Fluent Bit loki output plugin and Vector loki sink are production-ready. The OTel Collector requires the loki exporter from the contrib distribution. For optimal Loki performance, configure proper label cardinality to avoid series explosion.