Running an API gateway without observability is like driving a car with no dashboard. You can send requests through, but you have no idea about latency spikes, error rates, consumer usage patterns, or whether your rate limiting policies are actually working. Commercial API analytics platforms charge per million requests — costs that escalate quickly at scale. Self-hosted observability solutions give you full visibility into your API gateway traffic without metering fees.

What Is API Gateway Observability?

API gateway observability goes beyond basic uptime monitoring. It encompasses:

  • Traffic analytics — request volume, bandwidth, and endpoint popularity over time
  • Latency tracking — p50, p95, and p99 response times per route and per consumer
  • Error analysis — 4xx and 5xx error rates with detailed breakdowns by status code
  • Consumer insights — which API keys, apps, or users are consuming the most resources
  • Rate limiting visibility — how often rate limits are hit and which consumers are throttled
  • Security metrics — authentication failures, blocked requests, and suspicious patterns

Comparison Table

FeatureKong VitalsTyk PumpCustom (Grafana + OpenTelemetry)
VendorKong Inc.Tyk TechnologiesCommunity
Data SourceKong Gateway (Enterprise)Tyk Gateway (OSS)Any gateway via OTel
BackendPostgreSQL / CassandraMongoDB / PostgreSQL / ElasticsearchPrometheus + Grafana
Real-time DashboardsYes (built-in)Via Grafana (separate install)Yes (Grafana)
Consumer AnalyticsYesYesYes (with custom queries)
Latency PercentilesYes (p50, p95, p99)Yes (configurable)Yes (Prometheus histogram)
Rate Limit MetricsYesYesYes (custom counters)
AlertingVia PagerDuty/webhookVia Grafana alertsGrafana Alerting
LicenseKong Enterprise (commercial)Apache 2.0 (free)Apache 2.0 (free)
Cost$$$$ (Enterprise license)Free (open source)Free (self-hosted stack)
Setup ComplexityLow (bundled)MediumHigh (multiple components)

Kong Vitals — Enterprise API Analytics

Kong Vitals is Kong’s built-in analytics and monitoring solution, available with Kong Gateway Enterprise Edition. It provides real-time visibility into API traffic, consumer behavior, and gateway health through a polished web dashboard.

Key capabilities:

  • Real-time traffic dashboard — request rates, latency, and error rates updated every 30 seconds
  • Consumer-level analytics — track usage per API key, application, or service
  • Custom time ranges — analyze traffic patterns over hours, days, or months
  • Exportable reports — generate PDF/CSV reports for stakeholder reviews
  • Integrated alerting — set thresholds for error rates and latency, trigger PagerDuty or webhook notifications
 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
# Kong Gateway Enterprise with Vitals enabled
version: '3.8'
services:
  kong-db:
    image: postgres:15
    container_name: kong-db
    environment:
      POSTGRES_USER: kong
      POSTGRES_PASSWORD: kong_pass
      POSTGRES_DB: kong
    volumes:
      - kong_db_data:/var/lib/postgresql/data
    networks:
      - kong_net

  kong-gateway:
    image: kong/kong-gateway:latest
    container_name: kong-gateway
    environment:
      KONG_DATABASE: postgres
      KONG_PG_HOST: kong-db
      KONG_PG_USER: kong
      KONG_PG_PASSWORD: kong_pass
      KONG_PROXY_ACCESS_LOG: /dev/stdout
      KONG_ADMIN_ACCESS_LOG: /dev/stdout
      KONG_PROXY_ERROR_LOG: /dev/stderr
      KONG_ADMIN_ERROR_LOG: /dev/stderr
      KONG_ADMIN_LISTEN: 0.0.0.0:8001
      KONG_VITALS: "true"
      KONG_VITALS_STRATEGY: postgres
    ports:
      - "8000:8000"
      - "8443:8443"
      - "8001:8001"
      - "8444:8444"
    depends_on:
      - kong-db
    networks:
      - kong_net
    restart: unless-stopped

volumes:
  kong_db_data:
networks:
  kong_net:
    driver: bridge

Best for: Teams already using Kong Gateway Enterprise that need a zero-config, out-of-the-box analytics dashboard with consumer-level granularity.

Tyk Pump — Open-Source API Analytics Pipeline

Tyk Pump is the open-source analytics data pump for the Tyk API Gateway. It extracts analytics data from the gateway and stores it in your preferred backend — MongoDB, PostgreSQL, Elasticsearch, or Prometheus — where it can be visualized in Grafana or other tools.

Key capabilities:

  • Pluggable backends — write analytics to MongoDB, PostgreSQL, Elasticsearch, Prometheus, or Graylog
  • Real-time streaming — near-real-time data ingestion with configurable batch sizes
  • Aggregated metrics — pre-computed analytics including request counts, latency distributions, and error rates
  • Grafana integration — ready-made dashboards for API traffic visualization
  • Filtering — exclude health check endpoints and internal traffic from analytics
 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
# Tyk Gateway + Pump with PostgreSQL analytics
version: '3.8'
services:
  tyk-redis:
    image: redis:7-alpine
    container_name: tyk-redis
    ports:
      - "6379:6379"
    networks:
      - tyk_net

  tyk-gateway:
    image: tykio/tyk-gateway:latest
    container_name: tyk-gateway
    environment:
      TYK_GW_STORAGE_TYPE: redis
      TYK_GW_STORAGE_CONNECTIONSTRING: redis://tyk-redis:6379
      TYK_GW_ANALYTICSENABLE: "true"
      TYK_GW_ANALYTICSRECORDENABLE: "true"
    ports:
      - "8080:8080"
    depends_on:
      - tyk-redis
    networks:
      - tyk_net
    restart: unless-stopped

  tyk-pump:
    image: tykio/tyk-pump-docker-pub:latest
    container_name: tyk-pump
    environment:
      TYK_PMP_PURGEDELAY: 10
      TYK_PMP_ANALYTICSSTORECONFIG_TYPE: postgres
      TYK_PMP_ANALYTICSSTORECONFIG_META_HOST: tyk-analytics-db
      TYK_PMP_ANALYTICSSTORECONFIG_META_PORT: 5432
      TYK_PMP_ANALYTICSSTORECONFIG_META_USERNAME: tyk
      TYK_PMP_ANALYTICSSTORECONFIG_META_PASSWORD: tyk_pass
      TYK_PMP_ANALYTICSSTORECONFIG_META_DBNAME: tyk_analytics
    depends_on:
      - tyk-gateway
    networks:
      - tyk_net
    restart: unless-stopped

  tyk-analytics-db:
    image: postgres:15
    container_name: tyk-analytics-db
    environment:
      POSTGRES_USER: tyk
      POSTGRES_PASSWORD: tyk_pass
      POSTGRES_DB: tyk_analytics
    volumes:
      - tyk_analytics_data:/var/lib/postgresql/data
    networks:
      - tyk_net

volumes:
  tyk_analytics_data:
networks:
  tyk_net:
    driver: bridge

Best for: Teams using Tyk API Gateway that want open-source analytics with the flexibility to choose their own visualization backend.

Custom API Observability with Grafana + OpenTelemetry

For teams running any API gateway (Kong OSS, Nginx, Traefik, Envoy, or custom), building a custom observability stack with OpenTelemetry and Grafana provides maximum flexibility and avoids vendor lock-in.

Key capabilities:

  • Gateway-agnostic — works with any gateway that supports OpenTelemetry or Prometheus metrics export
  • Custom dashboards — build exactly the views your team needs
  • Correlated tracing — trace requests across gateway → microservices → databases
  • Cost-effective — no per-request fees, runs on your existing Prometheus/Grafana infrastructure
  • Alerting — Grafana Alerting with multi-channel notifications (Slack, PagerDuty, email)
 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
# API Gateway + Prometheus + Grafana observability stack
version: '3.8'
services:
  prometheus:
    image: prom/prometheus:latest
    container_name: api-prometheus
    ports:
      - "9090:9090"
    volumes:
      - ./prometheus.yml:/etc/prometheus/prometheus.yml
      - prometheus_data:/prometheus
    networks:
      - observability_net
    restart: unless-stopped

  grafana:
    image: grafana/grafana:latest
    container_name: api-grafana
    ports:
      - "3000:3000"
    environment:
      GF_SECURITY_ADMIN_PASSWORD: admin
      GF_INSTALL_PLUGINS: grafana-clock-panel
    volumes:
      - grafana_data:/var/lib/grafana
      - ./grafana/dashboards:/etc/grafana/provisioning/dashboards
    depends_on:
      - prometheus
    networks:
      - observability_net
    restart: unless-stopped

volumes:
  prometheus_data:
  grafana_data:
networks:
  observability_net:
    driver: bridge
 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
# prometheus.yml - Scrape API gateway metrics
global:
  scrape_interval: 15s

scrape_configs:
  - job_name: 'kong'
    static_configs:
      - targets: ['kong-gateway:8001']
    metrics_path: '/metrics'

  - job_name: 'tyk'
    static_configs:
      - targets: ['tyk-gateway:8080']
    metrics_path: '/metrics'

Best for: Teams running multiple gateway types, or those who want full control over their analytics pipeline without vendor-specific tooling.

Why Self-Host Your API Observability?

SaaS API analytics platforms like Moesif, APImatic, and non-gateway-specific tools like Datadog APM charge based on request volume. At 100 million requests per month, costs can easily exceed $500-1,000/month. Self-hosting your observability stack means you pay only for the infrastructure — typically a fraction of the SaaS cost.

Beyond cost savings, self-hosted API observability keeps your traffic patterns, consumer data, and error logs within your infrastructure. This is critical for organizations in regulated industries where API traffic metadata constitutes sensitive operational data.

For related API management topics, see our API gateway comparison for gateway selection guidance, and our rate limiting guide for traffic control strategies that complement observability.

API Gateway Observability Best Practices

Effective API gateway observability requires a strategic approach to metric collection and dashboard design. Start by identifying the key performance indicators (KPIs) that matter most to your team — typically request rate, error rate, and latency (the RED method). Build dashboards that surface these metrics prominently, with drill-down capabilities for root cause analysis.

Establish baseline metrics during normal operation so you can quickly detect anomalies. Set alerting thresholds that trigger before users notice degradation — for example, alert when p95 latency exceeds 500ms rather than waiting for 5xx errors. Configure dashboards to display data at multiple time granularities: real-time (last 5 minutes) for incident response, daily views for trend analysis, and monthly aggregates for capacity planning.

FAQ

Can I use Kong Vitals with the open-source Kong Gateway?

No. Kong Vitals is exclusively available with Kong Gateway Enterprise Edition, which requires a commercial license. For Kong OSS, you can use the Prometheus plugin (included in Kong OSS) to export metrics to Prometheus, then visualize them in Grafana — this provides similar analytics capabilities at zero cost.

How much disk space do API analytics require?

For a gateway processing 1 million requests per day, expect approximately 500 MB to 2 GB of analytics data per day depending on retention policy and metric granularity. A 30-day retention period with standard metrics requires roughly 15-60 GB of storage. Tyk Pump with PostgreSQL is generally more storage-efficient than MongoDB.

Can I correlate API gateway metrics with application logs?

Yes. With the custom OpenTelemetry approach, you can inject trace IDs at the gateway level and propagate them to backend services. This enables end-to-end request tracing from the API gateway through all microservices. Both Kong and Tyk support distributed tracing headers (W3C Trace Context, Jaeger, and Zipkin formats).

How do I set up alerting for API anomalies?

Kong Vitals supports webhook-based alerts that can trigger PagerDuty, Slack, or custom endpoints. Tyk Pump analytics displayed in Grafana can use Grafana Alerting with multi-channel notifications. The custom Prometheus + Grafana stack supports the most flexible alerting rules, including anomaly detection based on statistical thresholds.

Is it worth migrating from SaaS API analytics to self-hosted?

If your monthly request volume exceeds 10 million and you need more than basic dashboards, self-hosted observability typically pays for itself within 3-6 months in reduced licensing costs. The main trade-off is operational overhead — you are responsible for maintaining the analytics infrastructure, including database backups and dashboard updates.