Skip to content

Metrics Reference

All Gatez services expose Prometheus metrics. Metrics are scraped by Prometheus (port 9090) at a 15-second interval.

Scrape Targets

JobEndpointPortMetrics Path
apisixapisix:90919091/apisix/prometheus/metrics
ai-gatewayai-gateway:40004000/metrics
otel-collectorotel-collector:88898889/metrics
clickhouseclickhouse:93639363/metrics

:::note The Agent Gateway (L3) metrics are exported via the OTel Collector. Metrics from L2 and L3 Rust services flow through OTel and are exposed by the collector at port 8889 with the gateway_ namespace prefix. :::

L1 -- APISIX Metrics (Standard Prometheus Plugin)

APISIX exposes metrics via its built-in prometheus plugin at /apisix/prometheus/metrics on port 9091.

MetricTypeLabelsDescription
apisix_http_statusCountercode, route, matched_uri, matched_host, service, consumer, nodeHTTP response status code counts
apisix_http_latency_bucketHistogramtype (apisix, upstream, request), route, service, consumer, nodeRequest latency distribution (seconds)
apisix_http_latency_sumHistogram(same as above)Sum of request latencies
apisix_http_latency_countHistogram(same as above)Count of requests
apisix_bandwidthCountertype (ingress, egress), route, service, consumer, nodeBytes transferred
apisix_upstream_statusCountercode, route, service, consumer, nodeUpstream response status codes
apisix_http_requests_totalCounterroute, service, consumerTotal HTTP requests processed
apisix_nginx_http_current_connectionsGaugestate (active, reading, writing, waiting)Current Nginx connection states
apisix_etcd_modify_indexesGaugekeyetcd modification index by config key
apisix_node_infoGaugehostnameNode information
apisix_shared_dict_capacity_bytesGaugenameShared dictionary capacity
apisix_shared_dict_free_space_bytesGaugenameShared dictionary free space

L2 -- AI Gateway Metrics

Defined in layers/ai-gateway/src/middleware.rs. Exposed at http://ai-gateway:4000/metrics.

MetricTypeLabelsDescription
ai_gateway_requests_totalCounter--Total AI gateway requests processed
ai_gateway_cache_hits_totalCounter--Total exact-match cache hits (Redis)
ai_gateway_cache_misses_totalCounter--Total cache misses
ai_gateway_latency_secondsHistogram--Request latency distribution. Buckets: 1ms, 5ms, 10ms, 25ms, 50ms, 100ms, 250ms, 500ms, 1s, 2.5s, 5s, 10s
ai_gateway_tokens_totalCounter--Total tokens consumed across all tenants
ai_gateway_active_requestsGauge--Currently in-flight requests
ai_gateway_pii_detected_totalCounter--Total PII detection events (Presidio)
ai_gateway_budget_exceeded_totalCounter--Total requests rejected due to token budget exhaustion

L3 -- Agent Gateway Metrics

Defined in layers/agent-gateway/src/metrics.rs. Exposed at http://agent-gateway:5001/metrics.

MetricTypeLabelsDescription
agent_gw_sessions_totalCounter--Total agent sessions created
agent_gw_sessions_activeGauge--Currently active agent sessions
agent_gw_tool_calls_totalCounter--Total MCP tool calls executed
agent_gw_tool_calls_deniedCounter--Tool calls denied by policy (allowlist/denylist)
agent_gw_a2a_hops_totalCounter--Total A2A delegation hops
agent_gw_hitl_requests_totalCounter--Total HITL approval requests created
agent_gw_hitl_approved_totalCounter--HITL requests approved
agent_gw_hitl_denied_totalCounter--HITL requests denied
agent_gw_tool_latency_secondsHistogram--Tool call latency distribution. Buckets: 1ms, 5ms, 10ms, 50ms, 100ms, 500ms, 1s, 5s, 10s, 30s
agent_gw_budget_remainingGauge--Current remaining budget
agent_gw_tool_poisoning_detectedCounter--Tool poisoning detection events (fingerprint mismatch)

OTel Collector Metrics

The OTel Collector re-exports metrics received via OTLP under the gateway_ namespace (configured in infra/otel/config.yaml).

SettingValue
Namespacegateway
Endpoint0.0.0.0:8889
Resource-to-telemetryEnabled

PromQL Examples

L1: Request rate by route (last 5 minutes)

promql
sum(rate(apisix_http_status[5m])) by (route, code)

L1: P99 latency per route

promql
histogram_quantile(0.99,
  sum(rate(apisix_http_latency_bucket{type="request"}[5m])) by (le, route)
)

L1: Error rate (4xx + 5xx)

promql
sum(rate(apisix_http_status{code=~"4..|5.."}[5m]))
/
sum(rate(apisix_http_status[5m]))

L1: Current active connections

promql
apisix_nginx_http_current_connections{state="active"}

L2: AI Gateway request rate

promql
rate(ai_gateway_requests_total[5m])

L2: Cache hit ratio

promql
rate(ai_gateway_cache_hits_total[5m])
/
(rate(ai_gateway_cache_hits_total[5m]) + rate(ai_gateway_cache_misses_total[5m]))

L2: P99 AI Gateway latency

promql
histogram_quantile(0.99,
  sum(rate(ai_gateway_latency_seconds_bucket[5m])) by (le)
)

L2: Token consumption rate

promql
rate(ai_gateway_tokens_total[1h])

L2: Budget exceeded events

promql
rate(ai_gateway_budget_exceeded_total[5m])

L3: Active sessions

promql
agent_gw_sessions_active

L3: Tool call throughput

promql
rate(agent_gw_tool_calls_total[5m])

L3: Tool call denial rate

promql
rate(agent_gw_tool_calls_denied[5m])
/
rate(agent_gw_tool_calls_total[5m])

L3: HITL approval rate

promql
rate(agent_gw_hitl_approved_total[5m])
/
(rate(agent_gw_hitl_approved_total[5m]) + rate(agent_gw_hitl_denied_total[5m]))

L3: P99 tool call latency

promql
histogram_quantile(0.99,
  sum(rate(agent_gw_tool_latency_seconds_bucket[5m])) by (le)
)

L3: Tool poisoning alerts

promql
increase(agent_gw_tool_poisoning_detected[1h]) > 0

Grafana Datasources

Configured in infra/grafana/provisioning/datasources/datasources.yaml:

DatasourceTypeURLDefault
Prometheusprometheushttp://prometheus:9090Yes
Jaegerjaegerhttp://jaeger:16686No
ClickHousegrafana-clickhouse-datasourceclickhouse:9000 (native)No

Access Grafana at http://localhost:3002 (default credentials: admin / admin).

Enterprise API + AI + Agent Gateway