Appearance
Metrics Reference
All Gatez services expose Prometheus metrics. Metrics are scraped by Prometheus (port 9090) at a 15-second interval.
Scrape Targets
| Job | Endpoint | Port | Metrics Path |
|---|---|---|---|
apisix | apisix:9091 | 9091 | /apisix/prometheus/metrics |
ai-gateway | ai-gateway:4000 | 4000 | /metrics |
otel-collector | otel-collector:8889 | 8889 | /metrics |
clickhouse | clickhouse:9363 | 9363 | /metrics |
:::note The Agent Gateway (L3) metrics are exported via the OTel Collector. Metrics from L2 and L3 Rust services flow through OTel and are exposed by the collector at port 8889 with the gateway_ namespace prefix. :::
L1 -- APISIX Metrics (Standard Prometheus Plugin)
APISIX exposes metrics via its built-in prometheus plugin at /apisix/prometheus/metrics on port 9091.
| Metric | Type | Labels | Description |
|---|---|---|---|
apisix_http_status | Counter | code, route, matched_uri, matched_host, service, consumer, node | HTTP response status code counts |
apisix_http_latency_bucket | Histogram | type (apisix, upstream, request), route, service, consumer, node | Request latency distribution (seconds) |
apisix_http_latency_sum | Histogram | (same as above) | Sum of request latencies |
apisix_http_latency_count | Histogram | (same as above) | Count of requests |
apisix_bandwidth | Counter | type (ingress, egress), route, service, consumer, node | Bytes transferred |
apisix_upstream_status | Counter | code, route, service, consumer, node | Upstream response status codes |
apisix_http_requests_total | Counter | route, service, consumer | Total HTTP requests processed |
apisix_nginx_http_current_connections | Gauge | state (active, reading, writing, waiting) | Current Nginx connection states |
apisix_etcd_modify_indexes | Gauge | key | etcd modification index by config key |
apisix_node_info | Gauge | hostname | Node information |
apisix_shared_dict_capacity_bytes | Gauge | name | Shared dictionary capacity |
apisix_shared_dict_free_space_bytes | Gauge | name | Shared dictionary free space |
L2 -- AI Gateway Metrics
Defined in layers/ai-gateway/src/middleware.rs. Exposed at http://ai-gateway:4000/metrics.
| Metric | Type | Labels | Description |
|---|---|---|---|
ai_gateway_requests_total | Counter | -- | Total AI gateway requests processed |
ai_gateway_cache_hits_total | Counter | -- | Total exact-match cache hits (Redis) |
ai_gateway_cache_misses_total | Counter | -- | Total cache misses |
ai_gateway_latency_seconds | Histogram | -- | Request latency distribution. Buckets: 1ms, 5ms, 10ms, 25ms, 50ms, 100ms, 250ms, 500ms, 1s, 2.5s, 5s, 10s |
ai_gateway_tokens_total | Counter | -- | Total tokens consumed across all tenants |
ai_gateway_active_requests | Gauge | -- | Currently in-flight requests |
ai_gateway_pii_detected_total | Counter | -- | Total PII detection events (Presidio) |
ai_gateway_budget_exceeded_total | Counter | -- | Total requests rejected due to token budget exhaustion |
L3 -- Agent Gateway Metrics
Defined in layers/agent-gateway/src/metrics.rs. Exposed at http://agent-gateway:5001/metrics.
| Metric | Type | Labels | Description |
|---|---|---|---|
agent_gw_sessions_total | Counter | -- | Total agent sessions created |
agent_gw_sessions_active | Gauge | -- | Currently active agent sessions |
agent_gw_tool_calls_total | Counter | -- | Total MCP tool calls executed |
agent_gw_tool_calls_denied | Counter | -- | Tool calls denied by policy (allowlist/denylist) |
agent_gw_a2a_hops_total | Counter | -- | Total A2A delegation hops |
agent_gw_hitl_requests_total | Counter | -- | Total HITL approval requests created |
agent_gw_hitl_approved_total | Counter | -- | HITL requests approved |
agent_gw_hitl_denied_total | Counter | -- | HITL requests denied |
agent_gw_tool_latency_seconds | Histogram | -- | Tool call latency distribution. Buckets: 1ms, 5ms, 10ms, 50ms, 100ms, 500ms, 1s, 5s, 10s, 30s |
agent_gw_budget_remaining | Gauge | -- | Current remaining budget |
agent_gw_tool_poisoning_detected | Counter | -- | Tool poisoning detection events (fingerprint mismatch) |
OTel Collector Metrics
The OTel Collector re-exports metrics received via OTLP under the gateway_ namespace (configured in infra/otel/config.yaml).
| Setting | Value |
|---|---|
| Namespace | gateway |
| Endpoint | 0.0.0.0:8889 |
| Resource-to-telemetry | Enabled |
PromQL Examples
L1: Request rate by route (last 5 minutes)
promql
sum(rate(apisix_http_status[5m])) by (route, code)L1: P99 latency per route
promql
histogram_quantile(0.99,
sum(rate(apisix_http_latency_bucket{type="request"}[5m])) by (le, route)
)L1: Error rate (4xx + 5xx)
promql
sum(rate(apisix_http_status{code=~"4..|5.."}[5m]))
/
sum(rate(apisix_http_status[5m]))L1: Current active connections
promql
apisix_nginx_http_current_connections{state="active"}L2: AI Gateway request rate
promql
rate(ai_gateway_requests_total[5m])L2: Cache hit ratio
promql
rate(ai_gateway_cache_hits_total[5m])
/
(rate(ai_gateway_cache_hits_total[5m]) + rate(ai_gateway_cache_misses_total[5m]))L2: P99 AI Gateway latency
promql
histogram_quantile(0.99,
sum(rate(ai_gateway_latency_seconds_bucket[5m])) by (le)
)L2: Token consumption rate
promql
rate(ai_gateway_tokens_total[1h])L2: Budget exceeded events
promql
rate(ai_gateway_budget_exceeded_total[5m])L3: Active sessions
promql
agent_gw_sessions_activeL3: Tool call throughput
promql
rate(agent_gw_tool_calls_total[5m])L3: Tool call denial rate
promql
rate(agent_gw_tool_calls_denied[5m])
/
rate(agent_gw_tool_calls_total[5m])L3: HITL approval rate
promql
rate(agent_gw_hitl_approved_total[5m])
/
(rate(agent_gw_hitl_approved_total[5m]) + rate(agent_gw_hitl_denied_total[5m]))L3: P99 tool call latency
promql
histogram_quantile(0.99,
sum(rate(agent_gw_tool_latency_seconds_bucket[5m])) by (le)
)L3: Tool poisoning alerts
promql
increase(agent_gw_tool_poisoning_detected[1h]) > 0Grafana Datasources
Configured in infra/grafana/provisioning/datasources/datasources.yaml:
| Datasource | Type | URL | Default |
|---|---|---|---|
| Prometheus | prometheus | http://prometheus:9090 | Yes |
| Jaeger | jaeger | http://jaeger:16686 | No |
| ClickHouse | grafana-clickhouse-datasource | clickhouse:9000 (native) | No |
Access Grafana at http://localhost:3002 (default credentials: admin / admin).