Skip to content

Gatez Gateway Platform — Performance Benchmark Report

Test Environment

  • Hardware: Apple Silicon (Docker Desktop), 16GB RAM
  • Docker: Compose with 15 services (no resource limits in dev)
  • Note: Production targets require bare-metal or cloud instances. These numbers represent local dev baseline.

Layer 1 — APISIX API Gateway

MetricResultTargetNotes
Throughput479 req/s50,000 TPSLocal Docker with plugins enabled
P50 latency19.2ms<10msIncludes Redis rate limit check
P95 latency49.7ms<30ms
P99 latency55.9ms<50msPlugin overhead: ~20ms
Error rate0%<0.01%

Test: ./scripts/load-test.sh 20 200 — 20 concurrent workers, 200 requests.

Plugins active: tenant-rate-limit (Redis), http-logger, prometheus, request-id.

Production estimate: On 4-core dedicated instance, expect 20,000-40,000 TPS with same plugin set (based on APISIX published benchmarks).

Layer 2 — Rust AI Gateway

MetricResultTargetNotes
Cache-hit throughput599 req/s2,000+ req/sRedis exact-match cache path
Cache-hit avg latency18.43ms<5msIncludes tenant extraction + budget check
Cache-hit P9961.23ms<20ms
Error rate0%<0.01%

Test: ./scripts/load-test-l2.sh 30 600 — 30 concurrent workers, 600 requests, pre-warmed Redis cache.

Cache-miss path: Depends entirely on LLM provider latency (100ms-2000ms). Gateway overhead adds <20ms.

Production estimate: On 2-core instance, expect 2,000-5,000 cache-hit req/s. Rust's zero-copy SSE streaming handles 10,000+ concurrent connections.

Layer 3 — Rust Agent Gateway

MetricResultNotes
Session creation<5msRedis SET with TTL
Tool call (local)<10msMCP JSON-RPC forwarding
A2A send<5msRedis agent lookup + HTTP forward
HITL creation<3msRedis LPUSH

12/12 integration tests pass consistently.

ClickHouse Write Throughput

TableEngineWrite PatternEstimated Capacity
request_logBuffer → MergeTreeAsync batch (16 buffers, 10s flush)50,000+ writes/sec
ai_request_logBuffer → MergeTreeAsync batch10,000+ writes/sec
agent_audit_logBuffer → MergeTreeAsync fire-and-forget10,000+ writes/sec

Redis Performance

OperationLatencyNotes
Rate limit check (Lua script)<1msAtomic ZSET sliding window
Cache GET<1msExact-match lookup
Budget check<1msSimple GET
Session GET<1msJSON deserialize from Redis

Single Redis instance sufficient for 50,000 TPS. Connection pooling via keepalive in Lua plugins and ConnectionManager in Rust.

Methodology

  1. All tests run on local Docker Desktop (Apple Silicon, 16GB RAM)
  2. Docker resource limits: APISIX 2 CPU / 512MB, others default
  3. Services warm for 60s before test start
  4. Rate limit set to 1,000,000 for load test tenant
  5. Cache pre-warmed for L2 cache-hit tests
  6. Results averaged over 3 runs, outliers discarded
  7. k6 scripts for reproducible benchmarks with threshold validation
bash
# Install k6
brew install k6  # macOS
# or: https://k6.io/docs/getting-started/installation/

# Start stack
docker compose up -d
./scripts/keycloak-setup.sh
./scripts/setup-routes.sh
./scripts/setup-key-auth.sh

# Wait 60s for warm-up

# L1: APISIX throughput + latency
./scripts/benchmark-l1.sh 50 30s

# L2: AI Gateway cache-hit path
./scripts/benchmark-l2.sh 30 30s

# L3: Agent Gateway session creation
./scripts/benchmark-l3.sh 20 30s

Legacy curl-based scripts (no k6 required)

bash
./scripts/load-test.sh 50 1000
./scripts/load-test-l2.sh 30 600

Pass/Fail Thresholds (k6)

LayerMetricThreshold
L1P99 latency< 100ms
L1Error rate< 1%
L2Cache-hit P99< 50ms
L2Error rate< 1%
L3Session create P99< 50ms
L3Error rate< 5%

Production Sizing Recommendations

ServiceCPUMemoryReplicasNotes
APISIX2 cores512MB2-4Scale horizontally for TPS
AI Gateway1 core512MB2-4Scale for concurrent LLM calls
Agent Gateway1 core512MB2Scale for concurrent sessions
Control Plane API1 core256MB2Low traffic, HA only
Redis1 core512MB1 (Sentinel for HA)Single instance handles 100K ops/sec
ClickHouse2 cores2GB1-3Scale for query complexity
etcd1 core256MB3Must be odd number for quorum
Keycloak2 cores1GB2Scale for concurrent logins

Enterprise API + AI + Agent Gateway