Gatez Gateway Platform — Performance Benchmark Report

Test Environments

AWS Production (2026-04-09)

Instance type: AWS m6a.xlarge (spot)
CPU: 4 vCPU (AMD EPYC, 3rd gen)
Memory: 16 GB RAM
Cost: ~$25-34/month (spot pricing)
Region: ap-south-1 (Mumbai)
OS: Ubuntu 22.04 LTS
Docker: 19 containers running (all services + observability stack)
Tool: wrk 4.2.0 (10 threads, 200-500 connections, 30s duration)
Network: All tests run localhost on the VM (no internet RTT)

Local Development

Hardware: Apple Silicon M-series (Docker Desktop), 16GB RAM
Docker: Compose with 15 services (no resource limits in dev)
Note: Docker Desktop on macOS throttles I/O significantly — expect 50-100x lower throughput than bare-metal Linux

Layer 1 — APISIX API Gateway

AWS Production Results

Gateway-only (no upstream proxy — measures APISIX + plugin overhead)

Metric	Result	Target	Status
Throughput (bare, no plugins)	46,665 req/s	50,000 TPS	93% of target
Throughput (all plugins active)	43,011 req/s	50,000 TPS	86% of target
P50 latency	5.25ms	<10ms	PASS
P99 latency	~15ms	<50ms	PASS
Plugin overhead	<8%	<10%	PASS
Error rate	0%	<0.01%	PASS

Test: wrk -t10 -c200 -d30s http://APISIX_IP:9080/ — direct to APISIX container, bypassing Caddy TLS.

Plugins active: tenant-rate-limit (Redis), key-auth, clickhouse-logger, prometheus, request-id, consumer-restriction, response-pii-scrub.

Full stack (APISIX → upstream → response)

Metric	Result	Bottleneck	Notes
APISIX → mock-backend (200)	961 req/s	mock-backend	httpbin Python caps at ~1,835 req/s
APISIX → mock-backend + key-auth	848 req/s	mock-backend	+key-auth adds negligible overhead
APISIX → mock-backend (500 conn)	951 req/s	mock-backend	Flat — confirms upstream saturation
Mock-backend direct (no APISIX)	1,835 req/s	—	Python httpbin baseline

Test: wrk -t10 -c200 -d30s -H "X-Tenant-ID: retail" http://APISIX_IP:9080/smoke/get

Key insight: With real production backends (Go/Rust/Java at 10k+ req/s), the gateway will NOT be the bottleneck. APISIX processes 43k req/s with full plugin pipeline — throughput is limited only by upstream service capacity.

To reach 50k TPS gateway-only: Scale to m6a.2xlarge (8 vCPU, $50/mo spot) or add a second APISIX node behind a load balancer.

Local Development Results (baseline)

Metric	Result	Notes
Throughput	479 req/s	Docker Desktop throttles I/O — 90x slower than AWS
P50 latency	19.2ms
P99 latency	55.9ms

Test: ./scripts/load-test.sh 20 200 — 20 concurrent workers, 200 requests.

Local Development Results (baseline)

Metric	Result	Notes
Throughput	479 req/s	Docker Desktop throttles I/O — 96x slower than AWS
P50 latency	19.2ms
P99 latency	55.9ms

Test: ./scripts/load-test.sh 20 200 — 20 concurrent workers, 200 requests.

Layer 2 — Rust AI Gateway

Metric	Result	Target	Notes
Cache-hit throughput	599 req/s	2,000+ req/s	Redis exact-match cache path
Cache-hit avg latency	18.43ms	<5ms	Includes tenant extraction + budget check
Cache-hit P99	61.23ms	<20ms
Error rate	0%	<0.01%

Test: ./scripts/load-test-l2.sh 30 600 — 30 concurrent workers, 600 requests, pre-warmed Redis cache.

Cache-miss path: Depends entirely on LLM provider latency (100ms-2000ms). Gateway overhead adds <20ms.

Production estimate: On 2-core instance, expect 2,000-5,000 cache-hit req/s. Rust's zero-copy SSE streaming handles 10,000+ concurrent connections.

Layer 3 — Rust Agent Gateway

Metric	Result	Notes
Session creation	<5ms	Redis SET with TTL
Tool call (local)	<10ms	MCP JSON-RPC forwarding
A2A send	<5ms	Redis agent lookup + HTTP forward
HITL creation	<3ms	Redis LPUSH

12/12 integration tests pass consistently.

ClickHouse Write Throughput

Table	Engine	Write Pattern	Estimated Capacity
request_log	Buffer → MergeTree	Async batch (16 buffers, 10s flush)	50,000+ writes/sec
ai_request_log	Buffer → MergeTree	Async batch	10,000+ writes/sec
agent_audit_log	Buffer → MergeTree	Async fire-and-forget	10,000+ writes/sec

Redis Performance

Operation	Latency	Notes
Rate limit check (Lua script)	<1ms	Atomic ZSET sliding window
Cache GET	<1ms	Exact-match lookup
Budget check	<1ms	Simple GET
Session GET	<1ms	JSON deserialize from Redis

Single Redis instance sufficient for 50,000 TPS. Connection pooling via keepalive in Lua plugins and ConnectionManager in Rust.

Methodology

All tests run on local Docker Desktop (Apple Silicon, 16GB RAM)
Docker resource limits: APISIX 2 CPU / 512MB, others default
Services warm for 60s before test start
Rate limit set to 1,000,000 for load test tenant
Cache pre-warmed for L2 cache-hit tests
Results averaged over 3 runs, outliers discarded
k6 scripts for reproducible benchmarks with threshold validation

k6 Benchmark Scripts (Recommended)

bash

# Install k6
brew install k6  # macOS
# or: https://k6.io/docs/getting-started/installation/

# Start stack
docker compose up -d
./scripts/zitadel-setup.sh
./scripts/setup-routes.sh
./scripts/setup-key-auth.sh

# Wait 60s for warm-up

# L1: APISIX throughput + latency
./scripts/benchmark-l1.sh 50 30s

# L2: AI Gateway cache-hit path
./scripts/benchmark-l2.sh 30 30s

# L3: Agent Gateway session creation
./scripts/benchmark-l3.sh 20 30s

Legacy curl-based scripts (no k6 required)

bash

./scripts/load-test.sh 50 1000
./scripts/load-test-l2.sh 30 600

Pass/Fail Thresholds (k6)

Layer	Metric	Threshold
L1	P99 latency	< 100ms
L1	Error rate	< 1%
L2	Cache-hit P99	< 50ms
L2	Error rate	< 1%
L3	Session create P99	< 50ms
L3	Error rate	< 5%

Production Sizing Recommendations

Service	CPU	Memory	Replicas	Notes
APISIX	2 cores	512MB	2-4	Scale horizontally for TPS
AI Gateway	1 core	512MB	2-4	Scale for concurrent LLM calls
Agent Gateway	1 core	512MB	2	Scale for concurrent sessions
Control Plane API	1 core	256MB	2	Low traffic, HA only
Redis	1 core	512MB	1 (Sentinel for HA)	Single instance handles 100K ops/sec
ClickHouse	2 cores	2GB	1-3	Scale for query complexity
etcd	1 core	256MB	3	Must be odd number for quorum
Zitadel	2 cores	1GB	2	Scale for concurrent logins

Gatez Gateway Platform — Performance Benchmark Report ​

Test Environments ​

AWS Production (2026-04-09) ​

Local Development ​

Layer 1 — APISIX API Gateway ​

AWS Production Results ​

Gateway-only (no upstream proxy — measures APISIX + plugin overhead) ​

Full stack (APISIX → upstream → response) ​

Local Development Results (baseline) ​

Local Development Results (baseline) ​

Layer 2 — Rust AI Gateway ​

Layer 3 — Rust Agent Gateway ​

ClickHouse Write Throughput ​

Redis Performance ​

Methodology ​

k6 Benchmark Scripts (Recommended) ​

Legacy curl-based scripts (no k6 required) ​

Pass/Fail Thresholds (k6) ​

Production Sizing Recommendations ​

Gatez Gateway Platform — Performance Benchmark Report

Test Environments

AWS Production (2026-04-09)

Local Development

Layer 1 — APISIX API Gateway

AWS Production Results

Gateway-only (no upstream proxy — measures APISIX + plugin overhead)

Full stack (APISIX → upstream → response)

Local Development Results (baseline)

Local Development Results (baseline)

Layer 2 — Rust AI Gateway

Layer 3 — Rust Agent Gateway

ClickHouse Write Throughput

Redis Performance

Methodology

k6 Benchmark Scripts (Recommended)

Legacy curl-based scripts (no k6 required)

Pass/Fail Thresholds (k6)

Production Sizing Recommendations