Appearance
Gatez Feature Matrix
Complete feature inventory across all dimensions. Status: Built = shipped and tested.
API Gateway (Layer 1 — APISIX + Lua)
| Feature | Status | Details |
|---|---|---|
| HTTP/HTTPS proxying | Built | APISIX core, TLS termination via Caddy/cert-manager |
| Per-tenant rate limiting | Built | Redis-backed Lua plugin, sliding window, rl:{tenant_id}:{route}:{window} |
| Rate limit hierarchy | Built | Global → tenant → route overrides, visual editor in Operator Portal |
| JWT authentication | Built | Keycloak OIDC integration, tenant_id extracted from JWT claims |
| API key authentication | Built | key-auth plugin, request → approve → issue workflow |
| Request logging | Built | ClickHouse via http-logger plugin, async (never blocks request path) |
| Route management | Built | APISIX Admin API CRUD, lifecycle states (draft → published → deprecated → retired) |
| gRPC proxying | Built | grpc-transcode + grpc-web plugins, /grpc/* and /grpc-web/* routes |
| WebSocket proxying | Built | /ws/* route, 60s keepalive, enable_websocket flag |
| Circuit breaker | Built | api-breaker plugin, configurable thresholds, failure injection for testing |
| IP restriction per tenant | Built | APISIX ip-restriction plugin, PUT /api/tenants/:id/ip-allowlist |
| Active + passive health checks | Built | All upstreams, exposed in Operator Portal health page |
| Canary deployments | Built | Traffic splitting via upstream weighting (%), blue-green pattern documented |
| Service discovery | Built | DNS SRV + Consul + K8s, documented in docs/deployment/service-discovery.md |
| Cross-layer trace propagation | Built | W3C traceparent/tracestate forwarding L1 → L2 → L3 |
| Grafana dashboard | Built | Request rate, error rate, P99 latency per tenant |
| Load tested | Built | 479 TPS local dev (target 50k TPS on prod hardware), P99 56ms |
AI Gateway (Layer 2 — Custom Rust, axum + tokio)
| Feature | Status | Details |
|---|---|---|
| Multi-model routing | Built | 13 providers: OpenAI, Anthropic, Gemini, Ollama, Azure, Mistral, Cohere, DeepSeek, Together, Groq, Fireworks, vLLM, Bedrock (stub) |
| Model passthrough | Built | Prefix-pattern routing — any model ID routed to correct provider, no whitelist |
| Model aliasing | Built | MODEL_ALIASES=fast=gpt-4o-mini,smart=claude-sonnet env var |
| P2C load balancing | Built | Power of Two Choices across providers, health scoring (error rate + latency + pending) |
| Circuit breaker | Built | 3 failures → open, 30s recovery, auto half-open |
| Retry with backoff | Built | 2 retries, 100ms initial, max 10s, exponential |
| Fallback chains | Built | Circuit breaker open → auto-route to next available provider |
| Redis exact-match cache | Built | Tenant-scoped keys, cache-hit path: 599 req/s, 18ms avg |
| Semantic cache (Qdrant) | Built | Two-tier: Redis exact → Qdrant similarity, hash-based vectors |
| PII redaction | Built | Regex: SSN, email, credit card, phone, IP — runs BEFORE LLM call |
| Multi-layer prompt guards | Built | Pipeline: regex (<1ms) → webhook → action (Pass/Reject/Mask) |
| Token budget enforcement | Built | Per-tenant, pre-request check in Redis, deduct after response |
| Streaming SSE | Built | Zero-copy pass-through, no buffering |
| ClickHouse logging | Built | Async fire-and-forget: model, tokens, latency, cache_hit, pii_detected |
| Prometheus metrics | Built | Requests, cache, latency, tokens, PII, budget, active requests |
| Hot config reload | Built | POST /admin/reload, RwLock config swap, validates before applying |
| Provider health API | Built | GET /v1/providers/health — all provider stats, error rates, latency |
| OpenAI-compatible API | Built | Drop-in replacement: /v1/chat/completions, /v1/models |
| JWT signature validation | Built | Independent JWKS validation (doesn't trust L1 blindly) |
| Auth header scrubbing | Built | Strips Authorization/x-api-key before ClickHouse writes |
| Observability webhooks | Built | Batched LlmEvent export (metadata only, prompts opt-in) |
| Grafana dashboard | Built | Request rate, cache hit rate, latency P50/P95/P99, tokens, PII, budget |
| Load tested | Built | 599 req/s cache-hit, 18ms avg latency (local Docker) |
Agent Gateway (Layer 3 — Custom Rust, axum + tokio)
| Feature | Status | Details |
|---|---|---|
| MCP protocol | Built | Server registry CRUD, tool discovery, JSON-RPC forwarding |
| A2A protocol | Built | Agent registry, send message, task tracking, HTTP forwarding |
| Session lifecycle | Built | Create, list, inspect, terminate — Redis-backed with TTL |
| Tool allowlists | Built | Deny by default, per-session, tenant-scoped |
| CEL expression engine | Built | 776-line built-in evaluator, 30 tests, jwt/mcp/tenant/session vars |
| HITL approval gates | Built | Per-tenant configurable, pending queue, approve/deny API |
| Session token budgets | Built | Per-session limits, budget check before every tool call |
| Tool poisoning protection | Built | Server fingerprinting, naming collision detection (409 on conflict) |
| A2A delegation policies | Built | Cross-tenant block, chain depth limit (max 5), loop detection |
| MCP elicitation | Built | /v1/elicit + /v1/elicit/:id/respond — structured input via HITL |
| OpenAPI-to-MCP translation | Built | Auto-convert OpenAPI 3.x specs into MCP tool definitions |
| Virtual MCP endpoint | Built | GET /v1/mcp federates all servers, tool name prefixing |
| MCP health checks | Built | Background task, 30s interval, Healthy/Degraded/Unhealthy per server |
| stdio transport | Built | Process lifecycle via StdioManager, JSON-RPC over stdin/stdout |
| SSE transport | Built | HTTP POST fallback for MCP SSE servers |
| MCP OAuth | Built | RFC 9728 + RFC 8414, gateway proxy pattern, 3 validation modes |
| JSON schema validation | Built | Validate tool args against MCP input_schema (types, required fields) |
| Agent registry persistence | Built | Redis-backed, survives restarts |
| Cross-layer tracing | Built | OTel + Jaeger, L1 → L2 → L3 span tree |
| ClickHouse audit trail | Built | Tool calls, A2A hops, session events with tenant_id |
| Prometheus metrics | Built | Sessions, tool calls, denied, A2A, HITL, latency, poisoning |
| Hot config reload | Built | POST /admin/reload, health check interval + session TTL |
| Grafana dashboard | Built | 7 panels: sessions, tool calls, A2A, HITL, latency, poisoning |
Multi-Tenancy
| Feature | Status | Details |
|---|---|---|
| tenant_id on every call | Built | JWT claim extraction, propagated L1 → L2 → L3 |
| Per-tenant rate limiting | Built | Independent quotas, Redis sliding window, never shared buckets |
| Per-tenant token budgets | Built | Pre-request check, post-response deduct, alert at 80% |
| Per-tenant API keys | Built | Namespace-scoped, request → approve → issue workflow |
| Per-tenant tool allowlists | Built | Deny by default, CEL rules per tenant |
| Per-tenant HITL policies | Built | Configurable per-tool, per-tenant |
| Per-tenant branding | Built | Logo (base64, 100KB), portal title, primary color |
| Tenant-scoped cache | Built | Redis: {tenant_id}:cache:*, no cross-tenant sharing |
| Tenant-scoped analytics | Built | ClickHouse row-level filter, per-tenant dashboards |
| Tenant-scoped audit trail | Built | Every log entry includes tenant_id |
| IP restriction per tenant | Built | CIDR allowlist, 403 on violation |
| Cross-tenant isolation | Built | Session isolation, key isolation, analytics isolation — tested |
Control Plane — Operator Portal
| Feature | Status | Details |
|---|---|---|
| Tenant management | Built | List, create (3-step wizard), edit, suspend, delete |
| Rate limit editor | Built | Visual hierarchy: global → tenant → route overrides |
| API catalogue | Built | Route/service browser, search, filter, plugin badges |
| OAS 3 Swagger UI | Built | Upload spec, inline try-it console, curl generator |
| API lifecycle | Built | Draft → published → deprecated → retired |
| API key management | Built | Create, show-once-then-mask, revoke, audit log |
| Key approval queue | Built | Review tenant requests, approve/deny |
| Usage analytics | Built | ClickHouse-backed KPI cards, time-series, drill-downs |
| LLM token analytics | Built | Prompt vs completion, cost per provider, per-model bars |
| Health monitoring | Built | Upstream status, dependency map, alert config |
| Session browser | Built | List, filter, terminate from UI |
| MCP tool registry | Built | Catalog, enable/disable per tenant |
| MCP tool playground | Built | Auto-generated form, execute, history, curl generator |
| Trace explorer | Built | Cross-layer L1 → L2 → L3 span tree |
| A2A topology graph | Built | Agent delegation chains, loop detection |
| HITL approval queue | Built | Pending tool calls, approve/modify/deny |
| Policy editor | Built | Visual tool allowlist + RBAC per tenant |
| CEL playground | Built | Expression editor, context builder, examples, history |
| Audit log | Built | ClickHouse-backed, filters, CSV export |
| Settings | Built | Platform config, notification config, data retention |
| LLM provider management | Built | Add/test/delete providers, secret references, UI tab |
| User management | Built | Context-aware: SCIM/SSO/Keycloak/Bootstrap adaptive UI |
| Service accounts | Built | gtz_sa_ prefixed keys, SHA-256 hashed, show-once modal |
| Webhook management | Built | Register URL + event types, delivery log |
| IP allowlist editor | Built | Per-tenant CIDR management |
| Canary deployment slider | Built | 0-100% traffic split per route |
| Notifications | Built | Bell icon, type-specific icons, polling |
| Custom branding | Built | Logo upload, portal title, color per tenant |
Control Plane — Developer Portal
| Feature | Status | Details |
|---|---|---|
| API discovery | Built | Browse published APIs, tenant-scoped |
| Swagger try-it console | Built | Inline method selector, headers, body, live response, curl |
| Key management | Built | Request → approval → secure issuance (show-once modal) |
| My keys dashboard | Built | Masked keys, last-used, request count, revoke |
| Usage dashboard | Built | Request volume, error rate, latency, token consumption |
| Token budget visibility | Built | Remaining, burn rate, projected exhaustion |
| Agent session viewer | Built | Sessions, tool call timeline, budget gauges |
| HITL approval | Built | Approve own sessions, amber banner, countdown timer |
| Usage drill-down | Built | LLM tokens by model, cache hit rate, cost estimate, error breakdown |
| Session drill-down | Built | Tool call timeline, duration, tokens, status per call |
| Audit log export | Built | Date range, action filter, CSV export |
| Notifications | Built | Bell with unread count, type-specific icons, filter tabs |
| Settings | Built | Profile, notification prefs, branding (tenant-admin) |
| Custom branding | Built | Tenant logo, title, color |
| Tenant-locked | Built | Cannot see other tenants' data, ever |
Security
| Feature | Status | Details |
|---|---|---|
| JWT authentication | Built | Keycloak OIDC, validated at L1 + L2 independently |
| JWKS caching | Built | L2: 5-min, L3: 30-min TTL, offline validation |
| API key auth | Built | key-auth plugin, scoped per tenant |
| Master key fallback | Built | Service-to-service calls bypass JWT when needed |
| PII redaction | Built | Pre-LLM: SSN, email, credit card, phone, IP |
| Prompt guards | Built | Regex + webhook pipeline, Reject/Mask actions |
| Auth header scrubbing | Built | Authorization/x-api-key stripped before ClickHouse |
| Tool allowlists | Built | Deny by default, CEL expressions, per-tenant |
| Tool poisoning detection | Built | Fingerprinting, naming collision (409) |
| HITL gates | Built | Human approval for high-risk tool calls |
| Blast radius controls | Built | Session budgets, depth limits, loop detection |
| IP restriction | Built | Per-tenant CIDR allowlist |
| SQL injection protection | Built | Parameterized queries, pre-merge scan |
| XSS protection | Built | React escaping, pre-merge scan |
| Secret management | Built | KeySource enum: EnvVar, Vault, K8s, AWS SM, Azure KV (stubs) |
| License key system | Built | JWT-signed, offline validation, tier gating |
| TLS everywhere | Built | cert-manager, inter-service TLS, self-signed CA for dev |
| MCP OAuth | Built | RFC 9728/8414, PKCE, 3 validation modes |
Observability
| Feature | Status | Details |
|---|---|---|
| Prometheus metrics | Built | All 3 layers export metrics: requests, latency, errors, cache, tokens |
| Grafana dashboards | Built | L1 (4 panels), L2 (10 panels), L3 (7 panels) |
| Jaeger distributed tracing | Built | Cross-layer L1 → L2 → L3 span tree, OTel export |
| ClickHouse analytics | Built | request_log, ai_request_log, agent_audit_log — partitioned by month |
| ClickHouse TTL | Built | 90d request logs, 365d AI usage, no TTL audit logs |
| Buffer engine | Built | Buffer → MergeTree for high-write tables |
| Real-time health monitoring | Built | Upstream status, dependency map, alert config |
| LLM observability webhooks | Built | Batched metadata export (Langfuse/LangSmith compatible) |
| Audit trail | Built | Every tool call, A2A hop, session event logged with tenant_id |
| CSV export | Built | request_log, ai_request_log, agent_audit_log — ClickHouse FORMAT CSVWithNames |
Enterprise & Compliance
| Feature | Status | Details |
|---|---|---|
| License key & feature gates | Built | Community / Pro / Enterprise / Trial tiers |
| SSO federation | Built | Okta, Microsoft Entra ID, Google Workspace via Keycloak OIDC broker |
| SCIM provisioning | Built | Identity source detection, role assignment API |
| Multi-realm Keycloak | Built | Dedicated realm per enterprise tenant |
| HIPAA compliance mapping | Built | docs/compliance/hipaa-mapping.md, controls → features |
| Air-gap deployment | Built | All services from container images, zero internet dependency |
| Performance benchmarks | Built | Documented methodology, L1/L2/L3 numbers |
| Dependency audit | Built | cargo audit + npm audit clean |
| Backup/restore runbook | Built | etcd, ClickHouse, Redis — RTO/RPO documented |
| Disaster recovery | Built | docs/operations/disaster-recovery.md |
| Horizontal autoscaling | Built | HPA for APISIX, AI Gateway, Agent Gateway |
| Canary deployments | Built | Traffic splitting, blue-green documented |
| Webhook system | Built | 6 event types, retry with backoff, delivery log |
Infrastructure & Deployment
| Feature | Status | Details |
|---|---|---|
| Docker Compose (local) | Built | All 15 services, single docker compose up -d |
| Kubernetes manifests | Built | Namespace, Deployments, Services, Secrets, Ingress |
| Helm chart | Built | infra/helm/gatez/ with configurable values.yaml |
| Terraform | Built | infra/terraform/ for cloud provisioning |
| Caddy reverse proxy | Built | Auto-TLS, subdomain routing template |
| Environment templates | Built | .env.local, .env.staging, .env.production |
| Environment detection | Built | GATEZ_ENV — refuses default passwords in production |
| Hot config reload | Built | L2 + L3 POST /admin/reload, no restart needed |
| Kong migration tools | Built | CLI translator, Python parser, plugin map, migration guide |
| gRPC + WebSocket | Built | APISIX plugins enabled, routes configured |
Testing
| Feature | Status | Details |
|---|---|---|
| L2 Rust unit tests | Built | 78 tests (PII, cache, semantic cache, config, providers, logging) |
| L3 Rust unit tests | Built | 106 tests (sessions, security, A2A, MCP, audit, CEL) |
| Cross-layer E2E | Built | 16 tests (L1→L2→L3 health, sessions, tools, metrics) |
| Enterprise test suite | Built | 213 scenarios (isolation, auth, boundary, concurrency) |
| Playwright UI E2E | Built | 208 specs across both portals |
| Pre-merge gate | Built | 10-section security/quality scan (secrets, SQL injection, auth, tenant isolation) |
| Smoke test | Built | scripts/smoke-test.sh — all services healthy |
| Full test runner | Built | scripts/test-all.sh — runs all suites in sequence |
| Performance benchmarks | Built | wrk/k6 based, documented methodology |
| Chaos engineering | Built | Service stop/start resilience tests |
Not Yet Built (Planned)
| Feature | Priority | Details |
|---|---|---|
| Kubernetes Gateway API | Medium | GatewayClass, Gateway, HTTPRoute CRDs |
| Vault secret resolver | Medium | HTTP API with token auth + cache |
| K8s Secret resolver | Medium | kube crate, service account auth |
| AWS Secrets Manager resolver | Medium | aws-sdk-secretsmanager crate |
| Usage metering & billing | Medium | Stripe integration, materialized views |
| SOC 2 Type II | Low | 3-6 month audit process |
| L1 response PII scrubbing | Low | APISIX body_filter plugin, opt-in per route |
| Per-tenant provider preferences | Low | Tenant-specific model routing |
| Per-tenant guard configuration | Low | Custom prompt guard rules per tenant |
| Full stdio bidirectional JSON-RPC | Low | Background stdout reader with reconnection |
| Full SSE streaming with reconnection | Low | Session pinning for stateful MCP servers |
| File watcher config reload | Low | notify crate for automatic reload |