Skip to content

ADR-005: Multi-tenancy Model

Status

Accepted — 2026-03-21

Context

The platform must provide hard per-tenant isolation from day one: separate rate limit buckets, auth realms, audit logs, API key namespaces, and token budgets.

Decision

Hybrid multi-tenancy with tenant_id as the universal isolation key.

Keycloak: Hybrid realm model

  • Enterprise tenants (ICP 1, 3): One Keycloak realm per tenant for full auth isolation, custom branding, and compliance requirements
  • SMB tenants (ICP 2): Shared realm with group-based isolation for lower operational overhead
  • JWT tokens always contain tenant_id claim regardless of realm model

Redis: Keyspace isolation

  • All keys prefixed with tenant_id: {tenant_id}:cache:*, {tenant_id}:session:*, {tenant_id}:budget:*
  • Rate limit buckets: rl:{tenant_id}:{route}:{window} — never shared
  • No Redis database-per-tenant (operational overhead too high)

ClickHouse: Row-level isolation

  • Shared tables with mandatory tenant_id column
  • ORDER BY (tenant_id, timestamp) ensures fast per-tenant queries
  • Partition large tenants separately if query performance degrades
  • Row-level security enforced at application layer

API key namespaces

  • Keys scoped to tenant: {tenant_id}:apikey:{key_id}
  • Key rotation per tenant without affecting others
  • Key usage tracked in ClickHouse per tenant

Rationale

  • Row-level isolation in ClickHouse is operationally simpler than table-per-tenant
  • Keyspace prefixing in Redis is proven at scale (Slack, Discord use similar patterns)
  • Hybrid Keycloak model balances isolation needs with operational cost
  • Application-layer enforcement gives flexibility without database-level complexity

Consequences

  • Every feature must accept and propagate tenant_id — enforced via code review
  • Bugs that leak data across tenants are treated as P0 security incidents
  • Monitoring must include per-tenant metrics to detect noisy neighbor issues

Enterprise API + AI + Agent Gateway