Appearance
ADR-002: ClickHouse for Request Logging and Analytics
Status
Accepted — 2026-03-21
Context
The platform needs to ingest and query request logs at 50,000 writes/second with per-tenant analytics, retention policies, and dashboard-friendly aggregation.
Options evaluated
- ClickHouse — Column-oriented OLAP, MergeTree engine, excellent write throughput
- Elasticsearch — Full-text search, Kibana dashboards, widely adopted
- Grafana Loki — Log aggregation, label-based queries, lower resource usage
Decision
Use ClickHouse for all structured request logging and analytics.
Rationale
- Write throughput: ClickHouse handles 50k+ inserts/sec on modest hardware via Buffer engine. Elasticsearch typically needs 5-10x more resources for comparable write throughput.
- Storage efficiency: Columnar compression achieves 10-20x compression ratios on structured log data vs Elasticsearch's inverted index overhead.
- SQL interface: Native SQL queries are more accessible than Elasticsearch DSL for analytics.
- Materialized views: Pre-aggregated dashboards without query-time overhead — critical for real-time tenant metrics.
- TTL support: Native per-table TTL with automatic data expiration.
Ingestion strategy
- Buffer engine:
Buffer(gateway, request_log_raw, 16, ...)absorbs bursts and flushes in batches - Partition by month:
PARTITION BY toYYYYMM(timestamp)for efficient TTL and query pruning - Order by tenant:
ORDER BY (tenant_id, timestamp)for fast per-tenant queries
Retention
- Request logs: 90 days
- AI usage logs: 365 days
- Audit logs: no TTL (regulatory compliance)
Consequences
- Need Buffer engine tuning under load to prevent memory pressure
- Grafana requires ClickHouse datasource plugin
- No full-text search capability (not needed for structured logs)