Scaling request logging from millions to billions is forcing a bare-knuckle choice: keep it in ClickHouse with async inserts, bolt on a streaming stack with Kafka and Vector, or lean on a buffering layer such as Redis. The discussion weighs throughput, latency, and operational risk as the decision maker [1].
Buffering bets Two-layer buffering—a buffer table plus a larger aggregation window—shows up as a practical pattern in the discussion [1]. Some readers favor Redis as the buffering layer paired with periodic jobs, arguing it’s simpler than wiring a full Kafka+Vector pipeline [1]. There’s also the option to use a simple buffer table and even explore forks like kittenhouse for experimentation [1].
Materialized views Materialized views in ClickHouse are praised for automatic updates on inserts, speeding up aggregates [1]. There are two kinds: standard materialized views and refreshable ones that run on schedules and can have interdependencies [1].
Asynchronous inserts vs streaming Asynchronous inserts provide a robust, simpler path in ClickHouse compared with Kafka+Vector for many workloads [1]. The tradeoff is complexity, reliability, and latency control, which streaming stacks can address but add operational heft [1].
Bottom line: a pragmatic mix—materialized views, buffering where it helps, and measured streaming—often wins for high-throughput observability pipelines [1].
References
Scaling request logging with ClickHouse, Kafka, and Vector
Discusses scaling request logging using ClickHouse, Kafka, and Vector; compares buffering strategies, materialized views, and alternative stacks for high-scale analytics.
View source