Back to topics

CDC decisions and UUID bets: how ClickHouse engines and ID strategies shape real-time analytics

1 min read
234 words
Database Debates ClickHouse

Real-time analytics keeps landing on one question: which engine and ID scheme makes CDC sing? A Hacker News thread on ClickHouse table engines breaks down how MergeTree, ReplacingMergeTree, and Collapsing affect CDC updates and deletes during initial writes and merges. The punchy takeaway: ReplacingMergeTree is the goldilocks for many CDC workloads [1].

ClickHouse engines and CDC On initial writes and merges, the post shows how updates and deletes are treated differently by each engine, with ReplacingMergeTree offering a balance that fits most CDC use-cases. Worked examples illustrate CDC behavior across engines and why the author leans toward ReplacingMergeTree as the go-to choice for typical CDC use. That conclusion is highlighted clearly in the discussion [1].

UUID strategy in PostgreSQL PostgreSQL now supports UUIDv7; the discussion compares UUIDv7 vs UUIDv4 but questions why not just stick with serial/bigserial in all cases. Two big advantages drive client-generated IDs: you can insert many rows without round-trips, and you can keep coherence across distributed or offline-first systems. The usual flow (INSERT ... RETURNING id) gives you the DB-generated key but can bottleneck batching or circular deps. In practice, client-generated IDs help with offline-first scenarios, even if collisions are theoretically possible but extremely unlikely [2].

Closing thought: when CDC realism edges up against distribution and offline use, you’ll likely pick ReplacingMergeTree for ClickHouse and UUIDv7 (or UUIDv4) strategies in PostgreSQL depending on your offline needs and batching reality [2].

References

[1]
HackerNews

ClickHouse table engines & CDC data (MergeTree, Replacing, Collapsing +)

Explains how ClickHouse table engines influence CDC updates and merges; concludes ReplacingMergeTree best fits many CDC use-cases.

View source
[2]
HackerNews

Exploring PostgreSQL 18's new UUIDv7 support

Discusses UUIDv7 vs UUIDv4 vs serial PKs for Postgres; client-generated IDs; distribution, offline use, and security tradeoffs.

View source

Want to track your own topics?

Create custom trackers and get AI-powered insights from social discussions

Get Started