DuckLake and the SQL Lakehouse Wave: From DuckLake to Vectorized Streaming

DuckLake is pushing a SQL-powered lakehouse format, signaling a move toward SQL-native data lake architectures. DuckLake and Prof. H. Mühleisen frame SQL as the default access layer for lake data ^[1].

Timeplus Proton 3.0 — the first vectorized streaming SQL engine — is the highlight. The open-source release promises enterprise-grade streaming in a single binary with zero dependencies, plus a few big ideas:

• Timeplus Proton 3.0 — first vectorized streaming SQL engine in modern C++ with JIT compilation; high-throughput, low-latency processing; end-to-end streaming: ETL, joins, aggregation, alerts, and tasks; native connectors (Kafka, Redpanda, Pulsar, ClickHouse, Splunk, Elastic, MongoDB, S3, Iceberg); native Python UDF/UDAF support. The release also emphasizes zero dependencies for easier deployment ^[2].

Meanwhile, in Dan Cohen's newsletter, 'The Index and the Vector,' indexing and vector search are debated as complementary approaches to fast retrieval ^[3].

Separately, 'Query Decomposition for RAG'—a Q&A on breaking down RAG queries—shows how these threads map onto practical analytics pipelines ^[4].

Taken together, these threads sketch a 2025–2026 analytics landscape where SQL lakehouses, vectorized streaming, and retrieval-augmented workflows collide in real-world pipelines.

References

[1]

HackerNews

DuckLake – SQL-Powered Lakehouse Format for the Rest of Us by Prof. H. Mühleisen [video]

Video presentation on DuckLake, a SQL-powered lakehouse format for data lake enthusiasts.

View source

[2]

HackerNews

Show HN: Timeplus Proton 3.0 – First vectorized streaming SQL engine

Show HN announcing Timeplus Proton 3.0, a vectorized streaming SQL engine with connectors, UDFs, and end-to-end streaming, seeking feedback here

View source

[3]

HackerNews

The Index and the Vector

Discusses how indexing relates to vector databases and modern search techniques in databases.

View source

[4]

HackerNews

Query Decomposition for RAG

arXiv paper on decomposing queries for RAG; relevance to database querying and retrieval techniques

View source

References

DuckLake – SQL-Powered Lakehouse Format for the Rest of Us by Prof. H. Mühleisen [video]

Show HN: Timeplus Proton 3.0 – First vectorized streaming SQL engine

The Index and the Vector

Query Decomposition for RAG

Want to track your own topics?