Back to topics

Cross-DB distributed analytics: 63-node DuckDB shard with DataFusion Ballista, and PostgreSQL versioning with Pgfeaturediff

1 min read
204 words
Database Debates Cross-DB DuckDB,

A 63-node sharded DuckDB run with DataFusion Ballista clocks a 1T-row aggregation in 5 seconds. That setup echoes a broader push toward open-source, distributed analytics where sharded planners are the next frontier [1]. The build uses 63 Azure Standard E64pds v6 nodes, each with 64 vCPUs and 504 GiB RAM, totaling around 4000 CPUs and 30 TB memory. The math is striking: about $235.872/hr, and that’s cheaper than a Snowflake 4XL cluster at $384/hr for the same scale. The discussion even riffs on BigQuery as a comparison point [1].

• Spot instances could bring the hourly price down to roughly $45.99, highlighting how cloud pricing and availability influence feasibility as workloads scale [1].

The dataset and the 1T challenge live in public view—the 1trc repo on GitHub—so teams can explore the reality of petabyte-scale sharding in practice [1].

Meanwhile, a parallel thread spotlights PostgreSQL feature evolution with Pgfeaturediff, a tool that compares features across versions. This kind of version-to-version visibility helps teams gauge compatibility and roadmap decisions as they consider moving between releases [2].

Taken together, the chatter shows distributed analytics tooling shaping real decisions about distribution, cost, and compatibility—between open-source paths and cloud options—as workloads creep into trillions of rows.

References

[1]
HackerNews

A sharded DuckDB on 63 nodes runs 1T row aggregation challenge in 5 sec

Discusses cross-DB sharding, DuckDB, DataFusion Ballista, cost/performance tradeoffs, and comparisons with Snowflake/BigQuery, including one-trillion-row challenge datasets and distributed query planning.

View source
[2]
HackerNews

Pgfeaturediff: Compare PostgreSQL features between versions

A tool to compare PostgreSQL features across versions; highlights changes, capabilities, and feature differences.

View source

Want to track your own topics?

Create custom trackers and get AI-powered insights from social discussions

Get Started