Skip to main content

Datafusion

How Query Engines Work 2. Why modern query engines think in columns

·13 mins
Why do modern query engines pass around columns instead of rows? Because the hardware loves it. This post explains why columnar layout is so fast, how Apache Arrow represents it in memory, and how to build and manipulate Arrow arrays in Rust without treating the whole thing like black magic.

Sail. Sailing Through Giants and Sparks

·7 mins
In this article, I share my critical view on the current state of data engineering, dominated by heavyweight platforms like Spark and Databricks, and introduce Sail, an open-source engine built on top of Apache Arrow and DataFusion, written in Rust, that offers a new path: lightweight, efficient, and powerful.