Butch78 / 1BillionRowChallenge

I saw this [Blog Post](https://www.morling.dev/blog/one-billion-row-challenge/) on a Billion Row challenge for Java so naturally I tried implementing a solution in Python & Rust using mainly polars

Geek Repo:Geek Repo

Github PK Tool:Github PK Tool

I saw this Blog Post on a Billion Row challenge for Java so naturally I tried implementing a solution in Python & Rust using mainly polars.

To run my code you will need to have Rust & Python installed, or you can run it in a Dev Container.

Thanks to Coriolinus for the full implementation in Rust and the updated generation code.

Thanks to TheBracket for the implemtation in Rust using the STD Library and Rayon.

Thanks to Ragnar for the implementation in Rust using the STD Library and SIMD.

Thanks to Koen Vossen for the Python implementation for both the Polars & Python STD Library solutions.

Generating the data file

There is a feature-gated binary which can create the appropriate measurements list, as follows:

time cargo run --release --features generator --bin generate 1000000000

Run the challenge

cargo build --release && time target/release/1brc >/dev/null

Results

Running the code on my laptop, which is equipped with an i7-1185G7 @ 3.00GHz and 16GB of RAM.

Implementation Time
Python - Crashed
Python + Pandas - Crashed
Python + Polars 33.86 sec
Rust + Polars 39 sec
Rust STD Libray 16 sec
Rust STD Libray + Rayon 12 sec
Rust STD Libray + SIMD 8 sec

About

I saw this [Blog Post](https://www.morling.dev/blog/one-billion-row-challenge/) on a Billion Row challenge for Java so naturally I tried implementing a solution in Python & Rust using mainly polars


Languages

Language:Rust 89.9%Language:Python 10.1%