aheart / datafusion

A modern distributed compute platform implemented in Rust

Home Page:https://datafusion.rs/

Geek Repo:Geek Repo

Github PK Tool:Github PK Tool

DataFusion: Modern Distributed Compute Platform implemented in Rust

License Version Docs Gitter chat

DataFusion is a modern distributed compute platform implemented in Rust. It is very much inspired by Apache Spark and has a similar programming style through the use of DataFrames and SQL.

DataFusion can also be used as a crate dependency in your project if you want the ability to perform SQL queries and DataFrame style data manipulation in-process against your own data sources. In that respect, DataFusion is inspired by Apache Calcite in the Java world.

Project Home Page

The project home page is now at https://datafusion.rs and contains the roadmap as well as documentation for using this crate or running DataFusion as a distributed cluster. I am using GitHub issues to track development tasks and feedback.

Prerequisites

  • Rust nightly
  • Thrift (required by parquet-rs crate) - instructions here

Building DataFusion

See BUILDING.md.

Gitter

There is a Gitter channel where you can ask questions about the project or make feature suggestions too.

Contributing

Contributors are welcome! Please see CONTRIBUTING.md for details.

About

A modern distributed compute platform implemented in Rust

https://datafusion.rs/

License:Apache License 2.0


Languages

Language:Rust 97.6%Language:Shell 1.9%Language:HTML 0.4%Language:CSS 0.2%