royadityak / hudi

Spark Library for Hadoop Upserts And Incrementals

Home Page:https://uber.github.io/hudi

Geek Repo:Geek Repo

Github PK Tool:Github PK Tool

Hudi

Hudi (pronounced Hoodie) stands for Hadoop Upserts anD Incrementals. Hudi manages storage of large analytical datasets on HDFS and serve them out via two types of tables

  • Read Optimized Table - Provides excellent query performance via purely columnar storage (e.g. Parquet)
  • Near-Real time Table (WIP) - Provides queries on real-time data, using a combination of columnar & row based storage (e.g Parquet + Avro)

For more, head over here

About

Spark Library for Hadoop Upserts And Incrementals

https://uber.github.io/hudi

License:Apache License 2.0


Languages

Language:Java 97.3%Language:Scala 2.7%Language:Shell 0.0%