esmioley / hoodie

Spark Library for Hadoop Upserts And Incrementals

Home Page:https://uber.github.io/hoodie

Geek Repo:Geek Repo

Github PK Tool:Github PK Tool

Hoodie

Hoodie manages storage of large analytical datasets on HDFS and serve them out via two types of tables

  • Read Optimized Table - Provides excellent query performance via purely columnar storage (e.g. Parquet)
  • Near-Real time Table (WIP) - Provides queries on real-time data, using a combination of columnar & row based storage (e.g Parquet + Avro)

For more, head over here

About

Spark Library for Hadoop Upserts And Incrementals

https://uber.github.io/hoodie

License:Apache License 2.0


Languages

Language:Java 99.0%Language:Scala 0.9%Language:Shell 0.0%