david's repositories
carbondata
Mirror of Apache CarbonData
deeplake
Database for AI. Store Vectors, Images, Texts, Videos, etc. Use with LLMs/LangChain. Store, query, version, & visualize any AI data. Stream data in real-time to PyTorch/TensorFlow. https://activeloop.ai
ray
Ray is a unified framework for scaling AI and Python applications. Ray consists of a core distributed runtime and a set of AI Libraries for accelerating ML workloads.
velox
A C++ vectorized database acceleration library aimed to optimizing query engines and data processing systems.
ClickHouse
ClickHouse® is a free analytics DBMS for big data
spark-rapids
Spark RAPIDS plugin - accelerate Apache Spark with GPUs
duckdb
DuckDB is an in-process SQL OLAP Database Management System
arrow
Apache Arrow is a multi-language toolbox for accelerated data interchange and in-memory processing
free-programming-books
:books: Freely available programming books
fdb-record-layer
A record-oriented store built on FoundationDB
iceberg
Apache Iceberg
tez
Mirror of Apache Tez
arthas
Alibaba Java Diagnostic Tool Arthas/Alibaba Java诊断利器Arthas
incubator-iotdb
Apache IoTDB
lipwig
A slightly moist lipstick-on-pig clone for Apache Hive
async-profiler
Sampling CPU and HEAP profiler for Java featuring AsyncGetCallTrace + perf_events
zstd-jni
JNI binding for Zstd
FlameViewer
Tool for flamegraphs visualization
hadoop-ozone
Scalable, redundant, and distributed object store for Apache Hadoop
carbondata_guide
Apache CarbonData 源码阅读
mlflow
Open source platform for the machine learning lifecycle
sqlflow
Brings SQL and AI together.
influxdb
Scalable datastore for metrics, events, and real-time analytics
kudu
Mirror of Apache Kudu
drill
Apache Drill
rocksdb
A library that provides an embeddable, persistent key-value store for fast storage.