Sujith Jay Nair's repositories
data-readings
Reading List in Data Systems
sujithjay.github.io
Personal Blog
incubator-gluten
Gluten is a middle layer responsible for offloading JVM-based SQL engines' execution to native engines.
incubator-iceberg
Apache Iceberg (Incubating)
spark
Fork of Apache Spark
weldj
Java Bindings for Weld
arrow
Apache Arrow is a cross-language development platform for in-memory data. It specifies a standardized language-independent columnar memory format for flat and hierarchical data, organized for efficient analytic operations on modern hardware. It also provides computational libraries and zero-copy streaming messaging and interprocess communication. Languages currently supported include C, C++, Java, JavaScript, Python, and Ruby.
ballista
Distributed compute platform implemented in Rust, using Apache Arrow memory model.
benchmarks
Benchmarks on Code Snippets
breeze
Breeze is a numerical processing library for Scala.
colabs
https://colab.research.google.com/
datafusion-comet
Apache DataFusion Comet Spark Accelerator
lettuceleaf
A Distributed Task Queue in Java
llama_index
LlamaIndex (GPT Index) is a project that provides a central interface to connect your LLM's with external data.
logseq
A privacy-first, open-source platform for knowledge management and collaboration. Download link: http://github.com/logseq/logseq/releases. roadmap: http://trello.com/b/8txSM12G/roadmap
rabpubsub-publisher
A simple Publisher wrapper for RabbitMQ
rabpubsub-subscriber
Subscriber Module of RabPubSub
raft-rs
Raft distributed consensus algorithm implemented in Rust.
velox
A C++ vectorized database acceleration library aimed to optimizing query engines and data processing systems.
weld
High-performance runtime for data analytics applications