cj's repositories
JavaFamily
记录学习点滴,分享技术干货
arrow
Apache Arrow is a multi-language toolbox for accelerated data interchange and in-memory processing
arrow2
Transmute-free Rust library to work with the Arrow format
statd
A simple light-weigh Data Statistics API Service Component on top of Apache Calcite
trino
Official repository of Trino, the distributed SQL query engine for big data, formerly known as PrestoSQL (https://trino.io)
CPlusPlusThings
C++那些事
data-juicer
A one-stop data processing system to make data higher-quality, juicier, and more digestible for LLMs! 🍎 🍋 🌽 ➡️ ➡️🍸 🍹 🍷为大语言模型提供更高质量、更丰富、更易”消化“的数据!
devika
Devika is an Agentic AI Software Engineer that can understand high-level human instructions, break them down into steps, research relevant information, and write code to achieve the given objective. Devika aims to be a competitive open-source alternative to Devin by Cognition AI.
dpkb
大数据相关内容汇总,包括分布式存储引擎、分布式计算引擎、数仓建设等。关键词:Hadoop、HBase、ES、Kudu、Hive、Presto、Spark、Flink、Kylin、ClickHouse
duckdb
DuckDB is an in-process SQL OLAP Database Management System
e2eAIOK
Intel® End-to-End AI Optimization Kit
God-Of-BigData
专注大数据学习面试,大数据成神之路开启。Flink/Spark/Hadoop/Hbase/Hive...
Jlama
Jlama is a modern Java inference engine for LLMs
jvector
JVector: the most advanced embedded vector search engine
llm-examples
Streamlit LLM app examples for getting started
mlx-examples
Examples in the MLX framework
presto
The official home of the Presto distributed SQL query engine for big data
pytorch-llama
LLaMA 2 implemented from scratch in PyTorch
pytorch_geometric
Graph Neural Network Library for PyTorch
substrait
A cross platform way to express data transformation, relational algebra, standardized record expression and plans.
velox
A C++ vectorized database acceleration library aimed to optimizing query engines and data processing systems.
vllm
A high-throughput and memory-efficient inference and serving engine for LLMs