heluocs

heluocs

Geek Repo

Location:Beijing

Github PK Tool:Github PK Tool

heluocs's starred repositories

paimon

Apache Paimon is a lake format that enables building a Realtime Lakehouse Architecture with Flink and Spark for both streaming and batch operations.

Language:JavaLicense:Apache-2.0Stargazers:2165Issues:0Issues:0

ragflow

RAGFlow is an open-source RAG (Retrieval-Augmented Generation) engine based on deep document understanding.

Language:PythonLicense:Apache-2.0Stargazers:12314Issues:0Issues:0

sparrow

Data processing with ML and LLM

Language:PythonLicense:GPL-3.0Stargazers:2335Issues:0Issues:0

pygwalker

PyGWalker: Turn your pandas dataframe into an interactive UI for visual analysis

Language:PythonLicense:Apache-2.0Stargazers:10754Issues:0Issues:0

LMFlow

An Extensible Toolkit for Finetuning and Inference of Large Foundation Models. Large Models for All.

Language:PythonLicense:Apache-2.0Stargazers:8132Issues:0Issues:0

MediaCrawler

小红书笔记 | 评论爬虫、抖音视频 | 评论爬虫、快手视频 | 评论爬虫、B 站视频 | 评论爬虫、微博帖子 | 评论爬虫

Language:PythonLicense:NOASSERTIONStargazers:15250Issues:0Issues:0

trino

Official repository of Trino, the distributed SQL query engine for big data, formerly known as PrestoSQL (https://trino.io)

Language:JavaLicense:Apache-2.0Stargazers:9934Issues:0Issues:0

hudi

Upserts, Deletes And Incremental Processing on Big Data.

Language:JavaLicense:Apache-2.0Stargazers:5227Issues:0Issues:0

iceberg

Apache Iceberg

Language:JavaLicense:Apache-2.0Stargazers:5925Issues:0Issues:0

data-juicer

A one-stop data processing system to make data higher-quality, juicier, and more digestible for (multimodal) LLMs! 🍎 🍋 🌽 ➡️ ➡️🍸 🍹 🍷为大模型提供更高质量、更丰富、更易”消化“的数据!

Language:PythonLicense:Apache-2.0Stargazers:1773Issues:0Issues:0

build-your-own-x

Master programming by recreating your favorite technologies from scratch.

Stargazers:288354Issues:0Issues:0
Language:PythonLicense:Apache-2.0Stargazers:7018Issues:0Issues:0

ant

Ant game engine

Language:LuaLicense:MITStargazers:3729Issues:0Issues:0

flink

Apache Flink

Language:JavaLicense:Apache-2.0Stargazers:23545Issues:0Issues:0

vllm

A high-throughput and memory-efficient inference and serving engine for LLMs

Language:PythonLicense:Apache-2.0Stargazers:23182Issues:0Issues:0

spark

Apache Spark - A unified analytics engine for large-scale data processing

Language:ScalaLicense:Apache-2.0Stargazers:38964Issues:0Issues:0

TensorRT-LLM

TensorRT-LLM provides users with an easy-to-use Python API to define Large Language Models (LLMs) and build TensorRT engines that contain state-of-the-art optimizations to perform inference efficiently on NVIDIA GPUs. TensorRT-LLM also contains components to create Python and C++ runtimes that execute those TensorRT engines.

Language:C++License:Apache-2.0Stargazers:7524Issues:0Issues:0

streaming-llm

[ICLR 2024] Efficient Streaming Language Models with Attention Sinks

Language:PythonLicense:MITStargazers:6378Issues:0Issues:0

javascript

JavaScript Style Guide

Language:JavaScriptLicense:MITStargazers:143448Issues:0Issues:0

parquet-format

Apache Parquet Format

Language:ThriftLicense:Apache-2.0Stargazers:1700Issues:0Issues:0

tarp

Fast and simple stream processing of files in tar files, useful for deep learning, big data, and many other applications.

Language:GoStargazers:117Issues:0Issues:0

stanford_alpaca

Code and documentation to train Stanford's Alpaca models, and generate the data.

Language:PythonLicense:Apache-2.0Stargazers:29171Issues:0Issues:0
Language:PythonStargazers:1364Issues:0Issues:0

peft

🤗 PEFT: State-of-the-art Parameter-Efficient Fine-Tuning.

Language:PythonLicense:Apache-2.0Stargazers:15089Issues:0Issues:0

bark

🔊 Text-Prompted Generative Audio Model

Language:Jupyter NotebookLicense:MITStargazers:33873Issues:0Issues:0

Awesome-LLMOps

An awesome & curated list of best LLMOps tools for developers

Language:ShellLicense:CC0-1.0Stargazers:3451Issues:0Issues:0

AudioGPT

AudioGPT: Understanding and Generating Speech, Music, Sound, and Talking Head

Language:PythonLicense:NOASSERTIONStargazers:9905Issues:0Issues:0

COLA

🥤 COLA: Clean Object-oriented & Layered Architecture

Language:JavaLicense:LGPL-2.1Stargazers:11613Issues:0Issues:0

katalyst-core

Katalyst aims to provide a universal solution to help improve resource utilization and optimize the overall costs in the cloud. This is the core components in Katalyst system, including multiple agents and centralized components

Language:GoLicense:Apache-2.0Stargazers:394Issues:0Issues:0

mpire

A Python package for easy multiprocessing, but faster than multiprocessing

Language:PythonLicense:MITStargazers:1962Issues:0Issues:0