kevin__liu's repositories

lightllm

LightLLM is a Python-based LLM (Large Language Model) inference and serving framework, notable for its lightweight design, easy scalability, and high-speed performance.

License:Apache-2.0Stargazers:0Issues:0Issues:0

CUDA-Learn-Note

🎉CUDA 笔记 / 高频面试题汇总 / C++笔记,个人笔记,更新随缘: sgemm、sgemv、warp reduce、block reduce、dot product、elementwise、softmax、layernorm、rmsnorm、hist etc.

License:GPL-3.0Stargazers:0Issues:0Issues:0

vllm

A high-throughput and memory-efficient inference and serving engine for LLMs

License:Apache-2.0Stargazers:0Issues:0Issues:0

transformers

🤗 Transformers: State-of-the-art Machine Learning for Pytorch, TensorFlow, and JAX.

License:Apache-2.0Stargazers:0Issues:0Issues:0

Awesome-LLM-Inference

💻A small Collection for Awesome LLM Inference [Papers|Blogs|Docs] with codes, contains TensorRT-LLM, streaming-llm, SmoothQuant, WINT8/4, Continuous Batching, FlashAttention, PagedAttention etc.

License:GPL-3.0Stargazers:0Issues:0Issues:0

lmdeploy

LMDeploy is a toolkit for compressing, deploying, and serving LLMs.

License:Apache-2.0Stargazers:0Issues:0Issues:0

Chinese-CLIP

Chinese version of CLIP which achieves Chinese cross-modal retrieval and representation generation.

License:MITStargazers:0Issues:0Issues:0

llama.cpp

Port of Facebook's LLaMA model in C/C++

License:MITStargazers:0Issues:0Issues:0

LLMSurvey

The official GitHub page for the survey paper "A Survey of Large Language Models".

Stargazers:0Issues:0Issues:0

DeepLearningSystem

Deep Learning System core principles introduction.

License:Apache-2.0Stargazers:0Issues:0Issues:0

transformer-deploy

Efficient, scalable and enterprise-grade CPU/GPU inference server for 🤗 Hugging Face transformer models 🚀

License:Apache-2.0Stargazers:0Issues:0Issues:0

whisper.cpp

Port of OpenAI's Whisper model in C/C++

License:MITStargazers:0Issues:0Issues:0

Megatron-DeepSpeed

Ongoing research training transformer language models at scale, including: BERT & GPT-2

License:NOASSERTIONStargazers:0Issues:0Issues:0

gloo

Collective communications library with various primitives for multi-machine training.

License:NOASSERTIONStargazers:0Issues:0Issues:0

DeepSpeedExamples

Example models using DeepSpeed

License:Apache-2.0Stargazers:0Issues:0Issues:0

peft

🤗 PEFT: State-of-the-art Parameter-Efficient Fine-Tuning.

License:Apache-2.0Stargazers:0Issues:0Issues:0

ChatGPT

🔮 ChatGPT Desktop Application (Mac, Windows and Linux)

License:AGPL-3.0Stargazers:0Issues:0Issues:0

FasterTransformer

Transformer related optimization, including BERT, GPT

License:Apache-2.0Stargazers:0Issues:0Issues:0

flash-attention

Fast and memory-efficient exact attention

License:BSD-3-ClauseStargazers:0Issues:0Issues:0

server

The Triton Inference Server provides an optimized cloud and edge inferencing solution.

License:BSD-3-ClauseStargazers:0Issues:0Issues:0

tokenizers-cpp

Universal cross-platform tokenizers binding to HF and sentencepiece

License:Apache-2.0Stargazers:0Issues:0Issues:0

mlc-llm

Enable everyone to develop, optimize and deploy AI models natively on everyone's devices.

License:Apache-2.0Stargazers:0Issues:0Issues:0

pwndbg

Exploit Development and Reverse Engineering with GDB Made Easy

License:MITStargazers:0Issues:0Issues:0

glibc

Unofficial mirror of sourceware glibc repository. Updated daily.

License:NOASSERTIONStargazers:0Issues:0Issues:0

pdfs

Technically-oriented PDF Collection (Papers, Specs, Decks, Manuals, etc)

Stargazers:0Issues:0Issues:0

sentencepiece

Unsupervised text tokenizer for Neural Network-based text generation.

License:Apache-2.0Stargazers:0Issues:0Issues:0

the-algorithm

Source code for Twitter's Recommendation Algorithm

License:AGPL-3.0Stargazers:0Issues:0Issues:0

triton

Development repository for the Triton language and compiler

License:MITStargazers:0Issues:0Issues:0

airflow

Apache Airflow - A platform to programmatically author, schedule, and monitor workflows

License:Apache-2.0Stargazers:0Issues:0Issues:0

web-stable-diffusion

Bringing stable diffusion models to web browsers. Everything runs inside the browser with no server support.

License:Apache-2.0Stargazers:0Issues:0Issues:0