zzp_miracle (zzpmiracle)

zzpmiracle

Geek Repo

Company:Alibaba PAI.

Location:Hangzhou, China

Github PK Tool:Github PK Tool

zzp_miracle's starred repositories

vattention

Dynamic Memory Management for Serving LLMs without PagedAttention

Language:CLicense:MITStargazers:161Issues:0Issues:0

ScaleLLM

A high-performance inference system for large language models, designed for production environments.

Language:C++License:Apache-2.0Stargazers:346Issues:0Issues:0
Language:PythonLicense:Apache-2.0Stargazers:53Issues:0Issues:0

DistServe

Disaggregated serving system for Large Language Models (LLMs).

Language:Jupyter NotebookLicense:Apache-2.0Stargazers:228Issues:0Issues:0

llumnix

Efficient and easy multi-instance LLM serving

Language:PythonLicense:Apache-2.0Stargazers:82Issues:0Issues:0

Spec-Bench

Spec-Bench: A Comprehensive Benchmark and Unified Evaluation Platform for Speculative Decoding (ACL 2024 Findings)

Language:PythonLicense:Apache-2.0Stargazers:132Issues:0Issues:0

DARC

Decentralized Autonomous Regulated Company (DARC), a company virtual machine that runs on any EVM-compatible blockchain, with on-chain law system, multi-level tokens and dividends mechanism.

Language:TypeScriptLicense:NOASSERTIONStargazers:9405Issues:0Issues:0

FlexFlow

FlexFlow Serve: Low-Latency, High-Performance LLM Serving

Language:C++License:Apache-2.0Stargazers:1614Issues:0Issues:0

transformers

🤗 Transformers: State-of-the-art Machine Learning for Pytorch, TensorFlow, and JAX.

Language:PythonLicense:Apache-2.0Stargazers:130388Issues:0Issues:0

flashinfer

FlashInfer: Kernel Library for LLM Serving

Language:CudaLicense:Apache-2.0Stargazers:956Issues:0Issues:0

lmdeploy

LMDeploy is a toolkit for compressing, deploying, and serving LLMs.

Language:PythonLicense:Apache-2.0Stargazers:3801Issues:0Issues:0

Modern-CPP-Programming

Modern C++ Programming Course (C++03/11/14/17/20/23/26)

Language:HTMLStargazers:11597Issues:0Issues:0

rtp-llm

RTP-LLM: Alibaba's high-performance LLM inference engine for diverse applications.

Language:C++License:Apache-2.0Stargazers:475Issues:0Issues:0

how-to-optim-algorithm-in-cuda

how to optimize some algorithm in cuda.

Language:CudaStargazers:1318Issues:0Issues:0

DeepSpeed-MII

MII makes low-latency and high-throughput inference possible, powered by DeepSpeed.

Language:PythonLicense:Apache-2.0Stargazers:1810Issues:0Issues:0

ChatGLM3

ChatGLM3 series: Open Bilingual Chat LLMs | 开源双语对话语言模型

Language:PythonLicense:Apache-2.0Stargazers:13235Issues:0Issues:0

TensorRT-LLM

TensorRT-LLM provides users with an easy-to-use Python API to define Large Language Models (LLMs) and build TensorRT engines that contain state-of-the-art optimizations to perform inference efficiently on NVIDIA GPUs. TensorRT-LLM also contains components to create Python and C++ runtimes that execute those TensorRT engines.

Language:C++License:Apache-2.0Stargazers:7870Issues:0Issues:0

recom

An Optimizing Compiler for Recommendation Model Inference

Language:C++License:Apache-2.0Stargazers:21Issues:0Issues:0

sd-webui-EasyPhoto

📷 EasyPhoto | Your Smart AI Photo Generator.

Language:PythonLicense:Apache-2.0Stargazers:4854Issues:0Issues:0
Language:PythonLicense:MITStargazers:33Issues:0Issues:0

vllm

A high-throughput and memory-efficient inference and serving engine for LLMs

Language:PythonLicense:Apache-2.0Stargazers:24534Issues:0Issues:0

hiq

HiQ - Observability And Optimization In Modern AI Era

Language:PythonLicense:NOASSERTIONStargazers:70Issues:0Issues:0

NeMo

A scalable generative AI framework built for researchers and developers working on Large Language Models, Multimodal, and Speech AI (Automatic Speech Recognition and Text-to-Speech)

Language:PythonLicense:Apache-2.0Stargazers:11198Issues:0Issues:0

NeMo-Framework-Launcher

Provides end-to-end model development pipelines for LLMs and Multimodal models that can be launched on-prem or cloud-native.

Language:PythonLicense:Apache-2.0Stargazers:433Issues:0Issues:0

YesPlayMusic

高颜值的第三方网易云播放器,支持 Windows / macOS / Linux :electron:

Language:VueLicense:MITStargazers:28075Issues:0Issues:0

BMTrain

Efficient Training (including pre-training and fine-tuning) for Big Models

Language:PythonLicense:Apache-2.0Stargazers:541Issues:0Issues:0

Grounded-Segment-Anything

Grounded SAM: Marrying Grounding DINO with Segment Anything & Stable Diffusion & Recognize Anything - Automatically Detect , Segment and Generate Anything

Language:Jupyter NotebookLicense:Apache-2.0Stargazers:14473Issues:0Issues:0

the-algorithm

Source code for Twitter's Recommendation Algorithm

Language:ScalaLicense:AGPL-3.0Stargazers:61865Issues:0Issues:0

ChatGPT-Next-Web

A cross-platform ChatGPT/Gemini UI (Web / PWA / Linux / Win / MacOS). 一键拥有你自己的跨平台 ChatGPT/Gemini 应用。

Language:TypeScriptLicense:MITStargazers:73804Issues:0Issues:0

Pake

🤱🏻 Turn any webpage into a desktop app with Rust. 🤱🏻 利用 Rust 轻松构建轻量级多端桌面应用

Language:RustLicense:MITStargazers:25106Issues:0Issues:0