neos's repositories
ascend-docker
昇腾推理基础镜像,基于ubuntu22.04制作
_cuda_learning
learning how CUDA works
automated-sub-network-selection
Official Repo for "Compressing Large Language Models with Automated Sub-Network Search"
DistServe
Disaggregated serving system for Large Language Models (LLMs).
EAGLE
Official Implementation of EAGLE-1 and EAGLE-2
FunClip
Open-source, accurate and easy-to-use video clipping tool, LLM based AI clipping intergrated || 开源、精准、方便的视频切片工具,集成了大语言模型AI智能剪辑功能
GPTSwarm
🐝 GPTSwarm: LLM agents as (Optimizable) Graphs
graph-of-thoughts
Official Implementation of "Graph of Thoughts: Solving Elaborate Problems with Large Language Models"
IMCCompiler
logic compiler for SIMD IMC
LazyLLM
Easyest and lazyest way for building multi-agent LLMs applications.
lectures
Material for cuda-mode lectures
linkandroid
Link Android and PC easily! 全能手机连接助手!
llm-on-ray
Pretrain, finetune and serve LLMs on Intel platforms with Ray
LLM-Viewer
Analyze the inference of Large Language Models (LLMs). Analyze aspects like computation, storage, transmission, and hardware roofline model in a user-friendly interface.
llm_interview_note
主要记录大语言大模型(LLMs) 算法(应用)工程师相关的知识及面试题
llumnix
Efficient and easy multi-instance LLM serving
lolcats
Repo for "LoLCATs: On Low-Rank Linearizing of Large Language Models"
mem0
The memory layer for Personalized AI
mllm
Fast Multimodal LLM on Mobile Devices
Nanoflow
A throughput-oriented high-performance serving framework for LLMs
ONNXim
ONNXim is a fast cycle-level simulator that can model multi-core NPUs for DNN inference
PromptIR
PromptIR: Prompting for All-in-One Blind Image Restoration [NeurIPS 2023]
punica
Serving multiple LoRA finetuned LLM as one
swiftLLM
A tiny yet powerful LLM inference system tailored for researching purpose
tiny-universe
《大模型白盒子构建指南》:一个全手搓的Tiny-Universe
vidur
A large-scale simulation framework for LLM inference