wm901115nwpu's repositories

apex

A PyTorch Extension: Tools for easy mixed precision and distributed training in Pytorch

Language:PythonLicense:BSD-3-ClauseStargazers:0Issues:0Issues:0

Awesome-LLM-Inference

📖A curated list of Awesome LLM Inference Paper with codes, TensorRT-LLM, vLLM, streaming-llm, AWQ, SmoothQuant, WINT8/4, Continuous Batching, FlashAttention, PagedAttention etc.

License:GPL-3.0Stargazers:0Issues:0Issues:0

DeepLearningSystem

Deep Learning System core principles introduction.

Language:Jupyter NotebookLicense:Apache-2.0Stargazers:0Issues:0Issues:0

DeepSpeed

DeepSpeed is a deep learning optimization library that makes distributed training and inference easy, efficient, and effective.

Language:PythonLicense:Apache-2.0Stargazers:0Issues:0Issues:0

DeepSpeed-MII

MII makes low-latency and high-throughput inference possible, powered by DeepSpeed.

Language:PythonLicense:Apache-2.0Stargazers:0Issues:0Issues:0

DeepSpeedExamples

Example models using DeepSpeed

Language:PythonLicense:Apache-2.0Stargazers:0Issues:0Issues:0

depyf

depyf is a tool to help you understand and adapt to PyTorch compiler torch.compile.

Language:PythonLicense:MITStargazers:0Issues:0Issues:0

intel-extension-for-transformers

⚡ Build your chatbot within minutes on your favorite device; offer SOTA compression techniques for LLMs; run LLMs efficiently on Intel Platforms⚡

Language:PythonLicense:Apache-2.0Stargazers:0Issues:0Issues:0
Language:Jupyter NotebookStargazers:0Issues:1Issues:0

LLaMA-Factory

Unify Efficient Fine-tuning of 100+ LLMs

License:Apache-2.0Stargazers:0Issues:0Issues:0

lmdeploy

LMDeploy is a toolkit for compressing, deploying, and serving LLMs.

License:Apache-2.0Stargazers:0Issues:0Issues:0
License:Apache-2.0Stargazers:0Issues:0Issues:0

Megatron-LM

Ongoing research training transformer models at scale

Language:PythonLicense:NOASSERTIONStargazers:0Issues:0Issues:0

mmcv

OpenMMLab Computer Vision Foundation

Language:PythonLicense:Apache-2.0Stargazers:0Issues:0Issues:0

mmdeploy

OpenMMLab Model Deployment Framework

Language:PythonLicense:Apache-2.0Stargazers:0Issues:0Issues:0

mmengine

OpenMMLab Foundational Library for Training Deep Learning Models

Language:PythonLicense:Apache-2.0Stargazers:0Issues:0Issues:0

NeMo

NeMo: a framework for generative AI

License:Apache-2.0Stargazers:0Issues:0Issues:0

neural-compressor

Provide unified APIs for SOTA model compression techniques, such as low precision (INT8/INT4/FP4/NF4) quantization, sparsity, pruning, and knowledge distillation on mainstream AI frameworks such as TensorFlow, PyTorch, and ONNX Runtime.

Language:PythonLicense:Apache-2.0Stargazers:0Issues:0Issues:0

OnnxSlim

A Toolkit to Help Optimize Large Onnx Model

Language:PythonLicense:MITStargazers:0Issues:0Issues:0

pytorch

Tensors and Dynamic neural networks in Python with strong GPU acceleration

Language:PythonLicense:NOASSERTIONStargazers:0Issues:0Issues:0

pytorch-image-models

PyTorch image models, scripts, pretrained weights -- ResNet, ResNeXT, EfficientNet, NFNet, Vision Transformer (ViT), MobileNet-V3/V2, RegNet, DPN, CSPNet, Swin Transformer, MaxViT, CoAtNet, ConvNeXt, and more

Language:PythonLicense:Apache-2.0Stargazers:0Issues:0Issues:0

TensorRT

TensorRT is a C++ library for high performance inference on NVIDIA GPUs and deep learning accelerators.

Language:C++License:Apache-2.0Stargazers:0Issues:0Issues:0

TransformerEngine

A library for accelerating Transformer models on NVIDIA GPUs, including using 8-bit floating point (FP8) precision on Hopper GPUs, to provide better performance with lower memory utilization in both training and inference.

Language:PythonLicense:Apache-2.0Stargazers:0Issues:0Issues:0

triton

Development repository for the Triton language and compiler

Language:C++License:MITStargazers:0Issues:0Issues:0

tvm

Open deep learning compiler stack for cpu, gpu and specialized accelerators

Language:PythonLicense:Apache-2.0Stargazers:0Issues:0Issues:0

vllm

A high-throughput and memory-efficient inference and serving engine for LLMs

Language:PythonLicense:Apache-2.0Stargazers:0Issues:0Issues:0

wenet

Production First and Production Ready End-to-End Speech Recognition Toolkit

Language:PythonLicense:Apache-2.0Stargazers:0Issues:0Issues:0

workshops

This is a repository for all workshop related materials.

Language:Jupyter NotebookStargazers:0Issues:0Issues:0

xtuner

An efficient, flexible and full-featured toolkit for fine-tuning large models (InternLM, Llama, Baichuan, Qwen, ChatGLM)

License:Apache-2.0Stargazers:0Issues:0Issues:0

zero_nlp

中文nlp解决方案(大模型、数据、模型、训练、推理)

Language:PythonLicense:MITStargazers:0Issues:0Issues:0