zhaoxiandong (hustzxd)

hustzxd

Geek Repo

Company:AMD

Location:Beijing

Home Page:https://joyeeo.github.io/about

Github PK Tool:Github PK Tool

zhaoxiandong's starred repositories

sparse_gpu_operator

GPU operators for sparse tensor operations

Language:PythonStargazers:20Issues:0Issues:0

LLaMA-Factory

Unify Efficient Fine-Tuning of 100+ LLMs

Language:PythonLicense:Apache-2.0Stargazers:22812Issues:0Issues:0

EfficientPaperList

Paper about Pruning, Quantization, and Efficient-inference/training.

Language:PythonStargazers:3Issues:0Issues:0

my-tv

我的电视 电视直播软件,安装即可使用

Language:CLicense:Apache-2.0Stargazers:26710Issues:0Issues:0

FLAP

[AAAI 2024] Fluctuation-based Adaptive Structured Pruning for Large Language Models

Language:PythonLicense:Apache-2.0Stargazers:26Issues:0Issues:0

PiPPy

Pipeline Parallelism for PyTorch

Language:PythonLicense:BSD-3-ClauseStargazers:637Issues:0Issues:0

neurips_llm_efficiency_challenge

NeurIPS Large Language Model Efficiency Challenge: 1 LLM + 1GPU + 1Day

Language:PythonStargazers:238Issues:0Issues:0

DeepSpeedExamples

Example models using DeepSpeed

Language:PythonLicense:Apache-2.0Stargazers:5743Issues:0Issues:0

LLM-Finetuning

LLM Finetuning with peft

Language:Jupyter NotebookStargazers:1669Issues:0Issues:0

litgpt

Pretrain, finetune, deploy 20+ LLMs on your own data. Uses state-of-the-art techniques: flash attention, FSDP, 4-bit, LoRA, and more.

Language:PythonLicense:Apache-2.0Stargazers:6883Issues:0Issues:0

gpu_poor

Calculate token/s & GPU memory requirement for any LLM. Supports llama.cpp/ggml/bnb/QLoRA quantization

Language:JavaScriptStargazers:657Issues:0Issues:0

TensorRT-LLM

TensorRT-LLM provides users with an easy-to-use Python API to define Large Language Models (LLMs) and build TensorRT engines that contain state-of-the-art optimizations to perform inference efficiently on NVIDIA GPUs. TensorRT-LLM also contains components to create Python and C++ runtimes that execute those TensorRT engines.

Language:C++License:Apache-2.0Stargazers:6884Issues:0Issues:0

neural-compressor

SOTA low-bit LLM quantization (INT8/FP8/INT4/FP4/NF4) & sparsity; leading model compression techniques on TensorFlow, PyTorch, and ONNX Runtime

Language:PythonLicense:Apache-2.0Stargazers:2013Issues:0Issues:0

composer

Supercharge Your Model Training

Language:PythonLicense:Apache-2.0Stargazers:5029Issues:0Issues:0

SparseFinetuning

Repository for Sparse Finetuning of LLMs via modified version of the MosaicML llmfoundry

Language:PythonLicense:Apache-2.0Stargazers:36Issues:0Issues:0

text-generation-inference

Large Language Model Text Generation Inference

Language:PythonLicense:Apache-2.0Stargazers:8090Issues:0Issues:0

FasterTransformer

Transformer related optimization, including BERT, GPT

Language:C++License:Apache-2.0Stargazers:5529Issues:0Issues:0

ChatGPT-Academic-Prompt

Use ChatGPT for academic writing

License:MITStargazers:397Issues:0Issues:0

chatgpt-prompts-for-academic-writing

This list of writing prompts covers a range of topics and tasks, including brainstorming research ideas, improving language and style, conducting literature reviews, and developing research plans.

Stargazers:2481Issues:0Issues:0
Language:PythonStargazers:19Issues:0Issues:0
Language:PythonStargazers:231Issues:0Issues:0

triton

Development repository for the Triton language and compiler

Language:C++License:MITStargazers:11437Issues:0Issues:0

Llama-Chinese

Llama中文社区,Llama3在线体验和微调模型已开放,实时汇总最新Llama3学习资料,已将所有代码更新适配Llama3,构建最好的中文Llama大模型,完全开源可商用

Language:PythonStargazers:12209Issues:0Issues:0

GPU-Puzzles

Solve puzzles. Learn CUDA.

Language:Jupyter NotebookLicense:MITStargazers:5132Issues:0Issues:0

Awesome-Efficient-LLM

A curated list for Efficient Large Language Models

Language:PythonStargazers:852Issues:0Issues:0

pdftitle

a utility to extract the title from a PDF file

Language:PythonLicense:GPL-3.0Stargazers:129Issues:0Issues:0

LLMSurvey

The official GitHub page for the survey paper "A Survey of Large Language Models".

Language:PythonStargazers:9092Issues:0Issues:0

PaperListTemplate

This template makes it easy for you to manage papers.

Language:PythonStargazers:2Issues:0Issues:0

llm-awq

[MLSys 2024 Best Paper Award] AWQ: Activation-aware Weight Quantization for LLM Compression and Acceleration

Language:PythonLicense:MITStargazers:1924Issues:0Issues:0

wanda

A simple and effective LLM pruning approach.

Language:PythonLicense:MITStargazers:538Issues:0Issues:0