hkxIron

followers

following

stars

hkxIron's starred repositories

sglang

SGLang is yet another fast serving framework for large language models and vision language models.

Language:PythonApache-2.0428900

paper-reading

深度学习经典、新论文逐段精读

Apache-2.02577400

fairscale

PyTorch extensions for high performance and large scale training.

Language:PythonNOASSERTION311900

llama-models

Utilities intended for use with Llama models.

Language:PythonNOASSERTION349100

vscode_debug_transformers

Language:Python15100

zero_nlp

中文nlp解决方案(大模型、数据、模型、训练、推理)

Language:Jupyter NotebookMIT279200

accelerate

🚀 A simple way to launch, train, and use PyTorch models on almost any device and distributed configuration, automatic mixed precision (including fp8), and easy-to-configure FSDP and DeepSpeed support

Language:PythonApache-2.0755400

LLM-Pretrain-SFT

Scripts of LLM pre-training and fine-tuning (w/wo LoRA, DeepSpeed)

Language:PythonApache-2.06100

LLM101n

LLM101n: Let's build a Storyteller

SPACE

Official implementation of SPACE

Language:PythonApache-2.0700

Spec-Bench

Spec-Bench: A Comprehensive Benchmark and Unified Evaluation Platform for Speculative Decoding (ACL 2024 Findings)

Language:PythonApache-2.014200

COMET

A Neural Framework for MT Evaluation

Language:PythonApache-2.046900

LMOps

General technology for enabling AI capabilities w/ LLMs and MLLMs

Language:PythonMIT350900

leedl-tutorial

《李宏毅深度学习教程》（李宏毅老师推荐👍，苹果书🍎），PDF下载地址：https://github.com/datawhalechina/leedl-tutorial/releases

Language:Jupyter NotebookNOASSERTION1255200

MInference

To speed up Long-context LLMs' inference, approximate and dynamic sparse calculate the attention, which reduces inference latency by up to 10x for pre-filling on an A100 while maintaining accuracy.

Language:PythonMIT65800

chat-dataset-baseline

人工精调的中文对话数据集和一段chatglm的微调代码

Language:Jupyter Notebook112900

ChatGLM-Finetuning

基于ChatGLM-6B、ChatGLM2-6B、ChatGLM3-6B模型，进行下游具体任务微调，涉及Freeze、Lora、P-tuning、全参微调等

Language:Python261100

Cherry_LLM

[NAACL'24] Self-data filtering of LLM instruction-tuning data using a novel perplexity-based difficulty score, without using any other models

Language:Python25800

LLMBook-zh.github.io

《大语言模型》作者：赵鑫，李军毅，周昆，唐天一，文继荣

LLMTest_NeedleInAHaystack

Doing simple retrieval from LLM models at various context lengths to measure accuracy

Language:Jupyter NotebookNOASSERTION140000

mlmm-evaluation

Multilingual Large Language Models Evaluation Benchmark

Language:PythonApache-2.08100

PainlessInferenceAcceleration

Language:PythonCC-BY-4.027600

llama3

The official Meta Llama 3 GitHub site

Language:PythonNOASSERTION2567600

minbpe

Minimal, clean code for the Byte Pair Encoding (BPE) algorithm commonly used in LLM tokenization.

Language:PythonMIT894300

Chinese-Mixtral

中文Mixtral混合专家大模型（Chinese Mixtral MoE LLMs）

Language:PythonApache-2.057100

mistral-inference

Official inference library for Mistral models

Language:Jupyter NotebookApache-2.0944200

REST

REST: Retrieval-Based Speculative Decoding, NAACL 2024

Language:CApache-2.015400

mamba

Mamba SSM architecture

Language:PythonApache-2.01223200

Medusa

Medusa: Simple Framework for Accelerating LLM Generation with Multiple Decoding Heads

Language:Jupyter NotebookApache-2.0214600

parallel-decoding

Repository of the paper "Accelerating Transformer Inference for Translation via Parallel Decoding"

Language:PythonApache-2.09900