Beast code in Giters

yty3805595's starred repositories

Chinese-Mixtral

中文Mixtral混合专家大模型（Chinese Mixtral MoE LLMs）

Language:PythonApache-2.056200

yarn

YaRN: Efficient Context Window Extension of Large Language Models

Language:PythonMIT126100

grouped-query-attention-pytorch

(Unofficial) PyTorch implementation of grouped-query attention (GQA) from "GQA: Training Generalized Multi-Query Transformer Models from Multi-Head Checkpoints" (https://arxiv.org/pdf/2305.13245.pdf)

Language:PythonMIT8900

ring-flash-attention

Ring attention implementation with flash attention

Language:Python44700

Chinese-LLaMA-Alpaca-2

中文LLaMA-2 & Alpaca-2大模型二期项目 + 64K超长上下文模型 (Chinese LLaMA-2 & Alpaca-2 LLMs with 64K long context models)

Language:PythonApache-2.0700300

WebCPM

Official codes for ACL 2023 paper "WebCPM: Interactive Web Search for Chinese Long-form Question Answering"

Language:HTMLApache-2.096200

unsloth

Finetune Llama 3, Mistral, Phi & Gemma LLMs 2-5x faster with 80% less memory

Language:PythonApache-2.01274000

swift

ms-swift: Use PEFT or Full-parameter to finetune 300+ LLMs or 50+ MLLMs. (Qwen2, GLM4v, Internlm2.5, Yi, Llama3, Llava-Video, Internvl2, MiniCPM-V, Deepseek, Baichuan2, Gemma2, Phi3-Vision, ...)

Language:PythonApache-2.0241000

llama3

The official Meta Llama 3 GitHub site

Language:PythonNOASSERTION2327300

llama-moe

⛷️ LLaMA-MoE: Building Mixture-of-Experts from LLaMA with Continual Pre-training

Language:PythonApache-2.080400

makeMoE

From scratch implementation of a sparse mixture of experts language model inspired by Andrej Karpathy's makemore :)

Language:Jupyter NotebookMIT55500

LEval

[ACL'24] Data and code for L-Eval, a comprehensive long context language models evaluation benchmark

Language:PythonGPL-3.031200

Qwen2

Qwen2 is the large language model series developed by Qwen team, Alibaba Cloud.

Language:Shell627400

local-attention

An implementation of local windowed attention for language modeling

Language:PythonMIT35100

ChatPLUG

A Chinese Open-Domain Dialogue System

Language:PythonApache-2.030400

alignment-handbook

Robust recipes to align language models with human and AI preferences

Language:PythonApache-2.0423100

DAIL-SQL

A efficient and effective few-shot NL2SQL method on GPT-4.

Language:PythonApache-2.034400

BCEmbedding

Netease Youdao's open-source embedding and reranker models for RAG products.

Language:PythonApache-2.0120800

QAnything

Question and Answer based on Anything.

Language:PythonApache-2.01068300

st-moe-pytorch

Implementation of ST-Moe, the latest incarnation of MoE after years of research at Brain, in Pytorch

Language:PythonMIT25600

PowerInfer

High-speed Large Language Model Serving on PCs with Consumer-grade GPUs

Language:C++MIT766100

RAG-Survey

156800

FlagEmbedding

Retrieval and Retrieval-augmented LLMs

Language:PythonMIT608500

FActScore

A package to evaluate factuality of long-form generation. Original implementation of our EMNLP 2023 paper "FActScore: Fine-grained Atomic Evaluation of Factual Precision in Long Form Text Generation"

Language:PythonMIT24400

export_llama_to_onnx

export llama to onnx

Language:PythonMIT7600

Qwen

The official repo of Qwen (通义千问) chat & pretrained large language model proposed by Alibaba Cloud.

Language:PythonApache-2.01266000

tiktoken

tiktoken is a fast BPE tokeniser for use with OpenAI's models.

Language:PythonMIT1116300

lmppl

Calculate perplexity on a text with pre-trained language models. Support MLM (eg. DeBERTa), recurrent LM (eg. GPT3), and encoder-decoder LM (eg. Flan-T5).

Language:PythonMIT10900

AutoAgents

Complex question answering in LLMs with enhanced reasoning and information-seeking capabilities.

Language:PythonMIT16400

Large-Language-Model-Notebooks-Course

Practical course about Large Language Models.

Language:Jupyter NotebookMIT91100