yty3805595's starred repositories

Chinese-Mixtral

中文Mixtral混合专家大模型(Chinese Mixtral MoE LLMs)

Language:PythonLicense:Apache-2.0Stargazers:562Issues:0Issues:0

yarn

YaRN: Efficient Context Window Extension of Large Language Models

Language:PythonLicense:MITStargazers:1261Issues:0Issues:0

grouped-query-attention-pytorch

(Unofficial) PyTorch implementation of grouped-query attention (GQA) from "GQA: Training Generalized Multi-Query Transformer Models from Multi-Head Checkpoints" (https://arxiv.org/pdf/2305.13245.pdf)

Language:PythonLicense:MITStargazers:89Issues:0Issues:0

ring-flash-attention

Ring attention implementation with flash attention

Language:PythonStargazers:447Issues:0Issues:0

Chinese-LLaMA-Alpaca-2

中文LLaMA-2 & Alpaca-2大模型二期项目 + 64K超长上下文模型 (Chinese LLaMA-2 & Alpaca-2 LLMs with 64K long context models)

Language:PythonLicense:Apache-2.0Stargazers:7003Issues:0Issues:0

WebCPM

Official codes for ACL 2023 paper "WebCPM: Interactive Web Search for Chinese Long-form Question Answering"

Language:HTMLLicense:Apache-2.0Stargazers:962Issues:0Issues:0

unsloth

Finetune Llama 3, Mistral, Phi & Gemma LLMs 2-5x faster with 80% less memory

Language:PythonLicense:Apache-2.0Stargazers:12740Issues:0Issues:0

swift

ms-swift: Use PEFT or Full-parameter to finetune 300+ LLMs or 50+ MLLMs. (Qwen2, GLM4v, Internlm2.5, Yi, Llama3, Llava-Video, Internvl2, MiniCPM-V, Deepseek, Baichuan2, Gemma2, Phi3-Vision, ...)

Language:PythonLicense:Apache-2.0Stargazers:2410Issues:0Issues:0

llama3

The official Meta Llama 3 GitHub site

Language:PythonLicense:NOASSERTIONStargazers:23273Issues:0Issues:0

llama-moe

⛷️ LLaMA-MoE: Building Mixture-of-Experts from LLaMA with Continual Pre-training

Language:PythonLicense:Apache-2.0Stargazers:804Issues:0Issues:0

makeMoE

From scratch implementation of a sparse mixture of experts language model inspired by Andrej Karpathy's makemore :)

Language:Jupyter NotebookLicense:MITStargazers:555Issues:0Issues:0

LEval

[ACL'24] Data and code for L-Eval, a comprehensive long context language models evaluation benchmark

Language:PythonLicense:GPL-3.0Stargazers:312Issues:0Issues:0

Qwen2

Qwen2 is the large language model series developed by Qwen team, Alibaba Cloud.

Language:ShellStargazers:6274Issues:0Issues:0

local-attention

An implementation of local windowed attention for language modeling

Language:PythonLicense:MITStargazers:351Issues:0Issues:0

ChatPLUG

A Chinese Open-Domain Dialogue System

Language:PythonLicense:Apache-2.0Stargazers:304Issues:0Issues:0

alignment-handbook

Robust recipes to align language models with human and AI preferences

Language:PythonLicense:Apache-2.0Stargazers:4231Issues:0Issues:0

DAIL-SQL

A efficient and effective few-shot NL2SQL method on GPT-4.

Language:PythonLicense:Apache-2.0Stargazers:344Issues:0Issues:0

BCEmbedding

Netease Youdao's open-source embedding and reranker models for RAG products.

Language:PythonLicense:Apache-2.0Stargazers:1208Issues:0Issues:0

QAnything

Question and Answer based on Anything.

Language:PythonLicense:Apache-2.0Stargazers:10683Issues:0Issues:0

st-moe-pytorch

Implementation of ST-Moe, the latest incarnation of MoE after years of research at Brain, in Pytorch

Language:PythonLicense:MITStargazers:256Issues:0Issues:0

PowerInfer

High-speed Large Language Model Serving on PCs with Consumer-grade GPUs

Language:C++License:MITStargazers:7661Issues:0Issues:0

FlagEmbedding

Retrieval and Retrieval-augmented LLMs

Language:PythonLicense:MITStargazers:6085Issues:0Issues:0

FActScore

A package to evaluate factuality of long-form generation. Original implementation of our EMNLP 2023 paper "FActScore: Fine-grained Atomic Evaluation of Factual Precision in Long Form Text Generation"

Language:PythonLicense:MITStargazers:244Issues:0Issues:0

export_llama_to_onnx

export llama to onnx

Language:PythonLicense:MITStargazers:76Issues:0Issues:0

Qwen

The official repo of Qwen (通义千问) chat & pretrained large language model proposed by Alibaba Cloud.

Language:PythonLicense:Apache-2.0Stargazers:12660Issues:0Issues:0

tiktoken

tiktoken is a fast BPE tokeniser for use with OpenAI's models.

Language:PythonLicense:MITStargazers:11163Issues:0Issues:0

lmppl

Calculate perplexity on a text with pre-trained language models. Support MLM (eg. DeBERTa), recurrent LM (eg. GPT3), and encoder-decoder LM (eg. Flan-T5).

Language:PythonLicense:MITStargazers:109Issues:0Issues:0

AutoAgents

Complex question answering in LLMs with enhanced reasoning and information-seeking capabilities.

Language:PythonLicense:MITStargazers:164Issues:0Issues:0

Large-Language-Model-Notebooks-Course

Practical course about Large Language Models.

Language:Jupyter NotebookLicense:MITStargazers:911Issues:0Issues:0