yty3805595's starred repositories
Chinese-Mixtral
中文Mixtral混合专家大模型(Chinese Mixtral MoE LLMs)
grouped-query-attention-pytorch
(Unofficial) PyTorch implementation of grouped-query attention (GQA) from "GQA: Training Generalized Multi-Query Transformer Models from Multi-Head Checkpoints" (https://arxiv.org/pdf/2305.13245.pdf)
ring-flash-attention
Ring attention implementation with flash attention
Chinese-LLaMA-Alpaca-2
中文LLaMA-2 & Alpaca-2大模型二期项目 + 64K超长上下文模型 (Chinese LLaMA-2 & Alpaca-2 LLMs with 64K long context models)
local-attention
An implementation of local windowed attention for language modeling
alignment-handbook
Robust recipes to align language models with human and AI preferences
BCEmbedding
Netease Youdao's open-source embedding and reranker models for RAG products.
st-moe-pytorch
Implementation of ST-Moe, the latest incarnation of MoE after years of research at Brain, in Pytorch
PowerInfer
High-speed Large Language Model Serving on PCs with Consumer-grade GPUs
FlagEmbedding
Retrieval and Retrieval-augmented LLMs
export_llama_to_onnx
export llama to onnx
AutoAgents
Complex question answering in LLMs with enhanced reasoning and information-seeking capabilities.
Large-Language-Model-Notebooks-Course
Practical course about Large Language Models.