There are 168 repositories under language-model topic.
🤗 Transformers: the model-definition framework for state-of-the-art machine learning models in text, vision, audio, and multimodal models, for both inference and training.
21 Lessons, Get Started Building with Generative AI
Implement a ChatGPT-like LLM in PyTorch from scratch, step by step
The official gpt4free repository | various collection of powerful language models | o4, o3 and deepseek r1, gpt-4.1, gemini 2.5
🐙 Guides, papers, lecture, notebooks and resources for prompt engineering
Unified Efficient Fine-Tuning of 100+ LLMs & VLMs (ACL 2024)
OpenAssistant is a chat-based assistant that understands tasks, can interact with third-party systems, and retrieve information dynamically to do so.
Code and documentation to train Stanford's Alpaca models, and generate the data.
AI orchestration framework to build customizable, production-ready LLM applications. Connect components (models, vector DBs, file converters) to pipelines or agents that can interact with your data. With advanced retrieval methods, it's best suited for building RAG, question answering, semantic search or conversational agent chatbots.
📦 Repomix is a powerful tool that packs your entire repository into a single, AI-friendly file. Perfect for when you need to feed your codebase to Large Language Models (LLMs) or other AI tools like Claude, ChatGPT, DeepSeek, Perplexity, Gemini, Gemma, Llama, Grok, and more.
The AI Toolkit for TypeScript. From the creators of Next.js, the AI SDK is a free open-source library for building AI-powered applications and agents
DocsGPT is an open-source genAI tool that helps users get reliable answers from knowledge source, while avoiding hallucinations. It enables private and reliable information retrieval, with tooling and agentic system capability built in.
RWKV (pronounced RwaKuv) is an RNN with great LLM performance, which can also be directly trained like a GPT transformer (parallelizable). We are at RWKV-7 "Goose". So it's combining the best of RNN and transformer - great performance, linear time, constant space (no kv-cache), fast training, infinite ctx_len, and free sentence embedding.
An open source implementation of CLIP.
Code for loralib, an implementation of "LoRA: Low-Rank Adaptation of Large Language Models"
💡 All-in-one open-source embeddings database for semantic search, LLM orchestration and language model workflows
💥 Fast State-of-the-Art Tokenizers optimized for Research and Production
A PyTorch-based Speech Toolkit
大规模中文自然语言处理语料 Large Scale Chinese Corpus for NLP
ChatRWKV is like ChatGPT but powered by RWKV (100% RNN) language model, and open source.
A framework for few-shot evaluation of language models.
An Extensible Toolkit for Finetuning and Inference of Large Foundation Models. Large Models for All.
An implementation of model parallel GPT-2 and GPT-3-style models using the mesh-tensorflow library.
An implementation of model parallel autoregressive transformers on GPUs, based on the Megatron and DeepSpeed libraries
Open Source Neural Machine Translation and (Large) Language Models in PyTorch
Google AI 2018 BERT pytorch implementation
GPT 3.5/4 with a Chat Web UI. No API key required.
An Open-source Framework for Data-centric, Self-evolving Autonomous Language Agents
On device AI inference in minutes—now for MLX & GGUF and Qualcomm NPU, with Android and iOS coming soon.
Aligning pretrained language models with instruction data generated by themselves.
[COLM 2024] OpenAgents: An Open Platform for Language Agents in the Wild
中文语言理解测评基准 Chinese Language Understanding Evaluation Benchmark: datasets, baselines, pre-trained models, corpus and leaderboard