instruction-tuning

There are 35 repositories under instruction-tuning topic.

LLaMA-Factory
hiyouga / LLaMA-Factory
Unified Efficient Fine-Tuning of 100+ LLMs & VLMs (ACL 2024)
fine-tuning llama llm peft transformers rlhf qlora quantization qwen instruction-tuning gpt lora large-language-models agent ai moe llama3 deepseek gemma nlp
Language:Python 58291
haotian-liu / LLaVA
[NeurIPS'23 Oral] Visual Instruction Tuning (LLaVA) built towards GPT-4V level capabilities and beyond.
gpt-4 chatbot chatgpt llama multimodal llava foundation-models instruction-tuning multi-modality visual-language-learning llama-2 llama2 vision-language-model
Language:Python 23544
BradyFU / Awesome-Multimodal-Large-Language-Models
:sparkles::sparkles:Latest Advances on Multimodal Large Language Models
chain-of-thought in-context-learning instruction-following instruction-tuning large-language-models large-vision-language-model large-vision-language-models multi-modality multimodal-chain-of-thought multimodal-in-context-learning multimodal-instruction-tuning multimodal-large-language-models visual-instruction-tuning
16263
RUCAIBox / LLMSurvey
The official GitHub page for the survey paper "A Survey of Large Language Models".
chain-of-thought chatgpt in-context-learning instruction-tuning large-language-models llm llms natural-language-processing pre-trained-language-models pre-training rlhf
Language:Python 11811
modelscope / data-juicer
Data processing for and with foundation models! 🍎 🍋 🌽 ➡️ ➡️🍸 🍹 🍷
data-analysis data-science large-language-models llm data-visualization llms instruction-tuning pre-training multi-modal synthetic-data data data-pipeline data-processing foundation-models
Language:Python 5193
yizhongw / self-instruct
Aligning pretrained language models with instruction data generated by themselves.
general-purpose-model language-model instruction-tuning
Language:Python 4475
Instruction-Tuning-with-GPT-4 / GPT-4-LLM
Instruction Tuning with GPT-4
alpaca chatgpt gpt-4 instruction-tuning llama
Language:HTML 4297
NExT-GPT / NExT-GPT
Code and models for ICML 2024 paper, NExT-GPT: Any-to-Any Multimodal Large Language Model
chatgpt foundation-models gpt-4 instruction-tuning large-language-models llm mllm multi-modal-chatgpt multimodal visual-language-learning
Language:Python 3480
Otter
EvolvingLMMs-Lab / Otter
🦦 Otter, a multi-modal model based on OpenFlamingo (open-sourced version of DeepMind's Flamingo), trained on MIMIC-IT and showcasing improved instruction-following and in-context learning ability.
artificial-inteligence chatgpt deep-learning embodied-ai foundation-models gpt-4 instruction-tuning large-scale-models machine-learning multi-modality visual-language-learning
Language:Python 3269
PKU-YuanGroup / Video-LLaVA
【EMNLP 2024🔥】Video-LLaVA: Learning United Visual Representation by Alignment Before Projection
instruction-tuning large-vision-language-model multi-modal
Language:Python 3217
DSXiangLi / DecryptPrompt
总结Prompt&LLM论文，开源数据&模型，AIGC应用
demonstration in-context-learning prompt few-shot-learning zero-shot-learning aigc papers prompt-tuning instruction-tuning chain-of-thought chatgpt llm llm-agent prompt-engineering
3204
InternLM / InternLM-XComposer
InternLM-XComposer2.5-OmniLive: A Comprehensive Multimodal System for Long-term Streaming Video and Audio Interactions
chatgpt visual-language-learning multi-modality foundation gpt-4 instruction-tuning mllm multimodal vision-language-model language-model large-language-model large-vision-language-model llm vision-transformer gpt supervised-finetuning
Language:Python 2892
PhoebusSi / Alpaca-CoT
We unified the interfaces of instruction-tuning data (e.g., CoT data), multiple LLMs and parameter-efficient methods (e.g., lora, p-tuning) together for easy use. We welcome open-source enthusiasts to initiate any meaningful PR on this repo and integrate as many LLM related technologies as possible. 我们打造了方便研究人员上手和使用大模型等微调平台，我们欢迎开源爱好者发起任何有意义的pr！
chatglm llama llm lora chatgpt cot instruction-tuning alpaca moss p-tuning parameter-efficient pytorch tabul tabular-data tabular-model
Language:Jupyter Notebook 2769
X-PLUG / mPLUG-Owl
mPLUG-Owl: The Powerful Multi-modal Large Language Model Family
chatbot chatgpt large-language-models llama multimodal damo mplug instruction-tuning pretraining mplug-owl huggingface pytorch transformer alpaca visual-recognition gpt gpt4 gpt4-api dialogue video
Language:Python 2513
OpenGVLab / InternVideo
[ECCV2024] Video Foundation Models & Data for Multimodal Understanding
foundation-models video-understanding vision-transformer action-recognition masked-autoencoder multimodal open-set-recognition spatio-temporal-action-localization temporal-action-localization video-question-answering video-retrieval zero-shot-classification zero-shot-retrieval benchmark contrastive-learning self-supervised instruction-tuning video-data video-dataset video-clip
Language:Python 2047
cambrian-mllm / cambrian
Cambrian-1 is a family of multimodal LLMs with a vision-centric design.
chatbot clip computer-vision dino instruction-tuning large-language-models llms mllm multimodal-large-language-models representation-learning
Language:Python 1952
zjunlp / KnowLM
An Open-sourced Knowledgable Large Language Model Framework.
llama large-language-models pre-trained-language-models language-model instruction-following deep-learning chinese english instructions models reasoning gpt-3 deepspeed instruction-tuning lora pre-training bilingual pre-trained-model knowlm instructie
Language:Python 1301
curator
bespokelabsai / curator
Synthetic data curation for post-training and structured data extraction
agents deep-learning fine-tuning instruction-tuning llm machine-learning natural-language-processing prompt python synthetic-data synthetic-dataset-generation
Language:Python 1195
yaodongC / awesome-instruction-dataset
A collection of open-source dataset to train instruction-following LLMs (ChatGPT,LLaMA,Alpaca)
awsome-lists datasets gpt-3 gpt-4 instruction-following instruction-tuning language-model llama
1113
DataDreamer
datadreamer-dev / DataDreamer
DataDreamer: Prompt. Generate Synthetic Data. Train & Align Models. 🤖💤
deep-learning machine-learning natural-language-processing nlp nlp-library python pytorch transformers alignment fine-tuning gpt instruction-tuning llm llmops llms openai synthetic-data synthetic-dataset-generation
Language:Python 1058
NVlabs / DoRA
[ICML2024 (Oral)] Official PyTorch implementation of DoRA: Weight-Decomposed Low-Rank Adaptation
commonsense-reasoning deep-learning deep-neural-networks instruction-tuning large-language-models large-vision-language-models lora parameter-efficient-fine-tuning parameter-efficient-tuning vision-and-language
Language:Python 760
HKUDS / GraphGPT
[SIGIR'2024] "GraphGPT: Graph Instruction Tuning for Large Language Models"
graph-neural-networks instruction-tuning large-language-models text-graph graph-learning
Language:Python 719
FudanDISC / DISC-FinLLM
DISC-FinLLM，中文金融大语言模型（LLM），旨在为用户提供金融场景下专业、智能、全面的金融咨询服务。DISC-FinLLM, a Chinese financial large language model (LLM) designed to provide users with professional, intelligent, and comprehensive financial consulting services in financial scenarios.
chain-of-retrieval financial-large-language-model instruction-tuning retrieval-enhanced-generation tool-learning
Language:Python 698
ContextualAI / gritlm
Generative Representational Instruction Tuning
embedding embedding-models embeddings grit information-retrieval instruction-tuning llm llms mteb retrieval sbert sgpt text-embedding
Language:Jupyter Notebook 672
hkust-nlp / deita
Deita: Data-Efficient Instruction Tuning for Alignment [ICLR2024]
alignment data-centric instruction-tuning large-language-models
Language:Python 567
bigscience-workshop / xmtf
Crosslingual Generalization through Multitask Finetuning
bloom bloomz instruction-tuning language-models large-language-models mt0 multilingual-nlp multitask-learning t5 zero-shot-learning
Language:Jupyter Notebook 530
salesforce / DialogStudio
DialogStudio: Towards Richest and Most Diverse Unified Dataset Collection and Instruction-Aware Models for Conversational AI
conversational-ai dataset dialog language-model natural-language-understanding open-domain-dialog question-answering natural-language-generation open-source instruction-tuning
Language:Python 497
RenzeLou / awesome-instruction-learning
Papers and Datasets on Instruction Tuning and Following. ✨✨✨
pretrained-language-model instruction-learning paper-list awesome-list datasets in-context-learning large-language-models prompt survey instruction instruction-tuning
Language:Python 488
mindspore-courses / step_into_llm
MindSpore online courses: Step into LLM
llm natural-language-processing nlp large-language-models mindspore bert chatgpt codegeex gpt gpt2 instruction-tuning parallel-computing prompt-tuning rlhf chatglm chatglm2 llama llama2 moe peft
Language:Jupyter Notebook 457
princeton-nlp / LESS
[ICML 2024] LESS: Selecting Influential Data for Targeted Instruction Tuning
data data-selection influence instruction-tuning llama llm mistral
Language:Jupyter Notebook 427
yaotingwangofficial / Awesome-MCoT
Multimodal Chain-of-Thought Reasoning: A Comprehensive Survey
chain-of-thought cot deepseek-r1 instruction-tuning large-vision-language-model mcts mllm-reasoning multimodal multimodal-chain-of-thought multimodal-large-language-models openai-o1 reasoning slow-thinking survey system-2
415
HKUDS / UrbanGPT
[KDD'2024] "UrbanGPT: Spatio-Temporal Large Language Models"
fundation-models instruction-tuning large-language-models pre-trained-model smart-cities spatio-temporal-prediction urban-computing urban-data-science
Language:Python 387
HugAILab / HugNLP
CIKM2023 Best Demo Paper Award. HugNLP is a unified and comprehensive NLP library based on HuggingFace Transformer. Please hugging for NLP now!😊
code-understanding deep-learning few-shot-learning instruction-tuning large-language-models natural-language-processing prompt-based-learning pytorch transformers information-extraction hugchat
Language:Python 387
HenryHZY / Awesome-Multimodal-LLM
Research Trends in LLM-guided Multimodal Learning.
llm multimodal large-language-models multimodal-learning in-context-learning instruction-tuning multimodal-large-language-models parameter-efficient-learning parameter-efficient-tuning
357
zhilizju / Awesome-instruction-tuning
A curated list of awesome instruction tuning datasets, models, papers and repositories.
alpaca awesome awesome-list chatgpt cross-task-generalization gpt in-context-learning instruct-gpt instruction-tuning llm llms zero-shot
Language:Python 331
ZigeW / data_management_LLM
Collection of training data management explorations for large language models
instruction-tuning large-language-models natural-language-processing pre-training pretraining
322