There are 2 repositories under pretraining topic.
Llama中文社区,Llama3在线体验和微调模型已开放,实时汇总最新Llama3学习资料,已将所有代码更新适配Llama3,构建最好的中文Llama大模型,完全开源可商用
Papers about pretraining and self-supervised learning on Graph Neural Networks (GNN).
Recent Advances in Vision and Language PreTrained Models (VL-PTMs)
Official Pytorch Implementation of: "ImageNet-21K Pretraining for the Masses"(NeurIPS, 2021) paper
Official Repository for the Uni-Mol Series Methods
【ICLR 2024🔥】 Extending Video-Language Pretraining to N-modality by Language-based Semantic Alignment
Best practice for training LLaMA models in Megatron-LM
PITI: Pretraining is All You Need for Image-to-Image Translation
飞桨大模型开发套件,提供大语言模型、跨模态大模型、生物计算大模型等领域的全流程开发工具链。
[ACL 2022] LinkBERT: A Knowledgeable Language Model 😎 Pretrained with Document Links
End-to-End recipes for pre-training and fine-tuning BERT using Azure Machine Learning Service
BigDetection: A Large-scale Benchmark for Improved Object Detector Pre-training
Paper List for Recommend-system PreTrained Models
[NeurIPS 2022] DRAGON 🐲: Deep Bidirectional Language-Knowledge Graph Pretraining
Recent Advances in Vision and Language Pre-training (VLP)
Personal Project: MPP-Qwen14B(Multimodal Pipeline Parallel-Qwen14B). Don't let the poverty limit your imagination! Train your own 14B LLaVA-like MLLM on RTX3090/4090 24GB.
OpenAI GPT2 pre-training and sequence prediction implementation in Tensorflow 2.0
Research code for EMNLP 2020 paper "HERO: Hierarchical Encoder for Video+Language Omni-representation Pre-training"
Saprot: Protein Language Model with Structural Alphabet
Universal User Representation Pre-training for Cross-domain Recommendation and User Profiling
Collection of training data management explorations for large language models