There are 6 repositories under pre-training topic.
The official GitHub page for the survey paper "A Survey of Large Language Models".
Data processing for and with foundation models! 🍎 🍋 🌽 ➡️ ➡️🍸 🍹 🍷
Papers about pretraining and self-supervised learning on Graph Neural Networks (GNN).
Awesome resources for in-context learning and prompt engineering: Mastery of the LLMs such as ChatGPT, GPT-3, and FlanT5, with up-to-date and cutting-edge updates. - Professor Yu Liu
Code for TKDE paper "Self-supervised learning on graphs: Contrastive, generative, or predictive"
Unified Training of Universal Time Series Forecasting Transformers
Awesome list for research on CLIP (Contrastive Language-Image Pre-Training).
A professional list on Large (Language) Models and Foundation Models (LLM, LM, FM) for Time Series, Spatiotemporal, and Event Data.
Tencent Pre-training framework in PyTorch & Pre-trained Model Zoo
Pre-training of Deep Bidirectional Transformers for Language Understanding: pre-train TextCNN
Research code for ECCV 2020 paper "UNITER: UNiversal Image-TExt Representation Learning"
Code for ICLR 2020 paper "VL-BERT: Pre-training of Generic Visual-Linguistic Representations".
Large Language Model-enhanced Recommender System Papers
[ICLR 2024] Sheared LLaMA: Accelerating Language Model Pre-training via Structured Pruning
[NeurIPS 2020] "Graph Contrastive Learning with Augmentations" by Yuning You, Tianlong Chen, Yongduo Sui, Ting Chen, Zhangyang Wang, Yang Shen
The repository of ET-BERT, a network traffic classification model on encrypted traffic. The work has been accepted as The Web Conference (WWW) 2022 accepted paper.
Code for our SIGKDD'22 paper Pre-training-Enhanced Spatial-Temporal Graph Neural Network For Multivariate Time Series Forecasting.
Conceptual 12M is a dataset containing (image-URL, caption) pairs collected for vision-and-language pre-training.
Paper List of Pre-trained Foundation Recommender Models
The official repo for [NeurIPS'23] "SAMRS: Scaling-up Remote Sensing Segmentation Dataset with Segment Anything Model"
[Survey] Masked Modeling for Self-supervised Representation Learning on Vision and Beyond (https://arxiv.org/abs/2401.00897)
Collection of training data management explorations for large language models
Probing the representations of Vision Transformers.
[CVPR 2024 Highlight] Visual Point Cloud Forecasting
[MIR-2023-Survey] A continuously updated paper list for multi-modal pre-trained big models
GearNet and Geometric Pretraining Methods for Protein Structure Representation Learning, ICLR'2023 (https://arxiv.org/abs/2203.06125)
[CVPR2023] All in One: Exploring Unified Video-Language Pre-training
💐Kaleido-BERT: Vision-Language Pre-training on Fashion Domain