There are 8 repositories under masked-image-modeling topic.
OpenMMLab Pre-training Toolbox and Benchmark
OpenMMLab Self-Supervised Learning Toolbox and Benchmark
[ICLR'23 Spotlight🔥] The first successful BERT/MAE-style pretraining on any convolutional network; Pytorch impl. of "Designing BERT for Convolutional Networks: Sparse and Hierarchical Masked Modeling"
This is an official implementation for "SimMIM: A Simple Framework for Masked Image Modeling".
CAIRI Supervised, Semi- and Self-Supervised Visual Representation Learning Toolbox and Benchmark
[Survey] Masked Modeling for Self-supervised Representation Learning on Vision and Beyond (https://arxiv.org/abs/2401.00897)
Official Codes for "Uniform Masking: Enabling MAE Pre-training for Pyramid-based Vision Transformers with Locality"
This is a PyTorch implementation of “Context AutoEncoder for Self-Supervised Representation Learning"
[NeurIPS2022] Official implementation of the paper 'Green Hierarchical Vision Transformer for Masked Image Modeling'.
MixMIM: Mixed and Masked Image Modeling for Efficient Visual Representation Learning
Official PyTorch implementation of MOOD series: (1) MOODv1: Rethinking Out-of-distributionDetection: Masked Image Modeling Is All You Need. (2) MOODv2: Masked Image Modeling for Out-of-Distribution Detection.
This is a PyTorch implementation of “Context AutoEncoder for Self-Supervised Representation Learning"
PyTorch code for MUST
[CVPR'23 & TPAMI'25] Hard Patches Mining for Masked Image Modeling & Bootstrap Masked Visual Modeling via Hard Patch Mining
A TensorFlow 2.x implementation of Masked Autoencoders Are Scalable Vision Learners
[ICLR2024] Exploring Target Representations for Masked Autoencoders
[NIPS'23] Official Code of the paper "Cross-Scale MAE: A Tale of Multi-Scale Exploitation in Remote Sensing"
PyTorch reimplementation of "A simple, efficient and scalable contrastive masked autoencoder for learning visual representations".
[MedIA 2025] MambaMIM: Pre-training Mamba with State Space Token Interpolation and its Application to Medical Image Segmentation
Official codebase for "Unveiling the Power of Audio-Visual Early Fusion Transformers with Dense Interactions through Masked Modeling".
Recent Advances in Vision-Language Pre-training!
[ICML 2023] Architecture-Agnostic Masked Image Modeling -- From ViT back to CNN
Code to reproduce experiments from the paper "Continual Pre-Training Mitigates Forgetting in Language and Vision" https://arxiv.org/abs/2205.09357
[MedIA 2025] Hi-End-MAE: Hierarchical encoder-driven masked autoencoders are stronger vision learners for medical image segmentation
[ECCV 2022] Official pytorch implementation of "mc-BEiT: Multi-choice Discretization for Image BERT Pre-training" in European Conference on Computer Vision (ECCV) 2022.
[MICCAI 2024] HySparK: Hybrid Sparse Masking for Large Scale Medical Image Pre-Training
PyTorch implementation for "Training and Inference on Any-Order Autoregressive Models the Right Way", NeurIPS 2022 Oral, TPM 2023 Best Paper Honorable Mention
[ICML 2024] Matrix Variational Masked Autoencoder (M-MAE) for ICML paper "Information Flow in Self-Supervised Learning" (https://arxiv.org/abs/2309.17281)
Pytorch implementation of an energy transformer - an energy-based reccurrent variant of the transformer.
[WACV'25] Official implementation of "PK-YOLO: Pretrained Knowledge Guided YOLO for Brain Tumor Detection in Multiplane MRI Slices".
[ECCV 2024] Official PyTorch implementation of LUT "Learning with Unmasked Tokens Drives Stronger Vision Learners"
Masked Autoencoder Pretraining on 3D Brain MRI
[TMI'24] "Masked Deformation Modeling for Volumetric Brain MRI Self-supervised Pre-training".