sunsmarterjie / beyond_masking

Beyond Masking: Demystifying Token-Based Pre-Training for Vision Transformers

Geek Repo:Geek Repo

Github PK Tool:Github PK Tool

beyond masking

Beyond Masking: Demystifying Token-Based Pre-Training for Vision Transformers

The code is coming

beyond masking

Figure 1: Pipeline of token-based pre-training.

visualization

Figure 2: The visualization of the proposed 5 tasks.

main results

All the results are pre-trained for 300 epochs using Vit-base as default.

zoomed-in zoomed-out distorted blurred de-colorized
finetune 82.7 82.5 82.1 81.8 81.4
zoomed-in (a) mask (m) (a)+(m)
finetune 82.7 82.9 83.2
We note that the integrated version dose not require extra computational cost.

Effiencicy

effiencicy

Figure 3: Efficiency of the integrated task.

About

Beyond Masking: Demystifying Token-Based Pre-Training for Vision Transformers


Languages

Language:Python 99.7%Language:Shell 0.3%Language:C++ 0.0%