This is the NUS CS5260 course project, https://github.com/KoalaYuFeng/vit_train_benchmark_with_Colossalai. In the repository's README.md
file, provide the following information:
In this repository, we utilize pretrained weights of the Vision Transformer (ViT) loaded from HuggingFace. We adapt the ViT training code to work with ColossalAI by leveraging the Boosting API, which is loaded with a chosen plugin. Each plugin corresponds to a specific type of training strategy. This example supports plugins including:
TorchDDPPlugin
(DDP)LowLevelZeroPlugin
(Zero1/Zero2)GeminiPlugin
(Gemini)
We use the BeansDataset
from HuggingFace.
- First, ensure the correct version of PyTorch is installed that matches your CUDA version. In my case, with CUDA version 11.7, I install
torch 1.13.0
. - Include the requirements in the
requirements.txt
. You can install them using the command:pip install -r requirements.txt
- Clone the ColossalAI repository from GitHub:
git clone --recursive https://github.com/hpcaitech/ColossalAI.git
- Navigate to the directory
cd ColossalAI/examples/images/vit
- Run the script:
bash run_demo.sh // for training ViT: bash run_benchmark.sh // for benchmark ViT:
Epoch | Average Loss | Accuracy |
---|---|---|
1 | 1.1607 | 85.94% |
2 | 0.2364 | 97.66% |
3 | 0.2099 | 98.44% |
The benchmarking was conducted using different plugins and batch sizes. The results are summarized in the table below:
Plugin | Batch Size per GPU | Throughput (samples/sec) | Maximum Memory Usage per GPU |
---|---|---|---|
torch_ddp |
8 | 43.7168 | 1.80 GB |
torch_ddp_fp16 |
8 | 60.1283 | 1.91 GB |
low_level_zero |
8 | 47.1534 | 1.65 GB |
gemini |
8 | 28.0425 | 663.17 MB |
torch_ddp |
32 | 66.7630 | 2.34 GB |
torch_ddp_fp16 |
32 | 153.6898 | 2.25 GB |
low_level_zero |
32 | 143.5798 | 1.66 GB |
gemini |
32 | 110.6582 | 663.17 MB |
For more detailed configurations and complete benchmark results, please refer to the log file in the repository.