hengck23 / solution-predict-ai-model-runtime

Geek Repo:Geek Repo

Github PK Tool:Github PK Tool

Kaggle Competition Solution

Google - Fast or Slow? Predict AI Model Runtime (6-th)

https://www.kaggle.com/competitions/predict-ai-model-runtime/

For discussion, please refer to:
https://www.kaggle.com/competitions/predict-ai-model-runtime/discussion/456084

1. Hardware

  • GPU: 2x Nvidia Quadro RTX 8000, each with VRAM 48 GB
  • CPU: Intel® Xeon(R) Gold 6240 CPU @ 2.60GHz, 72 cores
  • Memory: 376 GB RAM

2. OS

  • ubuntu 18.04.5 LTS

3. Set Up Environment

  • Install Python >=3.10.9
  • Install requirements.txt in the python environment
  • Set up the directory structure as shown below.
└── solution
    ├── src 
    ├── results
    ├── data
    |   ├── predict-ai-model-runtime
    |       ├── sample_submission.csv
    │       ├── npz_all
    │            ├── npz
    │                 ├── layout 
    │                 │     ├── nlp
    │                 │     │    ├── default : train/valid/test
    │                 │     │    ├── random : train/valid/test
    │                 │     ├── xla
    │                 │          ├── default : train/valid/test
    │                 │          ├── random : train/valid/test
    |                 ├── tile
    |                       ├── xla : train/valid/test      
    ├── LICENSE 
    ├── README.md 

4. Training the model

Warning !!! training output will be overwritten to the "solution/results" folder

Please run the following python scripts to output the model files

>> python src/1a_run_res_graphsage4_layout.py
output model:
- results/final-01/model/4x-graphsage-pair2/layout/nlp-default/checkpoint/swa.pth
- results/final-01/model/4x-graphsage-pair2/layout/nlp-random/checkpoint/swa.pth
- results/final-01/model/4x-graphsage-pair2/layout/xla-default/checkpoint/swa.pth
- results/final-01/model/4x-graphsage-pair2/layout/xla-random/checkpoint/swa.pth

>> python src/1b_run_res_gin4_layout.py
output model:
- results/final-01/model/4x-gin-pair2/layout/xla-default/checkpoint/swa.pth

>> python src/2_run_res_gatconv4_tile.py
output model:
- results/final-01/model/4x-gatconv-listmle/tile/xla/checkpoint/00010013.pth

Local validation results are also output:

  • 4x-graphsage-pair2
opa kendall_tau
nlp-default 0.76969 0.53938
nlp-random 0.96327 0.92654
xla-default 0.72754 0.45508
xla-random 0.83563 0.67127
  • 4x-gin-pair2
opa kendall_tau
xla-default 0.72978 0.45957
  • 2_run_res_gatconv4_tile
slowndown1 slowndown5 slowndown10
xla 0.89052 0.97462 0.98351

5. Submission csv

Please run the following script:

>> python src/3_run_make_kaggle_submission.py
output file:
- results/final-01/submission_06.csv
public lb private lb
submission_06.csv 0.69424 0.70549

6. Reference trained models and validation results

Authors

License

  • This project is licensed under the MIT License - see the LICENSE file for details.

Acknowledgement

"We extend our thanks to HP for providing the Z8-G4 Data Science Workstation, which empowered our deep learning experiments. The high computational power and large GPU memory enabled us to design our models swiftly."

About

License:MIT License


Languages

Language:Python 100.0%