[BUG] Issue title...UserWarning: Grad strides do not match bucket view strides. This may indicate grad was not created according to the gradient layout contract, or that the param's strides changed since DDP was constructed. This is not an error, but may impair performance.
TheDarkKnight-21th opened this issue · comments
Describe the bug
A clear and concise description of what the bug is.
when i execute the script the "dustrbuted_train.sh", i encounter warnings "UserWarning: Grad strides do not match bucket view strides. This may indicate grad was not created according to the gradient layout contract, or that the param's strides changed since DDP was constructed. This is not an error, but may impair performance."
what is that?
To Reproduce
Steps to reproduce the behavior:
- model = convnext.fb_in1k
- dataset = imagenet21k_winter ( the number of class : 19167)
Expected behavior
A clear and concise description of what you expected to happen.
Screenshots
If applicable, add screenshots to help explain your problem.
![image](https://private-user-images.githubusercontent.com/85755635/322349119-10235377-4464-4518-8860-1c25f7a01cca.png?jwt=eyJhbGciOiJIUzI1NiIsInR5cCI6IkpXVCJ9.eyJpc3MiOiJnaXRodWIuY29tIiwiYXVkIjoicmF3LmdpdGh1YnVzZXJjb250ZW50LmNvbSIsImtleSI6ImtleTUiLCJleHAiOjE3MTgzODA2NjgsIm5iZiI6MTcxODM4MDM2OCwicGF0aCI6Ii84NTc1NTYzNS8zMjIzNDkxMTktMTAyMzUzNzctNDQ2NC00NTE4LTg4NjAtMWMyNWY3YTAxY2NhLnBuZz9YLUFtei1BbGdvcml0aG09QVdTNC1ITUFDLVNIQTI1NiZYLUFtei1DcmVkZW50aWFsPUFLSUFWQ09EWUxTQTUzUFFLNFpBJTJGMjAyNDA2MTQlMkZ1cy1lYXN0LTElMkZzMyUyRmF3czRfcmVxdWVzdCZYLUFtei1EYXRlPTIwMjQwNjE0VDE1NTI0OFomWC1BbXotRXhwaXJlcz0zMDAmWC1BbXotU2lnbmF0dXJlPTY2OTMzNjdjYjkxY2NhYmUwMmJkZTNhMTFkZTNjNzcyMTRhODdmMDMxYWE1Njc5MjI0ZDhlYmFlZGI2NjkxZGUmWC1BbXotU2lnbmVkSGVhZGVycz1ob3N0JmFjdG9yX2lkPTAma2V5X2lkPTAmcmVwb19pZD0wIn0.Gs1zGsa1Uqw2CD7Bgsq4DVZkC5ko_EaTUL1u8WFrp2s)
Desktop (please complete the following information):
- OS: Ubuntu 22.04.3 LTS
- This repository version [e.g. pip 0.3.1 or commit ref]
- PyTorch version w/ CUDA/cuDNN [e.g. from
conda list
, 1.7.0 py3.8_cuda11.0.221_cudnn8.0.3_0]
Additional context
Add any other context about the problem here.
@TheDarkKnight-21th it's expected, not ideal perhaps but not an issue, and altering the ops used to remove the warning will in fact make performance lower in most scenarious based on past analysis.