huggingface / pytorch-image-models

The largest collection of PyTorch image encoders / backbones. Including train, eval, inference, export scripts, and pretrained weights -- ResNet, ResNeXT, EfficientNet, NFNet, Vision Transformer (ViT), MobileNetV4, MobileNet-V3 & V2, RegNet, DPN, CSPNet, Swin Transformer, MaxViT, CoAtNet, ConvNeXt, and more

Home Page:https://huggingface.co/docs/timm

Geek Repo:Geek Repo

Github PK Tool:Github PK Tool

[BUG] Issue title...UserWarning: Grad strides do not match bucket view strides. This may indicate grad was not created according to the gradient layout contract, or that the param's strides changed since DDP was constructed. This is not an error, but may impair performance.

TheDarkKnight-21th opened this issue · comments

Describe the bug
A clear and concise description of what the bug is.

when i execute the script the "dustrbuted_train.sh", i encounter warnings "UserWarning: Grad strides do not match bucket view strides. This may indicate grad was not created according to the gradient layout contract, or that the param's strides changed since DDP was constructed. This is not an error, but may impair performance."

what is that?

To Reproduce
Steps to reproduce the behavior:

  1. model = convnext.fb_in1k
  2. dataset = imagenet21k_winter ( the number of class : 19167)

Expected behavior
A clear and concise description of what you expected to happen.

Screenshots
If applicable, add screenshots to help explain your problem.

image

Desktop (please complete the following information):

  • OS: Ubuntu 22.04.3 LTS
  • This repository version [e.g. pip 0.3.1 or commit ref]
  • PyTorch version w/ CUDA/cuDNN [e.g. from conda list, 1.7.0 py3.8_cuda11.0.221_cudnn8.0.3_0]

Additional context
Add any other context about the problem here.

@TheDarkKnight-21th it's expected, not ideal perhaps but not an issue, and altering the ops used to remove the warning will in fact make performance lower in most scenarious based on past analysis.