fix _FullyShardedDataParallelMapping when running test_fsdp.py
josemlopez opened this issue · comments
Jose Lopez commented
How to reproduce
python -m torch.distributed.run --nproc_per_node=2 --master_port=2333 ./tests/torch/nn/parallel/data_parallel/test_fsdp.py
Environment
- OS :
- Python version :
- Transformers version :
- Whether to use Docker:
- Misc.:
The problem is in _fsdp where _FullyShardedDataParallelMappingForHuggingFace is used instead of _FullyShardedDataParallelMapping
from oslo.transformers.mapping_utils import ( _FullyShardedDataParallelMappingForHuggingFace, )
Jose Lopez commented
I'm creating a PR fixing this