BaguaSys / bagua

Bagua Speeds up PyTorch

Home Page:https://tutorials.baguasys.com/

Geek Repo:Geek Repo

Github PK Tool:Github PK Tool

What's wrong with this? Do I need to do anything else? It will affect my result?

lixiangMindSpore opened this issue · comments

Describe the bug
A clear and concise description of what the bug is.
image

Environment

  • Your operating system and version: Ubuntu18.04
  • Your python version:3.8.12
  • Your PyTorch version:11.1
  • How did you install python (e.g. apt or pyenv)? Did you use a virtualenv?:conda create -n torch python=3.8
  • Have you tried using latest bagua master (python3 -m pip install --pre bagua)?:0.8.1.post1

Reproducing

Please provide a minimal working example. This means the runnable code.

Please also write what exact commands are required to reproduce your results.

Additional context
Add any other context about the problem here.

commented

The message means the memory layout of your PyTorch tensor is inconsistent with latest PyTorch, so Bagua will fallback to a less efficient way to get a tensor's memory address. It will not affect your result. You can safely ignore it if your training runs fine.

The message means the memory layout of your PyTorch tensor is inconsistent with latest PyTorch, so Bagua will fallback to a less efficient way to get a tensor's memory address. It will not affect your result. You can safely ignore it if your training runs fine.

It will affect my velocity?

commented

Only about 0.5%-2% difference in training speed in our tests. We actually plan to remove this warning in next release.

Only about 0.5%-2% difference in training speed in our tests. We actually plan to remove this warning in next release.

Now I intend to remove this warning. How can I do?

commented

Try to launch your program with environment variable LOG_LEVEL=error

Like this

export LOG_LEVEL=error
python -m bagua.distributed.launch ....