microsoft / pai

Resource scheduling and cluster management for AI

Home Page:https://openpai.readthedocs.io

Geek Repo:Geek Repo

Github PK Tool:Github PK Tool

unreachable error when add or delete node

JohanOu opened this issue · comments

Organization Name:

Short summary about the issue/question:
when i try to delete a node,i miss the error like the figure.But I can ssh the master in the dev-box container,how to solve?Thanks
image

Brief what process you are following:

How to reproduce it:

OpenPAI Environment:

  • OpenPAI version:v1.8.0
  • Cloud provider or hardware configuration:
  • OS (e.g. from /etc/os-release):
  • Kernel (e.g. uname -a):
  • Hardware (e.g. core number, memory size, storage size, GPU type etc.):
  • Others:

Anything else we need to know:

Can you run the script with -vvvv, this flag will print the verbose log which will help to debug