在有 3 个 gpu 的节点出现调度失败的情况
scxu opened this issue · comments
cluster commented
想要实现 https://ieeexplore.ieee.org/abstract/document/8672318 这边文章中的内容。部署成功后发现其中一台有四卡的节点调度是没有问题的,但是另外一台只有三卡的机器会出现 Pending
的情况,也就是明明有资源但是 scheduler 说:
Events:
Type Reason Age From Message
---- ------ ---- ---- -------
Warning FailedScheduling <unknown> default-scheduler 0/4 nodes are available: 1 Insufficient tencent.com/vcuda-core, 3 node(s) didn't match node selector.
这里是 kubectl describe node
的结果:
可以看到是只有两块卡被调度了。
之所以是有三块卡是因为有一块卡出了问题,把它屏蔽了。然后这种调度失败可以通过强制 kube-scheduler
重启的方式一定程度上解决,重启之后一般会正常一下,但是后面还会出类似的问题。
在有三块卡的机器上执行 nvidia-smi topo -mp
结果如下:
GPU0 GPU1 GPU2 CPU Affinity
GPU0 X SYS SYS 0-11
GPU1 SYS X PIX 0-11
GPU2 SYS PIX X 0-11
Legend:
X = Self
SYS = Connection traversing PCIe as well as the SMP interconnect between NUMA nodes (e.g., QPI/UPI)
NODE = Connection traversing PCIe as well as the interconnect between PCIe Host Bridges within a NUMA node
PHB = Connection traversing PCIe as well as a PCIe Host Bridge (typically the CPU)
PXB = Connection traversing multiple PCIe bridges (without traversing the PCIe Host Bridge)
PIX = Connection traversing at most a single PCIe bridge
- k8s 版本:1.17.3
- nvidia driver 版本:440.59