[Issue]: Creating two HIP streams causes 100% GPU utilization
pxl-th opened this issue · comments
Problem Description
Creating two HIP streams causes 100% GPU utilization.
This is observed on ROCm 5.7-6.0 and on RX 7600, RX 7800 XT and RX 7900 XTX (at least).
Here's the utilization graph using resources during the execution of C++ MWE below (this is observed with rocm-smi
as well):
Operating System
Ubuntu 22.04.3 LTS (Jammy Jellyfish)
CPU
AMD Ryzen 7 5800X 8-Core Processor
GPU
AMD Radeon RX 7900 XT
ROCm Version
ROCm 6.0.0
Steps to Reproduce
C++ MWE:
#include <hip/hip_runtime.h>
#include <iostream>
#include <chrono>
#include <thread>
using namespace std;
void check(int res) {
if (res != 0) {
std::cerr << "Fail" << std::endl;
}
}
int main(int argc, char* argv[]) {
hipStream_t s1;
check(hipStreamCreateWithPriority(&s1, 0, 0));
hipStream_t s2;
check(hipStreamCreateWithPriority(&s2, 0, 0));
std::this_thread::sleep_for(std::chrono::seconds(5));
return 0;
}
Compile with hipcc main.cpp
& run a.out
and observe utilization during program execution.
I could not reproduce it on Navi21(6900 XT).
rocm-smi reads the data from the driver to populate percent usage. Will forward this to relevant teams to get more information.
This looks to be a Navi 3 issue. I was also not able to reproduce it on RX6700 XT.
Hi! Just curious if there's any update on the issue?
Nothing as of now. I will update here once we have a solution.
This issue seems to be fixed with ROCm 6.0.2 & Linux 6.5.0-18.
Not sure from where the fix came though.