ROCm / ROCm

AMD ROCm™ Software - GitHub Home

Home Page:https://rocm.docs.amd.com

Geek Repo:Geek Repo

Github PK Tool:Github PK Tool

[Documentation]: please keep a single source of truth for "Known Issues". Inconsistency found.

ye-luo opened this issue · comments

Description of errors

I hope there is only one team making the release of ROCm.
There is a section in https://www.amd.com/en/support/kb/release-notes/rn-amdgpu-unified-linux-23-40-rocm-6-0-2

Known Issues

    Running PyTorch with iGPU enabled + Discrete GPU enabled may cause crashes. See the Limitations section within the How To Guide for details.
    GPU reset may occur when running multiple heavy Machine Learning workloads at same time over an extended period of time.
    Intermittent gpureset errors may be seen with Automatic 1111 webUI with IOMMU enabled. Please see https://community.amd.com/t5/knowledge-base/tkb-p/amd-rocm-tkb for suggested resolutions.
    RX 7900 GRE may exhibit a hang rather than Out Of Memory error on BERT FP32 training loads.
    Soft hang observed when running multi-queue workloads.

None of these were mentioned at https://github.com/ROCm/ROCm/blob/develop/CHANGELOG.md

Could you please improve the accuracy and consistency of info in release notes?

Attach any links, screenshots, or additional evidence you think will be helpful.

No response

@ye-luo, thank you for catching this error. I will forward your info to the internal team.