Awesome ROCm

A collection of userful information and tutorials for using ROCm.

Collections

Websties

ROCm Documentation: Main documentation for ROCm, all about its components and how to use them.
GPUOpen: A collection of resources from AMD and GPUOpen partners, including ISA documentation, developer tools, libraries, and SDKs.

ROCm have a lot of github Organizations and Repositories, here are some of them:

ROCm Core Technology: Low level drivers and runtimes for ROCm.
ROCm Developer Tools: Tools for profiling, debugging, and optimizing applications for ROCm.
ROCm Software Platform: High level libraries and frameworks for ROCm, like Pytorch, Tensorflow, MIOpen, etc. Xformers and Flash-attention are also here.

The docker hub for ROCm is rocm, you can find all the official docker images here.

Useful Repositories

HIPIFY: A tool to convert CUDA code to HIP code. You can use it to port your CUDA code to ROCm.
Flash-Attention: Flash-Attention ported to ROCm. NAVI GPU support is in howiejay/navi_support branch.

Inference

FastLLM-ROCm: A simple implementation of FastLLM on ROCm. Not optimized, but it is easy to maintain and modify.
VLLM: A high performance implementation of FastLLM on ROCm. It is optimized for performance. It have AMD Installation Guide and Docker image for MI GPUs.

Training and Fine-tuning

PEFT: Parameter-Efficient Fine-Tuning from Huggingface, very easy to use
Deepspeed: Distributed training and inference library.

Quantization

BitsAndBytes: 8-bit CUDA functions for PyTorch, ported to HIP for use in AMD GPUs. 4 bit is on the way.

Installation and environment setup

See env-install folder for useful scripts to install ROCm and setup environment. All of the scripts need Pytorch to run, so you need to install Pytorch first.

test-rocm.py: A script to test if ROCm is installed correctly.
test-pytorch.py: A script to test performance of Pytorch on ROCm using GEMM operations.

Steps to install ROCm

Check system compatibility: ROCm System Requirements
Install via package manager: ROCm Installation Guide

Note, after installing the AMDGPU driver, a reboot is required.

Steps to install Pytorch

From Docker image

The easiest way to install Pytorch is to use the docker image provided by ROCm. You can find the docker hub here. We use rocm/pytorch:rocm6.0_ubuntu22.04_py3.9_pytorch_2.0.1, you can pull it by:

docker pull rocm/pytorch:rocm6.0_ubuntu22.04_py3.9_pytorch_2.0.1

Check out the Docker tutorial for ROCm for more information about GPUs control and docker images.

If you just want to run Pytorch on ROCm, the command is:

docker run -it --cap-add=SYS_PTRACE --security-opt seccomp=unconfined --device=/dev/kfd --device=/dev/dri --group-add video --ipc=host --shm-size 16G rocm/pytorch:rocm6.0_ubuntu22.04_py3.9_pytorch_2.0.1

From wheel

If you want to install Pytorch on your host machine, you can install it from wheel. You can find the steps on Pytorch official website: Pytorch Get Started.

An example command to install Pytorch on ROCm is:

pip3 install --pre torch torchvision torchaudio --index-url https://download.pytorch.org/whl/nightly/rocm5.7

Note: virtual environment is recommended. We recommand mamba to create virtual environment, it is much faster than conda.

Fine-tuning example

See fine-tuning folder for fine-tuning examples. You can run it by bash run.sh.

Some notes:

BitAndBytes is used to quantize the model to 8-bit, but its ported haven't finished yet. You need to use adamw_torch optimizer to avoid error.
The base model is meta-llama/Llama-2-7b-chat-hf, you shoud get access to it first.

lcpu-club / awesome-rocm