NVIDIA / open-gpu-kernel-modules

NVIDIA Linux open GPU kernel module source

Geek Repo:Geek Repo

Github PK Tool:Github PK Tool

Installing the NVIDIA Driver and CUDA Toolkit failed

seungsoo-lee opened this issue · comments

NVIDIA Open GPU Kernel Modules Version

535.86.06

Operating System and Version

Ubuntu 22.04.2 LTS

Kernel Release

Linux guest 5.19.0-rc6-snp-guest-c4daeffce56e

Build Command

$ wget https://developer.download.nvidia.com/compute/cuda/12.2.1/local_installers/cuda_12.2.1_535.86.10_linux.run
$ sudo sh cuda_12.2.1_535.86.10_linux.run -m=kernel-open

Terminal output/Build Log

[  160.822054] nvidia: loading out-of-tree module taints kernel.
[  160.825258] nvidia: module verification failed: signature and/or required key missing - tainting kernel
[  160.877989] nvidia-nvlink: Nvlink Core is being initialized, major device number 236
[  160.877997] NVRM: The NVIDIA GPU 0000:01:00.0 (PCI ID: 10de:2331)
               NVRM: installed in this system is not supported by open
               NVRM: nvidia.ko because it does not include the required GPU
               NVRM: System Processor (GSP).
               NVRM: Please see the 'Open Linux Kernel Modules' and 'GSP
               NVRM: Firmware' sections in the driver README, available on
               NVRM: the Linux graphics driver download page at
               NVRM: www.nvidia.com.
[  166.261422] nvidia: probe of 0000:01:00.0 failed with error -1
[  166.261517] NVRM: The NVIDIA probe routine failed for 1 device(s).
[  166.261520] NVRM: None of the NVIDIA devices were initialized.
[  166.262025] nvidia-nvlink: Unregistered Nvlink Core, major device number 236

More Info

SYSTEM: GIGABYTE
CPU: Dual AMD EPYC 9224 16-Core Processor
GPU: H100 10de:2331
Host OS: Ubuntu 22.04 with 5.19.0-rc6-snp-host-c4daeffce56e kernel
Guest OS: Ubuntu 22.04.2 with 5.19.0-rc6-snp-guest-c4daeffce56e kernel

What has been completed? The instructions for installing CUDA 12.3 Update 2, say for Ubuntu 22.04 to install sudo apt-get install -y nvidia-kernel-open-545 and that is currently failing.