ProjectPhysX / FluidX3D

The fastest and most memory efficient lattice Boltzmann CFD software, running on all GPUs via OpenCL.

Home Page:https://youtube.com/@ProjectPhysX

Geek Repo:Geek Repo

Github PK Tool:Github PK Tool

Cannot run after compilation (Error: Memory size is too large)

Fifis opened this issue · comments

I have compiled everything for Linux + X11, and this is what it shows after compilation:

.-----------------------------------------------------------------------------.
|                       ______________   ______________                       |
|                       \   ________  | |  ________   /                       |
|                        \  \       | | | |       /  /                        |
|                         \  \      | | | |      /  /                         |
|                          \  \     | | | |     /  /                          |
|                           \  \_.-"  | |  "-._/  /                           |
|                            \    _.-" _ "-._    /                            |
|                             \.-" _.-" "-._ "-./                             |
|                               .-"  .-"-.  "-.                               |
|                               \  v"     "v  /                               |
|                                \  \     /  /                                |
|                                 \  \   /  /                                 |
|                                  \  \ /  /                                  |
|                                   \  '  /                                   |
|                                    \   /                                    |
|                                     \ /               FluidX3D Version 2.10 |
|                                      '     Copyright (c) Dr. Moritz Lehmann |
|-----------------------------------------------------------------------------|
|----------------.------------------------------------------------------------|
| Device ID    0 | AMD Radeon Pro WX 2100 (polaris12, LLVM 16.0.6, DRM 3.54, 6.6.1-arch1-1) |
|----------------'------------------------------------------------------------|
|----------------.------------------------------------------------------------|
| Device ID      | 0                                                          |
| Device Name    | AMD Radeon Pro WX 2100 (polaris12, LLVM 16.0.6, DRM 3.54, 6.6.1-arch1-1) |
| Device Vendor  | AMD                                                        |
| Device Driver  | 23.2.1-arch1.2                                             |
| OpenCL Version | OpenCL C 1.1                                               |
| Compute Units  | 8 at 1219 MHz (512 cores, 1.248 TFLOPs/s)                  |
| Memory, Cache  | 2048 MB, 0 KB global / 64 KB local                         |
| Buffer Limits  | 512 MB global, 65536 KB constant                           |
|----------------'------------------------------------------------------------|
| Info: OpenCL C code successfully compiled.                                  |
| Error: Memory size is too large at 1216 MB. Device "AMD Radeon Pro WX 2100  |
|        (polaris12, LLVM 16.0.6, DRM 3.54, 6.6.1-arch1-1)" accepts a maximum |
|        buffer size of 512 MB.                                               |
'-----------------------------------------------------------------------------'

Here is the machine in question – it has 128 GB RAM and 24 physical cores:

$ inxi -Fxz
System:
  Kernel: 6.6.1-arch1-1 arch: x86_64 bits: 64 compiler: gcc v: 13.2.1
    Desktop: KDE Plasma v: 5.27.9 Distro: Arch Linux
Machine:
  Type: Desktop System: Dell product: Precision 7920 Tower v: N/A
    serial: <superuser required>
  Mobo: Dell model: 060K5C v: A02 serial: <superuser required> UEFI: Dell
    v: 2.5.0 date: 12/13/2019
CPU:
  Info: 24-core model: Intel Xeon Platinum 8168 bits: 64 type: MT MCP
    arch: Skylake rev: 4 cache: L1: 1.5 MiB L2: 24 MiB L3: 33 MiB
  Speed (MHz): avg: 1200 min/max: 1200/3700 cores: 1: 1200 2: 1200 3: 1200
    4: 1200 5: 1200 6: 1200 7: 1200 8: 1200 9: 1200 10: 1200 11: 1200 12: 1200
    13: 1200 14: 1200 15: 1200 16: 1200 17: 1200 18: 1200 19: 1200 20: 1200
    21: 1200 22: 1200 23: 1200 24: 1200 25: 1200 26: 1200 27: 1200 28: 1200
    29: 1200 30: 1200 31: 1200 32: 1200 33: 1200 34: 1200 35: 1200 36: 1200
    37: 1200 38: 1200 39: 1200 40: 1200 41: 1200 42: 1200 43: 1200 44: 1200
    45: 1200 46: 1200 47: 1200 48: 1200 bogomips: 259296
  Flags: avx avx2 ht lm nx pae sse sse2 sse3 sse4_1 sse4_2 ssse3 vmx
Graphics:
  Device-1: AMD Lexa XT [Radeon PRO WX 2100] vendor: Dell driver: amdgpu
    v: kernel arch: GCN-4 bus-ID: 0000:b3:00.0
  Device-2: Logitech Logitech Webcam C925e driver: snd-usb-audio,uvcvideo
    type: USB bus-ID: 1-13:6
  Display: x11 server: X.Org v: 21.1.9 with: Xwayland v: 23.2.2 driver: X:
    loaded: amdgpu unloaded: modesetting dri: radeonsi gpu: amdgpu
    resolution: 1920x1080~60Hz
  API: EGL v: 1.5 drivers: radeonsi,swrast platforms:
    active: x11,surfaceless,device inactive: gbm,wayland
  API: OpenGL v: 4.6 compat-v: 4.5 vendor: amd mesa v: 23.2.1-arch1.2
    glx-v: 1.4 direct-render: yes renderer: AMD Radeon Pro WX 2100 (polaris12
    LLVM 16.0.6 DRM 3.54 6.6.1-arch1-1)
  API: Vulkan Message: No Vulkan data available.
Audio:
  Device-1: Intel vendor: Dell driver: snd_hda_intel v: kernel
    bus-ID: 0000:00:1f.3
  Device-2: AMD Baffin HDMI/DP Audio [Radeon RX 550 640SP / 560/560X]
    vendor: Dell driver: snd_hda_intel v: kernel bus-ID: 0000:b3:00.1
  Device-3: Logitech Logitech Webcam C925e driver: snd-usb-audio,uvcvideo
    type: USB bus-ID: 1-13:6
  API: ALSA v: k6.6.1-arch1-1 status: kernel-api
  Server-1: sndiod v: N/A status: off
  Server-2: PipeWire v: 0.3.85 status: active
Network:
  Device-1: Intel Ethernet I219-LM vendor: Dell driver: e1000e v: kernel
    port: N/A bus-ID: 0000:00:1f.6
  IF: enp0s31f6 state: up speed: 1000 Mbps duplex: full mac: <filter>
  Device-2: Intel I210 Gigabit Network vendor: Dell driver: igb v: kernel
    port: 2000 bus-ID: 0000:02:00.0
  IF: enp2s0 state: down mac: <filter>
RAID:
  Hardware-1: Intel C600/X79 series SATA RAID Controller driver: ahci v: 3.0
    bus-ID: 0000:00:17.0
  Hardware-2: Intel Volume Management Device NVMe RAID Controller
    driver: vmd v: 0.6 bus-ID: 0000:64:05.5
Drives:
  Local Storage: total: 2.75 TiB used: 474.57 GiB (16.8%)
  ID-1: /dev/sda vendor: Micron model: 1300 SATA 1024GB size: 953.87 GiB
  ID-2: /dev/sdb vendor: Seagate model: ST2000DM001-1ER164 size: 1.82 TiB
Partition:
  ID-1: / size: 913.92 GiB used: 474.57 GiB (51.9%) fs: ext4 dev: /dev/sda3
Swap:
  Alert: No swap data was found.
Sensors:
  System Temperatures: cpu: 32.0 C pch: 34.0 C mobo: N/A gpu: amdgpu
    temp: 38.0 C
  Fan Speeds (rpm): cpu: 0 fan-2: 812 fan-3: 797 gpu: amdgpu fan: 1926
Info:
  Processes: 711 Uptime: 7d 20h 8m Memory: total: 128 GiB
  available: 124.46 GiB used: 24.42 GiB (19.6%) Init: systemd Compilers:
  gcc: 13.2.1 clang: 16.0.6 Packages: 2599 Shell: Bash v: 5.2.21 inxi: 3.3.31

How does one set the limits upon compilation / run? One of the few places where this problem is mentioned is this OpenBenchmarking page.

Hi @Fifis,

OpenCL has a (somewhat arbitrary) size limitation for how large a single VRAM buffer allocation can be. This is typically 1/4 of total VRAM capacity, in your case 512 MB. Most new GPU drivers do not inforce this limit anymore, and don't pose issues. However some very old GPU drivers for ancient GPUs do.

FluidX3D holds the majority of data (the density distribution functions) in one large buffer. You can compute the size of this largest buffer allocation as follows:

  • for FP32: Nx*Ny*Nz cells * 76 Bytes/cell
  • for FP16: Nx*Ny*Nz cells * 38 Bytes/cell

For the standard benchmark case (256³ resolution), this is 1216 MB (FP32) and 608 MB (FP16) respectively, so the AMD's old GPU driver throws OpenCL error -61 (too large single-buffer allocation).

You'll have to run FluidX3D at lower grid resolution unfortunately, and can't use even the small 2GB VRAM to the full extent. Largest supported cubic resolution is 188³ (FP32) or 240³ (FP16). You can change the benchmark resolution here.

Kind regards,
Moritz