cupy / cupy

NumPy & SciPy for GPU

Home Page:https://cupy.dev

Geek Repo:Geek Repo

Github PK Tool:Github PK Tool

`cupy.min()` mishandles infinity

leofang opened this issue · comments

I see the same problem with cupy version 13.0.0:

In [3]: cp.min(cp.array([cp.inf, cp.inf]))
Out[3]: array(1.79769313e+308)

Is this because the bug has come back, is caused by something else or did the fix never make it in to the release?

Environment info:

OS                           : Linux-5.15.0-94-generic-x86_64-with-glibc2.35
Python Version               : 3.12.1
CuPy Version                 : 13.0.0
CuPy Platform                : NVIDIA CUDA
NumPy Version                : 1.26.3
SciPy Version                : 1.12.0
Cython Build Version         : 0.29.37
Cython Runtime Version       : 3.0.8
CUDA Root                    : /usr/local/cuda
nvcc PATH                    : /usr/local/cuda/bin/nvcc
CUDA Build Version           : 12000
CUDA Driver Version          : 12020
CUDA Runtime Version         : 12000 (linked to CuPy) / 12010 (locally installed)
cuBLAS Version               : (available)
cuFFT Version                : 11002
cuRAND Version               : 10304
cuSOLVER Version             : (11, 4, 4)
cuSPARSE Version             : (available)
NVRTC Version                : (12, 3)
Thrust Version               : 200001
CUB Build Version            : 200200
Jitify Build Version         : c08b8c6
cuDNN Build Version          : None
cuDNN Version                : None
NCCL Build Version           : None
NCCL Runtime Version         : None
cuTENSOR Version             : None
cuSPARSELt Build Version     : None
Device 0 Name                : Tesla T4
Device 0 Compute Capability  : 75
Device 0 PCI Bus ID          : 0000:3B:00.0
Device 1 Name                : Tesla T4
Device 1 Compute Capability  : 75
Device 1 PCI Bus ID          : 0000:5E:00.0
Device 2 Name                : Tesla T4
Device 2 Compute Capability  : 75
Device 2 PCI Bus ID          : 0000:AF:00.0
Device 3 Name                : Tesla T4
Device 3 Compute Capability  : 75
Device 3 PCI Bus ID          : 0000:D8:00.0

Originally posted by @betatim in #7424 (comment)

This reproduces when CUB is activated.

% export CUPY_ACCELERATORS=       
% python
Python 3.10.13 (main, Nov 10 2023, 16:45:28) [GCC 9.4.0] on linux
Type "help", "copyright", "credits" or "license" for more information.
>>> import cupy as cp
>>> cp.min(cp.array([cp.inf, cp.inf]))
array(inf)

Maybe it's the same issue as before for argmin/argmax? We can tweak the earlier C++ reproducer from @asi1024 to test this. If so, I am sure Georgii from the CCCL team will be happy to help fix 🙂

Tweaked version of NVIDIA/cub#642 for min:

#include <iostream>
#include <cub/cub.cuh>

constexpr int N = 10;
constexpr float INF = std::numeric_limits<float>::infinity();

int main() {
  float* in;
  float* d_in;
  float* out;
  float* d_out;

  in = new float[N];
  out = new float[1];
  std::fill(in, in + N, INF);

  size_t temp_storage_bytes = 1024;
  void* d_temp_storage;
  cudaMalloc((void**)&d_temp_storage, temp_storage_bytes);

  cudaMalloc((void**)&d_in, sizeof(float) * N);
  cudaMalloc((void**)&d_out, sizeof(float) * 1);

  cudaMemcpy(d_in, in, sizeof(float) * N, cudaMemcpyHostToDevice);
  cub::DeviceReduce::Min(d_temp_storage, temp_storage_bytes, d_in, d_out, N);
  cudaMemcpy(out, d_out, sizeof(float), cudaMemcpyDeviceToHost);

  std::cout << *out << std::endl;  // Expected: inf
}
% nvcc test.cu
% ./a.out
3.40282e+38

It turns out that someone else has asked the same question, and it's documented in CUB that the initial value for min/max is not $\pm\infty$, but the lowest/largest values of the given type: NVIDIA/cub#662 (comment). It's probably overlooked in CuPy so far. I'll send a fix.