Bug in `max_abs()` function in FP16 Tensor

Question

Bug in `max_abs()` function in FP16 Tensor

skykongkong8 opened this issue 2 months ago · comments

Found an error in Tensor::max_abs() with FP16 dataType.
Strongly believe that this issue has to do with isamax() in blas_neon.cpp.
You can reproduce the same error with the TC below.

TEST(nntrainer_Tensor, max_abs_768) {

nntrainer::TensorDim::TensorType t_type_nchw_fp16 = {
    nntrainer::Tformat::NCHW, nntrainer::Tdatatype::FP16};

  nntrainer::TensorDim::TensorType t_type_nchw_fp32 = {
    nntrainer::Tformat::NCHW, nntrainer::Tdatatype::FP32};

  size_t batch = 1;
  size_t channel = 1;
  size_t height = 768;
  size_t width = 768;

  nntrainer::Tensor input(
    nntrainer::TensorDim(batch, channel, height, width, t_type_nchw_fp16));

  nntrainer::Tensor input_fp32(
    nntrainer::TensorDim(batch, channel, height, width, t_type_nchw_fp32));

  const float alpha = 1e-1;
  const int MOD = 10;

  GEN_TEST_INPUT(input, ((k * l * (batch * height * channel) +
                          l * (batch * height) + k * (width) + l + 1) %
                         MOD) *
                          alpha);
  GEN_TEST_INPUT(input_fp32, ((k * l * (batch * height * channel) +
                               l * (batch * height) + k * (width) + l + 1) %
                              MOD) *
                               alpha);


  __fp16 result_neon = input.max_abs();
  float result_fp32 = input_fp32.max_abs();

  float absErrorNeon =
    std::abs(result_neon - result_fp32);

  const float epsilon = 1e-3;

  EXPECT_EQ(result_neon, result_fp32); // maybe a little bit different
  EXPECT_IN_RANGE(absErrorNeon, 0, epsilon); // but should reside in this range
}

taos-ci · Answer 1 · Tue May 21 2024 08:03:33 GMT+0800 (China Standard Time)

cibot: Thank you for posting issue #2593. The person in charge will reply soon.

Sungsik Kong · Answer 2 · Tue May 21 2024 08:03:35 GMT+0800 (China Standard Time)

Could you please check? @s-debadri

Debadri Samaddar · Answer 3 · Wed May 22 2024 12:45:27 GMT+0800 (China Standard Time)

Reason for failing testcases:
unint16_t was used while storing indices in isamax(). Hence, this will cause unexpected behaviour when total number of elements in the tensor is above the range 0 - 65535 like in the mentioned testcase which has 589824 elements.
Will add a patch to resolve this.