[BUG] wierd core dump for uninitialized array (no data, undefined size)
liuy opened this issue · comments
I'm tring with arrayfire latest stable 3.8.3 from offical .sh installer on ubuntu 22.04 lts. The full class and code snippet is defined as follows:
I tried all the backends, all worked the same. What am I doing wrong for dealing with uninitialized array as a private data of a class? it looks a bug to me, but I can work around the coredump by simply calling af_print() beforehand.
=== .h file ===
class Tensor {
private:
array data; // <---- here I defined a unitilized array
bool data_computed = false;
Tensor *lhs;
Tensor *rhs;
forward_fn_t forward_fn;
backward_fn_t backward_fn;
public:
Tensor(array &a)
{
data = a;
data_computed = true;
}
Tensor(Tensor &a, Tensor &b, forward_fn_t ffn, backward_fn_t bfn)
{
lhs = &a;
rhs = &b;
forward_fn = ffn;
backward_fn = bfn;
}
void forward(void);
Tensor matmul(Tensor &t);
Tensor operator+(Tensor &t);
Tensor operator-(Tensor &t);
inline void print()
{
af_print(data);
}
};
== .cpp file ==
void Tensor::forward(void)
{
if (data_computed)
return;
if (!lhs->data_computed)
lhs->forward();
if (!rhs->data_computed)
rhs->forward();
data = forward_fn(lhs->data, rhs->data);
data_computed = true;
}
static array add(array &a, array &b)
{
return a + b;
}
static array sub(array &a, array &b)
{
return a - b;
}
static array mat_mul(array &a, array &b)
{
return af::matmul(a, b);
}
Tensor Tensor::operator+(Tensor &t)
{
return Tensor(*this, t, add, NULL);
}
Tensor Tensor::operator-(Tensor &t)
{
return Tensor(*this, t, sub, NULL);
}
=== main.cpp======
int main(int argc, char* argv[])
{
af::info();
float i[] = {1.0, 1.0, 1.0, 1.0};
array A(2, 2, i);
float j[] = {2.0, 2.0, 2.0, 2.0};
array B(2, 2, j);
float k[] = {3.0, 3.0, 3.0, 2.0};
array C(2, 2, k);
Tensor c(C);
Tensor a(A);
Tensor b(B);
Tensor d = a + b - c;
// b.print(); // <----------------This is magic line, without this, I got a core dump
d.forward();
d.print();
return 0;
}
================
without magic line, I got this:
=====output============
ArrayFire v3.8.3 (CPU, 64-bit Linux, build 987d567)
terminate called after throwing an instance of 'AfError'
what(): Input Array not created on current device
Aborted (core dumped)
=================
with magic line, I can run without any error
=======output======
ArrayFire v3.8.3 (CPU, 64-bit Linux, build 987d567)
[0] Intel: Intel(R) Xeon(R) CPU E5-2696 v3 @ 2.30GHzdata
[2 2 1 1]
2.0000 2.0000
2.0000 2.0000
data
[2 2 1 1]
0.0000 0.0000
0.0000 1.0000
Hi @liuy,
I think the problem is that the Tensor returned from the operator+ function is deleted before you take its pointer and store it in the operator- function. This causes a failure when you call the sub function.
There is a separate issue that needs to be addressed in ArrayFire. There is a missing try catch in some of the arithmatic functions that needs to be addressed. Thanks for bringing this to our attention.
Closing the issue but I will be fixing the reason for the segfault. An exception should be caught in the C API before its is terminated.