svenpeter42 / fastfilters

Old academic project for my PhD - no longer maintained by me: fast gaussian and derivative convolutional filters

Geek Repo:Geek Repo

Github PK Tool:Github PK Tool

NaNs in eigenvalue 3d code with -ffast-math

svenpeter42 opened this issue · comments

Stepts to reproduce by @stuarteberg ilastik/ilastik#1477 (comment)
Happens to me now as well.

With feenableexcept(FE_DIVBYZERO | FE_INVALID | FE_OVERFLOW); the first fp exception happens in the eigenvalue computation for 3D, possibly because the results of the previous convolutions are already wrong or because the EV code has some instabilities when used with relaxed FP rules.

The bug disappears when I disable -ffast-math at a significant speed penalty.

Program received signal SIGFPE, Arithmetic exception.
_ev3d_avx2 (a00=0xc22f80, a01=0xc25c50, a02=0xc24d60, a11=0xc00870, a12=0xc23e70, a22=0xbd25b0, ev0=<optimized out>, ev1=<optimized out>, ev2=<optimized out>, len=<optimized out>) at /home/speter/miniconda3/conda-bld/fastfilters_1507908730174/work/build_conda/linalg_avx2.avx2.c:78
78	        q = _mm256_min_ps(q, zero);
(gdb) bt
#0  _ev3d_avx2 (a00=0xc22f80, a01=0xc25c50, a02=0xc24d60, a11=0xc00870, a12=0xc23e70, a22=0xbd25b0, ev0=<optimized out>, ev1=<optimized out>, ev2=<optimized out>, len=<optimized out>)
    at /home/speter/miniconda3/conda-bld/fastfilters_1507908730174/work/build_conda/linalg_avx2.avx2.c:78
#1  0x00007fffe7bbb301 in (anonymous namespace)::filter_ev_3d_binding<(anonymous namespace)::ConvolveST> (input=..., fn=...) from /home/speter/miniconda3/envs/fastfilter-dev/lib/python3.5/site-packages/fastfilters/core.cpython-35m-x86_64-linux-gnu.so
#2  0x00007fffe7bb7424 in (anonymous namespace)::<lambda(pybind11::array_t<float, 17>&, double, double, float)>::operator()(pybind11::array_t<float, 17> &, double, double, float) const (__closure=0xbcec08, input=..., E#0=1, E#1=0.5, window_ratio=2)
   from /home/speter/miniconda3/envs/fastfilter-dev/lib/python3.5/site-packages/fastfilters/core.cpython-35m-x86_64-linux-gnu.so
#3  0x00007fffe7bc37c6 in pybind11::detail::type_caster<std::tuple<pybind11::array_t<float, 17>&, double, double, float>, void>::call<pybind11::array_t<float>, (anonymous namespace)::bind2d3d_ev(pybind11::module&, std::__cxx11::string) [with ConvolveFunctor = (anonymous namespace)::ConvolveST; args = {double, double}; std::__cxx11::string = std::__cxx11::basic_string<char>]::<lambda(pybind11::array_t<float, 17>&, double, double, float)>&, 0ul, 1ul, 2ul, 3ul>((anonymous namespace)::<lambda(pybind11::array_t<float, 17>&, double, double, float)> &, pybind11::detail::index_sequence<0ul, 1ul, 2ul, 3ul>) (this=0x7fffffffcf70, f=...) from /home/speter/miniconda3/envs/fastfilter-dev/lib/python3.5/site-packages/fastfilters/core.cpython-35m-x86_64-linux-gnu.so
#4  0x00007fffe7bc2a8f in pybind11::detail::type_caster<std::tuple<pybind11::array_t<float, 17>&, double, double, float>, void>::call<pybind11::array_t<float>, (anonymous namespace)::bind2d3d_ev(pybind11::module&, std::__cxx11::string) [with ConvolveFunctor = (anonymous namespace)::ConvolveST; args = {double, double}; std::__cxx11::string = std::__cxx11::basic_string<char>]::<lambda(pybind11::array_t<float, 17>&, double, double, float)>&>((anonymous namespace)::<lambda(pybind11::array_t<float, 17>&, double, double, float)> &) (this=0x7fffffffcf70, f=...)
   from /home/speter/miniconda3/envs/fastfilter-dev/lib/python3.5/site-packages/fastfilters/core.cpython-35m-x86_64-linux-gnu.so
#5  0x00007fffe7bc14b4 in pybind11::cpp_function::<lambda(pybind11::detail::function_record*, pybind11::handle, pybind11::handle, pybind11::handle)>::operator()(pybind11::detail::function_record *, pybind11::handle, pybind11::handle, pybind11::handle) const (__closure=0x0, 
    rec=0xbcebd0, args=..., kwargs=..., parent=...) from /home/speter/miniconda3/envs/fastfilter-dev/lib/python3.5/site-packages/fastfilters/core.cpython-35m-x86_64-linux-gnu.so
#6  0x00007fffe7bc15e5 in pybind11::cpp_function::<lambda(pybind11::detail::function_record*, pybind11::handle, pybind11::handle, pybind11::handle)>::_FUN(pybind11::detail::function_record *, pybind11::handle, pybind11::handle, pybind11::handle) ()
   from /home/speter/miniconda3/envs/fastfilter-dev/lib/python3.5/site-packages/fastfilters/core.cpython-35m-x86_64-linux-gnu.so
#7  0x00007fffe7bc928c in pybind11::cpp_function::dispatcher (self=0x7fffec4cf600, args=0x7fffe75d7818, kwargs=0x0) from /home/speter/miniconda3/envs/fastfilter-dev/lib/python3.5/site-packages/fastfilters/core.cpython-35m-x86_64-linux-gnu.so
#8  0x00007ffff79a0dd1 in PyCFunction_Call (func=0x7fffe9122708, args=0x7fffe75d7818, kwds=<optimized out>) at Objects/methodobject.c:98
#9  0x00007ffff7a294a6 in call_function (oparg=<optimized out>, pp_stack=0x7fffffffd3d8) at Python/ceval.c:4720
#10 PyEval_EvalFrameEx (f=<optimized out>, throwflag=<optimized out>) at Python/ceval.c:3251
#11 0x00007ffff7a29fc9 in _PyEval_EvalCodeWithName (_co=<optimized out>, globals=<optimized out>, locals=<optimized out>, args=<optimized out>, argcount=4, kws=0x7ffff7f79060, kwcount=0, defs=0x7ffff1003e60, defcount=1, kwdefs=0x0, closure=0x0, name=0x0, qualname=0x0)
    at Python/ceval.c:4033
#12 0x00007ffff7a2a158 in PyEval_EvalCodeEx (_co=<optimized out>, globals=<optimized out>, locals=<optimized out>, args=<optimized out>, argcount=<optimized out>, kws=<optimized out>, kwcount=0, defs=0x7ffff1003e60, defcount=1, kwdefs=0x0, closure=0x0) at Python/ceval.c:4054
#13 0x00007ffff797ec91 in function_call (func=0x7fffe91299d8, arg=0x7fffe9116f98, kw=0x7ffff7edd848) at Objects/funcobject.c:627
#14 0x00007ffff794b4c6 in PyObject_Call (func=0x7fffe91299d8, arg=<optimized out>, kw=<optimized out>) at Objects/abstract.c:2166
#15 0x00007ffff7a26286 in ext_do_call (nk=-384733288, na=1, flags=<optimized out>, pp_stack=0x7fffffffd728, func=0x7fffe91299d8) at Python/ceval.c:5049
#16 PyEval_EvalFrameEx (f=<optimized out>, throwflag=<optimized out>) at Python/ceval.c:3290
#17 0x00007ffff7a29fc9 in _PyEval_EvalCodeWithName (_co=<optimized out>, globals=<optimized out>, locals=<optimized out>, args=<optimized out>, argcount=4, kws=0x7ffff7f719c8, kwcount=0, defs=0x0, defcount=0, kwdefs=0x0, closure=0x7fffee902978, name=0x7ffff0bc3370, 
    qualname=0x7ffff0ff7fa8) at Python/ceval.c:4033
#18 0x00007ffff7a2813d in fast_function (nk=<optimized out>, na=4, n=<optimized out>, pp_stack=0x7fffffffd948, func=0x7fffe9129a60) at Python/ceval.c:4828
#19 call_function (oparg=<optimized out>, pp_stack=0x7fffffffd948) at Python/ceval.c:4745
#20 PyEval_EvalFrameEx (f=<optimized out>, throwflag=<optimized out>) at Python/ceval.c:3251
#21 0x00007ffff7a29fc9 in _PyEval_EvalCodeWithName (_co=<optimized out>, globals=<optimized out>, locals=<optimized out>, args=<optimized out>, argcount=0, kws=0x0, kwcount=0, defs=0x0, defcount=0, kwdefs=0x0, closure=0x0, name=0x0, qualname=0x0) at Python/ceval.c:4033
#22 0x00007ffff7a2a158 in PyEval_EvalCodeEx (_co=<optimized out>, globals=<optimized out>, locals=<optimized out>, args=<optimized out>, argcount=<optimized out>, kws=<optimized out>, kwcount=0, defs=0x0, defcount=0, kwdefs=0x0, closure=0x0) at Python/ceval.c:4054
#23 0x00007ffff7a2a19b in PyEval_EvalCode (co=<optimized out>, globals=<optimized out>, locals=<optimized out>) at Python/ceval.c:777
#24 0x00007ffff7a4f410 in run_mod (arena=0x6611b0, flags=0x7fffffffdc90, locals=0x7ffff7f27208, globals=0x7ffff7f27208, filename=0x7ffff7e4e260, mod=0x6c0de0) at Python/pythonrun.c:982
#25 PyRun_FileExFlags (fp=0x660f80, filename_str=<optimized out>, start=<optimized out>, globals=0x7ffff7f27208, locals=0x7ffff7f27208, closeit=<optimized out>, flags=0x7fffffffdc90) at Python/pythonrun.c:935
#26 0x00007ffff7a50a03 in PyRun_SimpleFileExFlags (fp=0x660f80, filename=<optimized out>, closeit=1, flags=0x7fffffffdc90) at Python/pythonrun.c:402
#27 0x00007ffff7a6bce7 in run_file (p_cf=0x7fffffffdc90, filename=0x603320 L"blah.py", fp=0x660f80) at Modules/main.c:318
#28 Py_Main (argc=<optimized out>, argv=<optimized out>) at Modules/main.c:769
#29 0x0000000000400bbd in main (argc=2, argv=0x7fffffffde08) at ./Programs/python.c:65

Also happens when using the non-SIMD linalg code.

The problem seems to be that aDiv3 * aDiv3 * aDiv3 becomes -nan
https://github.com/svenpeter42/fastfilters/blob/master/src/library/linalg.c#L146