Move to better test infrastructure
jdemel opened this issue · comments
Johannes Demel commented
In a lot of issues and PRs we discuss problems with our current tests.
We need to discuss a way forward to improve this situation.
One option would be to introduce gtest. We would write specific tests for some kernels first and adopt an approach where we slowly move to the new system.
Johannes Demel commented
I did some tests with gtest:
https://github.com/jdemel/volk/tree/newtest
At the moment, there are quite a few areas where this can be improved.
- Integration into ctest
- Output prints should go into the log instead of the default output
- Possible copypasta code should be reduced.
Thus, this implementation is a proof of concept and open for discussion.
Clayton Smith commented
I don't have an opinion on which test framework to use, but I'll list out some things that could be improved by moving away from one-size-fits-all testing:
- No more puppets!
- Many kernels have fixed-length inputs (e.g.
sum_of_poly
) or outputs (e.g.dot_prod
,index_max
,stddev
) but the current system always supplies variable-length buffers. This makes it difficult to catch buffer overruns on the fixed-length buffers. - All buffers are padded by 5 (
vlen_twiddle
), apparently to help catch out-of-bounds writes and prevent fixed-length buffers from becoming too short (see above). But this prevents tools like ASAN and valgrind from catching buffer overruns, including out-of-bounds reads. - Some kernels (e.g.
index_min
,index_max
) only make sense for vector lengths >= 1, so length 0 should be disallowed for them. - The current tolerance options are "relative" (to the output magnitude) and "absolute". Neither of these makes much sense for kernels like
dot_prod
, where the error magnitude is proportional to the vector length, and is independent of the output magnitude. (If the dot product happens to be close to zero, the relative error becomes large.) - Kernels with rounded integer output are forced to use tolerance 1, even though very few of the floating point values have a fractional part near 0.5 (e.g. #647).
- All floating-point kernels are tested with uniformly distributed inputs in the range -1 .. +1. For some kernels (e.g.
pow
,sqrt
) such inputs are inappropriate, resulting in bugs like #649. - Almost all kernels are tested with the fixed scalar value 327.0, which may not be appropriate (e.g. #381).
- Special cases (e.g. 0.0) are untested, allowing bugs like #622, #701, and #730 to slip through.
- The
32fc_index_*
kernels can have multiple possible correct answers, so the test framework should allow that. See #700 for more details.
Clayton Smith commented
Another problematic case:
volk_32f_s32f_32f_fm_detect_32f
andvolk_32f_s32f_s32f_mod_range_32f
both involve phase angle calculations, and tolerance checks fail if the angles being compared are -pi+epsilon and +pi-epsilon. (The difference in angle is very small, but the difference in absolute value is large.)