Potential false positives in negative assertions of data equality
antimora opened this issue · comments
Description
We have identified an issue with our tensor operation tests, particularly in how they handle data types. This inconsistency can lead to false positives in our test suite, potentially masking real issues in our tensor operations.
Current Behavior
- Some tests are using
f64
data by default, while the test backend usesf32
. - The
assert_eq()
method checks for data type equality, which is good for positive assertions but can lead to false positives in negative assertions. - When using
assert_eq(&expected, false)
, we can't be sure what specifically failed (data values, dimensions, or data type).
Example
In crates/burn-tensor/src/tests/ops/slice.rs
:
#[test]
fn clamp_when_slice_exceeds_dimension() {
let data = TensorData::from([0.0, 1.0, 2.0]); // This uses f64 by default
let tensor = Tensor::<TestBackend, 1>::from_data(data.clone(), &Default::default());
let output = tensor.slice([0..4]);
output.into_data().assert_eq(&data, true);
}
This test fails with:
assertion `left == right` failed: Data types differ (F32 != F64)
left: F32
right: F64
Expected Behavior
- Consistent use of data types across tests and backends.
- Ability to perform negative assertions without risk of false positives due to type mismatches.
Proposed Solutions
- Standardize on a single data type (e.g.,
f32
) across all tests, or explicitly specify the data type in each test. - Consider adding separate methods for asserting value equality and type equality.
Additional Context
- This issue was discovered while working on improving the slice operation.
- The problem is more pronounced after the
TensorData
refactor, which made the struct no longer generic over the element type. - We previously had stricter type checking in tests, but it was relaxed during a code review. We may need to revisit this decision.
There are many tests that use assert_eq(&expected, false)
which can give a false positive assertion, e.g. the data is equal but datatype is not but actually we want the data not to be equal.
@nathanielsimard remember when I first made the tests strict and had to convert every data type for correct comparison then we decided to make it less strict? Where do we want to go from here?
I think the first proposed solution is too restrictive, because not all backends support the same dtypes necessarily. And we might want to run tests in f16 at some point.
But adding separate methods for testing value equality and complete equality (w/ dtype) is a good idea.
Maybe something like .assert_values_eq(other)
which does not test dtype and .assert_eq(other)
which tests both values and dtype. We could also have inequality methods (so the same but _ne
instead of _eq
) for shorthand.