Potential false positives in negative assertions of data equality

Question

Potential false positives in negative assertions of data equality

antimora opened this issue 3 months ago · comments

Dilshod Tadjibaev commented 3 months ago

Description

We have identified an issue with our tensor operation tests, particularly in how they handle data types. This inconsistency can lead to false positives in our test suite, potentially masking real issues in our tensor operations.

Current Behavior

Some tests are using f64 data by default, while the test backend uses f32.
The assert_eq() method checks for data type equality, which is good for positive assertions but can lead to false positives in negative assertions.
When using assert_eq(&expected, false), we can't be sure what specifically failed (data values, dimensions, or data type).

Example

In crates/burn-tensor/src/tests/ops/slice.rs:

#[test]
fn clamp_when_slice_exceeds_dimension() {
    let data = TensorData::from([0.0, 1.0, 2.0]);  // This uses f64 by default
    let tensor = Tensor::<TestBackend, 1>::from_data(data.clone(), &Default::default());

    let output = tensor.slice([0..4]);

    output.into_data().assert_eq(&data, true);
}

This test fails with:

assertion `left == right` failed: Data types differ (F32 != F64)
  left: F32
 right: F64

Expected Behavior

Consistent use of data types across tests and backends.
Ability to perform negative assertions without risk of false positives due to type mismatches.

Proposed Solutions

Standardize on a single data type (e.g., f32) across all tests, or explicitly specify the data type in each test.
Consider adding separate methods for asserting value equality and type equality.

Additional Context

This issue was discovered while working on improving the slice operation.
The problem is more pronounced after the TensorData refactor, which made the struct no longer generic over the element type.
We previously had stricter type checking in tests, but it was relaxed during a code review. We may need to revisit this decision.

There are many tests that use assert_eq(&expected, false) which can give a false positive assertion, e.g. the data is equal but datatype is not but actually we want the data not to be equal.

Guillaume Lagrange · Answer 1 · Mon Jul 08 2024 20:38:50 GMT+0800 (China Standard Time)

@nathanielsimard remember when I first made the tests strict and had to convert every data type for correct comparison then we decided to make it less strict? Where do we want to go from here?

Guillaume Lagrange · Answer 2 · Mon Jul 08 2024 21:01:50 GMT+0800 (China Standard Time)

I think the first proposed solution is too restrictive, because not all backends support the same dtypes necessarily. And we might want to run tests in f16 at some point.

But adding separate methods for testing value equality and complete equality (w/ dtype) is a good idea.

Maybe something like .assert_values_eq(other) which does not test dtype and .assert_eq(other) which tests both values and dtype. We could also have inequality methods (so the same but _ne instead of _eq) for shorthand.