hypothesis-cats

Hypothesis extension which allows to classify values generated by a Hypothesis strategy and then to make complex assertions based on the categories that consist up the example data given to the test.

The approach seems to be promising in testing validation functions and other objects which behavior is determined by a number of independent factors.

A basic example. Let us write a simple User() class with a littile bug in the validation code:

class User():
    def __init__(self, name: str, role: str = None, age: int = None):
        if not name:
            raise TypeError('Name should not be empty!')
        self.name = name

        if role:
            self.role = role
            return # <---------- BUG!

        if age != None:
            if age <= 0:
                raise ValueError('Age should be a positive number!')
            self.age = age

The bug is that an empty role shadows validation of the age. Now, let's try to find that bug using the conventional technics, even powered by Hypothesis. We're starting with testing "independent" (as we may think!) error cases:

from hypothesis import given
from hypothesis.strategies import integers, text
import pytest

@given(name=text(max_size=0))
def test_name_empty(name):
    with pytest.raises(TypeError, match='^Name'):
        u = User(name)

@given(age=integers(max_value=0))
def test_wrong_age(age):
    with pytest.raises(ValueError, match='^Age'):
        u = User('test', age=age)

That seem to pass as expected: all pieces of the validation code work fine when are invoked independently of each other. Whet else can we do? Let's check all known good combinations:

@given(name=text(min_size=1), role=text(), age=integers(min_value=1))
def test_good_parms(name, role, age):
    u = User(name, role=role, age=age)

That's not a bad test, showing there are no false-positive reactions from the validation code. But the bug is still not found yet... :( How could we find it? A naive approach would be to test the code in the whole range of possible parameter values. Thus, however, makes it necessary to distinguish good cases from the wrong ones, which in turn, involves the use of the same or near the same logic that was used in the code we want to test! For instance, we could write:

@given(name=text(), role=text(), age=integers())
def test_all_naive(name, role, age):
    if not name: # Bad: same logic as in __init__!
        with pytest.raises(TypeError, match='^Name'):
            u = User(name, role=role, age=age)
    elif age <= 0: # Bad: same logic as in __init__!
        with pytest.raises(ValueError, match='^Age'):
            u = User(name, role=role, age=age)
    else:
        u = User(name, role=role, age=age)

Although, it's possible to find the bug with the test function above such test code is somehow meaningless: it uses the same logic as the code under the test. How to get the things better?

It would be better to mark the "good" and "bad" ranges for each parameter beforehand and then to make assertions based in what — "good" or "bad" — category the value has fallen. Moreover, it would be nice to define the expected errors along with the categories. With the present package that could be written as follows:

from hypothesis_cats import given_divided, with_cat_checker

@given_divided(
    name={
        'empty': {
            'raises': {
                'err': TypeError,
                'pattern': '^Name'
            },
            'values': text(max_size=0)
        },
        'non-empty': text(min_size=1)
    },
    role=text(),
    age={
        'non-positive': {
            'raises': {
                'err': ValueError,
                'pattern': '^Age',
            },
            'values': integers(max_value=0)
        },
        'positive': integers(min_value=1)
    }
)
@with_cat_checker()
def test_all_better(name, role, age, cts):
    u = User(name, role, age)

What's good with this function, is that it checks both for good and bad parameters together making the individual checks unncecessary, see the complete example in examples/validator.py.

It's also possible to declare relationships between the expected errors and other categories. For instance, it's possible to state that the age to role relation in the User class constructor is not a bug, but the desired behavior:

@given_divided(
    name={
        'empty': {
            'raises': {
                'err': TypeError,
                'pattern': '^Name'
            },
            'values': text(max_size=0)
        },
        'non-empty': text(min_size=1)
    },
    role={
        'empty': text(max_size=0),
        'non-empty': text(min_size=1)
    },
    age={
        'non-positive': {
            'raises': {
                'err': ValueError,
                'pattern': '^Age',
                 'requires': {
                     'role': "empty"
                 }
            },
            'values': integers(max_value=0)
        },
        'positive': integers(min_value=1)
    }
)

Note the new 'requires': declaration for non-positive age category referencing the "empty" category of role strings (that also was subdivided in order to make it possible to reference).

wolneykien / hypothesis-cats

hypothesis-cats

About

Languages