pytest-dev / pytest-cov

Coverage plugin for pytest.

Geek Repo:Geek Repo

Github PK Tool:Github PK Tool

Maximum coverage in minimal time

masaccio opened this issue · comments

Summary

Given a project where tests have been added incrementally over time and there is a significant amount of overlap between tests,
I'd like to be able to generate a list of tests that creates maximum coverage in minimal time. Clearly this is a pure coverage approach and doesn't guarantee that functional coverage is maintained, but this could be a good approach to identifying redundant tests.

I have a quick proof-of-concept that's not integrated into pytest that:

  • runs all tests with pytest-cov and --durations=0
  • processes CoverageData and the output of --durations=0 to generate a list of arcs/lines that are covered for each context
  • reduces the list of subsets using the set cover algorithm
  • optionally applies a coverage 'confidence' in the event you want a faster smoke test that has reduced coverage (say 95%).

I am happy to work on a PR and include tests, but before I do I wanted to gauge fit to your project's goals and if you'd rather not have this feature, I can always create a separate plugin for people who want it.

It sounds interesting! Do you have a link to your proof of concept?

Sure: pretty basic code in this gist and a lash up in terms of integration, but tells me this is worth looking at for a broader set of packages using pytest.

But this works as expected on the tests I was trying to optimise:

#  Reports 100% coverage from 140 tests:
poetry run pytest --cov=src/numbers_parser \
                              --cov-report=term-missing:skip-covered \
                              --cov-context=test \
                              --durations=0 -n logical | tee duration.txt
# Generate the cover set
poetry run python3  maxcov.py > maxcov.txt
#  Reports 100% coverage from 91 tests:
poetry run pytest --cov=src/numbers_parser \
                              --cov-report=term-missing:skip-covered \
                              --cov-context=test \
                              -n logical `cat maxcov.txt`

Runtime is a bit over half as long, which isn't very long for this package anyway, but still.

I think this is very interesting! I don't see a reason to add it into pytest-cov though: it operates after the entire test run. If you package it independently, it can be used by people who don't use pytest-cov.

I tried running the code and ran into a few issues (needed -vvv to get all the durations, splitting the lines needed maxsplit=2 because my parameterized test names sometimes have spaces in them, I was using a data file other than .coverage). In the end, my output was "none", though I'm not sure if I wasn't capturing the contexts correctly.

Will close this issue since I have started development on https://pypi.org/project/pytest-maxcov/. I need to unpick how well this can work with contexts given measurement needs --cov-context=test .

@nedbat can you think of a reason why running pytest-cov in a subprocess in a different directory would interfere with coverage in the parent process? I am seeing pretty much zero coverage in the plugin's pytest run and I'm wondering whether .corverage is getting clobbered.

@masaccio I'm interested to see where maxcov goes. I'm sorry, but I don't know enough about the internals of pytest-cov to know whether there's interference like you might be seeing.