[Infrastructure] Testing outputs

Question

[Infrastructure] Testing outputs

gyermolenko opened this issue 6 years ago · comments

Grygorii Iermolenko commented 6 years ago

update on idea of testing outputs as docstrings (e.g. with python -m doctest):
I couldn't find easy way to generate docstrings for def main()

By "testing outputs" I mean to compare script output with ### OUTPUT ### section at the bottom of the file.
Structuring scripts like that:

def main():
    ...

if __name__ == "__main__":
   main()

allows imports of main in tests to evaluate and compare outputs.

It would also be more convenient to have ### OUTPUT ### section as variable or docstring, so we do not need to parse file for such comparisons

Grygorii Iermolenko · Answer 1 · Fri Jan 11 2019 22:20:48 GMT+0800 (China Standard Time)

Here are example script/test files

#example script file

def main():
    print("abc")


if __name__ == "__main__":
    main()

OUTPUT = """
abc
"""

#example test (same for all scripts with the structure as above)

from contextlib import redirect_stdout
import io

from example_script_file import main, OUTPUT

def test_output():
    f = io.StringIO()
    with redirect_stdout(f):
        main()

    real_output = f.getvalue().strip()
    expected_output = OUTPUT.strip()
    assert real_output == expected_output

@faif Hi. Can you please share your thoughts on this?

Sakis Kasampalis · Answer 2 · Sat Jan 12 2019 04:48:32 GMT+0800 (China Standard Time)

Hi,

I like this idea since it makes the outputs unit-testable.

Grygorii Iermolenko · Answer 3 · Tue Feb 12 2019 19:48:39 GMT+0800 (China Standard Time)

hey @faif
I've been thinking about it a lot lately, and I return to doctests again and again.
Examples are written once / read multiple times. That's why I strongly believe input/output rows should interleave, and not be in separate blocks.

Here is a basic example of before-after of this idea:

CURRENT (input and then output)

def main():
    template_function(get_text, to_save=True)
    print("-" * 30)
    template_function(get_pdf, converter=convert_to_text)
    print("-" * 30)
    template_function(get_csv, to_save=True)


if __name__ == "__main__":
    main()


OUTPUT = """
Got `plain-text`
Skip conversion
[SAVE]
`plain-text` was processed
------------------------------
Got `pdf`
[CONVERT]
`pdf as text` was processed
------------------------------
Got `csv`
Skip conversion
[SAVE]
`csv` was processed
"""

PROPOSED HERE (interleaved)

def main():
    """
    >>> template_function(get_text, to_save=True)
    Got `plain-text`
    Skip conversion
    [SAVE]
    `plain-text` was processed

    >>> template_function(get_pdf, converter=convert_to_text)
    Got `pdf`
    [CONVERT]
    `pdf as text` was processed

    >>> template_function(get_csv, to_save=True)
    Got `csv`
    Skip conversion
    [SAVE]
    `csv` was processed
    """

if __name__ == "__main__":
    import doctest
    doctest.testmod()

I think the second example it is much-much easier to read.

I took short and simple output and I do understand that some difficulties can occur with longer ones, but I believe that this is worth it.

As for testing, it is as simple as python script.py.
If we leave out if __name__ == "__main__": part we still can run it with python -m doctest script.py (or pytest --doctest-modules script.py )

As for auto-generated output vs human generated one.. auto-generated is not correct just because. It still needs review.
And as I said in the beginning, examples are written once, read multiple times. So the burden of writing doctest manually is worth the effort (imho)

Sakis Kasampalis · Answer 4 · Thu Feb 14 2019 05:31:13 GMT+0800 (China Standard Time)

I'm not very familiar with doctest so I'm not sure if it can replace all current functionality. You can pick a more complex use case and give it a try. I'm generally positive with the proposal.

Grygorii Iermolenko · Answer 5 · Thu Feb 14 2019 22:33:07 GMT+0800 (China Standard Time)

I created #283 for illustration purposes.

Here are some things to point out:

Doctest is a bit harder to write than "usual code" - because of indentation, ">>>/ ..." markers etc. But in case with most patterns it is not an issue at all.
there are some things to remember when writing doctests, described in official docs .
Here are tldr recommendations:

do not rely on dicts ordering in repr (before py3.6) (same for current outputs)
do not rely on user classes instances in repr - have object addresses like 0x23abc4 (same for current outputs)
tracebacks need to be shortened with "..." (shorthand for "everything you want"). I see it as improvement over current way.
module name (main / script_name ) , paths (abs / relative, with or w/o homedir) - depend on a way of running, from dev terminal or CI. Same for current situation, "..." workaround works

Grygorii Iermolenko · Answer 6 · Tue Jan 14 2020 17:05:34 GMT+0800 (China Standard Time)

To update on this issue, what is left to do:

there are only 8 files left with an old-style ### OUTPUT ### (to be substituted with doctests)
remove append_output.sh
update Output section of readme

Alan Tan · Answer 7 · Sat Jul 04 2020 16:23:50 GMT+0800 (China Standard Time)

I think we're almost done with substituting all files with doctests!

The only one left is for abstract_factory.py which is ambiguous due to its random output for the shop = PetShop(random_animal) test case. We could use doctest.ELLIPSIS to check for random outputs (i.e. We have a lovely ... and It says ....), though this might not be the most accurate way to test as it doesn't ensure that Dog must go with woof and Cat must go with meow. Looking at the doctest documentation, I don't think there is any way to explicitly test for such cases (or not that I am aware of). Any thoughts?

The other thing left to do would be to update the contributing section of README.md on writing doctests for future patterns.

Grygorii Iermolenko · Answer 8 · Sun Jul 05 2020 23:42:28 GMT+0800 (China Standard Time)

@alanwuha have you seen https://github.com/faif/python-patterns/blob/master/patterns/other/blackboard.py#L124 ?

Alan Tan · Answer 9 · Mon Jul 06 2020 21:48:23 GMT+0800 (China Standard Time)

@gyermolenko got it and on it, thanks for the tip!

Sakis Kasampalis · Answer 10 · Sat Jan 02 2021 06:19:35 GMT+0800 (China Standard Time)

Can we close this @gyermolenko @alanwuha? I believe yes

Grygorii Iermolenko · Answer 11 · Sat Jan 02 2021 17:22:19 GMT+0800 (China Standard Time)

Can we close this @gyermolenko @alanwuha? I believe yes

hey @faif , happy holidays!
Only abstract_factory.py is left and @alanwuha wanted to work on it I believe
Then to remove append_output.sh and update Output section of readme.