danijar / handout

Turn Python scripts into handouts with Markdown and figures

Geek Repo:Geek Repo

Github PK Tool:Github PK Tool

Allow to hide ranges of code lines

danijar opened this issue · comments

We can already hide individual lines from the handout using # handout: exclude.

It would be great to allow excluding ranges via:

# handout: begin-exclude
<Python code here>
# handout: end-exclude

Media added inside such blocks will still be shown.

This will make it easy to create reports that only include text and media, but not code.

Tried # handout: exclude right after the closing triple-quote in docstring in example.py (hoping it would get docstring excluded from html), but html did not work out as intended.

"""
# Python Handout

Turn Python scripts into handouts with Markdown comments and inline figures. An
alternative to Jupyter notebooks without hidden state that supports any text
editor.
""" # handout: exclude

Keeping docstring intact may be something users can expect, see here: #12 (comment)

This feature would be useful in cases where I am producing a document with the help of results produced by my script and am not interested in showing any code.

Thanks for bringing this up. It means both code and comments but not media inside a range exclude should be hidden.

You're example will be possible via range exclude. The # handout: exclude operates on the current line only.

However, I don't think it'll be common to exclude docstrings. More likely, users would like to have them show in the code cell rather than a Markdown cell. I've opened #23 for that.

I suggest that # handout: exclude should operate on the current statement, so if that statement has multiple lines, exclude them all. It would make putting it after triple quotes work naturally. It would also allow conveniently excluding blocks such as for loops or normal statements spread out over many lines such as function calls with many parameters.

Hi @alexmojaki, thanks for your suggestion. However, extracting statements would be both a big effort and likely to be brittle. Having # handout: exclude hide the line its on and # handout: begin-exclude and # handout: end-exclude hide ranges has simple and clear semantics.

No, it's pretty easy to implement robustly by parsing the AST. I'll do it myself if you're happy with the feature. It'd solve #23 / #15 and would generally fit people's expectations better, i.e. they don't try to hide a statement and hide only part of it instead.

Excluding a statement, as @alexmojaki suggests, is indeed useful. The analogy that comes to mind is test coverage exclusion with #pragma: no cover on coverage.py:

class MyObject(object):
    def __init__(self):
        blah1()
        blah2()

    def __repr__(self): # pragma: no cover
        return "<MyObject>"

# handout: begin-exclude and # handout: end-exclude have their own uses too, but for example putting them around triple quoted string, especially the module docstring, is not desirable, I think.

Hi @alexmojaki and @epogrebnyak, I have a hard time seeing how the benefit of one comment line instead of two outweighs the added complexity of parsing the input file's AST. Even though the behavior we could get from looking at the AST would be deterministic, it may still be hard to predict for users.

I'd be happy to accept a PR to support the syntax in the first post though. Otherwise, it's also one of the next things I'd be looking into adding.

@danijar you're right, I thought it would be fun and got a bit carried away. I think it would be valuable to add eventually but shouldn't be a priority. Let me know if you'd like me to implement it in the future, e.g. if users ask for it.

@alexmojaki Sounds good, thanks!

Just as a reminder for code-parsing: I think we should check a single-line triplequoted string. I think It would be taken for code now, unless offset by whitespace.

I'm using single line triple quotes for Markdown regularly.

Thinking of edge cases for script text parsing:

"""one-line docsting"""
def foo():
    """docsting with offset"""
    pass
doc = Handout('.')
doc.add_text('abc'); doc.add_text('zzz')

doc.html('<pre>foo</pre>')
# some code comment here
"""
Multiline
triple quoted 
string
"""
print(True) #handout:exclude

Considerations:

  • docstring may be one line
  • allow whitespace in handout pragma
  • several add_x() call per line
  • cautiously purging empty lines
  • concat of similar blocks (always or preserving user calls)
  • comment with # is code

The trailing # handout: exclude for docstring seems possible without AST parsing:

"""
This docstring will be excluded
""" # handout: exclude

This can be useful to exclude module docstrings.

A test input for block exclude:

# comment
# handout: begin-exclude
import sys; 
sys.path.append('..')
# handout: end-exclude
print()

Should be parsed same as:

# comment
print()

Maybe there are some border cases?

We mostly have to make sure to continue appending to the same code cell once the exclude range ends. The code should already be set up for this. Let's not change the behavior of single line excludes in the same PR.

By the way, triple quoted strings that start at the beginning of the line are considered Markdown. This is also true for single line comments. Docstrings are considered code since they are indented. I think module-level docstrings are not a big concern for now.

Single line code comments are converted to markdown? I thought they stay as code. Need to run a conversion for myself to see how it works currently.

Would a call to doc.add_something be processed inside a block exclude?

@danijar would you be open to a PR that uses the Python AST. Looking at this request and others, it might be worth to consider making the change to AST parsing, sooner rather then later. For instance one advantage could be, is that you choose to hide the doc.* code.

AST parsing would add a bit of complexity, and would have to be in own module. Must agree on interface to the parser - but otherwise sounds like an improvement.

For instance one advantage could be, is that you choose to hide the doc.* code.

Like an option to exclude the code associated with doc.add_*() calls from a handout? Could be neat!

Yeah, it would be nice to automatically exclude all handout related code since its generally not relevant how the document was generated unless otherwise specified.

@eiso, thanks for the suggestion. I've opened #34 to discuss switching to the AST module.