Global cleaning

Question

Global cleaning

jnothman opened this issue 7 years ago · comments

I like the API design of tinycss2, but find it frustrating that if I'm interpreting some CSS, I always have to be ready for the possibility of finding a ParseError (or indeed a Comment), even if I have now lost the context in which it was parsed, and hence enough information to report an error to the user.

While I understand the benefits of keeping parse errors local, I would propose one or both of:

a function to traverse all token lists found within a parse, so that they can be cleaned or otherwise handled;
an option to attach the original input CSS or similar to each ParseError node.

Joel Nothman · Answer 1 · Thu May 11 2017 18:53:48 GMT+0800 (China Standard Time)

In particular, it becomes a much greater nuisance to test that my CSS interpretation code works with comments in any part of the CSS.

Or would you recommend just sticking with tinycss, @SimonSapin?

Simon Sapin · Answer 2 · Thu May 11 2017 22:17:16 GMT+0800 (China Standard Time)

parse_* functions have a skip_comments parameter. Does this help?

I don’t quite understand what you’re asking for, sorry. What do you mean by "the context in which it was parsed"? Comment and ParseError inherit from Node. All nodes have source_line and source_column attributes, as well as a serialize() method.

Could you given an example with some code?

As to tinycss, I think it has significant design flaws. But if it works for you ¯\_(ツ)_/¯

Joel Nothman · Answer 3 · Thu May 11 2017 22:53:32 GMT+0800 (China Standard Time)

Thanks, I somehow overlooked skip_comments. Yes, that helps.

To be explicit, I'm writing a library that involves transforming CSS declarations, but many of them are left untransformed and consumed by a client library. My library will land up raising warnings for the parse errors it encounters directly. But then the client is obliged to also catch and handle parse errors, or else trip on them.

I rather the tinycss2 interface, thanks :)

Simon Sapin · Answer 4 · Thu May 11 2017 23:28:06 GMT+0800 (China Standard Time)

It sounds like maybe you want a tree traversal/rewriting mechanism.

Here is an untested attempt:

_NESTED = {
    'qualified-rule': ['prelude', 'content'],
    'at-rule': ['prelude', 'content'],
    'declaration': ['value'],
    '() block': ['content'],
    '[] block': ['content'],
    '{} block': ['content'],
    'function': ['arguments'],
}

def _apply_to(node, callback):
    for attr in NESTED.get(node.type, []):
        nested_nodes = getattr(node, attr)
        new_nested_nodes = fold(nested_nodes, callback)
        setattr(node, attr, new_nested_nodes)

def _fold_iter(nodes, callback):
    for node in nodes:
        replacement = callback(node)
        if replacement is not None:
            _apply_to(replacement, callback)
            yield replacement

def fold(nodes, callback):
    return list(fold(nodes, callback))

Which could be used like this:

def remove_errors_callback(node):
    if node.type == 'error':
        print_error(node)
        # Implicit: return None
    else:
        return node

stylesheet = tinycss2.parse_stylesheet(…)
stylesheet = fold(stylesheet, remove_errors_callback)

Joel Nothman · Answer 5 · Sun May 14 2017 20:24:17 GMT+0800 (China Standard Time)

Yes, that sort of thing, though I'd design it as a generator, with an interface more like os.walk.

Joel Nothman · Answer 6 · Mon May 15 2017 08:30:56 GMT+0800 (China Standard Time)

I.e. replacement would occur in-place in lists and nodes.

Joel Nothman · Answer 7 · Mon May 15 2017 08:41:57 GMT+0800 (China Standard Time)

_CHILD_ATTRS = {
    'qualified-rule': ['prelude', 'content'],
    'at-rule': ['prelude', 'content'],
    'declaration': ['value'],
    '() block': ['content'],
    '[] block': ['content'],
    '{} block': ['content'],
    'function': ['arguments'],
}


def walk(nodes):
    '''
    >>> import tinycss2, pprint
    >>> pprint.pprint(list(walk(tinycss2.parse_declaration_list('font: rgb(1,2,3) bold; background: red'))))
    [((), [<Declaration font: …>, <WhitespaceToken>, <Declaration background: …>]),
     ((0,), <Declaration font: …>),
     ((0, 'value'),
      [<WhitespaceToken>,
       <FunctionBlock rgb( … )>,
       <WhitespaceToken>,
       <IdentToken bold>]),
     ((0, 'value', 0), <WhitespaceToken>),
     ((0, 'value', 1), <FunctionBlock rgb( … )>),
     ((0, 'value', 1, 'arguments'),
      [<NumberToken 1>,
       <LiteralToken ,>,
       <NumberToken 2>,
       <LiteralToken ,>,
       <NumberToken 3>]),
     ((0, 'value', 1, 'arguments', 0), <NumberToken 1>),
     ((0, 'value', 1, 'arguments', 1), <LiteralToken ,>),
     ((0, 'value', 1, 'arguments', 2), <NumberToken 2>),
     ((0, 'value', 1, 'arguments', 3), <LiteralToken ,>),
     ((0, 'value', 1, 'arguments', 4), <NumberToken 3>),
     ((0, 'value', 2), <WhitespaceToken>),
     ((0, 'value', 3), <IdentToken bold>),
     ((1,), <WhitespaceToken>),
     ((2,), <Declaration background: …>),
     ((2, 'value'), [<WhitespaceToken>, <IdentToken red>]),
     ((2, 'value', 0), <WhitespaceToken>),
     ((2, 'value', 1), <IdentToken red>)]
    '''
    if isinstance(nodes, list):
        yield (), nodes
        for i, node in enumerate(nodes):
            for path, descendant in walk(node):
                yield (i,) + path, descendant
    else:
        node = nodes
        yield (), node
        for attr in _CHILD_ATTRS.get(node.type, []):
            for path, descendant in walk(getattr(node, attr)):
                yield (attr,) + path, descendant

Simon Sapin · Answer 8 · Mon May 15 2017 15:09:13 GMT+0800 (China Standard Time)

Just to let you know: I’m not working on WeasyPrint or tinycss2 anymore. While I don’t mind chatting about it, filing issues is unlikely to get the project moving. Even pull requests I’m unlikely to spend time reviewing etc.

Joel Nothman · Answer 9 · Mon May 15 2017 16:11:11 GMT+0800 (China Standard Time)

I hadn't known of weasyprint. It's not surprising to find it does a lot of what I've been working on in weasyprint.css...

…

On 15 May 2017 5:09 pm, "Simon Sapin" ***@***.***> wrote: Just to let you know: I’m not working on WeasyPrint or tinycss2 anymore. While I don’t mind chatting about it, filing issues is unlikely to get the project moving. Even pull requests I’m unlikely to spend time reviewing etc. — You are receiving this because you authored the thread. Reply to this email directly, view it on GitHub <#4 (comment)>, or mute the thread <https://github.com/notifications/unsubscribe-auth/AAEz66UeQruxvhbQ85_TcqNsUl1BWE9cks5r5_oagaJpZM4NTXix> .

Simon Sapin · Answer 10 · Mon May 15 2017 16:18:59 GMT+0800 (China Standard Time)

A lot of this is not part of tinycss2 on purpose: to make fallback work, only the set of properties and values that are supported (e.g. in layout code) should be parsed.

Guillaume Ayoub · Answer 11 · Wed Jul 17 2019 20:28:16 GMT+0800 (China Standard Time)

@jnothman A lot of things have changed since 2017, are you still interested in this issue?

A lot of this is not part of tinycss2 on purpose: to make fallback work, only the set of properties and values that are supported (e.g. in layout code) should be parsed.

I hadn't known of weasyprint. It's not surprising to find it does a lot of what I've been working on in weasyprint.css...

Yes, I think that most of the features you want are not in TinyCSS2's scope.

Guillaume Ayoub · Answer 12 · Mon Feb 24 2020 02:47:34 GMT+0800 (China Standard Time)

@jnothman Feel free to reopen if you want!