neogeny / TatSu

竜 TatSu generates Python parsers from grammars in a variation of EBNF

Home Page:https://tatsu.readthedocs.io/

Geek Repo:Geek Repo

Github PK Tool:Github PK Tool

Literal ``None`` is interpreted as a string

dnicolodi opened this issue · comments

When using a construct like

null = 'null' value:`None` ;

value takes the string value "None" and not None. `None`is used in the Tatsu's EBNF grammar in a few places and I have the impression that it is meant to be the None object no the "None" string. I find it mildly confusing that bot `"foo"` and `foo` are interpreted as strings, but I don't think that this can be changed, for backward compatibility.

I would like to suggest to use Python's ast.literal_eval() to evaluate the content of the backtick syntax in the Tatsu grammar. This would allow to support any Python literal with minimal effort. I'll try to prepare a patch.

This behavior is as intended (a string), though admittedly not too useful.

To have a Python None the sequence to match should be (), but I don't think that's working.

I'll keep the issue open to think about the good solution.

I found it surprising because `True` is returned as a boolean, and similarly for other literals like integers and floats. Among the basic types, only None seems to do not be interpreted as a Python literal None but as a string.

() works, but it kind of breaks the symmetry:

parser = tatsu.compile('''
test
   =
   | int
   | hex
   | float
   | null
   | bool
   | string
   ;

int = 'int' value:`1` ;

hex = 'hex' value:`0xFF` ;

float = 'float' value:`0.11` ;

null = 'null' value:() ;

bool = ('true' value:`True` | 'false' value:`False`) ;

string = 'string' value:`"foo"` ;

''')

for string in 'int', 'hex', 'float', 'null', 'true', 'false', 'string':
    value = parser.parse(string)
    print(f'{string} : {value}')

The asymmetric behavior is because the TatSu grammar has a boolean rule and a semantic action that takes care of it. Adding symmetry would require a null rule that accepts 'None' and a corresponding semantic action.

I like the idea of using ast.literal_eval(). The back-quote syntax is fairly recent in TatSu, and the change is innocuous enough that most existing gerammars should be unaffected.

Who writes the pull request?