Literal ``None`` is interpreted as a string

Question

Literal ``None`` is interpreted as a string

dnicolodi opened this issue a year ago · comments

When using a construct like

null = 'null' value:`None` ;

value takes the string value "None" and not None. `None`is used in the Tatsu's EBNF grammar in a few places and I have the impression that it is meant to be the None object no the "None" string. I find it mildly confusing that bot `"foo"` and `foo` are interpreted as strings, but I don't think that this can be changed, for backward compatibility.

Daniele Nicolodi · Answer 1 · Sun Aug 27 2023 16:34:10 GMT+0800 (China Standard Time)

I would like to suggest to use Python's ast.literal_eval() to evaluate the content of the backtick syntax in the Tatsu grammar. This would allow to support any Python literal with minimal effort. I'll try to prepare a patch.

Juancarlo Añez · Answer 2 · Sun Aug 27 2023 19:53:21 GMT+0800 (China Standard Time)

This behavior is as intended (a string), though admittedly not too useful.

To have a Python None the sequence to match should be (), but I don't think that's working.

I'll keep the issue open to think about the good solution.

Daniele Nicolodi · Answer 3 · Sun Aug 27 2023 19:58:24 GMT+0800 (China Standard Time)

I found it surprising because `True` is returned as a boolean, and similarly for other literals like integers and floats. Among the basic types, only None seems to do not be interpreted as a Python literal None but as a string.

Daniele Nicolodi · Answer 4 · Sun Aug 27 2023 20:04:48 GMT+0800 (China Standard Time)

() works, but it kind of breaks the symmetry:

parser = tatsu.compile('''
test
   =
   | int
   | hex
   | float
   | null
   | bool
   | string
   ;

int = 'int' value:`1` ;

hex = 'hex' value:`0xFF` ;

float = 'float' value:`0.11` ;

null = 'null' value:() ;

bool = ('true' value:`True` | 'false' value:`False`) ;

string = 'string' value:`"foo"` ;

''')

for string in 'int', 'hex', 'float', 'null', 'true', 'false', 'string':
    value = parser.parse(string)
    print(f'{string} : {value}')

Juancarlo Añez · Answer 5 · Tue Aug 29 2023 01:21:44 GMT+0800 (China Standard Time)

The asymmetric behavior is because the TatSu grammar has a boolean rule and a semantic action that takes care of it. Adding symmetry would require a null rule that accepts 'None' and a corresponding semantic action.

Juancarlo Añez · Answer 6 · Tue Oct 17 2023 20:50:38 GMT+0800 (China Standard Time)

I like the idea of using ast.literal_eval(). The back-quote syntax is fairly recent in TatSu, and the change is innocuous enough that most existing gerammars should be unaffected.

Who writes the pull request?