Parsing a multi-line conditional expression causes exception - Unexpected token Token('QMARK', '?')
kartikp10 opened this issue · comments
Hello!
I have observed that parsing a ternary expression fails if the expression is using multiple lines. For example:
data "aws_iam_policy_document" "sample" {
source_json = (
length(var.sample_value) > 0
? data.aws_iam_policy_document.sample_reader.json
: ""
)
}
Loading this file with hcl2.load(file)
will result in this exception:
Traceback (most recent call last):
File "/Users/kpande/.pyenv/versions/3.9.4/lib/python3.9/site-packages/lark/parsers/lalr_parser.py", line 59, in get_action
return states[state][token.type]
KeyError: 'QMARK'
During handling of the above exception, another exception occurred:
Traceback (most recent call last):
File "/Users/kpande/Downloads/sampletf/hcl2_test.py", line 4, in <module>
dict = hcl2.load(file)
File "/Users/kpande/.pyenv/versions/3.9.4/lib/python3.9/site-packages/hcl2/api.py", line 9, in load
return loads(file.read())
File "/Users/kpande/.pyenv/versions/3.9.4/lib/python3.9/site-packages/hcl2/api.py", line 18, in loads
return hcl2.parse(text + "\n")
File "/Users/kpande/.pyenv/versions/3.9.4/lib/python3.9/site-packages/lark/lark.py", line 464, in parse
return self.parser.parse(text, start=start)
File "/Users/kpande/.pyenv/versions/3.9.4/lib/python3.9/site-packages/lark/parser_frontends.py", line 115, in parse
return self._parse(token_stream, start)
File "/Users/kpande/.pyenv/versions/3.9.4/lib/python3.9/site-packages/lark/parser_frontends.py", line 63, in _parse
return self.parser.parse(input, start, *args)
File "/Users/kpande/.pyenv/versions/3.9.4/lib/python3.9/site-packages/lark/parsers/lalr_parser.py", line 35, in parse
return self.parser.parse(*args)
File "/Users/kpande/.pyenv/versions/3.9.4/lib/python3.9/site-packages/lark/parsers/lalr_parser.py", line 88, in parse
action, arg = get_action(token)
File "/Users/kpande/.pyenv/versions/3.9.4/lib/python3.9/site-packages/lark/parsers/lalr_parser.py", line 66, in get_action
raise UnexpectedToken(token, expected, state=state, puppet=puppet)
lark.exceptions.UnexpectedToken: Unexpected token Token('QMARK', '?') at line 4, column 5.
Expected one of:
* __ANON_1
* __ANON_0
* RPAR
* __ANON_2
If, however, the expression is in a single line, the parsing will work fine.
Also running into this. I wonder if its as simple as editing https://github.com/amplify-education/python-hcl2/blob/master/hcl2/hcl2.lark#L12 to be
conditional : expression new_line_or_comment? "?" new_line_or_comment? expression new_line_or_comment? ":" new_line_or_comment? expression
Yeah this should do it
conditional : expression new_line_or_comment? "?" new_line_or_comment? expression new_line_or_comment? ":" new_line_or_comment? expression new_line_or_comment?
Seems like adding the first new_line_or_comment?
confuses the parser so every time there's an expression and then a new line it expects a question mark. I assume it's because of LALR that checks only 1 token ahead.
As another method I tried adding the new_line_or_comment?
to the end of expression
rule so it looks like:
?expression : (expr_term | operation | conditional) new_line_or_comment?
but then some of the parsing also breaks.
Any ideas? @ianvonseggern1 @kartikp10
Yeah I noticed the same when I tried it :( Unfortunately I'm really not familiar with lark so I have very few ideas about what to try next
Raised a PR to fix this - #128
The PR was merged. This issue can be closed now.