amplify-education / python-hcl2

Geek Repo:Geek Repo

Github PK Tool:Github PK Tool

v4.3.1 regression

sidekick-eimantas opened this issue · comments

Hi

In v4.3.1 we started seeing parser failures on one of our files. We reduced the failing case to this:

locals {
  terraform = {
    channels = local.running_in_ci ? local.ci_channels : local.local_channels
    authentication = []
  }
}
(.venv) Eimantas@Eimantas-Gecass-MacBook-Pro skm-cli % python -c "import hcl2, pathlib; hcl2.loads(pathlib.Path('/Users/Eimantas/git/sidekick-money/skm-cli/examples/terraform/modules/terraform-context/terraform-config.tf').read_text())"
Traceback (most recent call last):
  File "/Users/Eimantas/git/sidekick-money/skm-cli/.venv/lib/python3.10/site-packages/lark/parsers/lalr_parser.py", line 126, in feed_token
    action, arg = states[state][token.type]
KeyError: '__ANON_3'

During handling of the above exception, another exception occurred:

Traceback (most recent call last):
  File "<string>", line 1, in <module>
  File "/Users/Eimantas/git/sidekick-money/skm-cli/.venv/lib/python3.10/site-packages/hcl2/api.py", line 27, in loads
    tree = hcl2.parse(text + "\n")
  File "/Users/Eimantas/git/sidekick-money/skm-cli/.venv/lib/python3.10/site-packages/lark/lark.py", line 645, in parse
    return self.parser.parse(text, start=start, on_error=on_error)
  File "/Users/Eimantas/git/sidekick-money/skm-cli/.venv/lib/python3.10/site-packages/lark/parser_frontends.py", line 96, in parse
    return self.parser.parse(stream, chosen_start, **kw)
  File "/Users/Eimantas/git/sidekick-money/skm-cli/.venv/lib/python3.10/site-packages/lark/parsers/lalr_parser.py", line 41, in parse
    return self.parser.parse(lexer, start)
  File "/Users/Eimantas/git/sidekick-money/skm-cli/.venv/lib/python3.10/site-packages/lark/parsers/lalr_parser.py", line 171, in parse
    return self.parse_from_state(parser_state)
  File "/Users/Eimantas/git/sidekick-money/skm-cli/.venv/lib/python3.10/site-packages/lark/parsers/lalr_parser.py", line 193, in parse_from_state
    raise e
  File "/Users/Eimantas/git/sidekick-money/skm-cli/.venv/lib/python3.10/site-packages/lark/parsers/lalr_parser.py", line 184, in parse_from_state
    state.feed_token(token)
  File "/Users/Eimantas/git/sidekick-money/skm-cli/.venv/lib/python3.10/site-packages/lark/parsers/lalr_parser.py", line 129, in feed_token
    raise UnexpectedToken(token, expected, state=self, interactive_parser=None)
lark.exceptions.UnexpectedToken: Unexpected token Token('__ANON_3', 'authentication') at line 4, column 5.
Expected one of: 
	* __ANON_8
	* PLUS
	* __ANON_9
	* PERCENT
	* MORETHAN
	* STAR
	* QMARK
	* LESSTHAN
	* __ANON_6
	* __ANON_7
	* __ANON_1
	* SLASH
	* __ANON_4
	* RBRACE
	* __ANON_2
	* __ANON_5
	* COMMA
	* __ANON_0
	* MINUS

Last working version was v4.3.0

Thanks

I can confirm the same issue on our side.

Yes, I also can confirm this. 4.3.0 was parsing correctly a very large and heterogeneous set of terraform projects, and started failing on specific cases on update to 4.3.1

Experience the same with parsing main.tf and locals.tf of https://github.com/aws-ia/terraform-aws-eks-blueprints

Experiencing the same issue here. In case someone doesn't want to revert to a version older than 4.3.1, wrapping the ternary operation around a string is another way to get around it "${some_boolean ? opt_1 : opt2}", that is, if you're intending to get a string out of it.

Another solution is moving the problematic line to the very bottom of the block it's in.

Also, it works if the ternary operation will be in parentheses.
like this

locals {
  terraform = {
    channels = (local.running_in_ci ? local.ci_channels : local.local_channels)
    authentication = []
  }
}

It's not a regression in the grammar, but the likelihood of a conditional (trinary) to be caught was reduced.
changing conditional from:

conditional : expression "?" new_line_or_comment? expression new_line_or_comment? ":" new_line_or_comment? expression new_line_or_comment?

to:

conditional : expression "?" new_line_or_comment? expression new_line_or_comment? ":" new_line_or_comment? expression

Will solve the issue.

Another option is to use Earley (instead of LALR) parser, which doesn't have those issues.
You can play around with it here (try and switch the parser)

Btw Im playing around with grammar improvements here, also added support to choose Earley parser, if anyone is willing to review and add the improvements Im in

For reference I'm still getting lark.exceptions.UnexpectedToken: Unexpected token Token('__ANON_3', 'DD_SITE') at line 159, column 5. with 4.3.2.

Can also confirm we can replicate this on 4.3.2.

Small reproduction:

# foo.tf
module "foobar" {
  attributes = {
    do_ray_me_far = var.foobar_thing > 1024.0
    blah_blah     = var.foobar_thing / 1024.0
  }
}
Unexpected token Token('__ANON_3', 'blah_blah') at line 4, column 5.
 | Expected one of:
 | 	* __ANON_0
 | 	* __ANON_7
 | 	* __ANON_9
 | 	* STAR
 | 	* __ANON_6
 | 	* PLUS
 | 	* PERCENT
 | 	* MINUS
 | 	* __ANON_2
 | 	* COMMA
 | 	* MORETHAN
 | 	* __ANON_4
 | 	* __ANON_8
 | 	* __ANON_5
 | 	* QMARK
 | 	* SLASH
 | 	* LESSTHAN
 | 	* RBRACE
 | 	* __ANON_1
 |

@IButskhrikidze please can we re-open this issue?

This is still occurring on both v4.3.1. and v4.3.2. Can we reopen this issue ?
It fails in my organisation on many different tf files, I just recreated it on my personal laptop with the code block from @ascopes