r2d4 / parserllm

Use context-free grammars with an LLM

Geek Repo:Geek Repo

Github PK Tool:Github PK Tool

Grammar stops whenever there is a valid parse even if you could generate more

marcotcr opened this issue · comments

Minimum example:

grammar = '''start: expression

expression: expression " " expression | terminal
terminal:  "token1" 
         | "token2"
'''.strip()
parser = Lark(grammar, parser='lalr')
p = ParserState(parser)

Now, this works for incomplete parses, e.g.

p.next_lex('')

{'TOKEN1', 'TOKEN2'}

p.next_lex('token1 ')

{'TOKEN1', 'TOKEN2'}

Now, let's say you have a state that is itself a valid parse, but which also allows for more tokens to be generated (notice I removed the trailing whitespace):

p.next_lex('token1')

[]