Empty list cannot be closed after a newline
AriX opened this issue · comments
Thanks for your awesome work on this project!
I'm seeing an issue with JSON like the following:
{
"num" : 1,
"list_of_strings" : [
]
}
In particular, if a newline is generated after the array's opening [
, lm-format-enforcer
will not allow the list to be closed with a ]
. It appears that:
- When the
[
character is parsed, aUnionParser
is added to the stack with aStringParsingState
and aForceStopParser
- When the newline character is parsed, the
UnionParser
decides that onlyStringParsingState
can accept newlines, and therefore dissolves itself, returning only theStringParsingState
onto the stack and removing theForceStopParser
- Without
ForceStopParser
on the stack,JsonSchemaParser
'sallowedCharacters
implementation does not evaluate any parsers on the stack aboveStringParsingState
, becauseStringParsingState
returnsFalse
forcanEnd()
- Therefore,
allowedCharacters
does not include]
and the list cannot be closed
This can be verified by adding this test to test_jsonschemaparser.py
:
def test_empty_list_with_newline():
class EmptyListOKModel(BaseModel):
num: int
list_of_strings: Optional[List[str]] = Field(None, min_length=0, max_length=1)
no_strings = '{"num":1,"list_of_strings":[\n]}'
_test_json_schema_parsing_with_string(no_strings, EmptyListOKModel.model_json_schema(), True)
I'm not sure what the best solution here is, but some ideas I have are:
- Have
ForceStopParser
allow newlines/whitespace - Prevent
UnionParser
from dissolving itself if one of its parsers is aForceStopParser
Any input on what solution would be most idiomatic would be greatly appreciated.
Thanks for the report! I hope to fix this in the near future.
Solved in v0.8.3