mangiucugna / json_repair

A python module to repair invalid JSON, commonly used to parse the output of LLMs

Home Page:https://pypi.org/project/json-repair/

Geek Repo:Geek Repo

Github PK Tool:Github PK Tool

Missing left quotes in numbers are not parsed properly

AloXado320 opened this issue · comments

Describe the bug
If there's a missing out quote at the left of a json define, it doesn't repair properly

To Reproduce
Steps to reproduce the behavior:
Run this json file with json_repair

{
  "words": abcdef",
  "numbers": 12345",
  "words2": ghijkl"
}

Gets parsed like this

{'words': 'abcdef', 'numbers': 12345, ',\n ': 'ords2', 'ghijkl': ''}

Expected behavior
Proper output:
{'words': 'abcdef', 'numbers': '12345', 'words2': 'ghijkl'}

Hi! Thank for reporting the bug, one note about numbers in canonical JSON though. The expected output here is:

{ "words": "abcdef", "numbers": 12345, "words2": "ghijkl" }

So the library should fix this by removing the dangling " and not eating the following key