Hi from the maintainer of the json-repair library

Question

Hi from the maintainer of the json-repair library

mangiucugna opened this issue 9 months ago · comments

Stefano Baccianella commented 9 months ago

Hi! I saw that you are using the library and that is great!

I was wondering why you decided to reimplement the repair_json method instead of calling the one provided, it's because the preprocessing done to the string breaks something? It's because of lazy loading?

Just wondering if there's something I can learn to improve the lib

Cheers

Stefano Baccianella commented 9 months ago

cheers!

Oleksandr Yaremchuk · Answer 1 · Mon Nov 20 2023 17:30:10 GMT+0800 (China Standard Time)

Hey @mangiucugna , thanks for reaching out, and for the library :) I tried multiple solutions but your works better than others.

That way, I'd not need to do json.loads two times. I want repair to be optional feature. Have you considered using orjson to speed it up?

Cheers

Stefano Baccianella · Answer 2 · Mon Nov 20 2023 18:28:31 GMT+0800 (China Standard Time)

I was expecting that json.load() to be the reason :)
Would be adding an option to skip the json.load() a good option for you? Something like 'skip_json_load=True'.

The reason why I decided against orjson is that I wanted to keep the library without external dependencies, especially orjson bring pyO3 that sometimes doesn't play well with some other libraries.

If you like the idea, I will release 0.4.0 (I recently released 0.3.0 to fix some reported issues with llama) and if you have suggestions please do let me know!

Oleksandr Yaremchuk · Answer 3 · Mon Nov 20 2023 18:29:20 GMT+0800 (China Standard Time)

Sounds like a plan! I will make a change once you have the new version :) Thank you!

Stefano Baccianella · Answer 4 · Mon Nov 20 2023 18:57:42 GMT+0800 (China Standard Time)

here you go https://github.com/mangiucugna/json_repair/releases/tag/0.4.0

Stefano Baccianella · Answer 5 · Tue Nov 21 2023 00:20:29 GMT+0800 (China Standard Time)

FYI I was intrigued about the performance of the library so I did some profiling and found a way to make it 20% faster, small numbers for small JSON size but it's quite significant if you have large strings or doing large batch jobs.
So yeah, 0.4.1 is out

Oleksandr Yaremchuk · Answer 6 · Tue Nov 21 2023 00:43:03 GMT+0800 (China Standard Time)

Awesome! Updated the library to use your exported function. Thank you!