ahrefs / atd

Static types for JSON APIs

Geek Repo:Geek Repo

Github PK Tool:Github PK Tool

JSON to OCaml

ArulselvanMadhavan opened this issue · comments

Thanks for writing this library!

Has there been any work to infer the types directly from the JSON as opposed to having the user add type annotations?

Example: https://github.com/mholt/json-to-go

I'm not aware of a tool like this that people could use today for ATD. I did similar work over ten years ago but it was never released. It's definitely a fun project that doesn't require knowledge of the atd internals. I imagine it could produce good guesses for the most common patterns. Off the top of my head:

  • [...]list
  • {...} → record
  • field name foo that is always present → foo: ...;
  • field name bar that is sometimes missing → ?bar: ... option;
  • number literals that are all integers → int
  • string that takes a small number of different values, but takes each value multiple times → enum
  • various JSON types where a single ATD type is expected → ??? (could use any placeholder like unit or ... or TODO)

Tuples are more exotic but could be inferred given enough data (e.g. [ [1, "a"], [2, "ddd"], [32, "qiwi"], [-999, "jsjs"] ](int * string) list).

Variants that are not just enums are more problematic as they don't have a standard representation outside of ATD.

If someone wants to start such a project in OCaml, let us know here! It would be a good fit in this repo. Alternatively, extending another project like https://github.com/mholt/json-to-go to use the ATD syntax could also work.

commented

@cyberhuman wrote a tool that was doing that. But I don't think it's open source (yet?)

I wrote a very basic POC long time ago but haven't worked on it ever since:
https://github.com/cyberhuman/fmtcheck/blob/master/json/json.ml

When working on it, I realized that writing an initial atd is very easy, but what I really want is to be able to update it easily when the wire format is updated, or fine-tune auto-detected types. This would require the tool to take the existing atd as an input.