outlines-dev / outlines

Structured Text Generation

Home Page:https://outlines-dev.github.io/outlines/

Geek Repo:Geek Repo

Github PK Tool:Github PK Tool

Implement JSON schema field constraints

rlouf opened this issue · comments

We can specify constraints for the different fields in the JSON schema specification. Only maxLength for strings is currently implemented. Remaining:

Strings

  • minLength
  • pattern

and the default formats that can be specified via the format keyword:

  • date-time
  • time
  • date
  • duration
  • email
  • idn-email
  • hostname
  • idn-hostname
  • ipv4
  • ipv6
  • uuid
  • uri
  • uri-reference
  • iri
  • iri-reference
  • uri-template
  • regex

Numeric types

  • multipleOf
  • minimum
  • exclusiveMinimum
  • maximum
  • exclusiveMaximum

Arrays

  • minItems
  • maxItems
  • uniqueItems (may only be applicable dynamically)
  • Set length

Tuples

See https://json-schema.org/understanding-json-schema/reference/array#tupleValidation

Required fields

We should handle optional fields as well, i.e. those not specified in the required field of the schema.

Here are some examples of integer range constraints expressed as regular expressions: https://stackoverflow.com/a/34680927/3006474, https://3widgets.com/

What is the status of this? Are length of tuples/lists also implemented?

To be honest I switched to ggml’s ebnf for grammar constraints.

To be honest I switched to ggml’s ebnf for grammar constraints.

How does an EBNF-specified grammar provide these constraints?