json-schema-org / vocab-idl

Help and clarify how JSON Schema can be interpreted from validation rules to data definition. This extends to how those data definitions can be represented in any programming language

Geek Repo:Geek Repo

Github PK Tool:Github PK Tool

Define how properties with simple array type can be interpreted

jonaslagoni opened this issue Β· comments

We need to define how array type properties should be interpreted, such as for the following schema:

{
  "$schema": "https://json-schema.org/draft/2020-12/idl-schema",
  "name": "SomeTitle",
  "type": "object",
  "properties": {
    "ArrayProperty": {
      "type": "array",
       "items": { "type": "string" }
    }
  }
}

I'm confused. This seems pretty straightforward to me. What other way other than the obvious could this be interpreted.

I would say so myself, one question that might come up is simple arrays vs lists. Of course, some implementations might want to provide the option during the generation and not hardcoded. We just need to define the default πŸ™‚

Regardless, created the task to ensure we have test cases for it πŸ™‚

Ahhh, now I see what you mean. Generally, I think it should be an array except when it has "uniqueItems": true it would be a Set. However, there's going to need to be some language specific leeway for common practices in that languages. For example, I would probably expect Java to use ArrayList and for many functional languages to use linked lists. An idl-vocab keyword can be introduced if people want something other than the default.

one question that might come up is simple arrays vs lists

What does this mean? Is this referring to a particular language's use of these terms?

Ahhh, now I see what you mean. Generally, I think it should be an array except when it has "uniqueItems": true it would be a Set. However, there's going to need to be some language specific leeway for common practices in that languages. For example, I would probably expect Java to use ArrayList and for many functional languages to use linked lists. An idl-vocab keyword can be introduced if people want something other than the default.

Did not even think about uniqueItems, that's a good point πŸ‘ I think we should discuss that point in another issue and focus on the core array type here (just to keep the discussion clear).

For the array type, would you not expect the simple array type by default in all languages, instead of a more advanced array type?

For the vocabulary keyword, what would make sense to introduce, and any suggestions if we can make it language-agnostic? I can only think of non-agnostic variants.

Non language-agnostic: arrayType: 'List'

Maybe as object:

arrayType: {
   java: 'List',
   csharp: 'ArrayList'
}

What does this mean? Is this referring to a particular language's use of these terms?

In languages such as Java, we have simple arrays such as String[] which are not dynamic in terms of allocating. You initialize them with a specific size. The other way is the dynamic array types in the form of Lists, such as ArrayList where you have helper functions so you can dynamically add entries arrayProperty.push('value'). Did that clarify it @karenetheridge πŸ™‚?

For the array type, would you not expect the simple array type by default in all languages, instead of a more advanced array type?

Not necessarily. In JSON Schema, { "type": "array", "items": { "type": "string" } } represents an open-ended array of strings. There might be zero, one, twenty, or thousands. A fixed-size, sequential memory array isn't capable of that. Therefore, I would expect a dynamically sized array list to be used for that schema. However, if the array included a maxItems keyword, a fixed-size sequential memory array would make sense.

For the vocabulary keyword, what would make sense to introduce, and any suggestions if we can make it language-agnostic?

This shouldn't be a problem. We just have to make up our words and define them. Then implementations should translate those into whatever data structure most closely fits the definition in the target language. For Example,

  • FixedSizedIterable - A fixed-size ordered collection of items with sequential numeric indexes that start at 0 or 1. (Example: A traditional fixed memory array)
  • Iterable - An idefinitely-sized ordered collection of items with sequential numeric indexes that start at 0 or 1. (Example: ArrayList)
  • ForwardIterable - An indefinitely-sized ordered collection of items. (Example: A standard linked list)

What does this mean? Is this referring to a particular language's use of these terms?

This was good question because I had assumed "array" to mean anything iterable with sequential numeric indexing and "list" to mean a linked list data structure.

except when it has "uniqueItems": true it would be a Set - @jdesrosiers

We need to be careful that we don't specify things that languages don't have native support for. For instance, C/C++ defines arrays, but they don't define complex collection types. They can be created and included, but it's not built-in.

I think the best we could do is follow @jdesrosiers suggestion and define our own words and let implementations decide what types best fit.

I had assumed "array" to mean anything iterable with sequential numeric indexing and "list" to mean a linked list data structure.

In C#, lists are iterable, but not necessarily a linked list data structure. The different between arrays and lists there is that arrays are statically allocated while lists are dynamically allocated.

We need to be careful that we don't specify things that languages don't have native support for. For instance, C/C++ defines arrays, but they don't define complex collection types. They can be created and included, but it's not built-in.

I think the best we could do is follow @jdesrosiers suggestion and define our own words and let implementations decide what types best fit.

That's a good point, and there will definitely be multiple cases for this across the languages. Therefore suggestions per language are all we can do I think πŸ™‚ And even suggest less restrictive types in certain scenarios πŸ€” But yea, even there in the end it's the end users' choice which types they want to use, and they might vary from person to person tbh.