json-schema-org / community

A space to discuss community and organisational related things

Geek Repo:Geek Repo

Github PK Tool:Github PK Tool

GSoC: Define upgrade/downgrade language agnostic declarative transformation rules for all JSON Schema dialects

jviotti opened this issue Β· comments

Project title

Define upgrade/downgrade language agnostic declarative transformation rules for all JSON Schema dialects.

Brief Description

The Alterschema project defines a set of JSON-based formal transformation rules for upgrading schemas between Draft 4 and 2020-12, and all dialects in between. These rules are defined using JSON Schema and JSON-e and live within the Alterschema project.

We would like to revise these rules, extend them to support every dialect of JSON Schema (potentially including OpenAPI's old dialects too), and attempt to support some level of downgrading.

Instead of having these rules on the Alterschema repository, we want to have them on the JSON Schema organization for everybody to consume, including Alterschema itself.

Revising the rule format should consider currently unresolved edge cases in Alterschema like tweaking references after a subschema is moved.

Expected Outcomes

A new repository in the JSON Schema organization with upgrade/downgrade rules defined using JSON.

Skills Required

Understanding of various dialects of JSON Schema and their differences.

Mentors

@jviotti

Expected Difficulty

Medium

Expected Time Commitment

350 hours

Thanks Juan. This looks amazing!

Hey @jviotti I read through the problem statement, I loved the way the description was put through giving a good understanding. I would love to work on this problem statement under GSOC and the mentors. Can you guide me through more understanding regarding this..😁 and where to start with
And will it be good to read all of the repositories

Hey there! I'd first suggest getting acquainted with https://github.com/sourcemeta/alterschema. This is the original project where I prototyped something like what we want to do here, using JSON-e (https://json-e.js.org), but ended up hitting some blockers. You can take a look at all the upgrade transformation rules I support here: https://github.com/sourcemeta/alterschema/tree/master/rules. Try to read them, and understand them mainly in conjunction with JSON Schema's official migration guide: https://json-schema.org/specification#migrating-from-older-drafts.

The way Alterschema work is pretty simple. It will recursively traverse through every subschema of the given schema in a top-down manner, applying all the rules it knows about to every subschema over and over again until no more transformation rules can be executed. The core business logic of it its literally a small JavaScript file: https://github.com/sourcemeta/alterschema/blob/master/bindings/node/index.js

For example, Alterschema rules for upgrading JSON Schema 2019-09 to 2020-12 are defined here: https://github.com/sourcemeta/alterschema/blob/master/rules/jsonschema-2019-09-to-2020-12.json, based on what JSON Schema published here: https://json-schema.org/draft/2020-12/release-notes.

Now, what we would like to do in this GSoC initiative is learn from what we did in Alterschema to do another take on the problem that solves the limitations of Alterschema. The main limitation is this one: sourcemeta/alterschema#43.

In summary, a JSON Schema may reference other parts of itself using URI encoded JSON Pointers along with the $ref and $dynamicRef keywords. The current JSON-e rules that I have on Alterschema will only look at the current subschema and blindly transform it according to what the template says.

However, what happens if there is a reference in another other part the schema that is now invalid after the schema transformation you did somewhere else? If so, we don't have a deterministic way of detecting this, even less know how to "fix up" the reference pointers.

The conclusion I got from this is that JSON-e, while powerful, is too low level and doesn't carry semantics about what the transformation actually did. For example, if you upgrade definitions to $defs, that's a simple rename. Knowing that it is indeed just a simple rename, it's easy to know how to fix any pointers that included /definitions in it.

So what I'm thinking about is that we can study the transformation rules that we want to perform, and break them down into higher level sub transformations. For example, are you completely deleting something? Are we performing just a rename? Are we moving the contents around? If we design a JSON language that works at a higher level of abstraction, we can deterministically know how we should fix any affected pointer.

So I'd say the phases in this project are like this:

  • Research JSON Schema transformation rules, categorize them, etc
  • Come up with a higher-level transformation language than JSON-e that carry semantics about how we are actually transforming the schema (I was thinking something similar to JSON Patch (https://jsonpatch.com))
  • Then do a prototype of implementing upgrade rules with this language, ensuring it solves the limitations of Alterschema
  • If we have more time, we use this language to attempt to level of downgrading support, etc

As an initial qualifying task for this project (cc @benjagm), I propose:

  • Go through every upgrade transformation rules from JSON Schema 2019-09 to 2020-12 in the official upgrade guide (https://json-schema.org/draft/2020-12/release-notes) and on Alterschema (https://github.com/sourcemeta/alterschema/blob/master/rules/jsonschema-2019-09-to-2020-12.json) and categorize them on a spreadsheet/table based on what they are doing. For example, are they simple renames, are they completely moving stuff around? Are they doing something even more complicated? Up to you to figure out how to categorize them

  • Propose a toy JSON-based DSL transformation language (perhaps inspired by JSON-e and JSON Patch) that encapsulates how to perform these 2019-09 to 2020-12 upgrade rules in a way that you can algorithmically tell how to fix any $ref JSON Pointer that went through the transformed schema

  • Describe a pseudo-algorithm to fix up $refs

As a more specific (though probably a bit artificial and silly πŸ˜…) example of the $ref issue, consider the following JSON Schema 2019-09:

{
  "$schema": "https://json-schema.org/draft/2019-09/schema",
  "type": "array",
  "items": [
    { "type": "string" },
    { "type": "number" }
  ],
  "additionalItems": { 
    "$ref": "#/items/0" 
  }
}

To turn it into a JSON Schema 2020-12, we need to:

  • Replace $schema with https://json-schema.org/draft/2020-12/schema
  • Rename /items to /prefixItems
  • Rename /additionalItems to /items

However, if you blindly perform these transformations, you would end up with the following schema:

{
  "$schema": "https://json-schema.org/draft/2020-12/schema",
  "type": "array",
  "prefixItems": [
    { "type": "string" },
    { "type": "number" }
  ],
  "items": { 
    "$ref": "#/items/0" 
  }
}

However note that the /items/$ref, which still says #/items/0 is now invalid. We first renamed prefixItems to items, so the $ref should have been updated to #/prefixItems/0 too.

This one is a bit simple, but think about more complex variations of the same problem. You might have long references where many of its components will need to be updated, and in some cases, it will be more than just a component rename.

Or if you can think of a better way to deterministically solve this problem, please propose it and we can work on it together!

However note that the /items/$ref, which still says #/items/0 is now invalid. We first renamed prefixItems to items, so the $ref should have been updated to #/prefixItems/0 too.

I'm confused by this line. Are we supposed to convert prefixItems to items for the reference to be #/prefixItems/0 as part of the conversion from 2019-09 to 2020-12?

Perhaps you meant items to prefixItems, or maybe I am misunderstanding? πŸ˜•

@MeastroZI The reference was originally #/items/0, but because we rename items to prefixItems, for the schema to be valid, we should have also adjusted the reference from #/items/0 to #/prefixItems/0. The expected end result should have been this:

{
  "$schema": "https://json-schema.org/draft/2020-12/schema",
  "type": "array",
  "prefixItems": [
    { "type": "string" },
    { "type": "number" }
  ],
  "items": { 
    "$ref": "#/prefixItems/0" 
  }
}

Hasn't this problem already been addressed with the pattern

"pattern": "/items/\\d+"

"$eval": "replace(schema['$ref'], '/items/(\\d+)', '/prefixItems/$1')"

or is there a possibility that this approach might not cover all cases? If so, could you please specify which cases it might not handle, so I can gain a better understanding of the issue?

@MeastroZI For this very trivial rename case yes, but it's very easy to construct valid JSON Schemas where that simple pattern won't do. Take this one as a silly example:

{
  "$schema": "https://json-schema.org/draft/2019-09/schema",
  "type": "object",
  "properties": {
    "items": {
      "items": [
        { "type": "string" }
      ]
    },
    "extra": {
      "$ref": "#/properties/items/items/0" 
    }
  }
}

It has an object property called items which is not the actual JSON Schema keyword. In this case, you need to rename only /properties/items/items to /properties/items/prefixItems, and thus only rename the second occurrence of items in the JSON Pointer. In JSON Schema 2019-09, items can also be both a schema or a collection of schemas, so you can have items be a schema that declares items as an array inside and get into a similar situation. You can probably come up with more edge cases around it.

In any case, items to prefixItems is just a simple rename upgrade example. Other JSON Schema keywords may require more than just a simple renaming, making this even harder to resolve for all cases.

Keep in mind that a tool that upgrades schemas must be able to handle ANY valid JSON Schema document that the user passes to it, and handle these tricky edge cases accordingly.

For i.e. definitions to $defs in the Alterschema issue I shared is even trickier, because you cannot rely on the next component of items being an integer to improve the pattern like we do for items to prefixItems.

Here is a fun one that is valid and breaks the \\d part of the regex:

{
  "$schema": "https://json-schema.org/draft/2019-09/schema",
  "type": "object",
  "properties": {
    "foo": {
      "$ref": "#/$defs/items/0" 
    }
  },
  "$defs": {
    "items": {
      "0": {
        "type": "string"
      }
    }
  }
}

What I'm thinking about is that we can statically analyze the schema first, and know what each component of the pointers mean (i.e. does the /items part of #/$defs/items correspond to the actual items 2019-09 applicator in array form?) That plus additional semantics around what the transformation does could help us resolve every case

What I'm thinking about is that we can statically analyze the schema first, and know what each component of the pointers mean (i.e. does the /items part of #/$defs/items correspond to the actual items 2019-09 applicator in array form?) That plus additional semantics around what the transformation does could help us resolve every case
Hi, so instead of handling for every single case for keywords to be transformed.., it is better to make checks based on the semantic hierarchial flow. Am I right? Like chacking whether its an array or object if its only a real item and then casting the 0 to string? Is that what semantics means

Hi, so instead of handling for every single case for keywords to be transformed.., it is better to make checks based on the semantic hierarchial flow. Am I right? Like chacking whether it's an array or object if it's only a real item and then casting the 0 to string? Is that what semantics means

Not 100% sure what you mean, but what I mean by semantics is being able to statically analyze the actual transformation DSL and actually understand what it does. For example, you cannot very easily tell from a JSON-e template that such template is actually a property rename. And if we can tell that i.e. a rule is actually a rename for A to B, then we might know how to handle the reference fix ups.

Coming back to the items to prefixItems example we've been discussing so far, this is the corresponding JSON-e rule we have in Alterschema:

{
  "$merge": [
    { "$eval": "omit(schema, 'items')" },
    {
      "prefixItems": {
        "$eval": "schema.items"
      }
    }
  ]
}

What if instead of that weird-looking low-level complex JSON template, we instead had:

[
  { "type": "rename", "from": "items", "to": "prefixItems" }
]

The latter is a LOT more machine readable.

I guess the main challenge is that leaving the simple prefixItems rule aside, some upgrade rules are more complex and involve even more cryptic JSON-e templates that do more than just renames. So the problem statement is: can we come up with a set of higher level operations that capture everything we need, AND that is machine readable enough for us to deterministically do $ref fix-ups in every possible case?

So I'd say the phases in this project are like this:

  • Research JSON Schema transformation rules, categorize them, etc
  • Come up with a higher-level transformation language than JSON-e that carry semantics about how we are actually transforming the schema (I was thinking something similar to JSON Patch (https://jsonpatch.com))
  • Then do a prototype of implementing upgrade rules with this language, ensuring it solves the limitations of Alterschema
  • If we have more time, we use this language to attempt to level of downgrading support, etc

@jviotti one question in this: Should the high level transformation language call the JSON-e at the backend or can say(should the high level one be written on top of JSON-e itself)?

@Era-cell Maybe. I'm open to both building it on top of JSON-e or as a standalone thing. Whatever is easier I guess

Thanks a lot for joining JSON Schema org for this edition of GSoC!!

Qualification tasks will be published as comments in the project ideas by Thursday/Friday of this week. In addition I'd like to invite you to a office hours session this thursday 18:30 UTC where we'll present the ideas and the relevant date to consider at this stage of the program.

Please use this link to join the session:
🌐 Zoom
πŸ“… 20124-02-29 18:30 UTC

See you there!

For the qualifying task, just to echo back what I said before: the main thing we want to see on proposals is that you have a good grasp on what the problem of upgrading JSON Schemas is and are capable of understanding the upgrade rules that would need to be implemented.

So for that, you can focus only on 2019-09 to 2020-12 for the proposal (we'll cover other drafts later), list down the transformation rules that need to happen on all those drafts, and try to categorize them based on different criteria to understand them better. For example, what vocabulary they involve, what type of operation they are (rename, wrap, etc), whether they affect other sibling or non sibling keywords, etc. Be creative! Good grouping criteria can surface patterns that we might not be thinking about and that could influence the DSL. You can present this as a spreadsheet, list, or any form you want.

Then, once accepted, we will continue building up on this analysis to design the DSL, and finally implement it. If we did the previous phases well (mainly the one one understanding and categorizing the transformation rules), the rest will be easy

{
  "$schema": "https://json-schema.org/draft/2020-12",
  "$id": "https://example.com/anotherthing/agains/customer",

  "type": "object",
  "properties": {
    "name": { "type": "string" },
    "phone": { "$ref": "/schema/common#/$defs/phone" },
    "address": { "$ref": "/schema/address" }
  },

  "$defs": {
    "https://example.com/schema/address": {
      "$id": "https://example.com/schema/address",

      "type": "object",
      "properties": {
        "address": { "type": "string" },
        "city": { "type": "string" },
        "postalCode": { "$ref": "/schema/common#/$defs/usaPostalCode" },
        "state": { "$ref": "#/$defs/states" }
      },

      "$defs": {
        "states": {
          "enum": [4, 4]
        }
      }
    },
    "https://example.com/schema/common": {
      "$schema": "https://json-schema.org/draft/2019-09",
      "$id": "https://example.com/schema/common",

      "$defs": {
        "phone": {
          "type": "number"
        },
        "usaPostalCode": {
          "type": "string",
          "pattern": "^[0-9]{5}(?:-[0-9]{4})?$"
        },
        "unsignedInt": {
          "type": "integer",
          "minimum": 0
        }
      }
    }
  }
}

@jviotti I am not able to understand how, in this case, this $ref under:

"phone": { "$ref": "/schema/common#/$defs/phone" }

which has the relative path, gets resolved by the schema validator. I mean, how is the base URL for this calculated even if there is nothing common in the relative path under $ref and the $id of the root?

```json
{
  "$schema": "https://json-schema.org/draft/2020-12",
  "$id": "https://example.com/anotherthing/agains/customer",

  "type": "object",
  "properties": {
    "name": { "type": "string" },
    "phone": { "$ref": "/schema/common#/$defs/phone" },
    "address": { "$ref": "/schema/address" }
  },

  "$defs": {
    "https://example.com/schema/address": {
      "$id": "https://example.com/schema/address",

      "type": "object",
      "properties": {
        "address": { "type": "string" },
        "city": { "type": "string" },
        "postalCode": { "$ref": "/schema/common#/$defs/usaPostalCode" },
        "state": { "$ref": "#/$defs/states" }
      },

      "$defs": {
        "states": {
          "enum": [4, 4]
        }
      }
    },
    "https://example.com/schema/common": {
      "$schema": "https://json-schema.org/draft/2019-09",
      "$id": "https://example.com/schema/common",

      "$defs": {
        "phone": {
          "type": "number"
        },
        "usaPostalCode": {
          "type": "string",
          "pattern": "^[0-9]{5}(?:-[0-9]{4})?$"
        },
        "unsignedInt": {
          "type": "integer",
          "minimum": 0
        }
      }
    }
  }
}

@jviotti I am not able to understand how, in this case, this $ref under:

"phone": { "$ref": "/schema/common#/$defs/phone" }

which has the relative part, gets resolved by the schema validator. I mean, how is the base URL for this calculated even if there is nothing common in the relative path under $ref and the $id of the root?

Did you try to run it? I am thinking this is related to how schemas are stored

@Era-cell, I have read somewhere that $ref is resolved by directly pointing to the schema part they are referring to. So now my question is: how does the schema validator resolve this $ref with a relative path? Even if the schema validator stores these schemas in the definition part or in some other way under the hood , there is still a need to resolve it by referencing it and resolving $ref.

The schema I provided is not invalidating; it's working and successfully validating the JSON data.

You can try it here:
https://www.jsonschemavalidator.net/

Edited: Sorry, I am typing from my phone, so may you face typos in my messages

@MeastroZI Your reference, /schema/common#/$defs/phone is a URI reference, where /schema/common is the URI path and #/$defs/phone is the URI fragment. Furthermore, that URI reference is relative.

According to JSON Schema use of URI and the URI RFC, that relative URI is resolved taking https://example.com/anotherthing/agains/customer (the $id of the schema resource that contains such reference), as the base URI.

Following standard URI behavior, the result of resolving /schema/common#/$defs/phone against https://example.com/anotherthing/agains/customer results in https://example.com/schema/common#/$defs/phone. Then, when resolving that reference, JSON Schema will look for https://example.com/schema/common, which is an embedded schema resource in the schema you shared, and from then, resolve #/$defs/phone as a JSON Pointer.

If URI behavior is the confusing part, I recommend reading the URI RFC: https://www.rfc-editor.org/rfc/rfc3986

const transformRule = [
    {
    referencTraverser: true,
    path: "properties/*",
    conditions: [{ "isKey": "$ref" }],
    refConditions: [{ "isKey": "items", "hasSibling": ["type", "array"] }],
    updateRefPart: "prefixItems"
},
{
    path: '*',
    conditions: [{ "isKey": "items", "hasSibling": ["type", "array"] }],
    operations: {
        "editKey": "prefixItems"
    }
} , 
{
    path : '$schema' ,
  
    operations : {
        "updateValue" : "https://json-schema.org/draft/2020-12/schema"
    }
}
]

const jasonobj = {
    "$schema": "https://json-schema.org/draft/2019-09/schema",
    "type": "object",
    "properties": {
        "items": {
            "type": "array",
            "items": [
                { "type": "string" }
            ]
        },
        "extra": {
            "$ref": "#/properties/items/items/0"
        }
    },
    "ooos": {
        "items2": {
            "type": "array",
            "items": []
        },
        "item3": {
            "items4": {
                "items5": {
                    "type": "array",
                    "items": []
                }
            }
        }
    }
}

const result = convert(transformRule, jasonobj)
console.log('\n')
console.log('*******************************Logs*****************************************')
console.log('\n\n\n\n\n\n')
console.log('*******************************Result****************************************')
console.log( JSON.stringify (result , null , 2))
console.log('*******************************Result****************************************')
console.log('\n')

and here is the output

{
  "$schema": "https://json-schema.org/draft/2020-12/schema",
  "type": "object",
  "properties": {
    "items": {
      "type": "array",
      "prefixItems": [
        {
          "type": "string"
        }
      ]
    },
    "extra": {
      "$ref": "#/properties/items/prefixItems/0"
    }
  },
  "ooos": {
    "items2": {
      "type": "array",
      "prefixItems": []
    },
    "item3": {
      "items4": {
        "items5": {
          "type": "array",
          "prefixItems": []
        }
      }
    }
  }
}

Hi @jviotti, I have a doubt about the meaning of the JSON DSL. Could you please take a look at this code? It's a snippet of my work towards DSL. Actually, I want to know if my code can do something like this. Is it considered as a DSL? If not, how would you technically define a DSL?

And sorry for the previous comment. One more thing I am hesitant about is asking this many questions. Is it okay to ask this many questions or are they silly? I want to openly express my concern about it.

@MeastroZI

I have a doubt about the meaning of the JSON DSL. Could you please take a look at this code? It's a snippet of my work towards DSL. Actually, I want to know if my code can do something like this. Is it considered as a DSL? If not, how would you technically define a DSL?

Yeah, exactly, you are thinking about it in the right direction. Your transformRule JSON example is definitely a valid DSL.

And sorry for the previous comment. One more thing I am hesitant about is asking this many questions. Is it okay to ask this many questions or are they silly? I want to openly express my concern about it.

Please ask as many questions as you need. That's the whole point of this phase and I'm sure other people reading this thread would benefit as well. Asking lots of questions is definitely better than not asking them.

@jviotti can you explain this

{
  "$schema": "https://json-schema.org/draft/2020-12/schema",
  "type": "array",
  "prefixItems": [{ "type": "string" }, { "type": "string" }],
  "not": {
    "items": {
      "not": { "type": "string", "minLength": 3 }
    }
  },
  "unevaluatedItems": false
}

specially this part

"not": {
    "items": {
      "not": { "type": "string", "minLength": 3 }
    }

My understanding is that it dictates that there must not be any items in the array that are strings with a length less than 3. Therefore, the schema should only accept arrays where all elements have a minimum length of 3. However, it seems to also accept arrays like ["axd", "d"]. Could you clarify this?"

Also the unevaluatedItems behaviour is a bit wierd:

registerSchema({
    "$schema": "https://json-schema.org/draft/2020-12/schema",
    "$id": "http://example.com/lets_move",
    "type": "array",
    "prefixItems": [{
        "const": "aaa"
    }],
    "items": {
        "type": "string",
        "anyOf": [
            { "pattern": "^a" },
            { "pattern": "^b" }
        ]
    },
    "uniqueItems": true,
    "unevaluatedItems": {
        "type": "string",
        "pattern": "^y"
    }
})

for the instance: ["aaa", "ya"]
Shouldn't "^y" go to unevaluatedItems and produce true, why does it give false over here.
In both the examples, the presence of items keyword is making it confusing

@Era-cell
unevaluatedItems is only apply to the element in array which is not evaluated but as you use the items before the unevaluatedItems this make all the array element succefully evaluated so there will be no element will left unevaluated thats why your instance is not validating in this , you have to apply the keyword logically there is no meaning to put the unevaluatedItems property if all element is getting evaluated .
πŸ™‚
If i am wrong please correct me

@Era-cell unevaluatedItems is only apply to the element in array which is not evaluated but as you use the items before the unevaluatedItems this make all the array element succefully evaluated so there will be no element will left unevaluated thats why your instance is not validating in this , you have to apply the keyword logically there is no meaning to put the unevaluatedItems property if all element is getting evaluated . πŸ™‚ If i am wrong please correct me

But the order of keywords doesnt matter as per the docs, and:
These instance items or properties may have been unsuccessfully evaluated against one or more adjacent keyword subschemas, such as when an assertion in a branch of an "anyOf" fails. Such failed evaluations are not considered to contribute to whether or not the item or property has been evaluated. Only successful evaluations are considered.
-- it says only successful evaluations are consirdered to be evaluated

@Era-cell when you make the unevaluateditems to false in your code and then run your instance you will not get the erroe related to unevaluated element , you will get error related to the Items keyword

That means items take care of all the element which is not consider by the prefix element and not let the flow go to the unevaluateditem keyword

Try it here https://json-schema.hyperjump.io/

And Even if you remove the unevaluateditems keyword you will get the same error
Guess why !

Same thing bcz items keyword take care of all the element which is not consider by the prefixitems

And Even if you remove the unevaluateditems keyword you will get the same error Guess why !

Same thing bcz items keyword take care of all the element which is not consider by the prefixitems

Yeah, this was my initial thought..
But At this point presence of "items" keyword will not let any of the values to be unevaluated, as per your assumption

@Era-cell unevaluatedItems is only apply to the element in array which is not evaluated but as you use the items before the unevaluatedItems this make all the array element succefully evaluated so there will be no element will left unevaluated thats why your instance is not validating in this , you have to apply the keyword logically there is no meaning to put the unevaluatedItems property if all element is getting evaluated . πŸ™‚ If i am wrong please correct me

But the order of keywords doesnt matter as per the docs, and: These instance items or properties may have been unsuccessfully evaluated against one or more adjacent keyword subschemas, such as when an assertion in a branch of an "anyOf" fails. Such failed evaluations are not considered to contribute to whether or not the item or property has been evaluated. Only successful evaluations are considered. -- it says only successful evaluations are consirdered to be evaluated

Just is it possible to make this statement more clear..?😁

{
    "$schema": "https://json-schema.org/draft/2020-12/schema",
    "$id": "http://example.com/lets_uneval",
    "type": "array",
    "prefixItems": [{
        "const": "aaa"
    }],
    "items": {
        "type": "string",
        "allOf": [
            { "pattern": "^a" },
            { "pattern": "^b" }
        ]
    },
    "uniqueItems": true,
    "unevaluatedItems": {
        "type": "string",
        "pattern": "^an"
    },
}

now for ["aaa", "a", "bn", "an"] "an" should be left unevaluated because "a" took care of it,
I expect the result to be true but given false, if even this is evaluated can I get an example where "items" is present and values are unevaluated

{
    "$schema": "https://json-schema.org/draft/2020-12/schema",
    "$id": "http://example.com/lets_uneval",
    "type": "array",
    "prefixItems": [{
        "const": "aaa"
    }],
    "items": {
        "type": "string",
        "allOf": [
            { "pattern": "^a" },
            { "pattern": "^b" }
        ]
    },
    "uniqueItems": true,
    "unevaluatedItems": {
        "type": "string",
        "pattern": "^an"
    },
}

just tell me one thing is it possible to make the string start with a and simultaneously start with b , so because there is no possible string which is start with a and also start with b that why you are getting error try this

{
    "$schema": "https://json-schema.org/draft/2020-12/schema",
    "$id": "http://example.com/lets_uneval",
    "type": "array",
    "prefixItems": [{
        "const": "aaa"
    }],
    "items": {
        "type": "string",
        "allOf": [
            { "pattern": "^a" }, 
            { "pattern": "b$" }  
        ]
    },
    "uniqueItems": true,
    "unevaluatedItems": {
        "type": "string",
        "pattern": "^an"
    }
}

on this instance
["aaa" ,"aab" ,"aaab" ]

will give the result true but if you add any string which not start with a and end with b then that element is get catch by the items keyword, as i said earlier items check for all the elements which not consider by the prefixitems , not let the element go toward unevaluatedItems !

Correct me please if i am wrong 😺

@jviotti , I have some more questions in alterschema:
Why are rules mentioned 2019 to 2019, 2020 to 2020 -- what is the need of these
Why did you opt to choose json-e over javascript functions.. because it was more intuitive?
Is there a need of imperative DSL or is declarative DSL like OOP is what you meant (which gives higher level of abstraction) ?
Are you going to use alterschema or that will be abandoned?

@MeastroZI

@jviotti can you explain this

{
  "$schema": "https://json-schema.org/draft/2020-12/schema",
  "type": "array",
  "prefixItems": [{ "type": "string" }, { "type": "string" }],
  "not": {
    "items": {
      "not": { "type": "string", "minLength": 3 }
    }
  },
  "unevaluatedItems": false
}

My understanding is that it dictates that there must not be any items in the array that are strings with a length less than 3. Therefore, the schema should only accept arrays where all elements have a minimum length of 3. However, it seems to also accept arrays like ["axd", "d"]. Could you clarify this?"

That schema looks overly complicated. Maybe what you want is this instead?

{
  "$schema": "https://json-schema.org/draft/2020-12/schema",
  "type": "array",
  "items": {
    "minLength": 3
  }
}

@Era-cell

Also the unevaluatedItems behaviour is a bit wierd:

The unevaluatedItems behavior depend on other adjacent array-related keywords. As it name implies, unevaluatedItems will only kick-in for array items that have not been evaluated by adjacent array keywords, so the precent of items and prefixItems will indeed affect its behavior

@Era-cell

I have some more questions in alterschema:Why are rules mentioned 2019 to 2019, 2020 to 2020 -- what is the need of these

These perform simplifications within the same version to make it easier to process the other rules. i.e. you could simplify the use of certain keywords on the input schema without changing the version, before you attempt to upgrade it.

Why did you opt to choose json-e over javascript functions.. because it was more intuitive?

The whole point of this project is to make rule definitions programming language agnostic. We don't want to just create an upgrade tool for JavaScript, but one that is embeddable and implementable on ANY language out there. That's why the rules are pure JSON.

Is there a need of imperative DSL or is declarative DSL like OOP is what you meant (which gives higher level of abstraction) ?

Not sure I follow this. Can you give me an example?

Are you going to use alterschema or that will be abandoned?

I will. The idea is for the JSON-based rules to be moved to the JSON Schema org while Alterschema is (one of many, potentially?) an implementation of the actual engine.

@Era-cell

Also the unevaluatedItems behaviour is a bit wierd:

The unevaluatedItems behavior depend on other adjacent array-related keywords. As it name implies, unevaluatedItems will only kick-in for array items that have not been evaluated by adjacent array keywords, so the precent of items and prefixItems will indeed affect its behavior

@jviotti
My query on this is:
at the presence of items keyword wouldnt the items evaluate each and every instance value, so
-- none of them will be left unevaluated.
(can you give an example even at the presence of "items" keyword there are some unevaluated values left over)

at the presence of items keyword wouldnt the items evaluate each and every instance value, so none of them will be left unevaluated.

Correct. Maybe this example helps clarifying that: https://github.com/json-schema-org/JSON-Schema-Test-Suite/blob/main/tests/draft2020-12/unevaluatedItems.json#L64-L78

@Era-cell

I have some more questions in alterschema:Why are rules mentioned 2019 to 2019, 2020 to 2020 -- what is the need of these

These perform simplifications within the same version to make it easier to process the other rules. i.e. you could simplify the use of certain keywords on the input schema without changing the version, before you attempt to upgrade it.

Why did you opt to choose json-e over javascript functions.. because it was more intuitive?

The whole point of this project is to make rule definitions programming language agnostic. We don't want to just create an upgrade tool for JavaScript, but one that is embeddable and implementable on ANY language out there. That's why the rules are pure JSON.

Is there a need of imperative DSL or is declarative DSL like OOP is what you meant (which gives higher level of abstraction) ?

Not sure I follow this. Can you give me an example?

Are you going to use alterschema or that will be abandoned?

I will. The idea is for the JSON-based rules to be moved to the JSON Schema org while Alterschema is (one of many, potentially?) an implementation of the actual engine.

  1. like do we need to use parsers, lexifiers and new grammar defining the language, OR use abstraction over the json-e or javascript(or any other language to create functions with arguments) itself..?

@Era-cell

like do we need to use parsers, lexifiers and new grammar defining the language, OR use abstraction over the json-e or javascript(or any other language to create functions with arguments) itself..?

It should be all JSON based. No need for a new grammar. Just use JSON's grammar. But don't embed an actual programming language like JavaScript on the JSON. JSON-e is one valid way of doing it. It expresses the transformations purely using JSON.

Hi, @jviotti when the algorithm/DSL will be included in JSON Schema org, will the access to external json schema documents be provided,

"$ref":"other.json#/$defs/items/0"

whose schema resource isnt present in the document which is being altered, at this point the external schema document(which is external resource) also needs to be altered?

Hi @Era-cell

whose schema resource isnt present in the document which is being altered, at this point the external schema document(which is external resource) also needs to be altered?

Great question! Yes on both cases:

  • A JSON Schema is allowed to externally reference another JSON Schema that makes use of a different draft. i.e. you can have a JSON Schema 2020-12 that externally references a JSON Schema Draft 4. So in that case, it is not really required to i.e. upgrade the other schema and we can simply ignore it if we don't have access to it

  • That said, while this cross-version referencing is supposed to work, I think many implementations out there don't properly support it, and the JSON Schema test suite doesn't cover it either. For these cases, what you can do is perform JSON Schema Bundling (https://json-schema.org/blog/posts/bundling-json-schema-compound-documents) before upgrading that schema. Bundling will bring in all externally referenced schema into a single schema with nested schema resources, and then we upgrade them all together

But in both cases, our upgrader shouldn't really mind. If its passed a schema with unresolved remote references, it will do what it can, and if its passed a bundled schema, it will transform the entire thing.

"Hi, @jviotti! I have one more question about bundling schemas. Can I assume that the name(key) of the schema in $def will always be an $id of that schema, or it can be anything? For example, in this schema under the $def, the names are set to the $id of the schema:"

{
  "$id": "https://jsonschema.dev/schemas/examples/non-negative-integer-bundle",
  "$schema": "https://json-schema.org/draft/2020-12/schema",
  "description": "Must be a non-negative integer",
  "$comment": "A JSON Schema Compound Document. Aka a bundled schema.",
  "$defs": {
    "https://jsonschema.dev/schemas/mixins/integer": {
      "$schema": "https://json-schema.org/draft/2020-12/schema",
      "$id": "https://jsonschema.dev/schemas/mixins/integer",
      "description": "Must be an integer",
      "type": "integer"
    },
    "https://jsonschema.dev/schemas/mixins/non-negative": {
      "$schema": "https://json-schema.org/draft/2020-12/schema",
      "$id": "https://jsonschema.dev/schemas/mixins/non-negative",
      "description": "Not allowed to be negative",
      "minimum": 0
    },
    "nonNegativeInteger": {
      "allOf": [
        {
          "$ref": "/schemas/mixins/integer"
        },
        {
          "$ref": "/schemas/mixins/non-negative"
        }
      ]
    }
  },
  "$ref": "#/$defs/nonNegativeInteger"
}

It can be anything.

(The value of the $ref is applied to the current scope and the schema is resolved from that reference.)

Hi @Era-cell

whose schema resource isnt present in the document which is being altered, at this point the external schema document(which is external resource) also needs to be altered?

Great question! Yes on both cases:

  • A JSON Schema is allowed to externally reference another JSON Schema that makes use of a different draft. i.e. you can have a JSON Schema 2020-12 that externally references a JSON Schema Draft 4. So in that case, it is not really required to i.e. upgrade the other schema and we can simply ignore it if we don't have access to it
  • That said, while this cross-version referencing is supposed to work, I think many implementations out there don't properly support it, and the JSON Schema test suite doesn't cover it either. For these cases, what you can do is perform JSON Schema Bundling (https://json-schema.org/blog/posts/bundling-json-schema-compound-documents) before upgrading that schema. Bundling will bring in all externally referenced schema into a single schema with nested schema resources, and then we upgrade them all together

But in both cases, our upgrader shouldn't really mind. If its passed a schema with unresolved remote references, it will do what it can, and if its passed a bundled schema, it will transform the entire thing.

Okay, so if we have access to external resource and it is resolved.. we dont change the external schema,
but we bundle it in the present document itself right?
BECAUSE the user may use the external schema for other purposes too.. Right?

Keep in mind the project would not be able to "modify" any schema in place. What it does is create a copy of the input schema with the given transformations. So:

  • If the schema is bundled, you transform the entire thing, including the bundled resources
  • If the schema is NOT bundled, you just transform the immediate schema only

🚩 IMPORTANT INSTRUCTIONS REGARDING HOW AND WHERE TO SUBMIT YOU APPLICATION 🚩

Please join this discussion in JSON Schema slack to get the last details very important details on how to better submit your application to JSON Schema.

See communication here.

Hi, @jviotti where should the qualification task be submitted, and what is the deadline for it?

@Era-cell I believe there is a GSoC portal that you should use. @benjagm Can you clarify?

@Era-cell I believe there is a GSoC portal that you should use. @benjagm Can you clarify?

@jviotti I guess that is for the proposal, should I embed qualification task inside proposal itself..?
@benjagm

@Era-cell yes please. Make sure you add the details of the qualification task to the proposal! Feel free to join the #gsoc channel in our Slack workspace to get immediately response to these type of questions

Hi @jviotti,

First of all, I apologize for using the Alterschima UI to display my DSL transformation. It's only temporary!

Could you please review the transformation from 2019 to 2020 draft on this site? I've embedded the qualification tasks' DSL transformation code and have tried my best to cover all edge cases. However, if I've missed any, please let me know."

@MeastroZI Not much I can comment on given a single example, but looking forward to the explanations, proposed rules, etc in the proposal!

@jviotti, I submitted my proposal (Name: Pandit Vinit ) in Json schema. Could you please review it and provide any suggestions if possible ?

I will, thanks a lot for the submission! ❀️

@jviotti in 2019-09 draft i am not able to find the any difference between additionalItems and unevaluatedItems
here written as "Similar to additionalItems, but can "see" into subschemas and across references" but as i tested this schema , additionalItems also doing all of this
here is the example

{
  "$schema": "https://json-schema.org/draft/2019-09/schema",
  "$def": {
    "stringArray": {
      "type": "array",
      "items": {
        "type": "string"
      }
    },
    "numberArray": {
      "oneOf": [
        {
          "type": "array",
          "items": [
            {
              "type": "number"
            },
            {
              "$ref": "#/$def/stringArray"
            }
          ]
        },
        {
          "type": "boolean"
        }
      ]
    }
  },
  "type": "array",
  "items": [
    {
      "$ref": "#/$def/stringArray"
    }
  ],
  "additionalItems": {
    "$ref": "#/$def/numberArray"
  }
}

validate against : [[""] , [5 , [""]] ] and [[""] , true ]

so my question is what is the difference between additionalItems and unevaluatedItems in 2019-09 draft and is there any example schema which show the difference between additionalItems and unevaluatedItems ?

@MeastroZI Take a look at the official test suite examples: https://github.com/json-schema-org/JSON-Schema-Test-Suite/blob/main/tests/draft2019-09/unevaluatedItems.json. additionalItems matches any array element not covered by an adjacent items. unevaluatedItems applies to array items that were not evaluated (as its name implies) by any other relevant keyword (whether adjacent or not).

"@jviotti, I need direction to think on how to approach downgrading of JSON schema. Is it even possible to do this for all the dialects? With each new version, new keywords are introduced, and I'm unsure if it's feasible to replicate their behavior using the previous version.

Regarding upgrading, I've developed the DSL, and I believe it's capable of handling all upgrades. Please review the recent changes I made in the repository and please provide feedback if possible."

@MeastroZI It is not always feasible, but I think you can go a long way with it, and we can think how to handle the problematic cases. I think if the resulting downgraded schema is a superset of the schema (i.e. it doesn't add more constraints), then it's probably acceptable.

Hello! πŸ‘‹

This issue has been automatically marked as stale due to inactivity 😴

It will be closed in 180 days if no further activity occurs. To keep it active, please add a comment with more details.

There can be many reasons why a specific issue has no activity. The most probable cause is a lack of time, not a lack of interest.

Let us figure out together how to push this issue forward. Connect with us through our slack channel : https://json-schema.org/slack

Thank you for your patience ❀️