Generic CALM domain definitions via external schemas

Question

Generic CALM domain definitions via external schemas

willosborne opened this issue 4 months ago · comments

Feature Request

Description of Problem

In the last 6 months we have successfully built out the core of CALM to model architecture.

We are now seeing discussion around to model extra data in order to capture security, resiliency, observability (etc) in a CALM architecture.

We need a solution to attach additional data to CALM in a structured way, without growing the core manifest too much.
In the same vein as our solution for interfaces (#48), this solution also needs to allows individual organisations to define their own domain information, extending the core set offered by the CALM metaschema.

Potential Solutions:

Create the concept of a domain object in CALM.
A CALM document can optionally contain a map of domains.
Each of these domains will then contain a list of structured objects, each of which decorates an object (node, relationship, interface etc.) in the document.

The data it annotates with will be defined by another JSON schema.
CALM will provide a core set of basic extension types, and organisations can then provide their own.

NOTE: this will NOT replace metadata. Nor is this intended to capture the notion of when data is required - it just models the structure of the data that might be required. See final section for more detail here.

This is very much an early idea and needs some refinement, but wanted to see what people think.

Example: Resiliency domain

NOTE: this is a very simplified look at resiliency - I'm only considering vary basic properties. The idea is after all that CALM itself doesn't have an opinion about exactly what you want to define.

Here I'll be documenting the number of replica sets and deployment strategy for a system in the document. I'll first define this as an instantiation, and then again as a pattern to show how the types work.

See a very simple CALM pattern document:

{
    "nodes": [
        {
            "name": "API producer service",
            "unique-id": "api-producer",
            "interfaces": [
                // ...
            ]
        }
    ],
    "relationships": [
        // ...
    ],
    "domains": {
        "resiliency": {
            "nodes": [
                {
                    "element-id": "api-producer",
                    // custom properties defined by the external schema
                    "replica-sets": 4,
                    "deployment-strategy": "rolling-release"
                }
            ]
            // relationships, interfaces can also optionally be defined on a domain
            // they are defined separately so that each can have its own type in the schema
        }
    }
}

Note that rather than defining the domains on each element, we are defining them at the bottom and referencing elements by unique ID.
This is more in keeping with the general approach and helps keep things flat.

I decided to keep nodes, relationships, interfaces separate; we can potentially combine this all into one list but it makes the types a bit nicer this way, since you can apply a single type to all elements in the list. i.e. all resiliency nodes have this base type, all relationships have another base type and so on.

If we wanted to also extend this document with say a data domain:

{
    //...
    "domains": {
        "resiliency": {
            "nodes": [
                // .. as before
            ]
        },
        "data": {
            "nodes": [
                {
                    "element-id": "api-producer",
                    // custom properties
                    "data-classification": "PII"
                }
            ]
        }
    }}

Modelling this in JSON Schema

The changes to the core CALM schema are small:

Add an optional top-level domains property
Each domain then has a structured definition, which gives the nodes, relationships, interfaces and the inner list. This stuff is rigid.
We have a domain-decoration-type definition, which just contains element-id. This type is intended to be extended to provide specific domain types for nodes, relationships and so on.

Then in your CALM pattern developers can reference the appropriate types via $ref - and then link out to their organisational-specific schemas.
This lets them define a big list of potential properties in a way that doesn't bloat the core schema.

We can also provide some pre-defined domain types in CALM if need be.

This also means that the CALM CLI can generate, validate and visualize these properties, as long as they have the right schemas loaded.
(This can be done via the option to select a schema directory.)
This would allow developers to get a pre-populated starter instantiation with the properties required by the various domains inserted as placeholders, such as {{ REPLICA_SETS }} in our example. Tooling would then pick this up and report potentially missed values.

For some examples see the PR I raised here - #309 :

What the calm definion of domain-decoration-type might look like: here
An example of what the resiliency domain might look like: here
An example of what the data domain might look like: here
An example pattern file using these - this definitely needs some work as it looks very clunky right now: here

A note on requirements

I'm intentionally not considering the problem of deciding whether a certain element needs to specify certain domain properties.
This is because the logic for making these decisions is way too complex for JSON schema.

e.g.

"all nodes that process PII should have encryption on all HTTP traffic."
"all nodes with 24/7 availability should have at least 4 replica sets and multi-regional resiliency"

This is a problem for further down the line.

Will Osborne · Answer 1 · Wed Jul 24 2024 00:48:29 GMT+0800 (China Standard Time)

@yt-ms has suggested an alternative way of decorating elements - by simply putting them in-line, in the same fashion as interfaces are defined on a node.
I'll post an example of both with pros/cons so we can have a think about which we'd prefer. NB I've added a relationship here too to make it clearer.

Current proposal - decorated at the bottom, linked by unique-id

{
    "nodes": [
        {
            "name": "API producer service",
            "unique-id": "api-producer",
            "interfaces": [
                // ...
            ]
        }
    ],
    "relationships": [
        {
            "unique-id": "relationship-id",
            "relationship-type": {
                "connnects": {
                    // etc...
                }
            }
        }
    ],
    "domains": {
        "resiliency": {
            "nodes": [
                {
                    "element-id": "api-producer",
                    // custom properties defined by the external schema
                    "replica-sets": 4,
                    "deployment-strategy": "rolling-release"
                }
            ],
            "relationships": [
                {
                    "element-id": "relationship-id",
                    "uses-load-balancer": true
                }
            ]
        }
    }
}

Pros:

All domain information is specified in one place. Anyone who wants to look at e.g. security domain information can go to the relevant block.
Design has less nesting/hierarchy. This is more in line with existing design of CALM and will keep the patterns simpler (nesting gets confusing very quickly in JSON schema)
May be marginally easier to write tools for but this is minor.
Typing is quite nice. You can define a type for the security info shared by every node, and then use the 'items' property on the 'nodes' array

Cons:

Domain information for an element is in a completely different place in the document - a long way away in a large file. This will harm the editing/reading experience. (This could be improved with tooling e.g. unique-id jump-to via a plugin)
Nodes, relationships and interfaces being explicitly defined per domain is a little clunky. This could be improved
Hard to see at a glance when a node has no domain info specified for it. Tooling would need to make this easier or it would be very easy to miss things.

Domain decorations applied in-line, directly on the object

{
    "nodes": [
        {
            "name": "API producer service",
            "unique-id": "api-producer",
            "interfaces": [
                // ...
            ],
            "domains": {
                "resiliency": {
                    "replica-sets": 4,
                    "deployment-strategy": "rolling-release"
                }
            }
        }
    ],
    "relationships": [
        {
            "unique-id": "relationship-id",
            "relationship-type": {
                "connnects": {
                    // etc...
                }
            },
            "domains": {
                "resiliency": {
                    "uses-load-balancer": true
                }
            }
        }
    ]
}

Pros:

Definitely cleaner to look at.
Locality is improved - domain info is specified right next to the element it's annotating
Easy to visually see when an element has missing domain info.

Cons:

More nesting. This will make the patterns more complex to edit and read and make manipulation/parsing of the document harder.
Domain information is split across every element, so a reader can't look at it all in one place.
Potentially a lot of extra code on every element if there are multiple domains - especially on interfaces

Budlee · Answer 2 · Wed Jul 24 2024 01:57:01 GMT+0800 (China Standard Time)

Personally, I am a fan of the first approach. The domains are separate and contain what you need. Inline with the elements that I would say are first class, the relationship and nodes, I believe is harder to read and find.

With the first approach if you are an owner of a domain and it is referenced then the handling of that is straightforward. If you are a domain owner are there additional complexities if your requirements need to be added in the first class elements?

Also CALM is being consumed by applications, which is easier for a machine to process?

Will Osborne · Answer 3 · Thu Jul 25 2024 17:55:19 GMT+0800 (China Standard Time)

@Budlee I agree, you can add everything in one place in the first approach.

Regarding machine processing, there isn't too much difference since it's all JSON; most of the challenge here is making sure you can parse the custom domain objects - ideally via codegen.

I'd say that in an untyped language, the second approach is easier to consume, since there are no lookups required.

In a typed language like Java, probably the first is easier, because you can parameterise the 'domains' property on your nodes/relationships with the custom types for your domains - and it's only parsed from a single place, rather than all across the whole document. But this is guesswork.
(In general structured parsing of CALM documents is an interesting challenge that I haven't made too much progress with since there are fields like metadata, etc that are unstructured according to the core manifest. The CLI is not fully using static typing yet for this reason.)

jpgough-ms · Answer 4 · Wed Jul 31 2024 18:19:09 GMT+0800 (China Standard Time)

@yt-ms and I have had an offline discussion today for the requirements for a minimal controls domain that we are looking at for some gating work. I plan to go with option 2 as the approach and creating a separate issue to put together this proposal in an August version of the Schema.