6.X-12 ISO 19115 files

Question

6.X-12 ISO 19115 files

rmalyankar opened this issue 3 years ago · comments

In Ed. 4.0.0

The ISO 19115 metadata file is referenced from S100_DatasetDiscoveryMetadata by the element S100_19115DatasetMetadata which is a role in Figure 4a-D-2 implemented as a gcx:FileName reference and carries the name of the ISO 19115 metadata file. So even in 4.0.0 there is an indication whether an ISO 19115 metadata file is present for a dataset as well as its name.

In Ed. 5.0.0

The 4.0.0 modelling is carried over to Figure 4a-D-5. The implementation would be similar with a change to the role name. Since Figure 4a-D-5 desribes the exchange catalog (CATALOG.XML), there is a note saying it is an external file (meaning, external to the exchange catalog).

I can add the S100_19115DatasetMetadata box as a (white) block in Figure 4a-D-2 and aggregate it to S100_ExchangeSet (and change its background in 4a-D-5 to conform).

On aggregations vs. compositions: Portrayal and feature catalogues being common to all datasets in the product, they need not be deleted when the exchange set is removed. Since some support files are going to be shared based on the support file discussion we have been having, some of them (and their discovery metadata) would also be retained. In fact, since an exchange set could be an archive acting as a "delivery package", perhaps even datasets and their discovery metatata should be retained until the dataset is cancelled, or voluntarily removed, or goes out of license?

HolgerBothien · Answer 1 · Tue Jan 18 2022 17:27:08 GMT+0800 (China Standard Time)

The reference from the dataset is ok, but there must be an element for the 19115 metadata file itself. The reason is that there should be not a single file in the exchange set that does not carry a digital signature. I think S100_SupportFile_Metadata would do the job.
This is all under the assumption that the 19115 Metadata file is part of the exchange set.

rmalyankar · Answer 2 · Wed Jan 19 2022 03:30:07 GMT+0800 (China Standard Time)

I also think S100_SupportFileDiscoveryMetadata would do. Also, add a value to the S100_SupportFileFormat enumeration:
ISOMetadata: Dataset metadata in ISO format

There is a generic "XML" in the S100_SupportFileFormat enumeration but indicating that it is ISO metadata would be better.

The supportFileSpecification attribute of S100_SupportFileDiscoveryMetadata can provide more information about which edition of 19115-x is being used.

rmalyankar · Answer 3 · Fri Jan 28 2022 12:27:46 GMT+0800 (China Standard Time)

New figures. The white boxes are "structural" classes that represent objects such as dataset files, rather than blocks of XML.

Figure 17-2 S-100 Exchange Set:

Figure 17-5 (S-100 Exchange Set Catalogue):

HolgerBothien · Answer 4 · Fri Jan 28 2022 16:52:07 GMT+0800 (China Standard Time)

Hi, figures look good. The only issue I see so far is that they not describe the use case that support files may support catalogues (e.g. the translation files) I think there should be an association between S100_SupportFileDiscoveryMetadata and S100_CatalogueDiscoveryMetadata. A better way would be to introduce an abstract class S100_DiscoveryMetadata with the common members of S100_DatasetDiscoveryMetadata, S100_SupportFileMetadata, and S100_CatalogueDiscoveryMetaData. Then derive the latter three classes from the abstract base class and have an association ‘supports’ between S100_SupportFileMetadata and S100_DiscoveryMetadata. This would allow that support files supports any other file even other support files. Holger Bothien Product Manager P +49 40 853586940 sevencs.com From: rmalyankar ***@***.***> Sent: Friday, 28 January 2022 05:28 To: IHO-S100WG/TSM8 ***@***.***> Cc: Holger Bothien ***@***.***>; Comment ***@***.***> Subject: Re: [IHO-S100WG/TSM8] 6.X-12 ISO 19115 files (Issue #29) New figures. The white boxes are "structural" classes that represent objects such as dataset files, rather than blocks of XML. Figure 17-2 S-100 Exchange Set: [V5 0 Fig 17-2 S100 ExchangeSet]<https://user-images.githubusercontent.com/51188078/151486309-a7726df8-2704-4655-ac0f-a6bc31214ee2.png> Figure 17-5 (S-100 Exchange Set Catalogue): [V5 0 Fig 17-5 S100 ExchangeSetCatalogue]<https://user-images.githubusercontent.com/51188078/151487151-9c558e78-a6c3-40b1-9ad2-8b80e6b88dcd.png> — Reply to this email directly, view it on GitHub<#29 (comment)>, or unsubscribe<https://github.com/notifications/unsubscribe-auth/AVFWFFI2S4CJETNRQDJNSI3UYILM7ANCNFSM5LXMH32A>. Triage notifications on the go with GitHub Mobile for iOS<https://apps.apple.com/app/apple-store/id1477376905?ct=notification-email&mt=8&pt=524675> or Android<https://play.google.com/store/apps/details?id=com.github.android&referrer=utm_campaign%3Dnotification-email%26utm_medium%3Demail%26utm_source%3Dgithub>. You are receiving this because you commented.Message ID: ***@***.******@***.***>>

DavidGrant-NIWC · Answer 5 · Fri Jan 28 2022 23:12:24 GMT+0800 (China Standard Time)

I like the white boxes, but suggest renaming to avoid confusion with class names (the "structural class names" are not defined anywhere):

S100_ExchangeSet -> S-100 Exchange Set
ISOMetadataFile -> ISO Metadata
S100_Dataset -> Dataset
S100_SupportFile -> Support File
S100_SupportCatalogue -> Catalogue
S100_CatalogueSignature -> Exchange Set Signature

It seems misleading to give attribute names to the file relationships (e.g., "supportFile" for S100_Dataset->S100_SupportFile).

Also, the support file will reference a dataset and/or catalogue (or possibly a support file?) so this will need further updating. Will need to get the actual attribute names from JP.

Multiplicity on datasetReference should be 0..*
Add "catalogueReference" for S100_SupportFileDiscoveryMetadata->S100_CatalogueDiscoveryMetadata, 0..*
Possibly add "supportFileReference" for S100_SupportFileDiscoveryMetadata->S100_SupportFileDiscoveryMetadata, 0..*
Need a relationship between a support file and a catalogue (S100_SupportCatalogue->S100_SupportFile), 0..*
Possibly need a relationship between support files (S100_SupportFile->S100_SupportFile), 0..*

Consider if ISO Metadata would need support files in order to deliver schemas / codelists. If so, add ISOMetadataFile->S100_SupportFile, 0..*

New figures. The white boxes are "structural" classes that represent objects such as dataset files, rather than blocks of XML.

Figure 17-2 S-100 Exchange Set:

Figure 17-5 (S-100 Exchange Set Catalogue):

rmalyankar · Answer 6 · Sat Jan 29 2022 03:41:15 GMT+0800 (China Standard Time)

I'm going to apply Occam's razor to this discussion, and add two attributes to S100_SupportFileDiscoveryMetadata instead. The associations in question will be removed altogether or changed to dependencies (dependencies are conceptual links and are not mapped to XML tags in the XML exchange catalogue).

supportedResource: CharacterString [0..*] Definition: Name of the resource supported by this support file.
supportedResourceType: Enumeration {dataset, featureCatalogue, portrayalCatalogue, interoperabilityCatalogue, product, application, supportFile, other(?)}

I do not agree that the structural classes should have different names. The S100_ structure classes all represent things that are defined in the S-100 context and Part 17 4.1 explains them. ISOMetadataFile represents exactly what its name indicates: an ISO-format metadata file.

jonP · Answer 7 · Sat Jan 29 2022 16:33:29 GMT+0800 (China Standard Time)

I like the application of Occam's razor in this context. The diagrams look ok to me and I don't have any strong feelings on how these relationships are represented as the UML is largely illustrative and most OEMs will be guided by the schemas and example datasets (as our IEC representative will tell us).

supportedResource I think is concise and to the point and we just use that as a generic relationship name, the enumeration values are ok I think but I'm not sure about "product" or "application". I think "other" is fine - it's not actually massively important what the supportedResourceType is, is it? multiplicities are fine with me 0.* so you can have supportingResources which support nothing (they support the "exchange set" - this accounts for README.TXT type elements, and other elements aggregators / distributors may wish to insert at a later date)....

The diagrams I'm not so concerned about - it does strike me that the easier way of doing this is to dispense with exchange set metadata as a separate concept and just design a product specification for it. Then you have a complete UML language to describe the structure and its interrelationships. dataset metadata are just FeatureTypes and supporting resources are just information types. you'd need to define a generic XML schema to encode it but much of the hard work is done and you don't need a completely separate diagramming language to describe such structures. We did a testbed projectt with Geonovum doing S-100 metadata in this fashion so it could be accessed via OGC API Records - in the OGC API methodology the metadata API (OGC API Records) is an extension of the generic OGC API Features - there's simply no concept of having different structures for metadata and data... Just a thought - possibly for a future incarnation of these things.

rmalyankar · Answer 8 · Tue Feb 01 2022 12:19:34 GMT+0800 (China Standard Time)

Updated Figure 17-1:

Updated Figure 17-2:

(Is there a plausible use case for a support file that supports more than one catalogue? Even successive versions of catalogues should be complete packages that don't depend on something packaged with a previous version.)

Updated Figure 17-5:

DavidGrant-NIWC · Answer 9 · Wed Feb 02 2022 00:56:57 GMT+0800 (China Standard Time)

Looks good, I think these are a big improvement over what was in version 4.

(Is there a plausible use case for a support file that supports more than one catalogue? Even successive versions of catalogues should be complete packages that don't depend on something packaged with a previous version.)

Catalog updates will likely require more work in the future; recommend that these are 0..* so as not to restrict future developments.

Support files which haven't changed could be shared with multiple versions of a catalog.
Support files for IENC and S-101 could be shared.
Support files for interoperability catalogs could be shared with many portrayal catalogs.
In the future, may want to deliver dataset/catalog/support file updates via an update mechanism which provides the new content via support file (e.g. some type of patch / delta)

rmalyankar · Answer 10 · Wed Feb 02 2022 03:25:08 GMT+0800 (China Standard Time)

Draft enumeration for supported resource types added here

DavidGrant-NIWC · Answer 11 · Wed Feb 02 2022 03:45:15 GMT+0800 (China Standard Time)

Using the file name as an identifier here will require unique file names for all of these file types. I don't think there is any requirement or convention to enforce this currently - I think most portrayal catalogs are just named "portrayal_catalogue.xml".
- Consider using a URI/hash/MRN/etc.
productId is a non-version specific product type identifier. number is a major version specific product type identifier.

rmalyankar · Answer 12 · Wed Feb 02 2022 12:04:06 GMT+0800 (China Standard Time)

Given that there is no convention, and given that even datasets can be reissued, I can change the remarks for dataset through product to just supportedResource = identifier for the dataset, etc., and leave the question of exactly how they are identified open until a convention is published. Also, add a note saying conventions for identifiers are still to be developed.

DavidGrant-NIWC · Answer 13 · Thu Feb 03 2022 00:32:41 GMT+0800 (China Standard Time)

[...] change the remarks for dataset through product to just supportedResource = identifier for the dataset, etc. [...]

Sounds good, but thinking about this some more it seems like most of the values of S100_SupportedResourceType are duplicative and the use of the enumeration may limit the ability to share a supportFile among multiple resource types (it's not clear how this will be added to the model - will it be an attribute of each reference from a supportFile to a resource, or is it just a single attribute of the supportFile?)

For dataset, featureCatalogue, portrayalCatalogue, interoperabilityCatalog, and supportFile the referenced resource type can be determined based on which section of *DiscoveryMetadata contains the resource
- It's a dataset if it's in S100_DatasetDiscoveryMetadata
- It's a catalog if it's in S100_CatalogueDiscoveryMetadata
  - S100_CatalogueScope identifies the catalog type
- It's a support file if it's in S100_SupportFileDiscoveryMetadata
Recommend remove all these file type values and replace with a single value: file
- Retain product, software, system, and other
Recommend add a value to specify an exchange catalog:
- exchangeCatalogue (see S100_ExchangeCatalogueIdentifier.identifier)

productId is a non-version specific product type identifier. number is a major version specific product type identifier.

I'm not sure what the use case for product is (maybe something like a copyright notice or LPGL license file?). Consider:

should it apply to resource types other than just datasets?
does the supportFile still need to individually reference all of the files to which it applies?
- I don't think so, but it should be clarified.
If we need a version-specific product identifier then:
- productId: All resources of an S-100 product type e.g., "S-101". See S100_ProductSpecification.productId.
- productNumber: All resources of a registered version of an S-100 Product type e.g., "5". See S100_ProductSpecification.number.

jonP · Answer 14 · Thu Feb 03 2022 01:19:09 GMT+0800 (China Standard Time)

Just to flag - Holger and I are working on language packs with CHS/CCG. This will generate a number (0..*) of language packs containing translations of feature catalogue items. This is being done under a new Part 18 (as decided by S100WG at the meeting). The only issue for this group I believe is that a language_pack supports a feature catalogue providing a mechanism for translating some of its content. the nature of the supporting resource in this case (the language pack) can either be determined from the content (it's an XML file with a root element of "languagePack" for instance) or via the metadata where the S100_SupportedResourceType attribute...). Worth consideration - I'll put some thoughts together on it but just flagging it now before Part 18 is drafted.

jonP · Answer 15 · Thu Feb 03 2022 01:26:17 GMT+0800 (China Standard Time)

The other thing to note is that the meeting we had re: identification of resources concluded there's so much to consider as regards the maintenance of state on the implementing system that trying to get it perfect now is impossible. Some thorough testing of this part is likely to result in some changes given there's a need to produce full exchange sets, partuial exchange sets, delta exchange sets etc... and the various encodings and product specs have different updating models in them. Add in reissues and cancellations to the requirement to update/replace supporting resources and I think the general consensus was we can't test all these things exhaustively now - we will need to do scenario testing and then recommend either corrections/clarifications to this part of S-100 and/or clarifications in S-98 Annex C in regard to ECDIS... I think we're close here and reversing the associations basically works but until we've done it in practice we simply won't be able to exercise all the possibilities....

rmalyankar · Answer 16 · Thu Feb 03 2022 01:27:04 GMT+0800 (China Standard Time)

There may not be a target discovery block in the exchange set especially if it is an update. Also, not mentioning the type means that all discovery blocks must be examined to find out what it supports.

Codelist dictionaries are support files and would support all datasets for a product version.

Is there a realistic example of a support file that supports multiple types of resources? If so, it should be split into multiple files that each support one type, otherwise management of resources grows over-complicated.

I will rename product to productVersion and add productFamily (Datasets for any active version of a product specification.)

DavidGrant-NIWC · Answer 17 · Thu Feb 03 2022 03:51:05 GMT+0800 (China Standard Time)

There may not be a target discovery block in the exchange set especially if it is an update. Also, not mentioning the type means that all discovery blocks must be examined to find out what it supports.

You need to examine all the discovery blocks regardless.
The target may not exist in the current exchange set, but if not then it would have been present in a previous exchange set. Either way, it's a simple dictionary lookup.

Recommend add a value to specify an exchange catalog:
exchangeCatalogue (see S100_ExchangeCatalogueIdentifier.identifier)

What Raphael has added identifies the type of the file referenced from the support file. What Jonathan seems to want is to identify the type of the support file itself.

E.g., the support file is a "translation file", rather than the support file references a "feature catalogue".

rmalyankar · Answer 18 · Fri Feb 04 2022 00:00:39 GMT+0800 (China Standard Time)

You need to examine all the discovery blocks regardless.

No, a comprehensive search is not needed if you know what type of resource it supports. You can use the type to limit it.

The target may not exist in the current exchange set, but if not then it would have been present in a previous exchange set. Either way, it's a simple dictionary lookup.

This assumes or probably forces a particular implementation of metadata, which is not a good idea.

Recommend add a value to specify an exchange catalog: exchangeCatalogue (see S100_ExchangeCatalogueIdentifier.identifier)

Don't see the need for this, but don't mind adding it.

What Raphael has added identifies the type of the file referenced from the support file. What Jonathan seems to want is to identify the type of the support file itself.

E.g., the support file is a "translation file", rather than the support file references a "feature catalogue".

Language packs should have file name conventions that identify them as language packs. Or the S100_SupportFileFormat enumeration can be extended with "LanguagePack".