IHO-S100WG / TSM8

Geek Repo:Geek Repo

Github PK Tool:Github PK Tool

6.X-12 ISO 19115 files

rmalyankar opened this issue · comments

In Ed. 4.0.0

The ISO 19115 metadata file is referenced from S100_DatasetDiscoveryMetadata by the element S100_19115DatasetMetadata which is a role in Figure 4a-D-2 implemented as a gcx:FileName reference and carries the name of the ISO 19115 metadata file. So even in 4.0.0 there is an indication whether an ISO 19115 metadata file is present for a dataset as well as its name.
image

In Ed. 5.0.0

The 4.0.0 modelling is carried over to Figure 4a-D-5. The implementation would be similar with a change to the role name. Since Figure 4a-D-5 desribes the exchange catalog (CATALOG.XML), there is a note saying it is an external file (meaning, external to the exchange catalog).

I can add the S100_19115DatasetMetadata box as a (white) block in Figure 4a-D-2 and aggregate it to S100_ExchangeSet (and change its background in 4a-D-5 to conform).

On aggregations vs. compositions: Portrayal and feature catalogues being common to all datasets in the product, they need not be deleted when the exchange set is removed. Since some support files are going to be shared based on the support file discussion we have been having, some of them (and their discovery metadata) would also be retained. In fact, since an exchange set could be an archive acting as a "delivery package", perhaps even datasets and their discovery metatata should be retained until the dataset is cancelled, or voluntarily removed, or goes out of license?

The reference from the dataset is ok, but there must be an element for the 19115 metadata file itself. The reason is that there should be not a single file in the exchange set that does not carry a digital signature. I think S100_SupportFile_Metadata would do the job.
This is all under the assumption that the 19115 Metadata file is part of the exchange set.

I also think S100_SupportFileDiscoveryMetadata would do. Also, add a value to the S100_SupportFileFormat enumeration:
ISOMetadata: Dataset metadata in ISO format

There is a generic "XML" in the S100_SupportFileFormat enumeration but indicating that it is ISO metadata would be better.

The supportFileSpecification attribute of S100_SupportFileDiscoveryMetadata can provide more information about which edition of 19115-x is being used.

New figures. The white boxes are "structural" classes that represent objects such as dataset files, rather than blocks of XML.

Figure 17-2 S-100 Exchange Set:
V5 0 Fig 17-2 S100 ExchangeSet

Figure 17-5 (S-100 Exchange Set Catalogue):

V5 0 Fig 17-5 S100 ExchangeSetCatalogue

I like the white boxes, but suggest renaming to avoid confusion with class names (the "structural class names" are not defined anywhere):

  • S100_ExchangeSet -> S-100 Exchange Set
  • ISOMetadataFile -> ISO Metadata
  • S100_Dataset -> Dataset
  • S100_SupportFile -> Support File
  • S100_SupportCatalogue -> Catalogue
  • S100_CatalogueSignature -> Exchange Set Signature

It seems misleading to give attribute names to the file relationships (e.g., "supportFile" for S100_Dataset->S100_SupportFile).

Also, the support file will reference a dataset and/or catalogue (or possibly a support file?) so this will need further updating. Will need to get the actual attribute names from JP.

  • Multiplicity on datasetReference should be 0..*
  • Add "catalogueReference" for S100_SupportFileDiscoveryMetadata->S100_CatalogueDiscoveryMetadata, 0..*
  • Possibly add "supportFileReference" for S100_SupportFileDiscoveryMetadata->S100_SupportFileDiscoveryMetadata, 0..*
  • Need a relationship between a support file and a catalogue (S100_SupportCatalogue->S100_SupportFile), 0..*
  • Possibly need a relationship between support files (S100_SupportFile->S100_SupportFile), 0..*

Consider if ISO Metadata would need support files in order to deliver schemas / codelists. If so, add ISOMetadataFile->S100_SupportFile, 0..*

New figures. The white boxes are "structural" classes that represent objects such as dataset files, rather than blocks of XML.

Figure 17-2 S-100 Exchange Set: V5 0 Fig 17-2 S100 ExchangeSet

Figure 17-5 (S-100 Exchange Set Catalogue):

V5 0 Fig 17-5 S100 ExchangeSetCatalogue

I'm going to apply Occam's razor to this discussion, and add two attributes to S100_SupportFileDiscoveryMetadata instead. The associations in question will be removed altogether or changed to dependencies (dependencies are conceptual links and are not mapped to XML tags in the XML exchange catalogue).

  • supportedResource: CharacterString [0..*] Definition: Name of the resource supported by this support file.
  • supportedResourceType: Enumeration {dataset, featureCatalogue, portrayalCatalogue, interoperabilityCatalogue, product, application, supportFile, other(?)}

I do not agree that the structural classes should have different names. The S100_ structure classes all represent things that are defined in the S-100 context and Part 17 4.1 explains them. ISOMetadataFile represents exactly what its name indicates: an ISO-format metadata file.

commented

I like the application of Occam's razor in this context. The diagrams look ok to me and I don't have any strong feelings on how these relationships are represented as the UML is largely illustrative and most OEMs will be guided by the schemas and example datasets (as our IEC representative will tell us).

supportedResource I think is concise and to the point and we just use that as a generic relationship name, the enumeration values are ok I think but I'm not sure about "product" or "application". I think "other" is fine - it's not actually massively important what the supportedResourceType is, is it? multiplicities are fine with me 0.* so you can have supportingResources which support nothing (they support the "exchange set" - this accounts for README.TXT type elements, and other elements aggregators / distributors may wish to insert at a later date)....

The diagrams I'm not so concerned about - it does strike me that the easier way of doing this is to dispense with exchange set metadata as a separate concept and just design a product specification for it. Then you have a complete UML language to describe the structure and its interrelationships. dataset metadata are just FeatureTypes and supporting resources are just information types. you'd need to define a generic XML schema to encode it but much of the hard work is done and you don't need a completely separate diagramming language to describe such structures. We did a testbed projectt with Geonovum doing S-100 metadata in this fashion so it could be accessed via OGC API Records - in the OGC API methodology the metadata API (OGC API Records) is an extension of the generic OGC API Features - there's simply no concept of having different structures for metadata and data... Just a thought - possibly for a future incarnation of these things.

Updated Figure 17-1:
V5 0 Fig 17-1 Realization of the Exchange set Classes

Updated Figure 17-2:
V5 0 Fig 17-2 S100 ExchangeSet
(Is there a plausible use case for a support file that supports more than one catalogue? Even successive versions of catalogues should be complete packages that don't depend on something packaged with a previous version.)

Updated Figure 17-5:
V5 0 Fig 17-5 S100 ExchangeSetCatalogue

Looks good, I think these are a big improvement over what was in version 4.

(Is there a plausible use case for a support file that supports more than one catalogue? Even successive versions of catalogues should be complete packages that don't depend on something packaged with a previous version.)

Catalog updates will likely require more work in the future; recommend that these are 0..* so as not to restrict future developments.

  • Support files which haven't changed could be shared with multiple versions of a catalog.
  • Support files for IENC and S-101 could be shared.
  • Support files for interoperability catalogs could be shared with many portrayal catalogs.
  • In the future, may want to deliver dataset/catalog/support file updates via an update mechanism which provides the new content via support file (e.g. some type of patch / delta)

image

Draft enumeration for supported resource types added here

  • Using the file name as an identifier here will require unique file names for all of these file types. I don't think there is any requirement or convention to enforce this currently - I think most portrayal catalogs are just named "portrayal_catalogue.xml".

    • Consider using a URI/hash/MRN/etc.
      image
  • productId is a non-version specific product type identifier. number is a major version specific product type identifier.
    image

Given that there is no convention, and given that even datasets can be reissued, I can change the remarks for dataset through product to just supportedResource = identifier for the dataset, etc., and leave the question of exactly how they are identified open until a convention is published. Also, add a note saying conventions for identifiers are still to be developed.

[...] change the remarks for dataset through product to just supportedResource = identifier for the dataset, etc. [...]

Sounds good, but thinking about this some more it seems like most of the values of S100_SupportedResourceType are duplicative and the use of the enumeration may limit the ability to share a supportFile among multiple resource types (it's not clear how this will be added to the model - will it be an attribute of each reference from a supportFile to a resource, or is it just a single attribute of the supportFile?)

  • For dataset, featureCatalogue, portrayalCatalogue, interoperabilityCatalog, and supportFile the referenced resource type can be determined based on which section of *DiscoveryMetadata contains the resource
    • It's a dataset if it's in S100_DatasetDiscoveryMetadata
    • It's a catalog if it's in S100_CatalogueDiscoveryMetadata
      • S100_CatalogueScope identifies the catalog type
    • It's a support file if it's in S100_SupportFileDiscoveryMetadata
  • Recommend remove all these file type values and replace with a single value: file
    • Retain product, software, system, and other
  • Recommend add a value to specify an exchange catalog:
    • exchangeCatalogue (see S100_ExchangeCatalogueIdentifier.identifier)

productId is a non-version specific product type identifier. number is a major version specific product type identifier.
image

I'm not sure what the use case for product is (maybe something like a copyright notice or LPGL license file?). Consider:

  • should it apply to resource types other than just datasets?
  • does the supportFile still need to individually reference all of the files to which it applies?
    • I don't think so, but it should be clarified.
  • If we need a version-specific product identifier then:
    • productId: All resources of an S-100 product type e.g., "S-101". See S100_ProductSpecification.productId.
    • productNumber: All resources of a registered version of an S-100 Product type e.g., "5". See S100_ProductSpecification.number.
commented

Just to flag - Holger and I are working on language packs with CHS/CCG. This will generate a number (0..*) of language packs containing translations of feature catalogue items. This is being done under a new Part 18 (as decided by S100WG at the meeting). The only issue for this group I believe is that a language_pack supports a feature catalogue providing a mechanism for translating some of its content. the nature of the supporting resource in this case (the language pack) can either be determined from the content (it's an XML file with a root element of "languagePack" for instance) or via the metadata where the S100_SupportedResourceType attribute...). Worth consideration - I'll put some thoughts together on it but just flagging it now before Part 18 is drafted.

commented

The other thing to note is that the meeting we had re: identification of resources concluded there's so much to consider as regards the maintenance of state on the implementing system that trying to get it perfect now is impossible. Some thorough testing of this part is likely to result in some changes given there's a need to produce full exchange sets, partuial exchange sets, delta exchange sets etc... and the various encodings and product specs have different updating models in them. Add in reissues and cancellations to the requirement to update/replace supporting resources and I think the general consensus was we can't test all these things exhaustively now - we will need to do scenario testing and then recommend either corrections/clarifications to this part of S-100 and/or clarifications in S-98 Annex C in regard to ECDIS... I think we're close here and reversing the associations basically works but until we've done it in practice we simply won't be able to exercise all the possibilities....

There may not be a target discovery block in the exchange set especially if it is an update. Also, not mentioning the type means that all discovery blocks must be examined to find out what it supports.

Codelist dictionaries are support files and would support all datasets for a product version.

Is there a realistic example of a support file that supports multiple types of resources? If so, it should be split into multiple files that each support one type, otherwise management of resources grows over-complicated.

I will rename product to productVersion and add productFamily (Datasets for any active version of a product specification.)

There may not be a target discovery block in the exchange set especially if it is an update. Also, not mentioning the type means that all discovery blocks must be examined to find out what it supports.

  • You need to examine all the discovery blocks regardless.
  • The target may not exist in the current exchange set, but if not then it would have been present in a previous exchange set. Either way, it's a simple dictionary lookup.

Recommend add a value to specify an exchange catalog:
exchangeCatalogue (see S100_ExchangeCatalogueIdentifier.identifier)

What Raphael has added identifies the type of the file referenced from the support file. What Jonathan seems to want is to identify the type of the support file itself.

  • E.g., the support file is a "translation file", rather than the support file references a "feature catalogue".
  • You need to examine all the discovery blocks regardless.

No, a comprehensive search is not needed if you know what type of resource it supports. You can use the type to limit it.

  • The target may not exist in the current exchange set, but if not then it would have been present in a previous exchange set. Either way, it's a simple dictionary lookup.

This assumes or probably forces a particular implementation of metadata, which is not a good idea.

Recommend add a value to specify an exchange catalog: exchangeCatalogue (see S100_ExchangeCatalogueIdentifier.identifier)

Don't see the need for this, but don't mind adding it.

What Raphael has added identifies the type of the file referenced from the support file. What Jonathan seems to want is to identify the type of the support file itself.

  • E.g., the support file is a "translation file", rather than the support file references a "feature catalogue".

Language packs should have file name conventions that identify them as language packs. Or the S100_SupportFileFormat enumeration can be extended with "LanguagePack".