adiwg / mdTranslator

Metadata translation tool built using Ruby

Home Page:https://www.adiwg.org/mdTranslator/

Geek Repo:Geek Repo

Github PK Tool:Github PK Tool

Add 'forceValid' metadata flag to mdTranslator

stansmith907 opened this issue · comments

Add a flag to the mdTranslator module to 'forceValid'. If the flag=true missing required ISO elements will set the nilReason attribute to "missing" and missing required FGDC elements will be assigned a free text value of "missing". If the flag=false the action will not include these elements when the value is missing. This will allow the record to be invalid in accordance the published standard.

forceValidOutput true will force showEmptyTags=false.

All FGDC and ISO writers will be effected by this change.

  • add flag to CLI and mdTranslator response object
  • refactor FGDC writer
  • refactor ISO 19110 writer
  • refactor ISO 19115-2 writer

This flag causes some rethinking about what constitutes an error or warning.

  • If forceValid = true then tags are generated to cover missing required elements; so all messages related to missing data should be WARNING level.
  • If forceValid = false then missing required elements will cause generation of invalid metadata (at least according to the selected standard). These messages, even thought the same content as above, should be ERROR level.
  • UNLESS we adopt a policy of lax standard adherence such that all mdTranslator output is assumed to be healthy even when the chosen standard is not precisely met. This is probably the safest option since we know from experience not many types of data offer the opportunity to fully conform to a standard. In this scenario, the only issues that would prevent the translator from generating metadata (e.g. referencing an invalid contact or domain ID) would trigger an ERROR.
  • CAVEAT: writer required elements which are also mdJson required elements (e.g. addressType) will not get examined by the writer because the mdJson schema validation (or if validation is set to 'none' the mdJson reader) will reject the input file until compliant with mdJson.

Note: will refactor writers while working issue #183.

commented

Yes, that's true. It is quite common that records are not squeaky clean in adherence to FGDC standards. Even the authoring tools don't stop someone from say using biological profile elements in a claimed standard profile record. Technically in error, but nobody really cares and the infractions were conscious in some cases to serve a purpose. I think this is why the Metadata Parser classifies errors according to three levels of severity. A warning can be issued, but it is up to the author to interpret whether this error should be responded to. A second level states the error is due to non-compliance with the standard or optionally applied best practices (thesauri validation, etc). This level expects the author to address the issue. These two levels will still produce an output, but do not introduce a statement into the XML. The highest severity is usually due to an invalid XML structure and in this case an output is not generated.
If we were to follow this example, we would allow output from the translator except for invalid mdJSON.