adiwg / mdTranslator

Metadata translation tool built using Ruby

Home Page:https://www.adiwg.org/mdTranslator/

Geek Repo:Geek Repo

Github PK Tool:Github PK Tool

DCAT-US writer: required U.S. Government fields

hmaier-fws opened this issue · comments

Issue

The DCAT-US schema has several fields that are identified for specific use by the U.S. Government. Two of these fields are flagged as mandatory. The translator should probably support the [bureauCode] and [programCode] fields.

Related issues: #251, #264

Discussion

  • It seems that this would only need to be a priority update if the editor or translator were being used to generate the data.json files for direct ingest into data.gov.
  • The above use case seems unlikely given the guidance that states: "In accordance with the requirements of the OPEN Government Data Act, the Data.gov catalog is populated by harvesting federal agency harvest sources that have the metadata inventories of the agency datasets"
  • I don't think I've ever seen these fields exposed to the users of federal data catalogs. As far as I know, the program and bureau codes are injected into the data.json by the catalogs prior to creating the export for data.gov harvest.

Potential mapping

1) Extend codelists

bureauCode

programCode

  • Add resourceInfo.programCode element to schema of program
  • Add new codelist of "programCode"
  • resourceInfo.program[0,n]

2) Use contactIdentifier

Another approach might be to use the contact identifier. This would not require a schema update and would be more inline with what these codes actually represent. Identifiers for bureaus and programs.

  • Add respective code to an organization (Publisher?) as an identifier
  • extend ADIwg_Namespace codelist to include the respective namespaces

Summary of bureau and program code fields

bureauCode

Description:

Federal agencies, combined agency and bureau code from OMB Circular A-11, Appendix C.

Accepted values:

An array of strings from OMB Circular A-11, Appendix C. Codes are also available as a CSV file.

Usage Notes:

Represent each bureau responsible for the dataset according to the codes found in OMB Circular A-11, Appendix C (PDF, CSV). Start with the agency code, then a colon, then the bureau code.

programCode

Description:

Federal agencies, list the primary program related to this data asset, from the Federal Program Inventory. Use the format of 015:001.

Accepted values:

An array of strings from the Federal Program Inventory list.

Usage Notes:

Provide an array of programs related to this data asset, from the Federal Program Inventory.

@dwalt , @jwaspin as per our previous discussion the proposed solution is to extend the adiwg_namespace codelist to include:

  • codename: bureauCode
  • codename: programCode

A related issue has been created in the mdCodes repository: adiwg/mdCodes#82