DCAT-US writer: required U.S. Government fields
hmaier-fws opened this issue · comments
Issue
The DCAT-US schema has several fields that are identified for specific use by the U.S. Government. Two of these fields are flagged as mandatory. The translator should probably support the [bureauCode] and [programCode] fields.
Related issues: #251, #264
Discussion
- It seems that this would only need to be a priority update if the editor or translator were being used to generate the data.json files for direct ingest into data.gov.
- The above use case seems unlikely given the guidance that states: "In accordance with the requirements of the OPEN Government Data Act, the Data.gov catalog is populated by harvesting federal agency harvest sources that have the metadata inventories of the agency datasets"
- I don't think I've ever seen these fields exposed to the users of federal data catalogs. As far as I know, the program and bureau codes are injected into the data.json by the catalogs prior to creating the export for data.gov harvest.
Potential mapping
1) Extend codelists
bureauCode
- Extend role codelist to include "bureau", extend namespace codelist to include "bureauCode"*]
- for each
resourceInfo.citation.responsibleParty
WHERE role = "bureau" - SELECT contactId ->
contact.identifier
- identifier must conform to https://resources.data.gov/schemas/dcat-us/v1.1/omb_bureau_codes.csv
programCode
- Add
resourceInfo.programCode
element to schema of program - Add new codelist of "programCode"
- resourceInfo.program[0,n]
2) Use contactIdentifier
Another approach might be to use the contact identifier. This would not require a schema update and would be more inline with what these codes actually represent. Identifiers for bureaus and programs.
- Add respective code to an organization (Publisher?) as an identifier
- extend ADIwg_Namespace codelist to include the respective namespaces
Summary of bureau and program code fields
bureauCode
Description:
Federal agencies, combined agency and bureau code from OMB Circular A-11, Appendix C.
Accepted values:
An array of strings from OMB Circular A-11, Appendix C. Codes are also available as a CSV file.
Usage Notes:
Represent each bureau responsible for the dataset according to the codes found in OMB Circular A-11, Appendix C (PDF, CSV). Start with the agency code, then a colon, then the bureau code.
programCode
Description:
Federal agencies, list the primary program related to this data asset, from the Federal Program Inventory. Use the format of 015:001.
Accepted values:
An array of strings from the Federal Program Inventory list.
Usage Notes:
Provide an array of programs related to this data asset, from the Federal Program Inventory.
@dwalt , @jwaspin as per our previous discussion the proposed solution is to extend the adiwg_namespace codelist to include:
- codename: bureauCode
- codename: programCode
A related issue has been created in the mdCodes repository: adiwg/mdCodes#82