DCAT-US writer: optional fields
hmaier-fws opened this issue · comments
Issue:
Add support for DCAT-US schema optional fields.
Related issues: #251, #264, #267, #268
For information on the DCAT-US schema see:
- https://resources.data.gov/schemas/dcat-us/v1.1/schema/dataset.json
- https://resources.data.gov/resources/dcat-us
- https://github.com/adiwg/mdTranslator/blob/feature/dcat-us-writer/notes.md
Optional fields
- #276
- #281
- #278
- #284
- #280
- isPartOf (closed by #282)
- issued (closed by #282)
- #277
- landingPage (closed by #282)
- #279
- references (closed by #282)
- systemOfRecords (closed by #282)
- theme (closed by #282)
Proposed mapping of mdJSON to DCAT-US optional fields.
accrualPeriodicity
Description
The frequency with which dataset is published. See #276
conformsTo
Description
Data Standard URI used to identify a standardized specification the dataset conforms to.
URI used to identify a standardized specification the dataset conforms to. See #281
dataQuality
Description
U.S. Government specific. Whether the dataset meets the agency’s Information Quality Guidelines (true/false). See #278
describedBy
Description
URL to the data dictionary for the dataset. Note that documentation other than a data dictionary can be referenced using Related Documents (references). See #284
describedByType
Description
The machine-readable file format (IANA Media Type also known as MIME Type) of the dataset’s Data Dictionary (describedBy). See #280
isPartOf
Description
The collection of which the dataset is a subset.
issued
Description
Date of formal issuance.
language
Description
The language of the dataset. See #277
landingPage
Description
This field is not intended for an agency’s homepage (e.g. www.agency.gov), but rather if a dataset has a human-friendly hub or landing page that users can be directed to for all resources tied to the dataset.
primaryITInvestmentUII
Description
U.S. Government specific. For linking a dataset with an IT Unique Investment Identifier (UII). See #279
references
Description
Related documents such as technical information about a dataset, developer documentation, etc.
systemOfRecords
Description
U.S. Government specific. If the system is designated as a system of records under the Privacy Act of 1974, provide the URL to the System of Records Notice related to this dataset.
theme
Description
Main thematic category of the dataset.
Mapping
No (not required)
Field Name | DCAT Name | Condition | mdJson Source |
---|---|---|---|
Release Date | dcat:issued | if resourceInfo.citation.date[any].dateType = "publication" or "distributed" | resourceInfo.citation.date[earliest] |
Frequency | dcat:accrualPeriodicity | [ISO codelist MD_maintenanceFrequency can be used and several codes intersect with accrualPeriod codelist they are partially corresponding. A column of ISO8601 code equivalents could be added to MD_maintenanceFrequency to provide the coding expected https://resources.data.gov/schemas/dcat-us/v1.1/iso8601_guidance/#accrualperiodicity, community valuation should be determined] | |
Language | dcat:language | [language codelist could be used but needs to be bound with country corresponding to the RFC 5646 format https://datatracker.ietf.org/doc/html/rfc5646, such as "en-US", community valuation should be determined | |
Data Quality | dcat:dataQuality | [this is a boolean to indicate whether data "conforms" to agency standards, value seems negligble] | |
Category | dcat:theme | where resourceInfo.keyword[any].thesaurus.title = "ISO Topic Category" | [resourceInfo.keyword.keyword[0, n] flatten] |
Related Documents | dcat:references | associatedResource[all].resourceCitation.onlineResource[all].uri + additionalDocumentation[all].citation[all].onlineResource[all].uri [comma separated] | |
Homepage URL | dcat:landingPage | [Add code "landingPage" to CI_OnlineFunctionCode] if resourceInfo.citation.onlineResource[any].function = "landingPage" |
resourceInfo.citation.onlineResource.uri |
Collection | dcat:isPartOf | for each associatedResource[0, n].initiativeType = "collection" and associatedResource.associationType = "collectiveTitle" | associatedResource.resourceCitation[0].uri |
System of Records | dcat:systemOfRecords | [Add code "sorn" to DS_InitiativeTypeCode] for each associatedResource[0, n].initiativeType = "sorn" |
associatedResource.resourceCitation[0].uri |
Primary IT Investment | dcat:primaryITInvestmentUII | [Links data to an IT investment identifier relative to Exhibit 53 docs, community valuation should be determined] | |
Data Dictionary | dcat:describedBy | if dataDictionary.dictionaryIncludedWithResource IS NOT TRUE and citation[0].onlineResource[0].uri exists | dataDictionary.citation[0].onlineResource[0].uri |
Data Dictionary Type | dcat:describedByType | [For simplicity, leave blank implying html page, community decision needed whether to support other format types using mime type and in the form of "application/pdf"] | |
Data Standard | dcat:conformsTo | [Currently not able to identify the schema standard the data conforms to, though this has been proposed. Should this be built and there is community decision to support it, then it can be mapped] |
Issued is not writing relative to test data having a "publication" date type.
accuralPeriodicity is not writing. Test data has a md_maintenanceFrequency code of "annual".
References is not writing. Test data has an associated resource that should have been written.
Theme is not writing. Test data has at least one ISO Topic Category keyword.
Described By did not write. Test data had a data dictionary "not contained within the record", and uri to external dictionary.