AtlasOfLivingAustralia / biocache-store

Occurrence processing, indexing and batch processing

Geek Repo:Geek Repo

Github PK Tool:Github PK Tool

Incorrect processing of data license

timhicks-ala opened this issue · comments

Example occurrences:
https://biocache.ala.org.au/occurrences/f0020e4d-3d24-4f69-94f8-3e89b785801d
https://biocache.ala.org.au/occurrences/9a5b44ea-0539-466a-b7de-1a934e513d13
Original value: http://creativecommons.org/licenses/by-sa/4.0/
Processed value: CC-BY-NC-Int

A different example:
https://biocache.ala.org.au/occurrences/3be33ebf-a64d-4036-8edf-31ae635aeec6
Original value: https://creativecommons.org/publicdomain/zero/1.0/legalcode
Processed value: CC-BY

One case spotted where the processed value is less restrictive than the supplied value:
https://biocache.ala.org.au/occurrences/f0588d42-f80f-421e-8a52-85f814ef4f95
Original value: Supplied as http://creativecommons.org/licenses/by-nc-nd/4.0/
Processed value: CC-BY-NC-Int

Discovered while working on a more general data licensing query. This issue is not linked to a specific support ticket as yet.

The first step for this is determining where license processing occurs. It is possible that it is done inside a gbif library.

This is a rough list of GBIF libraries that biocache-store depends on currently:

org.gbif:ecat-common:1.6
org.gbif:dwca-io:2.6
org.gbif:dwc-api:1.19
org.gbif:gbif-common:0.42
org.gbif.crawler:crawler:0.50
org.gbif:gbif-wrangler:0.2

The licences come from the collectory and are associated with the data resource.

Relevant code:
https://github.com/AtlasOfLivingAustralia/biocache-store/blob/master/src/main/scala/au/org/ala/biocache/processor/AttributionProcessor.scala#L67

The value that gets used is the value associated with the dataresource and that comes from a webservice - https://collections.ala.org.au/ws/dataResource/dr1411

"licenseType": "CC-BY-NC-Int",
"licenseVersion": "4.0"

@timhicks-ala @peggynewman This appears to have been corrected in the latest biocache-store

@charvolant @peggynewman Confirm, looks correct now in production.