open-contracting / kingfisher-collect

Downloads OCDS data and stores it on disk

Home Page:https://kingfisher-collect.readthedocs.io

Geek Repo:Geek Repo

Github PK Tool:Github PK Tool

peru_osce_bulk: Some releases have no ocid

jpmckinney opened this issue · comments

e.g. Kingfisher Process produces compiled releases like:

{"id": "None-2023-04-01T00:00:00-05:00", "tag": ["compiled"], "url": "https://contratacionesabiertas.osce.gob.pe/api/v1/release/ocds-dgv273-seacev3-2023-51-73-2023-12-06T06:29:04.428973-05:00", "date": "2023-04-01T00:00:00-05:00", "details": {"id": "ocds-dgv273-seacev3-2023-51-73-2023-12-06T06:29:04.428973-05:00", "tender": {"id": "892585", "title": "AS-SM-5-2023-GOB.REG.TACNA-1"}}}

Pelican backend then fails.

The ocid might be extractable from details.id, but this OCDS looks very bizarre.

cc @fppenna for reporting to the publisher

Ok so the original release, from the release endpoint looks good: https://contratacionesabiertas.osce.gob.pe/api/v1/release/ocds-dgv273-seacev3-2023-51-73-2023-12-06T06:29:04.428973-05:00

The compiled release from the record endpoint also looks good, but, the linked release from that endpoint contains what Kingfisher Process is producing: https://contratacionesabiertas.osce.gob.pe/api/v1/record/ocds-dgv273-seacev3-2023-51-73?format=json (as a linked release it looks good but they have to remove the "details" object)

So, I guess, what is happening is that they are using linked releases for generating the bulk files instead of the compiled releases (and as linked releases are inside a record, they don't have an ocid). cc @fppenna

Aha. The test for a linked release in OCDS Kit is 'url' in data and len(data) <= 3, so the inclusion of details breaks that.

So, yes, we should report that they ought to remove details (or use embedded releases if they want such details).

I'll updated OCDS Kit and Kingfisher Process to change the test for linked releases.

I'll updated OCDS Kit and Kingfisher Process to change the test for linked releases.

Do we want to close this issue in Collect then or do we want to remove the details object in peru_osce_bulk and peru_osce_records?

We can report it to the publisher, but, yes, this is not an issue for Collect.