`aleph_emit` fails with data validation error
uhhhuh opened this issue · comments
During the store/aleph_emit
stage ingesting documents causes "Error: Error during data validation".
Some of the affected crawlers:
- documentcloud crawler;
- nextcloud crawler.
Some of the documents failing:
https://s3.documentcloud.org/documents/4387966/Gary-Michael-Lawkowski-Financial-Disclosure.pdf
https://s3.documentcloud.org/documents/4387964/Amanda-E-Kaster-Financial-Disclosure.pdf
https://s3.documentcloud.org/documents/269360/haiku-safety-warning-signs.pdf
https://s3.documentcloud.org/documents/269534/boogaard-police-report.pdf