Fix the doc field in scraper_v3
tarunima opened this issue · comments
Due to a bug in article parser in https://github.com/tattle-made/factchecking-sites-scraper/tree/master/scraper_v3, the doc id for multiple media items is the same. A new doc id needs to be assigned to media items that were scraped through scraper_v3.
Also need to go through older data scraped using v3 and change doc_id