tattle-made / factchecking-sites-scraper

A repo to store helper functions for scraping + experiments/visualisations

Geek Repo:Geek Repo

Github PK Tool:Github PK Tool

Fix the doc field in scraper_v3

tarunima opened this issue · comments

Due to a bug in article parser in https://github.com/tattle-made/factchecking-sites-scraper/tree/master/scraper_v3, the doc id for multiple media items is the same. A new doc id needs to be assigned to media items that were scraped through scraper_v3.

Also need to go through older data scraped using v3 and change doc_id