Jackustc / COVID-19-InstaPostIDs

The repository includes an ongoing collection of Instagram Posts IDs correlated with the new coronavirus COVID-19.

Geek Repo:Geek Repo

Github PK Tool:Github PK Tool

COVID19 Instagram Post IDs

The repository includes an ongoing collection of Instagram Posts IDs correlated with the new coronavirus COVID-19. The first version of this data collection process started on January 5, 2020 and continued until March 30, 2020. The data gathering is still running as the lockdown has not been finished in many countries around the world (at the time of writing this paper).

hashtag wordcloud

We hope that the dataset can support diverse research activities. Below we list a subset of potential topics, we believe the dataset could support:

  • Fake news, misinformation and rumors spreading.
  • Behavioral change analysis during the pandemic.
  • Information sharing related Covid-19.
  • etc.

The linked paper to this dataset (arxiv): A First Instagram Dataset on COVID-19


Data Collection

We have collected public posts from Instagram by crawling all posts associated with a set of COVID-19 hashtags including #coronavirus, #covid19, #covid_19, and #corona.

Release v1.0 (April 20, 2020).

The first version of this data collection process started on January 5, 2020 and continued until March 30, 2020. The data gathering is still running. During this time 18.5K comments and 329K likes from 5.3K public posts have been collected. These posts are distributed by 2.5K publishers.

language code of. #post total %
Egnlish en 3.1K 58.3%
Spanish es 530 9.9%
Portuguese pt 378 7.1%
Italian it 199 3.7%
French fr 120 2.2%
Russian ru 98 1.8%
Farsi fa 96 1.8%
Arabic ar 79 1.4%
Turkish tr 68 1.2%
Other & non-detected - 643 12.1%

Inquiries

For any further question, please contact Koosha Zarei at koosha.zarei@telecom-sudparis.eu.

Licensing

This dataset is licensed under the Creative Commons Attribution-NonCommercial-ShareAlike 4.0 International Public License (CC BY-NC-SA 4.0) and we publish in agreement with Instagram's Terms & Conditions.

By using this dataset, you agree to remain in compliance with conditions in the license and Instagram's Terms and Conditions, and cite the following paper:

References

Koosha Zarei, Reza Farahbakhsh, Noel Crespi, and Gareth Tyson. 2020. A First Instagram Dataset on COVID-19. arXiv:2004.12226.

About

The repository includes an ongoing collection of Instagram Posts IDs correlated with the new coronavirus COVID-19.

License:Creative Commons Zero v1.0 Universal