Instructions:
- Download the dataset from here.
- Extract the
.json
file from the.zip
file, rename itarxiv.json
and place it in thedata
directory - Use
years_and_categories.ipynb
to generate the visualization about the most active research fields in the last years - Use
countries.ipynb
to generate the visualization about the provenience of the papers submitted to arXiv in the last months. This notebook requires you to collect data in advance with the scriptcountries.py
: this will take a very long time (~2 days). Thedata/countries
directory contained tha data collected about the period2020/08 2020/10
.