DuyenDo / data-study-google-scholar

The First Data Study of Google Scholar

[Semester Project] The First Data Study of Google Scholar

Member:

DO Thi Duyen - LE Ta Dang Khoa

Outline

1. Collect data from Google Scholar

Packages:

Scrapy for crawling: http://doc.scrapy.org/en/latest/

$pip install scrapy

Selenium and Webdriver for JavaScript actions (e.g. click 'Show more'): https://pypi.org/project/selenium/

$pip install -U selenium

Download webdriver (chromedriver.exe/geckodriver.exe/...) and put it in google_scholar/libs For linux server: $sudo apt-get install -y chromium-browser

Spiders: google_scholar/spiders

Get list of papers given user URLs google_scholar/spiders/papers_spider.py

Get list of papers which cited the given paper google_scholar/spiders/citations_spider.py

Run: $python3 google_scholar/runner.py

2. Process and analyse

About

The First Data Study of Google Scholar

Languages

Language:Jupyter Notebook 68.8%Language:HTML 31.0%Language:Python 0.2%