compscidr / scholar

A parser for Google scholar, written in go

Geek Repo:Geek Repo

Github PK Tool:Github PK Tool

scholar

scholar is a WiP Go module that implements a querier and parser for Google Scholar's output. Its classes can be used independently, but it can also be invoked as a command-line tool.

This tool is inspired by scholar.py

Usage

import "github.com/compscidr/scholar"

sch := scholar.New()
articles := sch.QueryProfile("SbUmSEAAAAAJ", 1)

for _, article := range articles {
	// do something with the article
}

Features

Working:

  • Queries and parses a user profile by user id to get basic publication data
  • Queries each of the articles listed (up to 80) and parses the results for extra information
  • Caches the profile for a day, and articles for a week (need to confirm this is working)
    • This is in memory, so if the program is restarted, the cache is lost
  • Configurable limit to number of articles to query in one go

TODO:

  • Pagination of articles
  • Add throttling to avoid hitting the rate limit (figure out what the limit is)
  • Add on-disk caching so that if program restarts the cache is not lost

Possible throttle info:

https://stackoverflow.com/questions/60271587/how-long-is-the-error-429-toomanyrequests-cooldown

About

A parser for Google scholar, written in go

License:Apache License 2.0


Languages

Language:HTML 97.4%Language:Go 2.6%