amnonkhen / google-patents-scraper

A simple scraper for the Google patents website I wrote as a freelance project

Geek Repo:Geek Repo

Github PK Tool:Github PK Tool

google-patents-scraper

A simple scraper for the Google patents website I wrote as a freelance project. Saves each patent's HTML, images and PDF in a directory.

  1. Requirements
  1. Command line parameters:
  -h, --help            show this help message and exit
  --start START         start patent id (default: None)
  --end END             end patent id (inclusive) (default: None)
  --output_dir OUTPUT_DIR
                        output directory (default: ./)
  --org {EP,US,WO,DE}   prefix of the organization publishing the patent
                        (default: EP)

example command line:
python scraper.py --start 234 --end 1872

About

A simple scraper for the Google patents website I wrote as a freelance project


Languages

Language:Python 100.0%