AnarchoTechNYC / cli-scraper

Simple containerized Web scraping framework.

Geek Repo:Geek Repo

Github PK Tool:Github PK Tool

CLI Scraper

Simple containerized Web scraper framework with a minimal plug-in architecture for quickly creating Web scrapers for various sites of interest.

Mostly intended to support Anarchism.NYC right now.

Using

To use this container, you supply at least the name of a scraper as its command argument. A "scraper" is just the name of a subdirectory in the scrapers directory. For example, to invoke the facebook.com scraper, which scrapes Facebook.com data:

docker container build -t scraper .       # Build this container and call it `scraper`.
docker container run scraper facebook.com # Invoke the `facebook.com` scraper.

About

Simple containerized Web scraping framework.


Languages

Language:Shell 96.4%Language:Dockerfile 3.6%