darthbear / scrapy-proxynova

Use scrapy with a list of proxies generated from proxynova.com

Geek Repo:Geek Repo

Github PK Tool:Github PK Tool

scrapy-proxynova

Use scrapy with a list of proxies generated from proxynova.com

The first run will generate the list of proxies from http://proxynova.com and store it in the cache.

It will individually check each proxy to see if they work and remove the ones that timed out or cannot connect to.

Example:

./run_example.sh

To regenerate the proxy list, run: python proxies.py

In settings.py add the following line: DOWNLOADER_MIDDLEWARES = { 'scrapy_proxynova.middleware.HttpProxyMiddleware': 543 }

Options

Set these options in the settings.py.

  • PROXY_SERVER_LIST_CACHE_FILE — a file to store proxies list. Default: proxies.txt.
  • PROXY_BYPASS_PERCENT — probability for a connection to use a direct connection and not use a proxy

About

Use scrapy with a list of proxies generated from proxynova.com


Languages

Language:Python 98.0%Language:Shell 2.0%