proxy_ip_crawler

##简述抓取代理IP的爬虫
支持存储方式：Mysql，Sqlite，Json
运行于_Python2.7_

###A.使用__scrapy__抓取
依赖模块：scrapy，requests，lxml，pybloom。可选模块:mysql
运行配置：setting.py
使用_Mysql_存储内容需先运行SQL文件 proxy_ip.sql，并配置setting.py文件中的连接参数:MYSQL_CONNECT

  python launchScrapy.py

###B.使用__requests__抓取
增加的爬取的网址，减少了必要依赖
依赖模块：requests，lxml，redis或pybloom ，可选模块:mysql
运行配置：simple_crawler_config.py

  python launchSimpleCrawler.py

About

A simple crawler,Crawl and check the proxy IP.

Languages

Language:Python 100.0%