1st-PyCrawlerMarathon

Day	Contents	Remarks
day 001	download file file I/O
day 002	csv file handling
day 003	xml file handling
day 004	API	POKE
day 005	API + JSON
day 006	Headers
day 007
day 008	Static Webpage Crawling
day 009	download images
day 010	Packages: PyQuery/grab
day 011	Regular Expression
day 012	Ex. ETtoday
day 013	Ex. PTT
day 014	Ex. Yahoo! movie
day 015	Ex. Bank of Taiwan
day 016	Ex. Wiki	recursive scrawling
day 017
day 018	about "headers"...
day 019	Ex. ETtoday	selenium + beautifulsop
day 020	API operation
day 021	Ex. ETtoday	Active Web Pages
day 022	Ex. Air Quality Website
day 023	Ex. ETtoday.net	Get external website content
day 024	Ex. 104 HR
day 025	Scrapy Intro.	no HW
day 026	Scrapy: Request
day 027	Scrapy: XPath + Itempipeline
day 028	Scrapy: API
day 029	Scrapy: multi webpage
day 030	some challenges
day 031	headers
day 032	captcha
day 033	login
day 034	proxy IP
day 035	multithread
day 036	asyncronized
day 037	scheduled

About

Language:Jupyter Notebook 99.3%Language:Python 0.7%