WPCrawler

针对单个WordPress网站的网络爬虫程序

使用的开源类库如下：

Apache HttpComponents 4.3

HTML Parser 2.0

MySQL Connector/J 5.1.27

使用UTF-8编码以记录中文标签

使用XAMPP默认MySQL端口localhost:3306

需要本地XAMPP环境

下一次更新会加入统计每篇文章所使用的标签的功能

可以在我的博客内阅读详细原理：

(博客空间是新近开通的，如果访问时出现问题烦请告知，我会想办法解决)

=========

a web crawler for single WordPress site

open source projects that I am using:

Apache HttpComponents 4.3

HTML Parser 2.0

MySQL Connector/J 5.1.27

Need XAMPP environment.

The program assume that there is a database called "crawler" in your localhost with port 3306.

Analyzing tags for each article will be added in the next update.

You can read about this in my blog:

My blog is new and yet unstable. If you have any problems entering my blog, please notify me:)

xiaodin1 / WPCrawler