grangier / python-goose

Html Content / Article Extractor, web scrapping lib in Python

Geek Repo:Geek Repo

Github PK Tool:Github PK Tool

Can not get the image from a Chinese page even the text

SheldonWang3000 opened this issue · comments

I just wrote the code like the sample but I cannot get the image or text from the other page,is that a bug?or I need any other configuration?
I have config the StopWordsChinese

touch the test url pls

http://wz.sun0769.com/html/question/201506/278531.shtml
like this page
but I find out that some page in Chinese can be extracted some cannot