karussell / snacktory

Readability clone in Java

Geek Repo:Geek Repo

Github PK Tool:Github PK Tool

Text content is removed when there is an image in news webpage.

avi20072008 opened this issue · comments

Hi,

I have tried using snacktory and It works well on the webpages which do not contain images. I have tried using one of the newspapers and I found that whenever there is an image, snacktory removes text block close to the image.

Try this url : http://articles.timesofindia.indiatimes.com/2013-09-17/rest-of-world/42147651_1_tropical-depression-mexico-city-heavy-rains

Would be nice if you could digg into it and provide a fix via pull request :) !