aaronsw / html2text

Convert HTML to Markdown-formatted text.

Home Page:http://www.aaronsw.com/2002/html2text/

Geek Repo:Geek Repo

Github PK Tool:Github PK Tool

Please enable cookies

m0o0scar opened this issue · comments

commented

When I html2text https://davidwalsh.name/2016s-most-important-web-apps-tools, the following error shows up:

Please enable cookies.

Error 1010 Ray ID: 272e8783e7f122e2 • 2016-02-11 08:02:03 UTC

Access denied

What happened?

The owner of this website (davidwalsh.name) has banned your access based on
your browser's signature (272e8783e7f122e2-ua48).

CloudFlare Ray ID: 272e8783e7f122e2 • Your IP: xxxx •
Performance & security by CloudFlare

Did you go to that page with a regular browser ... my Firefox gives me on that URL "Page not found" error? Besides, direct downloading of HTML from the web is kind of hack in html2text it certainly doesn't contain the same functionality as a full browser. It is usually better to download HTML pages to the disk and process them from there.

@moscartong html2text feature for reading from URL is just a tiny helper and should not be considered a reliable method for such situations.

as @mcepl said:

It is usually better to download HTML pages to the disk and process them from there.