IonicaBizau / scrape-it

🔮 A Node.js scraper for humans.

Home Page:http://ionicabizau.net/blog/30-how-to-write-a-web-scraper-in-node-js

Geek Repo:Geek Repo

Github PK Tool:Github PK Tool

no space between tag

topikuning opened this issue · comments

I don't know if it issue or not, but I have strange behaviour with it.
let's say I have data <div class="content"><h3>some title</h3> <p>some content of it </p></div>, if I take ".content" when scrap, the result is: some titlesome content of it. It is quite annoying since normally we want it to be:
some title some content of it
or
some title \r\n some content of it
or at least there is separation between these elements.

If the initial HTML has a space, the space should be preserved. But I speculate there is no space in the original HTML, which makes the behaviour correct.