ZTF666 / web-scraper

A small page scraper , NO DYNAMIC SCRAPING tho :tired_face:

Geek Repo:Geek Repo

Github PK Tool:Github PK Tool

💩Scrapy💩

A small page scraper , still a WiP . No dynamic scraping ... This script uses :

Cheerio Javascript
Axios

How to use

  • Install and run
npm install
npm run scrapy
  • Change the website and add yours
axios.get("https://chouftv.ma/press");
  • Change the elements by the ones you desire
$(".description").each((index, element) => {
  const title = $(element).children().first().text();
  const links = $(element).children("a").attr("href");
});

Screenshot

It looks weird because i used it on a local news website.
  • Limitations

    This is a shitty scrapper , i'm still learning.

    It doesn't scrap unloaded links.

    Screenshot

In the screenshot above , the button litteraly translates to : LOAD MORE

Since i suck at this, i can't make it load more so i can grab the links

So it only grabs the latest news articles .

That's a blessing and a curse , beacause if clicked , it will load EVERY ARTICLE WRITTEN

since the deployement of the website...

Contact

you can contact me at ZTF666@protonmail.ch

License

💩Scrapy💩 released under the MIT License.

Made with 💘 by a 👨‍💻 on a 💻 | 2020 | ZTF666 - N.EA

About

A small page scraper , NO DYNAMIC SCRAPING tho :tired_face:

License:MIT License


Languages

Language:JavaScript 100.0%