tahmim-16 / BlogCrawler

Scraping Amrabondhu blog

Geek Repo:Geek Repo

Github PK Tool:Github PK Tool

BlogCrawler

Scraping Amrabondhu blog

Here I have scraped the blog contents from the website Amrabondhu using scrapy.

It includes:

  1. ID (just to number the contents)
  2. Title
  3. Author
  4. Text i.e the whole article
  5. URL of the page
  6. Published date of the article
  7. Accessed time

Also it will save all the html pages and will parse all the next pages.

Requirements:

  1. Pycharm IDE
  2. Python 3.10
  3. Scrapy 2.4.1

These are just my requirements to write and run the script.

About

Scraping Amrabondhu blog


Languages

Language:Python 100.0%