mahendra-shah / news_scrapy

Extracts data from the news articles.

Geek Repo:Geek Repo

Github PK Tool:Github PK Tool

News Scrap

MIT license. Last Commit Contributors


Description

This project is built using Python and the Scrapy framework to scrape news websites for the latest articles and news updates. The project extracts data from the news articles and stores them in a csv file for further analysis.

Project Structure

    ├── spiders                           # Contains spiders
        ├── news.py                # Contains main logic of extracting data
    ├── LICENSE 
    ├── README.md                       # Documentation
    ├── items.py                            
    ├── middlewares.py                   
    ├── pipelines.py
    ├── requirements.txt                    
    ├── settings.py                   # Configuration file for the Scrapy project.

Installation

  1. Clone the repository:
    git clone https://github.com/mahendra-shah/news_scrapy.git
    
  2. Install the required dependencies:
      pip install -r requirements.txt

Usage

  • Navigate to the project directory.
  • Run the following command to start the news scraping process:
     scrapy crawl news

Contact

For any questions or feedback, feel free to contact the project owner at mahendra21@navgurukul.org.

About

Extracts data from the news articles.

License:MIT License


Languages

Language:Python 100.0%