Santhin / TorScrapy

Simple crawler made with scrapy and TorIpChanger package

Geek Repo:Geek Repo

Github PK Tool:Github PK Tool


Simple app for scrapping data from gumtree.

🧐 About

The project was created for learning purposes to know how to combine scrapy framework with TorIp changer.

🏁 Getting Started

Prerequisites

  • Docker desktop

Project structure

.
β”œβ”€β”€ docker-compose.yml
β”œβ”€β”€ LICENSE
β”œβ”€β”€ README.md
└── src
    β”œβ”€β”€ crawler
    β”‚   β”œβ”€β”€ __init__.py
    β”‚   β”œβ”€β”€ items.py
    β”‚   β”œβ”€β”€ middlewares.py
    β”‚   β”œβ”€β”€ pipelines.py
    β”‚   β”œβ”€β”€ settings.py
    β”‚   └── spiders
    β”‚       β”œβ”€β”€ __init__.py
    β”‚       β”œβ”€β”€ mieszkania2.py
    β”‚       └── quotes_spider.py
    β”œβ”€β”€ Dockerfile
    β”œβ”€β”€ go_spider.py
    β”œβ”€β”€ scrapy.cfg
    └── tests
        └── ipchanger_works.py

Installing

Clone repository:

git clone https://github.com/Santhin/TorScrapy.git

To run the crawler type:

docker-compose up

πŸ”§ Running the tests

Simple check if tor ip changer is working unmark commented test in dockerfile.
The exemplary output:

Project logo

πŸ› οΈ Todo

  • add control startup for TorIpChanger container in docker-compose

⛏️ Built Using

πŸŽ‰ Acknowledgements

About

Simple crawler made with scrapy and TorIpChanger package

License:MIT License


Languages

Language:Python 96.9%Language:Dockerfile 3.1%