darixsamani / pdfdrive

I'm building this project to enhance my python skills after a long time without coding

Home Page:https://hub.docker.com/r/darixsamani/pdfdrive

Geek Repo:Geek Repo

Github PK Tool:Github PK Tool

pdfdrive

I built this project to enhance my Python skills after a long of time without coding

what does it do

it's a web scraper that collects information on the pdfdrive.com site and then saves it in a file and in a mongodb database

How to install

  1. Install requirements
pip3 install poetry

  1. Laucch Spider Before changing .env to your URI MongoDB and Redis
poetry install && cpdfdrive &&  poetry run scrapy crwal pdfdrive

Run with docker

docker pull darixsamani/pdfdrive
docker run -it -e MONGO_URI="mongodb://localhost" -e  MONGO_DATABASE="pdfdrive" -e REDIS_HOST="localhost" -e REDIS_PORT=6379 -e REDIS_PASSWORD=""  darixsamani/pdfdrive

MongoDB Screen

Mongo image

About

I'm building this project to enhance my python skills after a long time without coding

https://hub.docker.com/r/darixsamani/pdfdrive


Languages

Language:Python 98.3%Language:Dockerfile 1.7%