- Docker for containerization, including Chrome and chromedriver
- Flask with blueprints for better code organization
- Pydantic for models and input validation
- Selenium Wire and Stealth for avoid website block
- Mypy for typing check
- Flake8 for linting
- Pytest for unit testing
- Black for pep8 auto-format
- Clean code
- Makefile for common tasks
- Retry in case of any error
Using docker-compose:
docker-compose up
Without docker-compose:
Build
DOCKER_BUILDKIT=0 docker build --tag crawler:latest .
Run
docker run -p 3000:3000 -it crawler:latest
In order to run locally, Google Chrome and chromedriver
is requred in the PATH.
First, create a new virtual env:
python3 -m venv venv
source venv/bin/activate
Then, install the dependencies:
pip install -r requirements/development.txt
Run with Makefile:
make run
make run
will run lint, mypy, tests and format.
I tried to solve the reCAPTCHA using capmonster but had no luck, every time that a captcha was solved, the website shows again and again.
And tests, more tests!
I am using HTTPie because is more simple than cURL JSON.
Getting the name and surname of an user without OTP:
http http://localhost:3000/v1/ username=username password=password secret_answer=secret
Getting the same information of an user with OTP:
http http://localhost:3000/v1/ username=username password=password authenticator_secret_key="AAAA BBBB CCCC DDD"