Github Scrapper Data

This is an example application used to consume Github API. This project is splitted up in 2 components: API and Scrapper.

Responsible for exposing a query service through an API.

Responsible for consuming downloading and saving Github Repo and User data.

Quick Start

Install docker and docker-compose
Clone this project
Run make install
- This command will build the images and initialize the database
Set GITHUB_USERNAME and GITHUB_TOKEN variables in the docker-compose.yml file
- See this link to create a token: https://help.github.com/en/github/authenticating-to-github/creating-a-personal-access-token-for-the-command-line
Start the API with make start-api or the Scrapper with make start-scrapper
Open http://localhost:8002/ in your browser
Open http://localhost:8002/users/_search to search users
- Search by name. Ex: http://localhost:8002/users/_search?name=Lucas
Open http://localhost:8002/repositories/_search to search users
- Search by name. Ex: http://localhost:8002/repositories/_search?name=Git
- Search by languages. Ex: http://localhost:8002/repositories/_search?languages=Python
- Search by author. Ex: http://localhost:8002/repositories/_search?author=Lucas

Better management of connections with SQLAlchemy / Better async configuration for database queries
Use postgres
Add contract tests to the API
Add tests to the Scrapper
Use ElasticSearch as a database and enable a better search query system
Remove the coupling between the repository and the db query system

Example of a Github Scrapper application

MIT License

Language:Python 95.5%Language:Makefile 2.9%Language:Dockerfile 1.7%