Github Scrapper Data
This is an example application used to consume Github API. This project is splitted up in 2 components: API and Scrapper.
API
Responsible for exposing a query service through an API.
Scrapper
Responsible for consuming downloading and saving Github Repo and User data.
Quick Start
- Install
docker
anddocker-compose
- Clone this project
- Run
make install
- This command will build the images and initialize the database
- Set
GITHUB_USERNAME
andGITHUB_TOKEN
variables in thedocker-compose.yml
file- See this link to create a token: https://help.github.com/en/github/authenticating-to-github/creating-a-personal-access-token-for-the-command-line
- Start the API with
make start-api
or the Scrapper withmake start-scrapper
- Open
http://localhost:8002/
in your browser - Open
http://localhost:8002/users/_search
to search users- Search by
name
. Ex:http://localhost:8002/users/_search?name=Lucas
- Search by
- Open
http://localhost:8002/repositories/_search
to search users- Search by
name
. Ex:http://localhost:8002/repositories/_search?name=Git
- Search by
languages
. Ex:http://localhost:8002/repositories/_search?languages=Python
- Search by
author
. Ex:http://localhost:8002/repositories/_search?author=Lucas
- Search by
Improvements points
- Better management of connections with SQLAlchemy / Better async configuration for database queries
- Use postgres
- Add contract tests to the API
- Add tests to the Scrapper
- Use ElasticSearch as a database and enable a better search query system
- Remove the coupling between the repository and the db query system