LucasMagnum / github_scrapper_example

Example of a Github Scrapper application

Geek Repo:Geek Repo

Github PK Tool:Github PK Tool

Github Scrapper Data

CI

This is an example application used to consume Github API. This project is splitted up in 2 components: API and Scrapper.

API

Responsible for exposing a query service through an API.

Scrapper

Responsible for consuming downloading and saving Github Repo and User data.

Quick Start

  1. Install docker and docker-compose
  2. Clone this project
  3. Run make install
    • This command will build the images and initialize the database
  4. Set GITHUB_USERNAME and GITHUB_TOKEN variables in the docker-compose.yml file
  5. Start the API with make start-api or the Scrapper with make start-scrapper
  6. Open http://localhost:8002/ in your browser
  7. Open http://localhost:8002/users/_search to search users
    • Search by name. Ex: http://localhost:8002/users/_search?name=Lucas
  8. Open http://localhost:8002/repositories/_search to search users
    • Search by name. Ex: http://localhost:8002/repositories/_search?name=Git
    • Search by languages. Ex: http://localhost:8002/repositories/_search?languages=Python
    • Search by author. Ex: http://localhost:8002/repositories/_search?author=Lucas

Improvements points

  • Better management of connections with SQLAlchemy / Better async configuration for database queries
  • Use postgres
  • Add contract tests to the API
  • Add tests to the Scrapper
  • Use ElasticSearch as a database and enable a better search query system
  • Remove the coupling between the repository and the db query system

About

Example of a Github Scrapper application

License:MIT License


Languages

Language:Python 95.5%Language:Makefile 2.9%Language:Dockerfile 1.7%