mhamzawey / dalia_challenge_ruby

Geek Repo:Geek Repo

Github PK Tool:Github PK Tool

Project Target:

- A multi website agenda

- Collect all the cultural activities in Berlin into one.
- The specifications:
    -Collect the information for your website from other web sources.
        - https://www.co-berlin.org/en/calender
        - http://berghain.de/events/
- Events filter: User can filter the events based on different criteria:
      - Web Source
      - Dates
      - Simple partial text search on title.

Implementation:

- Implement the code needed to parse 2 of the "web sources" into the standardized format.
- Implement the code needed to collect the standardized format and render it in a website.
- Add some simple filtering mechanism (based on backend filtering, not frontend JS filtering)
- Prepare your code for easy deployment

Prerequisites:

docker
docker-compose

Starting Project:

1- git clone https://github.com/mhamzawey/dalia_challenge_ruby

2- cd dalia_challenge_ruby

3- docker-compose up -d --build

4- wait 2-3 minutes till the composer builds and starts the containers

5- Go to your browser:- To access the front-end visit -> http:localhost:3000/

Important Note: sometimes cron tasks are not triggered automatically, so for testing purposes, execute the following command from your terminal: docker exec -it api python /app/scrapy_events/core.py

Project Setup:

Project consists of Five docker contaiers:

1- api_ruby:

- built on top of ruby:2.5
- This api is a Rails project that has model:
    1- events:
        - A model called `event` which has those attributes: `{id, title ,description, category, start_date, end_date, link ,created_at, updated_at}`
        - We have one main filters in this api:
            -starts_with:
                - Responsible for filtering using title
        - Test cases: added 16 test cases for testing CRUD events
        - To run test case manually : `docker exec -it api_ruby  bundle exec rspec`
        - Can be automated using any ci/cd over Gitlab CI/CD, Jenkins, or Github Cirrus

2- scrappy_app:

- A crawler that's built on the Scrapy Framework that has two spiders:
    - co_berlin: that scrapes the events on co_berlin website
    - berghain: that scrapes the events on berghain website

- This is scalable as we can define any other spider we need and handle its case and map it to our own serializer
- Scrapping done as a crontask that gets registered once the api container is up and running
    - Cron job runs every 1 minute, this can be enhanced according to the needed time.

3- mysqldb_ruby:

- Built on top of mysql:5.7
- Doesn't persist data, can be presisted by adding volumes to the docker-compose file
- It is the default database, we use another container for test cases

4- mysqldb_ruby_test:

- Built on top of mysql:5.7
- Doesn't persist data, can be presisted by adding volumes to the docker-compose file
- Used purely for Rspec test cases

5- front-end:

- Built on top of node:9.6.1
- A versy simple ReactJS app that integrates with the api
- It fetches from the `api_ruby` the events and has a search bar that does backend searching on `title`
- There's another endpoint that can filter by dates that you can access via the swagger documentation `events/filter/`

To clean up everything afterwards, docker-compose down

Future Enhancements:

- Writing all unit test cases for the three frameworks and triggering them in the .cirrus.yml CI:
    - Rails
    - Scrapy
    - ReactJs

- Writing the whole cycle for deployment in the .cirrus.yml CD or Jenkins:
    - Use ECS (AWS) to deploy the Django & Scrapy Container
    - User RDS for the MySQL DB
    - Use S3 (AWS) for ReactJS app

About


Languages

Language:Ruby 56.9%Language:Python 18.7%Language:JavaScript 17.9%Language:HTML 4.0%Language:CSS 1.3%Language:Dockerfile 1.0%Language:Shell 0.2%