rafayet-monon / google-search-extractor

*For Education purpose* This application extracts large amounts of data from the Google search results page then display and report those.

Geek Repo:Geek Repo

Github PK Tool:Github PK Tool

Google Search Extractor

Install

Clone the repository

git clone git@github.com:rafayet-monon/google-search-extractor.git
cd google-search-extractor

Check your Ruby version

ruby -v

The ouput should start with something like ruby 2.6.4

If not, install the right ruby version using rbenv (or rvm, whatever you prefer):

rbenv install 2.6.4

Rails 6.0.2.1

Install dependencies

Using Bundler and Yarn:

bundle && yarn

Set Credentials

Using rails credentials.yml.enc and master.key(blog for details). Use EDITOR='nano' rails credentials:edit to create new credentials. Sample credentials can be found in config/sample_credentials.yml. Edit database and email credentials and paste it to credentials.yml.enc.

Mailer Setup

The mail is currently set up using google SMTP server. To configure your mail you have to add google user_name and password. To do that open the credentials.ym.enc and edit the below portion -

email:
  user_name: 'your_google_email'
  password: 'your_google_password'

Initialize the database

rails db:create db:migrate db:seed

Dependencies

This application uses the following dependencies that is need to be installed -

  1. Redis to manage Sidekiq.
  2. Chromedriver for selenium to search in google.

Testing

To run the tests simply execute the below command within the application

rspec

Serve

First run the redis server

redis-server

Then within the application execute below command to start the application.

foreman start -f Procfile.dev 

The application is deployed in heroku. As redis is a service that requires credit cards info to work in heroku and I don't have one so file uploading and searching using Sidekiq is not available there. The search is done in runtime after uploading the file.

Requesting not to upload CSV file with large keyword set in heroku.

Demo User Credentials

email: john_doe@gmail.com
password: 123123

Screenshots

Signup Page signup Login Page signup Forgot Password Page signup Update Profile Page signup Report Home Page signup Upload File Page signup Google HTML Page signup

About

*For Education purpose* This application extracts large amounts of data from the Google search results page then display and report those.


Languages

Language:Ruby 74.6%Language:HTML 15.9%Language:CSS 5.5%Language:JavaScript 4.0%