byhbt / keywords-scraper

Create a web application that will extract large amounts of data from the Google search results page.

Geek Repo:Geek Repo

Github PK Tool:Github PK Tool

Functionality


Our application support reading a CSV file that contains keywords separated by a comma. Then analyze the google search and return:

  • Total links on the search page
  • Total advertises on the search page
  • Total results with the keyword
  • HTML of the page (You also can preview it)

Versioning


We use the following library with version:

  • Ruby - 3.0.0
  • Rails - 7.0.3
  • psql (PostgreSQL) - 14.5

Setup and run application


  1. Clone project: git clone https://github.com/sanG-github/Search-Analytics.git
  2. Install Gem: bundle install
  3. Using your credentials with the example: .env.example
    1. With JWT_SECRET_KEY, generated by bin/rails secret
  4. Setup database: bin/rails db:create db:migrate
  5. Precompile: rake assets:precompile
  6. Run server: rails server
  7. Run sidekiq: bundle exec sidekiq

Database design


Screen Shot 2022-09-16 at 21.10.41.png

  • users store information of the user
  • attachments store the name and contents of the file after removing duplicate keyword
  • results store the necessary datas (like the numbers)
  • source_codes store source code of the Advertisement, Links, and HTML code of the Google search page. That needs to be separated because it contains large data.

About index, please take a look at the db/schema.rb

About

Create a web application that will extract large amounts of data from the Google search results page.


Languages

Language:Ruby 68.9%Language:Slim 13.4%Language:HTML 9.8%Language:JavaScript 3.6%Language:Dockerfile 2.5%Language:SCSS 1.4%Language:Shell 0.3%Language:CSS 0.1%