php1301 / NBTechInterview

Geek Repo:Geek Repo

Github PK Tool:Github PK Tool

Tech Interview Checklist and Approachement

1. WebUI

  • Sign in - ✅ WITH AUTH0
  • Sign up - ✅ WITH AUTH0
  • Upload a keyword file - ✅ DONE WITH PAPAPARSE
  • View list of keywords - ✅ REACT-TABLE
  • View the search result information for each keyword - ✅ SHOW WITH REACT-TABLE, htmlCode were embedded into a jsonblob link via their APIs
  • Search across all reports - ✅ DATABASE ACTION WITH pg

2. API

Leveraged the NextJS api routes mostly and serverless with lambda

  • For uploading - POST /api/search - integrated with Lambda behind the scence
  • Lambda Logic Repository: https://github.com/php1301/lambda-google
  • For the sake of quick developement - leveraged the serverless framework with free tier AWS account
  • Searching in Database - POST /api/user-search

3. Technical Requirements

  • ✅ Use a web framework of your choice - NextJS
  • ✅ Use PostgreSQL.
  • ✅ For the interface, front-end frameworks such as Bootstrap, Tailwind or Foundation can be used. Use SASS as the CSS preprocessor - Used Tailwind
  • ✅ Extra points are provided to the neatness and user-friendliness of the frontend.
  • ✅ Use Git during the development process. Push to a public repository on Github or Gitlab. Make regular commits and merge code using pull requests - Integrated linter, unit testing with Github Actions
  • ✅ Write tests using your framework of choice.
  • Optional: deploy the application to a cloud provider e.g. Heroku, AWS, Google Cloud or Digital Ocean -> Working on this or cut this off due to free tier eligible

4. Approachment

  • Overview of ER Diagram for database: image
  • ⛔⛔ The 429 Too Many Requests - it's all about tricking Google to not blocking our request
  • Request factor: Can come from UserAgent, Remote address, IP, Header, Cookies, Fingerprint, Headless Browser, Request random delay...
  • Most viable and easiest approachments are all about rotating those above: mostly are UserAgent and IP address with Paid Proxy (Costly) or setup our own Proxy server(tor) SOCKS -> this would lead to the optional requirement - deploying on Cloud Provider
  • => Cost Optimization, Fast, Headless like Puppeteer is not optimized, Premium proxy maybe is too overkill for this technical assignment
  • => ✅✅✅ Lambda Free Tier Approachment is suitable for this workload -> over freetier can consider about EventBridge for CronJob Daily scraping and Thanks To AWS generous IP pool
  • => Rotating Lambda IP -> Best trick here, we update lambda configuration like Environment image
  • => not guaranteed 100% percent all the time (approx 80-90% of not having 429) -> implement the Axios-Retry with the trick above for new IP image
  • => Stil Not guranteed -> Redeploy the lambda function via aws-sdk or serverless script -> Tried and worked

5. Screenshot

  • Homepage image
  • When uploaded keyword image
  • 94 keywords scraped image
  • Request's time image
  • Database image
  • Searching keyword image

6. Limitations

  • Due to the time limit, the source code maybe not on its best practice (Working on It by actively pushing commit)

7. Reproduction

  • run yarn and add necessary env variables in .example.env
  • available routes:
    • homepage: /
    • keyword searching in database: '/my-keywords'
    • Upload csv of keywords: '/search'
  • Create Database with SQL script
  • keywords.csv in src/mocks folder

Available Scripts

Running the development server.

    yarn dev

Building for production.

    yarn build

Running the production server.

    yarn start

TailwindCSS

A utility-first CSS framework packed with classes like flex, pt-4, text-center and rotate-90 that can be composed to build any design, directly in your markup.

Go To Documentation

SASS/SCSS

Sass is a stylesheet language that’s compiled to CSS. It allows you to use variables, nested rules, mixins, functions, and more, all with a fully CSS-compatible syntax.

Go To Documentation

Axios

Promise based HTTP client for the browser and node.js.

Go To Documentation

Environment Variables

Use environment variables in your next.js project for server side, client or both.

Go To Documentation

Reverse Proxy

Proxying some URLs can be useful when you have a separate API backend development server and you want to send API requests on the same domain.

Go To Documentation

React Query

Hooks for fetching, caching and updating asynchronous data in React.

Go To Documentation

react-use

A Collection of useful React hooks.

Go To Documentation

Zustand

A small, fast and scalable bearbones state-management solution using simplified flux principles.

Go To Documentation

ESLint

A pluggable and configurable linter tool for identifying and reporting on patterns in JavaScript. Maintain your code quality with ease.

Go To Documentation

Prettier

An opinionated code formatter; Supports many languages; Integrates with most editors.

Go To Documentation

lint-staged

The concept of lint-staged is to run configured linter (or other) tasks on files that are staged in git.

Go To Documentation

Testing Library

The React Testing Library is a very light-weight solution for testing React components. It provides light utility functions on top of react-dom and react-dom/test-utils.

Go To Documentation

Cypress

Fast, easy and reliable testing for anything that runs in a browser.

Go To Documentation

Docker

Docker simplifies and accelerates your workflow, while giving developers the freedom to innovate with their choice of tools, application stacks, and deployment environments for each project.

Go To Documentation

Github Actions

GitHub Actions makes it easy to automate all your software workflows, now with world-class CI/CD. Build, test, and deploy your code right from GitHub.

Go To Documentation

License

MIT

About

License:MIT License


Languages

Language:TypeScript 95.8%Language:JavaScript 2.9%Language:SCSS 0.6%Language:Dockerfile 0.4%Language:Shell 0.4%