incubated-geek-cc / WebScraper

A Node.js express server setup to create a web scraping api where web content is parsed and extracted via a proxy.

Home Page:https://icd-code-web-scraper.glitch.me/api/dexur/icd/panic

Geek Repo:Geek Repo

Github PK Tool:Github PK Tool

logo

๐ŸŒ Web Scraper

๐Ÿ› ๏ธ Retrieves HTML text content from https://dexur.com/icd/search/ and returns a JSON formatted response

Runs on Node.js Express framework. ๐Ÿ”Œ Request proxy setup.

๐ŸŒŸ Try it yourself (where query = "panic")

Live Demo :: Link
Live Demo :: Backup Link

๐Ÿงฐ Run on localhost

  • Run npm install to install all node dependencies
  • Double-click file startup.sh
  • Navigate to localhost:5000 and test API

โœ Read related post here

Article :: Link

๐Ÿ“Œ Features

  • Parses HTML content with jsdom
  • Minifies retrieved HTML text with html-minifier (optional)
  • Traverse the HTML node(s) for raw data extraction
  • Formats extracted data into structured JSON formatted data called via a GET API

๐Ÿ‘€ Preview (e.g. query = "mood")

โ€” Join me on ๐Ÿ“ Medium at ~ ฮพ(๐ŸŽ€หถโ›โ—กโ›) @geek-cc


๐ŸŒฎ Please buy me a Taco! ๐Ÿ˜‹

About

A Node.js express server setup to create a web scraping api where web content is parsed and extracted via a proxy.

https://icd-code-web-scraper.glitch.me/api/dexur/icd/panic


Languages

Language:JavaScript 96.6%Language:Shell 3.4%