EstebanMqz / Web-scraper

HTML HTTP GET requests for dynamic/client-sided web-scrapping purposes other than traditional static caching protocols.

Geek Repo:Geek Repo

Github PK Tool:Github PK Tool

Formatted / indexed web-scrapper


Web-scrapper tool for metadata extraction purposes using HTTP GET requests.
For complete attribute structural inspections inherent in code's granularity.



Technique different than those provided by web-development tools:


Web-development View-source Save as: Complete HTML, Single HTML, HTML only traditional methods generally provide unreliable or incomplete information from websites, particularly if they are using dynamic and client-sided scripts.

Usage:


.sh

Terminal

$ ./html-extractor.sh Enter a URL: https://estebanmqz.github.io/EstebanMqz/html/Resume.html Do you want to extract the raw code to a temporary file? (Y/N): Y Enter a filename to save the raw code: Resume Raw code extracted to Resume.html opening..

See also:

 

Note: Usage should be made in compliance with users open-source licenses & privacy rights
& according to international/local laws such as GDPR.

About

HTML HTTP GET requests for dynamic/client-sided web-scrapping purposes other than traditional static caching protocols.

License:Creative Commons Zero v1.0 Universal


Languages

Language:HTML 100.0%Language:Shell 0.0%