html-parsing

There are 5 repositories under html-parsing topic.

PuerkitoBio / goquery
A little like that j-thing, only in Go.
goquery html-parsing jquery selector-strings
Language:Go 14774
inikulin / parse5
HTML parsing/serialization toolset for Node.js. WHATWG HTML Living Standard (aka HTML5)-compliant.
html-parsing html html5 serialization serializer parser whatwg
Language:TypeScript 3837
interweave
milesj / interweave
🌀 React library to safely render HTML, filter attributes, autowrap text with matchers, render emoji characters, and much more.
autolink emoji emoji-picker emoji-unicode html-parsing interpolation matcher react react-elements
Language:TypeScript 1145
cezheng / Fuzi
A fast & lightweight XML & HTML parser in Swift with XPath & CSS support
xml xml-parsing xml-parser xpath html html-parsing html-parser css ios swift parser parsing
Language:Swift 1099
miso-belica / jusText
Heuristic based boilerplate removal tool
python text-extraction html-parser html-parsing
Language:Python 803
ruippeixotog / scala-scraper
A Scala library for scraping content from HTML pages
scala scraper dsl html-parsing hacktoberfest
Language:Scala 730
jpjacobpadilla / Stealth-Requests
Undetected web-scraping & seamless HTML parsing in Python!
python http-client data html-parsing http-requests python-scraping python-web-scraper requests web-crawler web-scraping webscraping xpath data-extraction
Language:Python 311
bookieio / breadability
Reworked https://www.readability.com/ parsing library (now https://mercury.postlight.com/ is living alternative)
html-extraction html-extractor html-parsing python text-extraction text-mining
Language:HTML 205
themm1 / procyclingstats
procyclingstats scraper
cycling html-parsing python python-package scraper sports-analytics web-scraping
Language:Python 89
ange007 / HTMLp
Delphi Dom HTML Parser and Converter. Fork (not from the original author): https://sourceforge.net/projects/htmlp/
delphi dom dom-parser formatter html html-formatter html-parser html-parsing parser xpath
Language:Pascal 31
digitalfondue / jfiveparse
A java html 5 compliant parser
html html-parser html-parsing html5 java java-html5-parser
Language:Java 31
petdance / htmlparsing
htmlparsing.com, a website devoted to helping people parse HTML correctly
hacktoberfest html html-parsing parsing
Language:CSS 30
liuderchi / ide-html
:atom: Atom-IDE for HTML, Go Template, Mustache and other Templates
atom-ide atom-package atom-editor atom-plugin html-parsing html language-server-protocol
Language:JavaScript 20
ElyaConrad / XML-Parser
A Node.js XML DOM, Parser & Stringifier.
xml xml-parser xml-parsing xml-schema dom html html-parser html-parsing crawler crawling
Language:JavaScript 18
julleboi / fast-wasm-scraper
Faster HTML scraper with WebAssembly
webassembly rust html-scraper html-parsing scraper
Language:Rust 17
shabanali-faghani / IUST-HTMLCharDet
A java tool for detecting charset encoding of HTML web pages
java charset charset-detector html html-parsing paper
Language:Java 12
fefit / rphtml
A html parser written in RUST, parse html into node trees.
html-parser html-parsing html-minify
Language:Rust 11
ktodorov / go-summarizer
Summarize text and websites and optionally saves the data to a local file
summarizer readability html-parsing parser
Language:Go 10
raymccrae / swift-htmlsaxparser
Swift wrapper around libxml2 HTML Parser to provide SAX style HTML Parsing
html-parser html-parsing libxml2 sax-parser swift
Language:Swift 10
mohaxspb / ScpFoundationRu
SourceCode for SCP Foundation app - https://play.google.com/store/apps/details?id=ru.dante.scpfoundation
scp scp-foundation html-parsing offline-first android android-client
Language:Java 8
patmull / disaster-warning-system-scripts
CAP (Common Alerting Protocol) XML alert format parsing, HTML parsing, inserting new alerts into database, OneSignal (possible Android and iOS push notifications), Twitter, Facebook, MailChimp (e-mail notifications) for project of open source solution for natural disasters early-warning.
early-warning-systems facebook-api twitter-api onesignal-notifications onesignal early-warning-signals mailchimp mailchimp-api social-media common-alerting-protocol xml-parsing html-parsing
Language:Python 8
peterhil / slurp
BeautifulSoup4 packaged into a command line tool
bookmarks netscape html-parsing cli-utilities
Language:Python 8
SaurabhSSB / BookMiner
A pipeline to scrape, extract, and analyze book data from web pages to insights.
beautifulsoup book-dataset books csv-export data-analysis data-pipeline data-science-project data-visualization eda html-parsing jupyter-notebook project-portfolio python web-data-extraction web-scraping
Language:HTML 8
siongui / go-facebook-post-parser
web scrape facebook post and extract data
go parser facebook demo web-scraping goquery html-parsing golang
Language:Go 8
bradmontgomery / django-janitor
django-janitor allows you to use bleach to clean HTML stored in a Model's field.
python django html html-parsing whitelist
Language:Python 6
brianary / SelectHtml
A PowerShell module for extracting data from HTML using XPath
html xpath scrape powershell powershell-module html-parsing fsharp
Language:HTML 6
emmanuelroecker / php-simply-html
Add, delete, modify, get html tags, text, links by using css selector
css-selector html-parser html-parsing html-tags links php
Language:PHP 6
imingyu / forgiving-xml-parser
An XML/HTML parser and serializer for JavaScript.
xml-parsing xml-parser html-parsing html-parser xml2json html2json xml2js html2js parser serializer html xml forgiving-xml-parser json javascript transformation typescript
Language:TypeScript 6
kan01234 / ur-web-spider
web spider to scan UR avialbe room and output as csv
python web-crawler web-spider csv html-parsing json
Language:Python 6
LylaCoding / Website-Subpage-Scraper
This Python script scrapes internal links on a webpage. It prompts for a URL, sends a GET request to retrieve HTML, uses BeautifulSoup to parse and filter links. Then it prompts the user for output mode (terminal or file) to either print or write the links. Installs required modules (requests and beautifulsoup4) if not found.
data-extraction hacking hacking-tool html-parsing python python-hacking web-scraping
Language:Python 6
hrbrmstr / drill-html-tools
Apache Drill UDFs for retrieving and working with HTML text
apache-drill jsoup html-parsing web-scraping dom css-selectors
Language:Java 5
rsharifnasab / telegram_export_analyzer
this script can analyze number of telegram messages by time
telegram telegram-desktop html html-parser html-parsing python python3 beautifulsoup4 beautifulsoup
Language:Python 5
ubbeg2000 / pars
a simple package for parsing html files into dom trees
parsing-library parser golang golang-package html-parsing html-parser dom
Language:Go 5
decal / cgiaudit
:package: general-purpose, "black box" CGI auditing tool (ARCHIVE)
attack-surface autotools cgi-bin dirbuster form-input fuzz-testing hacking-tool html-form html-parsing http-request-test http-server infosec penetration-testing security-audit spiders web-security-research webappsec
Language:C 4
sidward35 / splunk-messenger
Get insights into your Facebook Messenger activity with Splunk
splunk splunk-enterprise facebook facebook-messenger aws aws-ec2 python python3 html csv html-parsing csv-converter dashboard analysis analytics
Language:Python 4
imamhossain94 / bubt-website-scraping-script
The first public repository that provides free BUBT website scraping API script on Github.
python html-parsing scraping-websites json firestore bubt
Language:Python 3

html-parsing

PuerkitoBio / goquery

inikulin / parse5

milesj / interweave

cezheng / Fuzi

miso-belica / jusText

ruippeixotog / scala-scraper

jpjacobpadilla / Stealth-Requests

bookieio / breadability

themm1 / procyclingstats

ange007 / HTMLp

digitalfondue / jfiveparse

petdance / htmlparsing

liuderchi / ide-html

ElyaConrad / XML-Parser

julleboi / fast-wasm-scraper

shabanali-faghani / IUST-HTMLCharDet

fefit / rphtml

ktodorov / go-summarizer

raymccrae / swift-htmlsaxparser

mohaxspb / ScpFoundationRu

patmull / disaster-warning-system-scripts

peterhil / slurp

SaurabhSSB / BookMiner

siongui / go-facebook-post-parser

bradmontgomery / django-janitor

brianary / SelectHtml

emmanuelroecker / php-simply-html

imingyu / forgiving-xml-parser

kan01234 / ur-web-spider

LylaCoding / Website-Subpage-Scraper

hrbrmstr / drill-html-tools

rsharifnasab / telegram_export_analyzer

ubbeg2000 / pars

decal / cgiaudit

sidward35 / splunk-messenger

imamhossain94 / bubt-website-scraping-script