There are 5 repositories under html-parsing topic.
A little like that j-thing, only in Go.
🌀 React library to safely render HTML, filter attributes, autowrap text with matchers, render emoji characters, and much more.
A Scala library for scraping content from HTML pages
Heuristic based boilerplate removal tool
Reworked https://www.readability.com/ parsing library (now https://mercury.postlight.com/ is living alternative)
procyclingstats scraper
htmlparsing.com, a website devoted to helping people parse HTML correctly
A java html 5 compliant parser
A Node.js XML DOM, Parser & Stringifier.
Fully Featured Java Scrapping Framework, highly pluggable and customizable
Faster HTML scraper with WebAssembly
A java tool for detecting charset encoding of HTML web pages
Fully Featured, highly pluggable and customizable Java Html to Pojo converter.
Summarize text and websites and optionally saves the data to a local file
SourceCode for SCP Foundation app - https://play.google.com/store/apps/details?id=ru.dante.scpfoundation
web scrape facebook post and extract data
Swift wrapper around libxml2 HTML Parser to provide SAX style HTML Parsing
django-janitor allows you to use bleach to clean HTML stored in a Model's field.
Add, delete, modify, get html tags, text, links by using css selector
Apache Drill UDFs for retrieving and working with HTML text
CAP (Common Alerting Protocol) XML alert format parsing, HTML parsing, inserting new alerts into database, OneSignal (possible Android and iOS push notifications), Twitter, Facebook, MailChimp (e-mail notifications) for project of open source solution for natural disasters early-warning.
An XML/HTML parser and serializer for JavaScript.
A PowerShell module for extracting data from HTML using XPath
Vertretungsplan und Stundenplan des Wilhelm-Gymnasiums
web spider to scan UR avialbe room and output as csv
this script can analyze number of telegram messages by time
Get insights into your Facebook Messenger activity with Splunk