There are 129 repositories under parser topic.
Transforms complex documents like PDFs into LLM-ready markdown/JSON for your Agentic workflows.
Rust-based platform for the Web
The fast, flexible, and elegant library for parsing and manipulating HTML and XML.
An incremental parsing system for programming tools
A high-performance observability data pipeline.
⚓ A collection of JavaScript tools written in Rust.
A high-performance 100% compatible drop-in replacement of "encoding/json"
Repository for the book "Crafting Interpreters"
Rust parser combinator framework
The official repo for “Dolphin: Document Image Parsing via Heterogeneous Anchor Prompting”, ACL, 2025.
Select, put and delete data from JSON, TOML, YAML, XML and CSV files with a single tool. Supports conversion between formats and can be used as a Go package.
An extremely fast CSS parser, transformer, bundler, and minifier written in Rust.
File Parser optimised for LLM Ingestion with no loss 🧠 Parse PDFs, Docx, PPTx in a format that is ideal for LLMs.
A web tool to explore the ASTs generated by various parsers.
Java 1-21 Parser and Abstract Syntax Tree for Java with advanced analysis functionalities.
JSqlParser parses an SQL statement and translate it into a hierarchy of Java classes. The generated hierarchy can be navigated using the Visitor Pattern
Markdown parser, done right. Commonmark support, extensions, syntax plugins, high speed - all in one. Gulp and metalsmith plugins available. Used by Facebook, Docusaurus and many others! Use https://github.com/breakdance/breakdance for HTML-to-markdown conversion. Use https://github.com/jonschlinkert/markdown-toc to generate a table of contents.
Lark is a parsing toolkit for Python, built with a focus on ergonomics, performance and modularity.
One of the fastest alternative JSON parser for Go that does not require schema
Node.js body parsing middleware
:angel: The ultimate angle brackets parser library parsing HTML5, MathML, SVG and CSS to construct a DOM based on the official W3C specifications.
Picocli is a modern framework for building powerful, user-friendly, GraalVM-enabled command line apps with ease. It supports colors, autocompletion, subcommands, and more. In 1 source file so apps can include as source & avoid adding a dependency. Written in Java, usable from Groovy, Kotlin, Scala, etc.
LIEF - Library to Instrument Executable Formats (C++, Python, Rust)