syntax-tree / hast

Hypertext Abstract Syntax Tree format

Home Page:https://unifiedjs.com

Geek Repo:Geek Repo

Github PK Tool:Github PK Tool

Interested in htmlparser2 AST to HAST ?

StarpTech opened this issue · comments

I need htmlparser2 because it's much less strict than parse5. It would allow us to parse any HTML content without any preprocessing based on the browser specification this e.g useful for tools like formatter etc...

commented

Yes, that would make sense! This was the parser back in 0dbea95, so you could use it as an example.

Note that HAST is for HTML. htmlparser2 is not HTML, it’s XML-like. So maybe a proper XML parser / XML AST would make sense?

So maybe a proper XML parser / XML AST would make sense?

Yes, XML-like 😄

I close it. Right now, I have no interested in a XML spec anymore. I could archived my goals with some minor tweaks of the parse5 parser but as soon as it changes I will open an issue. thanks.

commented

The biggest thing with parse5/DOM is around the <html> element, and the whitespace it removes there. I don’t think most other cases really matter. But yeah, an XML parser is interesting, but I’m wondering how well that works for HTML actually.

Hi @wooorm would you accept a PR to list my parser and plugins for hast format?

commented

Yes I would, but they currently don’t have a lot of docs, could you add stuff like examples, api docs, and a license first?

Yes I would, but they currently don’t have a lot of docs, could you add stuff like examples, api docs, and a license first?

Good point. The license is MIT it's a monorepo.

commented

Good point. The license is MIT it's a monorepo.

It’s good practise to still mention it in the repo, as otherwise that’s not easily visible on npm, or in node_modules!

@wooorm done please check. After that, I will create a PR.

commented

Nice! Please do. Where should these be added? The plugin here, and rehype-webparser in rehype/doc/plugins perhaps?

P.S. would love an example of the features of webparser in the usage section, and how do the errors / rootnodes look?