syntax-tree / hast

Hypertext Abstract Syntax Tree format

Home Page:https://unifiedjs.com

Geek Repo:Geek Repo

Github PK Tool:Github PK Tool

Remove ancient nodes

wooorm opened this issue · comments

commented

htmlparser2 supports a weird mix of XML and HTML, such as processing instructions and directives. HTML, the standard, does not. For example, processingInstructions, directives (other than doctypes), and cdata are not supported in HTML.

There’s a new branch up for rehype which uses parse5, a standard compliant HTML parser. It doesn’t support processing instructions or cdata. I’m going to remove support for those. Plus, I’m replacing the directive with the one allowed directive: a doctype node.

As doctypes have a name, public identifier, and system identifier, maybe those should be supported on the interface?

interface Doctype <: Node {
  type: "doctype";
  name: string?;
  public: string?;
  system: string?;
}

Where:

<!DOCTYPE html>

Yields:

{
  "type": "doctype",
  "name": "html",
  "public": null,
  "system": null
}