mozilla / readability

A standalone version of the readability lib

Geek Repo:Geek Repo

Github PK Tool:Github PK Tool

H1 is converted into H2?

yagudaev opened this issue · comments

Hi Mozilla team, thanks so much for this amazing library!

I found it surprising to see H1 converted into an H2 like so:

CleanShot 2024-04-26 at 14 24 43@2x

Is there a way to turn this off?

Here is a live demo

Here is a quick workaround using classes:

    const parser = new DOMParser();
    const doc = parser.parseFromString(value, "text/html");
    const readability = new Readability(doc, {
      classesToPreserve: "h1",
    }).parse().content;
    const readabilityDoc = parser.parseFromString(readability, "text/html");
    readabilityDoc.querySelectorAll(".h1").forEach((heading) => {
      heading.outerHTML = heading.outerHTML.replace("h2", "h1");
    });

See: Demo