googleworkspace / google-docs-hast

Converts the JSON representation of a Google Docs document into an HTML abstract syntax tree (HAST).

Geek Repo:Geek Repo

Github PK Tool:Github PK Tool

What is replacing soft line breaks with ensp?

bwklein opened this issue · comments

In the tree output from the toHast function there is a string like

lineBreak\u000bThis line contains a shift-enter soft line-break here\u000bThis bit is after the line break.

This string contains the "U+000B Line Tabulation" control character.

This is replaced by   in the HTML code, after running it through the following code:

let html = unified()
  .use(rehypeStringify, { collapseEmptyAttributes: true })
  .stringify(tree);

Do you know how I can preserve this control character or replace it with the HTML hex reference ?