WebReflection / uhtml

A micro HTML/SVG render

Geek Repo:Geek Repo

Github PK Tool:Github PK Tool

How to parse HTML string content to Hole?

loganvolkers opened this issue Β· comments

I need a solution for virtual dom diffing when the HTML is provided as a string.

Background

For some background I am working on an alternative to GrapesJS that solve it's longstanding XSS vulnerability in the live preview. To do this, I need an iframe that can accept HTML content over the postMessage API and display it. Since postMessage is a string-based API, I am unable to send rich components to be rendered in the iframe. The current solution relies on setting document.body.innerHTML = contentFromPostMessage to render the content, but this removes state when using components with state such as <details> tags.

For this situation I have previously used html-react-parser but I am dissatisfied with it's lack of support for full HTML including template, style and comment tags.

Other virtual dom libraries have similar libraries for parsing from HTML, such as html2hscript for virtual-dom and toVNode in snabbdom. Solutions that don't use virtual dom, such as morphdom do not work as they are too eager in updating the dom.

Is there a utility for uhtml to transform raw HTML into a "Hole"?

Parsing HTML is just a matter of using document.createElement("template").innerHTML = content, but there doesn't seem to be a way to turn HTML element (or document fragment) into a Hole.

To start with, there's no virtual DOM in here, it's all based on udomdiff.

Regarding the Hole, it's exported, so you can play around with it, but you don't really need to ... example:

import {render, html as $html, svg as $svg} from 'uhtml';

const templates = new Map;

// handle explicit html("<p>content</p>"), preserving
// the usage as regular tag`...`
const create = tag => (template, ...values) => (
  typeof template === 'string' ?
    tag(templates.get(template) || templates.set(template, [template]).get(template)) :
    tag(template, ...values)
);

const html = create($html);
const svg = create($svg);

// if used as intermediate module/helper
export {render, html, svg};

If you have a finite amount of chunks this should work without issues, but if you have very long sessions and the content is very dynamic, this solution tends to leak. However, there's no other way to map 1:1 unique template, like the literal tag function does natively behind the scene, but at least here parsing happens once per string content, and never again for same content.

That being said, if you update in the passed string, there's no diffing, but you could be smarter there, and post templates and interpolations, instead of whole resolved strings with changes in it.

The key, is to have the template part unique per exact same amount of interpolations and exact same static content, so that you can post the html or svg arguments instead, and so something more clever, able to update interpolated values:

import {render, html as $html, svg as $svg} from 'uhtml';

const templates = new Map;

// handle explicit html(...args) in case these are posted/JSON
// example of args: [['a', 'c'], 'b']
const create = tag => (template, ...values) => {
  if ('raw' in template)
    return tag(template, ...values);
  const staticContent = template.join('\x01');
  let cached = templates.get(staticContent);
  if (!cached)
    templates.set(staticContent, cached = template);
  return tag(cached, ...values);
};

const html = create($html);
const svg = create($svg);

// if used as intermediate module/helper
export {render, html, svg};

It's up to you to find a way to create those arguments and pass these, but this could be one way (on the sender side, not the iframe receiver one)

// make tag function serializable
const json = (template, ...values) => [template, ...values];

// example
json`a${'b'}c`;
// [[ "a", "c" ], "b"]

At that point you have an always same data structure to use as template literal arguments, but you need to resolve possibly nested interpolations.

import {render as $render, html as $html, svg as $svg} from 'uhtml';

// the previous code that handles ...args

// from the iframe, render JSON data
const render = (where, json) => {
  return $render(where,  unroll(...json));
};

const {isArray} = Array;
const unroll = (template, ...values) => {
  for (let i = 0, {length} = values; i < length; i++) {
    if (isArray(values[i]) && values[i].length > 0 && isArray(values[i][0]))
      values[i] = unroll(...values[i]);
  }
  return html(template, ...values);
};

I hope this makes sense πŸ‘‹

P.S. as objects as holes are kinda ignored in uhtml, you could do better branding, and more robust checks on render:

const html = (template, ...values) => ({type: 'html', template, values});
const svg = (template, ...values) => ({type: 'svg', template, values});

postMessage(html`<h1>Hello ${html`<strong>World</strong>`}</h1>`);

// the render part is easier this way
const render = (where, json) => {
  return $render(where,  unroll(json));
};

const unroll = ({type, template, values}) => {
  for (let i = 0, {length} = values; i < length; i++) {
    if (
      typeof values[i] === 'object' &&
      values[i] !== null &&
      'type' in values[i] &&
      'template' in values[i] &&
      'values' in values[i]
    )
      values[i] = unroll(values[i]);
  }
  return (type === 'html' ? html : svg)(template, ...values);
};

@loganvolkers I've quickly added a uhtml/json export. It exports render, html, and svg, but it handles automatically all the things. From the sender, you'll use:

postMessage(html.json`<h1>This is some ${'content'}</h1>`);

From the receiver you have:

addEventListener('message', ({data}) => {
  render(document.body, data);
});

The render works regularly with html or svg too, but when it comes to JSON, events/listeners don't survive the postMessage dance, so you need to orchestrate some sort of "hydration" after JSON renders.

I hope this helps.

P.S. test here

Thanks for the update, I'll try it out.

P.S. For event handlers things are further complicated because of the iframe boundary, it needs to send responses back serialized over post message. I have an implementation that adds an event listener to the body and has unique IDs added to every element. It works for my simple case, but doesn't cover all possible event situations.

@loganvolkers I've updated the pen quite a bit, this is now it looks now:

import {render, html} from '//cdn.skypack.dev/uhtml/json';

addEventListener('message', ({
  data: {action, where, content}
}) => {
  if (action === 'render')
    render(document.querySelector(where), content);
});

const input = value => html.json`<input value=${value} />`;
const greetings = who => html.json`
  <h1>Hello ${who} πŸ‘‹</h1>
  <p>This demo simply shows uhtml via ${input('JSON')}</p>
  <p>Posted @ ${new Date().toISOString()}</p>
`;

setInterval(
  () => {
    postMessage({
      action: 'render',
      where: 'body',
      content: greetings('There')
    });
  },
  1000
);

Regarding the string to template transformation though, you could also generate templates and arguments via tag-params.

You need to surround the dynamic parts via ${...} and it's not "bullet proof" if it finds those chars where it shouldn't (user input, etc), but maybe a combination of these techniques will bring you far.

Regarding the unique ID, yes, that's the way to go ;-)

Thanks this is great. Having something bullet proof is the dream, so I'll stress test this across a bunch of html content.

if you build the string by your own, be sure foreign, non meant to be dynamic, content, replaces $ with &#36;, so that no ambiguous parsing happens:

const makeItSafe = content => content.replace(/\$/g, `&#${'$'.charCodeAt(0)};`);

const dynamic = 'class-name';
const layout = `<div class="${dynamic}">${makeItSafe('${nope}')}</div>`;

or something like that πŸ‘‹