syntax-tree / hast

Hypertext Abstract Syntax Tree format

Home Page:https://unifiedjs.com

Geek Repo:Geek Repo

Github PK Tool:Github PK Tool

Template tag behaviour

StarpTech opened this issue · comments

Hi, why is <template> not handled as every other element? In the current state, we always have to handle it explicitly this results in code like

minify(tree)
visitor(ast, (node) => {
	if (node.tagName === 'template') {
		minify(node.content)
		...
	}
})

proposal

  1. Handle it like any other elmement
commented

@StarpTech Because you don’t always want to walk into it (other interesting cases: noscript, svg, and math, though they don’t currently expose content).
The way the browser sees it, is as a separate fragment, that could, when directly in the document, be invalid and mess up the document.

I don't understand the reason why we can't handle it as one tree? We aren't browsers. At least for traversing its a huge mess look in the example above or isn't?!

commented

Template is not any other element. You wouldn’t want to accidentally select elements inside it. Therefore it’s opt-in.

We aren't browsers.

hast / rehype are not for xml either. They’re for html. Close to the DOM in fact.

At least for traversing its a huge mess look in the example above or isn't?!

Huge mess is an exaggeration. And not nice.
Your example is also incomplete or invalid. The <element> element is deprecated and not supported, so I’m not sure why you’re checking for it?

Huge mess is an exaggeration. And not nice.

Sry for that this wasn't my intention. I mean from the perspective of a developer it's not nice because I need to change the pattern of my visitor function. In my opinion all elements should be handled in the same way.

hast / rehype are not for xml either. They’re for html. Close to the DOM in fact.

What do you mean? I just want to traverse the tree in a consistent way without to care about browser specific rendering constraints and if you don't want to handle it as the same document you could work with a flag like node.template in your visitor function.

Your example is also incomplete or invalid. The element is deprecated and not supported, so I’m not sure why you’re checking for it?

It wasn't invalid I'm talking about the <template> tag it was just a typo 😄

It could be possible that I'm not full aware of what the <template> tag really is but from my perspective and as a HAST user it's just another node. I'm using HAST to traverse HTML in a consistent and uniform structure but this break it. I suggest that a template tag must be excluded explicitly in order to ignore it. A new template property on a HAST node could solve it.

interface Element <: Parent {
  type: "element";
  tagName: string;
  properties: Properties;
  isTemplate: boolean?;
}
commented

Sry for that this wasn't my intention.

Thank you. It’s easy to come across not-very-nice online. I do know I come across as a dick sometimes 😛

I mean from the perspective of a developer it's not nice because I need to change the pattern of my visitor function. In my opinion all elements should be handled in the same way.

That really depends, I think it. I’d suggest checking the mdn docs on it then! It’s a fragment, it’s unfinished content. It’s not part of the document!

So the thing is as follows: if we’d use children, we could break plugins that aren’t aware of the difference: opt-out. But by using content, it becomes opt-in.
So a new prop wouldn’t change that. I think it should be opt-in!

Thank you. It’s easy to come across not-very-nice online. I do know I come across as a dick sometimes 😛

🤣 good attitude

So the thing is as follows: if we’d use children, we could break plugins that aren’t aware of the difference: opt-out. But by using content, it becomes opt-in.
So a new prop wouldn’t change that. I think it should be opt-in!

It could solve it but it would be a major change. Correct if I'm wrong but HAST should make is easy to traverse and transform HTML right? Do you really like the fact that a user has to make differences between different nodes in order to traverse all nodes? As a browser it's completly understandable because he has to scope it but for a parser with that goals? I think it should be opt-out and it should be up to the developer to ignore some elements or not. With that change, all your helper packages like unist-* would work in the same way for any html document I think this is much more developer friendly.

commented

It could solve it but it would be a major change.

Changing HAST would be bigger!

Correct if I'm wrong but HAST should make is easy to traverse and transform HTML right? Do you really like the fact that a user has to make differences between different nodes in order to traverse all nodes? As a browser it's completly understandable because he has to scope it but for a parser with that goals? I think it should be opt-out and it should be up to the developer to ignore some elements or not. With that change, all your helper packages like unist-* would work in the same way for any html document I think this is much more developer friendly.

I honestly think it should be opt-in. Sure it can be documented better. But yes, I think that a developer building plugins should make these decisions, and that the default should be sensible (aka, don’t break stuff accidentally), which is the current workings.

It is also more “correct”. Say you start a document with a template, and it includes an element with an ID. Later, after the template, there’s another element with that ID. Selecting the element corresponding to that ID should return the latter one. Not the first.

Hi, I thought it would be very easy to change my code in order to support template tags but it is very hard to do it because I lose the parent context for example:

I use the rehype-format package. It heavily rely on the node parents but this state is lost as soon as it enters a template tag because its another document. I think we need better tooling to support the working with template tags in rehype. Sub-documents are very common today.

Try to format nested template tags and you will see its very hard to implement a simple solution as you showing in all of your examples. The indentation informations must be stored elsewhere and also child nodes has to be handled differently because it can contains template tag's.

You explained the reasons very well but from the point of view of a developer who just want to work with it in a convenient way it's not easy.

Hi @wooorm,

currently, I have to manage some forks of your projects in order to deal with attributes in a less strict way. This situation was improved very well but right know I have still the problem to work with <template> nodes. A <template> node create an additional container #document-fragment and does not expose its children through children but content.

Do you have an idea how to deal with that in your util packages? My project based on your https://github.com/rehypejs/rehype-format project which use the parents.length to check the indentation depth but this information is broken since a template node is handled as a new document.

Are you open for an optional flag to handle a template node like any other element? e.g in

  • hastscript
  • hast-util-from-parse5
  • hast-util-to-html

Purpose: for formatter, analyser which doesn't need such context of scoping but for much easier parsing. Problem: simple template nodes or nested template nodes.

commented

I don’t think that’s a good solution. You’re using visit-parents now, which gives you a stack of parent nodes, and that would still work, except that the whole subtree should be indented by an additional offset.

To deal with templates, you need to visit them as well. For the offset, you could create a visitFactory. Something like this:

visit(root, factory(0))

return root

function factory(offset) {
  return visitor

  function visitor(node, ancestors) {
    if (node.type === 'element' && node.tagName === 'template') {
      visit(node.content, factory(ancestors.length))
    } else {
      // ...
    }
  }
}

Thanks for the feedback. I will think about it.