DavidAnson / markdownlint

A Node.js style checker and lint tool for Markdown/CommonMark files.

Geek Repo:Geek Repo

Github PK Tool:Github PK Tool

Enhance rule MD033 to be more specific/targetted

mvilrokx opened this issue · comments

Describe the Enhancement:

I want to be able to narrow the scope of allowable HTML tags beyond just the tag name.

Impacted Rules:

MD033

Describe the Need:

I want to use rule MD033/no-inline-html, but there are genuine reasons to use some HTML Tags over plain markdown in certain cases. E.g. I only want to allow the usage of the <img> tag if that tag has a style, width or height attribute. Or, to put it another way: I do not want to allow the usage of the <img> tag if it only has attributes src and/or alt.

The reason behind this is that an <img> tag that has just a src or alt attribute can be perfectly described in plain Markdown:

<img src="./picture.png" alt="a picture" />

becomes

![a picture](./picture.png)

If it has any other attribute, it cannot be converted to a Markdown tag without affecting the way it gets rendered:

<img src="./picture.png" alt="a picture" height="100px" width="100px" />
<img src="./picture.png" alt="a picture" style="display: block; margin-left: auto; margin-right: auto; float: none; />

And therefor I would consider this acceptable usage of the <img> tag.

The problem is that the rule only allows exceptions at the "tag" level, e.g. "allowed_elements": ["img"]. I want to be able to narrow the scope by adding exceptions to this.

Current Alternative

Set "allowed_elements": ["img"], but this will allow all <img> tags whereas I only want to allow some.

Can We Help You Implement This?:

Yes, but I'd first like to get some input if this is deemed useful.

If it is, we need to discuss how this would be configured. Some examples:

Original:

{
  "MD033": {
    "allowed_elements": [
      "img"
    ]
  }
}

Enhancement:

{
  "MD033": {
    "allowed_elements": [{
        "img": {
          "exclusive_attributes": ["src", "alt"]
        }
    }]
  }
}

(sorry about the name, suggestions are welcome :-))

This would allow the <img> tag, except if it contains only src or only alt or only both.

Alternative Enhancement:

{
  "MD033": {
    "allowed_elements": [{
        "img": {
          "allowed_attributes": ["style", "width", "height"]
        }
    }]
  }
}

This would allow the <img> tag, but only if it contains any of "style", "width" or "height" (or any combination).

To be honest I am not sure how useful the later suggestion is since there are many other attributes that could be used on <img> tags that are not convertible to plain markdown, e.g. title, and I would want to exclude those too, but then the list in allowed_attributes would have to contain all possible attributes except src and alt.

This seems like a fair amount of complexity for something (HTML in markdown) that is discouraged. Your examples are all for image tags; if that's the only real scenario, it might be easier to accomplish some other way?

Related to #464

This seems like a fair amount of complexity for something (HTML in markdown) that is discouraged.

Agreed. I think the specific Use Case is that some things just cannot be achieved with Markdown and require HTML (unless I am wrong). E.g., centering an image or controlling the size of the image so it fits properly in the document when it renders. "New Lines" inside table cells are another example, AFAIK, that can only be achieved with HTML tables and using the <br/> tag.

Maybe those are the only exceptions?

Also I just realized that the second example is different than the first because that is not an attribute on a tag. It would be a <br/> tag somewhere inside a <table> tag (i.e. only allow HTML tables if they are using a
tag.

If there is a finite list of these exceptions maybe they can be "hard-coded" somehow. Maybe add a flag to the no-inline-html called strict or something that is true by default, and behaves as it does now, but if you set it to false it allows certain tags with certain attributes, e.g. the image example I gave. Maybe start with just that and we can add "exceptions" over time?

Your examples are all for image tags; if that's the only real scenario, it might be easier to accomplish some other way?

I'm all ears :-)

Trying to extend MD033 in an abstract way to support these special cases seems difficult – as you outline above by calling out challenges with excluding versus including. On the other hand, if there are just a few special cases, maybe those can be handled specifically. For example, there could be a new rule called "simplify images" (or something) that looks for IMG tags with only SRC and ALT and recommends converting them to Markdown syntax (this would even be auto-fixable). A rule like "breaks inside tables" is something else that could be specifically allowed, probably by MD033 with a parameter to opt in. I'm not sure how many others of these there might be.

and recommends converting them to Markdown syntax (this would even be auto-fixable)

That is literally what I did; some regex-foo in VS Code to find them and then convert them to markdown images. That is also how I found out that I cannot convert them all.

Maybe we don't even need a new rule (for image conversion), it's basically a straight-up fix, like all other fixes, no? I mean, there is zero difference between <img src="./picture.png" alt="a picture" /> and ![a picture](./picture.png) so why not just fix these?

Here's another one: I caught people using <hr/>, those can be converted to ---

The trick will be that if these tags are used inside other HTML, it needs to stay HTML, otherwise you will break the rendering.

Every fix is done as part of a rule that can be disabled for scenarios where it does not make sense. That said, I'm not sure I want to create a bunch of rules around HTML since my stance is that HTML should be avoided as much as possible.

Sounds reasonable. Feel free to close! Appreciate the discussion.