Make custom attribute rules consistent with custom element name rules

Question

Make custom attribute rules consistent with custom element name rules

LeaVerou opened this issue 8 years ago · comments

Related WICG discussion: https://discourse.wicg.io/t/relaxing-restrictions-on-custom-attribute-names/1444

Currently, custom attributes need to start with data-. For frameworks with a lot of attributes (Angular, Vue etc), this introduces a serious problem: Either they prefix all attributes with data- and become prone to collisions with other libraries (I've even had two of my own libraries collide!), or they make them extremely verbose (data-ng-*), or they make them non-standard (ng-*, v-*), which is their chosen solution. I'm about to release a library with a lot of attributes and I went for the latter as well. The former two pose serious practical problems, the latter is just conformance.

However, it doesn't have to be this way. Custom elements allow any element name with a hyphen in it, we could do the same for attributes. The cowpaths have been paved: Several very popular libraries follow this practice already. This is not true for proposals like #2250, which introduce a completely new naming scheme.

The main issue with this is all the existing attributes in SVG that come from CSS properties which use hyphens. However, there are several solutions to deal with this:

Exclude these prefixes, or just these names. The SVG working group is dying and these attributes must be manually added to the spec, there's no clause that says all CSS props must automatically be available as attributes.
Only allow prefixes of 1 or 2 letters. This gives us 26*26 + 25 = 701 more prefixes already, and does not clash with any CSS property that is available as an SVG attribute (z-index is the only CSS property that matches this, and it's not an SVG attribute). It also legalizes Angular & Vue's practices.

The more commonplace invalid HTML becomes, the less authors care about authoring valid HTML. Validation becomes pointless in their eyes if they see tons of perfectly good use cases being invalid. Also, if both attributes with and without hyphens are equally invalid, nothing forces developers to stick to any naming scheme. So, I think it would be great if we found a solution for this. And it's a proposal that requires zero effort from implementers, since these attributes already work just fine!

Marat Tanalin · Answer 1 · Wed Jan 18 2017 00:27:06 GMT+0800 (China Standard Time)

I’d be fine with this as long as the pre-hyphen part could be empty, so attributes could have names like -foo, -bar, etc.

Otherwise this does not add much over the existing data- prefix (e. g. da- instead of data-) and is probably too problematic compared with the absolutely issue-free and future-proof underscore/hyphen-prefixed custom attributes.

To be fair, better than nothing anyway though.

Anne van Kesteren · Answer 2 · Wed Jan 18 2017 17:03:00 GMT+0800 (China Standard Time)

I'm supportive of this, but only if we also add an API equivalent to what we added for custom elements. It should be possible for folks to easily observe when such attributes are added, removed, and change in value.

Simon Pieters · Answer 3 · Wed Jan 18 2017 19:10:47 GMT+0800 (China Standard Time)

I’d be fine with this as long as the pre-hyphen part could be empty, so attributes could have names like -foo, -bar, etc.

Starting with a dash is not XML-compatible. Currently the spec requires data-* attribute names to be XML-compatible, and custom element names as well.

Lea Verou · Answer 4 · Wed Jan 18 2017 19:37:00 GMT+0800 (China Standard Time)

Otherwise this does not add much over the existing data- prefix (e. g. da- instead of data-) and is probably too problematic compared with the absolutely issue-free and future-proof underscore/hyphen-prefixed custom attributes.

Clearly, you have not considered collisions between libraries and think everything can have the same prefix and the only problem is how to make the prefix less verbose. I don't blame you, I thought they were an edge case in the past as well, but they absolutely are not. With your proposal, libraries would end up doing things like _ng-*, or (most likely) simply not care and continue using ng-* like they've done for years.

Lea Verou · Answer 5 · Wed Jan 18 2017 19:38:14 GMT+0800 (China Standard Time)

I'm supportive of this, but only if we also add an API equivalent to what we added for custom elements. It should be possible for folks to easily observe when such attributes are added, removed, and change in value.

That would be awesome. So basically, syntactic sugar for MutationObserver?

Anne van Kesteren · Answer 6 · Wed Jan 18 2017 20:31:21 GMT+0800 (China Standard Time)

The problem with MutationObserver for this use case is that you don't know where the attribute is going to be added. So if you want a global custom attribute, you'd have to observe the entire tree and even then you'd miss certain things, such as shadow trees.

Marat Tanalin · Answer 7 · Wed Jan 18 2017 20:54:07 GMT+0800 (China Standard Time)

@LeaVerou

With your proposal, libraries would end up doing things like _ng-*, or (most likely) simply not care and continue using ng-* like they've done for years.

_ was invalid at the moment of making decisions as for design of those libraries, that’s most likely why libraries’ authors have decided just to drop the (only valid at that moment) data- prefix and not to use a generic prefix that would be formally invalid anyway.

As a side note, it’d probably be wrong to assume that the fact that it’s hard for someone who is already a smoker (existing libraries in terms of custom attributes) to leave off smoking is a reason not to try to prevent others (new products and libraries) from starting smoking (provide a valid short unobtrusive generic prefix).

Lea Verou · Answer 8 · Thu Jan 19 2017 03:28:39 GMT+0800 (China Standard Time)

The problem with MutationObserver for this use case is that you don't know where the attribute is going to be added. So if you want a global custom attribute, you'd have to observe the entire tree and even then you'd miss certain things, such as shadow trees.

True, and what you're proposing would solve a HUGE problem and I would cry tears of joy once it gets implemented! I'm just a bit concerned that it requires considerably more implementor effort, so adding it could stall. Whereas just permitting such attribute names at first would let us use them and it's a super easy addition to the spec since it requires no implementation effort.

Anne van Kesteren · Answer 9 · Thu Jan 19 2017 15:15:03 GMT+0800 (China Standard Time)

Fair, I think there is interest to go in this direction once custom elements has shipped. This idea was briefly discussed at the last W3C TPAC. I think the main thing we lack is someone freeing up the time to write the standard. @domenic thoughts?

Ryosuke Niwa · Answer 10 · Thu Jan 19 2017 15:34:00 GMT+0800 (China Standard Time)

I think the fact custom elements kind of encourage people to add a random attribute is a serious issue already so coming up with a some convention for author-defined attribute is a win even if we couldn't add an API for custom attributes yet.

Having said that, we think custom attribute is a much better alternative to is attribute.

Simon Pieters · Answer 11 · Thu Jan 19 2017 16:58:19 GMT+0800 (China Standard Time)

Pages in httparchive with attributes that start with _ or non-standard attributes containing -:

SELECT * FROM (
SELECT page, url, REGEXP_EXTRACT(LOWER(body), r'(<[a-z][a-z0-9-]*\s+(?:(?:data-|aria-|http-|accept-)?[a-z]+(?:\s*=\s*(?:"[^"]*"|\'[^\']*\'|[^>\s/"\']+\s+)|\s+))*(?:_[a-z]|(?:[b-ce-gj-z]|d[b-z0-9]|a[a-bd-qs-z0-9]|h[a-su-z0-9]|da[a-su-z0-9]|ar[a-hj-z0-9]|ac[a-bd-z0-9]|ht[a-su-z0-9])[a-z0-9]*-)[^>\s]*\s*=[^>]*>)') AS match
FROM [httparchive:har.2017_01_01_chrome_requests_bodies]
)
WHERE page = url
AND match != "null"
AND NOT REGEXP_MATCH(match, r'["\']\s*\+') # exclude JS string concats
AND NOT REGEXP_MATCH(match, r'<(altglyph|animate|circle|clippath|color-profile|cursor|defs|desc|ellipse|feblend|fecolor|fediffuse|fedisplacement|fedistant|feflood|fefunc|fegauss|feimage|femerge|femorph|feoffset|fepoint|fespec|fespot|fetile|feturb|filter|font|foreign|g\s|glyph|hkern|image|line|marker|mask|metadata|missing|mpath|path|pattern|polygon|polyline|radial|rect|set\s|stop|svg|switch|symbol|text\s|textpath|tref|tspan|use\s|view\s|vkern)') # exclude SVG elements

4068 results: https://gist.github.com/zcorpan/b54592e415a2f79f2ef7f79c0c37b2ed

Of those:

26 have _moz_*
531 have an attribute starting with _ (excluding moz prefix).
22 have x-webkit-* or x-ms-* (the HTML spec for a while recommended vendor extensions to be prefixed with x-vendor-).
57 start with x- (excluding webkit/ms prefixes).
2015 have a prefix of 1 or 2 letters and a dash (excluding x-).
1418 have 3+ letters before the dash.

Other things to note:

SVG font-face had x-height and v-alphabetic (etc) attributes. But this element is dead.
The HTML attributes with dash are aria-*, data-*, accept-charset, http-equiv.

I found an instance of typo of aria -- it would be good if conformance checkers could continue to catch this mistake:

<button area-invalid="true" aria-required="true" aria-controls="checkincontainer" aria-label="checkin" id="checkinbutton" class="checkinbutton">

Simon Pieters · Answer 12 · Thu Jan 19 2017 17:15:45 GMT+0800 (China Standard Time)

For comparison, equivalent query for data-* gives 59,755 results. So data-* is about 15 times more common than non-standard custom attributes (excluding _moz_, x-webkit-, x-ms-).

Simon Pieters · Answer 13 · Thu Jan 19 2017 20:20:36 GMT+0800 (China Standard Time)

So @LeaVerou's proposal is used by ~0.4% of pages in httparchive; @Marat-Tanalin's proposal is used by ~0.1%. data-* is used by ~12.1%. (The data set is 494,956 pages.)

Since the point here is to adopt what people like or use anyway, if we are to do this, it seems most reasonable to me to allow both. But we should disallow _moz_, x-webkit- and x-ms- and 3+ letter prefix followed by dash (to avoid clashes in SVG, and to make it possible to tell if an attribute is a "custom attribute" or not, and to catch typos in aria- or data-), as well as anything not XML-compatible. But no need to restrict the prefix to [a-z], I believe (data-* and custom element names allow other XML-compatible characters).

Domenic Denicola · Answer 14 · Fri Jan 20 2017 00:19:06 GMT+0800 (China Standard Time)

I still feel there's a strong advantage to sticking to a single sanctioned convention (data-) for custom data attributes, at least until we have a processing model for the "custom attributes".

If people want to go against that convention, that's their choice, but we shouldn't give them a free pass; they're making a conscious choice to trade conformance and ecosystem compatibility for convenience.

Marat Tanalin · Answer 15 · Fri Jan 20 2017 00:29:11 GMT+0800 (China Standard Time)

@domenic Sorry, but that’s just a purely theoretical statement totally detached from reality.

As a practicing web developer, I’m quite happy with what we already have currently feature-wise: getAttribute() / setAttribute() / removeAttribute() in JS and [attribute] in selectors.

The only issue here is the artificial validity limitation that could and should be easily removed on spec level. Having (or not) a processing model for custom attributes does not affect the ability to use such attributes right now (to be clear: I’m specifically about _-prefixed attributes that are 100% future-proof).

Lea Verou · Answer 16 · Fri Jan 20 2017 00:38:21 GMT+0800 (China Standard Time)

Thanks for the data @zcorpan!! Very enlightening. I find it surprising that Angular and Vue would only be used by 0.4% of websites. Perhaps a lot of these attributes are added dynamically? Also, I'm not surprised that data- has such as high percentage: Small libraries that only add 1 or 2 attributes can easily use data- and be less worried about either collisions or verbosity. It only takes 1 such library for a page to qualify as having a data- attribute.

It's also an interesting idea to allow both proposals. I don't see any problem with that, flexibility is good!

@domenic Several people have commented about the problems with data-. Developers of popular libraries with many attributes are not using data-. Even those that supported both their own prefix- and a data-prefix- version of each attribute are dropping the latter because nobody is using it, probably because data-prefix- is a verbose abomination. And you resist legalizing anything other than data- because of some theoretical purity argument about "a single sanctioned convention"? What happened to the priority of constituencies? Doesn't author convenience come several levels before theoretical purity?!

Domenic Denicola · Answer 17 · Fri Jan 20 2017 00:45:20 GMT+0800 (China Standard Time)

Several people have commented about the problems with data-. Developers of popular libraries with many attributes are not using data-. Even those that supported both their own prefix- and a data-prefix- version of each attribute are dropping the latter because nobody is using it, probably because data-prefix- is a verbose abomination.

This argument (and I would appreciate if you avoided phrases like "abomination" in reasoned discussion) is based on anecdotes, whereas @zcorpan shows soundly with data that it does not hold in the real world. A small minority of developers using custom attributes are unhappy with data; 15x more are happy with data than are unhappy. They can be vocal, as you are, but saying that this is a widespread problem is just not supported.

And you resist legalizing anything other than data- because of some theoretical purity argument about "a single sanctioned convention"? What happened to the priority of constituencies? Doesn't author convenience come several levels before theoretical purity?!

Sorry, but that’s just a purely theoretical statement totally detached from reality.

I don't think it's helpful or accurate to characterize the argument as one of theoretical purity, or start invoking the priority of constituencies before any such violation is apparent. This is about the practical impact of fracturing the ecosystem into multiple conventions for custom data. That has real impact on tooling, libraries, authors reading other authors' source code, API consistency and predictability (why do some data properties get a dataset API, and others don't?) and much more.

Again, I repeat that there is nothing stopping you from making a conscious choice between conformance and brevity. If you value brevity so much as to start calling data- attributes an abomination, I presume you value it more than conformant documents. That's fine! You can make that choice! As you yourself have noted, there's nothing stopping you. But it doesn't mean the spec should stop trying to keep the ecosystem coherent to the best of its abilities.

Marat Tanalin · Answer 18 · Fri Jan 20 2017 00:56:25 GMT+0800 (China Standard Time)

@domenic

15x more are happy with data than are unhappy.

The obvious reason of prevalence of data--prefixed attributes over other prefixes is that data- is the only formally valid option for now. This has nothing to do with whether people are actually happy with it.

Good web developers just usually prefer to keep their documents valid, and not just because that makes them “feel good”, but also to be able to use validators to easier see real errors not intermixed with fictious pseudoerrors related to artificial spec-level limitations not matching reality.

why do some data properties get a dataset API, and others don't?

Because not all custom attributes are data attributes. data- attributes are for data, custom-prefixed attributes are for custom needs whatever those are.

Steve Faulkner · Answer 19 · Fri Jan 20 2017 00:57:51 GMT+0800 (China Standard Time)

@zcorpan wrote:

So @LeaVerou's proposal is used by ~0.4% of pages in httparchive; @Marat-Tanalin's proposal is used by ~0.1%. data-* is used by ~12.1%. (The data set is 494,956 pages.)

does that mean that some other form of prefix is used by the other 87%?

Domenic Denicola · Answer 20 · Fri Jan 20 2017 01:20:00 GMT+0800 (China Standard Time)

The obvious reason of prevalence of data--prefixed attributes over other prefixes is that data- is the only formally valid option for now. This has nothing to do with whether people are actually happy with it.

That's an interesting speculation. Fortunately, it's also one we can answer, or at least upper-bound, with data. That is, what percentage of those ~12.1% of pages are conformant? In other words, what percentage of people using data-* attributes are also people who care about conformance, and thus might have chosen data- over x- because of conformance concerns?

Similarly, what percentage of the ~0.5% using nonstandard prefixes are conformant-except-for-bad-prefixes? This number is especially interesting, because it indicates people who are interested in conformance but just aren't willing to change their prefixes. Certainly you and Lea might fall in that sub-bucket of the ~0.5%. (Although maybe not?) But how many of that ~0.5% are you representing?

Another point worth making is the analogy to a previous push to use  for icons. The reasoning was exactly the same: lots of people are doing it, because it's shorter than the recommendation in the spec ( with fallback text). We even did a HTTP archive search, and found that many more developers would "benefit" from allowing this than the fraction-of-~0.5% being discussed here. But allowing  for icons has many practical downsides---the same ones I listed before for allowing non-data- prefixes for custom data attributes. For that reason, we didn't do it.

does that mean that some other form of prefix is used by the other 87%?

I assume it means they are not using any prefixed attributes (data- or otherwise) at all.

Domenic Denicola · Answer 21 · Fri Jan 20 2017 01:23:53 GMT+0800 (China Standard Time)

Let me also repeat that I do support exploring the concept of custom attributes, with a processing model similar to custom elements. That gives serious benefits beyond just brevity, that IMO outweigh the practical disadvantages. It's the simple conformance change with no processing model that I am not in support of.

Lea Verou · Answer 22 · Fri Jan 20 2017 01:26:30 GMT+0800 (China Standard Time)

Again, I repeat that there is nothing stopping you from making a conscious choice between conformance and brevity. If you value brevity so much as to start calling data- attributes an abomination, I presume you value it more than conformant documents. That's fine! You can make that choice! As you yourself have noted, there's nothing stopping you. But it doesn't mean the spec should stop trying to keep the ecosystem coherent to the best of its abilities.
@Marat-Tanalin

Authors don't typically invent their own attributes, and when they do, data- is fine. Most custom attributes are used because a library/framework will utilize them. Therefore, the person using the attribute is not the same person that decided on its naming. It's not about my choice, it's about making the right choice for the users of my library. I don't want to impose verbosity on them and litter their markup with lengthy prefixes, and I don't want to impose nonconformance on them. Library devs should not be forced into this dilemma.

Re: fracturing the ecosystem, how does that not apply to custom element names?

whereas @zcorpan shows soundly with data that it does not hold in the real world

While I definitely commend the effort to get real data, I would take that percentage with a grain of salt:

We're basically parsing HTML with regexes here
None of this accounts for dynamically added attributes.
It counts occurrence of each naming scheme per page, whereas I suspect that when ng- or v- attributes are used, A LOT of them are used.
As I mentioned above, smaller libraries can use data- just fine. When you only have one or two attributes, the verbosity doesn't matter much and the collisions are more rare. It only takes 1 such library for a page to count in @zcorpan's data.
As @Marat-Tanalin mentioned, data- is the only conformant option right now, don't you think that affects usage?
These stats go against common knowledge: Angular and Vue are very popular, it seems weird that they'd be collectively used by only 0.4% of websites.

Fortunately, it's also one we can answer with data. That is, what percentage of those ~12.1% of pages are conformant? In other words, what percentage of people using data-* attributes are also people who care about conformance, and thus might have chosen data- over x- because of conformance concerns?

You're assuming here that everybody who cares about conformance is actually conformant. A parallel about religions and sins comes to mind. :) Many authors care about conformance, but don't actually validate, so they make mistakes that are never caught. However, conformance still influences their decision making.

Lea Verou · Answer 23 · Fri Jan 20 2017 02:18:40 GMT+0800 (China Standard Time)

Let me also repeat that I do support exploring the concept of custom attributes, with a processing model similar to custom elements. That gives serious benefits beyond just brevity, that IMO outweigh the practical disadvantages. It's the simple conformance change with no processing model that I am not in support of.

Nobody is against that. As I said above, that would be incredible! It would make my life so much easier. What I was suggesting is making the conformance change first, since it's easy, and adding the (harder to design) API as a later step, once it gets implementor interest and a spec editor willing to do it.

Marat Tanalin · Answer 24 · Fri Jan 20 2017 05:25:14 GMT+0800 (China Standard Time)

@domenic

(Although maybe not?)

I would appreciate if you avoided further trolling.

FYI, unlike what you’ve probably naively assumed, I am aware the pubdate attribute is currently not in the HTML spec, so using the formally invalid attribute is not accidental. I use the attribute intentionally since it was previously specced and perfectly valid, but then has been removed on a purely theoretical basis by someone who unfortunately has a sort of overformal logical approach (but who is still able to be respectful and deserves to be respected) somewhat similar to yours, and recommended to use the bolted-on verbose pseudosemantic surrogate called Microdata instead. (Btw, the same person also tried to remove the TIME element in favor of a new cool universal element called… DATA, but fortunately failed thanks to massive web-developers’ objections.) Violating the current version of the HTML spec by continuing to use the pubdate attribute solely on my own site is a sort of my conscious and consistent objection to that (wrong in my opinion) decision. Moreover, according to my experience, at least Google search engine does support the attribute regardless of that it has been removed from the spec, so its use still makes sense in practice.

Marat Tanalin · Answer 25 · Fri Jan 20 2017 05:47:02 GMT+0800 (China Standard Time)

Another point worth making is the analogy to a previous push to use  for icons.

Any analogy suffers from inaccuracies, is not a proof or an argument of any kind, and is often actually just irrelevant offtopic noise.

Ian Hickson · Answer 26 · Fri Jan 20 2017 05:51:33 GMT+0800 (China Standard Time)

Let's please remain focused on the technical issues.

Simon Pieters · Answer 27 · Fri Jan 20 2017 06:24:59 GMT+0800 (China Standard Time)

@Marat-Tanalin

Because not all custom attributes are data attributes. data- attributes are for data, custom-prefixed attributes are for custom needs whatever those are.

I think this is incorrect (as I also said in #2250 (comment)). There is no difference in intended use at all -- why would there be? Possibly we should tweak the spec text to clarify that it is not "wrong" to use data-foo as a "boolean" attribute, etc. I can work on a PR for that. What other usages for "custom attributes" are there that you think are not "data"?

Marat Tanalin · Answer 28 · Fri Jan 20 2017 06:32:25 GMT+0800 (China Standard Time)

@zcorpan Is the disabled attribute a data attribute?
(Fwiw, it is clear to me that boolean data- attributes are valid per spec.)

Andrea Giammarchi · Answer 29 · Fri Jan 20 2017 07:04:58 GMT+0800 (China Standard Time)

FWIW, the moment data-* shipped is the moment pretty much every old fashioned MVC library added data-bind to any node, knockout to name one, others following, causing the same name clashing problem data- was supposed to solve for HTML attributes, but in developers-land (TL;DR the problem just moved somewhere else)

Being also impossible to polyfill, in terms of el.dataset.name and similar Proxy kind of sorcery, most advantages initially thought for developers got lost in "trans(pi)lation", since it was still a el.get/setAttribute('data-whatever') matter, which is probably in the Top Ten things I really don't want to waste time anymore typing in my life ... but that's another story, sorry ...

That being said, there's a lot of legacy in the wild trusting data-attributes and aria-roles are also untouchable from a "don't break the Web" point of view, so I agree with @domenic there's no way we can just throw away data- and aria- like that, and we honestly shouldn't.

However, since the begninning of the time, Custom Elements had reserved names such:

annotation-xml
color-profile
font-face
font-face-src
font-face-uri
font-face-format
font-face-name
missing-glyph

Proposing a new attribute standard that define forbidden prefixes for attributes doesn't seem that different, as long as the provided data is realiable and there are really no huge conflicts with what's already used out there. For instance, I know some major player use prefixes that are not mentioned in here, only because their prefixes are behind the scene, and not public. Having browser breaking randomly prefixes that don't show up on the Open Web is not probably "good enough".

Good news is: these kind of changes don't happen over night, so if there is a will to promote custom attributes, providing a list of untouchable prefixes so that other can have time to eventually update their code-base, that'd be ace.

Not A Substitute for Native Extends

I am not sure whi @rniwa mentioned it, but this proposal has nothing to do with the is="custom-el" one.

Attributes are per element, unless you want every website to add a MutationObserver to the document.documentElement so that every custom attribute would be intercepted and its node somehow manifested, including Shadow DOM concerns already mentioned, there's no way this is going to solve anything at all regarding the ability to define Custom Elements that extends natives: you would still need to declare a prototype that should react when that kind of node only had a custom attribute changed.

Possible Custom Element Seppuku

One thing to be concerned about, is the dual binding at that point a custom prefixed attribute is going to be readable, and writable, through the element.
In current Custom Element specifications, if I have ['my-attr'] as list of observable attributes, I expect that whoever use setAttribute on it would trigger an attributeChangdCallback.
If I have something reflected per instance that as soon as accessed would eventually trigger a possible callback, we'll be in infinite loop/recursion land for a basic attribute set, something that at least el.dataset.attrName = "value" wouldn't cause.

As summary, I hope this proposal/idea will be implemented, considering all the possible side-effects it might bring to the table _{(and not because the proposal is bad, simply because we have legacy around the WWW :-( )}

Simon Pieters · Answer 30 · Fri Jan 20 2017 07:30:29 GMT+0800 (China Standard Time)

@Marat-Tanalin it's not a custom attribute, so I don't understand the relevance to what I said. I did not say that standard boolean attributes are "data". I said that data-* attributes are not just for "data", but for any custom use, like you want to use _foo attributes for. So again, what do you want to use _foo for that you do not consider "data"?

Simon Pieters · Answer 31 · Fri Jan 20 2017 18:00:16 GMT+0800 (China Standard Time)

@LeaVerou

These stats go against common knowledge: Angular and Vue are very popular, it seems weird that they'd be collectively used by only 0.4% of websites.

SELECT page, COUNT(url) AS num
FROM [httparchive:har.2017_01_01_chrome_requests] WHERE
REGEXP_MATCH(JSON_EXTRACT(payload, '$.request.url'), '/(angular|vue).+js')
AND JSON_EXTRACT(payload, '$.response.content.mimeType') CONTAINS 'javascript' 
GROUP BY page
ORDER BY num DESC

5841 pages, so ~1.2%. Per https://trends.builtwith.com/javascript/Angular-JS it should be something like 1.4% in top 1m sites using Angular, so this seems in the right ballpark.

Marat Tanalin · Answer 32 · Fri Jan 20 2017 23:54:46 GMT+0800 (China Standard Time)

@zcorpan Simon, thank you for your substantive comments and questions.

it's not a custom attribute, so I don't understand the relevance to what I said.

My point is that all custom attributes are data attributes to the same extent as all standard attributes are data attributes. The latter are obviously not, so the former are not too.

what do you want to use _foo for that you do not consider "data"?

Given that _-prefixed attributes are currently formally invalid and I care about validity, I add them only via JS for now, so they are not discoverable by validator. This primarily includes adding attributes to the root HTML element dynamically based on feature detection for the purpose of applying different styles depending on what features are available. Using classes for the HTML element for this purpose is undesirable since such classes could conflict with other elements’ classes given that it’s a good CSS practice, in stylesheets, not to prepend class selector with a specific-element selector when styles are for a generic DIV container (e. g. just .foo should be used instead of DIV.foo, while the HTML element could have the foo class too, so styles could be unintentionally applied to the HTML element too). To prevent collisions, a prefix should be added to the class name (so that HTML-element’s class is named _foo instead foo), but then there is no point in using a class (<html class="_foo">), and it’s easier just to add a same-name prefixed attribute (<html _foo>). Compared with data- attributes, _ is shorter and makes selectors in CSS easier to read and use ([_foo] instead of [data-foo]) regardless of whether the attributes are added dynamically (and not hard-coded statically) on HTML level.

Also, using attributes in terms of feature detection allows to have more than just two boolean states (available/unavailable) that, in case of a class, would need to have a bunch of different similar classes, while an attribute, unlike a class, has not just a name, but also a value (this specific benefit applies to data- attributes too, but they are just too long as we already know).

Another purpose I would use _-prefixed attributes for once they are legitimized is e. g. navigation menus where design needs to apply styles to previous element of a specific element (e. g. corresponding to current section of the site). Given that there is still no way to select the previous sibling (unlike next sibling), it’s solved by marking the corresponding previous element explicitly on HTML level either with a class or with an attribute. With a class, to prevent collisions with global classes, it would make sense to use a local class prefixed with e. g. _, but if I’m forced to use a prefix anyway, it’d be unreasonable to use a class if a same-name attribute could be used instead:

<li class="_prev"> → <li _prev>

As a bonus, compared with classes, attributes allow more possibilities in terms of selectors, e. g. it’s possible to select elements by a part of an attribute. And compared with the one-character _ prefix, the data- prefix is too long and obtrusive as already said by me and others.

Simon Pieters · Answer 33 · Mon Jan 23 2017 17:46:52 GMT+0800 (China Standard Time)

My point is that all custom attributes are data attributes to the same extent as all standard attributes are data attributes. The latter are obviously not, so the former are not too.

OK, so it seems we agree on this.

Thanks for the examples. I'll try to tweak the spec text and add new examples in the spec for data-* attributes.

Chao · Answer 34 · Mon Jan 23 2017 22:51:36 GMT+0800 (China Standard Time)

If both proposals are being looked at perhaps it would be worth adding a note discouraging use of underscore prefix in libraries but to reserve it for individual sites and page level javascript. If this is done at the level of SHOULD/SHOULD NOT it won't affect conformance but will hopefully encourage good practice (and avoid the worst scenario where the first library using underscore to get popular will end up owning it).

This actually then has a nice benefit of making it easy to see attributes that are part of a library vs ones aimed at more local use.

Simon Pieters · Answer 35 · Tue Jan 24 2017 00:53:19 GMT+0800 (China Standard Time)

That sounds reasonable; the spec already has this text for data-* and JS libraries:

JavaScript libraries may use the custom data attributes, as they are considered to be part of the page on which they are used. Authors of libraries that are reused by many authors are encouraged to include their name in the attribute names, to reduce the risk of clashes. Where it makes sense, library authors are also encouraged to make the exact name used in the attribute names customizable, so that libraries whose authors unknowingly picked the same name can be used on the same page, and so that multiple versions of a particular library can be used on the same page even when those versions are not mutually compatible.

For example, a library called "DoQuery" could use attribute names like data-doquery-range, and a library called "jJo" could use attributes names like data-jjo-range. The jJo library could also provide an API to set which prefix to use (e.g. J.setDataPrefix('j2'), making the attributes have names like data-j2-range).

https://html.spec.whatwg.org/multipage/dom.html#embedding-custom-non-visible-data-with-the-data-*-attributes

digeomel · Answer 36 · Wed Feb 08 2017 18:30:52 GMT+0800 (China Standard Time)

Don't want to weigh in on the discussion, but from a practical Web developer's point of view, we have a disagreement with our colleagues at the office related to your subject. We are creating our own custom Angular components (e.g. <my-dropdown>) and we are wondering how to name the custom attributes for these. One colleague is putting data- all over them, because he's concerned about the validation issues, and I suggested that this is pointless, since strict, old-fashioned validators will complain about the custom elements anyway, so what's the point of prefixing the attributes with data-?

I would appreciate some advice on this.
Thanks :)

Simon Pieters · Answer 37 · Wed Feb 08 2017 20:32:11 GMT+0800 (China Standard Time)

You could use https://checker.html5.org/ which supports custom elements.

The reason arbitrary attributes are not allowed on custom elements is that standard global attributes apply to custom elements as well, and they are not a frozen set; we want to keep the possibility to add new attributes to the standard without conflicting with existing web content that had already started using that name for something else. So data- for custom elements is correct.

Simon Pieters · Answer 38 · Wed Feb 08 2017 22:32:10 GMT+0800 (China Standard Time)

Sorry, my above comment is wrong for autonomous custom elements, they allow any attribute per the HTML standard (and the checker allows them as well). See https://html.spec.whatwg.org/#autonomous-custom-element . The possible conflict with any new global attributes is still there, but is also there for the embed element...

Chao · Answer 39 · Thu Feb 09 2017 00:06:05 GMT+0800 (China Standard Time)

Based on the current discussions the recommendations would be:

_attribute - if this is custom just for your specific site
ng-attribute - namespaced under the prefix angular uses
data-attribute - this is still fine
data-my-attribute - existing data prefix but also good practice of including a namespace for your attributes
my-attribute - if you're developing this as a library to be used in other places, my should be a prefix common to your library to minimise collisions and must be 1 or 2 characters
ng-my-attribute - it's an angular component so namespacing under ng makes sense, the extra my helps namespace yours together to further prevent collisions with other angular attributes.

Marat Tanalin · Answer 40 · Thu Feb 09 2017 00:56:25 GMT+0800 (China Standard Time)

data-my-attribute - existing data prefix but also good practice of including a namespace for your attributes

This one looks redundant.

Also, it would probably make sense to generally recommend to use the _ prefix for any (incl. library-specific) nonstandard attributes, and additionally use a library-specific prefix for libraries. So if a New Cool Framework has appeared and it needs HTML attributes, its prefix would be e. g. _ncf-. Two-characters prefix space would be exhausted very quickly, while with _, we would have an unlimited number of possible framework-specific prefixes.

Tobias Buschor · Answer 41 · Fri Apr 28 2017 22:22:07 GMT+0800 (China Standard Time)

Any news on this?
There is no good reason why tags can be custom but attributes not.

Marat Tanalin · Answer 42 · Sat Apr 29 2017 00:28:06 GMT+0800 (China Standard Time)

Fwiw, I’ve already started using _-prefixed custom attributes instead of data- attributes and local classes.

Update: I mean using now in static HTML — besides adding dynamically via JS that I did long ago before public proposal.

Lea Verou · Answer 43 · Sat Apr 29 2017 00:46:37 GMT+0800 (China Standard Time)

Fwiw, I’ve already started using _-prefixed custom attributes instead of data- attributes and local classes.

I've been using mv- attributes in a library I'm about to release since before I started this post, doing my part in paving the cowpaths even more :)

Victor Csiky · Answer 44 · Sun Apr 30 2017 17:04:43 GMT+0800 (China Standard Time)

I’d be fine with this as long as the pre-hyphen part could be empty, so attributes could have names like -foo, -bar, etc.

I'd rather support your proposal, as it is generic and not biased (towards angular or whatever).
I also have the personal preference of the dash over the underscore (the latter just looks uglier to me).
But please remember that many X(HT)ML parsers follow the XML syntax that mandates that XML names may start with a letter, an underscore, or a colon (!) only.
Should this change be implemented as is, that might break a lot of things.
(In an era when one has to fight for people to close tags like <img />,   properly, it does matter to me.)
On the other hand, attributes starting with a colon shall work with XML based stuff, and according to my test they also work in Firefox properly, however there may be quirks with other user agents / DOM implementations.

Federico Brigante · Answer 45 · Fri May 26 2017 00:43:03 GMT+0800 (China Standard Time)

Dumb question: does XML compatibility matter at all? HTML5 is not XML already.

Also I'd add that today we can have data-* and aria-* attributes only because nobody squatted on those before. If people start using random attributes we'll reach a point where you can't introduce attributes with "nice" prefixes because they'd conflict with existing sites.

I'd love -myattr, I'd be ok with _myattr, but I see the suggestion to open any dashed attribute as shortsighted.

What I really liked was the good ol' real namespaces of XML, where you'd define your namespace:* at the top and avoid any conflicts.

Simon Pieters · Answer 46 · Tue May 30 2017 19:35:11 GMT+0800 (China Standard Time)

Re XML compatibility, see #1356 (comment) and earlier comments.

In this case, attribute names are important to preserve and be able to work with without having to go through infoset coersion. So in my opinion the general rule to be XML compatible should apply.

Tobias Buschor · Answer 47 · Tue Jul 04 2017 21:45:12 GMT+0800 (China Standard Time)

Can someone set it in stone?

Domenic Denicola · Answer 48 · Tue Jul 04 2017 21:48:02 GMT+0800 (China Standard Time)

So far none of the arguments here have sufficed to convince the editors or address their objections in a satisfactory way, so no. See #2271 (comment) for my latest thinking, at least. In fact it maybe time to close the issue without action.

Marat Tanalin · Answer 49 · Tue Jul 04 2017 22:34:04 GMT+0800 (China Standard Time)

Probably the best option is just to use the feature since it just works in all browsers regardless of what the spec says.

Federico Brigante · Answer 50 · Tue Jul 04 2017 23:20:10 GMT+0800 (China Standard Time)

Sure, why not

Willingly risk things breaking in the future
Cause headaches for spec authors who will have to work around stubborn library authors' decisions?

Marat Tanalin · Answer 51 · Wed Jul 05 2017 00:34:54 GMT+0800 (China Standard Time)

No risk at least with the underscore prefix. Standard attributes will never start with underscore.
If the spec authors would be forced to standardize the feature they otherwise wouldn’t, that’d be a nice intended outcome.

Lea Verou · Answer 52 · Wed Jul 05 2017 07:55:22 GMT+0800 (China Standard Time)

See #2271 (comment) for my latest thinking, at least. In fact it maybe time to close the issue without action.

I can't help but wonder if a similar analysis was done for custom element names. Did you find that nonstandard elements with hyphens were so widespread already that you had to standardize them? I’m guessing not. In fact, I'd wager that the number of invalid attribute names in the wild is much bigger than the number of invalid element names in the wild, even now. You just saw the need for a reasonable naming scheme that is not unreasonably verbose. Given that element names are only used once per element, and attributes multiple times, the need is greater here. This is a rather long discussion, where many people have expressed opinions, especially if you count the silent votes. It's disrespectful to all these people to close this with "no action" just because the feature we're proposing is not sufficiently in use already (even though the absolute numbers are in the thousands, unlike most proposed features!). That's a level of scrutiny that I have not seen applied to any other proposed feature in the Web platform.

Domenic Denicola · Answer 53 · Wed Jul 05 2017 08:00:23 GMT+0800 (China Standard Time)

The analogy with custom elements is a good one, especially from a cost vs. benefit perspective. Custom elements bring whole new capabilities to the platform, allowing you to hook into the parser and react to element lifecycles in a way that was impossible before. That's worth the downside, in our eyes.

On the contrary, the benefit of the change proposed here is that some people will be able to get both shorter custom attribute names and still have their pages validate. That's not worth the downsides that have been enumerated, especially given that people can do either of those things alone as long as they don't want to do them together.

Finally this lets me reiterate the point I've made several times now. Which is that if we had a true custom attributes feature, exposing low-level platform capabilities in the same way custom elements did, then it would be a worthwhile. In the absence of that, it's not.

Lea Verou · Answer 54 · Wed Jul 05 2017 08:05:03 GMT+0800 (China Standard Time)

Finally this lets me reiterate the point I've made several times now. Which is that if we had a true custom attributes feature, exposing low-level platform capabilities in the same way custom elements did, then it would be a worthwhile. In the absence of that, it's not.

Then let's do that! That's what we really need, only making them valid is a compromise, as it seems easier to implement.

Tobias Buschor · Answer 55 · Wed Jul 05 2017 16:57:56 GMT+0800 (China Standard Time)

I think html-syntax should not be influenced by DOM APIs

The advantages I see:

Same rules as custom-elements
Shorter (yes, there can be a lot of custom-attributes)
Standardizise what is allready in use and can not be overturned
Armed for a future API like "document.registerAttribute('my-title')"

Bede Overend · Answer 56 · Mon Jul 31 2017 13:58:41 GMT+0800 (China Standard Time)

FWIW re: Custom Attributes similar to Custom Elements, there's a custom-attributes library which follows the Custom Elements API very closely (almost identical), to allow a mixin style of functionality. Might be a good spot to test the waters / API.

As @annevk mentioned, it was discussed (briefly) at 2016 TPAC - see this tweet, and personally would love to see it discussed in more depth at this years 2017 TPAC. I feel Custom Attributes could solve a lot of of problems as well as this one, namely in the Custom Elements domain e.g. inheritance debate

Lea Verou · Answer 57 · Mon Jul 31 2017 14:27:53 GMT+0800 (China Standard Time)

I was at TPAC 2016, shame I missed this discussion. I will be at TPAC 2017 too, I'll make sure not to miss it again :)

Custom elements and custom attributes are inextricably linked. If custom elements are going to act like normal HTML elements, their functionality could require annotating existing HTML elements too. For example, let's assume <datalist> did not exist and had to be implemented with a custom element (e.g. <foo-datalist>). We would need a custom attribute on <input> to go with it (foo-list). It seems odd to spec the element part but not the attribute part.

digeomel · Answer 58 · Mon Jul 31 2017 15:40:24 GMT+0800 (China Standard Time)

Coming back to my practical Web developer's point of view, has anybody noticed what's going on with attributes in Angular (2+)? They have all sorts of (I would assume non-standard) symbols, stars, parentheses, brackets, square brackets, parentheses in square brackets... 😝
I guess the Angular team has not been reading this discussion or they just couldn't care less about W3C standards because... Google?

Simon Pieters · Answer 59 · Wed Aug 16 2017 17:07:22 GMT+0800 (China Standard Time)

@digeomel interesting, thanks for pointing that out (see https://angular.io/guide/template-syntax for examples). If whatwg/dom#449 is successful, maybe we should allow such attribute names (despite the XML incompatibility).

Victor Csiky · Answer 60 · Thu Aug 17 2017 02:56:28 GMT+0800 (China Standard Time)

I guess the Angular team has not been reading this discussion or they just couldn't care less about W3C standards because... Google?

Seems like that, yep. Like the "don't be evil" times are way over.
(Too bad "vue" followed suit though.)

Anne van Kesteren · Answer 61 · Thu Aug 17 2017 15:25:55 GMT+0800 (China Standard Time)

Please stop the attacks against Google. They're not welcome here as per https://whatwg.org/code-of-conduct.

Ahmid-Ra · Answer 62 · Mon Sep 11 2017 00:13:15 GMT+0800 (China Standard Time)

@LeaVerou @bedeoverend @Marat-Tanalin @rniwa @annevk @chaoaretasty @strongholdmedia please note that custom elements cannot have an empty prefix section i.e. -foo. Just a heads up as have recently had this discussion ad nauseam. Just wanted to be clear as this is the only naming caveat when comparing custom elements name conventions/restrictions to custom attributes being named similar to custom elements

Just thought i'd mention as @bfred-it stated he'd love the -foo syntax. For Custom Element definitions that wouldn't validate the PotentialCustomElementsName production restrictions

which enforces [a-z] (PCENChar)* '-' (PCENChar)*

Victor Csiky · Answer 63 · Mon Sep 11 2017 01:10:57 GMT+0800 (China Standard Time)

Thanks for the follow up.
Of course, XML standard specifically prohibits "Name" (as per a subset of NMTOKEN) to start with anything non-alpha.
But it allows "Name" to start with underscore, no matter it be an element name or attribute name.
This is consistent with that part of the PotentialCustomElementsName specification too.
(As a side note, I personally think the "blacklisting the already-used names" exclusion method is kind of horrible and prone to future collisions. But then again, maybe it is better to have collisions than a beautiful garden nobody visits.)

Troy · Answer 64 · Mon Oct 09 2017 17:51:22 GMT+0800 (China Standard Time)

I know I'm a little late but it would be great to not have to do the data-. Every developer I know thinks it's very annoying lol.

Kumar Harsh · Answer 65 · Mon Nov 13 2017 15:05:39 GMT+0800 (China Standard Time)

I found @strongholdmedia's suggestion of using a colon for denoting custom attributes to be a good solution, but seems like using colon would be invalid in HTML5 parsed as XML? (https://www.w3.org/TR/html5-diff/#syntax and https://www.w3.org/TR/html5/the-xhtml-syntax.html#the-xhtml-syntax). Which leaves us with @Marat-Tanalin's _ style attributes.

I came here from @LeaVerou's mavo website and initially felt that it made sense to drop data- attributes. (I used them extensively in angular 1.4, but then angular itself moved away from it towards a very puzzling syntax). But after reading all the comments here, and in the linked issues above, I feel conflicted. As a developer, adopting the mv- style 2-letter attributes feels like an arbitrary restriction, even though it does save me time from typing out data- every time. The _ syntax feels ok but looks ugly in a sea of hyphens.

From the spec:

Attributes have a name and a value. Attribute names must consist of one or more characters other than the space characters, U+0000 NULL, U+0022 QUOTATION MARK ("), U+0027 APOSTROPHE ('), ">" (U+003E), "/" (U+002F), and "=" (U+003D) characters, the control characters, and any characters that are not defined by Unicode. In the HTML syntax, attribute names, even those for foreign elements, may be written with any mix of lower- and uppercase letters that are an ASCII case-insensitive match for the attribute's name.

Then, you could also have -my-* style attributes, correct? I don't think there's any possible clash with SVG or other standard HTML attributes here? I'd serve the same as _-based proposition above, but would be more in-line with what is already there.

But, as a committee which has to decide on the spec to "set in stone" the validity of the attributes, and these solutions feel like only an iterative improvement over the existence of data- attributes, and probably might just work to fragment the developers' usage even more.

I still feel there's a strong advantage to sticking to a single sanctioned convention (data-) for custom data attributes, at least until we have a processing model for the "custom attributes".

I don't know what this means, and couldn't find a link describing the 'processing model', but it sounds promising.

Victor Csiky · Answer 66 · Tue Nov 14 2017 04:29:35 GMT+0800 (China Standard Time)

I found @strongholdmedia's suggestion of using a colon for denoting custom attributes to be a good solution, but seems like using colon would be invalid in HTML5 parsed as XML?

I am unsure if I was suggesting anything like it, IIRC, it is rather the XML spec itself that allows such namings/NMTOKENs.

seems like using colon would be invalid in HTML5 parsed as XML?

From what I read,

A node with a local name containing a ":" (U+003A).

it comes to my understanding that this is due to these being forbidden for they are used to reference other namespaces. So your suspicion seems correct.

I'd serve the same as _-based proposition above, but would be more in-line with what is already there.

Please note that standardising "what is already there", as opposed to premature optimization, is in reality the root of all evil.
Of course, there may be tried and true solutions and methods for different things, but just that something is widespread does not mean it is any good (just that, at a specific time, it was better than anything else widely known).

As for standardization, IMHO constructing a spec having a fixed - and lengthy - list of exceptions to it (on different bases mostly aggregating around things like "they was there first") pretty much nukes the purpose to begin with.

Lea Verou · Answer 67 · Wed Dec 06 2017 11:39:22 GMT+0800 (China Standard Time)

Another use case: Web Components that degrade gracefully.

For example, take a look at this carousel component

It’s used like this:

<skeleton-carousel dots nav loop>
  <iron-image placeholder="https://source.unsplash.com/category/nature/10x10"
              data-src="https://source.unsplash.com/category/nature/500x300"
              sizing="cover"
              preload
              fade
              ></iron-image>
  <iron-image placeholder="https://source.unsplash.com/category/food/10x10"
              data-src="https://source.unsplash.com/category/food/500x300"
              sizing="cover"
              preload
              fade
              ></iron-image>
  <iron-image placeholder="https://source.unsplash.com/category/buildings/10x10"
              data-src="https://source.unsplash.com/category/buildings/500x300"
              sizing="cover"
              preload
              fade
              ></iron-image>
</skeleton-carousel>

Wouldn't it be great if its content was proper <img> tags, so that something reasonable is visible in older browsers?
But if you do that, then the attributes would have to be data- prefixed with no indication of which attributes belong to the component and which ones don't.

chaals · Answer 68 · Thu Apr 26 2018 05:13:20 GMT+0800 (China Standard Time)

@LeaVerou

Wouldn't it be great if its content was proper tags, so that something reasonable is visible in older browsers?
But if you do that, then the attributes would have to be data- prefixed with no indication of which attributes belong to the component and which ones don't.

What am I missing? If you customised the img element, e.g.

<img is="iron-image" src="some.img" alt="what?"...>

You would use normal attributes where they existed, no? Isn't it only an issue where you are making up something completely new anyway?

effulgentsia · Answer 69 · Thu Apr 26 2018 06:13:32 GMT+0800 (China Standard Time)

@chaals: I think the issue is that the current custom elements spec says:

Customized built-in elements follow the normal requirements for attributes, based on the elements they extend. To add custom attribute-based behavior, use data-* attributes.

So if you write a component as <img is="iron-image", then all attributes defined by the iron-image component need to be data- prefixed. But data-* attributes are also used by whatever other scripts (unrelated to the iron-image component) might be interacting with the page that contains an <img is="iron-image" element. Hence, @LeaVerou's observation that:

no indication of which [data-*] attributes belong to the component and which ones don't

Lea Verou · Answer 70 · Sun Apr 29 2018 07:42:30 GMT+0800 (China Standard Time)

@chaals They are not customizing the <img> element because they don't want data- prefixed attributes (and I don't blame them). They use custom elements just so they can use shorter attribute names, so there is no fallback.

Victor Csiky · Answer 71 · Mon Apr 30 2018 15:03:50 GMT+0800 (China Standard Time)

But if you do that, then the attributes would have to be data- prefixed with no indication of which attributes belong to the component and which ones don't.

Following your logic, those attributes that display "something reasonable" in "older" browsers -or affect the appearance - do belong to the "component", while the others don't.

In my opinion, one should not use the markup layer for state and unpredictable side effects at all, for that is violation of the single responsibility principle. But this is exactly what Angular or Vue does.

I believe that people should not at all use something like your skeleton-carousel in the markup as well. For these types of things, there is - was and will be - XML/XSLT always, should anybody find the need.

After all, what could the benefits be of "knowing" what "attributes" do, according to the designer's own logic, belong to the component, if one could not reasonably deduce what attributes will actually affect the rendering in any conformant and well-specified client?
Does anyone really want to reduce the concept of well-formedness to having an even number of quotation marks or inequality marks?

I think, after following this discourse for a while, that perhaps the idea of a specific layer, just like XML/XSLT but maybe distinct, being promoted towards those people who are obsessed with this component-oriented thing that is, in my opinion, somewhat distinct and distant from the concept of DOM and what it was conceived for, that also conveys the abandon hope all ye who enter here type of note in and of itself for others; and that HTML be left to those who actually prefer documentation over convention of people with random mindsets that they may, at times, consider counter-intuitive or even marginal, is possibly better for both worlds.

James Browning · Answer 72 · Fri May 31 2019 17:01:36 GMT+0800 (China Standard Time)

I don't see a it being likely people will use data- attributes for custom elements, I don't think I've ever even seen an example of custom elements that uses them (even the HTML spec does not), there's no encouragement from any existing solutions and no push from custom element authors to use data--prefixed attributes.

I think non-conforming names is web reality already anyway, a decent number of sites are already using these non-conforming attributes (and even ones without hyphens). Regardless of whether the WHATWG agrees to change the spec new global attributes will still need to be checked for web compatibility.

Victor Csiky · Answer 73 · Sun Jun 02 2019 00:21:11 GMT+0800 (China Standard Time)

I think non-conforming names is web reality already anyway, a decent number of sites are already using these non-conforming attributes (and even ones without hyphens). Regardless of whether the WHATWG agrees to change the spec new global attributes will still need to be checked for web compatibility.

Please stop the attacks against Google. They're not welcome here as per https://whatwg.org/code-of-conduct.

Still, it may be important to find out that no matter how much effort you put into standardization, there were, are, and probably will be people that won't give a darn about these.
Of course, it wouldn't be a problem, were it not for such people having inexplicable influence.

Surely enough, it could be asked what this comment adds to the "mix" - if there is anything left to it.
But some of us do remember that we've seen this before (at IE5.5, to single out one) and it didn't turn out that well.

Simon Pieters · Answer 74 · Mon Jun 03 2019 15:42:07 GMT+0800 (China Standard Time)

Please don't add off-topic comments. This issue is about custom attributes.

To move this issue forward, a good step would be to ask implementers if there's interest in an API for observing changes to custom attributes as annevk suggested.

Tobias Buschor · Answer 75 · Fri Jun 12 2020 06:51:15 GMT+0800 (China Standard Time)

Since the css function attr() will be usable with all attributes, it might be a good time to think about a specification for custom attributes.
https://www.w3.org/TR/css3-values/#attr-notation

Robert Linder · Answer 76 · Wed Oct 21 2020 06:35:55 GMT+0800 (China Standard Time)

Since the css function attr() will be usable with all attributes

attr() may be limited to a subset of prefixed attributes, see w3c/csswg-drafts#5136.

Joshua Wise · Answer 77 · Mon Dec 14 2020 13:54:53 GMT+0800 (China Standard Time)

I'm currently in the process of writing a framework built on Web Components (custom elements), and I'm having a very hard time figuring out how to handle the name-spacing of attributes. The way I see it, there are 3 different agents who may want to define attributes on custom elements:

The consumer of the custom element. The intent may be to mark the element for querySelector() purposes or for CSS.
The author of the custom element. The intent may be to provide an interface with the consumer for receiving initial state or reflecting the element's state.
The user-agent (browser), which will inevitably define new global attributes in the future, for arbitrary purposes.

The first group of people (consumers of the element) can simply use data-* attributes, which are reserved by the spec for this purpose.

The third group of people (browsers) tend to define attributes that are single lowercased words, but I'm not confident that I can rely on that assumption.

The second group of people (authors of custom elements) seemingly have no good solution. They can't use data-* attributes because those are reserved for the consumers of the element. And without some guarantees about the naming of future global attributes, they have no way of protecting themselves against future name collisions.

As a software engineer, the obvious solution to me is namespaces. If we can't use colon (:) namespaces due to XML compatibility, then hyphen (-) namespaces seem perfectly fine. Each independent agent can define their own namespace to work in. The data- namespace is for the website author. The "empty" namespace (no hyphen) is for browsers. And every other namespace (except aria-, I guess) is for everybody in-between.

Victor Csiky · Answer 78 · Mon Dec 14 2020 18:45:33 GMT+0800 (China Standard Time)

As a software engineer, the obvious solution to me is namespaces. If we can't use colon (:) namespaces due to XML compatibility, then hyphen (-) namespaces seem perfectly fine. Each independent agent can define their own namespace to work in. The data- namespace is for the website author. The "empty" namespace (no hyphen) is for browsers. And every other namespace (except aria-, I guess) is for everybody in-between.

You, sir, as your name suggests, are indeed very wise.
I also insisted that something similar be made / kept, but ran into the some actors doing what-when-ever they deem feasible attitude that turned out persistent, and thus gave up.

Markus Johansson · Answer 79 · Tue Jun 08 2021 16:41:45 GMT+0800 (China Standard Time)

This really needs attention, I don't understand why the standard is enforcing things that makes cross-browser functional code invalid "HTML" so that we have to either stop to care about the standard or over and over explain to customers that the standard is behind reality. Leaving them worries without no real reson.

It's time to get up to speed with how things are actually used and update the standards - otherwise the relevance of the standard will decrease and become something that people see as "something from the past".

There is plenty of good ideas from 3-4 years ago - why is this stale?

Victor Csiky · Answer 80 · Tue Jun 08 2021 19:19:13 GMT+0800 (China Standard Time)

Leaving them worries without no real reson.

It all depends what does one call "real reson".

with how things are actually used

There are things like racism or oppression that are quite frequent, still, relatively few people usually argue that we "get up to speed" with them.

Heck, there are even tutorials that suggest that you embed your database server credentials into Android apps for the sake of purported simplicity.

Does it solve any problems? Surely. It is fast, easy, convenient.. The inconsiderate may even call it logical.
For others, though, it may create more problems that it solves.

Simon Pieters · Answer 81 · Thu Jun 10 2021 06:01:16 GMT+0800 (China Standard Time)

@strongholdmedia Please do not derail the discussion with issues that have nothing to do with the topic at hand.

@enkelmedia my previous comment suggests a next step for this issue.

Alan Lansdowne · Answer 82 · Fri Jul 16 2021 20:16:18 GMT+0800 (China Standard Time)

Amongst others, @JoshuaWise's comments (2020-12-14) suggest a clear outline for a practical, useful and consistent approach moving forward:

non-hyphenated attributes: standard, attributes introduced by Spec Authors
hyphenated attributes: custom attributes introduced by Custom Element / WebComponent Authors
data-attributes: custom attributes applied by Consumers of standard & custom elements

This leaves Library / Framework / WebComponent Authors (the middle group) needing to take note of a couple of well-known, reserved hyphenated prefixes - eg. don't use the prefix http- (because it already exists in http-equiv) and don't use data- or aria- - but otherwise Library / Framework / WebComponent Authors retain a free hand to build their own hyphenated custom attribute names, constrained only by the same requirements which apply to custom element names.

This means both ng- and v- can be welcomed (at last) as valid custom attribute prefixes.

Arguably, the most significant issue to resolve remains what to do about SVG (as @LeaVerou mentioned at the very beginning) since standard attribute names are frequently hyphenated in SVG. This threatens a worst case scenario of many name collisions between standard (hyphenated) SVG attribute names and custom (hyphenated) attribute names: not only in the present but (worse) in the future.

Perhaps here is where the leading underscore can come into play? A leading underscore which the SVG parser always takes note of but which remains optional in HTML, because the HTML parser always ignores it? (In the same way that the HTML5 parser ignores any XHTML-style trailing slash in self-closing elements).

Thus, in HTML:

enable-background
_enable-background

are functionally identical and in practice - or most of the time, at least - only the former will ever tend to be used.

Whereas, in SVG:

enable-background
_enable-background

the former is parsed as a specced standard attribute, while the latter may be immediately recognised (by developers and user-agents) as a custom attribute.

Is that too confusing? To have _enable-background mean the same thing as enable-background in HTML, but for the two names to mean two different things in SVG? There certainly is a precedent for syntax not always meaning the same thing in HTML and SVG - not least in that SVG is case-sensitive, while HTML is case-insensitive.

Advice to custom element authors would be:

if you wish to, you can, in every context, always prefix your hyphenated custom attributes with an underscore
though, for all practical purposes it makes no difference whether you do or not in HTML
however in SVG, it absolutely does make a difference, so when writing SVG, be sure to always prefix with an underscore

Simon Pieters · Answer 83 · Tue Aug 17 2021 00:50:29 GMT+0800 (China Standard Time)

We can't ignore a leading _ in attribute names in HTML, that would likely break content that uses it and expects the underscore to not be ignored.

Alan Lansdowne · Answer 84 · Tue Aug 17 2021 01:16:17 GMT+0800 (China Standard Time)

Three (genuine) questions in response:

Are there already standard attribute names in HTML which begin with a _ ?
Are there any frameworks / libraries / environments where a pair of distinct custom attributes exist which have identical names, save for the fact that one begins with a leading underscore and the other does not?
Are there any frameworks / libraries / environments which introduce (or allow for) a custom attribute which has an identical name to an already-existing standard attribute, save for the fact that it begins with a leading underscore?

Simon Pieters · Answer 85 · Tue Aug 17 2021 04:32:35 GMT+0800 (China Standard Time)

No
It seems unlikely, but I don't know.
There are such instances in https://gist.github.com/zcorpan/b54592e415a2f79f2ef7f79c0c37b2ed e.g. <img _src=...>

Last time I looked at non-standard attributes in HTTP Archive (see #2271 (comment) ), there were 531 instances with a leading _ excluding _moz_. Those pages might use those attributes from JS or CSS and therefore rely on the _ not being ignored (e.g. removed by the HTML parser).

Alan Lansdowne · Answer 86 · Tue Aug 17 2021 04:44:55 GMT+0800 (China Standard Time)

Many thanks for that clarification, @zcorpan.

Yes, I concede: we can't make a leading _ character ostensibly superfluous in custom HTML attributes if attributes such as _src are already in use alongside src.

Not wishing to sound absurd, but if a single _ as an arbitrary prefix is out of the question, then what about a double __?

After all, in this suggestion, the HTML-optional / SVG-obligatory underscore(s) aren't being introduced as prefixes for the benefit of the HTML parser - the HTML parser is already capable of recognising that any attribute which includes two hyphenated words (of which the first isn't aria-, http- etc.) must be a Custom Attribute.

The purpose of introducing HTML-optional / SVG-obligatory underscores is so the SVG parser may immediately distinguish between regular attributes with hyphens and Custom Attributes which (also obligatorily) include hyphens.

That is:

a hyphen is enough of a distinguishing feature in HTML to indicate that the attribute is a Custom Attribute (subject to not using a small handful of reserved hyphenated prefixes)
a hyphen is an insufficiently distinguishing feature in SVG, so another feature - in this case a double underscore prefix - is utilised
the HTML parser knows to ignore the double underscore prefix when it sees it, since this is an SVG convention and instead will only look for whether the attribute name is hyphenated or not

mangelozzi · Answer 87 · Mon Apr 18 2022 05:11:49 GMT+0800 (China Standard Time)

@ domenic

This argument (and I would appreciate if you avoided phrases like "abomination" in reasoned discussion) is based on anecdotes, whereas @zcorpan shows soundly with data that it does not hold in the real world. A small minority of developers using custom attributes are unhappy with data; 15x more are happy with data than are unhappy. They can be vocal, as you are, but saying that this is a widespread problem is just not supported.

I would like to state that the deduction of looking at the data reveals what people prefer is not 100% valid. People currently use data because that is currently endorsed and people have no other option (or can rebel). E.g. I use data- because I want to be conform, however I would love to use custom attributes that start with a prefix. So it you polled my website you would say I am in favour of data- only. I think data- is great, and use it all the time, but in addition to that there are many good use cases for custom prefixes. data- is more end user centric, custom prefixes is very nice for frameworks (to avoid collisions with user's data- attributes).

Its exact reason that native HTML components have their own attributes, instead of using class names, so they don't step on user's class names. We need something that sits between the native spec and the end user for framework developers.

If browsers ate their own dog food (e.g. web components), it would make web developement much better.

Andrea Giammarchi · Answer 88 · Tue Apr 19 2022 02:21:58 GMT+0800 (China Standard Time)

to whom it might concern, also as possible playground, there's a proxy-pants dsm export that, if tree-shaked, or required as proxy-pants/dsm, allows lazy one-off creation of dataset like namespaces, as long as the suffix is set.

const {ngset: ng} = dsm(element);

// set ng-test attribute
ng.test = 'value';

// remove ng-test attribute
delete ng.test;

It works for const {vset: v} = dsm(element); too, and retrieving the same set multiple times is weakly created/referenced once, to mimic what DOMStringMap via dataset do.

Not sure this is the answer anyone is looking for, but as use case/utility to test/play with, maybe it's useful, and it's also extremely tiny in size and logic.

Alexander Petros · Answer 89 · Wed Dec 20 2023 00:57:10 GMT+0800 (China Standard Time)

This issue has been around for a while, but now is a great time to make some progress on it again. Declarative front-end libraries are having a small (maybe medium?) moment, and a lot of them make use of the *-attribute pattern:

x-attribute for AlpineJS
hx-attribute for htmx
data-turbo-attribute for Turbo (and I'm sure they would be very happy to omit the data)

In addition, of course, to established libraries like Angular and Vue.

This emergent behavior exists for exactly the reason that @JoshuaWise describes: library authors want to namespace their attributes, and they will do so with the most ergonomic mechanism available to them. Some, like Turbo, decide that not using data- isn't worth the risk, but I also agree with @Jamesernator's point that every official example of data- attributes shows them being used to store... data, not functionality. data- attributes are clearly not intended to be the basis for future-proof attribute extensions to HTML, and I don't see anyone suggesting that they should be.

So there a lot of new people (myself included) who, after years of writing JSX that compiles down to HTML, are newly-interested in HTML as an authorship language in its own right. And the first thing they'll discover is that the tools that brought them back from SPA-land are, according to the standards body, invalid HTML. They will draw either one of two conclusions from this. Either:

a) the libraries brought this issue to their attention in the first place made a mistake
b) WHATWG is too slow/out of touch and validation doesn't matter.

As @LeaVerou correctly pointed out six years ago: "The more commonplace invalid HTML becomes, the less authors care about authoring valid HTML. Validation becomes pointless in their eyes if they see tons of perfectly good use cases being invalid."

The last comment we got from WHATWG about moving this forward was from @zcorpan, who said that if we should reach out to implementers to see if there's any interest in adding events to attribute changes to make them observable. This is an excellent idea, but I don't see why it should hold up standardization of custom attributes with hyphens. The existing solutions for observing attributes are clunky, but they work well enough to build successful libraries. If the only way to move this issue forward to convince a company to devote engineering resources to it, I expect it will continue sitting for a long time.

Simply reserving non-extant hyphenated attributes in the standard, which requires no work from implementers, would demonstrate the demand for this feature. If people made increased use of it, then it would be easier to convince implementers that adding additional observability features is worth their time. I likewise suggest punting on the SVG question by just not reserving hyphenated attributes in SVGs (but if that doesn't work, at least we can move the discussion forward by saying "we can do this once we resolve the SVG question").

I think it's extremely encouraging that this issue has remained open, because it demonstrates interest from both the applicants and WHATWG in resolving this. I also have immense respect for the standards body being conservative with the standard, so that we can ensure its essential backwards and forwards compatibility. In light of that, let's take the smallest possible victory—reserving non-extant hyphenated attributes in HTML—and see if it generates some momentum for custom attributes more generally. I'm happy to open a new issue if you feel that's appropriate, and am also generally available to push this forward in any way I can.

Brian Kardell · Answer 90 · Wed Dec 20 2023 01:38:06 GMT+0800 (China Standard Time)

@alexpetros note the several links just above your comment from this year, including positive movement and discussion at TPAC this year in w3c/tpac2023-breakouts#44 (which links to several relevant issues). People are still interested, and I think we're much closer to hitting a moment in which focus and progress are more likely to be made.

Alexander Petros · Answer 91 · Wed Dec 20 2023 03:34:18 GMT+0800 (China Standard Time)

@bkardell, first of all, it is very heartening to see the positive movement from the breakout sessions, and to see some people who gave up on this discussion actively participating in them.

That having been said, focusing on those issues is precisely what I think has led to the decision paralysis here. Custom behavior raises the following questions:

Should users be able to extend HTML with custom behavior?
If so, are custom attributes a reasonable place to put that behavior?
If so, is reserving attributes with hyphens for custom behavior correct? <---- we are here
If so, what should the JavaScript interface be for specifying the behavior declared in those attributes?

Questions 1 and 2 are resolved by the existence of JavaScript, and data-* attributes, respectively. The issues from the breakout session that deal with custom attributes deal with Question 4.

Question 3 is not an easy question, but is much easier than Question 4, and is also a priori. Resolving "should WHATWG reserve future kebab-case attributes for user-specified behavior" will not only help clarify one of the many questions raised by this proposal, it will inject it with momentum by erecting fences around the cowpath, for future pavement.

While this issue has been a little contentious, it has also been extremely focused—almost all the comments are salient points in either direction. I'm not saying that we should rubber-stamp kebab-case attributes tomorrow, but that it can (and should) be resolved on its own merits; logrolling the declarative specification (Question 3, this issue) and imperative implementation (Question 4) will only serve to delay both.

Question 3 also has the advantage (and the urgency) of having been resolved by the library market, while Question 4 resolutely is not. To use a practical example, both htmx and AlpineJS allow for declaring arbitrary even listeners (i.e. x-on:click, but the click could be any event). Since querySelector doesn't support wildcards in attribute names, AlpineJS finds these by walking the entire tree (h/t @dz4k for this) while htmx uses a mildly outrageous XPath query. But they both use custom-prefixed kebab-case attributes. Future developments in the declarative HTML extension space will likely do the same.

We can take the pressure off the committee doing the implementation work by sanctioning kebab-case attributes first. Then the libraries can kludge along successfully (and with valid HTML!) until a more efficient and streamlined JavaScript API is available.