Intl.DateTimeFormat does not support 'und' locale
sven-oly opened this issue · comments
This seems wrong. Apparently 'und' falls back to 'en', which is different behavior than ICU4C.
Examples:
Welcome to Node.js v18.19.1.
Type ".help" for more information.
> dt = new Intl.DateTimeFormat('und', {"month":"short","weekday":"narrow","day":"numeric","calendar":"gregory","numberingSystem":"latn"})
DateTimeFormat [Intl.DateTimeFormat] {}
> dt.format()
'T, Apr 30'
> Intl.DateTimeFormat.supportedLocalesOf(["und"])
[]
> Intl.DateTimeFormat.supportedLocalesOf(["und", "en"])
[ 'en' ]
I thought that "und"
was supported in engines, but I guess not?
und
is not supported in browsers. Supporting it would probably fix some of the use cases of the Stable Formatting proposal, but not all.
The reason is very simple. there are no locale resources defined for "und".
See
https://github.com/unicode-org/cldr/blob/main/common/main/und.xml
is a 404
The resources for "und" are stored in root.xml in CLDR.
In v8, internally we call
uloc_openAvailableByType(ULOC_AVAILABLE_WITH_LEGACY_ALIASES, &status);
to find out what locales are available. neither "und" nor "root" is enumerated
https://unicode-org.github.io/icu-docs/apidoc/dev/icu4c/uloc_8h.html#a1d61e1cb6a0d2ad60dc3cd78c931e551
said
"Gets a list of available locales according to the type argument, allowing the user to access different sets of supported locales in ICU."
if "und" and "root" are not reported by ICU as "available locales", then v8 will not treat them as supported.
sorry, I hit closed by accident.
I made an upstream issue: https://unicode-org.atlassian.net/browse/ICU-22766
Whether or not ICU decides to start including the root locale in the return value of uloc_openAvailableByType
, I think Web engines could decide to include that locale in their own lists of supported locales.
TG2 discussion: https://github.com/tc39/ecma402/blob/main/meetings/notes-2024-08-22.md#intldatetimeformat-does-not-support-und-locale-885
An interesting but potentially unexpected outcome of the discussion was the realization that "und" is defined by BCP-47 as simply an absent locale, so it is not semantically incorrect for ECMA-402 to have the current web reality behavior of making "und"
basically an alias for undefined
.
We want a way to actually get root behavior, but this might be better handled by the null
locale proposal (Stable Formatting).
In v8, internally we call
uloc_openAvailableByType(ULOC_AVAILABLE_WITH_LEGACY_ALIASES, &status);
to find out what locales are available. neither "und" nor "root" is enumerated
said
"Gets a list of available locales according to the type argument, allowing the user to access different sets of supported locales in ICU."
if "und" and "root" are not reported by ICU as "available locales", then v8 will not treat them as supported.
This is not correct. Root is structurally required. Available locales is the list to show to users. If icu docs don't make that clear it should be filed upstream.
V8 is wrong to filter on the available list and not include root. The better way would be to actually query icu for the locales actual status.
Internally root is included in the manifest for the locales. I don't remember, it's possible root is simply excluded here.
I don't think we should change the Web Reality behavior until TG2 has reached a consensus on this issue, so I don't want V8 or other engines to start doing something different with "und" in the mean time.