Standard prefix form of derivative operators in Leibniz notation and D-notation should be in category L to match ISO 80000-2
SmashManiac opened this issue · comments
In section B.1 "Operator Dictionary", figure 25 "Mapping from operator (Content, Form) to a category" does not map U+0044 LATIN CAPITAL LETTER D nor U+0064 LATIN SMALL LETTER D in prefix form to category L, even though it does for U+2202 ∂ PARTIAL DIFFERENTIAL, U+2145 ⅅ DOUBLE-STRUCK ITALIC CAPITAL D and U+2146 ⅆ DOUBLE-STRUCK ITALIC SMALL D.
This is a problem because ISO 80000-2 specifies that the basic Latin version should be used when representing a non-partial derivative operation in Leibniz notation, not the double-struck italic version. The non-standard double-struck italic notation appears to have been inherited from the Wolfram Language instead back when MathML 1.0 was created.
Because of this issue, if a user wants to write in the standard way a derivative or integral in Leibniz notation or D-notation in MathML Core, they are required to manually set the lspace
and rspace
attributes on the corresponding operators to match the values of category L from figure 26 "Operators values for each category" as a workaround, and hope that these values do not change in a future version.
Would you be able to show us a version of ISO-80000-2 ?
It has been difficult to take this family of standards in account as there is no public version (that I found so far).
All ISO standards are public, but they aren't freely available. I would strongly recommend for the W3C to purchase a copy of the latest edition of ISO 80000-2 to better support international needs. A copy can be purchased directly from ISO's web store here:
https://www.iso.org/standard/64973.html
In the meantime, for this specific issue, the free sample of ISO 80000-2:2019 on that store actually showcases an example of Leibniz's notation using its own recommendations in the Foreword section as part of the errata of the previous edition:
https://www.iso.org/obp/ui/#iso:std:iso:80000:-2:ed-2:v2:en
There are a bunch more examples of Leibniz's notation in the free sample of ISO-80000-3:2019 as well, specifically in section 3 table 1 in the Remarks column for rows 3‑1.7, 3‑3, 3‑4, 3‑8, 3‑10.1, 3‑11, 3‑12, 3‑13, 3‑17.2 and 3‑23.2:
https://www.iso.org/obp/ui/#iso:std:iso:80000:-3:ed-2:v1:en
I am sorry @SmashManiac , it appears to me really difficult to try to follow such a standard and build, on it, a standard that aims to be built on widely available knowledge.
There is a whole lot of problems appearing with changing such a behaviour as it might affect notations that neither the working group nor the authors of the ISO-80000-2 document have thought of. I am thinking at least of the distance function in geometry or in topology which is not what should be in category L and is rather common. It is likely that some other usages exist such as the ones in the ISO-80000-3 documents you refer (diameter, ...).
I have a hard time seeing a way to approach a solution to the issue (and I suppose it is because of the paywall).
My understanding is that distance function notation would not be affected since the d in that case should be used as an identifier by a user and not as an operator?
That said, I completely understand the concern about backwards compatibility and how it would affect more obscure use cases. However, in MathML 3, the provided operator dictionary was simply a non-normative suggestion, whereas the current MathML Core draft makes it normative. As such, it didn't seem to me that backwards compatibility was a concern up until now, but it will definitely become one if this change is published as-is. This would make it the last opportunity to align the default operator dictionary with ISO standards.
As for the paywall issue, I understand that the W3C is a non-profit and that money isn't always easy to come by, but considering that both ISO standards and W3C standards should be built upon the same widely-available knowledge as you put it @polx, and that ISO probably has done way more research into this subject, I fail to see how such an investment would not be justified.
In the worst case scenario, if the status quo must be preserved despite the reasons I just mentioned forcing this issue to be closed without changes, it may be worth considering some alternative, like a mechanism that would allow users to pick which category they want their operators to be in without having to manually set lspace, rspace and the other relevant properties. It was a pain point for me when I converted all math formulas on my blog into MathML Core, at the very least.
We don't work for W3C. There's just a few people working directly for W3C. Many of the people of the W3C Math working group are working as developers or researchers for software that uses mathematical display. So having W3C access won't solve it.
But access is not the problem. I already could see ISO-80000-2 from friends and other inofficial means. And I dare say that I do not feel at all that it is better documented or instructed than the knowledge of the WG and has a french-speakers' orientation, but such a critique will be for another day.
The problem of a standard behind a paywall is about traceability and shareability of our decisions: Our mandate is to try to respect the cultures everywhere and do so in an open fashion. Mentioning an ISO standard behind a paywall just does not work more than any scientific publication that is not open-access. Note that the data of several ISO standards are made openly acessible. E.g. the input for the currency iso-639 family of standards is publicly accessible on stix' page.
The need to convert a whole blog is definitely a pain I can share. But advancing standards (for example, having CSS sheets apply to MathML as is wished in interop's #861) might be a far better way for the future. You could also, temporarily, use JS to avoid injecting all these attributes.
Apart from the problems with ISO-80000-2 not being freely available, it would not in general be a good basis for mathml defaults (although obviously it could have been used as one of many possible sources of reference had it been available).
80000-2 is an update to ISO 31-11:1992 which had a more descriptive name "mathematical signs and symbols for use in physical sciences and technology" that is, it is explicitly documenting mathematical notations that are used in science and engineering that differ from traditional publishing standards in mathematics. Use of an upright differential d rather than math italic d is one of these actually, but in any case even if an upright d was the preferred form there is no possibility of having an operator dictionary entry for any of the basic latin letters. As you need consistent handling for the alphabet, you can't have individual leters getting different default spacing. The punctuation symbols in the ASCII range have operator dictionary entries but no letters. Conversely
U+2202 ∂ PARTIAL DIFFERENTIAL, U+2145 ⅅ DOUBLE-STRUCK ITALIC CAPITAL D and U+2146 ⅆ DOUBLE-STRUCK ITALIC SMALL D all have specific mathematical default interpretations so can have operator dictionary entries. But that does not mean that you can not use standard latin characters for these purposes.
Makes sense, and thank you for the insights! The only opinion I disagree with is with the interpretation of ISO 80000's title and how it relates to traditional publishing, but I don't have evidence either way.
Putting aside the ISO stuff, I'm starting to think that the real issue is the inability for a web developer to define how operators should render with a different category than the default without adding a bunch of advanced overriding properties (either manually or through JS), and that's if they would even understand what each category represent in the first place.
Should a separate issue be created to tackle this specific problem? I don't think there would any other reason to leave this issue open otherwise unless the current apparent consensus against ISO standards alignment changes in the future.
Actually the main issue isn't conformance to 80000-2 or anything else or even just fixing any "bad choices" about default operator spacing, changing the space now is really difficult. The operator dictionary spacing has been available in W3C recommendations since 1998 or so and the spacing for the ascii character range hasn't changed in all that time, changing it now would potentially change a lot of documents. Also for MathML-Core implementations it is not specified in some declarative table at run time it is baked in to the compiled layout code of the browsers. So changing it isn't impossible but it would mean getting agreement for all the major web browsers.
I'm starting to think that the real issue is the inability for a web developer to define how operators should render with a different category than the default without adding a bunch of advanced overriding properties (either manually or through JS),
Hopefully the interaction with css will get better and better specified currently it's not always clear where you can change the spacing with css and where the math layout positions elements without css applying. But global document layout properties in a web context really needs either css or javascript, there is not a lot mathml can specify at a document level.
However I wouldn't really call the spacing attributes "advanced" it's just lspace
and rspace
so no harder than adding a class
or some other attribute, you would need some markup. Certainly some tex to mathml convertors just routinely add lspace and rspace in all cases to force the same spacing as in the source which more or less means that the operator dictionary is not used at all. That makes the mathml a bit verbose but I wouldn't call it "advanced" and as almost all mathml is generated rather than hand written, it's no harder to produce than mathml without those attributes.
To clarify, the only reason I've been calling these attributes "advanced" is because it's not easy for the average front-end web developer to determine which element/attribute needs to modified exactly, and to which value to set it.
It would definitely be great if a CSS stylesheet alone could easily resolve that problem globally for an entire document, but as long as there won't be a CSS selector that checks for specific text contents in an element in some way, it won't be possible to target specific MathML operators without adding attributes to them in the DOM, and that introduces compatibility issues with full MathML...
Ah OK yes if you don't want to accept the defaults currently the only viable options are to add explicit attributes while generating the mathml or to add them via javascript, it's just the way it is. Probably you can just add a class attribute then adjust things by css class, but as you say the natural css selector here without adding markup would be on the element content, and that's not avaliable
We did consider this issue as a group, and while there are differing levels of agreement or disagreement as to whether the actual suggested spacing would be good if we were starting now, the bar for changing the operator dictionary is very high and especially for ASCII letters, changing the default spacing for any documents that may have used these in the 25+ years since MathML 1 was published means that we resolved to close this with no change.