googlefonts / fontc

Where in we pursue oxidizing (context: https://github.com/googlefonts/oxidize) fontmake.

Geek Repo:Geek Repo

Github PK Tool:Github PK Tool

Honor lookups during layout normalization for at least marks and kerning

rsheeter opened this issue · comments

commented

Per f2f and #641 (comment) we are not correctly handling lookups. We should for at least the layout types we generate for marks and kerning. Multiple lookups can be active and have overlapping rules. Resolution is:

  • Within a lookup the first matching subtable wins
    • For example (potential test?), in a lookup with two subtables that kern AW the first value is taken
  • For marks, the value from the last lookup wins.
    • That is, marks accumulate by assignment. Each new value overwrites any prior value.
  • For paripos, the value accumulates by addition
    • That is, if many lookups kern AW the final value is the sum of the adjustments

a fun detail of this: we can't just combine the lookups as we process them, because different language systems may reference different sets of lookups. This means that it is possible for one lang/script to (for instance) include two different pairpos lookups with adjustments for the same pair (in which case they are summed) but another lang/script to include only one of those lookups.

Basically: for interactions across lookup boundaries we can't do the work ahead of time, and instead need to do it for each language system.

Within a lookup the first matching subtable wins
For example (potential test?), in a lookup with two subtables that kern AW the first value is taken

I think it's more complicated than that at least for PairPos. The first glyph "A" could be in the coverage of multiple subtables, say one PairPosFormat1 contains pair "AW" and then another PairPosFormat1 subtable has the pair "AV"; if we only stop at the first occurence of "A" we will never evaluate the "AV" pair in the subsequent subtable.
Only when we have a match for both glyphs in the pair, not just the first, we can skip any duplicate pairs in following subtables.
I noticed this issue while attempting to use the updated layout-normalizer (following the #650 merge which was supposed to handle subtables) on the ufo2ft's compile-variable-features PR branch googlefonts/ufo2ft#635 (comment)

As @simoncozens noted in chat, "you should stop at the first matching rule, not the first matching subtable. The coverage tables are just there to help you filter out subtables you don't need to process at all."

Quoting https://learn.microsoft.com/en-us/typography/opentype/spec/chapter2#coverage-table (emphasis mine)

If a glyph does not appear in a Coverage table, the client can skip that subtable and move immediately to the next subtable.

so doing this is actually incorrect:

for (gid1, pairset) in coverage.iter().zip(pairsets.iter()) {
if !seen.insert(gid1) {
continue;
}

... and in fact we are reporting as missing, pairs that are valid and which hb-shape can apply just fine, see googlefonts/ufo2ft#635 (comment)

I attach below two fonts and the respective "markkern.txt" dumps from layout-normalizer, one font was built with regular fontmake, the other using ufo2ft/compile-variable-features branch linked above. The reported differences seem all to be false positive

Oswald-VF-main-vs-vfea-markkern.zip

I think it's more complicated than that at least for PairPos. The first glyph "A" could be in the coverage of multiple subtables, say one PairPosFormat1 contains pair "AW" and then another PairPosFormat1 subtable has the pair "AV"; if we only stop at the first occurence of "A" we will never evaluate the "AV" pair in the subsequent subtable.
Only when we have a match for both glyphs in the pair, not just the first, we can skip any duplicate pairs in following subtables.

Correct. It's a bit special in that sense for PairPosFormat1 in that matching coverage is not enough reason to stop. Only if the pair is found the search is stopped. With PairPosFormat2 you stop if coverage matched.

With PairPosFormat2 you stop if coverage matched.

This used to be the case in HarfBuzz at least. I think the other engines still do this. Because each glyph2 is assigned a class, even if 0. So the search always "succeeds".

However, a while back, mostly as an optimization I think, I made HB not apply PairPos2 kerning if glyph2's class is 0. This has the effect now that it will cascade to the next subtables.

With PairPosFormat2 you stop if coverage matched.

This used to be the case in HarfBuzz at least. I think the other engines still do this. Because each glyph2 is assigned a class, even if 0. So the search always "succeeds".

However, a while back, mostly as an optimization I think, I made HB not apply PairPos2 kerning if glyph2's class is 0. This has the effect now that it will cascade to the next subtables.

I haven't tested other implementations.

With PairPosFormat2 you stop if coverage matched.

This used to be the case in HarfBuzz at least. I think the other engines still do this. Because each glyph2 is assigned a class, even if 0. So the search always "succeeds".

However, a while back, mostly as an optimization I think, I made HB not apply PairPos2 kerning if glyph2's class is 0. This has the effect now that it will cascade to the next subtables.

@behdad are you correcting yourself here, or are you elaborating? Can you confirm that, for pairpos2, I can stop if the there's a glyph in the coverage table that was in a preceding subtable?

edit: a better question would be, "is it okay if, for pairpos f. 2, I use the same logic as for f. 1, and only skip a pair if that pair existed in a preceding subtable?"

The problem here is that what you're doing in terms of dumping out all the rules - for all glyphs that are involved in those rules - is different from what a shaping engine would do to apply the rules, in that the shaping engine has a particular glyph that it's interested in at each position in the buffer. The algorithm for applying a lookup, as a shaping engine, is:

  • Iterate through all subtables.
    • You may skip a subtable if the current glyph does not appear in the coverage table.
    • If you find a rule in the current subtable which applies (that is, all glyphs in your current view of the buffer matches the glyphs involved in the rule - e.g. for single pos, single subst, multiple subst, if current glyph matches glyph in rule; for pair pos, if current and next glyphs match the glyph in the rule or their classes both match; chaining rules: pre and both context matches, etc.) then apply the rule and finish processing this lookup.

But what you're doing is just dumping out all the rules; so I suggest you just dump out all the contexts (single glyph for single pos, single subst and multiple subst; pair of glyph for both pairpos1 and pairpos2; several glyphs for ligature subst, and so on), and de-duplicate the contexts, first one wins.

@behdad are you correcting yourself here, or are you elaborating? Can you confirm that, for pairpos2, I can stop if the there's a glyph in the coverage table that was in a preceding subtable?

Yes, you can assume that you can stop there. HarfBuzz works slightly different but you can ignore that.

I found the clearest expression of the logic in FontForge docs:

Within a lookup, the subtables will be applied in order until one of them actually does something. Then no further subtables will be executed. Note that this is different from the way lookups behave – all active lookups will always be applied, but only one subtable in a lookup will be.

(my emphasis)