Use full decompositions in decomposition map

Question

Use full decompositions in decomposition map

harendra-kumar opened this issue 4 years ago · comments

Currently we have a recursive decompose loop to decompose characters until they are no longer decomposable. This requires multiple lookups in the decomposable bitmap and the loop adds to the cost. Instead, we can statically generate fully decomposed sequence for each character and in the run time logic we won't require a recursive loop. This can potentially speed up NFD/NFC normalizations of several languages which involve composed forms (e.g. Devanagari and Japanese).

ˌbodʲɪˈɡrʲim · Answer 1 · Fri May 08 2020 05:32:43 GMT+0800 (China Standard Time)

I tried this idea last week, but did not gain any performance improvements. Full decompositions might get long-ish, maybe it is better to return them as arrays and not as [Char]?..

Harendra Kumar · Answer 2 · Fri May 08 2020 08:44:24 GMT+0800 (China Standard Time)

I removed the recursive decompose altogether to experiment and it does not seem to help at all.