nls: Auto-generate 11692 compose sequences
Most accented and other characters can be decomposed programmatically,
and yet the Compose file - thousands of lines long - is entirely manually
curated at present. This commit rectifies the situation by generating
a large number of Compose sequences with a Python script, generate_compose.py
.
The comments inside that file describe its workings in more detail.
Existing sequences do not have their meaning changed by this PR; generate_compose.py
contains a list of exclusions where necessary to ensure this. The only exceptions
are:
- The existing sequences
<Multi_key> <U223C> <slash>: "≁"
and<Multi_key> <approximate> <slash>: "≇"
clashed;U223C
andapproximate
are synonyms, but the repository tests did not detect this. The second sequence has been corrected to<Multi_key> <U2245> <slash>: "≇"
in order to resolve the conflict. - A few sequences that produced non-NFC strings, now produce the NFC-normalized form of those strings.