Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Overproductive theonym lookups in Noto Nastaliq Urdu #4

Closed
dscorbett opened this issue Feb 3, 2022 · 3 comments
Closed

Overproductive theonym lookups in Noto Nastaliq Urdu #4

dscorbett opened this issue Feb 3, 2022 · 3 comments

Comments

@dscorbett
Copy link

Font

NotoNastaliqUrdu-Regular.otf

Where the font came from, and when

Site: https://github.com/googlefonts/noto-fonts/blob/f7556efcaf940cf6c2f8ae0c3386d651e930b5cc/unhinted/otf/NotoNastaliqUrdu/NotoNastaliqUrdu-Regular.otf
Date: 2022-02-03

Font version

Version 3.002

Issue

Noto Nastaliq Urdu version 3.002 (not to be confused with yesterday’s build, also called 3.002) ligates some <lam, lam, heh> sequences which are clearly unrelated to the word “اللّٰه” and should not be ligated. If the heh is not really a heh but another letter that the font converts into a heh glyph, the ligature should not be formed. If the diacritics are incompatible with the pronunciation “llah”, the ligature should not be formed.

My rationale is that other <lam, lam, heh> sequences are not ligated; e.g. inserting a kasra, or ijam on a lam, blocks the ligature. I conclude that this ligature depends on the meaning of the word and not just the shapes of the glyphs. That’s why I include the second example below, though ae and heh are otherwise indistinguishable in final position.

Character data

للّٰة
U+0644 ARABIC LETTER LAM
U+0644 ARABIC LETTER LAM
U+0651 ARABIC SHADDA
U+0670 ARABIC LETTER SUPERSCRIPT ALEF
U+0629 ARABIC LETTER TEH MARBUTA

للّٰە
U+0644 ARABIC LETTER LAM
U+0644 ARABIC LETTER LAM
U+0651 ARABIC SHADDA
U+0670 ARABIC LETTER SUPERSCRIPT ALEF
U+06D5 ARABIC LETTER AE

لَله
U+0644 ARABIC LETTER LAM
U+064E ARABIC FATHA
U+0644 ARABIC LETTER LAM
U+0647 ARABIC LETTER HEH

Screenshots

للّٰة
للّٰە
لَله

@simoncozens
Copy link
Contributor

Okay. It seems like the answer is to move this lookup earlier, before all the dot decomposition turns other characters into heh+marks. I can do that, but what I don’t have is a good set of rules for which marks should allow and which marks should block the lookup. Do you mind spelling that out for me?

@dscorbett
Copy link
Author

dscorbett commented Feb 4, 2022

According to https://github.com/googlefonts/noto-fonts/issues/384#issuecomment-110829445 and https://github.com/googlefonts/noto-fonts/issues/384#issuecomment-110836010, it should be something like:

lookup FormDivineName {
    lookupflag IgnoreMarks;
    sub LamIni LamMed HehFin by Divine_nm_p1;
} FormDivineName;
lookup DivNmCheck {
    sub LamIni' lookup FormDivineName LamMed' ShaddaNS' [AlefSuperiorNS FathaNS]' HehFin';
} DivNmCheck;

The following rule might be okay in DivNmCheck too, but then again I don’t know if it is necessary. Do people ever write exactly one of the two diacritics?

    sub LamIni' lookup FormDivineName LamMed' [ShaddaNS AlefSuperiorNS FathaNS]' HehFin';

@simoncozens simoncozens transferred this issue from notofonts/noto-fonts Jun 20, 2022
simoncozens added a commit that referenced this issue Jul 17, 2023
@simoncozens
Copy link
Contributor

I've gone with your suggestion.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

3 participants