Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Update de_Latn.textproto #152

Merged
merged 1 commit into from
Jul 8, 2024
Merged

Conversation

nathan-williams
Copy link
Contributor

Move to the auxiliary character set. The lowercase ß is still part of the base character set.

Based on Wikipedia, seems to be more of an auxiliary than base character. It is relatively new and being written as SS is considered valid.

Move `ẞ` to the auxiliary character set. The lowercase `ß` is still part of the base character set.

Based on [Wikipedia](https://en.wikipedia.org/wiki/%C3%9F#:~:text=Additionally%2C%20as%20of%202017%2C%20when,with%20%E2%9F%A8SS%E2%9F%A9%20in%20allcaps.), `ẞ` seems to be more of an auxiliary than base character. It is relatively new and being written as `SS` is considered valid.
@nathan-williams nathan-williams merged commit 0f47e09 into main Jul 8, 2024
11 checks passed
@nathan-williams nathan-williams deleted the de_Latn_base_charset_update branch July 8, 2024 22:54
@moyogo
Copy link
Contributor

moyogo commented Jul 9, 2024

@nathan-williams google/fonts#7926 should have been based on this PR.

The Wikipedia page says:

Additionally, as of 2017, when capitalized, either capital ⟨ẞ⟩ (STRAẞE) or ⟨SS⟩ (STRASSE) are considered equally valid in all situations (not just when the character is unavailable)

More importantly the references from RdR Regeln und Wörterverzeichnis 2016 (current official spelling rules) it uses say:

E3: Bei Schreibung mit Großbuchstaben schreibt man SS. Daneben ist auch die Verwendung des Großbuchstabens ẞ möglich. Beispiel: Straße – STRASSE – STRAẞE. [When writing in all caps, one writes SS. It is also permitted to write ẞ. Example: Straße – STRASSE – STRAẞE.]

While auxiliary is not formally defined here, CLDR’s LDML auxiliary is defined as "Additional characters for common foreign words, technical usage". ẞ is not foreign and not technical, it’s a character that can be used as a capital of ß instead of SS.

@dscorbett
Copy link

“Auxiliary” also seems to cover characters used in native words but only rarely: en_Latn.textproto puts “æ” and “œ” in auxiliary.

@vv-monsalve
Copy link
Contributor

AFAIK, the official German orthography includes both and ß.

cc @mekkablue

@moyogo
Copy link
Contributor

moyogo commented Jul 9, 2024

Just to be clear, I’m arguing ẞ should remain in de_Latn base exemplar. This PR breaks https://github.com/googlefonts/glyphsets and its GF Latin Core set as it doesn’t include auxiliary exemplar for many languages.

@davelab6
Copy link
Member

While the official German orthography now (recently, on a decades timescale ;) includes both ẞ and ß, this isn't helpful when filtering a type library by languages, rather than unicode range subsets: If you search for fonts that support German, and don't show fonts that are missing the ẞ character, then Roboto, Poppins, Lato etc won't come up. This is undesirable, as while German support is better with ẞ, it is not expected as a hard requirement.

@twardoch
Copy link
Collaborator

twardoch commented Jul 12, 2024

I believe ẞ should be a mandatory character for de-Latn (but not for de-CH-Latn). The DIN 5008:2020 standard for German computerized typesetting states that ẞ should be preferred over SS or SZ when setting all-caps. This is now similar to the notion that Ä should be preferred over AE or Ö over UE. The rationale is: modern German digital texts DO include ẞ. Not all of them, but increasingly so.

So claiming a font is "good for German" if it doesn't include ẞ is as accurate as claiming that a font that does have Ö is "good for German".

The normative status of ẞ in Standard German orthography is DONE. It's accepted by Germany, Austria, Belgium & Italy, with a "no comment" from Switzerland where not even ß is used.

@twardoch
Copy link
Collaborator

twardoch commented Jul 12, 2024

ẞ is already part of the Apple iOS German keyboard layout, accessible the same way as ß. It's not (yet) part of the German layout of Google-made Gboard keyboard for iOS, but I'm sure it's coming.

ẞ has been an equal citizen of the German-Austrian Standard German orthography since 2017 (7 years ago), and the preferred character (over the old SS) in the DIN 5008 standard since 2020 (4 years ago).

Digital texts that contain the ẞ codepoint exist, and more of them are coming. In my view any font that doesn't contain ẞ these days can claim support of Swiss German (de-CH-Latn) but not of [Standard] German (de-Latn).

IMG_0995

IMG_0994

@twardoch
Copy link
Collaborator

I feel that if I specifically filter a list of fonts to show those that are supposed to "support a given language", then I would expect that the font includes all characters used to write that language's current standard orthography, and possibly older widespread texts.

I think it should be 100% coverage. If it were 80%, then a font would "support English" even if it doesn't contain "a", "b", "c", "d", "e" and "f" :)

I agree that an 80% or so rule is sensible for glyphsets, as they are a different matter, they're technical info.

But language support is the most important human-centric info about a font. More important than sans/serif etc.

Pretty much every typographic project contains text that's typically in a specific language. A font either supports that language or it doesn't — and if it doesn't, it's completely disqualified from consideration for typesetting that text. :)

@vv-monsalve
Copy link
Contributor

If the German orthography has recently been updated we should pay attention to these changes instead of going backward. We must understand how language speakers are evolving and shaping the latest definitions.

The character "ẞ" is part of the Latin Core requirements, precisely considering the above, so we have required it for all the newest fonts. As for the old fonts, ideally, they should catch up with the latest standards. There's an open issue from a user regarding the absence of this character in Roboto. After all, the future is longer than the past. ;)

@twardoch
Copy link
Collaborator

ẞ had become part of standard German orthography around the same time as variable fonts were introduced. 7+ years already constitutes a decent past — but you're absolutely right that the future is much longer still!

@davelab6
Copy link
Member

Alright, I am convinced that, if a user searches for fonts that support German, and then Roboto, Poppins, Lato etc should not come up.

@twardoch
Copy link
Collaborator

Alright, I am convinced that, if a user searches for fonts that support German, and then Roboto, Poppins, Lato etc should not come up.

Yes, and the right way to fix that will be to add the glyphs on the fonts. :) (Upcoming Lato v3 has ẞ)

@davelab6
Copy link
Member

davelab6 commented Jul 12, 2024

Upcoming Lato v3

You tease us :) Don't make me come over there!

@nathan-williams
Copy link
Contributor Author

I'm onboard with reverting this PR. There were internal pressures that led to this change, which @davelab6 touched on in brief above. Thank you to all that took the time to present arguments and share references poking holes in the justification provided in the PR summary.

@nathan-williams
Copy link
Contributor Author

Reverted here: #153.

@twardoch
Copy link
Collaborator

Thank you for that! I agree that this issue is "complicated". The uppercase letter got into Unicode in 2007, and for some 10 years many voices were against it. But gradually, designers added the glyph into some fonts, and then users started adopting them, and then the German standard bodies decided "well, in the end, it's not a bad idea". I’m pretty sure that in 10 years or so, it’ll be considered fully normal by most users.

And now it’s "there for you to peruse" — so FONTS should HAVE it, if they’re "inviting" you to be the right font for German.

This process is a fascinating one from the point of view of written language. 25 years ago, we had the Euro symbol, then a few other currency symbols (like Russian ruble and Turkish lira), and now the uppercase ß. I’m glad that we’ve decided to be more future-proof. :)

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

6 participants