-
Notifications
You must be signed in to change notification settings - Fork 18
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Editorial comments on character definitions #28
Comments
Here are definitions in the Unicode glossary. I usually find that these are pretty clear and reliable, and so worth relying on for our own needs. Character https://www.unicode.org/glossary/#character
Character encoding form https://www.unicode.org/glossary/#character_encoding_form
Character set https://www.unicode.org/glossary/#character_set
Code point https://www.unicode.org/glossary/#code_point
Code unit https://www.unicode.org/glossary/#code_unit
Extended grapheme cluster https://www.unicode.org/glossary/#extended_grapheme_cluster
Glyph https://www.unicode.org/glossary/#glyph
Glyph image https://www.unicode.org/glossary/#glyph_image
Grapheme https://www.unicode.org/glossary/#grapheme
Grapheme cluster https://www.unicode.org/glossary/#grapheme_cluster
User-perceived character https://www.unicode.org/glossary/#user_perceived_character
|
CSS terms. Typographic character unit https://drafts.csswg.org/css-text-3/#typographic-character-unit
Typographic letter unit (letter) https://drafts.csswg.org/css-text-3/#typographic-letter-unit
|
I will review this in detail. I suspect this is done? |
https://w3c.github.io/bp-i18n-specdev/#characters
These are comments on the text recently added to the start of section 4.
[1] 1st occurrence of 'character' not highlighted same as other definitions
[2] The first para, and probably all the rest, should be under "Choosing a definition of character" subsection.
[3] there's a conflation of 'glyph', 'grapheme', and 'user-perceived character' (UPC) which i think is incorrect. A given UPC can be represented by different glyphs, eg. regular, italic, bold, alternative font, etc. Also, a single UPC can be represented by multiple glyphs.
[4] UPC is actually coterminous with the linguistic term 'grapheme', but graphemes are NOT 'visual units found in fonts and rendering software' - those are 'grapheme clusters' (an approximation to the concept of a grapheme expressed using rules defined by TUS).
[5] What's an 'individual rendering unit'?
[6] We should also mention the CSS term 'typographic character unit', see https://drafts.csswg.org/css-text-3/#characters.
[7] This is incorrect.
It is standard to backwards delete codepoints, but to forward delete grapheme clusters.
[8]
Please don't munge those two terms. A grapheme cluster is a mechanical approximation to a grapheme. (And note that they are defined separately in the Unicode glossary.)
Looking at the section "Choosing a definition of character", i think it could do with some reordering. I'll submit a PR for that, because i think it will make it easier to integrate the text above.
The text was updated successfully, but these errors were encountered: