-
Notifications
You must be signed in to change notification settings - Fork 5
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
WG21 P1859: Modern terminology updates #7
Comments
Does it really work for character sets? The ones described in [lex.charset] appear to be literally sets (of code points): "The basic source character set consists of 96 characters..." If anything, the encoding would come up when the footnote under that says "an implementation is required to document how the basic source characters are represented in source files" |
Many of the existing uses of "character set" are appropriate. In some places, use of character encoding would be more correct or clear. In a few cases, the terms are used inconsistently. For example, http://eel.is/c++draft/locale.codecvt#3. |
P1859 now tracks this issue. R0 was discussed in Belfast. Now awaiting an updated paper. |
This issue is now tracked by cplusplus/papers#613. |
The readme should be edited and the paper should be moved to the section Inactive papers and noted that it got replaced. This issue should be closed. |
Agreed, thank you, @dimztimz! Done! |
The C++ standard currently uses the term "character" to refer to both code units and code points and "character set" where character encoding is intended. The lack of proper terminology leads to defective interfaces (e.g.,
std::ctype<>::toupper
). Future library additions for Unicode will depend on modern terminology for clear and correct specification.The text was updated successfully, but these errors were encountered: