-
Notifications
You must be signed in to change notification settings - Fork 167
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Implement Quick Check #2769
Implement Quick Check #2769
Conversation
gcc/rust/util/make-rust-unicode.py
Outdated
@@ -16,7 +16,15 @@ | |||
# along with GCC; see the file COPYING3. If not see | |||
# <http://www.gnu.org/licenses/>. | |||
|
|||
# Run this program as | |||
# First, download the folloqing files from unicode.org |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
# First, download the folloqing files from unicode.org | |
# First, download the following files from unicode.org |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Oh thanks😆
10a9f78
to
4a9fc52
Compare
4a9fc52
to
2a72592
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
this is great, thank you :) is it possible to maybe add some unit testing? otherwise this looks good to me
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
so these files are not present in any other GCC frontend? this is surprising. I guess it'll be a nice thing to bring higher in the GCC file hierarchy, so that other languages can benefit from this
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Omg, I completely missed the fact that contrib/unicode contains these files 😮😮
https://github.com/tamaroning/gccrs/tree/2a72592cf111f8f3dcc024d1f34ccd7fe6d36878/contrib/unicode
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
ah, no problem! better late than never :D well done finding them
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
removed them and pushed just now
37125bc
to
b6c3c3d
Compare
void | ||
rust_nfc_qc_test () | ||
{ | ||
ASSERT_EQ (Rust::nfc_quick_check ({0x1e0a /* NFC_QC_YES */}), | ||
Rust::QuickCheckResult::YES); | ||
ASSERT_EQ (Rust::nfc_quick_check ( | ||
{0x1e0a /* NFC_QC_YES */, 0x0323 /* NFC_QC_MAYBE */}), | ||
Rust::QuickCheckResult::MAYBE); | ||
ASSERT_EQ (Rust::nfc_quick_check ({0x0340 /* NFC_QC_NO */}), | ||
Rust::QuickCheckResult::NO); | ||
} | ||
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@CohenArthur
Here are tests. (but very few...)
Sorry, I failed to commit gcc/rust/rust-lang.cc for some reason.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
this looks great :) thanks @tamaroning, good work!
gcc/rust/util/rust-unicode.cc
Outdated
} | ||
|
||
template <std::size_t SIZE> | ||
int64_t | ||
bool |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Should we replace this function with a direct call to std::binary_search instead ?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Oh yes.
I will fix it later
gcc/rust/ChangeLog: * rust-lang.cc (run_rust_tests): Add test. * rust-system.h: Add <algorithm>. * util/make-rust-unicode.py: Output NFC_Quick_Check table. * util/rust-codepoint.h (struct Codepoint): Add is_supplementary method. * util/rust-unicode-data.h: Generated. * util/rust-unicode.cc (binary_search_sorted_array): Removed. (lookup_cc): Remove namespace. (is_alphabetic): Use std::binary_search (nfc_quick_check): New function. (nfc_normalize): Use nfc_quick_check. (is_nfc_qc_maybe): New function. (is_nfc_qc_no): New function. (rust_nfc_qc_test): New test. * util/rust-unicode.h (is_nfc_qc_no): New function. (is_nfc_qc_maybe): New function. (enum class): New enum class. (nfc_quick_check): New function. (rust_nfc_qc_test): New test. Signed-off-by: Raiki Tamura <[email protected]>
b6c3c3d
to
39d668c
Compare
@P-E-P |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM, thank you!
Addresses #2379
I implemented Unicode Quick Check.
This checks if strings (usually identifiers) should be normalized or not before performing Unicode normalization.
QuickCheckResult::Yes shows the given string is already normalized, saving time and memory.
Strings should be normalized if the result is QuickCheckResult::No or QuickCheckResult::Maybe.
Also, I added Unicode data files which are compatible with GPL.
they are used to generate rust-unicode-data.h
See https://www.gnu.org/licenses/license-list.en.html#Unicode