We read every piece of feedback, and take your input very seriously.
To see all available qualifiers, see our documentation.
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Regex::find_from_utf16/ecs2
The find functions for UTF16 and ECS2 strings are incorrectly matching against individual bytes instead of whole u16 words in some cases.
use regress::Regex; fn main() { let input = "赔".encode_utf16().collect::<Vec<_>>(); // U+8D54 let re = Regex::new(r"[A-Z]").unwrap(); // 0x41 - 0x5A let matched = re.find_from_utf16(&input, 0).collect::<Vec<_>>(); dbg!(matched.is_empty()); // false let matched = re.find_from_ucs2(&input, 0).collect::<Vec<_>>(); dbg!(matched.is_empty()); // false }
In this case the regex [A-Z] is interpreting "赔" as [0x54, 0x8D], which matches against the "T" character.
[A-Z]
[0x54, 0x8D]
The text was updated successfully, but these errors were encountered:
Yikes, great find.
Sorry, something went wrong.
ff467d2
find_from_ucs2
ridiculousfish
No branches or pull requests
The find functions for UTF16 and ECS2 strings are incorrectly matching against individual bytes instead of whole u16 words in some cases.
Reproducer
In this case the regex
[A-Z]
is interpreting "赔" as[0x54, 0x8D]
, which matches against the "T" character.The text was updated successfully, but these errors were encountered: