-
Notifications
You must be signed in to change notification settings - Fork 57
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Different behavior on x64 and aarch64 #99
Comments
There were some errors caught with ASAN and there is a pending PR, these mostly affect small sizes. These will soon be merged and new release will be made, however the truth is that x86 (32-bit) is not really tested as thoroughly as 64-bit. In fact, 32-bit x86 is not tested in our current jenkins setup at all. |
Thank you for the quick answer! I'll rebuild vectorscan from the PR branch on Monday and will let you know if it fixes the problem. |
I have locally merged the PR to |
Thank you for testing that. It seems I will have to build a minimal test case around that and pinpoint what exactly triggers the problem for arm. |
Thank you! Please let me know if you manage to reproduce it on your side. |
The problem is movemask128 is incorrectly implemented on ARM, i.e. does not match static really_inline u32 movemask128(m128 a) {
static const uint8x16_t powers = { 1, 2, 4, 8, 16, 32, 64, 128, 1, 2, 4, 8, 16, 32, 64, 128 };
// AND assumes that bytes are 0xFF, but that's not true!
// Compute the mask from the input
uint8x16_t mask = (uint8x16_t) vpaddlq_u32(vpaddlq_u16(vpaddlq_u8(vandq_u8((uint8x16_t)a, powers))));
uint8x16_t mask1 = vextq_u8(mask, (uint8x16_t)zeroes128(), 7);
mask = vorrq_u8(mask, mask1);
// Get the resulting bytes
uint16_t output;
vst1q_lane_u16((uint16_t*)&output, (uint16x8_t)mask, 0);
return output;
} After #101, the issue goes away |
I've tested the patch and it apears that the problem is solved. Thank you very much for it! |
Closed with #102 |
I noticed inconsitent matching behavior between x64 and aarch64 for certain regex and input.
Setup
Machine 1 (x64):
Machine 2 (aarch64) - QEMU Cortex-A72:
Version
Tested with vectorscan build from
master
and from release5.4.6
- both exhibit same behavior.Test program
I compile it using g++.
The problem
When I run the program on x64 the
input
is successfully matched, however on ARM machine it isn't. I observed that the issue occurs only when bothHS_FLAG_UTF8 | HS_FLAG_CASELESS
flags are provided. Sometimes small tweaks in regex make the problem go away, but I'm unable to pinpoint what exactly triggers this inconsistency.The text was updated successfully, but these errors were encountered: