Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Different behavior on x64 and aarch64 #99

Closed
Palkovsky opened this issue Apr 8, 2022 · 8 comments
Closed

Different behavior on x64 and aarch64 #99

Palkovsky opened this issue Apr 8, 2022 · 8 comments
Labels
bug Something isn't working

Comments

@Palkovsky
Copy link

Palkovsky commented Apr 8, 2022

I noticed inconsitent matching behavior between x64 and aarch64 for certain regex and input.

Setup

Machine 1 (x64):

user@ubuntu:~$ gcc --version
gcc (Ubuntu 9.4.0-1ubuntu1~20.04.1) 9.4.0
Copyright (C) 2019 Free Software Foundation, Inc.
This is free software; see the source for copying conditions.  There is NO
warranty; not even for MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.

user@ubuntu:~$ ragel -version
Ragel State Machine Compiler version 6.10 March 2017
Copyright (c) 2001-2009 by Adrian Thurston
user@ubuntu:~$ uname -a
Linux ubuntu 5.13.0-39-generic #44~20.04.1-Ubuntu SMP Thu Mar 24 16:43:35 UTC 2022 x86_64 x86_64 x86_64 GNU/Linux

Machine 2 (aarch64) - QEMU Cortex-A72:

user@aarch64-vm:~$ gcc --version
gcc (Ubuntu 9.4.0-1ubuntu1~20.04.1) 9.4.0
Copyright (C) 2019 Free Software Foundation, Inc.
This is free software; see the source for copying conditions.  There is NO
warranty; not even for MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.

user@aarch64-vm:~$ ragel -version
Ragel State Machine Compiler version 6.10 March 2017
Copyright (c) 2001-2009 by Adrian Thurston
user@aarch64-vm:~$ uname -a
Linux aarch64-vm 5.4.0-107-generic #121-Ubuntu SMP Thu Mar 24 16:07:22 UTC 2022 aarch64 aarch64 aarch64 GNU/Linux

Version

Tested with vectorscan build from master and from release 5.4.6 - both exhibit same behavior.

Test program

I compile it using g++.

#include <iostream>
#include <cstring>
#include "hs/hs.h"

const auto match_cb = [](auto, auto, auto, auto, void* ctx) {
    *((bool*) ctx) = true;
    return 0;
};

bool vectorscan_scan(char* regex, const char* input, unsigned int id) {
    hs_database_t* db = nullptr;
    hs_scratch_t* scratch = nullptr;
    hs_compile_error_t* error;
    bool is_match = false;
    if (hs_compile(regex, HS_FLAG_DOTALL | HS_FLAG_SINGLEMATCH | HS_FLAG_UTF8 | HS_FLAG_CASELESS, HS_MODE_BLOCK, nullptr, &db, &error) != HS_SUCCESS) {
        printf("ERROR hs_compile(): %s\n", error->message);
        return false;
    }
    if (hs_alloc_scratch(db, &scratch) != HS_SUCCESS) {
        printf("ERROR hs_alloc_scratch()\n");
        return false;
    }
    if (hs_scan(db, input, strlen(input), id, scratch, match_cb, &is_match) != HS_SUCCESS) {
        printf("ERROR hs_compile(): %s\n", error->message);
        return false;
    }
    hs_free_database(db);
    hs_free_scratch(scratch);
    return is_match;
}

int main() {
    char regex[] = R"(schtasks(\.exe)?\s.*\/create.*cscript.*)";
    char input[] = R"(schtasks.exe /create "cscript abcd")";
    if (!vectorscan_scan(regex, input, 1)) {
        printf("FAIL - SHOULD MATCH\n");
    } else {
        printf("SUCCESS - OK\n");
    }
    return 0;
}

The problem

When I run the program on x64 the input is successfully matched, however on ARM machine it isn't. I observed that the issue occurs only when both HS_FLAG_UTF8 | HS_FLAG_CASELESS flags are provided. Sometimes small tweaks in regex make the problem go away, but I'm unable to pinpoint what exactly triggers this inconsistency.

@markos
Copy link

markos commented Apr 8, 2022

There were some errors caught with ASAN and there is a pending PR, these mostly affect small sizes. These will soon be merged and new release will be made, however the truth is that x86 (32-bit) is not really tested as thoroughly as 64-bit. In fact, 32-bit x86 is not tested in our current jenkins setup at all.

@Palkovsky
Copy link
Author

Thank you for the quick answer! I'll rebuild vectorscan from the PR branch on Monday and will let you know if it fixes the problem.

@Palkovsky
Copy link
Author

I have locally merged the PR to develop branch and rebuilt vectorscan off it. It seems that the issue persists.

@markos
Copy link

markos commented Apr 12, 2022

Thank you for testing that. It seems I will have to build a minimal test case around that and pinpoint what exactly triggers the problem for arm.

@Palkovsky
Copy link
Author

Thank you! Please let me know if you manage to reproduce it on your side.

@danlark1
Copy link

danlark1 commented Apr 15, 2022

The problem is movemask128 is incorrectly implemented on ARM, i.e. does not match _mm_shuffle_epi8 semantics but vectorscan/src/rose/program_runtime.c:1470 does not assume bytes are 0xff, so the code below fails

static really_inline u32 movemask128(m128 a) {
    static const uint8x16_t powers = { 1, 2, 4, 8, 16, 32, 64, 128, 1, 2, 4, 8, 16, 32, 64, 128 };
    // AND assumes that bytes are 0xFF, but that's not true!
    // Compute the mask from the input
    uint8x16_t mask  = (uint8x16_t) vpaddlq_u32(vpaddlq_u16(vpaddlq_u8(vandq_u8((uint8x16_t)a, powers))));
    uint8x16_t mask1 = vextq_u8(mask, (uint8x16_t)zeroes128(), 7);
    mask = vorrq_u8(mask, mask1);

    // Get the resulting bytes
    uint16_t output;
    vst1q_lane_u16((uint16_t*)&output, (uint16x8_t)mask, 0);
    return output;
}

After #101, the issue goes away

@Palkovsky
Copy link
Author

I've tested the patch and it apears that the problem is solved. Thank you very much for it!

@markos
Copy link

markos commented Apr 18, 2022

Closed with #102

@markos markos closed this as completed Apr 18, 2022
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Something isn't working
Projects
None yet
Development

Successfully merging a pull request may close this issue.

3 participants