-
Notifications
You must be signed in to change notification settings - Fork 3.5k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[microTVM] Replace arm_nnsupportfunctions.h with arm_acle.h #13363
Conversation
Thanks for contributing to TVM! Please refer to the contributing guidelines https://tvm.apache.org/docs/contribute/ for useful information and tips. Please request code reviews from Reviewers by @-ing them in a comment. Generated by tvm-bot |
The include is removed in #13242, so we can wait for that to land 😸 |
540681a
to
aaa7786
Compare
183ee05
to
f98eb78
Compare
f98eb78
to
1aa5d09
Compare
This attempts to replace the CMSIS-NN header with a more portable alternative and avoid dependence on CMSIS
1aa5d09
to
85cfa55
Compare
The packing definitions aren't implemented as ACLE intrinsics nor is there a simple way to convince a C compiler to generate them.
85cfa55
to
ab667a5
Compare
@Mousius any update on this? |
We have started looking at it. I am able to reproduce the hang locally: |
Introduce `memcpy` to explain to the compiler that we're changing the alignment of `int16_t` to `int32_t`. What this appears to actually do is encourage the compiler to use three loads rather than one double load plus a regular load. The padded array is aligned as an `int16_t`, it isn't guaranteed to behave like an `int32_t` aligned array. One of the side effects of the type punning from `int16_t*` to `int32_t*` is that we're effectively lying to the compiler that this is correctly aligned and it can use instructions which load multiple `int32_t`s at the same time - this does not work 😿 Co-authored-by: Ashutosh Parkhi <[email protected]>
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Looks good to me @Mousius! That makes way for executing the generated code independent of CMSIS-NN.
…3363) * [microTVM] Replace arm_nnsupportfunctions.h with arm_acle.h This attempts to replace the CMSIS-NN header with a more portable alternative and avoid dependence on CMSIS * Remove CMSIS __STATIC_FORCEINLINE macro * Replace more intrinsics with ACLE variants * Use builtins for intrinsics missing in older GCC * Re-use common_includes to propagate shared functions The packing definitions aren't implemented as ACLE intrinsics nor is there a simple way to convince a C compiler to generate them. * Properly align memory access for Introduce `memcpy` to explain to the compiler that we're changing the alignment of `int16_t` to `int32_t`. What this appears to actually do is encourage the compiler to use three loads rather than one double load plus a regular load. The padded array is aligned as an `int16_t`, it isn't guaranteed to behave like an `int32_t` aligned array. One of the side effects of the type punning from `int16_t*` to `int32_t*` is that we're effectively lying to the compiler that this is correctly aligned and it can use instructions which load multiple `int32_t`s at the same time - this does not work 😿 Co-authored-by: Ashutosh Parkhi <[email protected]>
This attempts to replace the CMSIS-NN header with a more portable alternative and avoid dependence on CMSIS