-
Notifications
You must be signed in to change notification settings - Fork 3.5k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[TIR] VNNI and ARM dot product intrinsic for tensorization #10925
Conversation
|
||
|
||
@T.prim_func | ||
def dot_product_4x4_i8i8i32_neon( |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is equivalent to the TE one in
tvm/python/tvm/topi/arm_cpu/tensor_intrin.py
Line 536 in 912993f
def dot_int8_int8_int32_neon(): |
|
||
|
||
@T.prim_func | ||
def dot_product_4x4_i8i8i32_sdot( |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is equivalent to the TE one in
tvm/python/tvm/topi/arm_cpu/tensor_intrin.py
Line 431 in 912993f
def dot_int8_int8_int32_neon_82(int32_lanes, dtype="uint"): |
|
||
|
||
@T.prim_func | ||
def dot_product_16x4_u8i8i32_vnni( |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Equivalent to the TE one in
tvm/python/tvm/topi/x86/tensor_intrin.py
Line 244 in 912993f
def dot_16x1x16_uint8_int8_int32_cascadelake(): |
|
||
|
||
# TODO(masahi): Parametrize the TVMScript description of dot product by | ||
# shape and dtype, and share the common description with x86. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
cc @junrushao1994 @yelite, this is one of the common needs for meta programming support in TVMScript. I think shape parameterization is possible via specialize
, but not sure if I can use that with T.Buffer
syntax sugar.
A similar need arises for tensorcore (different mma shape x data type)
|
||
vec_b = B.vload([0, 0], dtype="int8x16") | ||
|
||
# TODO(masahi): Remove duplication when inlined function call is supported |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
cc @junrushao1994 @yelite, I want to define and call a convenience function like
tvm/python/tvm/topi/arm_cpu/tensor_intrin.py
Line 625 in 912993f
def pairwise_add_mul(extract_half): |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes, it is likely doable and @Hzfengsy probably already has something ready
82e152a
to
07bbb38
Compare
CC @vinx13 would you like to review? Thanks a lot! |
Introduces a new directory
python/tvm/tir/tensor_intrin
where we put intrinsic descriptions written in TVMScript for various HW targets. They can be used by manual tensorized TIR schedules or auto-tensorized ones. More intrinsics, such as tensorcore ones, DP4A etc will be added later.@junrushao1994 @vinx13 @shingjan @Hzfengsy @spectrometerHBH