-
Notifications
You must be signed in to change notification settings - Fork 3.5k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[Performance] Performance regression with int64 indices INDEX_DEFAULT_I64=ON (PR #6143) #6691
Comments
Sure. I will look into that. Thanks @trevor-m |
ping @hzfan can you please follow a bit further ? It is a great chance for us to improve the simplifcation and i64 flow |
@tqchen Yeah, sure. Perhaps I can start with figuring out why cast i32 is inserted. |
@tqchen The cast originates from https://github.com/apache/tvm/blob/main/src/te/schedule/operation_inline.cc#L67
after inline,
Perhaps I can do something with the SumExpr in CanonicalSimplifier to simplify this case. |
Thanks @hzfan for looking into this, a fix would be great |
I've started noticing a large performance regression affecting Keras MobileNetV2 caused by
INDEX_DEFAULT_I64=ON
(PR #6143). This is on an AWS m5.12xlarge instance.Profile with
INDEX_DEFAULT_I64=OFF
(fast)Profile with
INDEX_DEFAULT_I64=ON
(slow)The slowdown comes from these ops:
Here is a script to reproduce:
Thanks!
The text was updated successfully, but these errors were encountered: