-
Notifications
You must be signed in to change notification settings - Fork 3.5k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[ONNX] Add MatMulInteger16 contrib op #9186
Conversation
Thanks for helping with the review @cconvey ! CC-ing @mbrookhart @anwang2009 @AndrewZhaoLuo |
@abhikran-quic do you still plan on working on this? |
Hi @AndrewZhaoLuo : I've been out of office since last two weeks and hence I've not been able to reply here. Sorry about this! |
No problem, have a good vacation! |
@abhikran-quic friendly ping to see if updating this PR is currently blocked, thanks! |
Thank you @AndrewZhaoLuo for your help on this. I've incorporated the changes mentioned by you in the latest patch. However, there's one more pending issue. After removing test_matmulinteger from unsupported_onnx_tests, I saw an error from onnx.py . Here is the link to CI where failure is seen. IMHO, when we remove test_matmulinteger the error is expected because there is no op named matmulinteger in onnx.py. Onnxruntime supports MatMulInteger16 and MatMulIntegerToFloat and hence we can add ops with the names mentioned in Onnx runtime documentation. Hence, removing test_matmulinteger is leading to the error. In my latest patch, I've retained test_matmulinteger in unsupported_onnx_tests. Please share your thoughts on this. |
Ah yes, MatMulInteger is an onnx op but it is seperate from MatMulInteger16 |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM
Thank you @AndrewZhaoLuo for your review. I would request reviewers to please review this PR. |
Hello Reviewers, Could you please review this PR for any more changes needed ? |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM
Thank you @abhikran-quic, @AndrewZhaoLuo, @cconvey the PR has been merged |
* [ONNX] Add MatMulInteger16 contrib op * Fix formatting errors * Remove a code comment and do not set default value of nd * Move flatten_to_nd function outside matmul to be used across multiple functions * Add function docstring and describe the tests * Use max/min value of int16 as high/low while generating input vectors * Converge MatMul and MatMulInteger16 ops into a single op using output dtype * Fix indentation issues * Formatting changes * Fix CUDA batchmatmul strategy to allow mixed precision * Add test_matmulinteger to unsupported_onnx_tests
* [ONNX] Add MatMulInteger16 contrib op * Fix formatting errors * Remove a code comment and do not set default value of nd * Move flatten_to_nd function outside matmul to be used across multiple functions * Add function docstring and describe the tests * Use max/min value of int16 as high/low while generating input vectors * Converge MatMul and MatMulInteger16 ops into a single op using output dtype * Fix indentation issues * Formatting changes * Fix CUDA batchmatmul strategy to allow mixed precision * Add test_matmulinteger to unsupported_onnx_tests
* [ONNX] Add MatMulInteger16 contrib op * Fix formatting errors * Remove a code comment and do not set default value of nd * Move flatten_to_nd function outside matmul to be used across multiple functions * Add function docstring and describe the tests * Use max/min value of int16 as high/low while generating input vectors * Converge MatMul and MatMulInteger16 ops into a single op using output dtype * Fix indentation issues * Formatting changes * Fix CUDA batchmatmul strategy to allow mixed precision * Add test_matmulinteger to unsupported_onnx_tests
* [ONNX] Add MatMulInteger16 contrib op * Fix formatting errors * Remove a code comment and do not set default value of nd * Move flatten_to_nd function outside matmul to be used across multiple functions * Add function docstring and describe the tests * Use max/min value of int16 as high/low while generating input vectors * Converge MatMul and MatMulInteger16 ops into a single op using output dtype * Fix indentation issues * Formatting changes * Fix CUDA batchmatmul strategy to allow mixed precision * Add test_matmulinteger to unsupported_onnx_tests
@abhikran-quic |
* [ONNX] Add MatMulInteger16 contrib op * Fix formatting errors * Remove a code comment and do not set default value of nd * Move flatten_to_nd function outside matmul to be used across multiple functions * Add function docstring and describe the tests * Use max/min value of int16 as high/low while generating input vectors * Converge MatMul and MatMulInteger16 ops into a single op using output dtype * Fix indentation issues * Formatting changes * Fix CUDA batchmatmul strategy to allow mixed precision * Add test_matmulinteger to unsupported_onnx_tests
Hi @Icemist , This change was done to optimize the code path and since all the tests in automation passed, we went ahead and committed the change. |
* [ONNX] Add MatMulInteger16 contrib op * Fix formatting errors * Remove a code comment and do not set default value of nd * Move flatten_to_nd function outside matmul to be used across multiple functions * Add function docstring and describe the tests * Use max/min value of int16 as high/low while generating input vectors * Converge MatMul and MatMulInteger16 ops into a single op using output dtype * Fix indentation issues * Formatting changes * Fix CUDA batchmatmul strategy to allow mixed precision * Add test_matmulinteger to unsupported_onnx_tests
* [ONNX] Add MatMulInteger16 contrib op * Fix formatting errors * Remove a code comment and do not set default value of nd * Move flatten_to_nd function outside matmul to be used across multiple functions * Add function docstring and describe the tests * Use max/min value of int16 as high/low while generating input vectors * Converge MatMul and MatMulInteger16 ops into a single op using output dtype * Fix indentation issues * Formatting changes * Fix CUDA batchmatmul strategy to allow mixed precision * Add test_matmulinteger to unsupported_onnx_tests
This PR adds support for contrib op: com.microsoft.MatMulInteger16.
Converge MatMul and MatmulInteger16 ops to use a single function with output dtype.