-
Notifications
You must be signed in to change notification settings - Fork 3.5k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[Topi][Unittests] Parametrized tests in test_topi_dense.py
, split out gpu-independent implementations
#8336
Conversation
0dde14d
to
98a145b
Compare
return (a_np, b_np, c_np, d_np) | ||
|
||
|
||
def test_dense( |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'm thinking that wheather this can test different targets as you expected.
Shouldn't this function be decorated with "@tvm.testing.parametrize_targets"?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
As of #8010 , the decorator for @tvm.testing.parametrize_targets
is only needed if the test should override the default targets. Otherwise, having the target
and/or dev
fixture arguments is sufficient to have the test run on all enabled targets.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Oh, my bad. That really a helpful feature.
Now, tests run for multiple data types, can be extended with additional datatypes.
98a145b
to
4786a21
Compare
Added a few fixes in the codegen needed for the vulkan tests to run correctly on newer nvidia GPUs. (@masahi ) |
…ense As a follow-up to the renaming of "gpu" to "cuda", separating implementations that require CUDA (e.g. dense_cublas.cuda) from implementations that require any GPU, but not necessarily a CUDA GPU (e.g. dense_small_batch.gpu). My intent is to pair this migration with the extension of unit tests to cover additional GPU runtimes, migrating only implementations that run correctly on non-CUDA GPU devices.
…lts on some GPUs - In ThreadAllreduceBuilder, separate out load/store so that they can have a memory barrier in-between. - In Vulkan codegen, added Workgroup memory sync for subgroup thread sync, since the different subgroup threads can still access workgroup memory. Longer-term, may need tir enhancements to separate out sync of control/memory.
4786a21
to
adb2431
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks! Looks good to me.
I have a PR #8234 which also have some modifications on topi & cuda strategy. It should be conflict with this.
I can rebase it after this has been merged.
…ut gpu-independent implementations (apache#8336) * [Topi][UnitTests] Parametrized tests in test_topi_dense.py Now, tests run for multiple data types, can be extended with additional datatypes. * [Topi] Separated generic-gpu nn.dense implementations into topi.gpu.dense As a follow-up to the renaming of "gpu" to "cuda", separating implementations that require CUDA (e.g. dense_cublas.cuda) from implementations that require any GPU, but not necessarily a CUDA GPU (e.g. dense_small_batch.gpu). My intent is to pair this migration with the extension of unit tests to cover additional GPU runtimes, migrating only implementations that run correctly on non-CUDA GPU devices. * [Vulkan][Codegen] Updated storage sync to avoid incorrect matmul results on some GPUs - In ThreadAllreduceBuilder, separate out load/store so that they can have a memory barrier in-between. - In Vulkan codegen, added Workgroup memory sync for subgroup thread sync, since the different subgroup threads can still access workgroup memory. Longer-term, may need tir enhancements to separate out sync of control/memory. Co-authored-by: Eric Lunderberg <[email protected]>
…ut gpu-independent implementations (apache#8336) * [Topi][UnitTests] Parametrized tests in test_topi_dense.py Now, tests run for multiple data types, can be extended with additional datatypes. * [Topi] Separated generic-gpu nn.dense implementations into topi.gpu.dense As a follow-up to the renaming of "gpu" to "cuda", separating implementations that require CUDA (e.g. dense_cublas.cuda) from implementations that require any GPU, but not necessarily a CUDA GPU (e.g. dense_small_batch.gpu). My intent is to pair this migration with the extension of unit tests to cover additional GPU runtimes, migrating only implementations that run correctly on non-CUDA GPU devices. * [Vulkan][Codegen] Updated storage sync to avoid incorrect matmul results on some GPUs - In ThreadAllreduceBuilder, separate out load/store so that they can have a memory barrier in-between. - In Vulkan codegen, added Workgroup memory sync for subgroup thread sync, since the different subgroup threads can still access workgroup memory. Longer-term, may need tir enhancements to separate out sync of control/memory. Co-authored-by: Eric Lunderberg <[email protected]>
[Topi][UnitTests] Parametrized tests in test_topi_dense.py
Now, tests run for multiple data types, can be extended with additional datatypes.
[Topi] Separated generic-gpu nn.dense implementations into topi.gpu.dense
As a follow-up to the renaming of "gpu" to "cuda", separating implementations that require CUDA (e.g. dense_cublas.cuda) from implementations that require any GPU, but not necessarily a CUDA GPU (e.g. dense_small_batch.gpu).
My intent is to pair this migration with the extension of unit tests to cover additional GPU runtimes, migrating only implementations that run correctly on non-CUDA GPU devices.