[CUBLAS] Add cuBLAS as a Relay partitioning target (BYOC) #10820

mbaret · 2022-03-29T19:54:38Z

This PR adds a partitioning pass for cuBLAS so that supported Relay patterns can be offloaded to cuBLAS. This initial commit only adds offloading support for nn.matmul.

Although cuBLAS is already enabled in TVM by using strategy selection in TE, by exposing it explicitly as a Relay partitioning target we can more precisely describe how to execute a model in Relay. This is desirable particularly in the Collage effort to improve multi-backend graph partitioning.

This PR adds a partitioning pass for cuBLAS so that supported Relay patterns can be offloaded to cuBLAS. This initial commit only adds offloading support for nn.matmul. Although cuBLAS is already enabled in TVM by using strategy selection in TE, by exposing it explicitly as a Relay partitioning target we can more precisely describe how to execute a model in Relay. This is desirable particularly in the Collage effort to improve multi-backend graph partitioning.

mbaret · 2022-03-29T19:55:59Z

cc @mbs-octoml @mikepapadim

mbs-octoml

Quite nice I think, thanks. It will be fun to see this used for complex patterns.

When we add the second+ examples I think we can consider:

removing per-op boilerplate, since placeholder construction, create_schedule and build will always be the same
avoiding the tvm.build entry point since I believe the support for schedules & tensors is now considered anachronistic. Perhaps a tvm.build_te(op, placeholders, target, name.

mbaret · 2022-03-30T13:37:44Z

Thanks @mbs-octoml, I've refactored both the 'lower funcs' and tests to extract all the boilerplate into reusable code. I've also switched to first using create_prim_func, then I call tvm.build on the resulting TIR.

mikepapadim · 2022-03-30T13:55:52Z

thanks @mbaret LGTM

mbs-octoml

LGTM

mbrookhart · 2022-03-30T18:24:35Z

Thanks @mbaret @mbs-octoml @mikepapadim

* [CUBLAS] Add cuBLAS as a Relay partitioning target (BYOC) This PR adds a partitioning pass for cuBLAS so that supported Relay patterns can be offloaded to cuBLAS. This initial commit only adds offloading support for nn.matmul. Although cuBLAS is already enabled in TVM by using strategy selection in TE, by exposing it explicitly as a Relay partitioning target we can more precisely describe how to execute a model in Relay. This is desirable particularly in the Collage effort to improve multi-backend graph partitioning. * Refactor to remove boilerplate

mbs-octoml reviewed Mar 29, 2022

View reviewed changes

Refactor to remove boilerplate

eb4c71f

mbaret mentioned this pull request Mar 30, 2022

[CUBLAS] Add support for nn.dense and nn.batch_matmul #10826

Merged

mbs-octoml approved these changes Mar 30, 2022

View reviewed changes

mbrookhart approved these changes Mar 30, 2022

View reviewed changes

mbrookhart merged commit b2a0e1d into apache:main Mar 30, 2022

driazati mentioned this pull request Jul 14, 2022

TVM v0.9.0.rc0 Release Candidate Notes #12102

Closed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[CUBLAS] Add cuBLAS as a Relay partitioning target (BYOC) #10820

[CUBLAS] Add cuBLAS as a Relay partitioning target (BYOC) #10820

mbaret commented Mar 29, 2022

mbaret commented Mar 29, 2022

mbs-octoml left a comment

mbaret commented Mar 30, 2022

mikepapadim commented Mar 30, 2022

mbs-octoml left a comment

mbrookhart commented Mar 30, 2022

[CUBLAS] Add cuBLAS as a Relay partitioning target (BYOC) #10820

[CUBLAS] Add cuBLAS as a Relay partitioning target (BYOC) #10820

Conversation

mbaret commented Mar 29, 2022

mbaret commented Mar 29, 2022

mbs-octoml left a comment

Choose a reason for hiding this comment

mbaret commented Mar 30, 2022

mikepapadim commented Mar 30, 2022

mbs-octoml left a comment

Choose a reason for hiding this comment

mbrookhart commented Mar 30, 2022