-
Notifications
You must be signed in to change notification settings - Fork 3.5k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[OpenCL] Implement save/load pre-compiled programs #13868
[OpenCL] Implement save/load pre-compiled programs #13868
Conversation
Thanks for contributing to TVM! Please refer to the contributing guidelines https://tvm.apache.org/docs/contribute/ for useful information and tips. Please request code reviews from Reviewers by @-ing them in a comment. Generated by tvm-bot |
cc: @tqchen, @csullivan, @srkreddy1238 |
511f3db
to
308c8c1
Compare
@tvm-bot rerun |
Using pre-compiled programs might significantly improve inference time of the first run. - Added methods `SupportPreCompiledPrograms` which reports if the module supports using pre-compiled programs. - Method `GetPreCompiledPrograms` returns string with bytes of pre-compiled programs. - Method `SetPreCompiledPrograms` allows user to pass pre-compiled programs to the module.
@echuraev thanks for this feature. I too had a similar problem statement with CLML tuning cache management at runtime which need to be generated once and reused later. In CLML, there existed multiple sub graphs related to each tvm module and for now the tuning cache is indexed by subgraph symbol and are stored under one file by serializing them through I think we have a generalized problem statement here where there is a need of cache management for the tvm runtime. Probably we could come up with an unified approach where
The flow would be like 1: Runtime (OpenCL/CLML ..etc.) will form a key (concatenating [ tvm_module hash + runtime + purpose ...etc.] ) This will simplify and minimize the end user hassle to a level of just specify a cache folder and relax. In the implementation side we have
|
Supporting the idea of caching data, I am against of API introducing work with file system. We can provide serialize interfaces that users can implement and feed to TVM API or we can provide load/store functions operating with binary in memory, but it must not be file system dependent API. |
You mean, TVM will prepare in memory serialized blob for all the cache, application is responsible for retrieving followed by storing on their own and later user will input the the binary blob back to TVM (similar to binary input for load_params) ? |
308c8c1
to
a35988f
Compare
@srkreddy1238 Thank you for your review! The idea of caching data looks promising, but I agree with @elvin-n that the TVM should provide only API for serializing objects to binary format but shouldn't work with file system. So user application should be responsible for writing/reading this serialized blob to/from disk. Do you think that caching mechanism should be implemented in this PR? I would prefer to see such implementation in separate PR. |
e7a6ab7
to
0040e3f
Compare
0040e3f
to
611a320
Compare
e5e4d84
to
499e358
Compare
User management will impose additional challenge. Every time the application should retrieve and save the same as we are unaware of change to the cache blob in the current module load. Or the application should query about a change due to the current run. I feel all these complicates the app interface. @echuraev cache implementation incur more changes outside cl runtime and can be handled outside this PR scope once there is an agreement on the design. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
A suggestion, otherwise good to go.
e19925c
to
d41fb12
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM
@tqchen could you please review this PR once again? |
Using pre-compiled programs might significantly improve inference time of the first run.
SupportPreCompiledPrograms
which reports if the module supports using pre-compiled programs.GetPreCompiledPrograms
returns string with bytes of pre-compiled programs.SetPreCompiledPrograms
allows user to pass pre-compiled programs to the module.