Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[OpenCL] Implement save/load pre-compiled programs #13868

Merged

Conversation

echuraev
Copy link
Contributor

Using pre-compiled programs might significantly improve inference time of the first run.

  • Added methods SupportPreCompiledPrograms which reports if the module supports using pre-compiled programs.
  • Method GetPreCompiledPrograms returns string with bytes of pre-compiled programs.
  • Method SetPreCompiledPrograms allows user to pass pre-compiled programs to the module.

@tvm-bot
Copy link
Collaborator

tvm-bot commented Jan 30, 2023

Thanks for contributing to TVM! Please refer to the contributing guidelines https://tvm.apache.org/docs/contribute/ for useful information and tips. Please request code reviews from Reviewers by @-ing them in a comment.

Generated by tvm-bot

@echuraev
Copy link
Contributor Author

cc: @tqchen, @csullivan, @srkreddy1238

@echuraev echuraev force-pushed the echuraev/save_compiled_kernels_to_bin branch from 511f3db to 308c8c1 Compare January 30, 2023 10:58
include/tvm/runtime/module.h Outdated Show resolved Hide resolved
@echuraev
Copy link
Contributor Author

@tvm-bot rerun

Using pre-compiled programs might significantly improve inference time
of the first run.

- Added methods `SupportPreCompiledPrograms` which reports if the module
  supports using pre-compiled programs.
- Method `GetPreCompiledPrograms` returns string with bytes of
  pre-compiled programs.
- Method `SetPreCompiledPrograms` allows user to pass pre-compiled
  programs to the module.
@srkreddy1238
Copy link
Contributor

@echuraev thanks for this feature.

I too had a similar problem statement with CLML tuning cache management at runtime which need to be generated once and reused later. In CLML, there existed multiple sub graphs related to each tvm module and for now the tuning cache is indexed by subgraph symbol and are stored under one file by serializing them through DMLC::Strm. This works by maintaining one file per tvm module. This approach adds additional overhead for the user to maintain and specify the tuning cache for each tvm module.

I think we have a generalized problem statement here where there is a need of cache management for the tvm runtime.

Probably we could come up with an unified approach where

  • There will be a tvm_runtime cache specified by environment variale or graph_runtime api interface.
  • Each tvm compiled module will have a unique hash key generated at compile time and is accessible from tvm module interface.
  • A new file utility interface to load/store the binary blobs generated by any runtime as a key & value pair.

The flow would be like

1: Runtime (OpenCL/CLML ..etc.) will form a key (concatenating [ tvm_module hash + runtime + purpose ...etc.] )
2: Try to fetch from runtime cache.
3: If not exist regenerate and save into cache.
4: Use the stored cache

This will simplify and minimize the end user hassle to a level of just specify a cache folder and relax.

In the implementation side we have

  • tvm Module passing a unique key from compilation to runtime probably via grpah_json.
  • Cache API to serialize and load/store the key, value paired binary blobs.
  • Runtime specific changes to use this cache interface.

@elvin-n
Copy link
Contributor

elvin-n commented Feb 1, 2023

Supporting the idea of caching data, I am against of API introducing work with file system. We can provide serialize interfaces that users can implement and feed to TVM API or we can provide load/store functions operating with binary in memory, but it must not be file system dependent API.

@srkreddy1238
Copy link
Contributor

Supporting the idea of caching data, I am against of API introducing work with file system. We can provide serialize interfaces that users can implement and feed to TVM API or we can provide load/store functions operating with binary in memory, but it must not be file system dependent API.

You mean, TVM will prepare in memory serialized blob for all the cache, application is responsible for retrieving followed by storing on their own and later user will input the the binary blob back to TVM (similar to binary input for load_params) ?

@echuraev echuraev force-pushed the echuraev/save_compiled_kernels_to_bin branch from 308c8c1 to a35988f Compare February 1, 2023 09:56
@echuraev
Copy link
Contributor Author

echuraev commented Feb 1, 2023

@srkreddy1238 Thank you for your review! The idea of caching data looks promising, but I agree with @elvin-n that the TVM should provide only API for serializing objects to binary format but shouldn't work with file system. So user application should be responsible for writing/reading this serialized blob to/from disk.

Do you think that caching mechanism should be implemented in this PR? I would prefer to see such implementation in separate PR.

@echuraev echuraev force-pushed the echuraev/save_compiled_kernels_to_bin branch from e7a6ab7 to 0040e3f Compare February 1, 2023 10:17
@echuraev echuraev force-pushed the echuraev/save_compiled_kernels_to_bin branch from 0040e3f to 611a320 Compare February 1, 2023 10:32
@echuraev echuraev force-pushed the echuraev/save_compiled_kernels_to_bin branch from e5e4d84 to 499e358 Compare February 1, 2023 13:26
@srkreddy1238
Copy link
Contributor

User management will impose additional challenge.

Every time the application should retrieve and save the same as we are unaware of change to the cache blob in the current module load. Or the application should query about a change due to the current run. I feel all these complicates the app interface.

@echuraev cache implementation incur more changes outside cl runtime and can be handled outside this PR scope once there is an agreement on the design.

Copy link
Contributor

@srkreddy1238 srkreddy1238 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

A suggestion, otherwise good to go.

@echuraev echuraev force-pushed the echuraev/save_compiled_kernels_to_bin branch from e19925c to d41fb12 Compare February 2, 2023 07:59
Copy link
Contributor

@srkreddy1238 srkreddy1238 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

@echuraev
Copy link
Contributor Author

echuraev commented Feb 2, 2023

@tqchen could you please review this PR once again?

@srkreddy1238 srkreddy1238 merged commit 099ed94 into apache:main Feb 3, 2023
@echuraev echuraev deleted the echuraev/save_compiled_kernels_to_bin branch February 3, 2023 04:54
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

5 participants