[OpenCL] Implement save/load pre-compiled programs #13868

echuraev · 2023-01-30T09:21:44Z

Using pre-compiled programs might significantly improve inference time of the first run.

Added methods SupportPreCompiledPrograms which reports if the module supports using pre-compiled programs.
Method GetPreCompiledPrograms returns string with bytes of pre-compiled programs.
Method SetPreCompiledPrograms allows user to pass pre-compiled programs to the module.

tvm-bot · 2023-01-30T09:21:48Z

Thanks for contributing to TVM! Please refer to the contributing guidelines https://tvm.apache.org/docs/contribute/ for useful information and tips. Please request code reviews from Reviewers by @-ing them in a comment.

cc @elvin-n _{See #10317 for details}

_{Generated by tvm-bot}

echuraev · 2023-01-30T09:22:44Z

cc: @tqchen, @csullivan, @srkreddy1238

include/tvm/runtime/module.h

echuraev · 2023-01-30T15:10:01Z

@tvm-bot rerun

Using pre-compiled programs might significantly improve inference time of the first run. - Added methods `SupportPreCompiledPrograms` which reports if the module supports using pre-compiled programs. - Method `GetPreCompiledPrograms` returns string with bytes of pre-compiled programs. - Method `SetPreCompiledPrograms` allows user to pass pre-compiled programs to the module.

srkreddy1238 · 2023-02-01T06:33:12Z

@echuraev thanks for this feature.

I too had a similar problem statement with CLML tuning cache management at runtime which need to be generated once and reused later. In CLML, there existed multiple sub graphs related to each tvm module and for now the tuning cache is indexed by subgraph symbol and are stored under one file by serializing them through DMLC::Strm. This works by maintaining one file per tvm module. This approach adds additional overhead for the user to maintain and specify the tuning cache for each tvm module.

I think we have a generalized problem statement here where there is a need of cache management for the tvm runtime.

Probably we could come up with an unified approach where

There will be a tvm_runtime cache specified by environment variale or graph_runtime api interface.
Each tvm compiled module will have a unique hash key generated at compile time and is accessible from tvm module interface.
A new file utility interface to load/store the binary blobs generated by any runtime as a key & value pair.

The flow would be like

1: Runtime (OpenCL/CLML ..etc.) will form a key (concatenating [ tvm_module hash + runtime + purpose ...etc.] )
2: Try to fetch from runtime cache.
3: If not exist regenerate and save into cache.
4: Use the stored cache

This will simplify and minimize the end user hassle to a level of just specify a cache folder and relax.

In the implementation side we have

tvm Module passing a unique key from compilation to runtime probably via grpah_json.
Cache API to serialize and load/store the key, value paired binary blobs.
Runtime specific changes to use this cache interface.

elvin-n · 2023-02-01T08:36:14Z

Supporting the idea of caching data, I am against of API introducing work with file system. We can provide serialize interfaces that users can implement and feed to TVM API or we can provide load/store functions operating with binary in memory, but it must not be file system dependent API.

srkreddy1238 · 2023-02-01T09:50:32Z

Supporting the idea of caching data, I am against of API introducing work with file system. We can provide serialize interfaces that users can implement and feed to TVM API or we can provide load/store functions operating with binary in memory, but it must not be file system dependent API.

You mean, TVM will prepare in memory serialized blob for all the cache, application is responsible for retrieving followed by storing on their own and later user will input the the binary blob back to TVM (similar to binary input for load_params) ?

echuraev · 2023-02-01T10:11:55Z

@srkreddy1238 Thank you for your review! The idea of caching data looks promising, but I agree with @elvin-n that the TVM should provide only API for serializing objects to binary format but shouldn't work with file system. So user application should be responsible for writing/reading this serialized blob to/from disk.

Do you think that caching mechanism should be implemented in this PR? I would prefer to see such implementation in separate PR.

srkreddy1238 · 2023-02-01T14:12:11Z

User management will impose additional challenge.

Every time the application should retrieve and save the same as we are unaware of change to the cache blob in the current module load. Or the application should query about a change due to the current run. I feel all these complicates the app interface.

@echuraev cache implementation incur more changes outside cl runtime and can be handled outside this PR scope once there is an agreement on the design.

apps/cpp_rtvm/tvm_runner.cc

srkreddy1238

A suggestion, otherwise good to go.

srkreddy1238

LGTM

echuraev · 2023-02-02T11:52:51Z

@tqchen could you please review this PR once again?

echuraev force-pushed the echuraev/save_compiled_kernels_to_bin branch from 511f3db to 308c8c1 Compare January 30, 2023 10:58

echuraev mentioned this pull request Jan 30, 2023

Enable C++17 for cmake modules #13869

Merged

tqchen requested changes Jan 30, 2023

View reviewed changes

include/tvm/runtime/module.h Outdated Show resolved Hide resolved

echuraev added 2 commits January 31, 2023 10:59

Fix lint

bea1c17

Apply comment: PackedFunc is used

a35988f

echuraev force-pushed the echuraev/save_compiled_kernels_to_bin branch from 308c8c1 to a35988f Compare February 1, 2023 09:56

echuraev force-pushed the echuraev/save_compiled_kernels_to_bin branch from e7a6ab7 to 0040e3f Compare February 1, 2023 10:17

Fix build

611a320

echuraev force-pushed the echuraev/save_compiled_kernels_to_bin branch from 0040e3f to 611a320 Compare February 1, 2023 10:32

Fix CI and rename functions

499e358

echuraev force-pushed the echuraev/save_compiled_kernels_to_bin branch from e5e4d84 to 499e358 Compare February 1, 2023 13:26

srkreddy1238 reviewed Feb 2, 2023

View reviewed changes

apps/cpp_rtvm/tvm_runner.cc Outdated Show resolved Hide resolved

srkreddy1238 requested changes Feb 2, 2023

View reviewed changes

Apply comments

d41fb12

echuraev force-pushed the echuraev/save_compiled_kernels_to_bin branch from e19925c to d41fb12 Compare February 2, 2023 07:59

srkreddy1238 approved these changes Feb 2, 2023

View reviewed changes

tqchen approved these changes Feb 2, 2023

View reviewed changes

srkreddy1238 merged commit 099ed94 into apache:main Feb 3, 2023

echuraev deleted the echuraev/save_compiled_kernels_to_bin branch February 3, 2023 04:54

ysh329 mentioned this pull request Apr 17, 2023

[Release] v0.12.0 Release Candidate Notes #14645

Closed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[OpenCL] Implement save/load pre-compiled programs #13868

[OpenCL] Implement save/load pre-compiled programs #13868

echuraev commented Jan 30, 2023

tvm-bot commented Jan 30, 2023

echuraev commented Jan 30, 2023

echuraev commented Jan 30, 2023

srkreddy1238 commented Feb 1, 2023

elvin-n commented Feb 1, 2023

srkreddy1238 commented Feb 1, 2023

echuraev commented Feb 1, 2023

srkreddy1238 commented Feb 1, 2023

srkreddy1238 left a comment

srkreddy1238 left a comment

echuraev commented Feb 2, 2023

[OpenCL] Implement save/load pre-compiled programs #13868

[OpenCL] Implement save/load pre-compiled programs #13868

Conversation

echuraev commented Jan 30, 2023

tvm-bot commented Jan 30, 2023

echuraev commented Jan 30, 2023

echuraev commented Jan 30, 2023

srkreddy1238 commented Feb 1, 2023

elvin-n commented Feb 1, 2023

srkreddy1238 commented Feb 1, 2023

echuraev commented Feb 1, 2023

srkreddy1238 commented Feb 1, 2023

srkreddy1238 left a comment

Choose a reason for hiding this comment

srkreddy1238 left a comment

Choose a reason for hiding this comment

echuraev commented Feb 2, 2023