-
Notifications
You must be signed in to change notification settings - Fork 3.5k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[OpenCLML] Reactor and introduce on chip memory and memory planner #14922
Conversation
Thanks for contributing to TVM! Please refer to the contributing guidelines https://tvm.apache.org/docs/contribute/ for useful information and tips. Please request code reviews from Reviewers by @-ing them in a comment.
Generated by tvm-bot |
9da9bd2
to
676d4e6
Compare
Introduced thread context with CLMLWorkspace. Organized the code as runtime, utils and memory planners Introcuded recording queue support and on chip memory support. On chip memory allocation planner to acommodate multiple tensors at a time. DDR memory planner introduced to reuse the underlaying memory across multiple tensor descriptors. Dense layer support refactored to use GEMM. CLML binary operators doesn't support broadcasting. Hence introduced an explicite broadcast op as a work around. clml SDK codegen is enhanced accordingly.
676d4e6
to
3bf7e63
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Is it necessary to add new tests for memory planner?
We definitely need few test cases. Let me find a way of exposing the plan to verify externally. |
Probably you can take a look at the OpenCL tests: https://github.com/apache/tvm/blob/main/tests/cpp-runtime/opencl/opencl_texture_pool_test.cc |
@echuraev have you ever built gtests (opencl-cpptest bin) for Android ? |
3e02f6e
to
ffb3f82
Compare
Yes, I did it. You can build
|
using namespace tvm::runtime; | ||
using namespace tvm::runtime::cl; | ||
|
||
void InitMemoryPlan(tvm::runtime::contrib::CachedLayer& layer) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I saw that in class CLMLRuntime
you have almost the same methods. Probably it is better to test them? You can change private
to protected
and inherit CLMLRuntime
class in your test class. It should be like a test wrapper above CLMLRuntime
class.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Only InitMemoryPlan
can be reused but PlanMemory
has dependency with JSONRuntime
nodes. initializing CLMLRuntime here requires Json graph and it's dependents. Hence I tried isolating the test environment within "CahcedLayer``` object. Let me see how much I can reuse from CLMLRuntime.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM. Thanks!
…pache#14922) * Reactor and introduce in chip memory and memory planner Introduced thread context with CLMLWorkspace. Organized the code as runtime, utils and memory planners Introcuded recording queue support and on chip memory support. On chip memory allocation planner to acommodate multiple tensors at a time. DDR memory planner introduced to reuse the underlaying memory across multiple tensor descriptors. Dense layer support refactored to use GEMM. CLML binary operators doesn't support broadcasting. Hence introduced an explicite broadcast op as a work around. clml SDK codegen is enhanced accordingly. * * review comments * * Memory planner cpp_runtime tests. * * gtest build rules while in android environments. * * review comments --------- Co-authored-by: Siva Rama Krishna Reddy B <[email protected]>
Introduced thread context with CLMLWorkspace.
Organized the code as runtime, utils and memory planners Introcuded recording queue support and on chip memory support. On chip memory allocation planner to acommodate multiple tensors at a time. DDR memory planner introduced to reuse the underlaying memory across multiple tensor descriptors.
Dense layer support refactored to use GEMM.
CLML binary operators doesn't support broadcasting. Hence introduced an explicite broadcast op as a work around.
clml SDK codegen is enhanced accordingly.