-
Notifications
You must be signed in to change notification settings - Fork 698
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Add Host Manager #2284
Add Host Manager #2284
Conversation
void HostManager::removeNetwork(std::string networkID) { | ||
// Walk the tree for the given function, calling evict function for each node | ||
// before removing it from networks_ which frees the node. | ||
std::queue<std::string> nodes; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Why are you using std::string to represent nodes? Use pointers or something that can be uniqued.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Strings are being used because that's the key to the map that stores the nodes. I could use pointers instead and then pointer->name to get the iterator to remove. Using a string seemed cleaner.
Please make sure to squash the commits. |
@gcatron this is looking good, can't wait to have this interface to the runtime! |
6c4fcb8
to
fe0be49
Compare
961753d
to
16042c3
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
CMake changes missing
Need to look into concurrency issues.
62d8b65
to
920a99e
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We really need multithreaded tests for those methods that can be executed concurrently (potentially enable thread sanitizer to catch issues).
I'm going to unblock the progress here, but need better verification around concurrency here in general.
3a0c660
to
b3ab82e
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Looking good! How close are we to this being in a working state atm? Asking so I can judge whether I should parallelize some more onnxifi work or just wait til this is in.
ResultCBTy callback) { | ||
|
||
auto currentRun = totalRequestCount_++; | ||
std::lock_guard<std::mutex> networkLock(networkLock_); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
does this lock need to be held while callback() is being called?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
In the callback we are good since activeRequestCount_ is atomic. It's needed here since we are accessing roots_.
} | ||
|
||
void HostManager::clearHost() { | ||
std::lock_guard<std::mutex> networkLock(networkLock_); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
does lock need to be held while stopping devices?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
For the device no, but this way we can't add a function to a device we are stopping.
7079bf3
to
62aba6b
Compare
I currently have the concurrent test disabled. This is because the compiled function is owned by the provisioner not the DeviceManager. Since there's no callback on evictNetwork The HostManager can free the compiledFunction out from under the DeviceManager. We either need to add a callback/make evict blocking, or move ownership of the CompiledFunction to the deviceManager. |
62aba6b
to
7ff090d
Compare
@gcatron I think it makes sense to add a callback to the evict, we can probably use the same one from addNetwork. |
a258b1b
to
b4c3e95
Compare
c6238d7
to
dabe6c3
Compare
…s to provisioner.
dabe6c3
to
b9d17b9
Compare
Description: This is a resurrected PR for the HostManager. It is almost ready for merging, once Executor, Partitioner, and Provisioner are merged this will be updated and be ready.
The HostManager will serve as the entry point for execution calls and handle orchestrating the rest of the runtime components.
Testing: Will write Unit Test before merge.
Documentation: N/A
Related to #2045