-
Notifications
You must be signed in to change notification settings - Fork 3.5k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[AutoTVM][RPCRunner] timeout is not passed correctly #6924
Conversation
relates to this 6924 issue. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM
cc @merrymercy
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We cannot do this because it changes the meaning of the current timeout.
Currently, the timeout argument of RPCRunner
only controls the time spent on remote devices. But after your change, the timeout will take all time into consideration, which includes the waiting time to get the device.
For example, if the timeout is 10, we have one remote device and we launch two measurement tasks.
With the existing code, each of the two requests can get 10 seconds to do the measurement on the remote device. So the later request actually has a 20 seconds timeout in LocalExecutor
(10s for waiting for the first request to finish).
But after your change, the later request can only get 10 seconds in LocalExecutor
.
The correct way is
self.executor = LocalExecutor(timeout=(timeout * self.n_parallel + 1))
We use self.n_parallel+1
instead of self.n_parallel
to give more time for other overhead.
@merrymercy as you suggested I set the constructor argument to |
You should only change the timeout setting in |
ok, thank you! |
* [AutoTVM][RPCRunner] timeout is not passed correctly * like @merrymercy suggests, scale timeout with (n_parallel + 1) * Apply suggestions from code review * Apply suggestions from code review Co-authored-by: Lianmin Zheng <[email protected]>
* [AutoTVM][RPCRunner] timeout is not passed correctly * like @merrymercy suggests, scale timeout with (n_parallel + 1) * Apply suggestions from code review * Apply suggestions from code review Co-authored-by: Lianmin Zheng <[email protected]>
* [AutoTVM][RPCRunner] timeout is not passed correctly * like @merrymercy suggests, scale timeout with (n_parallel + 1) * Apply suggestions from code review * Apply suggestions from code review Co-authored-by: Lianmin Zheng <[email protected]>
If we hand-over the timeout to the constructor of the LocalExecutor, like done in the LocalBuilder class it seems to solve the issue and the timeout is now correctly passed.