-
-
Notifications
You must be signed in to change notification settings - Fork 727
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
TCP simplify receiving frames in comm
s
#3599
Conversation
1820f98
to
d69511e
Compare
Should make it easier to disambiguate things like `frame` and `frames` as they are now `each_frame` and `frames`.
As of Tornado 5.0+, this feature is included in Tornado. The `.read_bytes(...)` code path is a holdover from before Tornado got this feature. Not to mention it is more performant than the `.read_bytes(...)` code path. Given this, drop the `.read_bytes(...)` case and always use `.read_into(...)`.
ad5a99b
to
3ff1ce2
Compare
LGTM @jakirkham , thanks for the work! I did some testing (not very heavy though) and it doesn't seem to create any issues. On the other hand it also doesn't seem to improve performance, but I'm guessing we will need some larger workflows to be tested to see that, which I didn't do. |
Will this PR supersede #3453 ? If not, is it a step in that direction ? This looks good to be. This PR also makes me really appreciate CI due to the tcp changes I am not as familiar with. It would be good to get comments from @mrocklin or @TomAugspurger before merging in |
It's a preliminary step in that direction. Agreed that's why I figured we could try the same thing in Yeah if Matt and Tom have time to review, that would be welcome 🙂 FWIW I also tried to combining the |
cc @jrbourbeau (in case you have thoughts here as well 😉) |
3ff1ce2
to
cbe0bf1
Compare
FWIW the UCX tests pass for me locally except for issue ( rapidsai/ucx-py#461 ), which I run into with or without these changes. The latter may be a hardware configuration though. |
Yeah, this is actually something that we want to handle sequentially. We want to get one frame, and then only once that's done get the next one. Otherwise we have lots of coroutines all trying to pull bytes off a wire at the same time, which is unwise. |
for each_frame in recv_frames: | ||
each_length = len(each_frame) | ||
n = await stream.read_into(each_frame) | ||
assert n == each_length, (n, each_length) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It looks like this change has a semantic difference in that previously we used to include empty frames but now we don't. Is this correct?
Is there a specific reason for this change? Does it affect performance in some way, or is it strictly cosmetic? If it's strictly cosmetic, and if there is a change to semantics (however minor) I'm tempted to avoid the change just because I wouldn't be surprised if it has some unforeseen effect.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
That's not true actually. We excluded them before as well. It's just now done outside of the for
-loop. 🙂
The idea was to more-or-less pass control to Tornado or UCX-Py completely while receiving frames without all of the boilerplate in Dask between each receive. Tried asyncio.gather
to push this even further, but Tornado doesn't seem to like that (seems to work with UCX-Py though 😉).
In any event I thought it would be a nice thing to do the same thing we are doing in UCX for TCP. At the end of the day, I don't have strong feelings about it (especially if there is push back). 🤷♂
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Any other thoughts here, @mrocklin? 🙂
comm
scomm
s
We've broken out the UCX part and integrated it separately with PR ( #3651 ). This is now the change for TCP only. Would be good to discuss again when you have time Matt, but it's fine to sit in the interim. |
Pass from me. You'll have to get consensus from other maintainers.
…On Fri, Mar 27, 2020 at 3:20 PM jakirkham ***@***.***> wrote:
We've broken out the UCX part and integrated it separately with PR ( #3651
<#3651> ). This is now the change
for TCP only. Would be good to discuss again when you have time Matt, but
it's fine to sit in the interim.
—
You are receiving this because you were mentioned.
Reply to this email directly, view it on GitHub
<#3599 (comment)>,
or unsubscribe
<https://github.com/notifications/unsubscribe-auth/AACKZTG6V3HVLVEZRFUV7TDRJURCPANCNFSM4LP5ZERA>
.
|
Other changes have gone in that more-or-less handle what this was set out to do. Closing... |
Simplifies the logic for receiving frames in
tcp
anducx
. Drops some legacy code on thetcp
side. Also preallocates buffers to receive data into beforehand (and subselects the non-0-length buffers). Then hands off control to the underlying library receiving data to fill these buffers. Afterwards handles any sanity checks.cc @quasiben @pentschev