-
Notifications
You must be signed in to change notification settings - Fork 20
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Salvage mplex in the age of resource management #99
Conversation
multiplex.go
Outdated
|
||
return mp.getBuffer(length) | ||
} | ||
|
||
func (mp *Multiplex) getBuffer(length int) ([]byte, error) { | ||
if err := mp.memoryManager.ReserveMemory(length, 128); err != nil { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Can we reserve all the memory mplex could possibly consume up front, and get rid of this call?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
we could, but thats 16M a pop!!!
16 buffers x 1M max message size.
But you are right, we should.
I guess we can rsvp incrementally to find how much we can wiggle, up to 16 bufs (maybe less?)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
ok done; I also cut it to 8 buffers total (4 each direction) so that it's only up to 8M for the reservation.
- memory is reserved up front - number of buffers in flight is limited - we remove that insane behaviour of resetting conns with slow readers; now we just block.
squashed some commits. |
Why are we reserving memory up-front instead of on-demand? That and assuming that every message is at the "max message size" will basically make mplex useless. |
We didn't crash and burn, we killed the stream so the entire system didn't deadlock. Please revert that! |
The fact that mplex was randomly resetting streams is what made mplex unusable, as every transfer larger than a few MB would run into this limit. |
So I'm now very confused. |
Sorry for the confusion, we didn't reset the stream, we killed the connection when a memory allocation failed: |
the reservation must be for max message size because we really send that long messages (and must be prepared to receive). we could theoretically try to precommit memory and track actual usage every time we get a buffer, but it gets complicated and the only reasonable way to deal with the condition and the timeouts/cancellations is to spawn a goroutine every time we block. |
note that mplex was effectively unusable with dynamic memory reervations coz we dont know who the buffer belongs to and thus must kill the conn when it fails. |
Ah, I thought you were referring to what mplex did before. That makes more sense.
I'm still confused. Why can't we call
This cannot be correct. Like, literally cannot.
|
Because it can fail and we have to kill the conn. If your concern is that we are not very effectively using the reserved memory, then we can do more detailed memory accounting at the cost of a goroutine for every blocked allocation. But that just pushes the problem a bit further, it doesn't solve it.
yes, and if we can't get memory for it we must die because we don't know what it is. |
Can we not block?
Why? I assume we'd just block the receive until we have some memory. That's partially my concern. My main concern is that mplex is going to sit on massive memory allocations without using them.
Yes, then why aren't we doing this? |
how?
If that's your concern, yes, I can fix that; it's just more complexity for marginal benefit imo.
That's what we do, we block until we can get a buffer! |
It looks like we need a way to block on resource allocation. This isn't the first time, and won't be the last.
Reserving multiple megabytes per connection seems like a pretty big problem, right?
No, we're reserving multiple megabytes of memory up-front. That's my concern. |
@vyzo : is there any followup work that needs to happen? I'm asking because saw comments/discussion after this was merged. |
yes, we decided to drastically reduce the memory precommit. |
Instead of blindly pushing packets till we run out of memory and die because the resource manager throttled us, we limit in-flight in and out buffers.
As a side effect, we no longer crash and burn when the reader is slow, we just block.