-
Notifications
You must be signed in to change notification settings - Fork 11
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
CP-24917: limit simultaneous connections #26
Conversation
Signed-off-by: Thomas Sanders <[email protected]>
@@ -21,3 +21,6 @@ let wait_for_xapi_timeout_seconds = 300.0 | |||
|
|||
(** We sleep for this many seconds before the next login attempt. *) | |||
let wait_for_xapi_retry_delay_seconds = 4.0 | |||
|
|||
(** Allow no more than this many client connections. *) | |||
let connection_limit = 16 |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The counter logic looks good, but where does the limit 16 come from?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yep, whatever number we choose it's gonna be a bit arbitrary. @thomassa wanted to limit it to 127 originally. The important thing is that the maximum number of VBDs linked to dom0 is limited to 256, and xapi also needs to attach VBDs to dom0 to perform some operations. So we have to leave enough VBDs for dom0 & xapi to do its work.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
But I do think that we could increase it to a larger number.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'd rather be conservative and increase it in a second stage if it blocks partners. Let's ask some architects: @robhoes
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Hard to know. And we could make it settable through a command-line option or a config file or something. My initial suggestion of 127 had very little thought behind it.
But what's the maxiumum number of connections that would make sense for a client to use? With 16 they will almost certainly be slowing each other down, and it would be probably be faster to read the blocks one VDI at a time... or maybe multiple VDIs but only one connection at a time using each PBD.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
But the host could have multiple NICs though, and the VDIs may be in multiple different SRs, so that neither the network, nor the storage would be a bottleneck?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'd say let's make it 32 or 50?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Ok, let's limit it to 16, as we discussed.
The Travis build failure was
...which seems to be a problem with the environment rather than the change in this pullreq, but we should see about fixing it. |
Why is travis broken only with the base repo? |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Looks good. This is a very important fix that we should have in the next release. I had one minor comment, but that is subjective, so it's not necessary to change the code. I reckon it would be the best to test this live before merging?
let conn_m = Lwt_mutex.create () in | ||
let inc_conn ?(i=1) () = Lwt_mutex.with_lock conn_m (fun () -> | ||
conn_count := !conn_count + i; | ||
if !conn_count > Consts.connection_limit && i > 0 |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Personally I would find it clearer if we created a separate function for dec_conn
, instead of reusing the same code. Actually, I think the best would be to create a new wrapper with_limited_connections
, that way all the related code would be kept together. But this is subjective though.
The BASE_REPO error is due to a still open issue of ocaml-docker, we can ignore it for the moment |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Feel free to merge it once it's been tested.
I have tested it by creating connections using
So I'm going to merge it because this is an important fix, but it's not ideal though that the client just hangs. I guess it would still hang if instead of this method we just did not accept the incoming client connection? |
After this is merged, we still have to update |
Is there any way to send a message to close the connection to the client? I am not really convinced about the behaviour |
I'll check how the |
Hmm, interesting, when I use the
🤔 |
Although the client failure is a bit unreadable, I find this behaviour much more acceptable! |
Should we merge this then? |
So apparently this hanging is client-dependent for some reason 🤔 |
The PR is fine, and it does cause the client to error out. The reason my Python NBD client hung is that I was using the |
No description provided.