Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[ServiceBus] Keep connnection alive #12937

Merged
merged 5 commits into from
Aug 12, 2020
Merged

Conversation

yunhaoling
Copy link
Contributor

@yunhaoling yunhaoling commented Aug 7, 2020

The service side has 240 seconds connection idle timeout limitation. In this PR, we turn on the keep_alive feature supported by uamqp to keep the connection active. see issue: #11935

This feature will call connection.do_work() every keep_alive_interval (the default value is 30, which is the value used in EH track1) and send empty frame to the service if uamqp finds it has passed 0.5*remote-idle-timeout since last connection activity.

I don't plan to expose it as a public configurable to users in this preview because:

  1. in .Net and JS, they have keeping connection alive turned on by default, which can't be turned off/adjusted by user.
  2. this leaves us the room in the future on how we want to align the feature.

@yunhaoling
Copy link
Contributor Author

/azp run python - servicebus - tests

@azure-pipelines
Copy link

Azure Pipelines successfully started running 1 pipeline(s).

@yunhaoling yunhaoling added the Client This issue points to a problem in the data-plane of the library. label Aug 10, 2020
@yunhaoling
Copy link
Contributor Author

/azp run python - servicebus - tests

@azure-pipelines
Copy link

Azure Pipelines successfully started running 1 pipeline(s).

@@ -23,3 +23,4 @@ def __init__(self, **kwargs):
self.auth_timeout = kwargs.get("auth_timeout", 60) # type: int
self.encoding = kwargs.get("encoding", "UTF-8")
self.auto_reconnect = kwargs.get("auto_reconnect", True)
self.keep_alive = kwargs.get("keep_alive", 30)

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I assume 30 aligns with cross-language consistency? (just double-checking)

Copy link
Contributor Author

@yunhaoling yunhaoling Aug 11, 2020

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This 30 seconds default value is used in eventhub track1.

The working mechanism in uamqp is "every 30 seconds, call connection.do_work(), connection.do_work() would check whether it has passed 0.5*remote idle timeout (240s), if passed, send a empty frame out"

.Net is sending out empty frame like every ~ 50s seconds.
JS is sending out empty frame every 0.5 *remote idle timeout.

I think 30 seconds is a reasonable interval in our case.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm wondering how we tell the users to configure this value. 240s is the hard expiry time. So is 220s always better than 200s to keep the connection alive because it does the work with less traffic? If so, a bool value is better than a number.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@YijunXieMS , right now we don't expose this parameter to users to keep consistent with other languages -- JS and .Net don't allow users to set/tweak this interval, it is turned on by default in their SDK.

If there're customer needs to configure the value/turn on the switch, this could be a post ga feature.

Copy link
Member

@KieranBrantnerMagee KieranBrantnerMagee Aug 12, 2020

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This echoes a question I posed for autolockrenew, and I don't recall if we settled on an answer: Should we disallow setting a keepalive > 240s? Or caveat emptor? I might classify it a "semantic error"; if someone wants to disable keepalive they should show intent and pass None. (Context: Have had at least one user who adjusted a value such as this and unintentionally ran into lock expiry as a result)

Should be precise that this is likely a consideration for whenever we would expose this setting and thus lock it for backcompat, but is something to be mentioned/kept in mind in case there were strong feelings. (I've added it to our discussion list)

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

as person discussion OOB:

The proposal of flag is preferred as it's simple, users don't need to care about the value.
But we will leave it untouched (turned on by default) until there're customer requests for turning it off.

@yunhaoling
Copy link
Contributor Author

/azp run python - servicebus - tests

@azure-pipelines
Copy link

Azure Pipelines successfully started running 1 pipeline(s).

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Client This issue points to a problem in the data-plane of the library. Service Bus
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants