-
Notifications
You must be signed in to change notification settings - Fork 4.9k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[BUG] - Microsoft.Azure.ServiceBus - System.InvalidOperationException: Can't create session when the connection is closing. #9416
Comments
Thanks for the feedback! We are routing this to the appropriate team for follow-up. cc @jfggdl |
1 similar comment
Thanks for the feedback! We are routing this to the appropriate team for follow-up. cc @jfggdl |
@schwartzma1, This seems to be same as #6940. |
@nemakam - Yes so in the release the project that is using this has: I am wondering not whether it should be ServiceBusCommunicationException or InvalidOperationException, but why is this exception raised to the caller all if the connection is closing? Can't it just be ignored on ServiceBus side like ObjectDisposed and OperationCanceledException are handled in this commit in the MessagePump.cs? : 008bb2b#diff-a4508926a30ad8c3ce214614ed2fd446? |
I agree. This can be ignored if the pump itself is closing. Is that the case? Connection closing and pump closing would be two different things. |
We are getting InvalidOperationException. As far as whether the pump is closing vs. the connection closing - I am not sure can you tell from the stack trace? I see further down it has: at System.Runtime.CompilerServices.TaskAwaiter.HandleNonSuccessAndDebuggerNotification(Task task) And the exception itself is Can't create session when the connection is closing. But as to whether that means the pump is closing or the connection is closing I am not sure if I can tell based on that. |
@schwartzma1 , pump can only be closed by the user, when you do |
Another occurrence - #11066 |
Hi @nemakam , to confirm, we are not explicitly closing anything. I can see from the Stacktrace the connection is indeed closing. |
Okay connections can get closed once in a while in a distributed world. The only thing we could aim at here is to make sure we translate this exception into communication exception. |
Once we change this into communication exception, the retry policy will kick in and retry if the client is not closed. So it will improve the condition drastically |
Does this mean it won't pop-up anymore in AI or the console output? |
It will. By default the client will report every communication exception. Having few communication exceptions is expected and can happen and you could ignore it. If there a lot of such exceptions happening frequently then its something we should look at. For this, the client is will always report such exceptions if an operation failed with it. Only times when exceptions could be swallowed is when the client was explicitly closed and the exceptions are related to that. |
Hmm understand. Well, both AI and the console is being spammed a lot with service bus exceptions. Timeouts, exceptions like above, taskcancelled,... It's really annoying as it makes it really difficult to find value in there. Would it be possible for you (or one of your colleagues) and me to have a short call where I can showcase the behavior perhaps? |
Small update, we went through AZ (paid) support channels as this issue existed for almost a year already and was consuming not only a lot of AI bandwith, but also taking in the available slots for snapshot debugs. I noticed the 7.0 supports this, correct? Any ETA for that? |
@Mortana89 , sorry I think I didnt get few things. What support are you exactly looking at in 7.0? Also, would you be able to provide numbers? |
And you did mention "constant" in your description, but I wanted to confirm that again. Do you see these exceptions constantly at the same rate or do you see a burst of exceptions at sometime, and then remaining times you don't see any? |
Hi Nekam, Constant yes, see following screenshot from AI before applying the change; We had two seperate queueclients per queue, one for sending, one for receiving. Some microservices consume multiple queues and thus have more queueclients than others. We have roughly 60 microservice instances that were generating these exceptions. Sessionhandleroptions are the same, nothing changed there. We use peeklock, maxtimeout of 30min, maxwait of 5sec and 100 concurrency max. |
Also, the screenshot is with 50% ingestion sampling enabled! |
Could you expand the first column "overall" and send a screenshot / maybe just copy-paste? |
Its very surprising that switching from two queueClients to one reduces the exceptions. The pipelines are very independent. Do you think you could provide a sample snippet? |
Hi Nekam, It looks as if it's for receiving where the exceptions are thrown as they all come from the messagepump; Message: It's always per two exceptions. One of above also triggers the taskcancelled exceptions: Message: A task was canceled. A task was canceled.Do you have an e-mailadress I can send the previous sourcecode of our queue interop client to? |
This is still happening to us and I am fairly sure we did not call close. Can this be changed from InvalidOperationException to ServiceBusCommunicationException so that the retrying kicks in? |
@mladedav @Mortana89 Can you share with us a snippet of your code when you run into this error and we will see if we can repro it? We could try to translate this exception into communication exception if we can repro. But you will still get the logs on AI. Would that be okay for you? |
In our case this processes one message and then we get this exception when trying to read the next message. I was told the load should be about 35 messages/hour. It happens for one job consistently in kubernetes. We are ok with the exceptions being shown, it's just an issue of retrying. Since there is built-in retrying in the SDK, that would seem like the preferred way, but if that wasn't reliable, we would have to implement it ourselves costing us more time and code.
with the receiver built like
|
Thanks for providing us with the source code. I will look into this and see if I can repro the error. Will update as soon as I can. |
Hi, were you able to reproduce? Would it be possible to change the type? |
@mladedav Sorry about the delay in response. I have checked our code in the _receiver.ReceiveAsync path and confirmed that you should not receive InvalidOperationException if the connection is closing. I cannot repro what you are seeing with our SDK. I am also a bit confused since you have shared 2 stack trace, one is the first one: Another stack trace you have shared: Message: From this stack trace you seem to use SessionClient for receive, can you share a piece of code snippet that causes this stack trace? Which path is InvalidOperationException that you want to change the exception type? |
@mladedav Also just a follow up to the first stack trace: I saw it is throwing from MessageReceivePump, are you by any chance using |
I haven't actually sent any stack traces, that must have been someone else. I must admit I cannot seem to find them anywhere and I can't reproduce the error with our older code. When I get home I will try to hack around a bit if I can break it again. At this time I only remember that this was the exception with this message (by which I found this thread). I thought since there are already two I wouldn't clutter this place more with stack traces but lesson learned for next time |
@mladedav Thanks for your reply. Are you by any chance using OnMessageHandler in your code? If so, can you share with me that piece of code, especially how you handle exception in the OnMessageHandler. |
No, we're using only the MessageReceiver directly to receive directly. Only other service-bus relevant information I can think of is we were failing to renew messages before locks expired when these issues happened but we fixed that at the same time we added some defensive retrying so I can't say whether it's relevant. |
Got it, then the first log trace shows 'MessageReceivePump' in the trace which is only coming from receiving using a OnMessageHandler. So I am not sure how that error trace is thrown. We will need a reliable error trace to continue investigating since we cannot repro. Can you repro it and send us the log and timestamp? thanks! |
I too have been receiving this error intermittently for quite some time from our Azure function instances running in Kubernetes. We have a functions pod running .NET Core 2.2.8 with functions v2 in our Production cluster and a separate functions pod running .NET Core 3.1.5 with functions v3 in our Sandbox cluster after recently upgrading and the exceptions are being received from both pods still. The production pod references Microsoft.Azure.ServiceBus v3.4.0 and the sandbox pod references Microsoft.Azure.ServiceBus v4.1.3. Exception message: Stack Trace: Another interesting piece of info, is that I am also receiving this exception as well at essentially the same time: Let me know if you require any more information. |
Hi @mr-davidc, thanks for reaching out. Are you from the same team as @mladedav? If not, can you create another thread and share with us a piece of code that has the exceptions thrown? This thread is already really long and hard to navigate. We need to tailor for each repro to see what's the problem there since these kinds of problems are hard to debug. |
@mr-davidc, @mladedav any updates on the logs? If not, we will close this incident. |
Hi @DorothySun216, mladedav and I are from separate teams so I can't speak for them but from my end I created a separate issue for the problems I am getting here: #13637 Thanks |
We have rolled out a fixed #17023 on latest release 5.1.0 and can you test if with this new nuget package, are you still seeing the same issue? https://www.nuget.org/packages/Microsoft.Azure.ServiceBus/5.1.0 |
I'm still getting a similar issue with Microsoft.Azure.ServiceBus 5.1.2. If I close all the sessions and the SubscriptionClient with
Any suggestions? |
@ColeSiegelTR Thanks for reaching out. This is a very unpredictable issue and we attempted to repro it many times but couldn't repro. Do you have retry mechanisms that can recover from this exception? If this exception happened within 30 days, can you open a Azure support ticket with us? https://azure.microsoft.com/en-us/support/create-ticket/ We will investigate further. |
We upgraded to the latest Azure SDK and this still happens (Azure.Messaging.ServiceBus) |
Describe the bug
Getting an InvalidOperation Exception in our ExceptionReceivedHandler about being unable to create a session.
Exception or Stack Trace
System.InvalidOperationException: Can't create session when the connection is closing.
at Microsoft.Azure.ServiceBus.Core.MessageReceiver.d__86.MoveNext()
--- End of stack trace from previous location where exception was thrown ---
at System.Runtime.ExceptionServices.ExceptionDispatchInfo.Throw()
at System.Runtime.CompilerServices.TaskAwaiter.HandleNonSuccessAndDebuggerNotification(Task task)
at Microsoft.Azure.ServiceBus.Core.MessageReceiver.<>c__DisplayClass64_0.<b__0>d.MoveNext()
--- End of stack trace from previous location where exception was thrown ---
at System.Runtime.ExceptionServices.ExceptionDispatchInfo.Throw()
at System.Runtime.CompilerServices.TaskAwaiter.HandleNonSuccessAndDebuggerNotification(Task task)
at Microsoft.Azure.ServiceBus.RetryPolicy.d__19.MoveNext()
--- End of stack trace from previous location where exception was thrown ---
at System.Runtime.ExceptionServices.ExceptionDispatchInfo.Throw()
at Microsoft.Azure.ServiceBus.RetryPolicy.d__19.MoveNext()
--- End of stack trace from previous location where exception was thrown ---
at System.Runtime.ExceptionServices.ExceptionDispatchInfo.Throw()
at System.Runtime.CompilerServices.TaskAwaiter.HandleNonSuccessAndDebuggerNotification(Task task)
at Microsoft.Azure.ServiceBus.Core.MessageReceiver.d__64.MoveNext()
--- End of stack trace from previous location where exception was thrown ---
at System.Runtime.ExceptionServices.ExceptionDispatchInfo.Throw()
at System.Runtime.CompilerServices.TaskAwaiter.HandleNonSuccessAndDebuggerNotification(Task task)
at Microsoft.Azure.ServiceBus.Core.MessageReceiver.d__62.MoveNext()
--- End of stack trace from previous location where exception was thrown ---
at System.Runtime.ExceptionServices.ExceptionDispatchInfo.Throw()
at System.Runtime.CompilerServices.TaskAwaiter.HandleNonSuccessAndDebuggerNotification(Task task)
at Microsoft.Azure.ServiceBus.MessageReceivePump.<b__11_0>d.MoveNext()
To Reproduce
Not sure exactly how to reproduce this - it occurs intermittently.
Expected behavior
This exception should perhaps be ignored by the message pump similar to other exceptions which are ignored.
Setup (please complete the following information):
Additional context
Wondering if InvalidOperationException should be handled in the same way ObjectDisposed and OperationCanceledException are being handled in PR 8449 #8449 and this commit:
008bb2b#diff-a4508926a30ad8c3ce214614ed2fd446
Information Checklist
Kindly make sure that you have added all the following information above and checkoff the required fields otherwise we will treat the issuer as an incomplete report
The text was updated successfully, but these errors were encountered: