-
Notifications
You must be signed in to change notification settings - Fork 492
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Bug fixes and logging improvements #619
Conversation
Setting the twin's ETag in AddDeviceWithTwinAsync
Removing configure awaits; add 5s delay in MQTT WS close
Setting CompletedSynchronously everywhere
Add logs for amqp ws tpm
AMQP - handle sync completion
Add client side tracing to the TPM over AMQP/WS scenario
@CIPop is added to the review. #Closed |
@alextolp is added to the review. #Closed |
@rajeevmv is added to the review. #Closed |
@@ -9,6 +9,7 @@ | |||
<OutputPath>.\bin\$(Configuration)</OutputPath> | |||
<DocumentationFile>$(OutputPath)\$(TargetFramework)\$(AssemblyName).xml</DocumentationFile> | |||
<RootDir>$(MSBuildProjectDirectory)\..\..\..\..</RootDir> | |||
<NoWarn>CA2007</NoWarn> |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
remove this change
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes please! Thank you @mounicaArroju for pointing it out. (This is why the build was probably not failing anymore with the current code.)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We cannot have ConfigureAwait with DotNetty. See other comment.
In reply to: 217796093 [](ancestors = 217796093)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
See above: there is always an implicit configuration of the await
ed operation. (Default is true
.)
Default FXCop rules are making the ConfigureAwait
usage mandatory: your semantic is identical to ConfigureAwait(true)
. I don't agree with that either since the control has already been returned to the dotNetty thread when the first await
is encountered.
catch (Exception) | ||
{ | ||
Abort(); | ||
catch (Exception) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Never catch Exception without rethrowing. This needs to be somehow captured and marshalled to the caller.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We don't rethrow b/c of the Task.Delay. Please let us know if there's a better way to do this
In reply to: 217763304 [](ancestors = 217763304)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Same as above - I don't understand the reason but this is a bad practice in general.
@@ -140,7 +146,7 @@ protected override async void DoBeginRead() | |||
do | |||
{ | |||
byteBuffer = allocHandle.Allocate(allocator); | |||
allocHandle.LastBytesRead = await DoReadBytes(byteBuffer).ConfigureAwait(false); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This breaks the build - ConfigureAwait is mandatory on all await
statements.
Why are you changing the behavior?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We were seeing hangs in mqtt runners and the recommendation from Max was that we could never use ConfigureAwait with dot netty. After removing this, we never saw hangs again. We should avoid adding ConfigureAwaits in paths that use their own thread pool
In reply to: 217763495 [](ancestors = 217763495)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
could never use ConfigureAwait with dot netty.
@alextolp ConfigureAwait
is mandatory by FXCop rules. Not setting it will break the build. You can either set it to true
(return on this context) or false
(any context).
@nayato I don't understand the reasoning of not using any thread to continue the execution. Up to the first await
statement the void callback will execute on the original thread. After that, the thread returns to dotNetty. Why spend the cycles to ensure that we're returning on that random thread at a future time?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@CIPop the whole point is that it is not a random thread. DotNetty overrides current task scheduler so if you don't opt-out with ConfigureAwait continuation will be posted to executor thread. Primary benefit of doing that is to avoid synchronization on data access and random issues associated with data races when we fail to do so.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks @nayato - didn't know that! (Really cool!)
In that case the right code here would be .ConfigureAwait(true)
.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Do we want to have explicit .ConfigureAwait(true) as it is the default?
In reply to: 218609670 [](ancestors = 218609670)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes, otherwise build will break due to the FXCop rule. Our code should be explicit instead of implicit. DotNetty is a special case as it has its own TaskScheduler. Most library code should always do .ConfigureAwait(false)
to allow completions on the current thread and avoid a thread switch.
|
||
using (var cancellationTokenSource = new CancellationTokenSource(TimeSpan.FromSeconds(30))) | ||
Task.Run(async () => |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Never start a task that is not awaited
by something. The result is not captured in this case and so the application will have no chance to get exceptions, etc.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Christian - we observed hangs in mqtt client and the recommendation from Max was to add a 5s delay before closing. Could you recommend another way where the client can close but we don't block it for 5s?
In reply to: 217779370 [](ancestors = 217779370)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I can't find a connection - this handler is a void async
which means it's not awaited by the caller anyway. We may need to lock to avoid calling CloseAsync from multiple threads.
Are you by any chance calling non-static methods on ProvisioningClient from multiple threads?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@CIPop to give some context here: I don't know about 5 sec but the dispatch there is to avoid blocking executor thread. WS client tends to deadlock on simultaneous closure. Locking CloseAsync while running on executor thread will not buy us anything. That said, I don't see why we couldn't await on that task except that if CloseAsync deadlocks we would still want to make progress wherever we await it.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@alextolp, are you using .NET Framework or .NET Core when you're seeing the deadlock?
@nayato, according to MSDN Task.Run uses the default task scheduler (I read that as the current task scheduler and might be wrong). Wouldn't that if written like this, it would still run in the dotNetty executor loop?
WebSocketCloseStatus.NormalClosure, | ||
string.Empty, | ||
cancellationTokenSource.Token).ConfigureAwait(false); | ||
using (var cancellationTokenSource = new CancellationTokenSource(TimeSpan.FromSeconds(30))) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Propagate the cancellation token passed by the public API.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The two cancellation tokens have different roles. The one passed by the public API is about the caller timeout.
This one is about timeout on ws.Close()
If caller timed out, we should not pass a cancelled token to Close() and let it happen anyway
In reply to: 217779575 [](ancestors = 217779575)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think if the operation token expired we should not attempt to do graceful WS termination. Instead we should reset the TCP connection.
|
||
if (Logging.IsEnabled) Logging.Info(this, $"{nameof(AmqpClientConnection)}.{nameof(OpenAsync)}: Sent Protocol Header: {_sentHeader.ToString()} writeAsyncResult: {writeAsyncResult} completedSynchronously: {args.CompletedSynchronously}"); | ||
|
||
if (args.CompletedSynchronously) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I believe this fixes #552. Please remove all TODO and Ignore statements from the e2e tests.
/cc @abhipsaMisra @prmathur-microsoft we need to check this for IoT Hub client as well.
@@ -184,6 +193,8 @@ async Task<ClientWebSocket> CreateClientWebSocketAsync(Uri websocketUri, TimeSpa | |||
|
|||
private void OnWriteHeaderComplete(TransportAsyncCallbackArgs args) | |||
{ | |||
if (Logging.IsEnabled) Logging.Enter(this, $"{nameof(AmqpClientConnection)}.{nameof(OnWriteHeaderComplete)}"); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Enter is missing Exit. Please ensure that all Enter statements have an equivalent Exit.
@@ -9,6 +9,7 @@ | |||
<OutputPath>.\bin\$(Configuration)</OutputPath> | |||
<DocumentationFile>$(OutputPath)\$(TargetFramework)\$(AssemblyName).xml</DocumentationFile> | |||
<RootDir>$(MSBuildProjectDirectory)\..\..\..\..</RootDir> | |||
<NoWarn>CA2007</NoWarn> |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes please! Thank you @mounicaArroju for pointing it out. (This is why the build was probably not failing anymore with the current code.)
try | ||
{ | ||
if (message.Security is SecurityProviderX509) | ||
{ | ||
security = (SecurityProviderX509)message.Security; | ||
SecurityProviderX509 x509Security = (SecurityProviderX509)message.Security; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This seems to be incorrect.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks @anushprabhu ! I left a few comments. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks @alextolp.
There are still a few issues to address.
|
||
using (var cancellationTokenSource = new CancellationTokenSource(TimeSpan.FromSeconds(30))) | ||
Task.Run(async () => |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I can't find a connection - this handler is a void async
which means it's not awaited by the caller anyway. We may need to lock to avoid calling CloseAsync from multiple threads.
Are you by any chance calling non-static methods on ProvisioningClient from multiple threads?
WebSocketCloseStatus.NormalClosure, | ||
string.Empty, | ||
cancellationTokenSource.Token).ConfigureAwait(false); | ||
using (var cancellationTokenSource = new CancellationTokenSource(TimeSpan.FromSeconds(30))) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think if the operation token expired we should not attempt to do graceful WS termination. Instead we should reset the TCP connection.
catch (Exception) | ||
{ | ||
Abort(); | ||
catch (Exception) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Same as above - I don't understand the reason but this is a bad practice in general.
@@ -140,7 +146,7 @@ protected override async void DoBeginRead() | |||
do | |||
{ | |||
byteBuffer = allocHandle.Allocate(allocator); | |||
allocHandle.LastBytesRead = await DoReadBytes(byteBuffer).ConfigureAwait(false); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
could never use ConfigureAwait with dot netty.
@alextolp ConfigureAwait
is mandatory by FXCop rules. Not setting it will break the build. You can either set it to true
(return on this context) or false
(any context).
@nayato I don't understand the reasoning of not using any thread to continue the execution. Up to the first await
statement the void callback will execute on the original thread. After that, the thread returns to dotNetty. Why spend the cycles to ensure that we're returning on that random thread at a future time?
@@ -9,6 +9,7 @@ | |||
<OutputPath>.\bin\$(Configuration)</OutputPath> | |||
<DocumentationFile>$(OutputPath)\$(TargetFramework)\$(AssemblyName).xml</DocumentationFile> | |||
<RootDir>$(MSBuildProjectDirectory)\..\..\..\..</RootDir> | |||
<NoWarn>CA2007</NoWarn> |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
See above: there is always an implicit configuration of the await
ed operation. (Default is true
.)
Default FXCop rules are making the ConfigureAwait
usage mandatory: your semantic is identical to ConfigureAwait(true)
. I don't agree with that either since the control has already been returned to the dotNetty thread when the first await
is encountered.
Blocking merge: @prmathur-microsoft @tameraw @yzhong94 - we will not accept new PRs until we can enable all E2E tests.
|
@anushprabhu @alextolp I will cherry pick some of the fixes as well as do the ConfigureAwait into a separate PR. |
@anushprabhu @alextolp - I cherry picked some of the items from here. Please rebase and submit a new PR for the IoT enhancement (we cannot accept that w/o a test). |
Checklist
master
branch.Description of the changes
Reference/Link to the issue solved with this PR (if any)