Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

System.DirectoryServices.Protocols - Linux -> binding time #38610

Closed
GrowSharp opened this issue Jun 30, 2020 · 11 comments
Closed

System.DirectoryServices.Protocols - Linux -> binding time #38610

GrowSharp opened this issue Jun 30, 2020 · 11 comments
Assignees
Milestone

Comments

@GrowSharp
Copy link

Description

So problem is, that sometimes (really randomly) the bind takes 127 seconds to complete. It doesn't matter if it's anonymous bind or not. Just sometimes the .Bind(); takes roughly 127 seconds. Did some logging around it for you to see, see Data.

Configuration

Domain joined CentOS 7.8.2003
also Ubuntu 20.04
.NET Core 3.1.301, commit: 7feb845744
.Protocols version 5.0.0-preview.5.20278.1

Regression?

On Windows it does not happen.

Data

Time to .Bind(); the connection:

Elapsed time 6 ms
Elapsed time 7 ms
Elapsed time 6 ms
Elapsed time 8 ms
Elapsed time 5 ms
Elapsed time 127317 ms
Elapsed time 127183 ms
Elapsed time 5 ms
Elapsed time 127180 ms
Elapsed time 9 ms
Elapsed time 4 ms
Elapsed time 127299 ms
@GrowSharp GrowSharp added the tenet-performance Performance related issue label Jun 30, 2020
@Dotnet-GitSync-Bot Dotnet-GitSync-Bot added area-System.DirectoryServices untriaged New issue has not been triaged by the area owner labels Jun 30, 2020
@ericstj ericstj added this to the 5.0.0 milestone Jul 9, 2020
@ericstj ericstj removed the untriaged New issue has not been triaged by the area owner label Jul 9, 2020
@ericstj ericstj added the bug label Jul 9, 2020
@ericstj
Copy link
Member

ericstj commented Jul 9, 2020

If you can repro this, it might be good to get a dump to find out what's happening. It's possible this is a server side delay too. Maybe create an app that repros it and when you see the bind taking longer than 100 ms call dotnet-dump on your own proecess: https://docs.microsoft.com/en-us/dotnet/core/diagnostics/dotnet-dump

@GrowSharp
Copy link
Author

GrowSharp commented Jul 21, 2020

Finally found time to do this. @ericstj
It took some time but I manage to reproduce it (it seems to be happening pretty random). Now I have +- 250 MB dump file (Heap) and one smaller like 70 MB (Mini). I will now upload only the smaller file, let me know if you will need the bigger one.

File:
core_20200721_141538.zip

@danmoseley
Copy link
Member

dotnet trace collect might be more helpful - if you can get one for a good case, and for a bad case. https://docs.microsoft.com/en-us/dotnet/core/diagnostics/dotnet-trace#dotnet-trace-collect

@GrowSharp
Copy link
Author

GrowSharp commented Aug 10, 2020

All right. What I did while getting an example of a bad case scenario (bad.nettrace):
Right before my .Bind() statement I placed Process.GetCurrentProcess().Id; to obtain the process ID.
At .Bind(); statement I also placed breakpoint.
I started the program, called the API that went into the desired block of code, read the process ID and executed the dotnet-trace collect command. After that I went for F10 to continue. After .Bind() statement was completed, I pressed F5 and let it run to the end. After that I turned off the dotnet-trace tool and finally terminated my app.

In the good case scenario (good.nettrace) I did almost the same, except the fact that I called the API twice, so at some point the trace should repeat itself, sorry.

Files:
nettrace.zip
@danmosemsft

@danmoseley
Copy link
Member

danmoseley commented Aug 25, 2020

@GrowSharp could you please check good.nettrace? When I open in PerfView (https://docs.microsoft.com/en-us/dotnet/core/diagnostics/dotnet-trace#view-the-trace-captured-from-dotnet-trace) I get

Started: Opening good.nettrace
Creating ETLX file C:\Users\danmose\AppData\Local\Temp\PerfView\good_fe97c3c9.etlx from C:\temp\traces\good.nettrace
Collecting a maximum of 20,000,000 events.
[Opening a log file of size 0 MB.]
[ERROR: processing events ****]
***** Consider using /ContinueOnError to ignore the bad part of the trace.  *****
Error: Exception EventPipe conversion: System.Exception: Read past end of stream.
   at FastSerialization.IOStreamStreamReader.Fill(Int32 minimum)
   at FastSerialization.MemoryStreamReader.ReadByte()
   at FastSerialization.Deserializer.ReadObject()
   at Microsoft.Diagnostics.Tracing.EventPipeEventSource.Process()
   at Microsoft.Diagnostics.Tracing.Etlx.TraceLog.CopyRawEvents(TraceEventDispatcher rawEvents, IStreamWriter writer)
   at Microsoft.Diagnostics.Tracing.Etlx.TraceLog.<>c__DisplayClass118_0.<FastSerialization.IFastSerializable.ToStream>b__0()
   at FastSerialization.DeferedRegion.Write(Serializer serializer, Action toStream)
   at Microsoft.Diagnostics.Tracing.Etlx.TraceLog.FastSerialization.IFastSerializable.ToStream(Serializer serializer)
   at FastSerialization.Serializer.WriteObjectData(IFastSerializable obj, Tags beginTag)
   at FastSerialization.Serializer.WriteObjectRef(IFastSerializable obj, Boolean defered)
   at FastSerialization.Serializer..ctor(IStreamWriter writer, IFastSerializable entryObject)
   at Microsoft.Diagnostics.Tracing.Etlx.TraceLog.CreateFromEventPipeEventSources(TraceEventDispatcher source, String etlxFilePath, TraceLogOptions options)
   at Microsoft.Diagnostics.Tracing.Etlx.TraceLog.CreateFromEventPipeDataFile(String filePath, String etlxFilePath, TraceLogOptions options)
   at PerfView.EventPipePerfViewData.GetTraceLog(TextWriter log, Action`3 onLostEvents)
[Error: exception while opening EventPipe data.]
Completed: Opening good.nettrace   (Elapsed Time: 0.816 sec)

bad.nettrace opens OK and I"ll see if anything jumps out (but this is @joperezr area of expertise)

@danmoseley
Copy link
Member

From bad.nettrace, the expensive path just ends in native code unfortunately:

--
||     |                                        + module System.Private.CoreLib <<System.Private.CoreLib!AsyncTaskMethodBuilder.Start>>
||     |                                         + module Adler.Infrastructure.oAuth.WebApi <<Adler.Infrastructure.oAuth.WebApi!Adler.Infrastructure.oAuth.WebApi.Base.RouterMiddleware+<>c__DisplayClass1_0+<b__0>d.MoveNext()>>
||     |                                          + module Adler.Infrastructure.oAuth.BL <<Adler.Infrastructure.oAuth.BL!ActiveDirectoryRoleService.GetClaimsAsync>>
||     |                                          |+ module System.Private.CoreLib <<System.Private.CoreLib!System.Runtime.CompilerServices.AsyncTaskMethodBuilder`1[System.__Canon].Start(!!0&)>>
||     |                                          | + module Adler.Infrastructure.oAuth.BL <<Adler.Infrastructure.oAuth.BL!Adler.Infrastructure.oAuth.BL.Services.ActiveDirectoryRoleService+d__8.MoveNext()>>
||     |                                          |  + module System.DirectoryServices.Protocols <<System.DirectoryServices.Protocols!LdapConnection.Bind>>
||     |                                          |  |+ Type: External
||     |                                          |  | + Event Microsoft-DotNETCore-SampleProfiler/Thread/Sample

@noahfalk what is the best way to gather a trace that includes native frames here? I guess not dotnet trace?

@noahfalk
Copy link
Member

correct dotnet trace doesn't have support for native code (not yet at least). On Linux you can use the perfcollect script here that runs a combination of perf (for native CPU sampled callstacks) + lttng (for runtime and EventSource events)

@danmoseley
Copy link
Member

@GrowSharp do you want to give that a try?

Although, as they say, when a perf issue is egregious enough, you don't need a profiler, you need a debugger. Break in once or twice during the delay and inspect where the CPU is. Is that something you could try? I believe you need to use lldb. The closest I could find to instructions are here which assume you're debugging the CLR itself, but it should be much the same. @joperezr do you have more info? I assume @GrowSharp will need some -dev packages?

@GrowSharp
Copy link
Author

I was trying to get it work for a few hours yesterday, but unfortunately without success..
On the other side, I started reading The Book of the Runtime and discovered that I'm finally confident enough to start learning in depth how it all goes together.
I always wanted to be able to investigate these kind of issues on my own and in the process maybe become contributor. But I'm having hard time finding any good resources.. The link @danmosemsft sent me that describes how to debug CLR in lldb is not very detailed (I'm having problems with first steps).
I also can't find any resources on how to run CoreFX locally so I can debug it, fix things ex cetera.
My current goal is to learn about CoreFX, interop etc. I'm not planning on diving into stuff like Roslyn, CoreCLR, CoreRT right now, that's probably still a distant future for me.
So if you guys could please point me into the right direction, or maybe provide some more resources I would be very, very grateful. Also, I haven't written any article / tutorial maybe since I left the MSP program, so I could maybe also write some detailed tutorial for future people like me afterwards.
Thanks in advance!

@joperezr
Copy link
Member

Pushing to 6.0.0 for now as this is not required for 5.0.0 but we should finish up the investigation here.

@joperezr joperezr modified the milestones: 5.0.0, 6.0.0 Aug 28, 2020
@GrowSharp
Copy link
Author

As I said here. I think that the problem will be on our side, in our messy 12 years old AD system. I think we can close this one also.

Thank you very much for trying to help me everyone!

@ghost ghost locked as resolved and limited conversation to collaborators Dec 8, 2020
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Projects
None yet
Development

No branches or pull requests

6 participants