Skip to content

Commit

Permalink
Halve the buffer counts when tracing to circular buffers
Browse files Browse the repository at this point in the history
On some machines (including my awesome 24-core 48-thread Z840
workstation) tracing to circular memory buffers has, for a long time,
been virtually useless. If tracing has been running for a while then
saving a trace can take many minutes. It should not take longer than
30-60 seconds.

The problem has been reported to Microsoft. All I know is that
EtwpCopyLogHeader ends up calling ReadFile(1 MiB) hundreds of thousands
of times, even though the file being read is only about 600 MiB.

And I know that halving the number of buffers seems to help a lot.

And I know that you can run bin\metatrace.bat to trace the trace saving
process which *never* ceases to amuse me. This is what allowed me to
give an awesomely detailed bug report (see the EtwpCopyLogHeader
paragraph) and to find that the trace saving process was CPU bound in
the kernel in memcpy. Yep, memcpy.

That deserves repeating. KernelBase.dll!ReadFile is CPU bound in memcpy
for ~99% of the trace saving time. How cool/horrible is that?

The maximum kernel/user buffer sizes for circular-buffer memory tracing
in UIforETW used to be 600/100 MiB, leading to 700 MiB trace before
compression. That is actually larger than desired, in most cases, so
reducing this to 300/50 MiB could be good on multiple levels, We will
see.

I should really add options to the settings dialog to scale the buffer
sizes for different scenarios, but not today. Pull requests welcome, as
always.
  • Loading branch information
randomascii committed Jan 29, 2017
1 parent 9a7b897 commit 93d8435
Showing 1 changed file with 5 additions and 1 deletion.
6 changes: 5 additions & 1 deletion UIforETW/Support.cpp
Original file line number Diff line number Diff line change
Expand Up @@ -163,8 +163,12 @@ void CUIforETWDlg::TransferSettings(bool saving)
// a larger boost.
int CUIforETWDlg::BufferCountBoost(int requestCount) const
{
// Saving traces from circular buffers in memory seems to be really
// slow on some (dual socket?) machines and the 600 MB on medium to
// large memory machines is excessive anyway - who wants traces that
// big. So, this neatly haves the buffer sizes.
if (tracingMode_ == kTracingToMemory)
return requestCount;
return requestCount / 2;

int numerator = 1;
int denominator = 1;
Expand Down

0 comments on commit 93d8435

Please sign in to comment.