Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add logging buffering #5635

Draft
wants to merge 15 commits into
base: main
Choose a base branch
from

Conversation

evgenyfedorov2
Copy link
Contributor

@evgenyfedorov2 evgenyfedorov2 commented Nov 13, 2024

Related to the #5123 proposal, this PR is focused on the logging buffering only. See also the sampling part #5574.

Microsoft Reviewers: Open in CodeFlow

@evgenyfedorov2 evgenyfedorov2 force-pushed the evgenyfedorov2/log_buffering branch 2 times, most recently from e40b7b6 to 9d61d12 Compare December 12, 2024 16:30
@evgenyfedorov2 evgenyfedorov2 force-pushed the evgenyfedorov2/log_buffering branch from 9d61d12 to 2f1a335 Compare December 12, 2024 17:36
/// Defines a rule used to filter log messages for purposes of futher buffering.
/// </summary>
[Experimental(diagnosticId: DiagnosticIds.Experiments.Telemetry, UrlFormat = DiagnosticIds.UrlFormat)]
public class BufferFilterRule : ILoggerFilterRule
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think we need to also support filtering per name/value pair. So that you can control buffering based on things like cluster name or region.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Would cluster name and region be key-value pairs that are varying within the same process? Based on the names those sound like properties that would be fixed at process startup and remain constant throughout. This means all the decision making could occur at startup time when defining the filter rules rather than a decision that needs to be made per-log message later on. For example could you satisfy the scenario with some startup code like this?

logging.AddBuffering(buffering =>
{
    string region = GetRegionFromConfig();
    // in the test region we want to buffer some log messages that won't be buffered
    // in other regions
    if(region == "TestRegion")
    {
        buffer.AddFilter("ExperimentalFeatures");
    }
    ...
});

I'm not opposed to other things, I'm just trying to better understand what the scenario is.

Address PR comments
Add .NET 8 support
Add ExceptionJsonConverter
@evgenyfedorov2 evgenyfedorov2 force-pushed the evgenyfedorov2/log_buffering branch from 883334d to 022f00c Compare December 16, 2024 14:23
@evgenyfedorov2 evgenyfedorov2 force-pushed the evgenyfedorov2/log_buffering branch from 022f00c to a371d9c Compare December 16, 2024 14:51
@dotnet-comment-bot
Copy link
Collaborator

‼️ Found issues ‼️

Project Coverage Type Expected Actual
Microsoft.AspNetCore.Diagnostics.Middleware Line 100 96.03 🔻
Microsoft.AspNetCore.Diagnostics.Middleware Branch 100 98.51 🔻
Microsoft.Extensions.Telemetry Line 93 86.2 🔻
Microsoft.Extensions.Telemetry Branch 93 84.59 🔻
Microsoft.Extensions.Diagnostics.Testing Line 99 85.78 🔻
Microsoft.Extensions.Diagnostics.Testing Branch 99 79.57 🔻

🎉 Good job! The coverage increased 🎉
Update MinCodeCoverage in the project files.

Project Expected Actual
Microsoft.Extensions.Diagnostics.Probes 70 76
Microsoft.Extensions.Caching.Hybrid 75 86
Microsoft.Extensions.AI.OpenAI 72 77

Full code coverage report: https://dev.azure.com/dnceng-public/public/_build/results?buildId=896358&view=codecoverage-tab

{
case ModernTagJoiner modernTagJoiner:
_buffer.Enqueue(new SerializedLogRecord(logLevel, eventId, _timeProvider.GetUtcNow(), modernTagJoiner, exception,
((Func<ModernTagJoiner, Exception?, string>)(object)formatter)(modernTagJoiner, exception)));

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is this casting more performant than just formatter(attributes, exception)?

Comment on lines 63 to 77
public void Flush()
{
var result = _buffer.ToArray();

#if NETFRAMEWORK
lock (_netfxBufferLocker)
{
while (_buffer.TryDequeue(out _))
{
// Clear the buffer
}
}
#else
_buffer.Clear();
#endif

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This loses log entries if another thread enqueues more of them between _buffer.ToArray() and _buffer.Clear(). That is more likely to happen on .NET Framework, where clearing the queue takes more time.

Copy link
Contributor Author

@evgenyfedorov2 evgenyfedorov2 Jan 9, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Moved the _lastFlushTimestamp update to the top, so that when a flush operation is initiated at one thread, another thread will skip buffering for some period of time because of the non-zero value of SuspendAfterFlushDuration option. Don't expect someone to set it to zero because it does not make sense to emit some buffered logs and buffer other at almost the same time.

{
while (!cancellationToken.IsCancellationRequested)
{
await _timeProvider.Delay(_options.CurrentValue.Duration, cancellationToken).ConfigureAwait(false);

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

IIRC, IOptionsMonitor<TOptions>.CurrentValue throws if the option is bound to configuration and the binding fails (parsing fails or a validator rejects the value); this could even happen when an operator modifies a configuration file after the application has already started. The BackgroundService would then fail and, depending on HostOptions.BackgroundServiceExceptionBehavior, either log buffer truncation would stop working or the whole application would terminate. Should this instead keep using the previous value of the options in that case?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This makes sense, but I will take another look at it later on, possibly after finalizing public API. Keeping this open until then

Comment on lines 72 to 77
logger.Log(
serializedRecord.LogLevel,
serializedRecord.EventId,
serializedRecord.Attributes,
exception,
(_, _) => serializedRecord.FormattedMessage ?? string.Empty);

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This allocates a closure and a delegate for each ILogger.Log<TState> call. I can think of two ways to minimize the allocations:

  • Instead of serializedRecord.Attributes, pass serializedRecord itself as the state to ILogger.Log<TState>. Implement IReadOnlyList<KeyValuePair<string, object?>> in SerializedLogRecord by calling the corresponding methods of public IReadOnlyList<KeyValuePair<string, string>> Attributes.
  • Alternatively, keep using the same delegate across all iterations of both foreach loops; declare the serializedRecord variable and a delegate variable before the outer loop. That way, it would allocate only one closure and one delegate per BufferSink.LogRecords call, no matter how many log records are in IEnumerable<T> serializedRecords. This is safe because ILogger.Log<TState> is not allowed to save the formatter delegate for later calling.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

makes sense for optimizations later, thank you

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This code moved from BufferSink to BufferedLoggerProxy in d7661a6. The BufferedLogRecord instances are not pooled, so each BufferedLogRecord call costs at least one allocation per BufferedLogRecord; that may make the delegate and closure allocations less important.

using Microsoft.Extensions.ObjectPool;

namespace Microsoft.Extensions.Diagnostics.Buffering;
internal sealed class PooledLogRecord : BufferedLogRecord, IResettable

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Please add DebuggerDisplayAttribute so that the list of log records in a buffer can be viewed more easily.

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Or add the DebuggerDisplayAttribute to SerializedLogRecord, instead.

@evgenyfedorov2 evgenyfedorov2 force-pushed the evgenyfedorov2/log_buffering branch from fa52316 to fe00658 Compare January 14, 2025 11:03
@dotnet-comment-bot
Copy link
Collaborator

‼️ Found issues ‼️

Project Coverage Type Expected Actual
Microsoft.Extensions.Caching.Hybrid Line 86 84.76 🔻
Microsoft.AspNetCore.Diagnostics.Middleware Line 100 96.44 🔻
Microsoft.Extensions.Telemetry Line 93 90.62 🔻
Microsoft.Extensions.Telemetry Branch 93 91.16 🔻

🎉 Good job! The coverage increased 🎉
Update MinCodeCoverage in the project files.

Project Expected Actual
Microsoft.Extensions.AI 88 89

Full code coverage report: https://dev.azure.com/dnceng-public/public/_build/results?buildId=916781&view=codecoverage-tab

{
foreach (KeyValuePair<string, object?> ruleAttribute in rule.Attributes)
{
if (!attributes.Contains(ruleAttribute))

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can the object? in the KeyValuePair<string, object?> be a list; should this have some kind of structural equality comparison then? That seems complex.

Microsoft.Extensions.Logging.LoggerFilterRule has Func<string?, string?, LogLevel, bool>? Filter, which lets the application developer implement the logic as desired. A similar callback could work here.

Copy link
Contributor Author

@evgenyfedorov2 evgenyfedorov2 Jan 16, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Updated with introducing the Filter delegate

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Microsoft.Extensions.Logging.LoggerFilterRule has Func<string?, string?, LogLevel, bool>? Filter, which lets the application developer implement the logic as desired. A similar callback could work here.

Actually, even though the Func<> delegate is more flexible, we need to stick to IReadOnlyList<KeyValuePair<string, object?>> because it can be configured and bound from appsettings.json, while the Func<> delegate cannot because it is not serializable (at least, not to a human-readable format). The ability to bind from JSON config and refresh the config in runtime is the key feature of this component. Potentially we can add Func<> later on if users ask for it.

Can the object? in the KeyValuePair<string, object?> be a list; should this have some kind of structural equality comparison then? That seems complex.

Let's stick to the default equality comparer for now and consider accepting custom comparers if users ask for it.

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The rules configuration of the standard LoggerFactory does not use the standard configuration binder. Instead, it explicitly reads values from IConfiguration:

https://github.com/dotnet/runtime/blob/af25787dac9b2cba8a2d4a830f280e59782878a8/src/libraries/Microsoft.Extensions.Logging.Configuration/src/LoggerFilterConfigureOptions.cs

I think it would be possible to construct the filter delegate from configuration values in a similar fashion, while allowing the application developer to replace or override this conversion.

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Maybe you'll need some custom logic there anyway, if attribute values written to ILogger can be numbers but configuration provides only strings.

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Although possibly it can later be extended to:

   "Attributes": [
          {
            "Key": "Duration",
            "Type": "System.Double",
            "Comparison": ">",
            "Value": "2.1"
          }
        ]

where the library would map type names like "System.Double" to objects that know how to parse "2.1" to (double)2.1 and how to implement the ">" comparison. This would use neither Type.GetType(string) nor System.ComponentModel.TypeConverter.

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

8319dc8 changed tests:

  • AddGlobalBuffer_WithConfiguration_RegistersInDI asserts that new KeyValuePair<string, object?>("priority", 1) is included in the attributes of the rule. I expect that this test will fail because values in IConfiguration are always strings and Microsoft.Extensions.Configuration.Binder cannot know that "1" should be converted to Int32.
  • CanWorkWithValueTypeAttributes asserts that BufferFilterRuleSelector correctly compares the Int32 value of an attribute that is already in a rule.

AFAICT, these tests do not verify that, if ILogger.Log is given a state object that includes an Int32 attribute value, then the value is forwarded to BufferFilterRuleSelector as Int32 rather than first converted to String.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I expect that this test will fail because values in IConfiguration are always strings and Microsoft.Extensions.Configuration.Binder cannot know that "1" should be converted to Int32.

It is not failing. Unless you are talking about some different situation?

Because this PR only supports equality comparisons, rather than < and >, I think floating point is out of scope.

Supporting float numbers and their comparison is not in scope of this feature, at least not in the initial release. For now, for comparison purposes we treat both config and log state as key/value pairs of strings

Copy link

@KalleOlaviNiemitalo KalleOlaviNiemitalo Jan 22, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I see: GlobalBufferLoggerBuilderExtensionsTests.AddGlobalBuffer_WithConfiguration_RegistersInDI calls Assert.Equivalent(expectedData, options.CurrentValue.Rules), which tries to convert the actual value to the type of the expected value or vice versa. So the assertion succeeds and the test passes even though expectedData[0].Attributes[1].Value is the integer 1 and options.CurrentValue.Rules[0].Attributes[1].Value is the string "1". But BufferFilterRuleSelector is not so lenient.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Added a custom equality comparer

@dotnet-comment-bot
Copy link
Collaborator

‼️ Found issues ‼️

Project Coverage Type Expected Actual
Microsoft.Extensions.Telemetry Line 93 91.66 🔻
Microsoft.Extensions.Telemetry Branch 93 92.95 🔻
Microsoft.AspNetCore.Diagnostics.Middleware Line 100 96.04 🔻
Microsoft.AspNetCore.Diagnostics.Middleware Branch 100 98.51 🔻

🎉 Good job! The coverage increased 🎉
Update MinCodeCoverage in the project files.

Project Expected Actual
Microsoft.Extensions.AI 88 89
Microsoft.Extensions.AI.OpenAI 77 78

Full code coverage report: https://dev.azure.com/dnceng-public/public/_build/results?buildId=918352&view=codecoverage-tab

@evgenyfedorov2 evgenyfedorov2 force-pushed the evgenyfedorov2/log_buffering branch from 433ede2 to e5905b1 Compare January 16, 2025 09:58
@evgenyfedorov2 evgenyfedorov2 force-pushed the evgenyfedorov2/log_buffering branch from e5905b1 to 5fc421c Compare January 16, 2025 15:05
@dotnet-comment-bot
Copy link
Collaborator

‼️ Found issues ‼️

Project Coverage Type Expected Actual
Microsoft.Extensions.Caching.Hybrid Line 86 77.82 🔻
Microsoft.AspNetCore.Diagnostics.Middleware Line 100 96.04 🔻
Microsoft.AspNetCore.Diagnostics.Middleware Branch 100 98.51 🔻
Microsoft.Extensions.Telemetry Line 93 91.85 🔻

🎉 Good job! The coverage increased 🎉
Update MinCodeCoverage in the project files.

Project Expected Actual
Microsoft.Extensions.AI 88 89
Microsoft.Extensions.AI.OpenAI 77 78

Full code coverage report: https://dev.azure.com/dnceng-public/public/_build/results?buildId=919881&view=codecoverage-tab

Category = categoryName;
LogLevel = logLevel;
EventId = eventId;
Filter = filter ?? ((_, _, _, _) => true);

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It might be more efficient to leave Filter as null and not call the filter in that case. I expect that most rules won't have a filter specified and the null check will be well predicted by processors. It would also make rules easier to understand in a debugger as actual filters would stand out better.

public int? EventId { get; set; }

/// <inheritdoc/>
public Func<string?, LogLevel?, EventId?, IReadOnlyList<KeyValuePair<string, object?>>, bool> Filter { get; }
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Are there any callsites where Filter is invoked with a null category string, null log level, or null EventId? If yes thats a little surprising but fine as long as we define what that would mean. If no then those parameter types don't need to be nullable.

{
await context.Request.Body.DrainAsync(default);

// normally, this would be a Middleware and IHttpRequestBufferManager would be injected via constructor
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

await next(context).ConfigureAwait(false); 
}

This is the spot in the middleware where I would expect the majority of logging would occur. Right now it looks like the example code would not buffer that common case logging.

@evgenyfedorov2 evgenyfedorov2 force-pushed the evgenyfedorov2/log_buffering branch from 8b8558a to e96277f Compare January 21, 2025 14:11
@dotnet-comment-bot
Copy link
Collaborator

‼️ Found issues ‼️

Project Coverage Type Expected Actual
Microsoft.AspNetCore.Diagnostics.Middleware Line 100 96.22 🔻
Microsoft.AspNetCore.Diagnostics.Middleware Branch 100 98.48 🔻
Microsoft.Extensions.Caching.Hybrid Line 86 77.82 🔻
Microsoft.Extensions.Telemetry Line 93 91.91 🔻

🎉 Good job! The coverage increased 🎉
Update MinCodeCoverage in the project files.

Project Expected Actual
Microsoft.Extensions.AI.OpenAI 77 78
Microsoft.Extensions.AI 88 89

Full code coverage report: https://dev.azure.com/dnceng-public/public/_build/results?buildId=923995&view=codecoverage-tab

@tarekgh
Copy link
Member

tarekgh commented Jan 21, 2025

I asked this in one of comments but didn't see answer. Are we going to take these APIs to the design review?

@evgenyfedorov2
Copy link
Contributor Author

I asked this in one of comments but didn't see answer. Are we going to take these APIs to the design review?

We are going to take it to API review https://apireview.net. I am not aware of design review, is it different than the API review in .NET?

@tarekgh
Copy link
Member

tarekgh commented Jan 21, 2025

I am not aware of design review, is it different than the API review in .NET?

No, should be same. You may consult the doc https://github.com/dotnet/runtime/blob/main/docs/project/api-review-process.md for the details. Thanks!

@dotnet-comment-bot
Copy link
Collaborator

‼️ Found issues ‼️

Project Coverage Type Expected Actual
Microsoft.Extensions.Telemetry Line 93 91.91 🔻
Microsoft.AspNetCore.Diagnostics.Middleware Line 100 96.36 🔻
Microsoft.AspNetCore.Diagnostics.Middleware Branch 100 98.48 🔻

🎉 Good job! The coverage increased 🎉
Update MinCodeCoverage in the project files.

Project Expected Actual
Microsoft.Extensions.AI.OpenAI 77 78
Microsoft.Extensions.AI 88 89

Full code coverage report: https://dev.azure.com/dnceng-public/public/_build/results?buildId=925013&view=codecoverage-tab

@evgenyfedorov2 evgenyfedorov2 force-pushed the evgenyfedorov2/log_buffering branch from 8319dc8 to 8a91c15 Compare January 22, 2025 14:28
@dotnet-comment-bot
Copy link
Collaborator

‼️ Found issues ‼️

Project Coverage Type Expected Actual
Microsoft.Extensions.Telemetry Line 93 92 🔻
Microsoft.AspNetCore.Diagnostics.Middleware Line 100 96.04 🔻
Microsoft.AspNetCore.Diagnostics.Middleware Branch 100 98.44 🔻

🎉 Good job! The coverage increased 🎉
Update MinCodeCoverage in the project files.

Project Expected Actual
Microsoft.Extensions.AI.OpenAI 77 78
Microsoft.Extensions.AI 88 89

Full code coverage report: https://dev.azure.com/dnceng-public/public/_build/results?buildId=925328&view=codecoverage-tab


public int GetHashCode(KeyValuePair<string, object?> obj)
{
return HashCode.Combine(obj.Key, obj.Value);
Copy link

@KalleOlaviNiemitalo KalleOlaviNiemitalo Jan 22, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This would have to be obj.Value?.ToString(), in order to be consistent with Equals.
Alternatively just throw NotImplementedException, if this is designed to be used only with Enumerable.Contains, which doesn't call GetHashCode anyway.

@evgenyfedorov2 evgenyfedorov2 force-pushed the evgenyfedorov2/log_buffering branch from d387358 to 4f524eb Compare January 22, 2025 16:03
@dotnet-comment-bot
Copy link
Collaborator

‼️ Found issues ‼️

Project Coverage Type Expected Actual
Microsoft.Extensions.Telemetry Line 93 91.87 🔻
Microsoft.AspNetCore.Diagnostics.Middleware Line 100 96.04 🔻
Microsoft.AspNetCore.Diagnostics.Middleware Branch 100 98.44 🔻

🎉 Good job! The coverage increased 🎉
Update MinCodeCoverage in the project files.

Project Expected Actual
Microsoft.Extensions.AI.OpenAI 77 78
Microsoft.Extensions.AI 88 89

Full code coverage report: https://dev.azure.com/dnceng-public/public/_build/results?buildId=925376&view=codecoverage-tab

namespace Microsoft.Extensions.Diagnostics.Buffering;

/// <summary>
/// Defines a rule used to filter log messages for purposes of futher buffering.

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

"further"

Comment on lines +11 to +15
/// <summary>
/// Defines a rule used to filter log messages for purposes of futher buffering.
/// </summary>
[Experimental(diagnosticId: DiagnosticIds.Experiments.Telemetry, UrlFormat = DiagnosticIds.UrlFormat)]
public class BufferFilterRule

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This should have docs about what happens if a log entry matches a rule, and what happens if the log entry does not match any rule.

  • Is the log entry discarded, buffered, or written directly to underlying loggers?
  • Does it cause other log entries to be flushed from the buffer to underlying loggers? If so, does that apply to all log categories or only to the same category?

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Alternatively, if this type is used with different semantics in different contexts, then document the behaviour in HttpRequestBufferOptions.Rules and GlobalBufferOptions.Rules, and link to those docs from here.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

6 participants