-
Notifications
You must be signed in to change notification settings - Fork 196
Scoped variables of different types produces issues #184
Comments
Clarifications:
So here, we probably have a Property named "2" which I can see has been detected as a @nblumhardt what does Seq do to deal with properties of different types? |
@LordMike Seq was designed to handle this case, so no issue over there 😎 |
Well blast. |
Ok. So just to get this clear: This is a problem that will affect all users of the ES sink, if they use properties of different types. It should also affect users who use Serilog directly, as their event properties can change between Scalar values (which could be of any type) and f.ex. sequences (which again could contain one uniform type or many different types). I would say this is a pretty critical thing for this sink. Wouldn't you think so, @nblumhardt ? |
I've tested using Serilog only. static void Main(string[] args)
{
SelfLog.Enable(Console.Out);
using (var logger = new LoggerConfiguration()
.WriteTo.Elasticsearch(new ElasticsearchSinkOptions(new Uri("http://localhost:9200"))
{
IndexFormat = "test-logstash-{0:yyyy.MM.dd}"
})
.CreateLogger())
{
logger.Information("Hello world {MyProperty}, {MyOtherProperty}", new object[] { 4, "string" }, 66);
}
} Error (selflog): And also: static void Main(string[] args)
{
SelfLog.Enable(Console.Out);
using (var logger = new LoggerConfiguration()
.WriteTo.Elasticsearch(new ElasticsearchSinkOptions(new Uri("http://localhost:9200"))
{
IndexFormat = "test2-logstash-{0:yyyy.MM.dd}"
})
.CreateLogger())
{
logger.Information("Hello world {MyProperty}", 4);
logger.Information("Hello world {MyProperty}", "string");
}
} Error: |
Relevant: #47 |
Thanks for the follow-up @LordMike. As I understand it, this is a limitation on the Elastic side rather than a Serilog issue - I don't think there's much that can be done about it at this end. |
True. One drastic approach would be to ensure that all properties always are of the same type (render them to strings f.ex.) .. As I see it right now. The ES sink is useless - it loses logs "randomly", and should not be used with structured logging. |
Elasticsearch has some sort of type system (mapping based on a template when a new index is created), so storing an integer and an object in the same field (in the index) will not work. I agree with you that it is not explicit when submitting the logs to a sink like ES and it will fail silently (unless you listen to the errors returned from the bulk post inside the sink). It is all a tradeoff between speed (the bulk insert) and convenience and inside your code, you need to be careful naming your properties. I know some people implemented their own serializer to be swapped in the sink, that would effectively add the type to the field name (e.g. string_field1 etc). My best advice; carefully design your log messages anyway so they make sense to query on. ES also has a default max of the number of fields inside an index. So just throwing in bunches of different fields might fail anyway at some point. |
We've taken a stab at this to get this rolling, where we won't have to redo our logging .. yet ... but while still keeping ES running with useful queries. This custom formatter will rename Properties given their data type. So "MyValue" which contains an integer, will become "IMyValue" - thereby removing conflicts in ES. I'm putting the code here, both to get @mivano & @nblumhardt 's feedback, but also so others can benefit from it. Since this issue will always be a problem in ES sinks, it could also make sense to embed this as a default offering in the ES sink. This code currently only operates on ScalarValues, and not sequences. Sequences would be great to support, but it's more code than I care to write right now, since we don't need it :P. internal class EsPropertyTypeNamedFormatter : ElasticsearchJsonFormatter
{
// Use a property writer that can change the property names and values before writing to ES
protected override void WritePropertiesValues(IReadOnlyDictionary<string, LogEventPropertyValue> properties, TextWriter output)
{
string precedingDelimiter = "";
foreach (KeyValuePair<string, LogEventPropertyValue> property in properties)
{
char type;
object value;
// Modify property name
if (property.Value is ScalarValue asScalar)
{
if (asScalar.Value is DateTime || asScalar.Value is DateTimeOffset)
{
type = 'D';
value = asScalar.Value;
}
else if (asScalar.Value is long || asScalar.Value is int || asScalar.Value is short || asScalar.Value is byte ||
asScalar.Value is ulong || asScalar.Value is uint || asScalar.Value is ushort || asScalar.Value is sbyte)
{
type = 'I';
value = asScalar.Value;
}
else if (asScalar.Value is float || asScalar.Value is double || asScalar.Value is decimal)
{
type = 'F';
value = asScalar.Value;
}
else
{
type = 'S';
value = asScalar.Value?.ToString();
}
}
// TODO: Support sequences, problem: Sequences can contain elements of different types
// OTOH: all properties in ES are sequences already, so it'd be great to utilize that -- some way of saying "if all elements are datetimes; store as datetimes"
//else if (property.Value is SequenceValue asSequence)
//{
// // Check types
// Type type = null;
// foreach (var item in asSequence.Elements)
// {
// }
//}
else
{
// Convert to string
type = 'S';
value = property.Value?.ToString();
}
string key = type + property.Key;
WriteJsonProperty(key, value, ref precedingDelimiter, output);
}
}
} Use by setting the new ElasticsearchSinkOptions(config.Uri)
{
CustomFormatter = new EsPropertyTypeNamedFormatter()
} |
Thanks for sharing the code, @LordMike! Guessing that anyone who wants to use this will be able to modify it to handle whatever data types they care about. |
Nice solution, thanks for sharing. I will create a link to this issue so it is clear what the behavior is. There might be more sinks having this issue where the mapping is not set per document/record, but rather on the container the document is in. |
I've solved it following way: if sequence contains values of the same type I store them as is (satisfies "if all elements are datetimes; store as datetimes"), otherwise store all elements as strings (i.e., keywords). |
Thanks for sharing the code @LordMike! |
Does this issue relate to a new feature or an existing bug?
What version of Serilog.Sinks.Elasticsearch is affected? Please list the related NuGet package.
6.3.0
What is the target framework and operating system? See target frameworks & net standard matrix.
Please describe the current behavior?
If I run code like this:
I will get an error. In ES, the series of scope values are simply added as an array, so ES will figure out some type to support it. In our case, we had a number of scopes that were
Dictionary<string, string>
, so theScope
property ended up beingobject
.Then later, we had the string like the one above, and we got an error like this (string cannot be converted to
Object
):We found the issue as we in one place used the
Dictionary<string, string>
and elsewhere used strings - so some logs would simply not be logged (and as these are scopes, it can be whole series of logs -- anything within the scope)..The bug also goes the other way around. If the Scope value is initially some simple type like a string, ES will auto-map to a
text
property, and then any log with a scope value that is a complex value will fail.Please describe the expected behavior?
So. I know that using
Dictionary<string, object>
will "work as expected", as it doesn't add toScope
but adds the individual properties instead.Could we either:
Dictionary<string, T>
specially, so theT
type is always handled (converted to string?)The text was updated successfully, but these errors were encountered: