Skip to content
This repository has been archived by the owner on Jun 29, 2023. It is now read-only.

Improve additional field type discovery #121

Closed
mp911de opened this issue Oct 27, 2017 · 1 comment
Closed

Improve additional field type discovery #121

mp911de opened this issue Oct 27, 2017 · 1 comment
Labels
type: enhancement A general enhancement

Comments

@mp911de
Copy link
Owner

mp911de commented Oct 27, 2017

Right now, additional fields (string/double-typed) in discovery mode have a high cost of type discovery because states are communicated using exceptions.

A benchmark proves this cost:

Benchmark                                   Mode  Cnt     Score    Error  Units
GelfMessageBenchmark.configuredDoubleField  avgt    5    52,522 ±  1,848  ns/op
GelfMessageBenchmark.configuredLongField    avgt    5    36,982 ±  0,614  ns/op
GelfMessageBenchmark.configuredStringField  avgt    5    18,341 ±  0,324  ns/op
GelfMessageBenchmark.discoverDoubleField    avgt    5  1520,149 ± 88,000  ns/op
GelfMessageBenchmark.discoverLongField      avgt    5    25,537 ±  0,131  ns/op
GelfMessageBenchmark.discoverStringField    avgt    5  2714,702 ± 49,806  ns/op
@mp911de mp911de added the type: enhancement A general enhancement label Oct 27, 2017
@mp911de mp911de added this to the logstash-gelf 1.11.2 milestone Oct 27, 2017
@mp911de
Copy link
Owner Author

mp911de commented Oct 27, 2017

Improving discovery with own character scanning to discover the most appropriate type could help to improve parsing durations.

Benchmark                                   Mode  Cnt   Score   Error  Units
GelfMessageBenchmark.configuredDoubleField  avgt    5  53,310 ± 3,263  ns/op
GelfMessageBenchmark.configuredLongField    avgt    5  39,175 ± 1,899  ns/op
GelfMessageBenchmark.configuredStringField  avgt    5  19,136 ± 0,771  ns/op
GelfMessageBenchmark.discoverDoubleField    avgt    5  56,483 ± 4,365  ns/op
GelfMessageBenchmark.discoverLongField      avgt    5  29,863 ± 3,405  ns/op
GelfMessageBenchmark.discoverStringField    avgt    5   6,131 ± 0,481  ns/op

mp911de added a commit that referenced this issue Oct 27, 2017
Additional field values in discovery mode are now inspected before the actual parsing to determine whether the value is qualified for long/double parsing. Empty values, values exceeding 32 chars and these containing String-only chars fall back directly to string. Long values (containing only +,- and 0-9) are parsed as such directly and double values don't require long parsing anymore.

Parsing still falls back through the layers if the discovery yielded a different result than the parser understands (applies especially for hex and scientific notation double values).
@mp911de mp911de closed this as completed Oct 27, 2017
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
type: enhancement A general enhancement
Projects
None yet
Development

No branches or pull requests

1 participant