Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Support for decimal in timestamps for logparser input #1912

Closed
njhartwell opened this issue Oct 17, 2016 · 9 comments · Fixed by #3358
Closed

Support for decimal in timestamps for logparser input #1912

njhartwell opened this issue Oct 17, 2016 · 9 comments · Fixed by #3358
Labels
area/tail feature request Requests for new plugin and for new features to existing plugins help wanted Request for community participation, code, contribution

Comments

@njhartwell
Copy link

Feature Request

Proposal:

In trying to ingest Apache Traffic Server's squid-formatted log files (using logparser input or influx line protocol), there does not seem to be an easy way to get telegraf to accept millisecond-precision timestamps like 1476680409.042. I'd be happy to submit a PR if someone could verify that this is a reasonable request and suggest the best way to implement it (e.g. a new input plugin specific to ATS, some extension to the grok parser, etc.).

Current behavior:

1476680409.042 is not treated as a valid timestamp.

Desired behavior:

1476680409.042 could be parsed as a valid timestamp.

Use case:

We make extensive use of Apache Traffic Server (as do lots of people :) ) and having it would be great to be able to parse its access logs natively.

@sparrc
Copy link
Contributor

sparrc commented Oct 17, 2016

It should be added as a special timestamp format, similar to how ts-epoch and ts-epochnano are parsed: https://github.com/influxdata/telegraf/blob/master/plugins/inputs/logparser/grok/grok.go#L227

I would call it something like ts-epochdecimal, parse the number as a float, multiple by 1000000, convert to int, then parse as an epochnano

as a workaround, you could parse just the unix epoch part of the timestamp with something like this (throwing away the millisecond precision):

%{INTEGER:unixtime:ts-epoch}.%{INTEGER} ...

@sparrc sparrc added this to the 1.2.0 milestone Nov 3, 2016
@discoduck2x
Copy link

@sparrc , why not make an extension to the grok parser so you can "wash" out unwanted charachters similar to logstash´s mutate functionality? , it doesnt feel right having to try change apps - ofteh 3d party apps logging behaviour - often impossible, and right now i cant use telegraf (even though i want to) since i also had the exact similar case in #1649
@njhartwell

@sparrc
Copy link
Contributor

sparrc commented Dec 7, 2016

@discoduck2x isn't this what you're looking for? As as I said above, it's simple to ignore the decimals if you'd like:

as a workaround, you could parse just the unix epoch part of the timestamp with something like this (throwing away the millisecond precision):

%{INTEGER:unixtime:ts-epoch}.%{INTEGER} ...

@discoduck2x
Copy link

@sparrc that doesnt solve it at all - that makes it so i get second timeresoution , which is just a workaround for non high resolution timewise data , and that wont do it since alot of things especially packetcapture data - or high frequency transactional flows will not suffice with second as timeboundry.

We need to be able to get any arbitrary time format -> epoch ms/epoch nano! There is so much out there that you dont have control over and due to how telegraf handles this I have to use other propriatary scripts to get data into influx OR use logstash etc... which,,,,, sucks! I want to use telegraf , i just cant , yet :)

@sparrc sparrc modified the milestones: Future Milestone, 1.2.0 Dec 7, 2016
@sparrc
Copy link
Contributor

sparrc commented Dec 7, 2016

@discoduck2x it's a workaround, it really wouldn't be difficult at all for someone to fix this with a PR, I think you can understand that I don't have time to accommodate every single request.

If you want it prioritized of course you could contact [email protected] ;-)

@discoduck2x
Copy link

@sparrc , but it isnt a workaround, i look for serialization delays for operations that take X microseconds to complete and if say 500 of these arrive on the wire within the same second then having all of them piled up with the same second timestamp wont do me no good...not a workaround.

wish i knew some developer who could PR this for me... i totally understand u cant jump the gun on all requests.

unfort back to logstash for now.

@discoduck2x
Copy link

oh @njhartwell i didnt see first that you suggested you could hack this up - can you ? im soooo in your debt if you do :)

@danielnelson
Copy link
Contributor

@discoduck2x Can you add an example of how this would look in logstash configuration using their mutate functionality?

@danielnelson danielnelson added feature request Requests for new plugin and for new features to existing plugins help wanted Request for community participation, code, contribution labels Apr 12, 2017
@discoduck2x
Copy link

@danielnelson sorry for the late reply. here´s how im getting round it currently with logstash (prob not the best way to do it but it works for me):

input data:
0655050622.123000,brandon,1337
0655050629.456000,brenda,90210
1492839758.12345678,kelly,666
0655018483111,steve,911

logstash.conf:

filter {

        grok    {
                match => { "message" => "%{DATA:time1},%{WORD:host1},%{WORD:value:int}" }
                }

        mutate  {
                gsub => [ "time1", "\.", "" ]
                replace => [ "host", "%{host1}" ]
                }

        grok    {
                match => { "time1" => "(?<time2>^[0-9]{13})" }
                }

        date    {
                match => [ "time2", "UNIX_MS" ]
                target => "@timestamp"
                }
        mutate  {
                remove_field => [ "message" ]
                remove_field => [ "time1" ]
                remove_field => [ "time2" ]
                remove_field => [ "host1" ]
                }
        }

output  {
        stdout { codec => rubydebug }
        }

which produces the following output:

{
          "path" => "/opt/epoch.txt",
    "@timestamp" => 1990-10-04T14:30:22.123Z,
      "@version" => "1",
          "host" => "brandon",
         "value" => 1337
}
{
          "path" => "/opt/epoch.txt",
    "@timestamp" => 1990-10-04T14:30:29.456Z,
      "@version" => "1",
          "host" => "brenda",
         "value" => 90210
}
{
          "path" => "/opt/epoch.txt",
    "@timestamp" => 2017-04-22T05:42:38.123Z,
      "@version" => "1",
          "host" => "kelly",
         "value" => 666
}
{
          "path" => "/opt/epoch.txt",
    "@timestamp" => 1990-10-04T05:34:43.111Z,
      "@version" => "1",
          "host" => "steve",
         "value" => 911
}

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
area/tail feature request Requests for new plugin and for new features to existing plugins help wanted Request for community participation, code, contribution
Projects
None yet
Development

Successfully merging a pull request may close this issue.

4 participants