-
Notifications
You must be signed in to change notification settings - Fork 5.6k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Create a "logstreamer" plugin #102
Comments
👍 |
perhaps something simpler, tail a file.
this could be used in many ways! lets say you want to know howmany 404 nginx is returning a second. OR perhaps send raw error.log messages.. The log string lines would be nice in grafana when the table plugin is added. |
Where do we start? |
Tail code looks interesting, but it may even be overkill for this situation. A telegraf plugin being able to handle a constant stream of messages is something that I've implemented in the statsd plugin that has a PR open now #237. So it's possible, but I think for this situation we might be able to just cache the position in the file, and then start reading from that position on the next call to There is also a plugin in a PR that does exactly as @steverweber described (counting status codes of a webserver log), but I probably won't be merging it because it's very specific to that use-case and the author has not written unit tests for it, see #176. I think that more ideally this plugin should be a general use-case where a user can input any regex that will be counted when matched (or output a string as @steverweber suggested). I'm thinking configuration would look something like this:
|
+1 |
1 similar comment
👍 |
keep in mined the logstreamer should recover if a file is
perhaps make it so multiple logstreamers are not needed for each metric.
|
perhaps file could even be a network stream... this could open up support for syslog: |
some of the code in heka might be helpfull for udp input: fyi: i feel telegraf objectives would be further along if it forked or contributed to: https://github.com/mozilla-services/heka - http://hekad.readthedocs.org/en/v0.10.0b1/ |
How about (sample config):
The plugin recursively walks the specified directories and looks for all files that match the "mask". There are rules to parse and extract data, where regex named groups are used.
|
I like the idea of reading the datetime from the log, however I think it should be optional. Keep in-mind some time offsetting should be included to maintain the order of the log messages if not using the actual timestamps in the log. also like the idea of including a tag or field name in the regex/rule. |
@ekini I'd like if there was an option to add a straight filename in addition to the "mask" |
Also, +1 to date parsing being optional, some people are only going to care about a count within an interval, not a point for every single instance of a regex match. So you should support that as well, as in my original example above |
Of course, date is optional, as well as date_format. Timestamps will be time.Now() then. There is one more concern. If you want to cache position in a file, and parse it to the end at each Gather, what happens if file is big? Also, what happens if telegraf gets restarted? My test code constantly reads files, and sends parsed content to a buffered channel, and after call to Gather get as much as possible from the channel within specified timeout interval. |
tailing/seeking to end of file is often not a problem when its big...
it gets restarted and jumps to the end of the file... We don't care if we loose some data between. Keeping state data is kinda overkill. |
There is still a question of what to do if file is truncated. One option would be to make a ServicePIugin that has the Tail code that @steverweber running in the background. This probably wouldn't be possible until I merge the statsd code |
the https://github.com/hpcloud/tail code seems to handle this well.
Config.ReOpen is analogous to tail -F (capital F):
ref: http://stackoverflow.com/questions/10135738/reading-log-files-as-theyre-updated-in-go |
@ekini you mentioned you had some working code for this a couple weeks ago, do you happen to have anything I can take a look at? I'm interested in getting something working for this |
@sparrc yes, I've got something working at ekini/telegraf@04f4b72 |
a little trick i been toying with.
might need work, but thought it worth the share. |
Maybe more simple with Rsyslog ? rsyslog.conf: And listen on 1514 port for example. |
Would be great is this could make it to telegraf. 👍 |
👍 |
This will most likely start as a telegraf Recently came across this log analyzer project that looks like it has a pretty solid format for creating templates and parsing arbitrary logfile formats: https://github.com/trustpath/sequence Right now it's discontinued, but influxdata could probably fork and take over that project if it turns out to be useful. |
closes influxdata#102 closes influxdata#328
Inspired by issue #48, create a plugin for aggregating and pushing data from log files, allowing user-defined regex filters.
This would behave in a similar manner to heka's logstreamer plugin: https://hekad.readthedocs.org/en/v0.9.2/pluginconfig/logstreamer.html#logstreamerplugin
/cc @steverweber
The text was updated successfully, but these errors were encountered: