-
Notifications
You must be signed in to change notification settings - Fork 32
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
multiline_grok doesn't capture unmatched multi-line events in grok_failure output #25
Comments
…pattern, we should still store the unmatched lines in the line buffer so that plugins such as fluent-plugin-grok-parser can report the log event as a grok parse error, see: fluent/fluent-plugin-grok-parser#25
Here's an attempt to fix the potential issue in the core code: input:
output:
|
does not match |
hi @okkez, that line wasn't meant to match, and I was hoping the plugin will capture that as a grok parse failure event. This is currently how logstash's implementation behaves. Can this plugin reflect the original behaviour? |
…pattern, we should still store the unmatched lines in the line buffer so that plugins such as fluent-plugin-grok-parser can report the log event as a grok parse error, see: fluent/fluent-plugin-grok-parser#25. To enable this, set enable_multiline_catch_all to true in the of source section
This plugin cannot process lines that don't match |
I have a fix for in_tail that passes those unmatched lines to your plugin, or anything that uses multiline: fluent/fluentd#1416 |
…pattern, we should still store the unmatched lines in the line buffer so that plugins such as fluent-plugin-grok-parser can report the log event as a grok parse error, see: fluent/fluent-plugin-grok-parser#25. To enable this, set enable_multiline_catch_all to true in the of source section
@jitran Can I close this issue? |
Sure, closed now. |
Thanks! |
Hi okimoto,
We've been comparing fluentd with logstash for some time now, and both work well. We like logstash's grok parsing and error handling abilities, hence we're trying your plugin. We have a requirement to capture all logs even if it fails parsing.
Apache Test:
Input 1 (expected to work):
127.0.0.1 - - [19/Dec/2016:11:30:43 +1100] "GET / HTTP/1.0" 403 3985 "-" "ApacheBench/2.3"
Produces:
{"clientip":"127.0.0.1","ident":"-","auth":"-","timestamp":"19/Dec/2016:11:52:29 +1100","verb":"GET","request":"/","httpversion":"1.0","response":"403","bytes":"3985","referrer":""-"","agent":""ApacheBench/2.3"","server":"tranj-fluentd-s3b","stack":"my-app-dev-01","application":"my-app","log_type":"filter.test","time":"2017-01-10T20:56:49Z"}
Input 2 (expected to fail):
127.0.0.1 - - NOTIME "GET / HTTP/1.0" 403 3985 "-" "ApacheBench/2.3"
Produces:
{"message":"127.0.0.1 - - NOTIME "GET / HTTP/1.0" 403 3985 "-" "ApacheBench/2.3"","grokfailure":"No grok pattern matched","server":"tranj-fluentd-s3b","stack":"my-app-dev-01","application":"my-app","log_type":"filter.test","time":"2017-01-10T21:05:42Z"}
Exactly what we wanted.
Tomcat Test:
patterns file (/tmp/patterns):
Input 1 (expected to work):
Produces:
{"timestamp":"Dec 16, 2016 9:29:56 AM","class":"org.apache.jasper.servlet.TldScanner","method":"scanJars","tomcatmsg":"\nINFO: At least one JAR was scanned for TLDs yet contained no TLDs. Enable debug logging for this logger for a complete list of JARs that were scanned but no TLDs were found in them. Skipping unneeded JARs during scanning can improve startup time and JSP compilation time.\n09:29:56.861 [localhost-startStop-1] WARN com.something.common.ServiceContext - Found application_config, profile is: test\n\n . ____ _ __ _ \n /\\ / ' __ _ () __ __ _ \ \ \ \\n( ( )\_ | '_ | '| | ' \/ ` | \ \ \ \\n \\/ )| |)| | | | | || (| | ) ) ) )\n ' || .__|| ||| |\, | / / / /\n =========||==============|/=///_/","server":"tranj-fluentd-s3b","stack":"my-app-pdev-01","application":"my-app","log_type":"filter.test","time":"2017-01-11T01:14:20Z"}
Input 2:
127.0.0.1 - - NOTIME "GET / HTTP/1.0" 403 3985 "-" "ApacheBench/2.3"
Produces no output, but in the td-agent.log:
2017-01-11 12:14:51 +1100 [warn]: plugin/in_tail.rb:390:block in parse_multilines: got incomplete line before first line from /var/log/filter_test.log: "127.0.0.1 - - NOTIME "GET / HTTP/1.0" 403 3985 "-" "ApacheBench/2.3"\n"
With Logstash's grok filter, it would capture the unmatched lines and store it in the output with grokparsefailure tag, but this plugin drops the invalid log lines.
Is it possible to make the plugin capture unmatched multiline log events?
The text was updated successfully, but these errors were encountered: