-
Notifications
You must be signed in to change notification settings - Fork 4.9k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
filebeat: Add failing tests for @timestamps in JSON with timezone specified #1831
filebeat: Add failing tests for @timestamps in JSON with timezone specified #1831
Conversation
This is the default @timestamp format used by kibana json log for example
Jenkins standing by to test this. If you aren't a maintainer, you can ignore this comment. Someone with commit access, please review this and clear it for Jenkins to run; then say 'jenkins, test it'. |
1 similar comment
Jenkins standing by to test this. If you aren't a maintainer, you can ignore this comment. Someone with commit access, please review this and clear it for Jenkins to run; then say 'jenkins, test it'. |
@morganchristiansson Thanks a lot for all the details. We will have a look to check what goes wrong here. |
Filebeat currently doesn't support a configurable timestamp format, and the current format requires the Z at the end. What we could do to support this test case without adding configurable patterns is to switch to RFC3339 as the format. That accepts both formats: https://play.golang.org/p/boxSimIhq8 We could also just make it configurable. @morganchristiansson do you need just this format or configurable formats? |
Hi. I have no idea why it's suggested that filebeat should enforce the format of @timestamp when the rest of the stack seemingly supports ISO8601/RFC3339. Why do you even attempt to parse @timestamp other than to pass it to the output as a string? When I ask filebeat to parse the JSON I do not want it to mangle our @timestamp field. JSON parsing was only recently added and you immediately jump to enforce the format of the @timestamp ?? WHY? This is already handled by both logstash and elasticsearch, it is not the log collectors job to try and parse it! We switched from nxlog because it was converting @timestamp to a format that logstash/elasticsearch wouldn't recognize and also stripping precision from the @timestamp value and now filebeat is going down the same route! Also the format you are using as example in the filebeat reference log output is RFC3339 which is not supported by filebeat. Much like kibanas timestamp format is not supported either. filebeat: 2006-01-02T15:04:05Z07:00 <-- 07:00 is not supported. Only UTC! kibana: 2016-06-10T16:29:37+00:00 <-- Z missing and +00:00 not supported. https://www.elastic.co/guide/en/beats/filebeat/master/configuration-logging.html#_logging_format If you're going to enforce the format you really should make sure you use your own format that all filebeat users should now be using. I shall do the same and reach out to the multiple teams that are now generating invalid timezone formats as only a subset of the ISO8660/RFC3336 is now supported. Or should we start using an alternative field and write logstash configuration to rename it back just so that filebeat won't muck it up? All other apps including kibana seem to generate documents with (invalid) @timestamp field. |
It's a very good point that we should be consistent across the stack or support at least the other formats. I think the first step we should do is to support both formats which seems to be quite a simple change. The second question is if we should allow any kind of structure in @timestamp. If JSON is used, should we try to parse the @timestamp value if it is a string or just leave it as it is? All the beats and elasticsearch rely that @timestamp is a valid If someone does want to use his own @timestamp format this is possible, by not using KeysUnderRoot (never tested, but this should work). |
Personally I need support for specifying timezones in UTC and UTC. The +00:00/+01:00 format of ISO8601/RFC3339 is required. RFC3339 format for parsing suggested by @tsg sounds great. But I suspect other users will ask for other formats. As for keys_under_root setting, it changes existing data and is not what I want, although with a logstash filter to move the values under the json key to root it would probably work great. Will consider this. I'm curious what the reason is for filebeat to even parse the @timestamp and, and what it does with this value? Why not just pass on the value to the output as is? |
@morganchristiansson The reason it is parse is mainly for historical reason. All other implementations actually generate the timestamp and we use internally a fixed type to handle this field. So in case it is not converted, we have to change larger parts in the code base. That doesn't mean it should not be changed, at the same time I feel like as this is almost the only field we required, we should be strict here for the type. |
It would be useful to record when the log was processed by filebeat in a secondary field like beat.timeatamp. Parsing dates is a can of worms best avoided or delegated to libraries that specialise in it. |
We first had that in |
@morganchristiansson Could you have a look at this PR? #1889 I took your changes / tests and built on top the necessary changes. |
@morganchristiansson BTW: Could you sign the CLA so I can take your code in my PR? https://www.elastic.co/contributor-agreement/ |
Allow timestamps without timezone. See elastic#1831 for more details
Incorporated into #1889 |
* Update testcases to test timestamps with timezone specified This is the default @timestamp format used by kibana json log for example * Make timestamp conversation based on RFC3339 Allow timestamps without timezone. See #1831 for more details
I've implemented this workaround/solution to fix the events in logstash after they've exited the filebeats pipeline.
and then set |
@morganchristiansson Thanks for sharing. I'm not 100% sure in which cases this is needed? |
It's a workaround waiting for filebeat-5.0.0-alpha4 release and also handling any ISO8601 dates that aren't RFC3339 dates, such as timezones with missing colon (+0000) I also get the behaviour I initially expected from filebeat with this fix/workaround. |
@morganchristiansson Got it. Thanks for the details. |
filebeat is currently generating errors for timestamps with timezone specified as +00:00 ('Z' is allowed without suffix). Failing tests added.
This is the default @timestamp format used by kibana json log for example.
Seems to have been introduced here: https://github.com/elastic/beats/pull/1421/files#r60245935