Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add breaking change about host namespace causing mapping issues #7398

Merged
merged 4 commits into from
Jun 25, 2018

Conversation

dedemorton
Copy link
Contributor

@dedemorton dedemorton commented Jun 21, 2018

Fixes #7368

I found the original document hard to follow (had to read it through multiple times), so I tried to rewrite it my own words based on what I understood after reading the document. We need to connect more of the dots for users who aren't familiar with this problem.

We don't have enough section levels in asciidoc to properly format this, so I've fudged some sections (don't tell the other writers).

Not sure it's 100% there, but here you go.

UPDATE: Please make sure you respond to the questions I've raised for reviewers.

one. We don't want the plugin to add this field because it causes a mapping
conflict with the `host` object defined in the index template.)

*What does this mean to you?*
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hmmm...the formatting here looks OK in the built docs, but not so good in GitHub, so I might change it.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I see what you mean about formatting being weird in GitHub. That threw me off at first. Interesting that it builds correctly in the book.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Well the docs use our CSS to control the size of things. GitHub shouldn't make bolded text into a big section header, but it does.

Starting with version 6.3, Beats provides an `add_host_metadata` processor for
adding fields, such as `host.name` and `host.id`, to Beats events. These fields
are grouped under a `host` prefix and conform to the
https://github.com/elastic/ecs[Elastic Common Schema (ECS)]. The `host` object
Copy link
Contributor Author

@dedemorton dedemorton Jun 21, 2018

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I hope it's OK to call this the host object because saying host field doesn't convey the idea that the type is different.

@dedemorton
Copy link
Contributor Author

dedemorton commented Jun 21, 2018

@karenzone If you have the time, can you take a look at this and offer suggestions for places where the language is confusing? Once you understand the problem, the details are easy to follow...but otherwise it is a confusing situation.

Copy link
Contributor

@ruflin ruflin left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Left a few minor comments.

In general LGTM.

[[breaking-changes-mapping-conflict]]
==== New `host` namespace may cause mapping conflicts for Logstash

// REVIEWERS: I describe these as mapping conflicts throughout, but the acutal
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I like mapping conflict as it is a mapping conflict :-)

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It's a mapping conflict.

`host` and `host.name` fields. Or you can adjust your queries to use
`beat.hostname`, instead of `host`, to work across all indices.

// REVIEWERS: Would it better just to tell users to uses to change their
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

beat.hostname is going to disappear in 7.0. So I would prefer to stear people in the direction of host.name if possible so they have to change it only once.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

OK, then I'll remove mention of this field.

objects by using the Kibana API or UI, change any aggregation references to use
`beat.hostname` instead of `host`, and then re-import the objects into Kibana.

// REVIEWERS: Are the above details correct? Got them from a comment Pius made
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The above should work but if someone has only 1 search based on the host field I would rather recommend to directly edit it in Kibana instead of exporting / importing it. The above might be better if there are lots of queries dependening on it but I think it requires more knowledge as the files have to be edited manually.

I would propose both options.

@ppf2 WDYT?

+
With this configuration, Beats drops the `host` fields before sending the
event to Logstash, and Logstash adds a default `host` field, as it did with
previous Beats versions.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Long term I think the host field will disappear for the beats-input-plugin in Logstash. I wonder if we should make a recommendation here to start thinking about how to migrate long term to host.name.


To resolve the mapping issue, do *one* of the following:

// TOOD: Should add an example of the LS filter here, don't you think?
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

++

@ph Could you provide one?

There is a bit a difference if host.* is dropped by Beats or LS. If it is dropped by Beats, LS will add host field. If it's dropped by LS, there will be no host field at all in the final event. I wonder if we should mention that difference.

Copy link
Contributor

@ph ph Jun 22, 2018

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@ruflin @dedemorton Yes, the following example will bring it to pre 6.3.0.

mutate {
  remove_field => [ "[host]" ]
}
# or
mutate {
  add_field => {
    "host" => "%{[beat][hostname]}"
   }
}

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@ruflin but taking the content of [beat][hostname] for host will result in the same behavior as the input had. WDYT?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I will do a check, but the add_field should replace the original value.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@ruflin mentioned earlier that beat.hostname will disappear in 7.0, so I think maybe we shouldn't show this in the example. Maybe it's worth mentioning that you can use a mutate filter rather than removing the field...and leave it up to the user to decide where to take the value from. I wonder if they can use the value in host.name instead. Would that work? Not sure if mutate allows you to take a value from a field that you are mutating.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@ph See my comment ^^

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@dedemorton this is perfectly valid, thanks for that :)

mutate {
  add_field => {
    "host" => "%{[host][name]}"
   }
}```

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think this config might be valid, but I don't think it fixes the mapping conflict. If you have multiple versions of Beats writing to the index (Beats 6.2 and 6.3), I think you need to drop the field in LS, or you will still end up with mapping conflicts because the index template defines host as an object.

[[breaking-changes-mapping-conflict]]
==== New `host` namespace may cause mapping conflicts for Logstash

// REVIEWERS: I describe these as mapping conflicts throughout, but the acutal
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It's a mapping conflict.

// error message says mapper parsing exception. I'm wondering if I should use
// that instead?

This breaking change applies only to users who send Beats events to Logstash.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

And users using their own template.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@ph You're saying that this issue applies to Beats users who use a custom template and aren't using Logstash, correct? If they have host field defined in their custom template, they'll run into mapping errors, no?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@dedemorton
It applies to the following use cases:

  • FB -> LS, using the default logstash-output-elasticsearch template.
  • FB -> LS, using their own template templates with the host defined.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@ph The change might not be breaking for all Beats -> LS users, but they should probably read this topic anyhow so they know that they might need to update any searches/visualizations they've created. I'm concerned that if I make the statement too generic, they might skip reading it.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@dedemorton Yes this is true, so let's keep it more specific we can adjust if people are posting on discuss.

Copy link
Contributor

@karenzone karenzone left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Makes sense to me! I left a few comments.

(By default, the plugin adds a `host` field if the event doesn't already have
one. We don't want the plugin to add this field because it causes a mapping
conflict with the `host` object defined in the index template.)

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The first sentence in the parentheses doesn't say anything new, so consider deleting it, and adding host to remaining sentence.

+(We don't want the plugin to add the host field because it causes a mapping
+conflict with the host object defined in the index template.)

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I had it that way originally, but thought being explicit might be useful for users who don't understand that LS adds the field if there's not one. This behavior isn't consistent across input plugins, so it might be worth calling out. But I agree...it's a bit redundant--that's why I made it parenthetical even though I know people hate parens. If no one disagrees, I will remove the statement.

one. We don't want the plugin to add this field because it causes a mapping
conflict with the `host` object defined in the index template.)

*What does this mean to you?*
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I see what you mean about formatting being weird in GitHub. That threw me off at first. Interesting that it builds correctly in the book.


* <<custom-template-non-versioned-indices>>
* <<beats-template-non-versioned-indices>>

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

+What does this mean to you?
See the info for your particular use case:

  • <<link use case 1>>
  • <<link use case 2>>
  • <<link use case 3>>

You did a good job capturing the use case scenario in each title. The use cases are easier to read and parse without the surrounding text.


You cannot resolve the problem by dropping the `host.*` fields, as described
earlier, because Logstash will add a default `host` field, resulting in a
mapping conflict because the index template defines the `host` field as an
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

"earlier" might be problematic. Linking to different use cases doesn't promote linear reading, but "earlier" seems to imply that you've read everything. Here, I think we can delete the reference to earlier and just say, "You cannot resolve the problem by dropping the host.* fields, because Logstash will add a default host field..."
Especially since what we're referencing isn't going to work in this situation anyway. :-)

* Or:
** Modify the Beats index template by removing the `host.*` fields, and
** Configure Beats to drop all `host.*` fields, as described earlier.
+
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

WRT "earlier". Same concern applies (this document isn't designed to be read linearly).

If we're referencing this: "In the Beats config file, configure Beats to drop all host.* fields: ", just say that instead of referencing it.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'll add a link to the section because I want users to know that the topic describes how to do this.

@brandonmensing
Copy link

LGTM - nothing to add on top of the comments and todos already going on there. Thanks!

@dedemorton
Copy link
Contributor Author

dedemorton commented Jun 22, 2018

I've pushed some changes based on review feedback. There are a couple of sections that still need work pending a response from reviewers:

@tsg FYI...

Edit:

Copy link
Contributor

@ruflin ruflin left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@dedemorton LGTM. Ready to be merged?

objects into Kibana.


//TODO: Add links to the Kibana docs for exporting/importing objects.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Want to resolve it as part of this PR?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@ruflin If you think it's good enough to merge, maybe we should merge and then iterate over it after it's merged. I think @tsg wanted to review this.

@dedemorton dedemorton merged commit ea5290a into elastic:master Jun 25, 2018
@@ -9,9 +9,155 @@ changes, but there are breaking changes between major versions (e.g. 5.x to

See the following topics for a description of breaking changes:

* <<breaking-changes-6.3>>
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

For completeness, I think we will want to include headings for 6.1 and 6.2 as well since there were also breaking changes in these releases that are currently only documented in the release notes today.

https://www.elastic.co/guide/en/beats/libbeat/6.2/release-notes-6.1.0.html#_breaking_changes_3
https://www.elastic.co/guide/en/beats/libbeat/6.2/release-notes-6.2.0.html#_breaking_changes_2
https://www.elastic.co/guide/en/beats/libbeat/6.2/release-notes-6.2.3.html#_breaking_changes

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@ppf2 I think the difference with the aboves is that these we features which still were in beta.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'll either add sections or make it clear that this section only covers the main breaking changes. We say that in a couple of places plus provide links to the relnotes, but maybe it's not clear enough.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

+1, let's clarify. Are we saying that the breaking changes guide will only cover breaking changes against the GA product? Or do we mean that it only covers the "main" breaking changes of GA product? If "main" is the keyword here, let's also confirm that this is consistent with the rest of the Elastic stack. Afaik, Elasticsearch documents all breaking changes against the GA product, not just the main ones.

If we are only covering breaking changes of the GA product, we can state that for breaking changes against pre-GA product releases, these are documented only in the release notes, etc..

thanks!

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

That's a good question, and I'm not sure about the answer. I'd like @tsg or @ruflin to weigh in here.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Moving forward we should document here all breaking changes which happen for any GA features. In general no breaking changes in minor should happen, but there are exceptions.

For beta or experimental features, the breaking changes should only be in the changelog.

If you searched for the `host` field previously, you now need to search for the
`host` and `host.name` fields.

If you have multiple visualizations in Kibana that reference the `host` field,
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I will send a separate note on this shortly to discuss. Somewhere in here, we will want to provide additional details on tackling the outcome of having multiple indices with different schemas for the Kibana use case given Kibana's inconsistent handling of conflicting field types today. Stay tuned. Just adding a comment here as something to refine.

@dedemorton
Copy link
Contributor Author

I've created an issue to track the remaining work for this PR (just in case things get busy...I do plan to get this resolved asap): #7408

@ppf2
Copy link
Member

ppf2 commented Jun 26, 2018

@dedemorton @ruflin Can we merge this change to both 6.3 and 6.x branches? I only see it in master right now :)

@dedemorton dedemorton added the needs_backport PR is waiting to be backported to other branches. label Jun 26, 2018
@dedemorton
Copy link
Contributor Author

@ppf2 I'll cherry-pick the changes today. Yesterday I was waiting for the doc build on master to be reflected in the published docs, but there was a problem with the site updating.

dedemorton added a commit to dedemorton/beats that referenced this pull request Jun 26, 2018
…tic#7398)

* Document breaking change about host namespace causing mapping issues

* Add changes from the review

* Minor edit

* Adds more changes for review comments
ph pushed a commit that referenced this pull request Jun 26, 2018
… (#7427)

* Document breaking change about host namespace causing mapping issues
@dedemorton dedemorton removed the needs_backport PR is waiting to be backported to other branches. label Jun 29, 2018
leweafan pushed a commit to leweafan/beats that referenced this pull request Apr 28, 2023
…tic#7398) (elastic#7427)

* Document breaking change about host namespace causing mapping issues
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

6 participants