Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Fix issue 6342: Automatically lowercase the index name #6640

Closed
wants to merge 2 commits into from
Closed
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
2 changes: 1 addition & 1 deletion libbeat/outputs/elasticsearch/client.go
Original file line number Diff line number Diff line change
Expand Up @@ -390,7 +390,7 @@ func createEventBulkMeta(
}

meta := bulkEventMeta{
Index: index,
Index: TryLowercaseIndex(index),
DocType: eventType,
Pipeline: pipeline,
ID: id,
Expand Down
18 changes: 18 additions & 0 deletions libbeat/outputs/elasticsearch/utils.go
Original file line number Diff line number Diff line change
@@ -0,0 +1,18 @@
package elasticsearch

import (
"unicode"
"github.com/elastic/beats/libbeat/logp"
"strings"
)

func TryLowercaseIndex(index string) string {

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

exported function TryLowercaseIndex should have comment or be unexported

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We probably don't need to export it :)

for _, u := range []rune(index) {
if unicode.IsUpper(u) {
newIndex := strings.ToLower(index)
logp.Warn("Index name %s is invalid, replace it with %s", index, newIndex)
return newIndex
}
}
return index
}
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

There is a few things that bug me in this:

  • This could get noisy since it could be potentially be logged on every events.
  • Worse case when the index is OK we iterate on all the chars, so an happy path will be hit by this.
  • Due to the dynamic nature of index, we use string interpolation here, we have to do this check on every events.

I wonder if instead, we could provide a better errors message when this situation happen and either tell them to use a lowercase index or we could provide a lowercase processor.

Copy link
Author

@Hexilee Hexilee Mar 27, 2018

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Could we provide a better errors message and an additional flag like --lowercase-index to enable the processor?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Personnaly I would not add a custom flag, they are easy to add, hard to remove and it make the CLI more complicated.

But the error message could mention how to fix the situation.

"Use the lowercase processor to normalize your index"

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@ruflin would you mind give your opinion on my comment?

Copy link
Contributor

@ruflin ruflin Mar 28, 2018

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

First of all I don't like magic. Sending a field with upper case and then converting it to lower case automatically for the index naming feels like magic to me. I see multiple options moving forward:

  • A config option in the elasticsearch output to normalies / lowercase the index which is off by default.
  • A lower case processor. The downside of the lower case processor is that if it's applied on a field, I expect it to be also changed in the final event it is sent.

I'm leaning towards the output config for the reason that this is an Elasticsearch specific issue so it should be an Elasticsearch config option. Having a config option off by default will also make it clear that when turning it on, it will have some processing overhead.

For the flag: As we have -E everything should be a config option and not a flag if possible (there are exceptions).

+1 on providing better error messages.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

A lower case processor. The downside of the lower case processor is that if it's applied on a field, I expect it to be also changed in the final event it is sent.

Correct.

Glad we got a conclusion.

@Hexilee

I would be +1 to add a specific options to the elasticsearch output to lowercase the index name and make it off by default.

If you move forward with the changes can you sign the CLA?

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@ph Yeah, I'd like to, and I have signed the CLA.
However, as you say, current error message is generated by ElasticSearch, so should we improve ElasticSearch only? Whatever providing better error messages or lowercase processor.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@Hexilee I think @ruflin had a pretty good summary:

A config option in the elasticsearch output to normalizes / lowercase the index which is off by default.

Concerning the error message, what is the current message your are getting?

18 changes: 18 additions & 0 deletions libbeat/outputs/elasticsearch/utils_test.go
Original file line number Diff line number Diff line change
@@ -0,0 +1,18 @@
package elasticsearch

import (
"testing"
"github.com/stretchr/testify/assert"
)

func TestTryLowercaseIndex(t *testing.T) {
testCases := map[string]string{
"ElasticSearch": "elasticsearch",
"ELASTIC": "elastic",
"beats": "beats",
}

for former, latter := range testCases {
assert.Equal(t, latter, TryLowercaseIndex(former))
}
}