-
Notifications
You must be signed in to change notification settings - Fork 422
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Allow the use of underscores in place of periods in the field names for Elastic Common Schema #53
Comments
Hi @steveipkis For dots vs _ have a look at https://github.com/elastic/ecs#why-does-ecs-use-a-dot-notation-instead-of-an-underline-notation I'm not sure how we could use them interchangeably as it would mean 2 different documents in Elasticsearch. What about using an ingest processor that would rename all the fields that don't fit the format? For the JSON format: The JSON document that is written in Elasticsearch does not contain dots. So if you see |
Hi @ruflin I looked at the link you provided and I know have a better understanding on the decision made behind using dots. With regards to the Json format, we are currently generating the json objects ourselves to fit the elastic schema standard. Most other data processes are unable to read nested json fields such as a: { b: c } (though some that do handle them, do so quite different from what we would hope). Thus we were hoping to use a one to one mapping between the entries columns and the line keys. Seeing that in Elastic an each one is treated as an entity with sub-fields, I don't know how using "_" would change any of the searches? We're trying to avoid using an ingest processor since string manipulation is an expensive operation and we do already have the data going through several ingestion pipelines. That being said, is there any other way to fit a common format with the least amount of changes necessary to the system? |
I think field alias could be a good solution for you here: elastic/elasticsearch#31372 It just got merged and should be available in upcoming 6.4 release of Elasticsearch. An alternative is using |
I like the alias solution and seems to align perfectly with our requirements. Thank you so much! I'll try to follow that issue raise any future questions there. I will also try to see if that can be a feature request in Apache Beam's ElasticIO as well since we are using it as our data connector to Elastic. Once again, thank you so much for your help! |
@steveipkis Should we close this issue? |
Yes, I'm closing the issue. Once again thanks! |
I am currently using Elastic Search without Logstash. The data parsing is being done through Apache Beam and the data is being written to several other data sources including Elastic and an SQL database.
However, the issue I face is that the "." is a protected keyword in any sql database. And since beam is already writing data in json format, it is too expensive to output two different formats of the same data set. Since this data is being forwarded to multiple sources, is it possible to conform to a more common schema naming convention?
That being said, is it possible for the Elastic Common Schema to use "_" and "." interchangeably, if not replace "." with underscore completely?
The text was updated successfully, but these errors were encountered: