-
Notifications
You must be signed in to change notification settings - Fork 25k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Define valid index, type, field, id, routing values #6736
Comments
Should ensure that if any global discisions are made regarding naming that the aggregation names are also included |
one more relevant point is that some of our endpoints mask what is a valid get doc by id REST request (according to the above spec). For example: |
@bleskes i think it is a problem as there is no workaround. I've added this sentence to the original issue: "IDs should not begin with an underscore." |
Adding field names to the specs (see #5972). Field names should not begin with an underscore, contain If a fieldname contains
Throwing an error seems a more transparent way of dealing with this. |
Having a dot in the field name is actually very useful. Would it be possible to use an escape for referencing a field name instead of path? |
@dcw-netflix an escape? do you mean |
Yes, that would be perfect. The reason is that there are a lot of use cases where property/config files get indexed, which results in many dot separated keys. |
Colons in index names are also invalid. See #7148 |
This is a great start for having a format input validation rules in elasticsearch. I believe we need to centralize all these rules in one place. I also think we should have validation rules for every input in es (not just those listed above)... for example: field names, repository names, snapshot names, etc... basically everything that in one way or another can compromise the consistent state of the cluster. We currently have a lot of this logic (probably incomplete) scattered in different places, it's definitely time to formalize them (both in docs & code) |
Note that for repository names, we also need to delegate the validation to plugins as there can be other rules with some cloud providers (azure for example). See also #7096 |
We have an issue with routing value with comma. Any workaround we should use? Thanks |
A common use case for ES (and my use case) is to index a DB table which may have column names that start with an underscore. Renaming the columns is not an option in my use case as well. Currently this requires storing a mapping between DB column names and ES field names which adds complexity. Is it possible to escape an underscore in a field name? Or more generally is it possible to escape any special character in a field name? A more general escaping solution would be optimal in my opinion because then a field name could have any arbitrary characters just like a quoted SQL identifier. |
Closing in favour of #9059 |
I've just come across this problem in the past week with ES 2.1 whilst trying to create documents with "."s in the field name. Am I correct in that even the field name escaping parts aren't included in ES 2.1? This is sadly a showstopper for our application as the field names we use are equipment serial codes, and we've recently added a supplier that includes "."s in their serial codes. |
@mcayland using serial numbers for field names is a bad design choice as you will end up with sparse fields, and much more disk usage than you actually need. |
hi, I'm wondering is there any other wildcard characters allowed in the template names apart from the star symbol? We have several indexes named by the same pattern, i.e: ap-YYYY-MM, bg-YYYY-MM, cm-YYYY-MM, etc. And they all have the same mapping, we just want separate those data into different indexes. Is there anyway to create a single template with index name pattern like '??-*' ? |
Currently we have no specification of allowed values for index names, type names, IDs, field names or routing values.
This issue is an attempt to document and improve the existing specs to prevent inconsistencies.
Index names
Index names are limited by the file system. They may only be lower case, and my not start with an underscore. While we don't prevent index names starting with a
.
, we reserve those for internal use. Clearly,.
and..
cannot be used.These characters are already illegal:
,
\
,/
,*
,?
,"
,<
,>
,|
,,
. We should also add the null byte.There are other filenames which are illegal in Windows, but we probably don't need to check for those.
Type names
Type names can contain any character (except null bytes, which currently we don't check) but may not start with an underscore.
IDs
IDs can contain any character (except null bytes, which currently we don't check). IDs should not begin with an underscore.
Currently IDs are not checked for underscores and IDs with underscores may exist. These can clash with eg
_mapping
and so should be prevented. This is a backwards incompatible change.Routing & Parent
Routing and parent values should be the same as IDs, ie any chars except for the null byte. The problem is that multiple routing values are passed in the query string as comma-separated values, eg
?routing=foo,bar
.If a single routing value contains a comma, it will be misinterpreted as two routing values. One idea is to pass multiple routing values as eg
?routing=foo&routing=bar,baz
. Unfortunately, this is not backwards compatible and isn't supported by a number of client libraries.The only solution I can think of is to support some escaping of commas, eg
foo\,bar
. This would mean that\
would need to be escaped as well, ie:foo\bar
->foo\\bar
. Support for this escaping would need to be added to Elasticsearch and to the client libraries.The text was updated successfully, but these errors were encountered: