Expose preserve_original setting in edge ngram token filter #55767
Labels
:Search Relevance/Analysis
How text is split into tokens
Team:Search Relevance
Meta label for the Search Relevance team in Elasticsearch
preserve_original
setting is currently not supported in theEdgeNGramTokenFilter
https://www.elastic.co/guide/en/elasticsearch/reference/master/analysis-edgengram-tokenfilter.html#analysis-edgengram-tokenfilterand there is even TODO comment in the master code of Elasticsearch(as of 25th Apr 2020) to Expose
preserve_original
as shown in this GitHub code linkhttps://github.com/elastic/elasticsearch/blob/master/modules/analysis-common/src/main/java/org/elasticsearch/analysis/common/EdgeNGramTokenFilterFactory.java#L66
Elasticsearch version (bin/elasticsearch --version):
8.0.0-SNAPSHOT
Plugins installed: []
N/A
JVM version (java -version):
openjdk 14.0.1 2020-04-14
OpenJDK Runtime Environment (build 14.0.1+7)
OpenJDK 64-Bit Server VM (build 14.0.1+7, mixed mode, sharing)
OS version (uname -a if on a Unix-like system):
Darwin LT6577 19.3.0 Darwin Kernel Version 19.3.0: Thu Jan 9 20:58:23 PST 2020; root:xnu-6153.81.5~1/RELEASE_X86_64 x86_64
Description of the problem including expected versus actual behavior:
Its a feature request and mentioned in the TODO of Elasticsearch master code, if provided preserve original functionality would work with n-gram token filter.
Steps to reproduce:
Please include a minimal but complete recreation of the problem, including
(e.g.) index creation, mappings, settings, query etc. The easier you make for
us to reproduce it, the more likely that somebody will take the time to look at it.
1. Delete the existing index with the name
preserveoriginal
to test this feature.curl --user elastic:password -XDELETE localhost:9200/preserveoriginal
2. Create a new index with custom analyzer which uses edge-ngram token filter.
4.The output of the above analyzer API.
Pease note
foo
original token isn't present in the result.The text was updated successfully, but these errors were encountered: