可以做的是定义两个分析器,一个用于匹配字符串的开头,另一个用于匹配字符串的结尾。在下面的索引设置中,我将前者命名为prefix_edge_ngram_analyzer,将后者命名为suffix_edge_ngram_analyzer。这两个分析器可以应用于text.prefix子字段的多字段字符串字段,分别应用于text.suffix字符串字段。
{
"settings": {
"analysis": {
"analyzer": {
"prefix_edge_ngram_analyzer": {
"tokenizer": "prefix_edge_ngram_tokenizer",
"filter": ["lowercase"]
},
"suffix_edge_ngram_analyzer": {
"tokenizer": "keyword",
"filter" : ["lowercase","reverse","suffix_edge_ngram_filter","reverse"]
}
},
"tokenizer": {
"prefix_edge_ngram_tokenizer": {
"type": "edgeNGram",
"min_gram": "2",
"max_gram": "25"
}
},
"filter": {
"suffix_edge_ngram_filter": {
"type": "edgeNGram",
"min_gram": 2,
"max_gram": 25
}
}
}
},
"mappings": {
"test_type": {
"properties": {
"text": {
"type": "string",
"fields": {
"prefix": {
"type": "string",
"analyzer": "prefix_edge_ngram_analyzer"
},
"suffix": {
"type": "string",
"analyzer": "suffix_edge_ngram_analyzer"
}
}
}
}
}
}
}
那么假设我们索引以下测试文档:
PUT test_index/test_type/1
{ "text": "stackoverflow" }
然后我们可以使用以下查询按前缀或后缀进行搜索:
# input is "stack" => 1 result
GET test_index/test_type/_search?q=text.prefix:stack OR text.suffix:stack
# input is "flow" => 1 result
GET test_index/test_type/_search?q=text.prefix:flow OR text.suffix:flow
# input is "ackov" => 0 result
GET test_index/test_type/_search?q=text.prefix:ackov OR text.suffix:ackov
使用查询 DSL 进行查询的另一种方式:
POST test_index/test_type/_search
{
"query": {
"multi_match": {
"query": "stack",
"fields": [ "text.*" ]
}
}
}
更新
如果您已经有一个字符串字段,您可以将其“升级”为一个多字段,并使用其分析器创建两个必需的子字段。这样做的方法是按顺序进行:
-
关闭索引以创建分析器
POST test_index/_close
-
更新索引设置
PUT test_index/_settings
{
"analysis": {
"analyzer": {
"prefix_edge_ngram_analyzer": {
"tokenizer": "prefix_edge_ngram_tokenizer",
"filter": ["lowercase"]
},
"suffix_edge_ngram_analyzer": {
"tokenizer": "keyword",
"filter" : ["lowercase","reverse","suffix_edge_ngram_filter","reverse"]
}
},
"tokenizer": {
"prefix_edge_ngram_tokenizer": {
"type": "edgeNGram",
"min_gram": "2",
"max_gram": "25"
}
},
"filter": {
"suffix_edge_ngram_filter": {
"type": "edgeNGram",
"min_gram": 2,
"max_gram": 25
}
}
}
}
-
重新打开你的索引
POST test_index/_open
-
最后,更新文本字段的映射
PUT test_index/_mapping/test_type
{
"properties": {
"text": {
"type": "string",
"fields": {
"prefix": {
"type": "string",
"analyzer": "prefix_edge_ngram_analyzer"
},
"suffix": {
"type": "string",
"analyzer": "suffix_edge_ngram_analyzer"
}
}
}
}
}
您仍然需要重新索引所有文档,以便填充和分析新的子字段 text.prefix 和 text.suffix。