1.概念
1.补全api主要分为四类
- Term Suggester(纠错补全,输入错误的情况下补全正确的单词)
- Phrase Suggester(自动补全短语,输入一个单词补全整个短语)
- Completion Suggester(完成补全单词,输出如前半部分,补全整个单词)
- Context Suggester(上下文补全)
整体效果类似百度搜索,如图:
2.Term Suggester(纠错补全)
2.1.api
1.建立索引
PUT /book4 { "mappings": { "english": { "properties": { "passage": { "type": "text" } } } } }
2.插入数据
curl -H "Content-Type: application/json" -XPOST 'http:localhost:9200/_bulk' -d' { "index" : { "_index" : "book4", "_type" : "english" } } { "passage": "Lucene is cool"} { "index" : { "_index" : "book4", "_type" : "english" } } { "passage": "Elasticsearch builds on top of lucene"} { "index" : { "_index" : "book4", "_type" : "english" } } { "passage": "Elasticsearch rocks"} { "index" : { "_index" : "book4", "_type" : "english" } } { "passage": "Elastic is the company behind ELK stack"} { "index" : { "_index" : "book4", "_type" : "english" } } { "passage": "elk rocks"} { "index" : { "_index" : "book4", "_type" : "english" } } { "passage": "elasticsearch is rock solid"} '
3.看下储存的分词有哪些
post /_analyze { "text": [ "Lucene is cool", "Elasticsearch builds on top of lucene", "Elasticsearch rocks", "Elastic is the company behind ELK stack", "elk rocks", "elasticsearch is rock solid" ] }
结果:
{ "tokens": [ { "token": "lucene", "start_offset": 0, "end_offset": 6, "type": "<ALPHANUM>", "position": 0 }, { "token": "is", "start_offset": 7, "end_offset": 9, "type": "<ALPHANUM>", "position": 1 }, { "token": "cool", "start_offset": 10, "end_offset": 14, "type": "<ALPHANUM>", "position": 2 }, { "token": "elasticsearch", "start_offset": 15, "end_offset": 28, "type": "<ALPHANUM>", "position": 103 }, { "token": "builds", "start_offset": 29, "end_offset": 35, "type": "<ALPHANUM>", "position": 104 }, { "token": "on", "start_offset": 36, "end_offset": 38, "type": "<ALPHANUM>", "position": 105 }, { "token": "top", "start_offset": 39, "end_offset": 42, "type": "<ALPHANUM>", "position": 106 }, { "token": "of", "start_offset": 43, "end_offset": 45, "type": "<ALPHANUM>", "position": 107 }, { "token": "lucene", "start_offset": 46, "end_offset": 52, "type": "<ALPHANUM>", "position": 108 }, { "token": "elasticsearch", "start_offset": 53, "end_offset": 66, "type": "<ALPHANUM>", "position": 209 }, { "token": "rocks", "start_offset": 67, "end_offset": 72, "type": "<ALPHANUM>", "position": 210 }, { "token": "elastic", "start_offset": 73, "end_offset": 80, "type": "<ALPHANUM>", "position": 311 }, { "token": "is", "start_offset": 81, "end_offset": 83, "type": "<ALPHANUM>", "position": 312 }, { "token": "the", "start_offset": 84, "end_offset": 87, "type": "<ALPHANUM>", "position": 313 }, { "token": "company", "start_offset": 88, "end_offset": 95, "type": "<ALPHANUM>", "position": 314 }, { "token": "behind", "start_offset": 96, "end_offset": 102, "type": "<ALPHANUM>", "position": 315 }, { "token": "elk", "start_offset": 103, "end_offset": 106, "type": "<ALPHANUM>", "position": 316 }, { "token": "stack", "start_offset": 107, "end_offset": 112, "type": "<ALPHANUM>", "position": 317 }, { "token": "elk", "start_offset": 113, "end_offset": 116, "type": "<ALPHANUM>", "position": 418 }, { "token": "rocks", "start_offset": 117, "end_offset": 122, "type": "<ALPHANUM>", "position": 419 }, { "token": "elasticsearch", "start_offset": 123, "end_offset": 136, "type": "<ALPHANUM>", "position": 520 }, { "token": "is", "start_offset": 137, "end_offset": 139, "type": "<ALPHANUM>", "position": 521 }, { "token": "rock", "start_offset": 140, "end_offset": 144, "type": "<ALPHANUM>", "position": 522 }, { "token": "solid", "start_offset": 145, "end_offset": 150, "type": "<ALPHANUM>", "position": 523 } ] }