【问题标题】:Fetch all documents if source contains the given search text in Elastic Search Server如果源包含 Elastic Search Server 中的给定搜索文本,则获取所有文档
【发布时间】:2013-05-30 01:23:56
【问题描述】:

我是弹性搜索的新手。我将一个字段映射到弹性搜索索引中的“字符串”。如果字段值包含给定的搜索文本,我需要检索文档。

JSON1 : "{\"id\":\"1\",\"message\":\"Welcome to elastic search\"}"
JSON2 : "{\"id\":\"2\",\"message\":\"elasticsearch\"}"

如果我使用“弹性”进行搜索,我需要同时获取这两条记录。我只得到第一个。

现在我正在获取基于 FTS 的文档。请指导我在 Elastic Search 中的 psql 中实现 search like/ilike。

提前致谢。

【问题讨论】:

    标签: java elasticsearch


    【解决方案1】:

    这是分词器的问题。大家可以看看NGramhttp://www.elasticsearch.org/guide/reference/index-modules/analysis/ngram-tokenizer/

    您可以使用路由/_analyze进行测试

    这是 Elasticsearch 默认的标记方式。

    curl -XGET 'localhost:9200/_analyze?tokenizer=standard' -d 'this is a test elasticsearch'

    {
    "tokens": [{
            "token": "this",
            "start_offset": 0,
            "end_offset": 4,
            "type": "<ALPHANUM>",
            "position": 1
        }, {
            "token": "is",
            "start_offset": 5,
            "end_offset": 7,
            "type": "<ALPHANUM>",
            "position": 2
        }, {
            "token": "a",
            "start_offset": 8,
            "end_offset": 9,
            "type": "<ALPHANUM>",
            "position": 3
        }, {
            "token": "test",
            "start_offset": 10,
            "end_offset": 14,
            "type": "<ALPHANUM>",
            "position": 4
        }, {
            "token": "elasticsearch",
            "start_offset": 15,
            "end_offset": 28,
            "type": "<ALPHANUM>",
            "position": 5
        }
    ]
    

    }

    这是一个使用 nGram 和默认值的示例

    curl -XGET 'localhost:9200/_analyze?tokenizer=nGram' -d 'this is a test elasticsearch'

    {
        "tokens": [{
                "token": "t",
                "start_offset": 0,
                "end_offset": 1,
                "type": "word",
                "position": 1
            }, {
                "token": "h",
                "start_offset": 1,
                "end_offset": 2,
                "type": "word",
                "position": 2
            }, {
                "token": "i",
                "start_offset": 2,
                "end_offset": 3,
                "type": "word",
                "position": 3
            }, {
                "token": "s",
                "start_offset": 3,
                "end_offset": 4,
                "type": "word",
                "position": 4
            }, {
                "token": " ",
                "start_offset": 4,
                "end_offset": 5,
                "type": "word",
                "position": 5
            }, {
                "token": "i",
                "start_offset": 5,
                "end_offset": 6,
                "type": "word",
                "position": 6
            }, {
                "token": "s",
                "start_offset": 6,
                "end_offset": 7,
                "type": "word",
                "position": 7
            }, {
                "token": " ",
                "start_offset": 7,
                "end_offset": 8,
                "type": "word",
                "position": 8
            }, {
                "token": "a",
                "start_offset": 8,
                "end_offset": 9,
                "type": "word",
                "position": 9
            }, {
                "token": " ",
                "start_offset": 9,
                "end_offset": 10,
                "type": "word",
                "position": 10
            }, {
                "token": "t",
                "start_offset": 10,
                "end_offset": 11,
                "type": "word",
                "position": 11
            }, {
                "token": "e",
                "start_offset": 11,
                "end_offset": 12,
                "type": "word",
                "position": 12
            }, {
                "token": "s",
                "start_offset": 12,
                "end_offset": 13,
                "type": "word",
                "position": 13
            }, {
                "token": "t",
                "start_offset": 13,
                "end_offset": 14,
                "type": "word",
                "position": 14
            }, {
                "token": " ",
                "start_offset": 14,
                "end_offset": 15,
                "type": "word",
                "position": 15
            }, {
                "token": "e",
                "start_offset": 15,
                "end_offset": 16,
                "type": "word",
                "position": 16
            }, {
                "token": "l",
                "start_offset": 16,
                "end_offset": 17,
                "type": "word",
                "position": 17
            }, {
                "token": "a",
                "start_offset": 17,
                "end_offset": 18,
                "type": "word",
                "position": 18
            }, {
                "token": "s",
                "start_offset": 18,
                "end_offset": 19,
                "type": "word",
                "position": 19
            }, {
                "token": "t",
                "start_offset": 19,
                "end_offset": 20,
                "type": "word",
                "position": 20
            }, {
                "token": "i",
                "start_offset": 20,
                "end_offset": 21,
                "type": "word",
                "position": 21
            }, {
                "token": "c",
                "start_offset": 21,
                "end_offset": 22,
                "type": "word",
                "position": 22
            }, {
                "token": "s",
                "start_offset": 22,
                "end_offset": 23,
                "type": "word",
                "position": 23
            }, {
                "token": "e",
                "start_offset": 23,
                "end_offset": 24,
                "type": "word",
                "position": 24
            }, {
                "token": "a",
                "start_offset": 24,
                "end_offset": 25,
                "type": "word",
                "position": 25
            }, {
                "token": "r",
                "start_offset": 25,
                "end_offset": 26,
                "type": "word",
                "position": 26
            }, {
                "token": "c",
                "start_offset": 26,
                "end_offset": 27,
                "type": "word",
                "position": 27
            }, {
                "token": "h",
                "start_offset": 27,
                "end_offset": 28,
                "type": "word",
                "position": 28
            }, {
                "token": "th",
                "start_offset": 0,
                "end_offset": 2,
                "type": "word",
                "position": 29
            }, {
                "token": "hi",
                "start_offset": 1,
                "end_offset": 3,
                "type": "word",
                "position": 30
            }, {
                "token": "is",
                "start_offset": 2,
                "end_offset": 4,
                "type": "word",
                "position": 31
            }, {
                "token": "s ",
                "start_offset": 3,
                "end_offset": 5,
                "type": "word",
                "position": 32
            }, {
                "token": " i",
                "start_offset": 4,
                "end_offset": 6,
                "type": "word",
                "position": 33
            }, {
                "token": "is",
                "start_offset": 5,
                "end_offset": 7,
                "type": "word",
                "position": 34
            }, {
                "token": "s ",
                "start_offset": 6,
                "end_offset": 8,
                "type": "word",
                "position": 35
            }, {
                "token": " a",
                "start_offset": 7,
                "end_offset": 9,
                "type": "word",
                "position": 36
            }, {
                "token": "a ",
                "start_offset": 8,
                "end_offset": 10,
                "type": "word",
                "position": 37
            }, {
                "token": " t",
                "start_offset": 9,
                "end_offset": 11,
                "type": "word",
                "position": 38
            }, {
                "token": "te",
                "start_offset": 10,
                "end_offset": 12,
                "type": "word",
                "position": 39
            }, {
                "token": "es",
                "start_offset": 11,
                "end_offset": 13,
                "type": "word",
                "position": 40
            }, {
                "token": "st",
                "start_offset": 12,
                "end_offset": 14,
                "type": "word",
                "position": 41
            }, {
                "token": "t ",
                "start_offset": 13,
                "end_offset": 15,
                "type": "word",
                "position": 42
            }, {
                "token": " e",
                "start_offset": 14,
                "end_offset": 16,
                "type": "word",
                "position": 43
            }, {
                "token": "el",
                "start_offset": 15,
                "end_offset": 17,
                "type": "word",
                "position": 44
            }, {
                "token": "la",
                "start_offset": 16,
                "end_offset": 18,
                "type": "word",
                "position": 45
            }, {
                "token": "as",
                "start_offset": 17,
                "end_offset": 19,
                "type": "word",
                "position": 46
            }, {
                "token": "st",
                "start_offset": 18,
                "end_offset": 20,
                "type": "word",
                "position": 47
            }, {
                "token": "ti",
                "start_offset": 19,
                "end_offset": 21,
                "type": "word",
                "position": 48
            }, {
                "token": "ic",
                "start_offset": 20,
                "end_offset": 22,
                "type": "word",
                "position": 49
            }, {
                "token": "cs",
                "start_offset": 21,
                "end_offset": 23,
                "type": "word",
                "position": 50
            }, {
                "token": "se",
                "start_offset": 22,
                "end_offset": 24,
                "type": "word",
                "position": 51
            }, {
                "token": "ea",
                "start_offset": 23,
                "end_offset": 25,
                "type": "word",
                "position": 52
            }, {
                "token": "ar",
                "start_offset": 24,
                "end_offset": 26,
                "type": "word",
                "position": 53
            }, {
                "token": "rc",
                "start_offset": 25,
                "end_offset": 27,
                "type": "word",
                "position": 54
            }, {
                "token": "ch",
                "start_offset": 26,
                "end_offset": 28,
                "type": "word",
                "position": 55
            }
        ]
    }
    

    这是一个示例链接,用于在您的索引中设置正确的分析器/标记器 How to setup a tokenizer in elasticsearch

    那么您的查询应该返回预期的文档。

    【讨论】:

      猜你喜欢
      • 1970-01-01
      • 1970-01-01
      • 1970-01-01
      • 1970-01-01
      • 1970-01-01
      • 1970-01-01
      • 1970-01-01
      • 2019-03-12
      • 1970-01-01
      相关资源
      最近更新 更多