【问题标题】:Elasticsearch sort settings on index giving strange results索引上的 Elasticsearch 排序设置给出奇怪的结果
【发布时间】:2020-02-03 05:14:11
【问题描述】:

我有一个这样的索引设置:

PUT items
{
  "settings": {
    "index": {
      "sort.field": ["popularity", "title_keyword"],
      "sort.order": ["desc", "asc"]
    },
    "analysis": {
      "analyzer": {
        "autocomplete": {
          "tokenizer": "autocomplete",
          "filter": [
            "lowercase"
          ]
        },
        "autocomplete_search": {
          "tokenizer": "lowercase"
        }
      },
      "tokenizer": {
        "autocomplete": {
          "type": "edge_ngram",
          "min_gram": 2,
          "max_gram": 15,
          "token_chars": [
            "letter"
          ]
        }
      }
    }
  },
  "mappings": {
    "properties": {
      "title": {
        "type": "text",
        "analyzer": "autocomplete",
        "search_analyzer": "autocomplete_search"
      },
      "title_keyword": {
        "type": "keyword"
      },
      "popularity": {
        "type": "integer"
      },
      "visibility": {
        "type": "keyword"
      }
    }
  }
}

有以下数据:

POST items/_doc/1
{
  "title": "The Arbor",
  "popularity": 5,
  "title_keyword": "The Arbor",
  "visibility": "public"
}

POST items/_doc/2
{
  "title": "The Canon",
  "popularity": 10,
  "title_keyword": "The Canon",
  "visibility": "public"
}

POST items/_doc/3
{
  "title": "The Brew",
  "popularity": 15,
  "title_keyword": "The Brew",
  "visibility": "public"
}

我对数据运行此查询:

GET items/_search
{
  "size": 3,
  "query": {
    "bool": {
      "must": [
        {
          "match": {
            "title": {
              "query": "the", 
              "operator": "and"
            }
          }
        },
        {
          "match": {
            "visibility": "public"
          }
        }
      ]
    }
  },
  "highlight": {
    "pre_tags": ["<mark>"],
    "post_tags": ["</mark>"], 
    "fields": {
      "title": {}
    }
  }
}

它似乎正确匹配单词the 上的记录,但排序似乎不起作用。我希望它按定义的受欢迎程度排序,结果将是The ArborThe BrewThe Canon,但我得到的结果如下:

{
  "took" : 11,
  "timed_out" : false,
  "_shards" : {
    "total" : 1,
    "successful" : 1,
    "skipped" : 0,
    "failed" : 0
  },
  "hits" : {
    "total" : {
      "value" : 3,
      "relation" : "eq"
    },
    "max_score" : 0.27381438,
    "hits" : [
      {
        "_index" : "items",
        "_type" : "_doc",
        "_id" : "3",
        "_score" : 0.27381438,
        "_source" : {
          "title" : "The Brew",
          "popularity" : 15,
          "title_keyword" : "The Brew",
          "visibility" : "public"
        },
        "highlight" : {
          "title" : [
            "<mark>The</mark> Brew"
          ]
        }
      },
      {
        "_index" : "items",
        "_type" : "_doc",
        "_id" : "1",
        "_score" : 0.26392496,
        "_source" : {
          "title" : "The Arbor",
          "popularity" : 5,
          "title_keyword" : "The Arbor",
          "visibility" : "public"
        },
        "highlight" : {
          "title" : [
            "<mark>The</mark> Arbor"
          ]
        }
      },
      {
        "_index" : "items",
        "_type" : "_doc",
        "_id" : "2",
        "_score" : 0.26392496,
        "_source" : {
          "title" : "The Canon",
          "popularity" : 10,
          "title_keyword" : "The Canon",
          "visibility" : "public"
        },
        "highlight" : {
          "title" : [
            "<mark>The</mark> Canon"
          ]
        }
      }
    ]
  }
}

在设置下创建索引时定义排序字段和顺序是否自动对结果进行排序?它似乎是按分数而不是受欢迎程度排序。如果我在查询中包含排序选项,它会给我正确的结果:

GET items/_search
{
  "size": 3,
  "sort": [
    {
      "popularity": {
        "order": "desc"
      }
    },
    {
      "title_keyword": {
        "order": "asc"
      }
    }
  ], 
  "query": {
    "bool": {
      "must": [
        {
          "match": {
            "title": {
              "query": "the", 
              "operator": "and"
            }
          }
        },
        {
          "match": {
            "visibility": "public"
          }
        }
      ]
    }
  },
  "highlight": {
    "pre_tags": ["<mark>"],
    "post_tags": ["</mark>"], 
    "fields": {
      "title": {}
    }
  }
}

我读到像这样在查询中包含排序是低效的,并将其包含在设置中。在创建索引以使其默认按受欢迎程度排序时,我是否没有做任何事情?在查询中包含排序选项会导致查询效率低下吗?还是我真的需要在每个查询中都包含它?

希望这是有道理的!谢谢

【问题讨论】:

    标签: sorting elasticsearch elk


    【解决方案1】:

    索引排序定义了片段在分片中的排序方式,这与搜索结果的排序无关。您可以使用排序索引,如果您经常使用相同的条件进行排序的搜索,那么索引排序会加快搜索速度。

    如果您的搜索具有与索引不同的排序或根本没有排序,则索引排序不相关。

    索引排序请参见documentation,尤其是使用explains how index sorting的部分。

    【讨论】:

    • 太棒了!谢谢。我误解了文档。
    猜你喜欢
    • 2017-01-07
    • 1970-01-01
    • 1970-01-01
    • 1970-01-01
    • 2013-09-20
    • 1970-01-01
    • 1970-01-01
    • 1970-01-01
    • 2013-08-20
    相关资源
    最近更新 更多