【问题标题】:Removing duplicates and sorting (aggs + sort)删除重复项和排序(aggs + sort)
【发布时间】:2016-01-29 00:33:24
【问题描述】:

我正在尝试找到查询返回排序集的最佳解决方案,然后我使用 aggs 删除重复项,这很好,但是当我在查询结果上添加排序时,例如

"query": {..},
"sort": {.. "body.make": "asc" ..}

我希望 aggs 也按该顺序返回结果,但它似乎总是按查询分数排序。

  // Here I'm collecting all body.vin values to remove duplicates 
  // and then returning only the first in each result set.
  "aggs": {
    "dedup": {
      "terms": {
        "size": 8,
        "field": "body.vin"
      },
      "aggs": {
        "dedup_docs": {
          "top_hits": {
            "size": 1,
            "_source": false
          }
        }
      }
    }
  },

我尝试在两者之间放置一个术语聚合,看看是否会排序:

// here again same thing, however I attempt to sort on body.make
// in the document, however I now realize that my bucket result
// being each a collection of the duplicates, will sort each duplicate
// and not on the last results.
  "aggs": {
    "dedup": {
      "terms": {
        "size": 8,
        "field": "body.vin"
      },
      "aggs": {
        "order": {
          "terms": {
            "field": "body.make",
            "order": {
              "_term": "asc"
            }
          },
          "aggs": {
            "dedup_docs": {
              "top_hits": {
                "size": 1,
                "_source": false
              }
            }
          }
        }
      }
    }
  },

但是聚合的结果总是基于分数。

我还玩弄了基于查询排序调整分数的想法或解决方案,这样聚合将根据分数返回正确的顺序,但似乎没有做任何事情这是sort: {}

如果有人成功地对结果进行排序、删除重复项或想法/建议,请告诉我。

【问题讨论】:

    标签: elasticsearch


    【解决方案1】:

    这不是最理想的解决方案,因为它只允许对一个字段进行排序。最好的办法是更改排序结果的分数/提升

    试图解释它让我意识到一旦我掌握了存储桶的概念,或者更确切地说它们是如何传递的,如何做到这一点。我仍然对排序 + 分数调整解决方案感兴趣,但通过聚合这可行:

    // here we first aggregate all body.make, so first results might
    // {"toyota": {body.vin 123}, "toyota": {body.vin 123}...} and the
    // next result passed into the dedup aggregate would be say
    // {"nissan"...
      "aggs": {
        "sort": {
          "terms": {
            "size": 8,
            "field": "body.make",
            "order": {
              "_term": "desc"
            }
          },
          "aggs": {
            "dedup": {
              "terms": {
                "size": 8,
                "field": "body.vin"
              },
              "aggs": {
                "dedup_docs": {
                  "top_hits": {
                    "size": 1,
                    "_source": false
                  }
                }
              }
            }
          }
        }
      },
    

    【讨论】:

      猜你喜欢
      • 2011-02-09
      • 1970-01-01
      • 1970-01-01
      • 1970-01-01
      • 1970-01-01
      • 1970-01-01
      • 2013-05-02
      • 2021-12-07
      • 2016-03-12
      相关资源
      最近更新 更多