【问题标题】:solr: edismax tie parameter not working as describedsolr:edismax tie 参数未按描述工作
【发布时间】:2021-01-17 14:39:06
【问题描述】:

根据solr docs,tie 参数为 0 应该使得分较低的字段对最终得分没有贡献:

当针对多个字段测试用户输入中的一个词时,可能会匹配多个字段。如果是这样,每个字段将根据该词在该字段中的常见程度(对于每个文档相对于所有其他文档)生成不同的分数。 tie 参数可让您控制查询的最终得分在多大程度上受得分较低的字段的得分与得分最高的字段相比的影响。

值“0.0”(默认值)使查询成为纯粹的“析取最大查询”:也就是说,只有最大得分的子查询对最终得分有贡献。 “1.0”的值使查询成为纯粹的“析取和查询”,其中最大得分子查询无关紧要,因为最终得分将是子查询得分的总和。通常较低的值(例如 0.1)很有用。

但是,这对我不起作用。这是我的搜索参数:

q: 
parent_and_self_description:2-canal
parent_and_self_description:rct
info:2-canal
info:rct

qf: parent_and_self_description info^0.000000001

defType: edismax

尽管对 info 的提升如此之低,但以下是排名靠前的结果的分数调试信息,由于其在“info”字段上的分数而强行登上榜首:

9.731399 = sum of:
  2.0064516 = weight(parent_and_self_description:canal in 753) [SchemaSimilarity], result of:
    2.0064516 = score(freq=2.0), computed as boost * idf * tf from:
      3.5589366 = idf, computed as log(1 + (N - n + 0.5) / (n + 0.5)) from:
        75 = n, number of documents containing term
        2651 = N, total number of documents with field
      0.5637784 = tf, computed as freq / (freq + k1 * (1 - b + b * dl / avgdl)) from:
        2.0 = freq, occurrences of term within document
        1.2 = k1, term saturation parameter
        0.75 = b, length normalization parameter
        18.0 = dl, length of field
        12.986043 = avgdl, average length of field
  3.726406 = sum of:
    1.7199546 = weight(parent_and_self_description:root in 753) [SchemaSimilarity], result of:
      1.7199546 = score(freq=2.0), computed as boost * idf * tf from:
        3.0507636 = idf, computed as log(1 + (N - n + 0.5) / (n + 0.5)) from:
          125 = n, number of documents containing term
          2651 = N, total number of documents with field
        0.5637784 = tf, computed as freq / (freq + k1 * (1 - b + b * dl / avgdl)) from:
          2.0 = freq, occurrences of term within document
          1.2 = k1, term saturation parameter
          0.75 = b, length normalization parameter
          18.0 = dl, length of field
          12.986043 = avgdl, average length of field
    2.0064516 = weight(parent_and_self_description:canal in 753) [SchemaSimilarity], result of:
      2.0064516 = score(freq=2.0), computed as boost * idf * tf from:
        3.5589366 = idf, computed as log(1 + (N - n + 0.5) / (n + 0.5)) from:
          75 = n, number of documents containing term
          2651 = N, total number of documents with field
        0.5637784 = tf, computed as freq / (freq + k1 * (1 - b + b * dl / avgdl)) from:
          2.0 = freq, occurrences of term within document
          1.2 = k1, term saturation parameter
          0.75 = b, length normalization parameter
          18.0 = dl, length of field
          12.986043 = avgdl, average length of field
  1.6317743 = weight(info:canal in 753) [SchemaSimilarity], result of:
    1.6317743 = score(freq=3.0), computed as boost * idf * tf from:
      3.226844 = idf, computed as log(1 + (N - n + 0.5) / (n + 0.5)) from:
        2 = n, number of documents containing term
        62 = N, total number of documents with field
      0.50568736 = tf, computed as freq / (freq + k1 * (1 - b + b * dl / avgdl)) from:
        3.0 = freq, occurrences of term within document
        1.2 = k1, term saturation parameter
        0.75 = b, length normalization parameter
        56.0 = dl, length of field (approximate)
        19.14516 = avgdl, average length of field
  2.366767 = sum of:
    0.7349925 = weight(info:root in 753) [SchemaSimilarity], result of:
      0.7349925 = score(freq=1.0), computed as boost * idf * tf from:
        2.8903718 = idf, computed as log(1 + (N - n + 0.5) / (n + 0.5)) from:
          3 = n, number of documents containing term
          62 = N, total number of documents with field
        0.25428993 = tf, computed as freq / (freq + k1 * (1 - b + b * dl / avgdl)) from:
          1.0 = freq, occurrences of term within document
          1.2 = k1, term saturation parameter
          0.75 = b, length normalization parameter
          56.0 = dl, length of field (approximate)
          19.14516 = avgdl, average length of field
    1.6317743 = weight(info:canal in 753) [SchemaSimilarity], result of:
      1.6317743 = score(freq=3.0), computed as boost * idf * tf from:
        3.226844 = idf, computed as log(1 + (N - n + 0.5) / (n + 0.5)) from:
          2 = n, number of documents containing term
          62 = N, total number of documents with field
        0.50568736 = tf, computed as freq / (freq + k1 * (1 - b + b * dl / avgdl)) from:
          3.0 = freq, occurrences of term within document
          1.2 = k1, term saturation parameter
          0.75 = b, length normalization parameter
          56.0 = dl, length of field (approximate)
          19.14516 = avgdl, average length of field

我希望在两个字段上匹配的文档仅在 parent_and_self_description 字段上进行评分,并且我希望仅在 info 字段上匹配的文档得分如此之低,以至于它们排在排名的最后。我究竟做错了什么?谢谢。

【问题讨论】:

    标签: solr


    【解决方案1】:

    我只需要不在 q 参数中指定任何字段。所以不是

    parent_and_self_description:2-canal
    parent_and_self_description:rct
    info:2-canal
    info:rct
    

    只要q=2-canal rct

    【讨论】:

      猜你喜欢
      • 2015-12-26
      • 1970-01-01
      • 2021-12-25
      • 2017-03-23
      • 2020-02-25
      • 1970-01-01
      • 1970-01-01
      • 2023-03-09
      • 1970-01-01
      相关资源
      最近更新 更多