【问题标题】:Minimal Solr score for inclusion in results?包含在结果中的最低 Solr 分数?
【发布时间】:2018-11-16 02:34:25
【问题描述】:

我使用所有默认 Solr (7.5) 设置创建了一组医学术语。文档来自 CSV 文件,我使用默认设置的 bin/post

当我提交一个愚蠢的查询时,我可能得不到我请求的行数。

http://host/solr/collection/select?fl=anyLabel,score&q=anyLabel:(astronaut%20%20football%20felafel)&rows=9999&wt=csv

是否有分数阈值?在这种情况下,最低分数约为 8。我运行了其他不太愚蠢的查询,它们返回的合理结果降至 2 或 3 分。

为什么这个结果在一个得分为 8 的结果之后被截断?我对此有任何控制权吗?

anyLabel,score football,16.0328 astronaut haemolytic anaemia,15.470738 astronaut hemolytic anemia,15.470738 canadian football,14.440538 american football,14.440538 football field,14.440538 astronaut-bone demineralization syndrome,14.188901 indoor football arena,13.135968 australian rules football,13.135968 canadian football - sport,13.135968 american football - sport,13.135968 aussie rules football,13.135968 indoor football court,13.135968 astronaut-bone demineralization syndrome (disorder),13.103226 australian rules football ground,12.04758 indoor football arena (environment),12.04758 indoor american football arena,12.04758 american or canadian football,12.04758 american or canadian football field,11.12575 accidentally kicked during football game,11.12575 australian rules football ground (environment),11.12575 canadian football - sport (qualifier value),11.12575 american or canadian football - sport,11.12575 american football - sport (qualifier value),11.12575 australian rules football (qualifier value),11.12575 "american or canadian football\, device",11.12575 accidentally stepped on during football game,10.334962 american or canadian football field (environment),10.334962 accidentally kicked during football game (event),10.334962 american or canadian football - sport (qualifier value),9.649129 "american or canadian football\, device (physical object)",9.649129 accidentally stepped on during football game (event),9.649129 "place of occurrence of accident or poisoning\, football field",8.518538 "place of occurrence of accident or poisoning\, football field (environment)",8.047099

【问题讨论】:

    标签: solr


    【解决方案1】:

    没有最低分数 - 任何高于 0 的东西在某种程度上都被认为是匹配的,只要与响应中的 numFound 值相比,rowsstart 参数有意义就包含在内。

    一般来说,请求之间的分数是不可比较的,并且将分数外推为“一个文档的分数是另一个文档的一半只有 50% 的相关性”也没有意义。

    分数还取决于所使用的相似性算法,这可能因 Solr 版本而异。对于 7.5,这是 BM25 的相似性。

    【讨论】:

      猜你喜欢
      • 2011-02-21
      • 1970-01-01
      • 1970-01-01
      • 1970-01-01
      • 1970-01-01
      • 2014-02-08
      • 1970-01-01
      • 2016-08-29
      • 2014-05-16
      相关资源
      最近更新 更多