我使用所有默认的Solr(7.5)设置创建了医学术语的集合。这些文档来自CSV文件,我将bin/post
用于默认设置。
当我提交一个愚蠢的查询时,我可能无法获得所请求的行数。
http://host/solr/collection/select?fl=anyLabel,score&q=anyLabel:(astronaut%20%20football%20felafel)&rows=9999&wt=csv
有一些分数阈值吗?在这种情况下,最低分数是〜8。我已经运行了其他不那么傻的查询,这些查询将合理的结果返回到分数2或3。
为什么该结果在得分为8后被截断?我对此有任何控制权吗?
anyLabel,score
football,16.0328
astronaut haemolytic anaemia,15.470738
astronaut hemolytic anemia,15.470738
canadian football,14.440538
american football,14.440538
football field,14.440538
astronaut-bone demineralization syndrome,14.188901
indoor football arena,13.135968
australian rules football,13.135968
canadian football - sport,13.135968
american football - sport,13.135968
aussie rules football,13.135968
indoor football court,13.135968
astronaut-bone demineralization syndrome (disorder),13.103226
australian rules football ground,12.04758
indoor football arena (environment),12.04758
indoor american football arena,12.04758
american or canadian football,12.04758
american or canadian football field,11.12575
accidentally kicked during football game,11.12575
australian rules football ground (environment),11.12575
canadian football - sport (qualifier value),11.12575
american or canadian football - sport,11.12575
american football - sport (qualifier value),11.12575
australian rules football (qualifier value),11.12575
"american or canadian football\, device",11.12575
accidentally stepped on during football game,10.334962
american or canadian football field (environment),10.334962
accidentally kicked during football game (event),10.334962
american or canadian football - sport (qualifier value),9.649129
"american or canadian football\, device (physical object)",9.649129
accidentally stepped on during football game (event),9.649129
"place of occurrence of accident or poisoning\, football field",8.518538
"place of occurrence of accident or poisoning\, football field (environment)",8.047099
答案 0 :(得分:2)
没有最低分数-高于0
的任何内容在某种程度上都被认为是匹配项,只要rows
和start
参数与{{ 1}}值。
一般而言,请求之间的分数是不可比的,并且将分数外推为“一个文件的一半是另一个文件的相关性的50%”也没有道理。
分数还取决于所使用的相似性算法,在Solr版本之间,相似性可能会有所不同。对于7.5,这是BM25相似度。