【发布时间】:2020-11-22 15:26:06
【问题描述】:
我有以下要求。
{'bool':
{'must': [
{"terms": {"state.keyword": ["Alaska", "Alabama"]}
],
'should': [
{'match': {'abstract': 'Spill and Overfill Prevention 18 AAC 78.045'}},
{'match': {'title': 'Spill and Overfill Prevention 18 AAC 78.045'}},
{'constant_score': {
'filter': {
'match': {'title': 'Spill and Overfill Prevention 18 AAC 78.045'}
}
}}
]}
}
需要通过title(匹配)来计算分数。
为此,我尝试使用constant_score。
https://www.elastic.co/guide/en/elasticsearch/reference/current/query-dsl-constant-score-query.html
但是,它并没有达到预期的效果。它只是将每个元素的结果恰好增加 1。
这是分析的结果
{'took': 21, 'timed_out': False, '_shards': {'total': 1, 'successful': 1, 'skipped': 0, 'failed': 0}, 'hits': {'total': {'value': 6, 'relation': 'eq'}, 'max_score'
: 4.754379, 'hits': [{'_index': 'articles', '_type': '_doc', '_id': '483703', '_score': 4.754379, '_source':
这里是解释结果
{'_index': 'articles', '_type': '_doc', '_id': '483703', 'matched': True, 'explanation': {'value': 6.6602507, 'description': 'sum of:', 'details': [{'value': 0.150
05009, 'description': 'weight(legal_language:and in 2) [PerFieldSimilarity], result of:', 'details': [{'value': 0.15005009, 'description': 'score(freq=14.0), compu
ted as boost * idf * tf from:', 'details': [{'value': 2.2, 'description': 'boost', 'details': []}, {'value': 0.074107975, 'description': 'idf, computed as log(1 +
(N - n + 0.5) / (n + 0.5)) from:', 'details': [{'value': 6, 'description': 'n, number of documents containing term', 'details': []}, {'value': 6, 'description': 'N
, total number of documents with field', 'details': []}]}, {'value': 0.92034066, 'description': 'tf, computed as freq / (freq + k1 * (1 - b + b * dl / avgdl)) from
:', 'details': [{'value': 14.0, 'description': 'freq, occurrences of term within document', 'details': []}, {'value': 1.2, 'description': 'k1, term saturation para
meter', 'details': []}, {'value': 0.75, 'description': 'b, length normalization parameter', 'details': []}, {'value': 504.0, 'description': 'dl, length of field (a
pproximate)', 'details': []}, {'value': 497.5, 'description': 'avgdl, average length of field', 'details': []}]}]}]}, {'value': 0.3779109, 'description': 'weight(l
egal_language:18 in 2) [PerFieldSimilarity], result of:', 'details': [{'value': 0.3779109, 'description': 'score(freq=3.0), computed as boost * idf * tf from:', 'd
etails': [{'value': 2.2, 'description': 'boost', 'details': []}, {'value': 0.24116206, 'description': 'idf, computed as log(1 + (N - n + 0.5) / (n + 0.5)) from:',
'details': [{'value': 5, 'description': 'n, number of documents containing term', 'details': []}, {'value': 6, 'description': 'N, total number of documents with fi
eld', 'details': []}]}, {'value': 0.7122915, 'description': 'tf, computed as freq / (freq + k1 * (1 - b + b * dl / avgdl)) from:', 'details': [{'value': 3.0, 'desc
ription': 'freq, occurrences of term within document', 'details': []}, {'value': 1.2, 'description': 'k1, term saturation parameter', 'details': []}, {'value': 0.7
5, 'description': 'b, length normalization parameter', 'details': []}, {'value': 504.0, 'description': 'dl, length of field (approximate)', 'details': []}, {'value
': 497.5, 'description': 'avgdl, average length of field', 'details': []}]}]}]}, {'value': 0.3779109, 'description': 'weight(legal_language:aac in 2) [PerFieldSimi
larity], result of:', 'details': [{'value': 0.3779109, 'description': 'score(freq=3.0), computed as boost * idf * tf from:', 'details': [{'value': 2.2, 'descriptio
n': 'boost', 'details': []}, {'value': 0.24116206, 'description': 'idf, computed as log(1 + (N - n + 0.5) / (n + 0.5)) from:', 'details': [{'value': 5, 'descriptio
n': 'n, number of documents containing term', 'details': []}, {'value': 6, 'description': 'N, total number of documents with field', 'details': []}]}, {'value': 0.
7122915, 'description': 'tf, computed as freq / (freq + k1 * (1 - b + b * dl / avgdl)) from:', 'details': [{'value': 3.0, 'description': 'freq, occurrences of term
within document', 'details': []}, {'value': 1.2, 'description': 'k1, term saturation parameter', 'details': []}, {'value': 0.75, 'description': 'b, length normali
zation parameter', 'details': []}, {'value': 504.0, 'description': 'dl, length of field (approximate)', 'details': []}, {'value': 497.5, 'description': 'avgdl, ave
rage length of field', 'details': []}]}]}]}, {'value': 1.0089812, 'description': 'weight(title:spill in 2) [PerFieldSimilarity], result of:', 'details': [{'value':
1.0089812, 'description': 'score(freq=1.0), computed as boost * idf * tf from:', 'details': [{'value': 2.2, 'description': 'boost', 'details': []}, {'value': 1.02
96195, 'description': 'idf, computed as log(1 + (N - n + 0.5) / (n + 0.5)) from:', 'details': [{'value': 2, 'description': 'n, number of documents containing term'
, 'details': []}, {'value': 6, 'description': 'N, total number of documents with field', 'details': []}]}, {'value': 0.44543427, 'description': 'tf, computed as fr
eq / (freq + k1 * (1 - b + b * dl / avgdl)) from:', 'details': [{'value': 1.0, 'description': 'freq, occurrences of term within document', 'details': []}, {'value'
: 1.2, 'description': 'k1, term saturation parameter', 'details': []}, {'value': 0.75, 'description': 'b, length normalization parameter', 'details': []}, {'value'
: 7.0, 'description': 'dl, length of field', 'details': []}, {'value': 6.6666665, 'description': 'avgdl, average length of field', 'details': []}]}]}]}, {'value':
0.072622515, 'description': 'weight(title:and in 2) [PerFieldSimilarity], result of:', 'details': [{'value': 0.072622515, 'description': 'score(freq=1.0), computed
as boost * idf * tf from:', 'details': [{'value': 2.2, 'description': 'boost', 'details': []}, {'value': 0.074107975, 'description': 'idf, computed as log(1 + (N
- n + 0.5) / (n + 0.5)) from:', 'details': [{'value': 6, 'description': 'n, number of documents containing term', 'details': []}, {'value': 6, 'description': 'N, t
otal number of documents with field', 'details': []}]}, {'value': 0.44543427, 'description': 'tf, computed as freq / (freq + k1 * (1 - b + b * dl / avgdl)) from:',
'details': [{'value': 1.0, 'description': 'freq, occurrences of term within document', 'details': []}, {'value': 1.2, 'description': 'k1, term saturation paramete
r', 'details': []}, {'value': 0.75, 'description': 'b, length normalization parameter', 'details': []}, {'value': 7.0, 'description': 'dl, length of field', 'detai
ls': []}, {'value': 6.6666665, 'description': 'avgdl, average length of field', 'details': []}]}]}]}, {'value': 1.0089812, 'description': 'weight(title:overfill in
2) [PerFieldSimilarity], result of:', 'details': [{'value': 1.0089812, 'description': 'score(freq=1.0), computed as boost * idf * tf from:', 'details': [{'value':
2.2, 'description': 'boost', 'details': []}, {'value': 1.0296195, 'description': 'idf, computed as log(1 + (N - n + 0.5) / (n + 0.5)) from:', 'details': [{'value'
: 2, 'description': 'n, number of documents containing term', 'details': []}, {'value': 6, 'description': 'N, total number of documents with field', 'details': []}
]}, {'value': 0.44543427, 'description': 'tf, computed as freq / (freq + k1 * (1 - b + b * dl / avgdl)) from:', 'details': [{'value': 1.0, 'description': 'freq, oc
currences of term within document', 'details': []}, {'value': 1.2, 'description': 'k1, term saturation parameter', 'details': []}, {'value': 0.75, 'description': '
b, length normalization parameter', 'details': []}, {'value': 7.0, 'description': 'dl, length of field', 'details': []}, {'value': 6.6666665, 'description': 'avgdl
, average length of field', 'details': []}]}]}]}, {'value': 1.0089812, 'description': 'weight(title:prevention in 2) [PerFieldSimilarity], result of:', 'details':
[{'value': 1.0089812, 'description': 'score(freq=1.0), computed as boost * idf * tf from:', 'details': [{'value': 2.2, 'description': 'boost', 'details': []}, {'va
lue': 1.0296195, 'description': 'idf, computed as log(1 + (N - n + 0.5) / (n + 0.5)) from:', 'details': [{'value': 2, 'description': 'n, number of documents contai
ning term', 'details': []}, {'value': 6, 'description': 'N, total number of documents with field', 'details': []}]}, {'value': 0.44543427, 'description': 'tf, comp
uted as freq / (freq + k1 * (1 - b + b * dl / avgdl)) from:', 'details': [{'value': 1.0, 'description': 'freq, occurrences of term within document', 'details': []}
, {'value': 1.2, 'description': 'k1, term saturation parameter', 'details': []}, {'value': 0.75, 'description': 'b, length normalization parameter', 'details': []}
, {'value': 7.0, 'description': 'dl, length of field', 'details': []}, {'value': 6.6666665, 'description': 'avgdl, average length of field', 'details': []}]}]}]},
{'value': 0.072622515, 'description': 'weight(title:18 in 2) [PerFieldSimilarity], result of:', 'details': [{'value': 0.072622515, 'description': 'score(freq=1.0),
computed as boost * idf * tf from:', 'details': [{'value': 2.2, 'description': 'boost', 'details': []}, {'value': 0.074107975, 'description': 'idf, computed as lo
g(1 + (N - n + 0.5) / (n + 0.5)) from:', 'details': [{'value': 6, 'description': 'n, number of documents containing term', 'details': []}, {'value': 6, 'descriptio
n': 'N, total number of documents with field', 'details': []}]}, {'value': 0.44543427, 'description': 'tf, computed as freq / (freq + k1 * (1 - b + b * dl / avgdl)
) from:', 'details': [{'value': 1.0, 'description': 'freq, occurrences of term within document', 'details': []}, {'value': 1.2, 'description': 'k1, term saturation
parameter', 'details': []}, {'value': 0.75, 'description': 'b, length normalization parameter', 'details': []}, {'value': 7.0, 'description': 'dl, length of field
', 'details': []}, {'value': 6.6666665, 'description': 'avgdl, average length of field', 'details': []}]}]}]}, {'value': 0.072622515, 'description': 'weight(title:
aac in 2) [PerFieldSimilarity], result of:', 'details': [{'value': 0.072622515, 'description': 'score(freq=1.0), computed as boost * idf * tf from:', 'details': [{
'value': 2.2, 'description': 'boost', 'details': []}, {'value': 0.074107975, 'description': 'idf, computed as log(1 + (N - n + 0.5) / (n + 0.5)) from:', 'details':
[{'value': 6, 'description': 'n, number of documents containing term', 'details': []}, {'value': 6, 'description': 'N, total number of documents with field', 'det
ails': []}]}, {'value': 0.44543427, 'description': 'tf, computed as freq / (freq + k1 * (1 - b + b * dl / avgdl)) from:', 'details': [{'value': 1.0, 'description':
'freq, occurrences of term within document', 'details': []}, {'value': 1.2, 'description': 'k1, term saturation parameter', 'details': []}, {'value': 0.75, 'descr
iption': 'b, length normalization parameter', 'details': []}, {'value': 7.0, 'description': 'dl, length of field', 'details': []}, {'value': 6.6666665, 'descriptio
n': 'avgdl, average length of field', 'details': []}]}]}]}, {'value': 1.5095675, 'description': 'weight(title:78.045 in 2) [PerFieldSimilarity], result of:', 'deta
ils': [{'value': 1.5095675, 'description': 'score(freq=1.0), computed as boost * idf * tf from:', 'details': [{'value': 2.2, 'description': 'boost', 'details': []}
, {'value': 1.5404451, 'description': 'idf, computed as log(1 + (N - n + 0.5) / (n + 0.5)) from:', 'details': [{'value': 1, 'description': 'n, number of documents
containing term', 'details': []}, {'value': 6, 'description': 'N, total number of documents with field', 'details': []}]}, {'value': 0.44543427, 'description': 'tf
, computed as freq / (freq + k1 * (1 - b + b * dl / avgdl)) from:', 'details': [{'value': 1.0, 'description': 'freq, occurrences of term within document', 'details
': []}, {'value': 1.2, 'description': 'k1, term saturation parameter', 'details': []}, {'value': 0.75, 'description': 'b, length normalization parameter', 'details
': []}, {'value': 7.0, 'description': 'dl, length of field', 'details': []}, {'value': 6.6666665, 'description': 'avgdl, average length of field', 'details': []}]}
]}]}, {'value': 1.0, 'description': 'ConstantScore(title.keyword:Spill and Overfill Prevention 18 AAC 78.045)', 'details': []}]}}
与script_score
{'query': {
'function_score': {
'query': {
'bool': {
'should': [
{'match': {'legal_language': 'inspections and testing 691'}},
{'match': {'title': 'inspections and testing 691'}}
]
}
},
'script_score': {
'script': {'source': "doc['title'].value"}
}
}
}}
映射
{
"articles" : {
"mappings" : {
"properties" : {
"abstract" : {
"type" : "text",
"fields" : {
"keyword" : {
"type" : "keyword",
"ignore_above" : 256
}
}
},
"categories" : {
"type" : "text",
"fields" : {
"keyword" : {
"type" : "keyword",
"ignore_above" : 256
}
}
},
"cfr40_part280" : {
"type" : "text",
"fields" : {
"keyword" : {
"type" : "keyword",
"ignore_above" : 256
}
}
},
"citation" : {
"type" : "text",
"fields" : {
"keyword" : {
"type" : "keyword",
"ignore_above" : 256
}
}
},
"effective_date" : {
"type" : "date"
},
"id" : {
"type" : "long"
},
"legal_language" : {
"type" : "text",
"fields" : {
"keyword" : {
"type" : "keyword",
"ignore_above" : 256
}
}
},
"local_regulation" : {
"type" : "text",
"fields" : {
"keyword" : {
"type" : "keyword",
"ignore_above" : 256
}
}
},
"reference_images" : {
"type" : "text",
"fields" : {
"keyword" : {
"type" : "keyword",
"ignore_above" : 256
}
}
},
"state" : {
"type" : "text",
"fields" : {
"keyword" : {
"type" : "keyword",
"ignore_above" : 256
}
}
},
"tags" : {
"type" : "text",
"fields" : {
"keyword" : {
"type" : "keyword",
"ignore_above" : 256
}
}
},
"title" : {
"type" : "text",
"fields" : {
"keyword" : {
"type" : "keyword",
"ignore_above" : 256
}
}
},
"unique_id" : {
"type" : "text",
"fields" : {
"keyword" : {
"type" : "keyword",
"ignore_above" : 256
}
}
}
}
}
}
}
追溯
> Traceback (most recent call last): File
> "D:\work_projects\dewey_project\webapp\articles\services\elasticsearch_service.py",
> line 103, in retrieve_articles
> result = current_app.elasticsearch.search( File "d:\work_projects\dewey_project\venv\lib\site-packages\elasticsearch\client\utils.py",
> line 84, in _wrapped
> return func(*args, params=params, **kwargs) File "d:\work_projects\dewey_project\venv\lib\site-packages\elasticsearch\client\__init__.py",
> line 1547, in search
> return self.transport.perform_request( File "d:\work_projects\dewey_project\venv\lib\site-packages\elasticsearch\transport.py",
> line 351, in perform_request
> status, headers_response, data = connection.perform_request( File
> "d:\work_projects\dewey_project\venv\lib\site-packages\elasticsearch\connection\http_urllib3.py",
> line 261, in perform_request
> self._raise_error(response.status, raw_data) File "d:\work_projects\dewey_project\venv\lib\site-packages\elasticsearch\connection\base.py",
> line 181, in _raise_error
> raise HTTP_EXCEPTIONS.get(status_code, TransportError)( elasticsearch.exceptions.RequestError: RequestError(400,
> 'search_phase_execution_exception', 'runtime error')
【问题讨论】:
-
只是澄清一些事情,如果标题匹配,您是否试图为文档提供更多相关性(分数)?
-
@Sharmiko not..
-
我不明白你想达到什么目的,如果你想增加其他值,你必须指定
boost参数。如果这不是您想要做的,如果您能更具体地解释它会有所帮助。 -
@Sharmiko 我只需要计算分数
"match": {"title": text},不是整个查询(默认)。
标签: python python-3.x elasticsearch