【问题标题】:Elasticsearch not returns all fields for each hit rowElasticsearch 不返回每个命中行的所有字段
【发布时间】:2019-08-14 15:45:48
【问题描述】:

我的 Elasticsearch 索引有问题。我试图为每一行获取一些字段,但是当我搜索时,弹性不会返回所有字段。如果我尝试通过 id “获取”文档 - 它会返回所有字段

在我的查询中,我尝试使用 _source 字段,但它不起作用 - 查询仅返回来自 _source 的几个字段。

有什么限制吗? _source 字段的数量或大小有限制吗?

弹性版本 7.1

我的映射:

    "video": {
    "properties": {
        "title": {
            "type": "text",
            "fields": {
                "keyword": {
                    "type": "keyword",
                    "ignore_above": 128
                },
                "basic_edge_ngram_analyzer": {
                    "type": "text",
                    "analyzer": "basic_edge_ngram_analyzer"
                },
                "basic_edge_ngram_analyzer_no_digit": {
                    "type": "text",
                    "analyzer": "basic_edge_ngram_analyzer_no_digit"
                },
                "basic_ngram_analyzer": {
                    "type": "text",
                    "analyzer": "basic_ngram_analyzer"
                },
                "basic_ngram_analyzer_no_digit": {
                    "type": "text",
                    "analyzer": "basic_ngram_analyzer_no_digit"
                },
                "numeric_analyzer": {
                    "type": "text",
                    "analyzer": "numeric_analyzer"
                },
                "translit_analyzer": {
                    "type": "text",
                    "analyzer": "translit_analyzer"
                },
                "translit_double_metaphone_analyzer": {
                    "type": "text",
                    "analyzer": "translit_double_metaphone_analyzer"
                }
            }
        },
        "inverse_title": {
            "type": "text",
            "fields": {
                "keyword": {
                    "type": "keyword",
                    "ignore_above": 128
                },
                "basic_edge_ngram_analyzer": {
                    "type": "text",
                    "analyzer": "basic_edge_ngram_analyzer"
                },
                "basic_edge_ngram_analyzer_no_digit": {
                    "type": "text",
                    "analyzer": "basic_edge_ngram_analyzer_no_digit"
                },
                "basic_ngram_analyzer": {
                    "type": "text",
                    "analyzer": "basic_ngram_analyzer"
                },
                "basic_ngram_analyzer_no_digit": {
                    "type": "text",
                    "analyzer": "basic_ngram_analyzer_no_digit"
                },
                "numeric_analyzer": {
                    "type": "text",
                    "analyzer": "numeric_analyzer"
                },
                "translit_analyzer": {
                    "type": "text",
                    "analyzer": "translit_analyzer"
                },
                "translit_double_metaphone_analyzer": {
                    "type": "text",
                    "analyzer": "translit_double_metaphone_analyzer"
                }
            }
        },
        "thumbnail_url": {
            "type": "keyword",
            "store": "true"
        },
        "is_classic": {
            "type": "boolean",
            "store": "true"
        },
        "is_club": {
            "type": "boolean",
            "store": "true"
        },
        "product_id": {
            "type": "integer",
            "store": "true"
        },
        "duration": {
            "type": "integer",
            "store": "true"
        },
        "feed_name": {
            "type": "keyword",
            "store": "true"
        },
        "feed_url": {
            "type": "keyword",
            "store": "true"
        },
        "created_ts": {
            "type": "date",
            "store": "true"
        },
        "hot_until": {"type": "date", "format": "date_hour_minute_second_fraction"},
        "description": {
            "type": "keyword"
        },
        "mi_tv_id": {"type": "integer"},
        "total_views": {"type": "long"},
        "month_views": {"type": "long"},
        "week_views": {"type": "long"},
        "day_views": {"type": "long"},
        "blocked_countries": {"type": "keyword"},
        "linked_persons": {
            "type": "nested",
            "properties": {
                "id": {"type": "integer"},
                "name": {"type": "keyword"}
            }
        },
        "linked_tags": {
            "type": "nested",
            "properties": {
                "id": {"type": "integer"},
                "name": {"type": "keyword"}
            }
        },
        "linked_hashtags":{
            "type": "nested",
            "properties": {
                "id": {"type": "integer"},
                "name": {"type": "keyword"}
            }
        },
    }
}

我的查询:

GET /video_idx/_search
{
    "aggs": {
        "mi_tv_id": {
            "terms": {
                "field": "mi_tv_id",
                "size": 10
            }
        },
        "linked_hashtags_id": {
            "aggs": {
                "linked_hashtags_id": {
                    "terms": {
                        "field": "linked_hashtags.id",
                        "size": 10
                    }
                }
            },
            "nested": {
                "path": "linked_hashtags"
            }
        },
        "author_id": {
            "terms": {
                "field": "author_id",
                "size": 10
            }
        },
        "linked_tags_id": {
            "aggs": {
                "linked_tags_id": {
                    "terms": {
                        "field": "linked_tags.id",
                        "size": 10
                    }
                }
            },
            "nested": {
                "path": "linked_tags"
            }
        },
        "linked_persons_id": {
            "aggs": {
                "linked_persons_id": {
                    "terms": {
                        "field": "linked_persons.id",
                        "size": 10
                    }
                }
            },
            "nested": {
                "path": "linked_persons"
            }
        }
    },
    "highlight": {
        "fields": {
            "inverse_title": {
                "pre_tags": ["<b>"],
                "type": "plain",
                "post_tags": ["</b>"]
            },
            "title": {
                "pre_tags": ["<b>"],
                "type": "plain",
                "post_tags": ["</b>"]
            }
        }
    },
    "from": 0,
    "size": 20,
    "_source": {
      "includes":[ "mi_tv_id", "author_id", "hot_until", "id", "linked_persons", "linked_hashtags", "linked_tags", "total_views", "thumbnail_url", "feed_name", "feed_url", "duration", "is_club", "is_classic", "product_id", "created_ts", "title", "inverse_title", "description"]
     },
    "query": {
        "function_score": {
            "script_score": {
                "script": "\n                    double total = _score;\n    \n                    if (doc['total_views'].size() > 0) {total = total * Math.log(10 + 0.000087 * doc['total_views'].value)}\n                    if (doc['month_views'].size() > 0) {total = total * Math.log(10 + 0.00025 * doc['month_views'].value)}\n                    if (doc['week_views'].size() > 0) {total = total * Math.log(10 + 0.00077 * doc['week_views'].value)}\n                    if (doc['day_views'].size() > 0) {total = total * Math.log(10 + 0.0025 * doc['day_views'].value)}\n                    if (doc['hot_until'].size() > 0) {total = 1.5 * total}\n    \n                    if (doc['mi_tv_id'].size() > 0) {total = total * 1.5}\n    \n                    return total   \n                    "
            },
            "query": {
                "bool": {
                    "minimum_should_match": "20%",
                    "should": [{
                        "multi_match": {
                            "fields": ["title.basic_ngram_analyzer", "inverse_title.basic_ngram_analyzer"],
                            "operator": "and",
                            "tie_breaker": 1.0,
                            "minimum_should_match": "65%",
                            "type": "cross_fields",
                            "boost": 5.5,
                            "query": "\u0434\u043e\u043c 2"
                        }
                    }, {
                        "multi_match": {
                            "fields": ["title.keyword", "inverse_title.keyword"],
                            "operator": "and",
                            "tie_breaker": 1.0,
                            "minimum_should_match": "100%",
                            "type": "cross_fields",
                            "boost": 12.5,
                            "query": "\u0434\u043e\u043c 2"
                        }
                    }, {
                        "multi_match": {
                            "fields": ["title.translit_analyzer", "inverse_title.translit_analyzer"],
                            "operator": "and",
                            "tie_breaker": 1.0,
                            "minimum_should_match": "65%",
                            "type": "cross_fields",
                            "boost": 3,
                            "query": "\u0434\u043e\u043c 2"
                        }
                    }, {
                        "multi_match": {
                            "fields": ["title.numeric_analyzer", "inverse_title.numeric_analyzer"],
                            "operator": "and",
                            "tie_breaker": 1.0,
                            "minimum_should_match": "100%",
                            "type": "cross_fields",
                            "boost": 6,
                            "query": "\u0434\u043e\u043c 2"
                        }
                    }, {
                        "multi_match": {
                            "fields": ["title.basic_ngram_analyzer_no_digit", "inverse_title.basic_ngram_analyzer_no_digit"],
                            "operator": "and",
                            "tie_breaker": 1.0,
                            "minimum_should_match": "65%",
                            "type": "cross_fields",
                            "boost": 5.5,
                            "query": "\u0434\u043e\u043c 2"
                        }
                    }, {
                        "multi_match": {
                            "fields": ["title.basic_edge_ngram_analyzer", "inverse_title.basic_edge_ngram_analyzer"],
                            "operator": "and",
                            "tie_breaker": 1.0,
                            "minimum_should_match": "65%",
                            "type": "cross_fields",
                            "boost": 5.5,
                            "query": "\u0434\u043e\u043c 2"
                        }
                    }, {
                        "multi_match": {
                            "fields": ["title.translit_double_metaphone_analyzer", "inverse_title.translit_double_metaphone_analyzer"],
                            "operator": "and",
                            "tie_breaker": 1.0,
                            "minimum_should_match": "65%",
                            "type": "cross_fields",
                            "boost": 1,
                            "query": "\u0434\u043e\u043c 2"
                        }
                    }, {
                        "multi_match": {
                            "fields": ["description"],
                            "operator": "and",
                            "tie_breaker": 1.0,
                            "minimum_should_match": "100%",
                            "type": "cross_fields",
                            "boost": 1.0,
                            "query": "\u0434\u043e\u043c 2"
                        }
                    }],
                    "must_not": [{
                        "terms": {
                            "blocked_countries": ["RU"]
                        }
                    }]
                }
            }
        }
    }
}

【问题讨论】:

  • 您能否展示您发送的查询以及您希望匹配的示例文档?
  • 当然。将信息添加到问题正文。显然我希望从 _source 字段中获取所有字段
  • 您是否尝试从查询中删除_source.includes 字段?
  • 是的,当然。并立即重试 - 没有帮助
  • 缺少哪些字段?

标签: elasticsearch elastic-stack elk


【解决方案1】:

您需要将所有存储的字段添加到查询中的stored_fields parameter

"_source": {
  "includes":[ "mi_tv_id", "author_id", "hot_until", "id", "linked_persons", "linked_hashtags", "linked_tags", "total_views", "thumbnail_url", "feed_name", "feed_url", "duration", "is_club", "is_classic", "product_id", "created_ts", "title", "inverse_title", "description"]
 },
 "stored_fields": ["feed_name", "feed_url", "duration", "is_club", ...],

【讨论】:

  • 不幸的是它不起作用(在查询中添加了“stored_fields” - 对我的结果没有任何影响
  • 您能否提供gist 以便我重现您所看到的内容?
猜你喜欢
  • 1970-01-01
  • 2021-04-14
  • 1970-01-01
  • 1970-01-01
  • 1970-01-01
  • 2019-03-29
  • 1970-01-01
  • 1970-01-01
  • 1970-01-01
相关资源
最近更新 更多