【问题标题】:Search specific fields in nested documents as one document将嵌套文档中的特定字段作为一个文档搜索
【发布时间】:2016-01-01 15:30:38
【问题描述】:

我有以下结构:

{
    "mappings": {
        "document": {
            "properties": {
                "title": {
                    "type": "string"
                },
                "paragraphs": {
                    "type": "nested",
                    "properties": {
                        "paragraph": {
                            "type" : "object",
                            "properties" : {
                                "content": { "type": "string"},
                                "number":{"type":"integer"}
                            }
                        }
                    }
                }
            }
        }
    }
}

这些示例文档

{
    "title":"Dubai seeks cause of massive hotel fire at New Year",
    "paragraphs":[
    {"paragraph": {"number": "1", "content":"Firefighters managed to subdue the blaze, but part of the Address Downtown Hotel is still smouldering."}}, 
    {"paragraph": {"number": "2", "content":"A BBC reporter says a significant fire is still visible on the 20th floor, where the blaze apparently started."}}, 
    {"paragraph": {"number": "3", "content":"The tower was evacuated and 16 people were hurt. But a fireworks show went ahead at the Burj Khalifa tower nearby."}}, 
    {"paragraph": {"number": "4", "content":"The Burj Khalifa is the world's tallest building and an iconic symbol of the United Arab Emirates (UAE)."}}]
}

{
    "title":"Munich not under imminent IS threat",
    "paragraphs":[{"paragraph": {"number": "1", "content":"German officials say there is no sign of any imminent terror attack, after an alert that shut down two Munich railway stations on New Year's Eve."}}]
}

我现在可以使用

搜索每个段落
{ 
    "query": { 
        "nested": { 
            "path": "paragraphs", "query": { 
                "query_string": { 
                    "default_field": "paragraphs.paragraph.content", 
                    "query": "Firefighters AND still" 
                } 
            } 
        }
    }
}

问题:如何编写一个搜索多个段落但只搜索内容字段的查询?

这可行,但会搜索所有字段

{
  "query": {
    "query_string": {
      "query": "Firefighters AND apparently AND 1"
    }
  }
}

它与我想要的第 1 段中的 消防员 和第 2 段中的 显然 匹配。但是,我不希望 1 被匹配,因为它不是内容字段。

澄清:第一次搜索按段落执行搜索,这是我想要的。但是,我有时也希望能够搜索整个文档(所有段落)。

解决方案 我添加了 "include_in_parent": true 正如https://www.elastic.co/guide/en/elasticsearch/reference/1.7/mapping-nested-type.html中提到的那样

【问题讨论】:

  • 您的第一个查询有什么问题?它不是遍历所有段落吗?
  • 是的。但我也想选择同时搜索所有段落,以便“消防员和显然”会返回文档,即使它们在不同的段落中
  • 你可以尝试写一个脚本。

标签: elasticsearch


【解决方案1】:

您查询的方式是错误的,因为 nested documents 是单独索引的。请参阅doc 的最后一段。

您的查询

{
  "query": {
    "nested": {
      "path": "paragraphs",
      "query": {
        "query_string": {
          "default_field": "paragraphs.paragraph.content",
          "query": "Firefighters AND apparently"
        }
      }
    }
  }
}

正在寻找 same 段落中的两个词,因此您没有得到结果。您需要像这样单独查询它们

{
  "query": {
    "bool": {
      "must": [
        {
          "nested": {
            "path": "paragraphs",
            "query": {
              "match": {
                "paragraphs.paragraph.content": "firefighters"
              }
            }
          }
        },
        {
          "nested": {
            "path": "paragraphs",
            "query": {
              "match": {
                "paragraphs.paragraph.content": "apparently"
              }
            }
          }
        }
      ]
    }
  }
}

这会给你正确的结果。

作为旁注,我认为您在段落中不需要object datatype。以下也可以正常工作

"paragraphs": {
      "type": "nested",
      "properties": {
          "content": {
              "type": "string"
          },
          "number": {
              "type": "integer"
          }
      }
  }

希望这会有所帮助!

【讨论】:

  • 谢谢,我浏览了页面,我需要添加 "include_in_parent": true 才能让它工作
  • 很高兴我能帮上忙,没有选择“include_in_parent”,因为它可能会大大增加索引大小,但它肯定会起作用
猜你喜欢
  • 1970-01-01
  • 1970-01-01
  • 1970-01-01
  • 2011-05-24
  • 2021-06-01
  • 2018-11-05
  • 2015-08-04
  • 1970-01-01
  • 1970-01-01
相关资源
最近更新 更多