【问题标题】:How to check missing key in a nested type elasticsearch object?如何检查嵌套类型弹性搜索对象中缺少的键?
【发布时间】:2015-07-15 03:55:16
【问题描述】:

我有一个案例,我在键值对 json 格式的文档中收集了一些常规信息和数据库信息(db2、oracle、sybase、informix)。

我还有一些规则来检查上述文档是否满足特定规则,如果满足,则返回该特定文档进行分析。

这是文档

PUT /twitter/tweet/1
{
    "name": "Athena",
    "version": 1,
    "db": {
        "@type": "Oracle Database 10g Enterprise Edition Release 10.2.0.4.0 64bit",
        "oracle_props": [
            {
                "@name": "open_cursors",
                "@value": 4000
            },
            {
                "@name": "USER_ROLE_PRIVS_COUNT",
                "@value": 1
            }
        ]
    }
}

这就是它的映射

PUT /twitter/tweet/_mapping
{
   "properties": {
      "db": {
         "type": "object",
         "properties": {
            "@type": {
               "type": "string"
            },
            "oracle_props": {
               "type": "nested",
               "properties": {
                  "@name": {
                     "type": "string"
                  },
                  "@value": {
                     "type": "long"
                  }
               }
            }
         }
      }
   }
}

规则标准

列出带有name AthenaOracle database 的推文具有opencursors less than recommendaed value 4000opencursors is not present 时的文档。

所以上面的文档/twitter/tweet/1只有在以下匹配时才会返回。

  1. If (name == "Athena") && (db.@type contains "Oracle" 关键字)
  2. 和(如果((“open_cursors”@value

下面是匹配上述文档但缺少最后一个条件的搜索查询(即使在“db.oracle_props.@name”下缺少“open_cursors”键时也显示文档“/twitter/tweet/1”)

GET /twitter/tweet/_search
{
   "query": {
      "bool": {
         "must": [
            {
               "match": {
                  "tweet.name": "Athena"
               }
            },
            {
               "match": {
                  "tweet.db.@type": "Oracle"
               }
            }
         ],
         "should": [
            {
               "nested": {
                  "path": "db.oracle_props",
                  "query": {
                     "bool": {
                        "must": [
                           {
                              "term": {
                                 "db.oracle_props.@name": "open_cursors"
                              }
                           },
                           {
                              "range": {
                                 "db.oracle_props.@value": {
                                    "lt": 4001
                                 }
                              }
                           }
                        ]
                     }
                  }
               }
            }
         ],
         "minimum_should_match": 1
      }
   }
}

【问题讨论】:

    标签: json elasticsearch


    【解决方案1】:

    我会再试一次,参考你的 other questionmy answer

    如果我正确理解您的要求,我将设置一些示例文档,这些文档应该或不应该正确匹配我的 cmets:

    // All good, should match
    curl -XPUT 'http://localhost:9200/twitter/tweet/1' -d '{
        "name": "Athena",
        "version": 1,
        "db": {
            "@type": "Oracle Database 10g Enterprise Edition Release 10.2.0.4.0 64bit",
            "oracle_props": [
                {
                    "@name": "open_cursors",
                    "@value": 4000
                },
                {
                    "@name": "USER_ROLE_PRIVS_COUNT",
                    "@value": 1
                },
                {
                    "@name": "CREATE_PERMISSION",
                    "@value": "Y"
                }
            ]
        }
    }'
    
    // open cursors missing, should match
    curl -XPUT 'http://localhost:9200/twitter/tweet/2' -d '{
        "name": "Athena",
        "version": 1,
        "db": {
            "@type": "Oracle Database 10g Enterprise Edition Release 10.2.0.4.0 64bit",
            "oracle_props": [
                {
                    "@name": "USER_ROLE_PRIVS_COUNT",
                    "@value": 2
                },
                {
                    "@name": "CREATE_PERMISSION",
                    "@value": "N"
                }
            ]
        }
    }'
    
    // open_cursors less than 4000, should match
    curl -XPUT 'http://localhost:9200/twitter/tweet/3' -d '{
        "name": "Athena",
        "version": 1,
        "db": {
            "@type": "Oracle Database 10g Enterprise Edition Release 10.2.0.4.0 64bit",
            "oracle_props": [
                {
                    "@name": "open_cursors",
                    "@value": 2134
                },
                {
                    "@name": "USER_ROLE_PRIVS_COUNT",
                    "@value": 6
                },
                {
                    "@name": "CREATE_PERMISSION",
                    "@value": "N"
                }
            ]
        }
    }'
    
    // Different name, shouldn't match
    curl -XPUT 'http://localhost:9200/twitter/tweet/4' -d '{
        "name": "Alexandroupolis",
        "version": 1,
        "db": {
            "@type": "Oracle Database 10g Enterprise Edition Release 10.2.0.4.0 64bit",
            "oracle_props": [
                {
                    "@name": "open_cursors",
                    "@value": 4000
                },
                {
                    "@name": "USER_ROLE_PRIVS_COUNT",
                    "@value": 1
                },
                {
                    "@name": "CREATE_PERMISSION",
                    "@value": "Y"
                }
            ]
        }
    }'
    
    // open_cursors more than 4000, shouldn't match
    curl -XPUT 'http://localhost:9200/twitter/tweet/5' -d '{
        "name": "Athena",
        "version": 1,
        "db": {
            "@type": "Oracle Database 10g Enterprise Edition Release 10.2.0.4.0 64bit",
            "oracle_props": [
                {
                    "@name": "open_cursors",
                    "@value": 6500
                },
                {
                    "@name": "USER_ROLE_PRIVS_COUNT",
                    "@value": 1
                },
                {
                    "@name": "CREATE_PERMISSION",
                    "@value": "Y"
                }
            ]
        }
    }'
    

    因此我们有 3 个文档 (ID 1,2,3) 应该返回。

    我发现的解决方案似乎很复杂,也许其他人可以提供更简单的方法来解决这个问题?

    我设置了filtered query,以便能够使用OR filter

    curl -XGET 'http://localhost:9200/twitter/tweet/_search?pretty=true' -d '
    {
        "query" : {
            "filtered" : {
                "filter" : {
                    /* Set up two conditions */
                    "or" : [
                        /* First */
                        {
                            /* Check for open_cursors AND value < 4000 */
                            "bool" : {
                                "must" : [
                                    /* Same nested query as in other questions answer */
                                    {
                                        "nested" : {
                                            "path" : "db.oracle_props",
                                            "filter" : {
                                                "bool" : {
                                                    "must" : [
                                                        {
                                                        "term": {
                                                            "db.oracle_props.@name": "open_cursors"
                                                        }
                                                    },
                                                    {
                                                        "range": {
                                                            "db.oracle_props.@value": {
                                                                "lte": 4000
                                                            }
                                                        }
                                                    }
                                                    ]
                                                }
                                            }
                                        }
                                    }
                                ]
                            }
                        },
                        /* OR */
                        {
                            "bool" : {
                                /* watch out: negation, this MUST NOT be found*/
                                "must_not" : [
                                    {
                                        "nested" : {
                                            "path" : "db.oracle_props",
                                            "filter" : {
                                                "bool" : {
                                                    /* We do not want open_cursors to be in the nested document */
                                                    "must" : [
                                                        {
                                                        "term": {
                                                            "db.oracle_props.@name": "open_cursors"
                                                        }
                                                    }
                                                    ]
                                                }
                                            }
                                        }
                                    }
                                ]
                            }
                        }
                    ]
                },
                /* the query for the non-nested things */
                "query" : {
                    "bool" : {
                        "must" : [
                            {
                                "match" : {"tweet.name" : "Athena"}
                            },
                            {
                                "match" : {"tweet.db.@type" : "Oracle"}
                            }
                        ]
                    }
                }
            }
        }
    }
    '
    

    返回文档 1,2 和 3。

    更新: 这是一个更简单的解决方案,也应该可以工作。谢谢@TuanHuynh

    curl -XGET 'http://localhost:9200/twitter/tweet/_search?pretty=true' -d '
    {
        "query" : {
            "filtered" : {
                "filter" : {
                    /* Set up two conditions */
                    "or" : [
                        /* First */
                        {
                            "nested" : {
                                "path" : "db.oracle_props",
                                "filter" : {
                                    "bool" : {
                                        "must" : [
                                            {
                                            "term": {
                                                "db.oracle_props.@name": "open_cursors"
                                            }
                                        },
                                        {
                                            "range": {
                                                "db.oracle_props.@value": {
                                                    "lte": 4000
                                                }
                                            }
                                        }
                                        ]
                                    }
                                }
                            }
                        },
                        /* OR */
                        {
                            "nested" : {
                                "path" : "db.oracle_props",
                                "filter" : {
                                    "bool" : {
                                        /* We do not want open_cursors to be in the nested document */
                                        "must" : [
                                            {
                                            "term": {
                                                "db.oracle_props.@name": "open_cursors"
                                            }
                                        }
                                        ]
                                    }
                                }
                            }
                        }
                    ]
                },
                /* the query for the non-nested things */
                "query" : {
                    "bool" : {
                        "must" : [
                            {
                                "match" : {"tweet.name" : "Athena"}
                            },
                            {
                                "match" : {"tweet.db.@type" : "Oracle"}
                            }
                        ]
                    }
                }
            }
        }
    }
    '
    

    【讨论】:

    • 谢谢。正是我需要的。
    • @Thorsten,为什么我们需要一个 bool.must 在嵌套过滤器之前?我们可以在 OR 过滤器之后立即使用嵌套过滤器吗?
    • 谢谢,没有它应该可以工作,你是对的。我已经更新了答案。
    猜你喜欢
    • 1970-01-01
    • 1970-01-01
    • 1970-01-01
    • 1970-01-01
    • 1970-01-01
    • 2019-05-14
    • 1970-01-01
    • 1970-01-01
    • 1970-01-01
    相关资源
    最近更新 更多