【问题标题】:Elastic Search: indexing dates field that has null values弹性搜索:索引具有空值的日期字段
【发布时间】:2018-10-29 06:39:53
【问题描述】:

我在 Python 中使用 Elasticsearch 客户端为以下字段创建索引,但我一直坚持创建具有空值的日期索引。 当数据中存在空值时,我很难理解为什么它没有设置为date 而不是string 的索引。 从在线和 ES 文档研究来看,您似乎无法对空值进行索引。 所以,我正在关注这个https://www.elastic.co/guide/en/elasticsearch/reference/current/null-value.html 文档来解决使用"null_value": "NULL" 的问题,但是我没有成功。

我尝试将实际日期日期更改为"yyyy-MM-dd", "MM/dd/yyyy" ...等格式以及许多其他组合。 对于 json 映射,我也尝试过 {"type": "strict_date"}{"type": "strict_date": "MM/dd/yyyy"}。 有什么办法可以解决这个问题吗?

数据:

  id_name,team_name,team_members,date_info,date_sub
  123,"Biology, Neurobiology ","Ali Smith, Jon Doe",5/1/2015,5/1/2015
  234,Mathematics,Jane Smith ,8/12/2016,
  345,"Statistics, Probability","Matt P, Albert Shaw",5/15/2015,5/15/2015
  456,Chemistry,"Andrew M, Matt Shaw, Ali Smith",4/12/2017,
  678,Physics,"Joe Doe, Jane Smith, Ali Smith ",5/12/2017,5/12/2017

JSON/Python 映射:

request_body = '''
        {
            "settings" : {
              "number_of_shards": 2,
              "number_of_replicas": 1
            },

            "mappings": {
                "team": {
                    "properties": {
                        "id_name": { "type": "text"},
                        "team_name": { "type": "text"},
                        "team_members": { "type": "text"},
                        "date_info": {"type": "date","null_value": "NULL"},
                        "date_sub": {"type": "date","null_value":"NULL"}
                        }
                    }
            }
        }
    '''

    res = self.es.indices.create(index=your_index_name, ignore = 400, body=request_body)

错误:

raise HTTP_EXCEPTIONS.get(status_code, TransportError)(status_code, error_message, additional_info)
elasticsearch.exceptions.RequestError: TransportError(400, 'mapper_parsing_exception', 'failed to parse [date_info]')

【问题讨论】:

  • 您可以发布您的索引请求吗?

标签: python json elasticsearch


【解决方案1】:

在您的映射中,您没有为您的日期字段指定日期格式,在这种情况下,Elastic 将使用内置格式,如下 - "strict_date_optional_time||epoch_millis",这意味着,它应该是一个表示毫秒的长数字纪元的开头或strict_date_optional_time,实际上是一种strict格式

严格格式意味着,如果您有日期5/12/2017,则应将其填充到缺少的数字。在这种情况下,正确的严格日期应该是05/12/2017

有关日期格式的更多信息 - https://www.elastic.co/guide/en/elasticsearch/reference/current/mapping-date-format.html#built-in-date-formats

【讨论】:

    【解决方案2】:

    首先,您的日期字段架构不得包含"null_value": "NULL"

    我在 Kibana 中试过

    PUT stackoverflow {
      "settings": {
        "number_of_shards": 2,
        "number_of_replicas": 1
      },
      "mappings": {
        "team": {
          "properties": {
            "id_name": {
              "type": "text"
            },
            "team_name": {
              "type": "text"
            },
            "team_members": {
              "type": "text"
            },
            "date_info": {
              "type": "date"
            },
            "date_sub": {
              "type": "date"
            }
          }
        }
      }
    }
    

    然后,我尝试使用空日期信息插入数据

    POST stackoverflow/team
    {
      "id_name": 341,
      "team_name": "Gogologi",
      "team_members": "Wayern",
      "date_info": null,
      "date_sub": "2014-02-01"
    }
    

    为了验证,我执行了 GET 命令GET stackoverflow/team/_search

    {
      "_index": "stackoverflow",
      "_type": "team",
      "_id": "AWOCTEhoVu_LbUvfNt6J",
      "_score": 1,
      "_source": {
        "id_name": 341,
        "team_name": "Gogologi",
        "team_members": "Wayern",
        "date_info": null,
        "date_sub": "2014-02-01"
      }
    }
    

    希望对你有帮助!

    【讨论】:

      【解决方案3】:

      null_value 需要与字段具有相同的数据类型null_value | Elastic

      我将null_value 设置为可以被指定的format 解析的值。

      PUT my-index-000001
      {
        "mappings": {
          "properties": {
            "date": {
              "type":   "date",
              "null_value": "01/01/0001", 
              "format": "dd/MM/yyyy"
            }
          }
        }
      }
      

      然后,我们可以插入一些文档。

      POST my-index-000001/_doc
      { "date": null }
      POST my-index-000001/_doc
      { "date": "01/01/0001" }
      POST my-index-000001/_doc
      { "date": "31/10/2021" }
      

      现在,我们可以搜索null_value

      GET my-index-000001/_search
      {
        "query": {
          "match": {
            "date": "01/01/0001"
          }
        }
      }
      
      ### Response ###
      {
        "took" : 0,
        "timed_out" : false,
        "_shards" : {
          "total" : 1,
          "successful" : 1,
          "skipped" : 0,
          "failed" : 0
        },
        "hits" : {
          "total" : {
            "value" : 2,
            "relation" : "eq"
          },
          "max_score" : 1.0,
          "hits" : [
            {
              "_index" : "my-index-000001",
              "_type" : "_doc",
              "_id" : "rY203nwBSf_8E_MJ7pyJ",
              "_score" : 1.0,
              "_source" : {
                "date" : null
              }
            },
            {
              "_index" : "my-index-000001",
              "_type" : "_doc",
              "_id" : "ro203nwBSf_8E_MJ9Jzy",
              "_score" : 1.0,
              "_source" : {
                "date" : "01/01/0001"
              }
            }
          ]
        }
      }
      

      但请注意,null_value 仍然可以使用 range 查询进行搜索。

      GET my-index-000001/_search
      {
        "query": {
          "range": {
            "date": {
              "lt": "01/01/2021"
            }
          }
        }
      }
      
      ### Response ###
      {
        "took" : 0,
        "timed_out" : false,
        "_shards" : {
          "total" : 1,
          "successful" : 1,
          "skipped" : 0,
          "failed" : 0
        },
        "hits" : {
          "total" : {
            "value" : 2,
            "relation" : "eq"
          },
          "max_score" : 1.0,
          "hits" : [
            {
              "_index" : "my-index-000001",
              "_type" : "_doc",
              "_id" : "rY203nwBSf_8E_MJ7pyJ",
              "_score" : 1.0,
              "_source" : {
                "date" : null
              }
            },
            {
              "_index" : "my-index-000001",
              "_type" : "_doc",
              "_id" : "ro203nwBSf_8E_MJ9Jzy",
              "_score" : 1.0,
              "_source" : {
                "date" : "01/01/0001"
              }
            }
          ]
        }
      }
      

      【讨论】:

        猜你喜欢
        • 1970-01-01
        • 1970-01-01
        • 1970-01-01
        • 1970-01-01
        • 1970-01-01
        • 1970-01-01
        • 1970-01-01
        • 2016-03-28
        • 2014-12-09
        相关资源
        最近更新 更多