【问题标题】:My ES custom analyser is not used?我的ES自定义分析器没用?
【发布时间】:2015-05-06 22:03:40
【问题描述】:

我正在使用 Elasticsearch 并使用以下信息创建索引以进行映射和设置。我遇到的问题是我的字段geography.locality 应该使用'name_analyser' 似乎没有使用它。

{
  "index": "programs",
  "body": {
    "settings": {
      "number_of_shards": 5,
      "analysis": {
        "filter": {
          "elision": {
            "type": "elision",
            "articles": [
              "l",
              "m",
              "t",
              "qu",
              "n",
              "s",
              "j",
              "d"
            ]
          },
          "multi_words": {
            "type": "shingle",
            "min_shingle_size": 2,
            "max_shingle_size": 10
          },
          "name_filter": {
            "type": "edgeNGram",
            "max_gram": 100,
            "min_gram": 2
          }
        },
        "tokenizer": {
          "name_tokenizer": {
            "type": "edgeNGram",
            "max_gram": 100,
            "min_gram": 2
          }
        },
        "analyser": {
          "name_analyser": {          // <-- analyser I want to use on geography.locality
            "tokenizer": "whitespace",
            "type": "custom",
            "filter": [
              "lowercase",
              "multi_words",
              "name_filter",
              "asciifolding"
            ]
          },
          "french": {
            "tokenizer": "letter",
            "filter": [
              "asciifolding",
              "lowercase",
              "elision",
              "stop"
            ]
          },
          "city_name": {
            "type": "custom",
            "tokenizer": "letter",
            "filter": [
              "lowercase",
              "asciifolding"
            ]
          }
        }
      }
    },
    "mappings": {
      "program": {
        "properties": {
          "nid": {
            "type": "integer",
            "index": "not_analyzed"
          },
          "title": {
            "type": "string"
          },
          "language": {
            "type": "string",
            "index": "not_analyzed"
          },
          "regulation": {
            "type": "integer"
          },
          "sales_state": {
            "type": "integer"
          },
          "enabled_dwell": {
            "type": "boolean"
          },
          "enabled_invest": {
            "type": "boolean"
          },
          "delivery_date": {
            "type": "date"
          },
          "address": {
            "properties": {
              "country": {
                "type": "string",
                "index": "not_analyzed"
              },
              "locality": {
                "type": "string",
                "analyser": "name_analyser"
              },
              "postal_code": {
                "type": "integer"
              },
              "thoroughfare": {
                "type": "string",
                "index": "not_analyzed"
              },
              "premise": {
                "type": "string",
                "index": "not_analyzed"
              }
            }
          },
          "location": {
            "type": "geo_point"
          },
          "geography": {
            "properties": {
              "locality": {
                "type": "string",
                "analyser": "name_analyser"  // ... here :-/
              },
              "department": {
                "type": "string",
                "index": "not_analyzed"
              },
              "region": {
                "type": "string",
                "index": "not_analyzed"
              }
            }
          },
          "lots": {
            "type": "nested",
            "include_in_all": false,
            "properties": {
              "lot_type": {
                "type": "integer"
              },
              "rooms": {
                "type": "integer"
              },
              "price_vat_inc": {
                "type": "integer"
              },
              "price_reduced_vat_inc": {
                "type": "integer"
              },
              "price_vat_ex": {
                "type": "integer"
              }
            }
          }
        }
      }
    }
  }
}

这是 ES 为该索引注册的映射给出的输出。

{
  "program": {
    "properties": {
      "address": {
        "properties": {
          "country": {
            "index": "not_analyzed",
            "type": "string"
          },
          "premise": {
            "index": "not_analyzed",
            "type": "string"
          },
          "locality": {
            "type": "string"
          },
          "postal_code": {
            "type": "integer"
          },
          "thoroughfare": {
            "index": "not_analyzed",
            "type": "string"
          }
        }
      },
      "sales_state": {
        "type": "integer"
      },
      "nid": {
        "type": "integer"
      },
      "language": {
        "index": "not_analyzed",
        "type": "string"
      },
      "title": {
        "type": "string"
      },
      "enabled_invest": {
        "type": "boolean"
      },
      "geo_point": {
        "type": "string"
      },
      "lots": {
        "include_in_all": false,
        "type": "nested",
        "properties": {
          "rooms": {
            "include_in_all": false,
            "type": "integer"
          },
          "price_vat_inc": {
            "include_in_all": false,
            "type": "integer"
          },
          "price_vat_ex": {
            "include_in_all": false,
            "type": "integer"
          },
          "lot_type": {
            "include_in_all": false,
            "type": "integer"
          },
          "price_reduced_vat_inc": {
            "include_in_all": false,
            "type": "integer"
          }
        }
      },
      "enabled_dwell": {
        "type": "boolean"
      },
      "delivery_date": {
        "format": "dateOptionalTime",
        "type": "date"
      },
      "regulation": {
        "type": "integer"
      },
      "geography": {
        "properties": {
          "locality": {
            "type": "string"      // name_analyser should show up here right?????
          },
          "department": {
            "index": "not_analyzed",
            "type": "string"
          },
          "region": {
            "index": "not_analyzed",
            "type": "string"
          }
        }
      },
      "location": {
        "type": "geo_point"
      }
    }
  }
}

有人知道我做错了什么吗?我对此有点迷茫。

【问题讨论】:

  • ES 返回的映射包含"geo_point": { "type": "string" },它没有在你的静态映射中定义。还有一些其他字段与您的静态映射中的设置不匹配,不仅是localitygeo_point。你是如何创建索引的?它应该类似于PUT /programs { "settings": { "analysis": { "analyzer": {....{ "index": "programs", "body": { 在您的输出中来自哪里?
  • 我正在使用 php 客户端库,所以我只是使用类似:$client-&gt;indices()-&gt;create($params);
  • 那么我会说您如何创建索引存在问题。我建议仅在您的 php 客户端之外测试索引创建。我会说它应该工作。在此之后,仔细查看创建它的 php 代码。
  • 好的,那我试试看这里更新一下。
  • 好的,我刚刚使用 PUT 和 blabla 从 kopf 创建了一个干净的索引。映射结果完全一样...所以我猜php-client没有涉及。

标签: mapping elasticsearch


【解决方案1】:

你有一个错字:-),实际上是两个:

    "locality": {
      "type": "string",
      "analyser": "name_analyser"
    },

addressgeography 中。它应该是analyzer 而不是analyser(带有s)。

同样,这里也一样:

    "analyser": {
      "name_analyser": {
        "tokenizer": "whitespace",
    ...

【讨论】:

    【解决方案2】:

    我猜该索引存在并且您正在尝试使用新的分析仪更新设置。这在实时索引上是不允许的。

    您在提交更新的设置时是否有任何错误?

    看看这个帖子 - Change settings and mappings on existing index in Elasticsearch

    这里是http://www.elastic.co/guide/en/elasticsearch/reference/1.x/indices-update-settings.html#update-settings-analysis

    【讨论】:

    • 我每次都删除并重新创建索引。
    • 在这种情况下,我会按照@Andrei Smith 的建议单独测试索引创建(使用 sense 或任何其他插件)
    猜你喜欢
    • 2013-05-30
    • 2017-06-30
    • 1970-01-01
    • 1970-01-01
    • 1970-01-01
    • 2014-08-08
    • 1970-01-01
    • 1970-01-01
    • 2012-02-24
    相关资源
    最近更新 更多