【问题标题】:Elasticsearch processor for shingles similar to split?用于类似于拆分的带状疱疹的 Elasticsearch 处理器?
【发布时间】:2021-03-19 03:13:15
【问题描述】:

有没有可以做带状疱疹的处理器,或者我可以以某种方式定制一个?

在下面的管道处理器中,我拆分了空格字符,但我还想像瓦状分析器那样组合单词:

PUT _ingest/pipeline/split
{
  "processors": [
    {
      "split": {
        "field": "title",
        "target_field": "title_suggest.input",
        "separator": "\\s+"
      }
    }
  ]
}

例子:

“高级业务开发人员”需要包含这些术语的建议字段。

  1. 高级业务开发人员
  2. 业务开发人员
  3. 开发者

以下是引发此问题的文章和答案的链接:

  1. https://blog.mimacom.com/autocomplete-elasticsearch-part3/
  2. How to combine completion, suggestion and match phrase across multiple text fields?

【问题讨论】:

    标签: elasticsearch search autocomplete n-gram shingles


    【解决方案1】:

    这是我使用自定义脚本提出的一种解决方案:

    PUT _ingest/pipeline/shingle
    {
      "description" : "Create basic shingles from title field and input in another field title_suggest",
      "processors" : [
        {
          "script": {
            "lang": "painless",
            "source": """
                  String[] split(String s, char d) {                                   
                    int count = 0;
                
                    for (char c : s.toCharArray()) {                                 
                        if (c == d) {
                            ++count;
                        }
                    }
                
                    if (count == 0) {
                        return new String[] {s};                                     
                    }
                
                    String[] r = new String[count + 1];                              
                    int i0 = 0, i1 = 0;
                    count = 0;
                
                    for (char c : s.toCharArray()) {                                 
                        if (c == d) {
                            r[count++] = s.substring(i0, i1);
                            i0 = i1 + 1;
                        }
                
                        ++i1;
                    }
                
                    r[count] = s.substring(i0, i1);                                  
                
                    return r;
                  }
                  
                  if (!ctx.containsKey('title')) { return; }
                  def title_words = split(ctx['title'], (char)' ');
                  def title_suggest = [];
                  for (def i = 0; i < title_words.length; i++) {
                    def shingle = title_words[i];
                    title_suggest.add(shingle);
                    for (def j = i + 1; j < title_words.length; j++) {
                      shingle = shingle + ' ' + title_words[j];
                      title_suggest.add(shingle);
                    }
                  }
                  ctx['title_suggest'] = title_suggest;
                  
                """
          }
        }
      ]
    }
    

    【讨论】:

      猜你喜欢
      • 2017-07-13
      • 1970-01-01
      • 1970-01-01
      • 1970-01-01
      • 2017-07-06
      • 2015-02-09
      • 2019-01-04
      • 1970-01-01
      • 2018-09-07
      相关资源
      最近更新 更多