【问题标题】:ElasticSearch (Nest) - Ignoring ApostrophesElasticSearch (Nest) - 忽略撇号
【发布时间】:2017-06-20 13:23:56
【问题描述】:

我对 Elasticsearch 很陌生。我一直在尝试实现一个相当基本的搜索功能,它会忽略撇号。我找到了documentation 的这一点。对于我使用 Nest 库的实现:

[ElasticsearchType]
public class MappingTest
{
  [Text(Analyzer = "english")]
  public string Title { get; set; }
}

索引/搜索:

var indexName = "testindex";
var connectionSettings =
    new ConnectionSettings(
        new Uri("https://url.com"))
        .DefaultIndex(indexName).EnableDebugMode();var client = new ElasticClient(connectionSettings);

if (client.IndexExists(Indices.All, descriptor => descriptor.Index(indexName)).Exists)
    client.DeleteIndex(indexName);

var response = client.CreateIndex(indexName, i => new CreateIndexDescriptor(indexName)
    .Mappings(ms => ms
        .Map<MappingTest>(m => m.AutoMap())
    ).Settings(s => s
        .Analysis(a => a
            .Analyzers(aa => aa
                .Custom("english", ca => ca
                    .Tokenizer("standard")
                    .Filters("english_possessive_stemmer", "lowercase")
                )
            )
            .TokenFilters(
                t => t.Stemmer("english_possessive_stemmer", d => d.Language("possessive_english")))
            .TokenFilters(
                t => t.Stemmer("english_stemmer", d => d.Language("english")))
            .TokenFilters(
                t => t.Stop("english_stop", d => d.StopWords("_english_")))
        )
    ));


var obj = new MappingTest() { Title = "Example's" };


var indexResponse = client.Index(obj);

var term = "example";

QueryContainer commonQuery =
    Query<MappingTest>.QueryString(qs => qs.Query(term).DefaultField(f => f.Title));

var searchResponse = client.Search<MappingTest>(s => s.Query(x => commonQuery));
var debug = searchResponse.DebugInformation;    

我尝试了几种方法,但每次都没有得到任何结果。我会申请一些帮助。

【问题讨论】:

    标签: elasticsearch nest


    【解决方案1】:

    一些事情:

    1. 您无需指定自己的自定义 "english" 分析器,因为它已内置于 Elasticsearch 中。如果您想基于它implement your own analyzer,那么您需要按原样指定它。在您的情况下,我认为您不想指定自己的,因为您的自定义分析器仅使用 english_possessive_stemmerlowercase 令牌过滤器。
    2. 多次调用.TokenFilters()意味着只有最后一次调用获胜;调用通常在 NEST 中执行分配,所以你会想要

      .TokenFilters(t => t
          .Stemmer("english_possessive_stemmer", d => d.Language("possessive_english"))
          .Stemmer("english_stemmer", d => d.Language("english"))
          .Stop("english_stop", d => d.StopWords("_english_")))
      
    3. POCO 属性Title 上的Text 属性决定了用于该字段的分析器

    4. 只有在刷新间隔(默认为 1 秒)过后,索引文档才可用于搜索,并且新索引的文档被写入新段。在您的示例中,我怀疑您的搜索在文档可用于索引之前运行

    我建议使用 Analyze API 来玩分析器。您可以为您的类型创建索引和映射,然后使用特定字段的映射分析一段文本

    void Main()
    {
        var pool = new SingleNodeConnectionPool(new Uri("http://localhost:9200"));
        var defaultIndex = "default-index";
        var connectionSettings = new ConnectionSettings(pool)
            .DefaultIndex(defaultIndex)
            .PrettyJson()
            .EnableDebugMode(response =>
                {
                    if (response.RequestBodyInBytes != null)
                    {
                        Console.WriteLine(
                            $"{response.HttpMethod} {response.Uri} \n" +
                            $"{Encoding.UTF8.GetString(response.RequestBodyInBytes)}");
                    }
                    else
                    {
                        Console.WriteLine($"{response.HttpMethod} {response.Uri}");
                    }
    
                    Console.WriteLine();
    
                    if (response.ResponseBodyInBytes != null)
                    {
                        Console.WriteLine($"Status: {response.HttpStatusCode}\n" +
                                 $"{Encoding.UTF8.GetString(response.ResponseBodyInBytes)}\n" +
                                 $"{new string('-', 30)}\n");
                    }
                    else
                    {
                        Console.WriteLine($"Status: {response.HttpStatusCode}\n" +
                                 $"{new string('-', 30)}\n");
                    }
                });
    
        var client = new ElasticClient(connectionSettings);
    
        if (client.IndexExists(defaultIndex).Exists)
            client.DeleteIndex(defaultIndex);
    
        client.CreateIndex(defaultIndex, i => i
            .Mappings(ms => ms
                .Map<MappingTest>(m => m
                    .AutoMap()
                )
            )
        );
    
        client.Analyze(a => a
            .Index(defaultIndex)
            .Field<MappingTest>(f => f.Title)
            .Text("Example's")
        );
    }
    
    [ElasticsearchType]
    public class MappingTest
    {
        [Text(Analyzer = "english")]
        public string Title { get; set; }
    }
    

    您可以看到分析器为Example's返回以下令牌

    {
      "tokens" : [
        {
          "token" : "exampl",
          "start_offset" : 0,
          "end_offset" : 9,
          "type" : "<ALPHANUM>",
          "position" : 0
        }
      ]
    }
    

    【讨论】:

    猜你喜欢
    • 2013-04-18
    • 2014-03-31
    • 2013-05-07
    • 2016-04-23
    • 2011-03-20
    • 1970-01-01
    • 1970-01-01
    • 1970-01-01
    • 1970-01-01
    相关资源
    最近更新 更多