【问题标题】:Hibernate/Lucene search: Wildcard queryHibernate/Lucene 搜索:通配符查询
【发布时间】:2014-06-07 21:58:40
【问题描述】:

我正在尝试实现通配符查询,但我被卡住了...谁能帮助我?

如果用户正在搜索 AB,我想返回与正则表达式“A[/.-'+&, ]{0,1}B[/.-'+&, ]{0,1}"。我知道不能使用正则表达式,但我只想指出预期的结果。

因此,搜索“AB”应返回如下结果:“ABC x”、“abc x”、“Abcdefg”、“A.b.c.”、“A-B-C”、“A B C d”、“Table”。

我使用这个分析器创建了一个模型:

@Indexed
@AnalyzerDef(name = "abAnalyzer", tokenizer = @TokenizerDef(factory = StandardTokenizerFactory.class), filters =
{
    @TokenFilterDef(factory = StandardFilterFactory.class),
    @TokenFilterDef(factory = LowerCaseFilterFactory.class),
    @TokenFilterDef(factory = StopFilterFactory.class, params =
    {
        @Parameter(name = "ignoreCase", value = "true")
    })
})
public class Foo
{
    ...
    @Field(index = Index.YES, analyze = Analyze.YES, store = Store.NO)
    @Analyzer(definition = "abAnalyzer")
    private String name;

    ...
}

我实现了如下所示的查询。我得到了所有预期的结果,除了像“A.b.c”这样的结果。我究竟做错了什么?我在哪里误会了?

public List<Foo> getResults(final String searchName)
{
    Session session = this.sessionFactory.openSession();
    FullTextSession fullTextSession = Search.getFullTextSession(session);
    Transaction tx = fullTextSession.beginTransaction();

    BooleanQuery bQuery = new BooleanQuery();
    Analyzer analyzer = fullTextSession.getSearchFactory().getAnalyzer("abAnalyzer");

    QueryParser qp = new QueryParser(Version.LUCENE_36, "name", analyzer);
    String cleanedText = qp.parse(searchName).toString("name");
    String[] tokenized = cleanedText.split(""); // split on each character

    QueryBuilder qBuilder = fullTextSession.getSearchFactory().buildQueryBuilder().forEntity(Foo.class).get();

    org.apache.lucene.search.Query query = qBuilder.keyword().wildcard().onField("name").matching("*" + cleanedText + "*").createQuery();
    bQuery.add(query, BooleanClause.Occur.SHOULD);

    query = qBuilder.keyword().wildcard().onField("name").matching("*" + createSearchString(cleanedText) + "*").createQuery();
    bQuery.add(query, BooleanClause.Occur.SHOULD);

    org.hibernate.Query hibQuery = fullTextSession.createFullTextQuery(bQuery, Company.class);    
    return hibQuery.list();
}

....

private String createSearchString(final String name)
{
    StringBuilder searchName = new StringBuilder("");
    for (int i = 0; i < name.length(); i++)
    {
        if (searchName.length() > 0)
        {
            searchName.append("?");
        }
        searchName.append(name.charAt(i));
    }

    return searchName.toString();
}

我的代码基于这些资源:

【问题讨论】:

    标签: hibernate search lucene wildcard hibernate-search


    【解决方案1】:

    我想我找到了解决方案... Lucene 索引是基于小写字符串生成的,并且索引中的“特殊”字符被删除。

    所以,我将模型更改为:

    @Indexed
    @AnalyzerDef(name = "abAnalyzer", charFilters =
    {
        @CharFilterDef(factory = PatternReplaceCharFilterFactory.class, params =
        {
            @Parameter(name = "pattern", value = Company.PATTERN),
            @Parameter(name = "replacement", value = Company.REPLACEMENT_PATTERN)
        })
     }, tokenizer = @TokenizerDef(factory = StandardTokenizerFactory.class), filters =
    {
        @TokenFilterDef(factory = StandardFilterFactory.class), @TokenFilterDef(factory = LowerCaseFilterFactory.class),
    })
    public class Foo
    {
        public static final String PATTERN = "(A-Z)*[\\/\\.\\-'+&, ](A-Z)*";
        public static final String REPLACEMENT_PATTERN = "$1$2";
    
        @Field(index = Index.YES, analyze = Analyze.YES, store = Store.NO)
        @Analyzer(definition = "abAnalyzer")
        private String name;
    
        ....
    }
    

    对于查询,我实现了这个:

    public List<Foo> getResults(final String searchName)
    {
        List<Foo> result = new ArrayList<>();
    
        // remove "special chars from searchName"
        String searchName = name.replaceAll(Company.PATTERN, Company.REPLACEMENT_PATTERN);
    
        Session session = this.sessionFactory.openSession();
        try
        {
            FullTextSession fullTextSession = Search.getFullTextSession(session);
            Transaction tx = fullTextSession.beginTransaction();
            Analyzer analyzer = fullTextSession.getSearchFactory().getAnalyzer("abAnalyzer");
    
            QueryParser qp = new QueryParser(Version.LUCENE_36, "name", analyzer);
            String cleanedText = qp.parse(searchName).toString("name");
    
            BooleanQuery bQuery = new BooleanQuery();
            bQuery.add(new WildcardQuery(new Term("name", "*" + cleanedText + "*")), BooleanClause.Occur.SHOULD);
    
            org.hibernate.Query hibQuery = fullTextSession.createFullTextQuery(bQuery, Company.class);
    
            result = hibQuery.list();
            tx.commit();
        }
        catch (Exception e)
        {
            // TODO Auto-generated catch block
            e.printStackTrace();
        }
        finally
        {
            session.close();
        }
    
        return result;
    }
    

    【讨论】:

      猜你喜欢
      • 1970-01-01
      • 1970-01-01
      • 1970-01-01
      • 2018-04-20
      • 2015-12-23
      • 2013-08-07
      • 1970-01-01
      • 1970-01-01
      相关资源
      最近更新 更多