【发布时间】:2014-07-28 14:07:53
【问题描述】:
我正在使用 text_general 类型通过以下配置在 solr 索引中搜索。
<fieldType name="text_general" class="solr.TextField" positionIncrementGap="100">
<analyzer type="index">
<tokenizer class="solr.StandardTokenizerFactory"/>
<filter class="solr.SnowballPorterFilterFactory"/>
<filter class="org.apache.solr.analysis.WordDelimiterFilterFactory"
generateWordParts="1"
generateNumberParts="1"
catenateWords="1"
catenateNumbers="1"
catenateAll="1"
splitOnCaseChange="1"
splitOnNumerics="1"
preserveOriginal="1"
stemEnglishPossessive="1"/>
<filter class="solr.StopFilterFactory" ignoreCase="true" words="stopwords.txt" />
<!-- in this example, we will only use synonyms at query time
<filter class="solr.SynonymFilterFactory" synonyms="index_synonyms.txt" ignoreCase="true" expand="false"/>
-->
<filter class="solr.LowerCaseFilterFactory"/>
</analyzer>
<analyzer type="query">
<tokenizer class="solr.StandardTokenizerFactory"/>
<filter class="solr.SnowballPorterFilterFactory"/>
<filter class="solr.StopFilterFactory" ignoreCase="true" words="stopwords.txt" />
<filter class="solr.SynonymFilterFactory" synonyms="synonyms.txt" ignoreCase="true" expand="true"/>
<filter class="solr.LowerCaseFilterFactory"/>
</analyzer>
</fieldType>
我索引了很多内容并使用关键字搜索:请,请,请。
请关键字查询返回非常小的结果集。
q=%22PLEASE%22&q.op=OR&df=text&qt=%2Fselect&sort=content_name+desc&fq=content_source%3ASharepoint&AuthenticatedUserName=lalit
但是 Please & please 给出大的结果集。
q=%22please%22&q.op=OR&df=text&qt=%2Fselect&sort=content_name+desc&fq=content_source%3ASharepoint&AuthenticatedUserName=lalit
q=%22Please%22&q.op=OR&df=text&qt=%2Fselect&sort=content_name+desc&fq=content_source%3ASharepoint&AuthenticatedUserName=lalit
即使我使用 WordDelimiterFilterFactory,它也应该考虑 PLEASE, Please & please 作为相同的关键字?
任何想法。
【问题讨论】:
标签: solr