【发布时间】:2020-12-20 20:55:20
【问题描述】:
我正在尝试将抓取的文档推送给兔子。已关注所有可用的文档。
但是,我无法运行 indexer-rabbit。查看日志,上面甚至没有提到 indexer-rabbit。我只是想在进一步配置之前让它工作。我尝试用一个小的自定义程序连接到 RabbitMQ。一切正常。
我也在 nutch-site.xml 中包含了索引器。
<property>
<name>plugin.includes</name>
<value>protocol-http|urlfilter-(regex|validator)|parse-(html|tika)|index-(basic|anchor)|indexer-rabbit|scoring-opic|urlnormalizer-(pass|regex|basic)</value>
</property>
<property>
<name>rabbitmq.publisher.server.uri</name>
<value>amqp://guest:guest@172.17.0.2:5672/</value>
</property>
<property>
<name>publisher.queue.type</name>
<value>RabbitMQ</value>
</property>
此外,映射是默认的,似乎非常适合测试。
<writer id="indexer_solr_1" class="org.apache.nutch.indexwriter.solr.SolrIndexWriter">
<parameters>
<param name="type" value="http"/>
<param name="url" value="http://localhost:8983/solr/nutch"/>
<param name="collection" value=""/>
<param name="weight.field" value=""/>
<param name="commitSize" value="1000"/>
<param name="auth" value="false"/>
<param name="username" value="username"/>
<param name="password" value="password"/>
</parameters>
<mapping>
<copy>
<!-- <field source="content" dest="search"/> -->
<!-- <field source="title" dest="title,search"/> -->
</copy>
<rename>
<field source="metatag.description" dest="description"/>
<field source="metatag.keywords" dest="keywords"/>
</rename>
<remove>
<field source="segment"/>
</remove>
</mapping>
</writer>
<writer id="indexer_rabbit_1" class="org.apache.nutch.indexwriter.rabbit.RabbitIndexWriter">
<parameters>
<param name="server.uri" value="amqp://guest:guest@172.17.0.2:5672/"/>
<param name="binding" value="false"/>
<param name="binding.arguments" value=""/>
<param name="exchange.name" value=""/>
<param name="exchange.options" value="type=direct,durable=true"/>
<param name="queue.name" value="nutch.queue"/>
<param name="queue.options" value="durable=true,exclusive=false,auto-delete=false"/>
<param name="routingkey" value=""/>
<param name="commit.mode" value="multiple"/>
<param name="commit.size" value="250"/>
<param name="headers.static" value=""/>
<param name="headers.dynamic" value=""/>
</parameters>
<mapping>
<copy>
<field source="title" dest="title,search"/>
</copy>
<rename>
<field source="metatag.description" dest="description"/>
<field source="metatag.keywords" dest="keywords"/>
</rename>
<remove>
<field source="content"/>
<field source="segment"/>
<field source="boost"/>
</remove>
</mapping>
</writer>
有人知道我在这里缺少什么吗?
【问题讨论】:
标签: nutch