【发布时间】:2018-08-18 02:41:31
【问题描述】:
我对 solr 很陌生。我目前使用 docker compose 在云模式下运行(我的配置可以在问题的末尾看到)
我使用默认配置创建了一个名为 audittrail 的集合。我的想法是我会将事件日志信息从另一个应用程序发送到 solr。默认情况下,它有一个看起来很方便的充满动态字段的模式。 (我知道我不应该只在生产中使用默认设置,现在我正在寻找概念证明)。
现在我正在关注此文档以尝试索引我的一些数据:https://lucene.apache.org/solr/guide/7_2/transforming-and-indexing-custom-json.html#mapping-parameters
> curl 'http://0.0.0.0:8983/api/collections/audittrail/update/json'\
'?split=/events&'\
'f=action_kind_s:/action_kind_s&'\
'f=time_dt:/events/time_dt'\
'&echo=true' \ ########## NOTE this means we're running in debug more. solr returns the documents it should be creating
-H 'Content-type:application/json' -d '{
"action_kind_s": "task_exec",
"events": [
{
"event_kind_s": "start",
"in_transaction_b": false,
"time_dt": "2018-03-09T12:57:07Z"
},
{
"event_kind_s": "start_txn",
"in_transaction_b": true,
"time_dt": "2018-03-09T12:57:07Z"
},
{
"event_kind_s": "diff",
"in_transaction_b": true,
"key_s": "('MerchantWorkerProcess', 5819715045818368L)",
"property_s": "claim_time",
"time_dt": "2018-03-09T12:57:07Z",
"value_dt": "2018-03-09T12:57:07Z"
},
],
"final_status_s": "COMPLETE",
"request_s": "1dfda9955dac6f3cfd76fbedee98b15f6edc0db",
"task_name_s": "0p5k20100CcnMVxaxoWl32WlfPixjV1OFKgv0k1KZ0m_acc_work"
}'
# response:
{
"responseHeader":{
"status":0,
"QTime":1},
"docs":[{},
{},
{}]}
那是三个空文件...
所以我想可能是因为我没有指定 id。所以我给每个事件一个唯一的 id 并用添加的&f=id:/events/id 再次尝试。结果一样
最初我尝试使用通配符 (&f=/**) 具有相同的效果。
显然我的理解中缺少一些东西。
所以我的问题是: 我应该怎么做才能正确填充我的文档?
编辑
另外,我的 solr 节点日志并没有出现任何错误。这是一个示例:
2018-03-09 14:30:50.770 INFO (qtp257895351-21) [c:audittrail s:shard2 r:core_node4 x:audittrail_shard2_replica_n2] o.a.s.u.p.LogUpdateProcessorFactory [audittrail_shard2_replica_n2] webapp=null path=/update/json params={split=/events}{add=[78953602-6b02-4948-8443-fd1ebc340921 (1594470800573857792)]} 0 3
2018-03-09 14:31:05.770 INFO (commitScheduler-14-thread-1) [c:audittrail s:shard2 r:core_node4 x:audittrail_shard2_replica_n2] o.a.s.u.DirectUpdateHandler2 start commit{_version_=1594470816305643520,optimize=false,openSearcher=false,waitSearcher=true,expungeDeletes=false,softCommit=false,prepareCommit=false}
2018-03-09 14:31:05.770 INFO (commitScheduler-14-thread-1) [c:audittrail s:shard2 r:core_node4 x:audittrail_shard2_replica_n2] o.a.s.u.SolrIndexWriter Calling setCommitData with IW:org.apache.solr.update.SolrIndexWriter@13d117d6 commitCommandVersion:1594470816305643520
2018-03-09 14:31:05.918 INFO (commitScheduler-14-thread-1) [c:audittrail s:shard2 r:core_node4 x:audittrail_shard2_replica_n2] o.a.s.s.SolrIndexSearcher Opening [Searcher@4edc35b0[audittrail_shard2_replica_n2] realtime]
2018-03-09 14:31:05.921 INFO (commitScheduler-14-thread-1) [c:audittrail s:shard2 r:core_node4 x:audittrail_shard2_replica_n2] o.a.s.u.DirectUpdateHandler2 end_commit_flush
docker-compose.yml
version: '3'
services:
zookeeper:
image: zookeeper:3.4.11
ports:
- "2181:2181"
hostname: "zookeeper"
container_name: "zookeeper"
solr1:
image: solr:7.2.1
ports:
- "8983:8983"
container_name: solr1
links:
- zookeeper:ZK
command: /opt/solr/bin/solr start -f -z zookeeper:2181
solr2:
image: solr:7.2.1
ports:
- "8984:8983"
container_name: solr2
links:
- zookeeper:ZK
command: /opt/solr/bin/solr start -f -z zookeeper:2181
以下是我为某些数据编制索引所执行的具体步骤。
这实际上并没有索引任何东西,我想知道为什么
docker-compose up-
创建集合
curl -X POST 'http://0.0.0.0:8983/solr/admin/collections?action=CREATE&name=audittrail&numShards=2' { "responseHeader":{ "status":0, "QTime":6178}, "success":{ "172.24.0.3:8983_solr":{ "responseHeader":{ "status":0, "QTime":3993}, "core":"audittrail_shard1_replica_n1"}, "172.24.0.4:8983_solr":{ "responseHeader":{ "status":0, "QTime":4399}, "core":"audittrail_shard2_replica_n2"}}, "warning":"Using _default configset. Data driven schema functionality is enabled by default, which is NOT RECOMMENDED for production use. To turn it off: curl http://{host:port}/solr/audittrail/config -d '{\"set-user-property\": {\"update.autoCreateFields\":\"false\"}}'"} -
curl 创建一些数据(这与主要问题中的 curl 相同。但不是在调试模式下:
curl 'http://0.0.0.0:8983/api/collections/audittrail/update/json?split=/events&f=action_kind_s:/action_kind_s&f=time_dt:/events/time_dt' -H 'Content-type:application/json' -d '{ "action_kind_s": "task_exec", "events": [{"event_kind_s": "start","in_transaction_b": false, "time_dt": "2018-03-09T12:57:07Z"},{"event_kind_s": "start_txn", "in_transaction_b": true,"time_dt": "2018-03-09T12:57:07Z"},{"event_kind_s": "diff", "in_transaction_b": true,"key_s": "('MerchantWorkerProcess', 5819715045818368L)","property_s": "claim_time","time_dt": "2018-03-09T12:57:07Z","value_dt": "2018-03-09T12:57:07Z"},], "final_status_s": "COMPLETE", "request_s": "xxx", "task_name_s": "xxx"}' { "responseHeader":{ "status":0, "QTime":126}} -
进行查询:
curl 'http://0.0.0.0:8983/solr/audittrail/select?q=*:*' { "responseHeader":{ "zkConnected":true, "status":0, "QTime":12, "params":{ "q":"*:*"}}, "response":{"numFound":0,"start":0,"maxScore":0.0,"docs":[] }}
【问题讨论】:
-
响应的响应是什么查询?您使用的是模式还是无模式模式?
-
@MatsLindh 正上方的
curl -
但是 JSON 处理程序是否返回它索引的内容(我不确定)?查询索引会显示什么(即搜索
*:*)? -
@MatsLindh
echo=true所以我们处于调试模式 -
当我不包含
echo=true时,回复更像:{ "responseHeader":{ "status":0, "QTime":17}}