【发布时间】:2018-09-07 09:47:17
【问题描述】:
问题:批量索引抛出错误:[classIds] 的对象映射尝试将字段 [null] 解析为对象,但找到了具体值
这是我要发布的 JSON:
[{'_type': 'nodelookup', '_id': '248', '_source': {'modifiedOn': datetime.datetime(2013, 8, 28, 2, 44, 5), 'name': u'Big Words for Little People', 'sourceId': '', 'createdOn': datetime.datetime(2011, 8, 26, 16, 0, 49), 'classIds': [463, 10597], 'source': '', 'wikiInfo': {'wikiText': None, 'wikiLink': None}, 'notableInfo': {'source': u'NOTABLE_FOR', 'value': u'Book'}, 'relevance': 113L, 'urlFriendlyName': u'big-words-for-little-people', 'properties': [{'classId': 463, 'properties': [{'name': u'First Published', 'value': u'2008-09-08', 'id': 1411L}, {'name': u'Author', 'value': u'Jamie Lee Curtis', 'id': 1415L}]}, {'classId': 10597}], 'ontologyId': '248'}, '_index': 'nodes_a0f37542-3d66-4c2c-ad8c-5e59d9cdfa97'}]
忽略 json 中与 python 相关的额外字符/方法,示例 datetime.datetime()、无等
尝试发布文档时出现错误响应:
Traceback (most recent call last):
File "node_bulk_import_es.py", line 73, in <module>
helpers.bulk(es, data)
File "/usr/local/lib/python2.7/dist-packages/elasticsearch/helpers/__init__.py", line 188, in bulk
for ok, item in streaming_bulk(client, actions, **kwargs):
File "/usr/local/lib/python2.7/dist-packages/elasticsearch/helpers/__init__.py", line 160, in streaming_bulk
for result in _process_bulk_chunk(client, bulk_actions, raise_on_exception, raise_on_error, **kwargs):
File "/usr/local/lib/python2.7/dist-packages/elasticsearch/helpers/__init__.py", line 132, in _process_bulk_chunk
raise BulkIndexError('%i document(s) failed to index.' % len(errors), errors)
elasticsearch.helpers.BulkIndexError: (u'1 document(s) failed to index.', [{u'index': {u'status': 400, u'_type': u'nodelookup', u'_id': u'248', u'error': {u'reason': u'object mapping for [classIds] tried to parse field [null] as object, but found a concrete value', u'type': u'mapper_parsing_exception'}, u'_index': u'nodes_a0f37542-3d66-4c2c-ad8c-5e59d9cdfa97'}}])
我已经为这个索引预定义了映射,这是我的映射:
{
"nodes_e37a1e17-962d-40fb-bae2-ff20759ab1c6": {
"mappings": {
"nodelookup": {
"properties": {
"classIds": {
"type": "nested"
},
"createdOn": {
"type": "date",
"index": "analyzed",
"format": "strict_date_optional_time||epoch_millis"
},
"modifiedOn": {
"type": "date",
"index": "analyzed",
"format": "strict_date_optional_time||epoch_millis"
},
"name": {
"type": "string",
"index": "not_analyzed",
"fields": {
"nameSimple": {
"type": "string",
"analyzer": "simple"
},
"nameStandard": {
"type": "string",
"analyzer": "standard"
}
}
},
"notableInfo": {
"properties": {
"source": {
"type": "string"
},
"value": {
"type": "string"
}
}
},
"ontologyId": {
"type": "integer"
},
"properties": {
"type": "nested"
},
"relevance": {
"type": "integer",
"index": "analyzed"
},
"source": {
"type": "string"
},
"sourceId": {
"type": "string"
},
"urlFriendlyName": {
"type": "string"
},
"wikiInfo": {
"properties": {
"wikiLink": {
"type": "string",
"index": "not_analyzed"
},
"wikiText": {
"type": "string",
"index": "not_analyzed"
}
}
}
}
}
}
}
}
不知道这里出了什么问题,我上周尝试时代码运行良好,现在它失败了,请帮助我解决问题。
提前致谢!
【问题讨论】:
-
您确定 classId 应该嵌套吗?从您的批量中,我看到了数值。
-
但是数组认为是嵌套类型对吗?
-
是的,但对象数组不是一无所有的数组。您的批量应该如下所示:
'classIds': [{'someName': 463}, {'someName': 10597}]
标签: python json elasticsearch