【发布时间】:2016-03-11 17:11:15
【问题描述】:
如果我有一个 json/字典(在一个 scrapy 管道中),我将如何在开始时将所有内容添加到键中并去掉括号?
[
{
"date":"2015-11-25",
"threat_level_id":"1",
"info":"TEST",
"analysis":"0",
"distribution":"0",
"orgc":"Malware, Inc",
"Attribute":[
{
"type":"md5",
"category":"Payload delivery",
"to_ids":true,
"distribution":"3",
"value":"35b759347aee663e36f5b91877749349"
}
]
}
]
我想在它的开头添加一个键并去掉括号使其看起来像这样-
{
"Event":{
"date":"2015-11-25",
"threat_level_id":"1",
"info":"TEST",
"analysis":"0",
"distribution":"0",
"orgc":"Oxygen",
"Attribute":[
{
"type":"md5",
"category":"Payload delivery",
"to_ids":true,
"distribution":"3",
"value":"35b759347aee663e36f5b91877749349"
}
]
}
}
感谢 natdempk!
我收到异常。TypeError: 预期的字符串或缓冲区 -
class JsonPipeline(object):
def process_item(self, item, spider):
data = json.loads(item)
new_data = {}
new_data['Event'] = data
item = json.dumps(data)
return item
我正在像这样运行 scrapy 爬虫 - scrapy crawl spider -o items.json
这可行,但我在 _get_serialized_fields 中收到错误文件“/usr/lib/pymodules/python2.7/scrapy/contrib/exporter/init.py”,第 71 行 field = item.fields[field_name] exceptions.AttributeError: 'dict' 对象没有属性 'fields'
class JsonWithEncodingPipeline(object):
def process_item(self, item, spider):
data = {}
data['Event'] = item
return data
如果我将它添加到 settings.py 中,它可以工作,但我没有得到文件输出?? :(
EXTENSIONS = {'scrapy.contrib.feedexport.FeedExporter': None}
有没有办法在不禁用 FEEDEXPORTER 的情况下做到这一点?
【问题讨论】:
-
读回来,做你需要的修改,然后保存新数据...
标签: python arrays json dictionary scrapy