【发布时间】:2021-01-14 18:49:26
【问题描述】:
我有从 API 提取的数据,该 API 以以下格式输出 json 数据。如果您注意到,有一个名为“user”的嵌套元素。当我将这个嵌套元素导出到另一个源系统时,它会创建重复值。 我的目标是从用户元素中提取数据(id、名字等)并将数据保存在“用户”元素中。
这是 API 生成的原始 json 格式:
[{
"enrollment_id": 12,
"content_type": "sample",
"user": {
"id": 1,
"first_name": "Sarah",
"last_name": "Kis",
"email": "s_kis@aol.com"
},
"campaign_name": "camp1",
"policy_acknowledged": false
},
"enrollment_id": 13,
"content_type": "samplee",
"user": {
"id": 2,
"first_name": "Sarahe",
"last_name": "Kiss",
"email": "s_kiss@aol.com"
},
"campaign_name": "camp2",
"policy_acknowledged": false
}]
这是我想要的输出或类似的东西:
[{
"enrollment_id": 12,
"content_type": "sample",
"id": 1,
"first_name": "Sarah",
"last_name": "Kis",
"email": "s_kis@aol.com",
"campaign_name": "camp1",
"policy_acknowledged": false
},"enrollment_id": 13,
"content_type": "samplee",
"id": 2,
"first_name": "Sarahe",
"last_name": "Kiss",
"email": "s_kiss@aol.com",
"campaign_name": "camp2",
"policy_acknowledged": false
}]
**注意“用户”元素中的数据现在是如何被提取到 json 文件中的。我知道这可能是一个简单的快速修复,但我花了几个小时试图解决这个问题但无济于事。 **
这是我目前拥有的代码(见下文)。需要注意的是,这会完全从 json 文件中删除用户元素。不过,我想将数据保留在元素中。
path1 = '/Users/t1_{0}.json'
path2 = '/Users/t2_{0}.json'
with open(path1, 'r') as the_list:
data = json.load(the_list)
for element in data:
element.pop('user', None)
with open(path2, 'w') as the_list:
data = json.dump(data, the_list)
这是我的完整代码供参考:
def load_pst_rec_data(proxy=my_proxy, api_header=api_header,
url=rec_url, path=my_path):
all_psts = ['2011676', '2345729'] # List of items i am filtering in the subsequent data
the_list = []
s = requests.Session() # Create API session
s.proxies = my_proxy
for obj in all_psts: # Loop through the items inside the all_pst variable
for i in range(1, 10000000): # Due to pagination of the API, we have to loops through each page to collect data
try:
response = requests_retry_session(session=s). \
get(url + '{0}/recipients?page={1}&per_page=500'.format(obj, i), headers=api_header,
verify=False) # Connect to the API
resp = response.json()
except Exception as e:
print('It failed :(', e.__class__.__name__)
else:
print('It eventually worked', response.status_code)
if resp: # Consider using while resp: ______
the_list.extend(resp) # Loop through results and add it to a list
elif not resp:
last_page = str(i) # Get the last page
print("Should stop and go to next object")
break
finally:
print('process done!')
# This section attempts to load the data collected to a json file
try:
print('Beginning Json process')
except Exception as e:
print(e)
else:
path1 = '/Users/t1_{0}.json'
path2 = '/Users/t2_{0}.json'
with open(path1, 'r') as the_list:
data = json.load(the_list)
for element in data:
element.pop('user', None)
with open(path2, 'w') as the_list:
data = json.dump(data, the_list)
【问题讨论】:
-
与其尝试编辑现有的数据结构,不如只使用您想要传播的数据创建一个新的(扁平化的)结构?
-
在代码中会是什么样子?
-
flat_dict = {k: old_dict[k] for k in list_of_keys_you_want}; result = {**flat_dict, **old_dict['user']}; return json.dumps(result)实际上 kirk strauser 的回答是一样的,但更好。 -
我该把代码放在我的脚本中的什么位置?
标签: python json python-3.x python-requests