【发布时间】:2021-03-13 13:39:38
【问题描述】:
我的 JSON 由字典和列表组成。
我想将字典和列表写入单独的数据帧,如下所示:
这是一个示例 JSON,我有数千个类似的:
{
"zone_id" : "1001",
"timezone" : "Eastern Time",
"address" : {
"city" : "Niagara Falls",
"country_code" : "US"
},
"financial" : {
"currency" : {
"code" : "USA"
}
},
"amenities" : {
"self_park" : true,
"paved" : true,
"mobile_pass" : null,
"handicap" : null
},
"description" : "",
"html_description" : null,
"reserve" : false,
"access_type" : "mobile_pay",
"product_types" : [ "ondemand" ],
"rates" : [ {
"id" : 50000.1,
"rate_type" : "valid_for",
"zone_id" : "1001",
"description" : "1 Hour",
"price" : "1.50"
}, {
"id" : 50001.1,
"rate_type" : "valid_for",
"zone_id" : "1001",
"description" : "4 Hours",
"price" : "3.00"
}, {
"id" : 50002.1,
"rate_type" : "valid_for",
"zone_id" : "1001",
"description" : "8 Hours",
"price" : "6.00"
}],
"reservation_configuration" : null,
"company" : {
"proper_name" : "Niagara Falls",
"logo_thumbnail" : null,
"unique" : "niagarafalls"
}
}
我想将 json 扁平化为以下具有这些列和相应数据的数据框:
df1:
zone_id,
timezone,
description,
html_description,
reserve,
access_type,
product_types,
rates,
reservation_configuration,
address.city,
address.country_code,
financial.currency.code,
amenities.self_park,
amenities.paved,
amenities.mobile_pass,
amenities.handicap,
company.proper_name,
company.logo_thumbnail,
company.unique
address:
zone_id
city
country_code
financial.currency:
zone_id
code
amenities:
zone_id
self_park
paved
mobile_pass
handicap
product_types:
zone_id
product_types
rates:
id
rate_type
zone_id
description
price
company:
zone_id
proper_name
logo_thumbnail
unique
这是我到目前为止所做的,我可以用它生成df1,但是我无法将json中的列表/字典分成数据帧,每个数据帧都有一个键; zone_id 是每个的唯一标识符(类似于数据库中表中表的主键),用于将来的数据帧连接目的。 product_types 和 rates 有我正在尝试解决这个问题的信息。我需要帮助将每个字典或列表分成单独的数据帧,每个数据帧都附有 zone_id。
dfs = []
for index, js in enumerate(json_files):
print(index, js)
with open(os.path.join(path_to_json, js)) as json_file:
json_text = json.load(json_file)
a = pd.json_normalize(json_text)
dfs.append(a)
df1 = pd.concat(dfs, ignore_index=True)
【问题讨论】:
标签: python json pandas dataframe dictionary