【发布时间】:2020-11-07 04:30:07
【问题描述】:
我正在尝试从深度嵌套的 AWS 定价 API 创建 DataFrame,当我指定仅查看第一级键“tems”和第二级键“OnDemand”之后,我将 sku 作为索引和列 OnDemand具有多个嵌套的 json/dicts。这是代码和输出:
import requests
import json
import os
import pandas as pd
from pandas.io.json import json_normalize
import flatten_json
ec2_url = requests.get("https://pricing.us-east-1.amazonaws.com/offers/v1.0/aws/AmazonEC2/current/us-east-1/index.json")
ec2_dict = json.loads(ec2_url.text)
df_init_terms = pd.DataFrame(ec2_dict['terms'])
df_init_terms
#print(df_init_terms.values)
df_init_terms = df_init_terms.drop(['Reserved'], axis = 1)
df_dropna = df_init_terms.dropna()
df_dropna1 = df_dropna[:1000]
df_init_terms.values
输出:
array([[{'QUMEF4UK3NPT4MN3.JRTCKXETXF': {'offerTermCode': 'JRTCKXETXF', 'sku': 'QUMEF4UK3NPT4MN3', 'effectiveDate': '2020-07-01T00:00:00Z', 'priceDimensions': {'QUMEF4UK3NPT4MN3.JRTCKXETXF.6YS6EN2CT7': {'rateCode': 'QUMEF4UK3NPT4MN3.JRTCKXETXF.6YS6EN2CT7', 'description': '$0.376 per Unused Reservation Windows c3.xlarge Instance Hour', 'beginRange': '0', 'endRange': 'Inf', 'unit': 'Hrs', 'pricePerUnit': {'USD': '0.3760000000'}, 'appliesTo': []}}, 'termAttributes': {}}}],
[{'DBCQPZ6Z853WRE98.JRTCKXETXF': {'offerTermCode': 'JRTCKXETXF', 'sku': 'DBCQPZ6Z853WRE98', 'effectiveDate': '2020-07-01T00:00:00Z', 'priceDimensions': {'DBCQPZ6Z853WRE98.JRTCKXETXF.6YS6EN2CT7': {'rateCode': 'DBCQPZ6Z853WRE98.JRTCKXETXF.6YS6EN2CT7', 'description': '$3.586 per Unused Reservation RHEL r5d.12xlarge Instance Hour', 'beginRange': '0', 'endRange': 'Inf', 'unit': 'Hrs', 'pricePerUnit': {'USD': '3.5860000000'}, 'appliesTo': []}}, 'termAttributes': {}}}],
[{'MK44K7QNJQCC2E98.JRTCKXETXF': {'offerTermCode': 'JRTCKXETXF', 'sku': 'MK44K7QNJQCC2E98', 'effectiveDate': '2020-07-01T00:00:00Z', 'priceDimensions': {'MK44K7QNJQCC2E98.JRTCKXETXF.6YS6EN2CT7': {'rateCode': 'MK44K7QNJQCC2E98.JRTCKXETXF.6YS6EN2CT7', 'description': '$1.40 per Dedicated Linux with SQL Std m4.2xlarge Instance Hour', 'beginRange': '0', 'endRange': 'Inf', 'unit': 'Hrs', 'pricePerUnit': {'USD': '1.4000000000'}, 'appliesTo': []}}, 'termAttributes': {}}}],
...,
[nan],
[nan],
[nan]], dtype=object)
使用 head() 输出:
OnDemand
QUMEF4UK3NPT4MN3 {'QUMEF4UK3NPT4MN3.JRTCKXETXF': {'offerTermCod...
DBCQPZ6Z853WRE98 {'DBCQPZ6Z853WRE98.JRTCKXETXF': {'offerTermCod...
MK44K7QNJQCC2E98 {'MK44K7QNJQCC2E98.JRTCKXETXF': {'offerTermCod...
86MNM35KQ46XCFDQ {'86MNM35KQ46XCFDQ.JRTCKXETXF': {'offerTermCod...
NCQF4R2S47SB2QE5 {'NCQF4R2S47SB2QE5.JRTCKXETXF': {'offerTermCod...
如何标准化 OnDemand 列以将每个 sku 分隔为行并为有效日期、描述和 pricePerUnit 分隔列,这是新字典和深度嵌套:
sku effectiveDate description priceUnit
QUMEF4UK3NPT4MN3 2020-07-01T00:00:00Z $0.376 per Unus... $0.376
DBCQPZ6Z853WRE98 2020-07-01T00:00:00Z $3.586 per Unuse... $3.586
MK44K7QNJQCC2E98 ...and so on...
提前致谢!
【问题讨论】:
标签: json pandas dataframe nested