【问题标题】:Nested JSON unpack into PD dataframe嵌套的 JSON 解包到 PD 数据帧中
【发布时间】:2022-01-09 18:40:56
【问题描述】:

我想就我拥有的这个 .json 文件寻求一些帮助。 我已经广泛地查看了 pd.json_normalize() 方法,但我无法正确设置格式。

我正试图尝试的代码行是这样的 ''' result_df = pd.json_normalize(cgcryptohistory_data) '''

我很想把我的 json 格式化成一个格式如下的 df:

date bitcoin prices bitcoin market_caps bitcoin total_volumes ethereum prices ethereum market_caps ethereum total_volumes
1637920962758 55084.24409740329 1040185692035.8112 4096.986983019884 ... ...
1637924583096 ... ... ... ... ... ...

我一直在查看此文档,但无法使其与未命名的嵌套值一起使用。 https://pandas.pydata.org/pandas-docs/version/1.2.0/reference/api/pandas.json_normalize.html https://www.kaggle.com/jboysen/quick-tutorial-flatten-nested-json-in-pandas/notebook

[
  [
    {
      "crypto": "bitcoin"
    }
  ],
  {
    "prices": [
      [
        1637920962758,
        55084.24409740329
      ],
      [
        1637924583096,
        54657.9826454445
      ],
      [
        1637928143387,
        54031.99796233907
      ],
      [
        1638524408000,
        56556.355173823926
      ]
    ],
    "market_caps": [
      [
        1637920962758,
        1040185692035.8112
      ],
      [
        1637924583096,
        1032137732028.0712
      ],
      [
        1637928143387,
        1020318960913.6139
      ],
      [
        1638524408000,
        1068341065780.2579
      ]
    ],
    "total_volumes": [
      [
        1637920962758,
        40002799175.46155
      ],
      [
        1637924583096,
        38579701553.8867
      ],
      [
        1637928143387,
        39373185822.85809
      ],
      [
        1638524408000,
        32567680716.236423
      ]
    ]
  },
  [
    {
      "crypto": "ethereum"
    }
  ],
  {
    "prices": [
      [
        1637920951704,
        4096.986983019884
      ],
      [
        1637924408082,
        4072.6963895955864
      ],
      [
        1637928090810,
        4021.2930336538925
      ],
      [
        1638524390000,
        4559.839444343959
      ]
    ],
    "market_caps": [
      [
        1637920951704,
        485474079335.9266
      ],
      [
        1637924408082,
        482758573953.61304
      ],
      [
        1637928090810,
        479260985689.3548
      ],
      [
        1638524390000,
        540740261905.95264
      ]
    ],
    "total_volumes": [
      [
        1637920951704,
        25972933719.35031
      ],
      [
        1637924408082,
        26468521371.13646
      ],
      [
        1637928090810,
        27042124946.11916
      ],
      [
        1638524390000,
        20268892519.524815
      ]
    ]
  }
]

【问题讨论】:

    标签: python json pandas nested


    【解决方案1】:

    假设 js 是你的 json,我会这样做。

    l = []
    for i in range(0,len(js),2):
        prices = [k[1] for k in js[i+1]["prices"]]
        market_caps = [k[1] for k in js[i+1]["market_caps"]]
        total_volumes = [k[1] for k in js[i+1]["total_volumes"]]
        date =  [k[0] for k in js[i+1]["total_volumes"]]
        crypto =  js[i][0]["crypto"]
        df = pd.DataFrame({"crypto":crypto,"prices":prices,"market_caps":market_caps,"total_volumes":total_volumes,"date":date})
        l.append(df)
    df = pd.concat(l)
    

    输出:

         crypto        prices   market_caps  total_volumes           date
    0   bitcoin  55084.244097  1.040186e+12   4.000280e+10  1637920962758
    1   bitcoin  54657.982645  1.032138e+12   3.857970e+10  1637924583096
    2   bitcoin  54031.997962  1.020319e+12   3.937319e+10  1637928143387
    3   bitcoin  56556.355174  1.068341e+12   3.256768e+10  1638524408000
    0  ethereum   4096.986983  4.854741e+11   2.597293e+10  1637920951704
    1  ethereum   4072.696390  4.827586e+11   2.646852e+10  1637924408082
    2  ethereum   4021.293034  4.792610e+11   2.704212e+10  1637928090810
    3  ethereum   4559.839444  5.407403e+11   2.026889e+10  1638524390000
    

    这种方式更具可扩展性,你可以像这样过滤你想要的加密:

    df[df.crypto == "bitcoin"]
    

    输出

        crypto        prices   market_caps  total_volumes           date
    0  bitcoin  55084.244097  1.040186e+12   4.000280e+10  1637920962758
    1  bitcoin  54657.982645  1.032138e+12   3.857970e+10  1637924583096
    2  bitcoin  54031.997962  1.020319e+12   3.937319e+10  1637928143387
    3  bitcoin  56556.355174  1.068341e+12   3.256768e+10  1638524408000
    

    【讨论】:

    • 嘿 Tbaki,谢谢你的回答,这是我今天看到的最美丽的东西。检查它对我的工作方式。
    • 是的,很有魅力,非常感谢。这是我需要的一个严肃的支持。编码风格也不错。
    • 很高兴我能帮上忙。 :)
    猜你喜欢
    • 2019-04-11
    • 2021-12-28
    • 2018-09-07
    • 2021-07-02
    • 1970-01-01
    • 1970-01-01
    • 2021-07-19
    • 2018-10-11
    • 2020-01-19
    相关资源
    最近更新 更多