【问题标题】:How to get a string formatted JSON into a table如何将字符串格式的 JSON 放入表中
【发布时间】:2020-02-08 04:43:34
【问题描述】:

我有以下字符串格式的 JSON 数据。如何在 R 或 Python 中将data 转换为表格格式?

我试过df = pd.DataFrame(data),但这不起作用,因为data是一个字符串。

data = '{"Id":"048f7de7-81a4-464d-bd6d-df3be3b1e7e8","RecordType":20, "CreationTime":"2019-10-08T12:12:32","Operation":"SetScheduledRefresh", "OrganizationId":"39b03722-b836-496a-85ec-850f0957ca6b","UserType":0, "UserAgent":"Mozilla/5.0 (Windows NT 10.0; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/76.0.3809.132 Safari/537.36", "ItemName":"ASO Daily Statistics","Schedules":{"RefreshFrequency":"Daily", "TimeZone":"E. South America Standard Time","Days":["All"], "Time":["07:30:00","10:30:00","13:30:00","16:30:00","19:30:00","22:30:00"]}, "IsSuccess":true,"ActivityId":"4e8b4514-24be-4ba5-a7d3-a69e8cb8229e"}'

期望的输出:

output = 
------------------------------------------------------------------
ID                                      | RecordType | CreationTime
048f7de7-81a4-464d-bd6d-df3be3b1e7e8    | 20         | 2019-10-08T12:12:32

错误:

ValueError                                Traceback (most recent call last)
<ipython-input-26-039b238b38ef> in <module>
----> 1 df = pd.DataFrame(data)

e:\Anaconda3\lib\site-packages\pandas\core\frame.py in __init__(self, data, index, columns, dtype, copy)
    483                 )
    484             else:
--> 485                 raise ValueError("DataFrame constructor not properly called!")
    486 
    487         NDFrame.__init__(self, mgr, fastpath=True)

ValueError: DataFrame constructor not properly called!

【问题讨论】:

  • 欢迎来到 SO。 1. 由于这不是编码服务,请说明您尝试了什么,出了什么问题以及您的期望;换句话说:请参考How to Askminimal reproducible example 2.您的string变量不是字符串类型而是字典。 3. 你的变量定义引发了语法错误,因为 Python 中的True 有一个大写 T...
  • 首先我们在 python 中没有true,但是我们有True

标签: python r json pandas


【解决方案1】:

在 Python 中:

import pandas as pd
from ast import literal_eval
from pandas.io.json import json_normalize

data = '{"Id":"048f7de7-81a4-464d-bd6d-df3be3b1e7e8","RecordType":20, "CreationTime":"2019-10-08T12:12:32","Operation":"SetScheduledRefresh", "OrganizationId":"39b03722-b836-496a-85ec-850f0957ca6b","UserType":0, "UserAgent":"Mozilla/5.0 (Windows NT 10.0; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/76.0.3809.132 Safari/537.36", "ItemName":"ASO Daily Statistics","Schedules":{"RefreshFrequency":"Daily", "TimeZone":"E. South America Standard Time","Days":["All"], "Time":["07:30:00","10:30:00","13:30:00","16:30:00","19:30:00","22:30:00"]}, "IsSuccess":true,"ActivityId":"4e8b4514-24be-4ba5-a7d3-a69e8cb8229e"}'

data = data.replace('true', 'True')
data = literal_eval(data)

{'ActivityId': '4e8b4514-24be-4ba5-a7d3-a69e8cb8229e',
 'CreationTime': '2019-10-08T12:12:32',
 'Id': '048f7de7-81a4-464d-bd6d-df3be3b1e7e8',
 'IsSuccess': True,
 'ItemName': 'ASO Daily Statistics',
 'Operation': 'SetScheduledRefresh',
 'OrganizationId': '39b03722-b836-496a-85ec-850f0957ca6b',
 'RecordType': 20,
 'Schedules': {'Days': ['All'],
               'RefreshFrequency': 'Daily',
               'Time': ['07:30:00',
                        '10:30:00',
                        '13:30:00',
                        '16:30:00',
                        '19:30:00',
                        '22:30:00'],
               'TimeZone': 'E. South America Standard Time'},
 'UserAgent': 'Mozilla/5.0 (Windows NT 10.0; Win64; x64) AppleWebKit/537.36 '
              '(KHTML, like Gecko) Chrome/76.0.3809.132 Safari/537.36',
 'UserType': 0}

创建数据框:

df = json_normalize(data)

                                   Id  RecordType         CreationTime            Operation                        OrganizationId  UserType                                                                                                            UserAgent              ItemName  IsSuccess                            ActivityId Schedules.RefreshFrequency              Schedules.TimeZone Schedules.Days                                                Schedules.Time
 048f7de7-81a4-464d-bd6d-df3be3b1e7e8          20  2019-10-08T12:12:32  SetScheduledRefresh  39b03722-b836-496a-85ec-850f0957ca6b         0  Mozilla/5.0 (Windows NT 10.0; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/76.0.3809.132 Safari/537.36  ASO Daily Statistics       True  4e8b4514-24be-4ba5-a7d3-a69e8cb8229e                      Daily  E. South America Standard Time          [All]  [07:30:00, 10:30:00, 13:30:00, 16:30:00, 19:30:00, 22:30:00]

【讨论】:

  • 谢谢@Trenton McKinney 这是我想要的扫管笏
【解决方案2】:

您将需要reticulate 库:您需要将所有true 更改为True。看下面的代码

a <-  'string = {"Id":"048f7de7-81a4-464d-bd6d-df3be3b1e7e8","RecordType":20,
                 "CreationTime":"2019-10-08T12:12:32","Operation":"SetScheduledRefresh",
                 "OrganizationId":"39b03722-b836-496a-85ec-850f0957ca6b","UserType":0,
                 "UserAgent":"Mozilla/5.0 (Windows NT 10.0; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/76.0.3809.132 Safari/537.36",
                 "ItemName":"ASO Daily Statistics","Schedules":{"RefreshFrequency":"Daily",
                 "TimeZone":"E. South America Standard Time","Days":["All"],
                 "Time":["07:30:00","10:30:00","13:30:00","16:30:00","19:30:00","22:30:00"]},
                 "IsSuccess":true,"ActivityId":"4e8b4514-24be-4ba5-a7d3-a69e8cb8229e"}'

data.frame(reticulate::py_eval(gsub('true','True',sub('.*=\\s+','',a))))

【讨论】:

  • 感谢@Onyambu,但它给了我一个找不到路径的错误
猜你喜欢
  • 2018-01-23
  • 1970-01-01
  • 2018-07-20
  • 1970-01-01
  • 2021-02-05
  • 1970-01-01
  • 2020-06-19
  • 2020-09-03
  • 1970-01-01
相关资源
最近更新 更多