【问题标题】:Get data.frame from json (r language)从 json (r 语言) 获取 data.frame
【发布时间】:2016-03-14 12:35:46
【问题描述】:

我的 json 文件中有 400 个元素。我想把它放在 data.frame 中(需要它进行分析)。

例如我有 json 文件:

"user_id":"severalHint","device_id": "dfg4644fsnthj"
"user_id":"berrrrka","session_time": "2314"

我想把它变成这样的数据表:

   user_id      device_id     session_time
1  severalHint  dfg4644fsnthj NA
2  berrrrka     NA            2314

我把我的 JSON 转成列表,这里是 dput:

list(structure(list(data = structure(list(ios_idfv = "57BE266B-CA71-4C53-9F28-4CE0F6FFDB55", 
    engine_version = "unity 4.6.9", category = "resource", os_version = "ios 9.2", 
    ios_idfa = "256A9626-DD07-40C2-BB3F-862F26D89DEF", amount = 100, 
    v = 2, sdk_version = "unity 2.4.3", user_id = "256A9626-DD07-40C2-BB3F-862F26D89DEF", 
    session_num = 2, platform = "ios", connection_type = "wifi", 
    manufacturer = "apple", client_ts = 1457359485, session_id = "a4671aaa-bd13-4655-a42f-c333a9709c58", 
    device = "iPad4,5", event_id = "Source:Credits:Reward:DailyReward", 
    build = "1.0"), .Names = c("ios_idfv", "engine_version", 
"category", "os_version", "ios_idfa", "amount", "v", "sdk_version", 
"user_id", "session_num", "platform", "connection_type", "manufacturer", 
"client_ts", "session_id", "device", "event_id", "build")), first_in_batch = TRUE, 
    country_code = "RU", arrival_ts = 1457359488, game_id = 24540, 
    ip = "95.107.103.0", user_meta = structure(list(install_ad = "80961452183", 
        install_ad = "80961452183", install_ad = "80961452183", 
        install_ad = "80961452183", install_campaign = "wizzo", 
        install_campaign = "wizzo", install_campaign = "wizzo", 
        install_campaign = "wizzo", install_publisher = "d", 
        install_publisher = "d", install_publisher = "d", install_publisher = "d", 
        install_site = "Heroes and Castles 2 Free", install_site = "Heroes and Castles 2 Free", 
        install_site = "Heroes and Castles 2 Free", install_site = "Heroes and Castles 2 Free", 
        install_ts = 1457287997, revenue = list(), cohort_week = 1456704000, 
        cohort_month = 1456790400), .Names = c("install_ad", 
    "install_ad", "install_ad", "install_ad", "install_campaign", 
    "install_campaign", "install_campaign", "install_campaign", 
    "install_publisher", "install_publisher", "install_publisher", 
    "install_publisher", "install_site", "install_site", "install_site", 
    "install_site", "install_ts", "revenue", "cohort_week", "cohort_month"
    ))), .Names = c("data", "first_in_batch", "country_code", 
"arrival_ts", "game_id", "ip", "user_meta")))

【问题讨论】:

  • 请提供您的数据集的dput(请提供一个小的)和所需的输出。
  • @DavidArenburg 我编辑了我的帖子。
  • @AnandaMahto 是的,但我不知道什么是“dput”
  • rjson 是你的包裹吗?
  • @DavidArenburg 是的

标签: r


【解决方案1】:

好的,让我们从你的 json 示例开始:

js = '[
          {"user_id":"severalHint","device_id": "dfg4644fsnthj"},
          {"user_id":"berrrrka","session_time": "2314"}
      ]'

这可以简单地用 rjson 包转换成一个列表:

require(rjson)
js.list = fromJSON(js)

此外,您可以在此处找到一个讨论将列表转换为数据框的好帖子:Convert R list to dataframe with missing/NULL elements

在我们的示例中,代码为

library(plyr)

rbind.fill(lapply(js.list, function(f) {
    as.data.frame(Filter(Negate(is.null), f))
}))

导致输出

      user_id     device_id session_time
1 severalHint dfg4644fsnthj         <NA>
2    berrrrka          <NA>         2314

应该可以解决你的问题。

【讨论】:

    猜你喜欢
    • 2010-11-19
    • 2014-04-21
    • 1970-01-01
    • 2015-10-13
    • 2016-08-06
    • 2020-09-26
    • 1970-01-01
    • 1970-01-01
    相关资源
    最近更新 更多