【问题标题】:How to normalize a nested .json?如何标准化嵌套的.json?
【发布时间】:2021-09-30 23:02:12
【问题描述】:

所以我正在使用 Mapbox Web API 并返回一个 .json。我在解析 .jsons 时遇到了麻烦和困难。我面临的挑战之一是返回的 .json 是嵌套的。这是.json:

{
   "type":"FeatureCollection",
   "query":[
      -73.989,
      40.733
   ],
   "features":[
      {
         "id":"locality.12696928000137850",
         "type":"Feature",
         "place_type":[
            "locality"
         ],
         "relevance":1,
         "properties":{
            "wikidata":"Q11299"
         },
         "text":"Manhattan",
         "place_name":"Manhattan, New York, United States",
         "bbox":[
            -74.047313153061,
            40.679573,
            -73.907,
            40.8820749648427
         ],
         "center":[
            -73.9597,
            40.7903
         ],
         "geometry":{
            "type":"Point",
            "coordinates":[
               -73.9597,
               40.7903
            ]
         },
         "context":[
            {
               "id":"place.2618194975964500",
               "wikidata":"Q60",
               "text":"New York"
            },
            {
               "id":"district.12113562209855570",
               "wikidata":"Q500416",
               "text":"New York County"
            },
            {
               "id":"region.17349986251855570",
               "wikidata":"Q1384",
               "short_code":"US-NY",
               "text":"New York"
            },
            {
               "id":"country.19678805456372290",
               "wikidata":"Q30",
               "short_code":"us",
               "text":"United States"
            }
         ]
      },
      {
         "id":"region.17349986251855570",
         "type":"Feature",
         "place_type":[
            "region"
         ],
         "relevance":1,
         "properties":{
            "wikidata":"Q1384",
            "short_code":"US-NY"
         },
         "text":"New York",
         "place_name":"New York, United
States",
         "bbox":[
            -79.8578350999901,
            40.4771391062446,
            -71.7564918092633,
            45.0239286969073
         ],
         "center":[
            -75.4652471468304,
            42.751210955
         ],
         "geometry":{
            "type":"Point",
            "coordinates":[
               -75.4652471468304,
               42.751210955
            ]
         },
         "context":[
            {
               "id":"country.19678805456372290",
               "wikidata":"Q30",
               "short_code":"us",
               "text":"United States"
            }
         ]
      },
      {
         "id":"country.19678805456372290",
         "type":"Feature",
         "place_type":[
            "country"
         ],
         "relevance":1,
         "properties":{
            "wikidata":"Q30",
            "short_code":"us"
         },
         "text":"United States",
         "place_name":"United States",
         "bbox":[
            -179.9,
            18.8163608007951,
            -66.8847646185949,
            71.4202919997506
         ],
         "center":[
            -97.9222112121185,
            39.3812661305678
         ],
         "geometry":{
            "type":"Point",
            "coordinates":[
               -97.9222112121185,
               39.3812661305678
            ]
         }
      }
   ],
   "attribution":"NOTICE: © 2021 Mapbox and its suppliers. All
rights reserved. Use of this data is subject to the Mapbox Terms of Service
(https://www.mapbox.com/about/maps/). This response and the information it contains may not be
retained. POI(s) provided by Foursquare."
}

我能够使用以下代码 sn-p 将其加载到数据框中:

url = "https://api.mapbox.com/geocoding/v5/mapbox.places/-73.989,40.733.json?
types=country,region,locality&access_token=MY_KEY_HERE"

data = json.loads(requests.get(url).text)

df = json_normalize(data, 'features')

return df

但是,我发现我需要向其中添加 [查询],因此我将相关药水修改为如下所示:

url = "https://api.mapbox.com/geocoding/v5/mapbox.places/-73.989,40.733.json?
types=country,region,locality&access_token=MY_KEY_HERE"

data = json.loads(requests.get(url).text)

df = json_normalize(data, 'features', ['query'])

return df

(我下面的语法来自documentation

我得到的错误状态:

ValueError:值的长度与索引的长度不匹配。

查询字段如下所示...

我不确定错误说明了什么以及如何解决它。

这是我想要的输出数据框:

我可以清理和删除不需要的字段,但无法显示 [query] 字段。

【问题讨论】:

  • requests.get().json不直接解决你的问题吗?
  • 您的预期输出是什么?具体来说,您希望 DataFrame 中有什么?
  • 嗨,Xitiz,使用 requests.get(url).json 给我一个错误:“您正在尝试迭代类型方法的对象,但该类型的对象不可迭代。”但是尝试 json_normalize 对我有用,所以我就这么做了。
  • Not_Speshal,这是我想要的输出。我在我想要的表格末尾插入了一张图片。 .json 中有很多我不需要的额外字段,但我可以稍后删除它们。我无法让 [query] 字段与我想要的结果一样。

标签: python json pandas normalize


【解决方案1】:

json_normalize 之后添加query 列:

df.insert(0, 'query', [data['query']] * len(df))

【讨论】:

  • Corralien,我看到了你想要的技术。没想到后来加了。我做了 df = df.insert 但它完全擦除了所有内容,我将其保留为 df.insert 并且一切都很好。非常感谢!问题解决了!出于好奇,有没有办法在 json_normalize 一行中挤入 [query] ?还是太麻烦,不值得研究?
猜你喜欢
  • 2021-12-31
  • 2016-05-14
  • 2021-12-11
  • 1970-01-01
  • 2015-12-15
  • 2021-01-13
  • 2016-12-13
  • 2021-11-19
  • 1970-01-01
相关资源
最近更新 更多