【问题标题】:Split json response item into multiple items in Python在 Python 中将 json 响应项拆分为多个项
【发布时间】:2015-06-17 08:49:02
【问题描述】:

我目前正在尝试从 API 解析 JSON 文件。目前,我可以读取响应中的每个项目并将数据解析为变量,我使用这些变量来创建用于导入表的 SQL 语句。下面是一些示例代码:

{
"data": [
{
"id": 64731,
"label": "Label Text goes here",
"locations": [
    {
    "id": "55925",
    "label": "San Miguel (La Dorada)",
    "self": "http://url.com/api/locations/55925"
    }
 ],
"other_location": "Other location text goes here",
"subject": "Subject Text goes here",
"url": "http://www.url.com"
},
]
}

我的 python 脚本可以读取 json 中的每个条目,并将每个条目作为表中的行导入,其中 id、label、location、other_location、subject 和 url 作为字段。

但是,对于某些条目,我有多个位置,所以我想做的基本上是为每个位置重复相同的条目,并且只更改位置信息。因此,这是:

{
"data": [
{
"id": 64731,
"label": "Label Text goes here",
"locations": [
    {
    "id": "55925",
    "label": "San Miguel (La Dorada)",
    "self": "http://url.com/api/locations/55925"
    },
    {
    "id": "55926",
    "label": "Istanbul",
    "self": "http://url.com/api/locations/55926"
    }
 ],
"other_location": "Other location text goes here",
"subject": "Subject Text goes here",
"url": "http://www.url.com"
},
]
}

实际上将是我表中的两行,每行都有相同的数据,只是位置不同。我应该如何修改我的工作脚本以便将一个条目拆分为多个条目? (请注意,一些变量来自位置字段中的嵌套 json,但它不应该对我正在尝试做的事情产生影响)

def insert_into_table(sql_query):
try:
    print cl.sql(sql_query)
except Exception as e:
   print ("some error ocurred", e)

def main():
  # define a variable to hold the source URL
  urlData = "http://www.url.com/api"

  # Open the URL and read the data
  webUrl = urllib2.urlopen(urlData)
  if (webUrl.getcode() == 200):
      data = webUrl.read()

    # Use the json module to load the string data into a dictionary
      api_url = json.loads(data)

      for i in api_url["data"]:
        id = i["id"]
        label = i["label"]

        #variablea for nested locations JSON
        location_api = i["locations"][0]["self"]
        location_id = i["locations"][0]["id"]
        location_label = i["locations"][0]["label"]
        #checks connection and loads the json
        openlocations = urllib2.urlopen(location_api)
        if (openlocations.getcode() == 200):
            location_data = openlocations.read()
            load_locations = json.loads(location_data)

            #defining the variable to be inserted into table from the nested JSON
            geoid = load_locations["data"][0]["id"]
            geo_pcode = load_locations["data"][0]["pcode"]
            geo_iso_code = load_locations["data"][0]["iso3"]
            geo_admin_level = load_locations["data"][0]["admin_level"]
            lat = load_locations["data"][0]["geolocation"]["lat"]
            long = load_locations["data"][0]["geolocation"]["lon"]
            #dive into the nested locations to find admin level 0 which is the country name
            country = ""
            if geo_admin_level == "0":
                country = location_label
               #redeclares the location as null if the location is also the country
                location_label = "null"
            # finds, opens and loads the nested location url if necessary
            elif geo_admin_level == "1":
                geo_parent_url = load_locations["data"][0]["parent"][0]["self"]
                open_geoparent = urllib2.urlopen(geo_parent_url)
                if (open_geoparent.getcode() == 200):
                    geoparent_data = open_geoparent.read()
                    load_geoparent_data = json.loads(geoparent_data)
                    parent_geo_admin_level = load_geoparent_data["data"][0]["admin_level"]
                    if parent_geo_admin_level == "0":
                        country = load_geoparent_data["data"][0]["label"]
                    # finds, opens and loads the nested location url if necessary
                    elif parent_geo_admin_level == "1":
                        geo_grandparent_url = load_geoparent_data["data"][0]["parent"][0]["self"]
                        open_geograndparent = urllib2.urlopen(geo_grandparent_url)
                        if (open_geograndparent.getcode() == 200):
                            geo_grandparent_data = open_geograndparent.read()
                            load_geograndparent_data = json.loads(geo_grandparent_data)
                            grandparent_geo_admin_level = load_geograndparent_data["data"][0]["admin_level"]
                            if grandparent_geo_admin_level == "0":
                                country = load_geograndparent_data["data"][0]["label"]
                            # finds, opens and loads the nested location url if necessary
                            elif grandparent_geo_admin_level == "1":
                                geo_greatgrandparent_url = load_geograndparent_data["data"][0]["parent"][0]["self"]
                                open_geogreatgrandparent = urllib2.urlopen(geo_greatgrandparent_url)
                                if (open_geogreatgrandparent.getcode() == 200):
                                    geo_greatgrandparent_data = open_geogreatgrandparent.read()
                                    load_geogreatgrandparent_data = json.loads(geo_greatgrandparent_data)
                                    greatgrandparent_geo_admin_level = load_geogreatgrandparent_data["data"][0]["admin_level"]
                                    if greatgrandparent_geo_admin_level == "0":
                                        country = load_geogreatgrandparent_data["data"][0]["label"]
                                    else:
                                        #leaving as null for testing purposes if dive into 5th level
                                        country = "null"
                                else:
                                    print "GreatGrandparent location url does not exist or cannot be opened. Code: " + str(open_geogreatgrandparent.getcode())
                        else:
                             print "Grandparent location url does not exist or cannot be opened. Code: " + str(open_geograndparent.getcode())

                else:
                    print "Parent location url does not exist or cannot be opened. Code: " + str(open_geoparent.getcode())

            else:
                print "Primary Admin Level was not level 0 or 1"        
        #prints error mesage if connection fails   
        else:
            print "Cannot open locations url. Code: " + str(openlocations.getcode())

        other_location = i["other_location"]
        subject = i["subject"]
        assessment_url = i["url"]

        try:
            sql_query = "INSERT INTO table_name (lat_lon, id, label, location_id, location_label, country, geoid, geo_pcode, geo_iso_code, geo_admin_level, other_location, subject, assessment_url) VALUES ("
            sql_query = sql_query + "'SRID=4326; POINT (%f %f)', '%s', '%s', '%s', '%s', '%s', '%s', '%s', '%s', '%s', '%s', '%s', '%s'" % (float(str(long)), float(str(lat)), id, label, id, label, location_id, location_label, country, geoid, geo_pcode, geo_iso_code, geo_admin_level, other_location, subject, assessment_url)
             sql_query = sql_query + ")"
             print str(sql_query)
         except ValueError,e:
            print ("some error ocurred", e)

        #This is where you call insert_into_table()
        insert_into_table(sql_query)

任何帮助将不胜感激。

【问题讨论】:

    标签: python sql json


    【解决方案1】:

    您必须遍历 i["locations"] 列表。

    改变这个:

    #variables for nested locations JSON
    location_api = i["locations"][0]["self"]
    location_id = i["locations"][0]["id"]
    location_label = i["locations"][0]["label"]
    

    致以下:

    #variables for nested locations JSON
    for loc in i["locations"]:
        location_api = loc["self"]
        location_id = loc["id"]
        location_label = loc["label"]
        <...rest of your code...>
    

    Python Standard Docs

    【讨论】:

    • 感谢您的回复。是的,我可以很容易地做到这一点,但我将如何修改我的 SQL 查询语句以便将每个迭代作为新行插入?
    • 其余代码(包括 SQL 查询生成和数据库查询)应该在我编写的 for 循环内。代码将被执行多次,因此将插入多行。
    • 很高兴为您提供帮助,欢迎来到 Stack Overflow。如果此答案或任何其他答案解决了您的问题,请将其标记为已接受。
    • 再次感谢您的帮助。它只是部分工作。出于某种原因,当我打印到控制台时,它们会显示一个单独的条目,但是当我检查表格时,实际上只插入了第一个位置的一行。有什么建议吗?
    • 其实,没关系。我也不得不将我的插入函数移动到 for 循环中,这很好。再次感谢!
    猜你喜欢
    • 1970-01-01
    • 2014-10-31
    • 2011-09-16
    • 1970-01-01
    • 2015-08-04
    • 2019-03-06
    • 1970-01-01
    • 1970-01-01
    • 1970-01-01
    相关资源
    最近更新 更多