【问题标题】:Extracting nested list in column of tibble to other columns将tibble列中的嵌套列表提取到其他列
【发布时间】:2020-10-24 06:14:16
【问题描述】:

我正在使用 ggmap 包,我创建了一个包含 2 列的 tibble、一个地址和 geocode(address) 的输出,第二列是一个嵌套列表。我想将此嵌套列表中的信息提取到单独的列中。

小标题是这样的,

# A tibble: 6 x 2
Origin                               geo         
<chr>                                <list>      
1 Major Arterial Road(South-East), De~ <named list~
2 Murari Pukur Govt Sponsored Higher ~ <named list~
3 Running Lab, RA Puram Trustpakkam M~ <named list~
4 Ravindra Bharathi, Saifabad Khairat~ <named list~
5 11/7, Adarsha Pally, Netaji Nagar, ~ <named list~
6 Hotel Eagle executive, Pimpri-Chinc~ <named list~

嵌套列表包含 ggmap::geocode() 的输出,看起来像 (dput output)

list(results = list(list(address_components = list(list(long_name = "417", 
    short_name = "417", types = list("street_number")), list(
    long_name = "North Rawhide", short_name = "N Rawhide", types = list(
        "route")), list(long_name = "Mid-america Industrial Park", 
    short_name = "Mid-america Industrial Park", types = list(
        "neighborhood", "political")), list(long_name = "Olathe", 
    short_name = "Olathe", types = list("locality", "political")), 
    list(long_name = "Johnson County", short_name = "Johnson County", 
        types = list("administrative_area_level_2", "political")), 
    list(long_name = "Kansas", short_name = "KS", types = list(
        "administrative_area_level_1", "political")), list(long_name = "United States", 
        short_name = "US", types = list("country", "political")), 
    list(long_name = "66061", short_name = "66061", types = list(
        "postal_code")), list(long_name = "3657", short_name = "3657", 
        types = list("postal_code_suffix"))), formatted_address = "417 N Rawhide, Olathe, KS 66061, USA", 
    geometry = list(location = list(lat = 38.8861111, lng = -94.7913889), 
        location_type = "ROOFTOP", viewport = list(northeast = list(
            lat = 38.8874600802915, lng = -94.7900399197085), 
            southwest = list(lat = 38.8847621197085, lng = -94.7927378802915))), 
    place_id = "ChIJRVUJdB--wIcRHkCUc1X_XBk", plus_code = list(
        compound_code = "V6P5+CC Olathe, KS, USA", global_code = "86C7V6P5+CC"), 
    types = list("clothing_store", "establishment", "point_of_interest", 
        "shoe_store", "store"))), status = "OK")

我的目标是在这个 tibble 中添加额外的列,例如纬度、经度、县、国家等。

我尝试编写自己的函数,但它们似乎不起作用

我尝试的是

get_lat <- function(geo) {
    return(geo$geometry$location$lat) 
}

MWE <- tibble(geo = lst) #(where lst is my dput output)
mutate(MWE, lat = map(geo, get_lat))

小标题的DPUT

structure(list(Origin = c("Major Arterial Road(South-East), Deshbandhu Nagar, New Town, West Bengal, India", 
"Murari Pukur Govt Sponsored Higher Secondary School, Ultadanga Main Road Ultadanga Kolkata West Bengal India", 
"Running Lab, RA Puram Trustpakkam Mandaveli Chennai Tamil Nadu India", 
"Ravindra Bharathi, Saifabad Khairatabad Hyderabad Telangana India", 
"11/7, Adarsha Pally, Netaji Nagar, Kolkata, West Bengal 700092, India", 
"Hotel Eagle executive, Pimpri-Chinchwad Maharashtra India"), 
    geo = list(list(results = list(list(address_components = list(
        list(long_name = "Major Arterial Road(South-East)", short_name = "Major Arterial Road(South-East)", 
            types = list("route")), list(long_name = "Deshbandhu Nagar", 
            short_name = "Deshbandhu Nagar", types = list("political", 
                "sublocality", "sublocality_level_2")), list(
            long_name = "New Town", short_name = "New Town", 
            types = list("locality", "political")), list(long_name = "North 24 Parganas", 
            short_name = "North 24 Parganas", types = list("administrative_area_level_2", 
                "political")), list(long_name = "West Bengal", 
            short_name = "WB", types = list("administrative_area_level_1", 
                "political")), list(long_name = "India", short_name = "IN", 
            types = list("country", "political"))), formatted_address = "Major Arterial Road(South-East), Deshbandhu Nagar, New Town, West Bengal, India", 
        geometry = list(bounds = list(northeast = list(lat = 22.6121767, 
            lng = 88.4750972), southwest = list(lat = 22.5835264, 
            lng = 88.4662183)), location = list(lat = 22.5977493, 
            lng = 88.4718048), location_type = "GEOMETRIC_CENTER", 
            viewport = list(northeast = list(lat = 22.6121767, 
                lng = 88.4750972), southwest = list(lat = 22.5835264, 
                lng = 88.4662183))), place_id = "Ek9NYWpvciBBcnRlcmlhbCBSb2FkKFNvdXRoLUVhc3QpLCBEZXNoYmFuZGh1IE5hZ2FyLCBOZXcgVG93biwgV2VzdCBCZW5nYWwsIEluZGlhIi4qLAoUChIJRdaq9U91AjoR7gvrdkF1FOASFAoSCb05qolEdQI6ETkXaZ56UBJJ", 
        types = list("route"))), status = "OK"), list(results = list(
        list(address_components = list(list(long_name = "Ultadanga Main Road", 
            short_name = "Ultadanga Main Rd", types = list("route")), 
            list(long_name = "Block-9", short_name = "Block-9", 
                types = list("neighborhood", "political")), list(
                long_name = "Murari Pukur", short_name = "Murari Pukur", 
                types = list("political", "sublocality", "sublocality_level_2")), 
            list(long_name = "Ultadanga", short_name = "Ultadanga", 
                types = list("political", "sublocality", "sublocality_level_1")), 
            list(long_name = "Kolkata", short_name = "Kolkata", 
                types = list("locality", "political")), list(
                long_name = "Kolkata", short_name = "Kolkata", 
                types = list("administrative_area_level_2", "political")), 
            list(long_name = "West Bengal", short_name = "WB", 
                types = list("administrative_area_level_1", "political")), 
            list(long_name = "India", short_name = "IN", types = list(
                "country", "political")), list(long_name = "700067", 
                short_name = "700067", types = list("postal_code"))), 
            formatted_address = "107 & 108/4, Ultadanga Main Rd, Block-9, Murari Pukur, Ultadanga, Kolkata, West Bengal 700067, India", 
            geometry = list(location = list(lat = 22.5923114, 
                lng = 88.3879988), location_type = "GEOMETRIC_CENTER", 
                viewport = list(northeast = list(lat = 22.5936603802915, 
                  lng = 88.3893477802915), southwest = list(lat = 22.5909624197085, 
                  lng = 88.3866498197085))), place_id = "ChIJRTqclBN2AjoRvFKeihU8Y6U", 
            plus_code = list(compound_code = "H9RQ+W5 Kolkata, West Bengal, India", 
                global_code = "7MJCH9RQ+W5"), types = list("establishment", 
                "point_of_interest", "school"))), status = "OK"), 
        list(results = list(list(address_components = list(list(
            long_name = "Chennai", short_name = "Chennai", types = list(
                "locality", "political")), list(long_name = "RA Puram", 
            short_name = "RA Puram", types = list("political", 
                "sublocality", "sublocality_level_3")), list(
            long_name = "Trustpakkam", short_name = "Trustpakkam", 
            types = list("political", "sublocality", "sublocality_level_2")), 
            list(long_name = "Mandaveli", short_name = "Mandaveli", 
                types = list("political", "sublocality", "sublocality_level_1")), 
            list(long_name = "Chennai", short_name = "Chennai", 
                types = list("administrative_area_level_2", "political")), 
            list(long_name = "Tamil Nadu", short_name = "TN", 
                types = list("administrative_area_level_1", "political")), 
            list(long_name = "India", short_name = "IN", types = list(
                "country", "political")), list(long_name = "600028", 
                short_name = "600028", types = list("postal_code"))), 
            formatted_address = "No 1, GF, Sai Durbar, 44/45, 2nd Main Road, Near Billroth Hospital, RA Puram, Trustpakkam, Mandaveli, RA Puram, Trustpakkam, Mandaveli, Chennai, Tamil Nadu 600028, India", 
            geometry = list(location = list(lat = 13.0272203, 
                lng = 80.2568369), location_type = "GEOMETRIC_CENTER", 
                viewport = list(northeast = list(lat = 13.0285692802915, 
                  lng = 80.2581858802915), southwest = list(lat = 13.0258713197085, 
                  lng = 80.2554879197085))), place_id = "ChIJv0BOUshnUjoRZ4HdVwXArxA", 
            plus_code = list(compound_code = "27G4+VP Chennai, Tamil Nadu, India", 
                global_code = "7M5227G4+VP"), types = list("clothing_store", 
                "establishment", "point_of_interest", "store"))), 
            status = "OK"), list(results = list(list(address_components = list(
            list(long_name = "State Assembly", short_name = "State Assembly", 
                types = list("landmark")), list(long_name = "Lakdikapul Road", 
                short_name = "Lakdikapul Rd", types = list("route")), 
            list(long_name = "Saifabad", short_name = "Saifabad", 
                types = list("political", "sublocality", "sublocality_level_2")), 
            list(long_name = "Lakdikapul", short_name = "Lakdikapul", 
                types = list("political", "sublocality", "sublocality_level_1")), 
            list(long_name = "Hyderabad", short_name = "Hyderabad", 
                types = list("locality", "political")), list(
                long_name = "Hyderabad", short_name = "Hyderabad", 
                types = list("administrative_area_level_2", "political")), 
            list(long_name = "Telangana", short_name = "Telangana", 
                types = list("administrative_area_level_1", "political")), 
            list(long_name = "India", short_name = "IN", types = list(
                "country", "political")), list(long_name = "500004", 
                short_name = "500004", types = list("postal_code"))), 
            formatted_address = "Lakdikapul Rd, near State Assembly, Saifabad, Lakdikapul, Hyderabad, Telangana 500004, India", 
            geometry = list(location = list(lat = 17.4033074, 
                lng = 78.467095), location_type = "GEOMETRIC_CENTER", 
                viewport = list(northeast = list(lat = 17.4046563802915, 
                  lng = 78.4684439802915), southwest = list(lat = 17.4019584197085, 
                  lng = 78.4657460197085))), place_id = "ChIJ_33ft7mRMjoRMDstgG7X4N8", 
            plus_code = list(compound_code = "CF38+8R Hyderabad, Telangana, India", 
                global_code = "7J9WCF38+8R"), types = list("establishment", 
                "point_of_interest"))), status = "OK"), list(
            results = list(list(address_components = list(list(
                long_name = "11", short_name = "11", types = list(
                  "subpremise")), list(long_name = "7", short_name = "7", 
                types = list("premise")), list(long_name = "Adarsha Pally", 
                short_name = "Adarsha Pally", types = list("political", 
                  "sublocality", "sublocality_level_2")), list(
                long_name = "Netaji Nagar", short_name = "Netaji Nagar", 
                types = list("political", "sublocality", "sublocality_level_1")), 
                list(long_name = "Kolkata", short_name = "Kolkata", 
                  types = list("locality", "political")), list(
                  long_name = "Kolkata", short_name = "Kolkata", 
                  types = list("administrative_area_level_2", 
                    "political")), list(long_name = "West Bengal", 
                  short_name = "WB", types = list("administrative_area_level_1", 
                    "political")), list(long_name = "India", 
                  short_name = "IN", types = list("country", 
                    "political")), list(long_name = "700092", 
                  short_name = "700092", types = list("postal_code"))), 
                formatted_address = "11, 7, Adarsha Pally, Netaji Nagar, Kolkata, West Bengal 700092, India", 
                geometry = list(location = list(lat = 22.4814396, 
                  lng = 88.3604102), location_type = "ROOFTOP", 
                  viewport = list(northeast = list(lat = 22.4827885802915, 
                    lng = 88.3617591802915), southwest = list(
                    lat = 22.4800906197085, lng = 88.3590612197085))), 
                place_id = "EkYxMSwgNywgQWRhcnNoYSBQYWxseSwgTmV0YWppIE5hZ2FyLCBLb2xrYXRhLCBXZXN0IEJlbmdhbCA3MDAwOTIsIEluZGlhIh4aHAoWChQKEgkPR5DY4nACOhG1kq12opRwUxICMTE", 
                types = list("subpremise")), list(address_components = list(
                list(long_name = "11/7", short_name = "11/7", 
                  types = list("premise")), list(long_name = "Adarsha Pally", 
                  short_name = "Adarsha Pally", types = list(
                    "political", "sublocality", "sublocality_level_2")), 
                list(long_name = "Netaji Nagar", short_name = "Netaji Nagar", 
                  types = list("political", "sublocality", "sublocality_level_1")), 
                list(long_name = "Kolkata", short_name = "Kolkata", 
                  types = list("locality", "political")), list(
                  long_name = "Kolkata", short_name = "Kolkata", 
                  types = list("administrative_area_level_2", 
                    "political")), list(long_name = "West Bengal", 
                  short_name = "WB", types = list("administrative_area_level_1", 
                    "political")), list(long_name = "India", 
                  short_name = "IN", types = list("country", 
                    "political")), list(long_name = "700092", 
                  short_name = "700092", types = list("postal_code"))), 
                formatted_address = "11/7, Adarsha Pally, Netaji Nagar, Kolkata, West Bengal 700092, India", 
                geometry = list(location = list(lat = 22.480674, 
                  lng = 88.363836), location_type = "ROOFTOP", 
                  viewport = list(northeast = list(lat = 22.4820229802915, 
                    lng = 88.3651849802915), southwest = list(
                    lat = 22.4793250197085, lng = 88.3624870197085))), 
                place_id = "ChIJ5fMASx1xAjoRymUCIwGCqYw", plus_code = list(
                  compound_code = "F9J7+7G Kolkata, West Bengal, India", 
                  global_code = "7MJCF9J7+7G"), types = list(
                  "street_address"))), status = "OK"), list(results = list(
            list(address_components = list(list(long_name = "251", 
                short_name = "251", types = list("street_number")), 
                list(long_name = "Pimpri-Chinchwad Link Road", 
                  short_name = "Pimpri-Chinchwad Link Rd", types = list(
                    "route")), list(long_name = "Gawade Nagar", 
                  short_name = "Gawade Nagar", types = list("political", 
                    "sublocality", "sublocality_level_2")), list(
                  long_name = "Chinchwad", short_name = "Chinchwad", 
                  types = list("political", "sublocality", "sublocality_level_1")), 
                list(long_name = "Pimpri-Chinchwad", short_name = "Pimpri-Chinchwad", 
                  types = list("locality", "political")), list(
                  long_name = "Pune", short_name = "Pune", types = list(
                    "administrative_area_level_2", "political")), 
                list(long_name = "Maharashtra", short_name = "MH", 
                  types = list("administrative_area_level_1", 
                    "political")), list(long_name = "India", 
                  short_name = "IN", types = list("country", 
                    "political")), list(long_name = "411033", 
                  short_name = "411033", types = list("postal_code"))), 
                formatted_address = "Survey No, 251, Pimpri-Chinchwad Link Rd, Gawade Nagar, Chinchwad, Pimpri-Chinchwad, Maharashtra 411033, India", 
                geometry = list(location = list(lat = 18.6301787, 
                  lng = 73.7938112), location_type = "ROOFTOP", 
                  viewport = list(northeast = list(lat = 18.6315276802915, 
                    lng = 73.7951601802915), southwest = list(
                    lat = 18.6288297197085, lng = 73.7924622197085))), 
                place_id = "ChIJbRla9GO5wjsR5gGzf6KGoJk", plus_code = list(
                  compound_code = "JQJV+3G Pimpri-Chinchwad, Maharashtra, India", 
                  global_code = "7JCMJQJV+3G"), types = list(
                  "establishment", "lodging", "point_of_interest"))), 
            status = "OK"))), row.names = c(NA, -6L), class = c("tbl_df", 
"tbl", "data.frame"))

【问题讨论】:

  • 嗨 @RonakShah 我已将输出更新为 dput(inner_list) 的输出
  • inner_list 是地理列的一个元素还是什么
  • 地理列包含许多嵌套列表
  • @mharinga 的答案对 get_lat 的先前定义有效。我对他的答案进行了编辑,显示了输出并将map 更改为map_dbl get_lat 的返回值是geo$results[[1]]$geometry$location$lat
  • 另外我认为你需要经度和纬度,所以不要将 get_lat 的返回值(现在应该重命名为 get_pos)到 geo$results[[1]]$geometry$location 并使用 tidyr::unnest_wider 添加两者lat 和 long 为列,即tbl %&gt;% mutate(pos = map(geo, get_lat)) %&gt;% unnest_wider(col=pos) 这样你就可以一次性得到你的经度和纬度

标签: r dplyr


【解决方案1】:

您可以使用例如包 purrr。像这样的东西应该可以工作:

library(tidyverse)

tbl %>% mutate(lat = map_dbl(geo, get_lat)) 

输出

# A tibble: 6 x 3
  Origin                                                                                                       geo                lat
  <chr>                                                                                                        <list>           <dbl>
1 Major Arterial Road(South-East), Deshbandhu Nagar, New Town, West Bengal, India                              <named list [2]>  22.6
2 Murari Pukur Govt Sponsored Higher Secondary School, Ultadanga Main Road Ultadanga Kolkata West Bengal India <named list [2]>  22.6
3 Running Lab, RA Puram Trustpakkam Mandaveli Chennai Tamil Nadu India                                         <named list [2]>  13.0
4 Ravindra Bharathi, Saifabad Khairatabad Hyderabad Telangana India                                            <named list [2]>  17.4
5 11/7, Adarsha Pally, Netaji Nagar, Kolkata, West Bengal 700092, India                                        <named list [2]>  22.5
6 Hotel Eagle executive, Pimpri-Chinchwad Maharashtra India                                                    <named list [2]>  18.6

【讨论】:

  • 如果我使用当前的 get_lat 函数执行此操作,我会收到错误错误:mutate() 输入问题lat。 x $ 运算符对于原子向量 i 无效,输入 latmap(geo, get_lat)。并且该列用第一行的值填满。
  • 请添加 MWE。我无法重现您的错误。您的 tibble 是一个列表,而不是一个 data.frame。
  • 问题在于 get_lat 函数,因为它获取列表中不包含所需值的第一个元素。他应该删除$results[[1]],它会工作
  • 从根本上说,我的目标只是在“dput output”下获取该列表,并能够将其组件提取到我的 tibble 的不同列中。我试过删除 $results[[1]] 但它也不起作用。我仍然遇到与以前相同的错误。我已经更新了我在原始帖子中尝试过的内容。
猜你喜欢
  • 2018-05-18
  • 2019-08-07
  • 1970-01-01
  • 2022-01-11
  • 2022-07-07
  • 1970-01-01
  • 2018-06-10
  • 2018-05-21
  • 2014-03-01
相关资源
最近更新 更多