【问题标题】:Appending GeoDataFrames does not return expected dataframe附加 GeoDataFrames 不会返回预期的数据帧
【发布时间】:2021-06-10 09:22:15
【问题描述】:

我在尝试附加包含几何类型的数据框时遇到以下问题。我正在查看的 pandas 数据框如下所示:

name     x_zone     y_zone
0  A1  65.422080  48.147850
1  A1  46.635708  51.165745
2  A1  46.597984  47.657444
3  A1  68.477700  44.073700
4  A3  46.635708  54.108190
5  A3  46.635708  51.844770
6  A3  63.309560  48.826878
7  A3  62.215572  54.108190

如您所见,每个name 有四行,因为它们代表多边形的角。我需要它采用 geopandas 中定义的多边形形式,即我需要一个GeoDataFrame。为此,我将以下代码仅用于 name 之一(只是为了检查它是否有效):

df  = df[df['name']=='A1']

x = df['x_zone'].to_list()
y = df['y_zone'].to_list()
polygon_geom = Polygon(zip(x, y))
crs = {'init': "EPSG:4326"}
polygon = gpd.GeoDataFrame(index=[name], crs=crs, geometry=[polygon_geom])
print(polygon)

返回:

                                             geometry
A1  POLYGON ((65.42208 48.14785, 46.63571 51.16575...

polygon.info()

<class 'geopandas.geodataframe.GeoDataFrame'>
Index: 1 entries, A1 to A1
Data columns (total 1 columns):
 #   Column    Non-Null Count  Dtype   
---  ------    --------------  -----   
 0   geometry  1 non-null      geometry
dtypes: geometry(1)
memory usage: 16.0+ bytes

太棒了,太好了。因此,对于更多name,我认为以下方法可行:

unique_place = list(df['name'].unique())

GE = []
for name in unique_aisle:
    f = df[df['id']==name]
    x = f['x_zone'].to_list()
    y = f['y_zone'].to_list()
    polygon_geom = Polygon(zip(x, y))
    crs = {'init': "EPSG:4326"}
    polygon = gpd.GeoDataFrame(index=[name], crs=crs, geometry=[polygon_geom])
    print(polygon.info())
    GE.append(polygon)

但它返回的是一个列表,而不是一个数据框。

[                                             geometry
 A1  POLYGON ((65.42208 48.14785, 46.63571 51.16575...,
                                              geometry
 A3  POLYGON ((46.63571 54.10819, 46.63571 51.84477...]

这很奇怪,因为 *.append(**) 如果要附加的是 pandas 数据框,效果会很好。

我错过了什么?此外,即使在第一种情况下,我只剩下几何列,但这不是问题,因为我可以将文件写入 shp 并再次读取它以获得第二列(名称)。

感谢任何能让我前进的解决方案!

【问题讨论】:

    标签: pandas geopandas


    【解决方案1】:

    我猜您需要在您的数据上使用 groupby 的示例代码。如果不是这样,请告诉我。

    from io import StringIO
    import geopandas as gpd
    import pandas as pd
    from shapely.geometry import Polygon
    import numpy as np
    
    dats_str = """index  id     x_zone     y_zone
    0  A1  65.422080  48.147850
    1  A1  46.635708  51.165745
    2  A1  46.597984  47.657444
    3  A1  68.477700  44.073700
    4  A3  46.635708  54.108190
    5  A3  46.635708  51.844770
    6  A3  63.309560  48.826878
    7  A3  62.215572  54.108190"""
    
    # read the string, convert to dataframe
    df1 = pd.read_csv(StringIO(dats_str), sep='\s+', index_col='index')
    
    # Use groupBy as an iterator to:-
    # - collect interested items
    # - process some data: mean, creat Polygon, maybe others
    # - all are collected/appended as lists
    ids = []
    counts = []
    meanx = []
    meany = []
    list_x = []
    list_y = []
    polygon = []
    for label, group in df1.groupby('id'):
        # label: 'A1', 'A3'; 
        # group: dataframe of 'A', of 'B'
        ids.append(label)   
        counts.append(len(group))         #number of rows
        meanx.append(group.x_zone.mean())
        meany.append(group.y_zone.mean())
        # process x,y data of this group -> for polygon
        xs = group.x_zone.values
        ys = group.y_zone.values
        list_x.append(xs)
        list_y.append(ys)
        polygon.append(Polygon(zip(xs, ys))) # make/collect polygon
    
    # items above are used to create a dataframe here
    df_from_groupby = pd.DataFrame({'id': ids, 'counts': counts, \
                                    'meanx': meanx, "meany": meany, \
                                    'list_x': list_x, 'list_y': list_y,
                                    'polygon': polygon
                                   })
    

    如果你打印数据框df_from_groupby,你会得到:-

       id  counts      meanx      meany  \
    0  A1       4  56.783368  47.761185   
    1  A3       4  54.699137  52.222007   
    
                                            list_x  \
    0    [65.42208, 46.635708, 46.597984, 68.4777]   
    1  [46.635708, 46.635708, 63.30956, 62.215572]   
    
                                          list_y  \
    0  [48.14785, 51.165745, 47.657444, 44.0737]   
    1  [54.10819, 51.84477, 48.826878, 54.10819]   
    
                                                 polygon  
    0  POLYGON ((65.42207999999999 48.14785, 46.63570...  
    1  POLYGON ((46.635708 54.10819, 46.635708 51.844... 
    

    【讨论】:

      猜你喜欢
      • 1970-01-01
      • 1970-01-01
      • 1970-01-01
      • 2016-07-16
      • 1970-01-01
      • 1970-01-01
      • 2014-05-02
      • 2019-11-15
      • 2018-08-16
      相关资源
      最近更新 更多