【问题标题】:Combining two dataframes using buffer in geopandas在 geopandas 中使用缓冲区组合两个数据帧
【发布时间】:2021-09-08 02:41:05
【问题描述】:

我有两个数据框(不是确切的数据,但相似): df1:

Lon Lat Timestamp
4.44 61.41 2021-04-28 00:00:00
4.48 62.45 2021-04-28 00:02:00
4.51 61.48 2021-04-28 00:06:00
4.47 62.46 2021-04-28 00:08:00
4.44 61.41 2021-04-28 00:10:00
4.40 62.48 2021-04-28 00:12:00
4.51 61.44 2021-04-28 00:16:00
4.47 62.49 2021-04-28 00:18:00

df2

Lon Lat Timestamp
4.34 61.41 2021-04-28 00:00:00
4.38 62.45 2021-04-28 00:02:00
4.31 61.48 2021-04-28 00:06:00
4.17 62.46 2021-04-28 00:08:00
4.34 61.41 2021-04-28 00:10:00
4.30 62.48 2021-04-28 00:12:00
4.21 61.44 2021-04-28 00:16:00
4.47 62.49 2021-04-28 00:18:00

还有其他专栏,但我的问题与这些专栏有关。 所以我想为 df1 中的每个观察值在 100m 的半径内每分钟组合两个数据帧。

我只用一个数据帧做了类似的事情,对于数据帧中的每一次观察,我都加入了 100m 半径内的所有观察。

for name, group in df.groupby(['timestamp']):
        buf = group.copy()
        buf['geometry'] = buf.geometry.buffer(100)
        points_within = gpd.sjoin(group, buf,   op = 'within')

我需要用两个数据框做类似的事情

【问题讨论】:

    标签: python pandas geopandas


    【解决方案1】:
    • 样本数据集中 100m 范围内的内容不多。增加距离意味着更多sjoin()
    • GeoPandas 功能与 CRS 和 buffer() 结合使用。重要的是 UTM 几何用于距离。因此投影到 UTM 并返回到 EPSG:4326
    • 已显示输出数据帧和 plotly mapbox 作为标记和包含缓冲区的 geojson
    import geopandas as gpd
    import shapely, json
    import pandas as pd
    import plotly.express as px
    
    df1 = pd.DataFrame(
        {
            "Lon": [4.44, 4.48, 4.51, 4.47, 4.44, 4.4, 4.51, 4.47],
            "Lat": [61.41, 62.45, 61.48, 62.46, 61.41, 62.48, 61.44, 62.49],
            "Timestamp": [
                "2021-04-28 00:00:00",
                "2021-04-28 00:02:00",
                "2021-04-28 00:06:00",
                "2021-04-28 00:08:00",
                "2021-04-28 00:10:00",
                "2021-04-28 00:12:00",
                "2021-04-28 00:16:00",
                "2021-04-28 00:18:00",
            ],
        }
    )
    
    df2 = pd.DataFrame(
        {
            "Lon": [4.34, 4.38, 4.31, 4.17, 4.34, 4.3, 4.21, 4.47],
            "Lat": [61.41, 62.45, 61.48, 62.46, 61.41, 62.48, 61.44, 62.49],
            "Timestamp": [
                "2021-04-28 00:00:00",
                "2021-04-28 00:02:00",
                "2021-04-28 00:06:00",
                "2021-04-28 00:08:00",
                "2021-04-28 00:10:00",
                "2021-04-28 00:12:00",
                "2021-04-28 00:16:00",
                "2021-04-28 00:18:00",
            ],
        }
    )
    
    MIN_DIST = 10**2
    
    gdf1 = gpd.GeoDataFrame(
        geometry=df1.loc[:, ["Lon", "Lat"]]
        .apply(lambda r: shapely.geometry.Point(r["Lon"], r["Lat"]), axis=1)
        .values,
        crs="EPSG:4326",
    )
    
    gdf2 = gpd.GeoDataFrame(
        geometry=df2.loc[:, ["Lon", "Lat"]]
        .apply(lambda r: shapely.geometry.Point(r["Lon"], r["Lat"]), axis=1)
        .values,
        crs="EPSG:4326",
    )
    
    # add buffer to df1,  NB need to correctly use CRS systems to define distances
    gdf1 = (
        gdf1.to_crs(gdf1.estimate_utm_crs()).geometry.buffer(MIN_DIST).to_crs("EPSG:4326")
    )
    
    # join data frames back together
    df2_in_df1 = df2.reset_index().merge(
        gpd.sjoin(gpd.GeoDataFrame(geometry=gdf1), gdf2, how="inner"),
        left_on="index",
        right_on="index_right",
    )
    
    
    # plot it to see what's been found
    fig = (
        px.scatter_mapbox(df1, lat="Lat", lon="Lon")
        .update_traces(marker={"color": "red", "opacity":.3})
        .add_traces(px.scatter_mapbox(df2, lat="Lat", lon="Lon").update_traces(marker={"color":"red", "opacity":.3}).data)
        .add_traces(px.scatter_mapbox(df2_in_df1, lat="Lat", lon="Lon").update_traces(marker={"color":"green", "size":10}).data)
    
        )
    
    fig.update_layout(
        mapbox={
            "style": "open-street-map",
            "layers": [
                {
                    "source": json.loads(gdf1.geometry.to_json()),
                    "below": "traces",
                    "type": "line",
                    "color": "purple",
                    "line": {"width": 1.5},
                }
            ],
        },
        margin={"l": 0, "r": 0, "t": 0, "b": 0},
    )
    
    index Lon Lat Timestamp index_right
    0 7 4.47 62.49 2021-04-28 00:18:00 7

    【讨论】:

      猜你喜欢
      • 1970-01-01
      • 2018-05-12
      • 1970-01-01
      • 2022-12-15
      • 1970-01-01
      • 2023-03-06
      • 1970-01-01
      • 1970-01-01
      • 1970-01-01
      相关资源
      最近更新 更多