【问题标题】:How to calculate haversine distance between 4 columns correctly如何正确计算 4 列之间的半正弦距离
【发布时间】:2022-10-21 04:24:47
【问题描述】:

我尝试计算 4 列之间的半正弦距离

cgi                     longitude_bts       latitude_bts    longitude_poi   latitude_poi
0   510-11-32111-7131       95.335142           5.565253        95.337588       5.563713
1   510-11-32111-7135       95.335142           5.565253        95.337588       5.563713

这是我的代码

def haversine(lon1, lat1, lon2, lat2):
    """
    Calculate the great circle distance between two points 
    on the earth (specified in decimal degrees)
    """
    # convert decimal degrees to radians
    import numpy as np
    import math
    lon1, lat1, lon2, lat2 = map(np.radians, [lon1, lat1, lon2, lat2])
    # haversine formula 
    dlon = lon2 - lon1 
    dlat = lat2 - lat1 
    a = math.sin(dlat/2)**2 + math.cos(lat1) * math.cos(lat2) * math.sin(dlon/2)**2
    c = 2 * math.asin(math.sqrt(a)) 
    # Radius of earth in kilometers is 6371
    km = 6371008.799485213* c
    return km

ref_location_airport_hospital['radius'] = ref_location_airport_hospital.apply(lambda x: haversine(x['latitude_bts'], x['longitude_bts'], x['latitude_poi'], x['longitude_poi']), axis=1)

这是结果

    cgi                 longitude_bts      latitude_bts longitude_poi   latitude_poi    radius
0   510-11-32111-7131   95.335142              5.565253     95.337588       5.563713    272.441676
1   510-11-32111-7135   95.335142              5.565253     95.337588       5.563713    272.441676

结果不合理,两点距离小于0.004,所以半径应该小于1公里

笔记: 1经/纬度约为111公里

【问题讨论】:

  • 我认为您实际上是在返回米,而不是公里。另外,通常我看到c = 2 * atan2(sqrt(a), sqrt(1-a)),而不是asin

标签: python pandas haversine


【解决方案1】:

你得到的是米而不是公里。尝试这个:

import pandas as pd
import numpy as np

def haversine(lon1, lat1, lon2, lat2):
    lon1, lat1, lon2, lat2 = np.radians([lon1, lat1, lon2, lat2])
    dlon = lon2 - lon1
    dlat = lat2 - lat1

    haver_formula = np.sin(dlat/2)**2 + np.cos(lat1) * np.cos(lat2) * np.sin(dlon/2)**2

    r = 6371 #6371 for distance in KM for miles use 3958.756
    dist = 2 * r * np.arcsin(np.sqrt(haver_formula))
    return pd.Series(dist)

#provided data
df = pd.DataFrame({'cgi': {0: '510-11-32111-7131', 1: '510-11-32111-7135'}, 'longitude_bts': {0: 95.335142, 1: 95.335142}, 'latitude_bts': {0: 5.565253, 1: 5.565253}, 'longitude_poi': {0: 95.337588, 1: 95.337588}, 'latitude_poi': {0: 5.563713, 1: 5.563713}})

df['km'] = haversine(df['longitude_bts'], df['latitude_bts'], df['longitude_poi'], df['latitude_poi'])

#output
    cgi                 longitude_bts      latitude_bts longitude_poi   latitude_poi          km
0   510-11-32111-7131   95.335142              5.565253     95.337588       5.563713    0.320316
1   510-11-32111-7135   95.335142              5.565253     95.337588       5.563713    0.320316

【讨论】:

    猜你喜欢
    • 1970-01-01
    • 2014-03-10
    • 1970-01-01
    • 2022-10-21
    • 2022-07-19
    • 2018-09-20
    • 1970-01-01
    • 1970-01-01
    • 1970-01-01
    相关资源
    最近更新 更多