【发布时间】:2016-03-24 18:31:03
【问题描述】:
我写了一些代码,根据具有相同序列号的机器查找 gps 坐标之间的距离
但我相信如果可以简化为使用iterrows或df.apply会更有效;但是,我似乎无法弄清楚。
由于我只需要在 ser_no[i] == ser_no[i+1] 时执行函数并在 ser_no 更改的位置插入 NaN 值,因此我似乎无法应用 Pandas 方法来提高代码效率。我看过:
- Vectorised Haversine formula with a pandas dataframe
- Python function to calculate distance using haversine formula in pandas
- Vectorizing a function in pandas
不幸的是,即使在查看了这些帖子之后,我也看不到我需要做出的飞跃。
我有什么:
def haversine(lat1, long1, lat2, long2):
r = 6371 # radius of Earth in km
# convert decimals to degrees
lat1, long1, lat2, long2 = map(np.radians, [lat1, long1, lat2, long2])
# haversine formula
lat = lat2 - lat1
lon = long2 - long1
a = np.sin(lat/2)**2 + np.cos(lat1)*np.cos(lat2)*np.sin(lon/2)**2
c = 2*np.arcsin(np.sqrt(a))
d = r*c
return d
# pre-allocate vector
hdist = np.zeros(len(mttt_pings.index), dtype = float)
# haversine loop calculation
for i in range(0, len(mttt_pings.index) - 1):
'''
when the ser_no from i and i + 1 are the same calculate the distance
between them using the haversine formula and put the distance in the
i + 1 location
'''
if mttt_pings.ser_no.loc[i] == mttt_pings.ser_no[i + 1]:
hdist[i + 1] = haversine(mttt_pings.EQP_GPS_SPEC_LAT_CORD[i], \
mttt_pings.EQP_GPS_SPEC_LONG_CORD[i], \
mttt_pings.EQP_GPS_SPEC_LAT_CORD[i + 1], \
mttt_pings.EQP_GPS_SPEC_LONG_CORD[i + 1])
else:
hdist = np.insert(hdist, i, np.nan)
'''
when ser_no i and i + 1 are not the same, insert NaN at the ith location
'''
【问题讨论】:
-
您可以发布您的数据样本吗?