【问题标题】:Compute unique axis labels according to gps data in pandas DataFrame根据 pandas DataFrame 中的 gps 数据计算唯一的轴标签
【发布时间】:2014-09-07 13:49:52
【问题描述】:

我有一个文件,其中包含 gps 坐标和一个从 a 测量的标量值 巴士遵循一些路线。我想制作一个距离图 在 x 轴上移动,标量值沿 y 轴。我想用公里来标记 x 轴 标有我感兴趣的公共汽车站的标签。

为了说明问题,这里有一些代码可以为一个 MWE 类似的问题,公共汽车在三角形路线上行驶 角ABC 在xy 坐标[0,0][0,1][1,0]。巴士是 在循环中旅行A-->B-->C-->A.. 等。标量值为x+y 即坐标位置的总和。

import pandas as pd
import numpy as np
import datetime as dt
import matplotlib.pyplot as plt 

n = 30
L = 1.0

def generate_route(num_loops):
    """Generates x,y coordinates and scalar value x+y for bus travelling A->B->C num_loop times."""
    def get_perturb():
        return float(np.random.rand(1)[0] * 0.5*(L/n))       
    x = []
    y = []
    for l in range(num_loops):
        # A to B
        x += [0.0] * n
        y += [i*L/float(n) for i in range(0,n)]
        # B to C
        x += [i*L/float(n) for i in range(0,n)]
        y += [1.0-i*L/float(n) for i in range(0,n)]
        # C to A
        x += [1.0 - i*L/float(n) for i in range(0,n)]
        y += [0.0] * n
    x = map(lambda i: i + get_perturb(), x)
    y = map(lambda i: i + get_perturb(), y)
    val = map(lambda i: i[0]+i[1], zip(x,y))
    return [x, y, val]

x, y, val = generate_route(3)

# put data into DataFrame
d = {'x':x,'y':y,'val':val}
df = pd.DataFrame(d, index = pd.date_range(dt.datetime.today().replace(microsecond=0),periods=len(x),freq='1s'))

# plot route
plt.figure()
df.plot(x='x', y='y',xlim=[-0.1,1.1],ylim=[-0.1,1.1])
ax = plt.gca()
ax.set_title('Route: x vs y')

# plot bus stops
stops = {'A':[0.,0.], 'B':[0.,1.], 'C':[1.,0.]}
ax.plot(stops['A'][0], stops['A'][1], 'r.', markersize=20)
ax.plot(stops['B'][0], stops['B'][1], 'g.', markersize=20)
ax.plot(stops['C'][0], stops['C'][1], 'y.', markersize=20)
# plt.savefig('route.png')

# compute distance travelled as sum of line segments connecting adjacent readings
df = pd.concat([df,df[['x','y']].rename(columns={'x':'x_prev','y':'y_prev'},copy=True).shift(1)],axis=1).dropna()
df['Dist'] = np.sqrt((df['x']-df['x_prev'])**2 + (df['y']-df['y_prev'])**2)
df['TotalDist'] = df['Dist'].cumsum(0)

# plot value with distance
plt.figure()
df.plot(x='TotalDist', y='val')
ax = plt.gca()
ax.set_title('TotalDist vs val')
# plt.savefig('totaldistvsval.png')
plt.show()

输出数字:

现在我卡住了,我想沿 x 轴添加公交车站标签 在TotalDist vs val 图上,例如标记的垂直线或 相似的。我有以下代码来标记 DataFrame 的每一行 随着停止它接近。一个问题是很多行会 匹配每个,所以我只需要在每组中选择一个,然后我需要使用它 为 x 轴添加标签。

def label_stops(row):
    def close(p,q):
        return np.sqrt((p[0]-q[0])**2+(p[1]-q[1])**2) < 3.0*(L/n)
    res = 'None'
    for name, loc in stops.iteritems():
        if close([row['x'], row['y']], loc):
            res = name
    return res
df['label'] = df.apply(label_stops, axis = 1)
df

这给出了以下一些进展:

                          val         x         y    x_prev    y_prev      Dist  TotalDist label
2014-09-07 14:57:17  0.046516  0.008194  0.038322  0.014114  0.001992  0.036809   0.036809     A
2014-09-07 14:57:18  0.084732  0.014400  0.070333  0.008194  0.038322  0.032607   0.069416     A
2014-09-07 14:57:19  0.122984  0.013296  0.109688  0.014400  0.070333  0.039370   0.108786  None
2014-09-07 14:57:20  0.154545  0.005306  0.149240  0.013296  0.109688  0.040351   0.149137  None
...                       ...       ...       ...       ...       ...       ...        ...   ...
2014-09-07 14:57:42  0.882114  0.007021  0.875094  0.009029  0.839339  0.035811   0.888190  None
2014-09-07 14:57:43  0.923723  0.015505  0.908218  0.007021  0.875094  0.034194   0.922383     B
2014-09-07 14:57:44  0.952783  0.014462  0.938320  0.015505  0.908218  0.030121   0.952504     B
2014-09-07 14:57:45  0.985179  0.009943  0.975237  0.014462  0.938320  0.037192   0.989696     B
2014-09-07 14:57:46  1.010307  0.007226  1.003080  0.009943  0.975237  0.027976   1.017672     B
...                       ...       ...       ...       ...       ...       ...        ...   ...
2014-09-07 15:01:16  1.011478  1.001009  0.010469  0.970733  0.042690  0.044214   9.412063     C
2014-09-07 15:01:17  0.968017  0.967922  0.000095  1.001009  0.010469  0.034676   9.446738     C
2014-09-07 15:01:19  0.921621  0.907178  0.014444  0.934321  0.008302  0.027829   9.509157     C
2014-09-07 15:01:20  0.876492  0.875172  0.001320  0.907178  0.014444  0.034592   9.543749  None
2014-09-07 15:01:21  0.862456  0.846593  0.015863  0.875172  0.001320  0.032066   9.575814  None
...                       ...       ...       ...       ...       ...       ...        ...   ...

【问题讨论】:

    标签: python matplotlib pandas


    【解决方案1】:

    我想出了以下工作正常但可能不是惯用的。

    # loop over rows with label column not equal to 'None',
    # collect consecutive rows with same location into a
    # list (grp) and return list of those lists (groups)
    groups = []
    grp = []
    df1 = df[df['label'] != 'None']
    prev = df1.iloc[0]['label']
    for r in df1.iterrows():
        curr = r[1]['label']
        if curr == prev:
            grp.append(r[1])
        else:
            groups.append(grp)
            grp = []
            prev = curr
    
    # extract stop locations get middle distances
    loclines = []
    for g in groups:
        mids = g[len(g)/2]
        loclines.append([mids['TotalDist'], mids['label']])
    
    # mark stops on plot as coloured vertical lines
    plt.figure()
    df.plot(x='TotalDist', y='val')
    ax = plt.gca()
    ax.set_title('TotalDist vs val')
    for li, l in enumerate(loclines):
        if loclines[li][1] == 'A': color = 'r'
        if loclines[li][1] == 'B': color = 'g'
        if loclines[li][1] == 'C': color = 'y'
        plt.axvline(x=loclines[li][0],color= color)
    
    plt.show()
    

    结果图:

    【讨论】:

      猜你喜欢
      • 1970-01-01
      • 1970-01-01
      • 1970-01-01
      • 1970-01-01
      • 1970-01-01
      • 2019-08-01
      • 1970-01-01
      • 2018-11-07
      • 2022-01-13
      相关资源
      最近更新 更多