【问题标题】:2 x-axis, 1 of index and 1 using timestamp points, same y2 x 轴,1 的索引和 1 使用时间戳点,相同的 y
【发布时间】:2017-07-27 12:33:19
【问题描述】:

我有一个时间序列,我想要 2 个 x 轴,1 个显示数据索引,一个显示相应的时间戳。一种方法似乎是使用plt.plot_date,然后再次针对索引进行绘图。然而:

  1. 一定有更好的方法来做到这一点
  2. 针对日期的绘图并未收集所有数据(您可以在下面看到“差距”)。

红色是日期图,蓝色是索引图。

  • 有没有一种方法可以让我最初将“日期”xaxis 设置为等于 "index" xaxis,然后将刻度标签更改为对应的 日期?

  • 您还知道为什么我在使用时可能会出现间隙行为 plt.plot_date?

时间戳的原始格式为strarray(['2017-02-14 05:48:00', '2017-02-14 05:49:00', '2017-02-14 05:50:00', '2017-02-14 05:51:00', '2017-02-14 05:52:00', '2017-02-14 05:53:00',...]

x = [dt.datetime.strptime(d,'%Y-%m-%d %H:%M:%S') for d in dates_test]
x = [mdates.date2num(i) for i in x]
fig, ax = plt.subplots(nrows=3, figsize=(8, 6), sharex = True)

ax[0].plot(xaxis, errors)
ax4 = ax[0].twiny()
ax4.plot_date(x, errors, fmt="r-")
ax4.xaxis.set_major_formatter(mdates.DateFormatter('%d/%m \n%H:%M'))

【问题讨论】:

  • 您可以尝试将edit the ticks 设置为您想要的值,this example 可能会有用
  • 此外,针对日期的绘图似乎收集了所有数据,它只是没有像针对索引的绘图那样平滑地散布它,而是聚集在几个日期周围。您确定时间戳都是等距且正确转换的吗?
  • @berna1111 感谢您的回复。 i)我认为在这种情况下提到的例子没有帮助。 ii) 这是一个很好的观点。时间戳的间距不相等,但这就是它们的样子。这就是为什么我想将它们用作刻度标签,而不是实际的刻度。

标签: python matplotlib plot jupyter-notebook


【解决方案1】:

由于时间戳之间的差距,这比最初想象的要困难。与其将时间戳视为浮动,不如使用 date2num,将它们视为 str 可能更容易:

dates_test 是 str 格式的时间戳数组。

fig, ax = plt.subplots(nrows=3, figsize=(8, 6), sharex = True)

ax[0].plot(xaxis, errors)
ax4 = ax[0].twiny()
ax4.plot(xaxis, errors, "r-")
ax4.xaxis.set_major_locator(majorLocator)
ax4.xaxis.set_minor_locator(minorLocator)

majorDates = [dates_test[i] for i in range(0,len(dates_test),250)]
majorDates = [dates_test[0]] + majorDates
ax4.set_xticklabels(majorDates, rotation = 'vertical')

【讨论】:

    【解决方案2】:

    生成与您类似的数据后:

    import numpy as np
    import matplotlib.pyplot as plt
    plt.ioff()
    import matplotlib.dates as mdates
    import datetime as dt
    import time
    
    # --- Build date blocks ---
    now = time.time() # first time
    delta_time = 60 # distance between points in seconds
    
    n_jumps = 10 # number of pieces with no data
    avg_jump_size = 60*60*24 # 86400 s = 1 day
    jumps = abs(np.random.normal(avg_jump_size/2.,
                                 avg_jump_size/2,
                                 n_jumps+1)) + avg_jump_size/2.
    # `abs` just to make sure the jump is positive, and ` + avg_jump_size/2.` to
    # make sure it's larger than the time step.
    
    avg_n_poins_per_block = 2*60*60/delta_time # 2 hours of acquisition per block
    blocks_sizes = abs(np.random.normal(avg_n_poins_per_block/2.,
                                        avg_n_poins_per_block/2.,
                                        n_jumps+1)) + avg_n_poins_per_block/2.
    
    times = np.array([]) # array to place all dates
    for i in range(n_jumps):
        block = np.arange(now, now+delta_time*blocks_sizes[i], delta_time)
        times = np.concatenate((times, block))
        now += jumps[i]
    # last block
    block = np.arange(now, now+delta_time*blocks_sizes[-1], delta_time)
    times = np.concatenate((times, block))
    
    def time2mdate_str(number=None):
        """
        Convert a time given by `time.time()` to a `datetime` instance
        from `matplotlib.mdate`.
        """
        if number is None:
            number = time.time()
        # time.time() returns the number of seconds since the Epoch
        # (1970-01-01 00:00:00).
        # But mdate represents time in days since 0001-01-01 00:00:00, plus 1 day.
        # (http://matplotlib.org/api/dates_api.html)
        # So we convert to days:
        number /= 60*60*24
        # and then we add the 1969 years:
        # http://www.rapidtables.com/calc/time/days-in-year.htm
        number += 1969*365.2425
        # and now it should be off by only (!) ~11h (close enough)
        a = mdates.num2date(number)
        return a
    
    # list of time strings:
    dates_test = [time2mdate_str(t).strftime("%Y-%m-%d %H:%M:%S") for t in times]
    # some random data:
    errors = np.random.normal(0.025, 0.01, len(times))
    xaxis = np.arange(len(errors)) # omiter
    
    # Original code:
    x = [dt.datetime.strptime(d,'%Y-%m-%d %H:%M:%S') for d in dates_test]
    x = [mdates.date2num(i) for i in x]
    fig, ax = plt.subplots(nrows=3, figsize=(8, 6), sharex = True)
    
    ax[0].plot(xaxis, errors)
    ax4 = ax[0].twiny()
    ax4.plot_date(x, errors, fmt="r-")
    ax4.xaxis.set_major_formatter(mdates.DateFormatter('%d/%m \n%H:%M'))
    
    fig.tight_layout()
    fig.show()
    

    我的第一个想法是分离每个数据块(您仍然可以减去初始小时,使第一个 x 值为 00:00):

    # break into blocks:
    x = np.array(x)
    deltas = x[1:] - x[:-1]
    # assume there's no break right after the first value and
    # find where the difference between consecutive times is larger
    # than 1.1*deltas[0] (margin of 1.1* for float comparison)
    break_indexes = np.where(deltas > deltas[0]*1.1)[0]+1
    # add borders (will be useful for iterating over the list):
    break_indexes = np.concatenate(([0],break_indexes,[-1]))
    n_jumps = len(break_indexes) - 1
    # offset to make sure each line does not overlap with another:
    offset = 1.5*np.max(errors)
    
    fig2, ax2 = plt.subplots(figsize=(8, 6))
    for i in range(n_jumps):
        i_0 = break_indexes[i]
        slice_ = slice(i_0, break_indexes[i+1])
        ax2.plot(x[slice_]-x[i_0]+x[0], errors[slice_]+offset*i, label=dates_test[i_0])
    ax2.xaxis.set_major_formatter(mdates.DateFormatter('%H:%M'))
    ax2.set_xlim(x[0])
    ax2.legend()
    
    fig2.tight_layout()
    fig2.show()
    

    然后我想你会发现使用每个时间块的开始作为刻度很有用,尽管这会导致标签重叠:

    # use blocks as ticks:
    fig3, ax3 = plt.subplots(nrows=3, figsize=(8, 6), sharex = True)
    ax3[0].plot(xaxis, errors)
    ax3 = ax3[0].twiny()
    ax3.plot_date(x, errors, fmt="r-")
    ax3.set_xticks(x[break_indexes][:-1])
    ax3.xaxis.set_major_formatter(mdates.DateFormatter('%d/%m \n%H:%M'))
    
    fig3.tight_layout()
    fig3.show()
    

    最后我认为您可能想要删除这些空白,但将标签保留在正确的位置:

    # use blocks as ticks and eliminate the gaps:
    # (could have used the inverse of this to create those gaps)
    x_without_gaps = x.copy()
    delta = x[1] - x[0]
    for i in range(1,n_jumps):
        i0 = break_indexes[i]
        i1 = break_indexes[i+1]
        if i1 == -1:
            i1 = None
        x_without_gaps[i0:i1] -= x_without_gaps[i0] - x_without_gaps[i0-1] - delta
    
    #x_without_gaps += x[0]
    fig4, ax4 = plt.subplots(nrows=3, figsize=(8, 6), sharex = True)
    ax4[0].plot(xaxis, errors)
    ax4[0].set_xlim(0,len(errors)-1)
    ax5 = ax4[0].twiny()
    ax5.plot_date(x_without_gaps, errors, fmt="r-")
    ax5.set_xticks(x_without_gaps[break_indexes][:-1])
    ax5.set_xticklabels([date.strftime('%d/%m \n%H:%M') for date in 
                     mdates.num2date(x[break_indexes][:-1])])
    # the following line would clear the values placed!
    #ax5.xaxis.set_major_formatter(mdates.DateFormatter('%d/%m \n%H:%M'))
    fig4.tight_layout()
    fig4.show()
    

    带有错误刻度标签的旧结果(每次运行时数据都是随机的,因此新示例中的标签明显不同,但我不想重新上传每张图片 - 相信或自己尝试并验证 =op ) : 请注意,您无法真正看到红色后面的蓝线,因此匹配似乎很好 - 只要您设置 xlim 以确保这一点。

    【讨论】:

    • 最后一个示例中刻度标签的放置方式存在错误,将在修复后立即编辑(或者如果有人愿意,请成为我的客人)。
    • 感谢所有这些。如果沿着 date2num 路线走,这确实看起来是正确的方法。但是我只是在想,由于存在差距,将日期视为字符串更容易。
    • 但是,如果您尝试进行任何光谱或时间相关分析,您可能需要使用数字 - 由您决定是否使用一组数据或单独使用每个数据。最后一个例子似乎是“一种我可以最初将“日期”xaxis 设置为等于“索引”xaxis,然后将刻度标签更改为相应日期的方法?”,尽管可能有点过于复杂。
    猜你喜欢
    • 1970-01-01
    • 1970-01-01
    • 1970-01-01
    • 1970-01-01
    • 1970-01-01
    • 1970-01-01
    • 2012-10-03
    • 1970-01-01
    • 1970-01-01
    相关资源
    最近更新 更多