【问题标题】:Time series analysis in Python using conditions使用条件在 Python 中进行时间序列分析
【发布时间】:2020-07-21 23:53:09
【问题描述】:

我有以下数据(样本)

Symbol Sections      iBid     Bid                Date
0    O.U20       O1  99.73167  99.730 2020-06-29 16:32:25
1    O.Z20       O1  99.70250  99.700 2020-06-29 16:32:25
2    O.H21       O1       NaN  99.795 2020-06-29 16:32:25
3    O.M21       O1  99.81167  99.810 2020-06-29 16:32:25
4    O.U21       O2  99.81667  99.815 2020-06-29 16:32:25
5    O.Z21       O2       NaN  99.795 2020-06-29 16:32:25
6    O.H22       O2  99.81000  99.810 2020-06-29 16:32:25
7    O.M22       O2  99.79500  99.795 2020-06-29 16:32:25
16  F3.U26       F3       NaN   1.000 2020-06-29 16:32:25
17  F3.Z26       F3       NaN  -3.000 2020-06-29 16:32:25
18  F3.H27       F3       NaN  -1.000 2020-06-29 16:32:25
19  F6.H26       F6  -1.75000     NaN 2020-06-29 16:32:25
20  F6.M26       F6  -4.50000     NaN 2020-06-29 16:32:25
21  F6.U26       F6  -5.50000     NaN 2020-06-29 16:32:25
22  F9.U20       F9  -8.50000  -9.000 2020-06-29 16:32:25
23   O.U20       O3  99.73167  99.730 2020-06-29 16:32:26
24   O.Z20       O3  99.70250  99.700 2020-06-29 16:32:26
25   O.H21       O3       NaN  99.795 2020-06-29 16:32:26
26   O.M21       O3  99.81167  99.810 2020-06-29 16:32:26
27   O.U21       O4  99.81667  99.815 2020-06-29 16:32:26
28   O.Z21       O4       NaN  99.795 2020-06-29 16:32:26
29   O.H22       O4  99.81000  99.810 2020-06-29 16:32:26
30   O.M22       O4  99.79500  99.795 2020-06-29 16:32:26

我想做的是绘制散点图或折线图或任何适合这种分析的图表,如果满足条件,可以分析随时间变化的趋势。例如,我想查看每个符号(O、S、F)以及部分(O1、F3 等)的 iBid 比 Bid 加班高多少倍

我知道我需要展示一些工作,但我不确定这样的图表是否可行?到目前为止,我只能根据 Symbol 来拆分数据

df_O = df[df['Contract'].str.contains('O')]

并过滤掉类似的结果

IbidgreaterBid = big_frame[(big_frame.iBid > big_frame.Bid)]

是否可以获得可以分析何时同上> 投标日期列作为x轴的图表? (日期列有千行,只有秒的差异)

【问题讨论】:

    标签: python python-3.x matplotlib seaborn


    【解决方案1】:

    当同上>出价时可以分析的图表不清楚您的意思。但是,我可以建议一种基于 Ibid >/ Bid 的数据点,蓝色表示其他情况。此外,由于差异仅在秒的范围内,我使用mdates date-formatter 将 xticks 设置为仅显示 H-M-S。

    import matplotlib.pyplot as plt
    import pandas as pd
    import numpy as np
    from matplotlib.offsetbox import AnchoredText
    import matplotlib.dates as mdates
    from datetime import timedelta
    plt.style.use('seaborn-whitegrid')
    
    n_sections=df['Sections'].nunique()
    cols=2
    rows=int(round(n_sections/2.0))
    #setup the plot
    fig, ax = plt.subplots(rows, cols, figsize=(16,8),sharex=False,sharey=False) # if you want to turn off sharing axis.
    row=0 #to iterate over rows/cols
    col=0 #to iterate over rows/cols
    
    
    for index, Section in df.groupby('Sections'):
        ax[row][col].scatter(np.array(Section['Datetime']),Section['iBid'] , color='blue')
        ax[row][col].scatter(np.array(Section['Datetime'][Section['iBid']>Section['Bid']]),Section['iBid'][Section['iBid']>Section['Bid']] , color='red')
        ax[row][col].set_xlim([min(Section['Datetime'])-timedelta(seconds=5), max(Section['Datetime'])+timedelta(seconds=5)])
        ax[row][col].set_xlabel('Date Time',fontsize=20)
        ax[row][col].set_ylabel('iBid',fontsize=20)
        anchored_text = AnchoredText("{}".format(Section['Sections'].unique()[0]), loc=4,prop=dict(size=20))
        ax[row][col].add_artist(anchored_text)
    
        ax[row][col].xaxis.set_major_formatter(mdates.DateFormatter('%H:%M:%S'))
        ax[row][col].tick_params(axis='both', direction='in', which='major', length=5, width=2,labelsize=16)
        
        row=row+1
        if row==rows:
            row=0
            col=col+1
    

    【讨论】:

      猜你喜欢
      • 2017-05-12
      • 2021-09-16
      • 1970-01-01
      • 1970-01-01
      • 2018-03-07
      • 2021-10-28
      • 1970-01-01
      • 2012-06-05
      相关资源
      最近更新 更多