【问题标题】:Modify code to capture values greater than - instead of exact match修改代码以捕获大于 - 而不是完全匹配的值
【发布时间】:2016-08-20 08:31:04
【问题描述】:

以下代码可以很好地识别后续行中的值是命中还是遗漏,并提供显示满足条件的时间的输出列。

import datetime,numpy as np,pandas as pd;
nan = np.nan;

a = pd.DataFrame(  {'price': {datetime.time(9, 0): 1,   datetime.time(10, 0): 0,   datetime.time(11, 0): 3,   datetime.time(12, 0): 4,   datetime.time(13, 0): 7,   datetime.time(14, 0): 6,   datetime.time(15, 0): 5,   datetime.time(16, 0): 4,   datetime.time(17, 0): 0,   datetime.time(18, 0): 2,   datetime.time(19, 0): 4,   datetime.time(20, 0): 7},  'reversal': {datetime.time(9, 0): nan,   datetime.time(10, 0): nan,   datetime.time(11, 0): nan,   datetime.time(12, 0): nan,   datetime.time(13, 0): nan,
  datetime.time(14, 0): 6.0,   datetime.time(15, 0): nan,   datetime.time(16, 0): nan,   datetime.time(17, 0): nan,   datetime.time(18, 0): nan,   datetime.time(19, 0): nan,   datetime.time(20, 0): nan}});


a['target_hit_time']=a['target_miss_time']=nan;
a['target1']=a['reversal']+1;
a['target2']=a['reversal']-a['reversal'];
a.sort_index(1,inplace=True);

hits = a.ix[:,:-2].dropna();

for row,hit in hits.iterrows():

        forwardRows = [row]<a['price'].index.values

        targetHit = a.index.values[(hit['target1']==a['price'].values) & forwardRows][0];
        targetMiss = a.index.values[(hit['target2']==a['price'].values) & forwardRows][0];

        if targetHit>targetMiss:
            a.loc[row,"target_miss_time"] = targetMiss;
        else:
            a.loc[row,"target_hit_time"] = targetHit;


a

此图像显示了上述代码的输出,可以通过运行此代码轻松复制:

我遇到的问题是,当此代码用于真实数据时,价格可能不完全匹配和/或可能与某个值有差距。所以如果我们看下图:

我们看到,如果我们正在寻找值 &gt;= 7.5 而不仅仅是寻找值 7.5,则将满足 target1 条件。我该如何修改代码来实现这一点?

【问题讨论】:

    标签: python pandas


    【解决方案1】:

    一些如果,仅此而已:D...

    import datetime,numpy as np,pandas as pd;
    nan = np.nan;
    
    a = pd.DataFrame(  {'price': {datetime.time(9, 0): 1,   datetime.time(10, 0): 0,   datetime.time(11, 0): 3,   datetime.time(12, 0): 4,   datetime.time(13, 0): 7,   datetime.time(14, 0): 6,   datetime.time(15, 0): 5,   datetime.time(16, 0): 4,   datetime.time(17, 0): 2,   datetime.time(18, 0): 2,   datetime.time(19, 0): 4,   datetime.time(20, 0): 8},  'reversal': {datetime.time(9, 0): nan,   datetime.time(10, 0): nan,   datetime.time(11, 0): nan,   datetime.time(12, 0): nan,   datetime.time(13, 0): nan,
      datetime.time(14, 0): 6.0,   datetime.time(15, 0): nan,   datetime.time(16, 0): nan,   datetime.time(17, 0): nan,   datetime.time(18, 0): nan,   datetime.time(19, 0): nan,   datetime.time(20, 0): nan}});
    
    
    a['target_hit_time']=a['target_miss_time']=nan;
    a['target1']=a['reversal']+1;
    a['target2']=a['reversal']-a['reversal'];
    a.sort_index(1,inplace=True);
    
    hits = a.ix[:,:-2].dropna();
    
    for row,hit in hits.iterrows():
    
            forwardRows = a[a.index.values > row];
            targetHit = hit['target1']<=forwardRows['price'].values;
            targetMiss = hit['target2']==forwardRows['price'].values;
            targetHit = forwardRows[targetHit].head(1).index.values;
            targetMiss = forwardRows[targetMiss].head(1).index.values;
    
            targetHit, targetMiss = \
            targetHit[0] if targetHit else [], \
            targetMiss[0] if targetMiss else [];
    
            goMiss,goHit = False,False
            if targetHit and targetMiss:
                if targetHit>targetMiss: goMiss=True;
                else: goHit=True;
            elif targetHit and not targetMiss:goHit = True;
            elif not targetHit and targetMiss:goMiss = True;
    
            if goMiss:a.loc[row,"target_miss_time"] = targetMiss;
            elif goHit:a.loc[row,"target_hit_time"] = targetHit;
    
    
    
    print '#'*50
    print a
    '''
    ##################################################
              price  reversal  target1  target2 target_hit_time  target_miss_time
    09:00:00      1       NaN      NaN      NaN             NaN               NaN
    10:00:00      0       NaN      NaN      NaN             NaN               NaN
    11:00:00      3       NaN      NaN      NaN             NaN               NaN
    12:00:00      4       NaN      NaN      NaN             NaN               NaN
    13:00:00      7       NaN      NaN      NaN             NaN               NaN
    14:00:00      6       6.0      7.0      0.0        20:00:00               NaN
    15:00:00      5       NaN      NaN      NaN             NaN               NaN
    16:00:00      4       NaN      NaN      NaN             NaN               NaN
    17:00:00      2       NaN      NaN      NaN             NaN               NaN
    18:00:00      2       NaN      NaN      NaN             NaN               NaN
    19:00:00      4       NaN      NaN      NaN             NaN               NaN
    20:00:00      8       NaN      NaN      NaN             NaN               NaN
    '''
    

    【讨论】:

    • 今天我花了一些时间来浏览你的代码并理解它。感谢您付出的努力,因为这对我学习非常有用。请帮助我理解为什么我们需要a.sort_index(1,inplace=True);。我看到我们是根据列进行排序的,但是为什么?
    • 排序前的列是:“价格反转target_hit_time target_miss_time target1 target2”;在它们变为:“价格反转 target1 target2 target_hit_time target_miss_time”之后。这就是你应该知道的一切......只是让你的灵魂变得尝试/除了,只有在某些事情不起作用时才会打扰,如果你需要更高的效率则打扰。通常我会这样猜测或检查什么是有效的:“a.sort_index(0,inplace=True); a.sort_index(1,inplace=True);”然后我采取了其中之一 :D.
    【解决方案2】:

    在不大量修改代码的情况下,这就是我想出的:

    import numpy as np
    
    for row,hit in hits.iterrows():
            print ("row", row)
            print ("hit",hit)
    
            forwardRows = a[a.index.values > row]
    
            targetHit = forwardRows[(hit['target1'] <= forwardRows['price'].values)].head(1).index.values
    
            targetMiss = forwardRows[(hit['target2'] >= forwardRows['price'].values)].head(1).index.values
    
            if targetHit>targetMiss:
                a.loc[row,"target_miss_time"] = targetMiss
            else:
                a.loc[row,"target_hit_time"] = targetHit
    
        price   reversal    target1 target2 target_hit_time target_miss_time
    09:00:00    1   NaN NaN NaN NaN NaN
    10:00:00    0   NaN NaN NaN NaN NaN
    11:00:00    3   NaN NaN NaN NaN NaN
    12:00:00    4   NaN NaN NaN NaN NaN
    13:00:00    7   NaN NaN NaN NaN NaN
    14:00:00    6   6.5 7.5 0.0 [20:00:00]  NaN
    15:00:00    5   NaN NaN NaN NaN NaN
    16:00:00    4   NaN NaN NaN NaN NaN
    17:00:00    2   NaN NaN NaN NaN NaN
    18:00:00    2   NaN NaN NaN NaN NaN
    19:00:00    4   NaN NaN NaN NaN NaN
    20:00:00    8   NaN NaN NaN NaN NaN
    

    这还有待改进,因为 targetHit, targetMiss 返回一个数组,您需要检查数组中是否有任何元素以及两个数组中是否有元素 - 您需要比较第一个元素。现在它只在一个数组为空时才有效。

    【讨论】:

      猜你喜欢
      • 2020-07-11
      • 2022-06-15
      • 1970-01-01
      • 1970-01-01
      • 1970-01-01
      • 2014-12-31
      • 1970-01-01
      • 1970-01-01
      • 1970-01-01
      相关资源
      最近更新 更多