【问题标题】:How do I incrementally add rows in Pandas Dataframe?如何在 Pandas Dataframe 中增量添加行?
【发布时间】:2017-03-07 01:31:00
【问题描述】:

我正在计算从 9:15 到 15:30 每 15 分钟的数据的开-高-低-收 (OHLC),并希望将 OHLC 值存储在每个新行的数据框中。

ohlc = pd.DataFrame(columns=('Open','High','Low','Close'))
for row in ohlc:
    ohlc.loc[10] = pd.DataFrame([[candle_open_price,candle_high_price,candle_low_price,candle_close_price]])

但我不能说得到以下错误:

ValueError: cannot set a row with mismatched columns

只是我想增量存储我计算的每 15 分钟持续时间的 OHLC 数据并放入新的 ohlc 数据帧的行中


编辑

import numpy as np
import pandas as pd
import datetime as dt
import matplotlib as plt
import dateutil.parser

tradedata = pd.read_csv('ICICIBANK_TradeData.csv', index_col=False, 
              names=['Datetime','Price'], 
            header=0)
tradedata['Datetime'] =  pd.to_datetime(tradedata['Datetime'])

first_trd_time = tradedata['Datetime'][0]
last_time = dateutil.parser.parse('2016-01-01 15:30:00.000000')

candle_time = 15;
candle_number = 0

while(first_trd_time < last_time):
    candledata = tradedata[(tradedata['Datetime']>first_trd_time) & (tradedata['Datetime']<first_trd_time+dt.timedelta(minutes=candle_time))]
first_trd_time = first_trd_time+dt.timedelta(minutes=candle_time)

candle_open_price = candledata.iloc[0]['Price']
candle_open_time = candledata.iloc[0]['Datetime']
candle_close_price = candledata.iloc[-1]['Price']
candle_close_time = candledata.iloc[-1]['Datetime']
candle_high_price = candledata.loc[candledata['Price'].idxmax()]['Price']
candle_high_time = candledata.loc[candledata['Price'].idxmax()]['Datetime'] 
candle_low_price = candledata.loc[candledata['Price'].idxmin()]['Price']
candle_low_time = candledata.loc[candledata['Price'].idxmin()]['Datetime']

ohlc = pd.DataFrame(columns=('Open','High','Low','Close'))
ohlc_data = pd.DataFrame()

if(candle_number == 0):
    ohlc = pd.DataFrame(np.array([[0, 0, 0, 0]]), columns=['Open', 'High', 'Low', 'Close']).append(ohlc, ignore_index=True)
    candle_number = candle_number + 1
    print "Zeroth Candle"
else:
    ohlc.ix[candle_number] = (candle_open_price,candle_open_price,candle_open_price,candle_open_price)
    print "else part with incermenting candle_number"
    candle_number = candle_number + 1

print "first_trd_time" 
print first_trd_time
print candle_number

print "Success!"

这是我的代码错误是

ValueError: cannot set by positional indexing with enlargement

【问题讨论】:

  • df = pd.DataFrame([[candle_open_price,candle_high_price,candle_low_price,candle_close_price]]) print (df) 的输出是什么? Dataframe 一排? df.columns 是什么?
  • 请注意,像这样将行添加到 DataFrame 效率低下,因为会为每个新大小创建一个全新的 DataFrame。

标签: python pandas dataframe append


【解决方案1】:

IIUC 您可以将每一行的 DataFrames 附加到 DataFrames 列表dfs,然后将 concat 它们附加到 df1

ohlc = pd.DataFrame(columns=('Open','High','Low','Close'))

dfs = []
for row in ohlc.iterrows():
    df = pd.DataFrame([candle_open_price,candle_high_price,
                        candle_low_price,candle_close_price]).T
    dfs.append(df)

df1 = pd.concat(dfs, ignore_index=True)
print (df1)

然后将concat转为原来的DataFrameohlc

df2 = pd.concat([ohlc,df1])
print (df2)

示例(用于在循环的每次迭代中添加相同数据的测试):

#sample data
candle_open_price = pd.Series([1.5,10], 
                              name='Open', 
                              index=pd.DatetimeIndex(['2016-01-02','2016-01-03']) )
candle_high_price =  pd.Series([8,9], 
                               name='High', 
                               index=pd.DatetimeIndex(['2016-01-02','2016-01-03']))
candle_low_price =  pd.Series([0,12], 
                              name='Low', 
                              index=pd.DatetimeIndex(['2016-01-02','2016-01-03']))
candle_close_price =  pd.Series([4,5], 
                                name='Close', 
                                index=pd.DatetimeIndex(['2016-01-02','2016-01-03']))

data = np.array([[1,2,3,5],[7,7,8,9],[10,8,9,3]])
idx = pd.DatetimeIndex(['2016-01-08','2016-01-09','2016-01-10'])
ohlc = pd.DataFrame(data=data, 
                    columns=('Open','High','Low','Close'),
                    index=idx)
print (ohlc)
            Open  High  Low  Close
2016-01-08     1     2    3      5
2016-01-09     7     7    8      9
2016-01-10    10     8    9      3
dfs = []
for row in ohlc.iterrows():
    df = pd.DataFrame([candle_open_price,candle_high_price,
                       candle_low_price,candle_close_price]).T
    #print (df)
    dfs.append(df)

df1 = pd.concat(dfs)
print (df1)
            Open  High   Low  Close
2016-01-02   1.5   8.0   0.0    4.0
2016-01-03  10.0   9.0  12.0    5.0
2016-01-02   1.5   8.0   0.0    4.0
2016-01-03  10.0   9.0  12.0    5.0
2016-01-02   1.5   8.0   0.0    4.0
2016-01-03  10.0   9.0  12.0    5.0

df2 = pd.concat([ohlc,df1])
print (df2)
            Open  High   Low  Close
2016-01-08   1.0   2.0   3.0    5.0
2016-01-09   7.0   7.0   8.0    9.0
2016-01-10  10.0   8.0   9.0    3.0
2016-01-02   1.5   8.0   0.0    4.0
2016-01-03  10.0   9.0  12.0    5.0
2016-01-02   1.5   8.0   0.0    4.0
2016-01-03  10.0   9.0  12.0    5.0
2016-01-02   1.5   8.0   0.0    4.0
2016-01-03  10.0   9.0  12.0    5.0

【讨论】:

  • 你能添加一些示例数据吗? 5-6 行。
  • 我也尝试过使用 concat,但收到cannot concatenate a non-NDFrame object 的错误
  • 我也添加了示例数据
  • 对不起,我不明白while(first_trd_time &lt; last_time): - 也许有if
  • 我从第一个交易时间到最后一个交易时间都在列。
猜你喜欢
  • 2020-12-17
  • 2021-11-10
  • 2012-11-15
  • 1970-01-01
  • 2018-02-22
  • 1970-01-01
  • 1970-01-01
  • 2014-07-10
相关资源
最近更新 更多