【发布时间】:2021-07-30 19:16:57
【问题描述】:
我有一个有趣的问题,我尝试这样做,但没有奏效。我有一个包含 4 列的时间序列数据框:源、目标、时间戳和值。
每个时间戳都有多个源、目标和值作为提供的代码:
import pandas as pd
data =
[['a','None','01.01.2020',20], ['a','None','02.01.2020',15],['a','None','03.01.2020',11],
['a','b','01.01.2020',100], ['a','b','02.01.2020',105], ['a','b','03.01.2020',101],
['c','d','01.01.2020',0], ['c','d','02.01.2020',0], ['c','d','03.01.2020',1],
['b','c','01.01.2020',50.45], ['b','c','02.01.2020',10.5], ['b','c','03.01.2020',500],
['a','d','01.01.2020',5000], ['a','d','02.01.2020',1500], ['a','d','03.01.2020',25],
['c','a','01.01.2020',2.2538], ['c','a','02.01.2020',105], ['c','a','03.01.2020',110]]
df = pd.DataFrame(data, columns = ['Source', 'Target', 'timestamp', 'values'])
我想返回一个新的数据格式作为定义的数据框:
resultdata = [['01.01.2020',20,100,0,50.45,5000,2.2538], ['02.01.2020',15,105,0,10.5, 1500,105],
['03.01.2020',11,101,1,500,25,110]]
result = pd.DataFrame(resultdata, columns = ['timestamp', 'aNone', 'ab', 'cd', 'bc', 'ad', 'ca'])
为此,我尝试加入字符串列并删除重复的时间戳,然后运行迭代,但我只收到字典格式的最后一次迭代数据。
df['Source Target'] = df['Source'] + ' ' + df['Target']
st = df['Source Target'].drop_duplicates(keep= 'first').reset_index(drop=True)
timestamp = df['timestamp'].drop_duplicates(keep= 'first')
d ={}
for j in range(len(timestamp)):
Time = timestamp ['timestamp'][j]
for k in range(len(st)):
Column = st[k]
for i in range(len(df)):
time = df['timestamp'][i]
columnname = df['Source Target'][i]
if time==Time and columnname == Column:
d[Column] = (time,df['values'][i])
【问题讨论】:
标签: python pandas dataframe for-loop time-series