【发布时间】:2021-07-22 15:55:19
【问题描述】:
我需要基于 groupby 类型的列周插入行,在某些情况下,我在数据框中间的不同位置缺少周,我想插入行以填充缺失的行作为副本最后一个现有行,在这种情况下,第 7 周的副本填写第 8 周和第 9 周,第 11 周的副本填写第 12、13 和 14 周的行:在此表上,您可以看到从第 7 周到第 10 周的跳跃从 11 点到 15 点:
完美的输出如下:以正确方式在列周中具有增量值的最终表:
下面是我的代码,它只插入一行,我很困惑为什么:
def middle_values(final : DataFrame) -> DataFrame:
finaltemp= pd.DataFrame()
out= pd.DataFrame()
for i in range(0, len(final)):
for f in range(1, 52 , 1):
if final.iat[i,8]== f and final.iat[i-1,8] != f-1 :
if final.iat[i,8] > final.iat[i-1,8] and final.iat[i,8] != (final.iat[i-1,8] - 1):
line = final.iloc[i-1]
c1 = final[0:i]
c2 = final[i:]
c1.loc[i]=line
concatinated = pd.concat([c1, c2])
concatinated.reset_index(inplace=True)
concatinated.iat[i,11] = concatinated.iat[i-1,11]
concatinated.iat[i,9]= f-1
finaltemp = finaltemp.append(concatinated)
if 'type' in finaltemp.columns:
for name, groups in finaltemp.groupby(["type"]):
weeks = range(groups['week'].min(), groups['week'].max()+1)
out = out.append(pd.merge(finaltemp, pd.Series(weeks, name='week'), how='right').ffill())
out.drop_duplicates(subset=['project', 'week'], keep = 'first', inplace=True)
out.drop_duplicates(inplace = True)
out.sort_values(["Budget: Budget Name", "Budget Week"], ascending = (False, True), inplace=True)
out.drop(['level_0'], axis = 1, inplace=True)
out.reset_index(inplace=True)
out.drop(['level_0'], axis = 1, inplace=True)
return out
else :
return final
【问题讨论】:
标签: python pandas dataframe missing-data