【发布时间】:2022-09-27 16:09:44
【问题描述】:
我有一个 DataFrame df 有四列; Date、Location、Category 和 Value。对于每个Date 和Location,我尝试将Value 列中包含另一列Category 中的值的值更改为5 或更高,并将其替换为Category 行的值5.
df:
Date Location Category Value
20220101 FE 1 0.23
20220101 FE 2 0.24
20220101 FE 3 0.26
20220101 FE 4 0.27
20220101 FE 5 0.28
20220101 FE 6 0.30
20220101 RP 5 0.32
20220101 RP 6 0.35
20220102 FE 1 0.20
20220102 FE 2 0.23
20220102 FE 3 0.25
20220102 FE 4 0.26
20220102 FE 5 0.28
20220102 FE 6 0.32
df_new:
Date Location Category Value
20220101 FE 1 0.23
20220101 FE 2 0.24
20220101 FE 3 0.26
20220101 FE 4 0.27
20220101 FE 5 0.28
20220101 FE 6 0.28 <-- changed with value from row with Category == 5
20220101 RP 5 0.32
20220101 RP 6 0.32 <-- changed with value from row with Category == 5
20220102 FE 1 0.20
20220102 FE 2 0.23
20220102 FE 3 0.25
20220102 FE 4 0.26
20220102 FE 5 0.28
20220102 FE 6 0.28 <-- changed with value from row with Category == 5
到目前为止,我只能提取特定Date 的Value 和Category = 5 的Location。
df.loc[(df[\'Date\'] == 20220101) & (df[\'Location\'] == \'FE\') & (df[\'Category\'] == 5), \'Value\'].iloc[0]
有没有一种简单有效的方法来更改Value 列中的列值?非常感谢!
为了重现性:
df = pd.DataFrame({
\'Date\':[20220101, 20220101, 20220101, 20220101, 20220101, 20220101, 20220101, 20220101, 20220102, 20220102, 20220102, 20220102, 20220102, 20220102, 20220102, 20220102],
\'Location\':[\'FE\', \'FE\', \'FE\', \'FE\', \'FE\', \'FE\', \'RP\', \'RP\', \'FE\', \'FE\', \'FE\', \'FE\', \'FE\', \'FE\', \'RP\', \'RP\'],
\'Category\':[1, 2, 3, 4, 5, 6, 5, 6, 1, 2, 3, 4, 5, 6, 5, 6],
\'Value\':[0.23, 0.24, 0.26, 0.27, 0.28, 0.3, 0.32, 0.35, 0.2, 0.23, 0.25, 0.26, 0.28, 0.32, 0.34, 0.36]
})