【问题标题】:How to sum two columns together based on another column's value using Pandas如何使用 Pandas 根据另一列的值将两列相加
【发布时间】:2023-03-25 04:08:01
【问题描述】:

我想使用 Pandas 重现“desired_outcome”列。基本上每次“Acc Type”等于O时,我都必须取Balance和Amount之和。

+--------+----------+-------+---------+--------+----------+-----------------+
| MainID |   Date   | SubID | Balance | Amount | Acc Type | desired_outcome |
+--------+----------+-------+---------+--------+----------+-----------------+
|      1 | 1/1/2020 |     1 |      10 | 5      | O        |              15 |
|      1 | 1/1/2020 |     1 |      10 | 4      | R        |              10 |
|      1 | 1/1/2020 |     2 |      20 | 5      | O        |              25 |
|      1 | 1/1/2020 |     2 |      20 | 4      | R        |              20 |
|      1 | 1/1/2020 |     3 |      30 | 5      | O        |              35 |
|      1 | 1/1/2020 |     3 |      30 | 4      | R        |              30 |
|      1 | 2/1/2020 |     1 |      40 | 5      | O        |              45 |
|      1 | 2/1/2020 |     1 |      40 | 4      | R        |              40 |
|      1 | 2/1/2020 |     2 |      50 | 5      | O        |              55 |
|      1 | 2/1/2020 |     2 |      50 | 4      | R        |              50 |
|      1 | 2/1/2020 |     3 |      60 | 5      | O        |              65 |
|      1 | 2/1/2020 |     3 |      60 | 4      | R        |              60 |
|      2 | 1/1/2020 |     7 |     100 | NaN    | O        |             100 |
|      2 | 1/1/2020 |     7 |     100 | NaN    | R        |             100 |
+--------+----------+-------+---------+--------+----------+-----------------+

另外,我知道这不是一个理想的数据框,理想的方法可能是拥有两个数据框。我该如何设置它,我将拥有如下所示的第二个数据框:并且仍然能够拥有如上所示的 desired_output 列(没有额外的行,因为 acc 类型将不再存在)

+--------+----------+------------+----------+
| MainID |   Date   | Acc Amount | Acc Type |
+--------+----------+------------+----------+
|      1 | 1/1/2020 | 5          | O        |
|      1 | 1/1/2020 | 4          | R        |
|      1 | 2/1/2020 | 5          | O        |
|      1 | 2/1/2020 | 4          | R        |
|      2 | 1/1/2020 | NaN        | O        |
|      2 | 1/1/2020 | NaN        | R        |
+--------+----------+------------+----------+

谢谢!

【问题讨论】:

    标签: python pandas dataframe conditional-operator


    【解决方案1】:

    你的数据框很好。这就是我要做的:

    df['desired_outcome'] = np.where(df['Acc Type']=='O', 
                                     df['Balance'] + df['Amount'].fillna(0),
                                     df['Balance'])
    

    输出:

        MainID      Date  SubID  Balance  Amount Acc Type  desired_outcome
    0        1  1/1/2020      1       10     5.0        O             15.0
    1        1  1/1/2020      1       10     4.0        R             10.0
    2        1  1/1/2020      2       20     5.0        O             25.0
    3        1  1/1/2020      2       20     4.0        R             20.0
    4        1  1/1/2020      3       30     5.0        O             35.0
    5        1  1/1/2020      3       30     4.0        R             30.0
    6        1  2/1/2020      1       40     5.0        O             45.0
    7        1  2/1/2020      1       40     4.0        R             40.0
    8        1  2/1/2020      2       50     5.0        O             55.0
    9        1  2/1/2020      2       50     4.0        R             50.0
    10       1  2/1/2020      3       60     5.0        O             65.0
    11       1  2/1/2020      3       60     4.0        R             60.0
    12       2  1/1/2020      7      100     NaN        O            100.0
    13       2  1/1/2020      7      100     NaN        R            100.0
    

    【讨论】:

      猜你喜欢
      • 2021-03-23
      • 1970-01-01
      • 1970-01-01
      • 1970-01-01
      • 1970-01-01
      • 1970-01-01
      • 1970-01-01
      • 2018-04-15
      • 2017-06-17
      相关资源
      最近更新 更多