【问题标题】:New column in df using multiple conditions [duplicate]使用多个条件的df中的新列[重复]
【发布时间】:2021-01-14 14:24:06
【问题描述】:

我在这里看过一堆类似的帖子,但没有一个真正回答我的问题:

Df:

pc_cogs = pd.DataFrame({'Product': ['Product 1','Product 95','Product 10','Product 44','Product 100','Product 69','Product 78','Product 3'], 
              'Pack_type':['Case - 4x6 - 12oz - Can', 'Case - 4x6 - 12oz - Can', 'Case - 6x4 - 16oz - Can','Case - 6x4 - 12oz - Can',
                           'Case - 6x4 - 16oz - Can','Cask - Pin', 'Case - 12x - 22oz - Bottle','Case - 6x4 - 12oz - Can'],
             'Keg Category': ['REGULAR', 'SEASONAL', 'WINTER', 'SEASONAL', 'SEASONAL', 'FALL', 'SEASONAL', 'WINTER'],
             'Unit_Sale_Price': [5, 8, 10, 22, 45, 56, 33, 23],
             'New Price': [0,0,0,0,0,0,0,0]})

需要回答:

new_pc_cogs = pd.DataFrame({'Product': ['Product 1','Product 95','Product 10','Product 44','Product 100','Product 69','Product 78','Product 3'], 
              'Pack_type':['Case - 4x6 - 12oz - Can', 'Case - 4x6 - 12oz - Can', 'Case - 6x4 - 16oz - Can','Case - 6x4 - 12oz - Can',
                           'Case - 6x4 - 16oz - Can','Cask - Pin', 'Case - 12x - 22oz - Bottle','Case - 6x4 - 12oz - Can'],
             'Keg Category': ['REGULAR', 'SEASONAL', 'WINTER', 'SEASONAL', 'SEASONAL', 'FALL', 'SEASONAL', 'WINTER'],
             'Unit_Sale_Price': [5, 8, 10, 22, 45, 56, 33, 23],
             'New Price': [7.36,7.54,13.47,21.87,44.87,56,33,23]})

尝试添加列: 条件中使用的存储列表和变量:

## Pack type
four_pack = 'Case - 6x4 - 16oz - Can'
six_pack = 'Case - 4x6 - 12oz - Can'
four_pack2 = 'Case - 6x4 - 12oz - Can'

## Core Brands

core_brands = ['Prod 1','Prod 2', 'Prod 3',
               'Prod 4', 'Prod 5', 'Prod 6', 'Prod 7','Prod 8']

cali_brands = ['Prod 9', 'Prod 10']

尝试 1:

for product in pc_cogs['Product']:
    package = pc_cogs['Pack_type']
    category = pc_cogs['Keg Category']
    price = pc_cogs['Unit_sale_price']
    if product in core_brands & package == six_pack: 
        pc_cogs['Price Change'] = price + 2.36
    elif category == 'SEASONAL' & package == six_pack: 
        pc_cogs['Price Change'] = price - .46
    elif product in cali_brands & package == four_pack: 
        pc_cogs['Price Change'] = price + 3.47
    elif (category == 'SEASONAL') & (package == four_pack | package == four_pack2):
        pc_cogs['Price Change'] = price - .13      
    else: 
        pc_cogs['Price Change'] = 0

错误:操作数无法与形状一起广播 (8,) (611,) 最后一个 elif 包含两个条件。我也尝试过使用其他条件,但没有成功。

尝试 2:

pc_cogs['Price Change'][(pc_cogs['Product'] in core_brands) & (pc_cogs['Package'] == six_pack)] = pc_cogs['Unit_sale_price'] + 2.36`

错误:Series 的真值不明确。使用 a.empty、a.bool()、a.item()、a.any() 或 a.all()。 ​​​ 我看到了显示此错误的热门帖子的链接。它说要从单词和/或更改为 &|我做到了。

我也看到了这个:Creating Column in Dataframe Using Multiple Conditions

但这并没有帮助。

任何帮助将不胜感激

【问题讨论】:

  • 请显示您的起始输入的示例数据框,然后显示您想要的输出。请看:stackoverflow.com/questions/20109391/…
  • 知道了。给我20分钟
  • 数据框最好的部分是您不需要循环使用条件来更新列。您可以只给出条件,如果它是 true ,它将自行更新。
  • 好的。我已经包含了两个用于开始和结束的 df 模板。我在 Jupyter 中检查它们以确保它们看起来正确。
  • 我是这么认为的,乔。我只用了 2 个月,而且我经常陷入 Python 方式之间,忘记了 df 使用 pandas 的灵活性。

标签: python pandas


【解决方案1】:

我看到 Quang 已经发布了一个np.select() 解决方案,但这里是完整的代码。你的core_brandscali_brands 列表中有拼写错误Prod 而不是Product,所以我只是进行了拼写匹配,所以它被拉进去了:

您可以简单地创建conditionsresults 并使用np.select()

import pandas as pd
import numpy as np
pc_cogs = pd.DataFrame({'Product': ['Product 1','Product 95','Product 10','Product 44','Product 100','Product 69','Product 78','Product 3'], 
              'Pack_type':['Case - 4x6 - 12oz - Can', 'Case - 4x6 - 12oz - Can', 'Case - 6x4 - 16oz - Can','Case - 6x4 - 12oz - Can',
                           'Case - 6x4 - 16oz - Can','Cask - Pin', 'Case - 12x - 22oz - Bottle','Case - 6x4 - 12oz - Can'],
             'Keg Category': ['REGULAR', 'SEASONAL', 'WINTER', 'SEASONAL', 'SEASONAL', 'FALL', 'SEASONAL', 'WINTER'],
             'Unit_Sale_Price': [5, 8, 10, 22, 45, 56, 33, 23],
             'New Price': [0,0,0,0,0,0,0,0]})

four_pack = 'Case - 6x4 - 16oz - Can'
six_pack = 'Case - 4x6 - 12oz - Can'
four_pack2 = 'Case - 6x4 - 12oz - Can'
core_brands = ['Product 1','Product 2', 'Product 3',
               'Product 4', 'Product 5', 'Product 6', 'Product 7','Product 8']
cali_brands = ['Product 9', 'Product 10']
price = pc_cogs['Unit_Sale_Price']

c1 = (pc_cogs['Product'].isin(core_brands)) & (pc_cogs['Pack_type'] == six_pack)
r1 = price + 2.36

c2 = (pc_cogs['Keg Category'] == 'SEASONAL') & (pc_cogs['Pack_type'] == six_pack)
r2 = price - .46

c3 = (pc_cogs['Product'].isin(cali_brands)) & (pc_cogs['Pack_type'] == four_pack)
r3 = price + 3.47

c4 = (pc_cogs['Keg Category'] == 'SEASONAL') & (pc_cogs['Pack_type'].isin([four_pack, four_pack2]))
r4 = price - .13

conditions = [c1,c2,c3,c4]
results = [r1,r2,r3,r4]
pc_cogs['New Price'] = np.select(conditions, results, pc_cogs['Unit_Sale_Price'])                                              
pc_cogs
Out[1]: 
       Product                   Pack_type Keg Category  Unit_Sale_Price  \
0    Product 1     Case - 4x6 - 12oz - Can      REGULAR                5   
1   Product 95     Case - 4x6 - 12oz - Can     SEASONAL                8   
2   Product 10     Case - 6x4 - 16oz - Can       WINTER               10   
3   Product 44     Case - 6x4 - 12oz - Can     SEASONAL               22   
4  Product 100     Case - 6x4 - 16oz - Can     SEASONAL               45   
5   Product 69                  Cask - Pin         FALL               56   
6   Product 78  Case - 12x - 22oz - Bottle     SEASONAL               33   
7    Product 3     Case - 6x4 - 12oz - Can       WINTER               23   

   New Price  
0       7.36  
1       7.54  
2      13.47  
3      21.87  
4      44.87  
5      56.00  
6      33.00  
7      23.00  

【讨论】:

    【解决方案2】:

    这是np.select的申请:

    pc_cogs['New Price'] = pc_cogs['Unit_Sale_Price'] + np.select([
            pc_cogs['Product'].isin(core_brands) & pc_cogs['Pack_type'].eq(six_pack),
            pc_cogs['Keg Category'].eq('SEASONAL') & pc_cogs['Pack_type'].eq(six_pack),
            pc_cogs['Product'].isin(cali_brands) & pc_cogs['Pack_type'].eq(four_pack),
            pc_cogs['Keg Category'].eq('SEASONAL') & pc_cogs['Pack_type'].isin([four_pack,four_pack2])
        ],
        [2.36,-.46,3.47,-.13],0
    )
    

    输出:

        Product      Pack_type                   Keg Category      Unit_Sale_Price    New Price
    --  -----------  --------------------------  --------------  -----------------  -----------
     0  Product 1    Case - 4x6 - 12oz - Can     REGULAR                         5         7.36
     1  Product 95   Case - 4x6 - 12oz - Can     SEASONAL                        8         7.54
     2  Product 10   Case - 6x4 - 16oz - Can     WINTER                         10        13.47
     3  Product 44   Case - 6x4 - 12oz - Can     SEASONAL                       22        21.87
     4  Product 100  Case - 6x4 - 16oz - Can     SEASONAL                       45        44.87
     5  Product 69   Cask - Pin                  FALL                           56        56
     6  Product 78   Case - 12x - 22oz - Bottle  SEASONAL                       33        33
     7  Product 3    Case - 6x4 - 12oz - Can     WINTER                         23        23
    

    【讨论】:

    • 产品 10 的值应该是 13.47
    • 您需要将其从 six_pack 更改为 four_packpc_cogs['Product'].isin(cali_brands) & pc_cogs['Pack_type'].eq(six_pack)
    • @JoeFerndz 完全正确。更新了答案。谢谢。
    【解决方案3】:

    你可以用下面 5 行替换你的 for 循环,你会得到你的结果集。

    pc_cogs['New Price'] = pc_cogs['Unit_Sale_Price']
    
    pc_cogs.loc[(pc_cogs['Product'].isin(core_brands)) & (pc_cogs['Pack_type'] == six_pack), 'New Price'] += 2.36
    pc_cogs.loc[(pc_cogs['Keg Category']=='SEASONAL') & (pc_cogs['Pack_type'] == six_pack), 'New Price'] -=  0.46
    pc_cogs.loc[(pc_cogs['Product'].isin(cali_brands)) & (pc_cogs['Pack_type'] == four_pack), 'New Price'] += 3.47
    pc_cogs.loc[(pc_cogs['Keg Category']=='SEASONAL') & (pc_cogs['Pack_type'].isin([four_pack,four_pack2])), 'New Price'] -= 0.13
    

    【讨论】:

      猜你喜欢
      • 2019-06-08
      • 2020-06-24
      • 2021-12-11
      • 1970-01-01
      • 1970-01-01
      • 1970-01-01
      • 1970-01-01
      • 1970-01-01
      • 1970-01-01
      相关资源
      最近更新 更多