【问题标题】:creating new column by merging on column name and other column value通过合并列名和其他列值来创建新列
【发布时间】:2018-05-02 13:10:02
【问题描述】:

尝试在 DF1 中创建一个新列,列出当年全明星的主队人数。

DF1

                     Date             Visitor  V_PTS                  Home  H_PTS  \
0 2012-10-30 19:00:00  Washington Wizards     84   Cleveland Cavaliers     94   
1 2012-10-30 19:30:00    Dallas Mavericks     99    Los Angeles Lakers     91   
2 2012-10-30 20:00:00      Boston Celtics    107            Miami Heat    120   
3 2012-10-31 19:00:00    Dallas Mavericks     94             Utah Jazz    113   
4 2012-10-31 19:00:00   San Antonio Spurs     99  New Orleans Pelicans     95   

   Attendance                    Arena                 Location  Capacity  \
0       20562      Quicken Loans Arena          Cleveland, Ohio     20562   
1       18997           Staples Center  Los Angeles, California     18997   
2       20296  American Airlines Arena           Miami, Florida     19600   
3       17634  Vivint Smart Home Arena     Salt Lake City, Utah     18303   
4       15358     Smoothie King Center   New Orleans, Louisiana     16867   

  Yr Arena Opened   Season  
0            1994  2012-13  
1            1992  2012-13  
2            1999  2012-13  
3            1991  2012-13  
4            1999  2012-13 

DF2

                           2012-13  2013-14  2014-15  2015-16  2016-17
Cleveland Cavaliers           1        1        2        1        3
Los Angeles Lakers            2        1        1        1        0
Miami Heat                    3        3        2        2        1
Chicago Bulls                 2        1        2        2        1
Detroit Pistons               0        0        0        1        1
Los Angeles Clippers          2        2        2        1        1
New Orleans Pelicans          0        1        1        1        1
Philadelphia 76ers            1        0        0        0        0
Phoenix Suns                  0        0        0        0        0
Portland Trail Blazers        1        2        2        0        0
Toronto Raptors               0        1        1        2        2

DF1['H_Allstars']=DF2[DF1['Season'],DF1['Home']])

导致 TypeError: 'Series' 对象是可变的,因此它们不能被散列

我理解错误只是不知道该怎么做。

【问题讨论】:

    标签: python pandas


    【解决方案1】:

    我删除了多余的列,只专注于演示所需的列。

    输入:

    df1

                          Home  2012-13  2013-14  2014-15  2015-16  2016-17
    0      Cleveland Cavaliers        1        1        2        1        3
    1       Los Angeles Lakers        2        1        1        1        0
    2               Miami Heat        3        3        2        2        1
    3            Chicago Bulls        2        1        2        2        1
    4          Detroit Pistons        0        0        0        1        1
    5     Los Angeles Clippers        2        2        2        1        1
    6     New Orleans Pelicans        0        1        1        1        1
    7       Philadelphia 76ers        1        0        0        0        0
    8             Phoenix Suns        0        0        0        0        0
    9   Portland Trail Blazers        1        2        2        0        0
    10         Toronto Raptors        0        1        1        2        2
    

    df2

                  Visitor                  Home   Season
    0  Washington Wizards   Cleveland Cavaliers  2012-13
    1    Dallas Mavericks    Los Angeles Lakers  2012-13
    2      Boston Celtics            Miami Heat  2012-13
    3    Dallas Mavericks             Utah Jazz  2012-13
    4   San Antonio Spurs  New Orleans Pelicans  2012-13
    

    第 1 步:熔化 df1 以获得全明星列

    df3 = pd.melt(df1, id_vars='Home', value_vars = df1.columns[df.columns.str.contains('20')], var_name = 'Season', value_name='H_Allstars')
    

    输出:

                          Home   Season   H_Allstars
    0      Cleveland Cavaliers  2012-13           1
    1       Los Angeles Lakers  2012-13           2
    2               Miami Heat  2012-13           3
    3            Chicago Bulls  2012-13           2
    4          Detroit Pistons  2012-13           0
    5     Los Angeles Clippers  2012-13           2
    6     New Orleans Pelicans  2012-13           0
    7       Philadelphia 76ers  2012-13           1
    8             Phoenix Suns  2012-13           0
    ...
    

    第 2 步:将这个新数据帧与 df2 合并以获得 H_Allstars 和 V_Allstars 列

    df4 = pd.merge(df2, df3, how='left', on=['Home', 'Season'])
    

    输出:

                  Visitor                  Home   Season  H_Allstars
    0  Washington Wizards   Cleveland Cavaliers  2012-13         1.0
    1    Dallas Mavericks    Los Angeles Lakers  2012-13         2.0
    2      Boston Celtics            Miami Heat  2012-13         3.0
    3    Dallas Mavericks             Utah Jazz  2012-13         NaN
    4   San Antonio Spurs  New Orleans Pelicans  2012-13         0.0
    

    第 3 步:添加 V_Allstars 列

    # renaming column as required
    df3.rename(columns={'Home': 'Visitor', 'H_Allstars': 'V_Allstars'}, inplace=True)
    
    df5 = pd.merge(df4, df3, how='left', on=['Visitor', 'Season'])
    

    输出:

                  Visitor                  Home   Season  H_Allstars  V_Allstars
    0  Washington Wizards   Cleveland Cavaliers  2012-13         1.0         NaN
    1    Dallas Mavericks    Los Angeles Lakers  2012-13         2.0         NaN
    2      Boston Celtics            Miami Heat  2012-13         3.0         NaN
    3    Dallas Mavericks             Utah Jazz  2012-13         NaN         NaN
    4   San Antonio Spurs  New Orleans Pelicans  2012-13         0.0         NaN
    

    【讨论】:

      【解决方案2】:

      您可以使用 pandas.melt 。将您的数据 df2 转换为长格式,即 Home 和 season 作为列,Allstars 作为值,然后在“Home”和“Season”上合并到 df1。

      import pandas as pd
      df2['Home'] = df2.index
      df2 = pd.melt(df2, id_vars = 'Home', value_vars = ['2012-13',  '2013-14', '2014-15', '2015-16', '2016-17'], var_name = 'Season', value_name='H_Allstars')
      df = df1.merge(df2, on=['Home','Season'], how='left') 
      

      【讨论】:

      • 谢谢!如果我还想添加 V_Allstars 列怎么办?
      • 你能解释一下,那一栏是什么吗?
      • All set Ashish 解释了它
      猜你喜欢
      • 1970-01-01
      • 1970-01-01
      • 1970-01-01
      • 1970-01-01
      • 1970-01-01
      • 2016-03-05
      • 2022-01-21
      • 2023-04-07
      • 2020-01-17
      相关资源
      最近更新 更多