【问题标题】:Problem writing a function that adds Columns for Transfer from and to club编写添加从俱乐部转移到俱乐部的列的函数时出现问题
【发布时间】:2021-08-05 22:21:24
【问题描述】:

我的一个项目有问题。我正在尝试对足球中的转会做一个清晰的概述,我目前有这张表:

ClubID PlayerID FromDate ToDate TeamName c_Person
1 1 2010-01-01 2012-01-01 Club A Player 1
2 1 2012-02-01 2015-02-01 Club B Player 1
3 1 2015-05-01 2018-02-01 Club C Player 1
1 2 2010-01-01 2018-02-02 Club A Player 2
1 2 2018-03-02 2020-02-01 Club A Player 2

但是,我想添加 FromClub 和 ToClub 列。如果球员 1 从 2010 年 1 月 1 日到 2012 年 1 月 1 日首先为俱乐部 A 效力,然后从 2012 年 2 月 1 日到 2015 年 2 月 1 日转会并为俱乐部 B 效力,我想要“FromClub”和“ToClub”说明转移。

我希望表格如下所示:

ClubID PlayerID FromDate ToDate TeamName c_Person FromClub ToClub
1 1 2010-01-01 2012-01-01 Club A Player1 Nan Nan
2 1 2012-02-01 2015-02-01 Club B Player 1 Club A Club B
3 1 2015-05-01 2018-02-01 Club C Player 1 Club B Club C
1 2 2010-01-01 2018-02-02 Club A Player 2 Nan Nan
1 2 2018-03-02 2020-02-01 Club A Player 2 Nan Nan

我一直在尝试编写一个函数,但无法解决它。希望其他人可以帮助我解决这个问题。

这是创建第一个表的代码:

import pandas as pd
from datetime import datetime

df = pd.DataFrame({'ClubID':[1, 2, 3, 1, 1],
                  'PlayerID':[1, 1, 1, 2, 2],
                  'FromDate':["2010-01-01", "2012-02-01", "2015-05-01", "2010-01-01", "2018-03-02"],
                  'ToDate':["2012-01-01", "2015-02-01", "2018-02-01", "2018-02-02", "2020-02-01"],
                  'TeamName':["Club A", "Club B", "Club C",  "Club A", "Club A"],
                  'c_Person':["Player 1", "Player 1", "Player 1", "Player 2", "Player 2"]})

# convert the 'Date' columns to datetime format
df['FromDate']= pd.to_datetime(df['FromDate'])
df['ToDate']= pd.to_datetime(df['ToDate'])

提前致谢!

【问题讨论】:

  • 您还想解释创建这两个新列的标准
  • 感谢您的提示!我已经更新了描述。希望这会有所帮助

标签: python pandas dataframe function columnsorting


【解决方案1】:

首先对于数据框中的每一行,包括每个球员在转会前的前球队信息:

df['PreviousTeam'] = df.groupby('PlayerID')['TeamName'].shift()

>>> df
   ClubID    FromDate  PlayerID TeamName      ToDate  c_Person PreviousTeam
0       1  2010-01-01         1   Club A  2012-01-01  Player 1          NaN
1       2  2012-02-01         1   Club B  2015-02-01  Player 1       Club A
2       3  2015-05-01         1   Club C  2018-02-01  Player 1       Club B
3       1  2010-01-01         2   Club A  2018-02-02  Player 2          NaN
4       1  2018-03-02         2   Club A  2020-02-01  Player 2       Club A

但是,如果玩家被转移到同一支球队,之前的球队与当前球队相同(第 4 行)。所以应用下面的操作来解决这个问题:

df['FromClub'] = df[df['PreviousTeam'] != df['TeamName']]['PreviousTeam']

通过观察玩家何时被转移,最终可以从FromClub获得ToClub列:

df['ToClub'] = df[~df['FromClub'].isna()]['TeamName']

>>> df.drop('PreviousTeam', axis=1)
      ClubID    FromDate  PlayerID TeamName      ToDate  c_Person FromClub  ToClub
0       1  2010-01-01         1   Club A  2012-01-01  Player 1      NaN     NaN
1       2  2012-02-01         1   Club B  2015-02-01  Player 1   Club A  Club B
2       3  2015-05-01         1   Club C  2018-02-01  Player 1   Club B  Club C
3       1  2010-01-01         2   Club A  2018-02-02  Player 2      NaN     NaN
4       1  2018-03-02         2   Club A  2020-02-01  Player 2      NaN     NaN

因此,将所有内容放在一个函数中,您只需使用您的数据框在下面调用并获得所需的输出:

def fill_club_details(df):
    df['PreviousTeam'] = df.groupby('PlayerID')['TeamName'].shift()
    df['FromClub'] = df[df['PreviousTeam'] != df['TeamName']]['PreviousTeam']
    df['ToClub'] = df[~df['FromClub'].isna()]['TeamName']
    return df.drop('PreviousTeam', axis=1)

【讨论】:

  • 非常感谢!
猜你喜欢
  • 2019-04-19
  • 1970-01-01
  • 1970-01-01
  • 1970-01-01
  • 1970-01-01
  • 2022-01-20
  • 1970-01-01
  • 1970-01-01
  • 1970-01-01
相关资源
最近更新 更多