【问题标题】:Calculate head-to-head statistics using Pandas使用 Pandas 计算面对面的统计数据
【发布时间】:2020-06-15 12:15:13
【问题描述】:

我有一个如下所示的 DataFrame:

  home_team  away_team  score  home_goals  away_goals  winner
1  Arsenal    Chelsea    3-0        3          0       Arsenal
2  ManCity    Arsenal    1-1        1          1       draw
3  Chelsea    Arsenal    2-1        2          1       Chelsea
4  Arsenal    Chelsea    5-5        5          5       draw
5  Arsenal    ManCity    1-2        1          2       ManCity

我的问题是:我如何计算阿森纳与另一支球队的胜负/对战记录

潜在的期望结果可能如下所示:

   team      opponent  games_played  wins  draws  losses  goals_scored  goals_conceded
1  Arsenal   Chelsea        3          1     1      1          9              7
2  Arsenal   ManCity        2          0     1      1          2              3

非常感谢任何帮助。请注意,数据框不是真实的(以防有任何英超联赛专家潜伏)!

【问题讨论】:

    标签: python python-3.x pandas dataframe pandas-groupby


    【解决方案1】:

    检查此代码:

    import pandas as pd
    
    df_in = pd.read_csv('data.csv')
    df_out = pd.DataFrame(columns = ['team', 'opponent', 'games_played', 'wins', 'draws', 'losses', 'goals_scored', 'goals_conceded'])
    
    team = 'Arsenal'
    
    for index, row in df_in.iterrows():
        if row['home_team'] == team:
            opponent = row['away_team']
            if row['home_goals'] > row['away_goals']:
                win = 1
                draw = 0
                loss = 0
            elif row['home_goals'] < row['away_goals']:
                win = 0
                draw = 0
                loss = 1
            else:
                win = 0
                draw = 1
                loss = 0
            goals_scored = row['home_goals']
            goals_conceded = row['away_goals']
        else:
            opponent = row['home_team']
            if row['home_goals'] > row['away_goals']:
                win = 0
                draw = 0
                loss = 1
            elif row['home_goals'] < row['away_goals']:
                win = 1
                draw = 0
                loss = 0
            else:
                win = 0
                draw = 1
                loss = 0
            goals_scored = row['away_goals']
            goals_conceded = row['home_goals']
    
        games_played = 1
    
    
    
        if opponent not in df_out['opponent'].unique():
            match = pd.DataFrame({'team': team,
                                  'opponent': opponent,
                                  'games_played': games_played,
                                  'wins': win,
                                  'draws': draw,
                                  'losses': loss,
                                  'goals_scored': goals_scored,
                                  'goals_conceded': goals_conceded},
                                 index = [0])
            df_out = pd.concat([df_out, match], ignore_index = True)
        else:
            df_out.loc[df_out['opponent'] == opponent, 'games_played'] += games_played
            df_out.loc[df_out['opponent'] == opponent, 'wins'] += win
            df_out.loc[df_out['opponent'] == opponent, 'draws'] += draw
            df_out.loc[df_out['opponent'] == opponent, 'losses'] += loss
            df_out.loc[df_out['opponent'] == opponent, 'goals_scored'] += goals_scored
            df_out.loc[df_out['opponent'] == opponent, 'goals_conceded'] += goals_conceded
    

    此代码会将您的数据加载为df_in,并使用您想要的数据创建一个df_out
    输出:

          team opponent games_played wins draws losses goals_scored goals_conceded
    0  Arsenal  Chelsea            3    1     1      1            9              7
    1  Arsenal  ManCity            2    0     1      1            2              3
    

    【讨论】:

    • 太棒了——谢谢安德里亚。您是否知道是否有可能对此进行进一步细分(例如 home_games_played、away_games_played、games_played ....、home_losses、away_losses)?
    • 我认为最好再提出一个问题
    【解决方案2】:

    首先,您需要复制数据并翻转主/客队以获得 您想要的团队/对手风格的统计数据。

    这是因为您需要计算每场比赛两次,一次是获胜者,一次是失败者。复制 df 并翻转字段,然后使用 df.concat 将数据帧放在一起。

    现在你可以聚合

    您应该汇总主客场和获胜者。计算这一步中的行数和目标。使用df.groupby(dimensions).agg(metrics)

    现在您需要将索引重置回 df 以便您可以再次使用获胜者列。使用df.reset_index(inplace=True) 来完成此操作。

    一旦你有了这个,你创建新的列win, loss, draw'你将获胜者与 home_team 列或静态字符串“draw”进行比较

    您现在可以再次聚合 df 并总结赢/输/平局列。

    【讨论】:

      猜你喜欢
      • 1970-01-01
      • 2015-10-27
      • 2021-10-11
      • 1970-01-01
      • 2017-05-08
      • 2019-07-02
      • 2015-04-15
      • 1970-01-01
      • 1970-01-01
      相关资源
      最近更新 更多