【问题标题】:How can I find the sum of certain columns and find avg of other columns in python?如何在python中找到某些列的总和并找到其他列的平均值?
【发布时间】:2020-06-23 22:27:27
【问题描述】:

enter image description here 我已经将 10 个 excel 文件与 1 年的 NFL 传球统计数据结合在一起,并且我已经总结了某些列(游戏次数、完成次数、尝试次数等),但我需要(传球者评分和 QBR)我想要查看平均值。

df3 = df3.groupby(['Player'],as_index=False).agg({'GS':'sum','Cmp':'sum','Att':'sum','Cmp% ':'sum','Yds':'sum','TD':'sum','TD%':'sum','Int':'sum','Int%':'sum','Y /A':'sum','AY/A':'sum','Y/C':'sum','Y/G':'sum','Rate':'sum','QBR': 'sum','Sk':'sum','Yds.1':'sum','NY/A':'sum','ANY/A':'sum','Sk%':'sum' ,'4QC':'sum','GWD':'sum'})

【问题讨论】:

    标签: python-3.x sum average


    【解决方案1】:

    快速注意:不要附上你的代码、数据集、错误等的照片。提供实际代码、实际数据集(或数据集的样本)等,以便用户重现错误、问题、等等。没有人真的会花时间从照片中制作数据集(或者我应该说很少,因为我确实这样做了......因为我喜欢处理体育数据,而且我可以很快地抓住它)。

    但要获得平均值而不是总和,您可以使用'mean'。另外,在您的代码中,为什么要对百分比求和?

    import pandas as pd
    
    df = pd.DataFrame()
    for season in range(2010, 2020):
        url = 'https://www.pro-football-reference.com/years/{season}/passing.htm'.format(season=season)
        df = df.append(pd.read_html(url)[0], sort=False)
    
    df = df[df['Rk'] != 'Rk']
    df = df.reset_index(drop=True)
    df['Player'] = df.Player.str.replace('[^a-zA-Z .]', '')
    df['Player'] = df['Player'].str.strip()
    
    
    strCols = ['Player','Tm', 'Pos', 'QBrec']
    numCols = [ x for x in df.columns if x not in strCols ]
    
    df[['QB_wins','QB_loss', 'QB_ties']] = df['QBrec'].str.split('-', expand=True)
    
    df[numCols] = df[numCols].apply(pd.to_numeric)
    
    df3 = df.groupby(['Player'],as_index=False).agg({'GS':'sum', 'TD':'sum', 'QBR':'mean'})
    

    输出:

    print (df3)
                     Player   GS   TD         QBR
    0           A.J. Feeley    3    1   27.300000
    1         A.J. McCarron    4    6         NaN
    2         Aaron Rodgers  142  305   68.522222
    3           Ace Sanders    4    1  100.000000
    4          Adam Podlesh    0    0    0.000000
    5         Albert Wilson    7    1   99.700000
    6         Alex Erickson    6    0         NaN
    7            Alex Smith  121  156   55.122222
    8           Alex Tanney    0    1   42.900000
    9          Alvin Kamara    9    0         NaN
    10          Andrew Beck    6    0         NaN
    11          Andrew Luck   86  171   62.766667
    12          Andy Dalton  133  204   53.375000
    13             Andy Lee    0    0    0.000000
    14        Anquan Boldin   32    0   11.600000
    15       Anthony Miller    4    0   81.200000
    16      Antonio Andrews   10    1  100.000000
    17        Antonio Brown   55    1   29.300000
    18     Antonio Morrison    4    0         NaN
    19    Antwaan Randle El    0    2  100.000000
    20         Arian Foster   13    1  100.000000
    21      Armanti Edwards    0    0   41.466667
    22         Austin Davis   10   13   38.150000
    23         B.J. Daniels    0    0         NaN
    24       Baker Mayfield   29   49   53.200000
    25   Ben Roethlisberger  130  236   66.833333
    26        Bernard Scott    1    0    5.600000
    27         Bilal Powell   12    0   17.700000
    28          Billy Volek    0    0   89.400000
    29       Blaine Gabbert   48   48   37.687500
    ..                  ...  ...  ...         ...
    329           Tim Boyle    0    0         NaN
    330         Tim Masthay    0    1    5.700000
    331           Tim Tebow   16   17   42.733333
    332         Todd Bouman    1    2   57.400000
    333        Todd Collins    1    0    0.800000
    334           Tom Brady  156  316   72.755556
    335     Tom Brandstater    0    0    0.000000
    336          Tom Savage    9    5   38.733333
    337           Tony Pike    0    0    2.500000
    338           Tony Romo   72  141   71.185714
    339      Travaris Cadet    1    0         NaN
    340     Travis Benjamin    8    0    1.700000
    341        Travis Kelce   15    0    1.800000
    342       Trent Edwards    3    2   98.100000
    343           Tress Way    0    0         NaN
    344      Trevone Boykin    0    1   66.400000
    345      Trevor Siemian   25   30   40.750000
    346          Troy Smith    6    5   38.500000
    347          Tyler Boyd   14    0    2.800000
    348          Tyler Bray    0    0    0.000000
    349         Tyler Palko    4    2   56.600000
    350       Tyler Thigpen    1    2   30.233333
    351         Tyreek Hill   13    0    0.000000
    352        Tyrod Taylor   46   54   51.242857
    353         Vince Young   11   14   50.850000
    354          Will Grier    2    0         NaN
    355        Willie Snead    4    1  100.000000
    356   Zach Mettenberger   10   12   24.600000
    357         Zach Pascal   13    0         NaN
    358           Zay Jones   15    0    0.000000
    

    【讨论】:

    • 太棒了。感谢您的帮助!!这是/是我第一次尝试一个项目,我只是迷路了,但我喜欢挑战。是的,我没有注意到百分比也需要调整。接得好。你是怎么到那个网址的?我尝试复制它,编辑它,并在谷歌中搜索,但仍然无法访问网页……Python 可以访问它,但我也想在网上看到它。我只能假设这是因为我没有登录名……但真的可以这么简单吗?我刚把它保存到一个 excel 文件中。
    • 网址中有一些变量,所以我可以遍历多个页面/季节。因此您需要填写这些内容才能访问浏览器中的特定页面:pro-football-reference.com/years/2018/passing.htm 然后pro-football-reference.com/years/2019/passing.htm
    猜你喜欢
    • 1970-01-01
    • 2021-11-18
    • 2015-12-01
    • 1970-01-01
    • 1970-01-01
    • 1970-01-01
    • 2020-08-18
    • 1970-01-01
    • 1970-01-01
    相关资源
    最近更新 更多