【问题标题】:Pandas Groupby Season and Year Average ColumnPandas Groupby 季节和年平均列
【发布时间】:2020-07-30 17:37:55
【问题描述】:

我有一个如下所示的 df“ncData”,我正在尝试按季节(冬季、春季、夏季、秋季)对数据进行分组,并取几个月的风速和功率列的平均值每个 windfarm_name 每年的每个季节。这是ncData的前几行:

ncData.head(2)
Out[432]: 
     site_name windfarm_name region_name                      time  \
4055     REDCK    Red Creek   Northeast 2019-12-28 20:00:00+00:00   
4056     REDCK    Red Creek   Northeast 2019-12-28 19:00:00+00:00   

      wind_speed    power       Dates     Hours  year month day  Season  
4055     5.89692  23.9702  2019-12-28  20:00:00  2019    12  28  Winter  
4056     4.75525  13.8225  2019-03-28  19:00:00  2019     3  28  Spring 

我尝试过类似的方法:

ncData.groupby([pd.Grouper(key='Season', freq='1Y'),pd.Grouper(key='windfarm_name')]).mean()

出现此错误:

TypeError: Only valid with DatetimeIndex, TimedeltaIndex or PeriodIndex, but got an instance of 
'Index'

而且,我试过这样:

ncData.groupby(['Season','windfarm_name'],freq='1Y')['wind_speed'].mean()

我需要输出如下所示:

         time       windfarm_name  season         wind_speed power
0    1991          Red Creek      winter         3.917762   8.276560
1    1991          Red Creek      spring         3.046854   0.132271
2    1991          Red Creek      summer         3.737426   6.799836
3    1991          Red Creek      autumn         3.870350   4.010200
4    1991         Oasis Wind      winter         2.955412   2.898962
5    1991         Oasis Wind      spring         2.707168   0.076643

谢谢!

【问题讨论】:

    标签: pandas pandas-groupby average


    【解决方案1】:

    你几乎拥有它

    ncData.groupby(['year', 'windfarm_name', 'Season'])['wind_speed', 'power'].mean()
    

    请注意,您可以不将时间列拆分为年、月、日。只要确保它是DateTime 类型和

    ncData.groupby([ncData['time'].month, 'windfarm_name', 'Season'])['wind_speed', 'power'].mean()
    

    【讨论】:

    • 您可能需要将 Season 转换为(排序的)Categorical 类型,以便按季节顺序(冬季、春季、...)而不是字典顺序进行排序。
    猜你喜欢
    • 1970-01-01
    • 2021-12-08
    • 2021-12-20
    • 2021-10-20
    • 1970-01-01
    • 2022-01-24
    相关资源
    最近更新 更多