【问题标题】:Loop function through dataframe, differentiate, and then combine per column in original dataframe通过数据框循环函数,区分,然后在原始数据框中按列组合
【发布时间】:2020-02-18 20:05:41
【问题描述】:

我正在尝试使用包 ballr 从篮球参考中收集一些数据。我正在尝试使用 NBASeasonTeamByYear 函数来收集多个赛季球队的赛季结果。那就是我想要每个团队在 2017 年到 2020 年的数据,然后再将数据框组合成由会议分隔的 2 个更大的数据框。

我首先用每个团队的代码和会议制作了一个数据框

league_teams <- data.frame("team" = c("ATL", "BOS", "NJN", "CHA", "CHI", "CLE", "DAL", "DEN", 
                                  "DET", "GSW", "HOU", "IND", "LAC", "LAL", "MEM", "MIA",
                                  "MIL", "MIN", "NOH", "NYK", "OKC", "ORL", "PHI", "PHO",
                                  "POR", "SAC", "SAS", "TOR", "UTA", "WAS"), 
                       "conference" = c("East", "East", "East", "East", "East", "East", "West",
                                        "West", "East", "West", "West", "East", "West", "West",
                                        "West", "East", "East", "West", "West", "East", "West",
                                        "East", "East", "West", "West", "West", "West", "East",
                                        "West", "East"))
league_teams$team <- as.character(league_teams$team)
league_teams$conference <- as.factor(league_teams$conference)

现在我在编写循环时遇到了麻烦,该循环首先使用每个独特的团队使用他们的代码和我想要的年份,然后将它们组合在一起,无论年份,但在每个会议中。

我从这个开始

   for (team in league_teams) {

  team_2017 <- NBASeasonTeamByYear(team = team, 2017)
  team_2017$season <- as.factor(2017)
  team_2017$team <- as.factor(team)

}

后几行说明我想为各自的年份添加 2 列,为各自的团队代码添加 1 列,但不仅是 2017 年,而且一直到 2020 年。虽然我在编写循环时遇到了麻烦,但我想我稍后使用 rbind 将它们组合起来,但我不确定如何这样做,并在我制作的原始数据框中通过会议进行区分。

【问题讨论】:

    标签: r function loops


    【解决方案1】:

    考虑以用户定义的方法概括您的流程,并使用expand.grid(所有组合)和Map(元素循环)传递参数:

    nba_df_build <- function(yr, team, conf) {    
      # base::TRANSFORM OR dplyr::MUTATE
      transform(NBASeasonTeamByYear(team = team, season = yr),         
                season = as.factor(yr),
                team = as.factor(team),
                conference = as.factor(conf))  
    }
    
    params_df <- expand.grid(year = 2017:2020,
                             team = league_teams$team,
                             conference = league_teams$conference)
    
    df_list <- Map(nba_df_build, params_df$year, params_df$team, params_df$conference)
    
    final_df <- do.call(rbind, df_list)
    #final_df <- dplyr::bind_rows(df_list)
    

    对于任何数据帧的分割:

    # LIST OF TWO CONFERENCE DATA FRAMES
    conference_dfs <- split(final_df, final_df$conference)
    
    # LIST OF FOUR SEASON DATA FRAMES
    season_dfs <- split(final_df, final_df$season)
    
    # LIST OF THIRTY TEAM DATA FRAMES
    team_dfs <- split(final_df, final_df$team)
    

    【讨论】:

      猜你喜欢
      • 2020-10-08
      • 1970-01-01
      • 1970-01-01
      • 2015-08-16
      • 2023-04-03
      • 2022-01-20
      • 2020-07-01
      • 2019-09-18
      • 2018-01-29
      相关资源
      最近更新 更多