【问题标题】:How to create a win/loss record from a data set in R如何从 R 中的数据集创建赢/输记录
【发布时间】:2022-01-17 10:51:11
【问题描述】:

我有一组梦幻足球数据,我正试图从中找出记录。数据结构如下

team <- c("Mary", "John", "Matt","Paul","Mary", "John", "Matt","Paul")
week <- c(1,1,1,1,2,2,2,2)
opponent <- c("John", "Mary" , "Paul" , "Matt" , "Paul" , "Matt" , "John" , "Mary")
team.score <- c(10,15,6,7,8,12,2,3)
df <- data.frame(team,week,opponent,team.score)
head(df)

team week opponent team.score
Mary    1     John         10
John    1     Mary         15
Matt    1     Paul          6
Paul    1     Matt          7
Mary    2     Paul          8
John    2     Matt         12

我想做的是,说玛丽的记录是 1 - 1。我不确定如何与数据的结构方式进行比较。

【问题讨论】:

  • 您能根据您的样本数据解释一下 Mary 是如何获得 1-1 记录的吗?我不明白它的逻辑。

标签: r dplyr


【解决方案1】:

我采取的第一步是在一个列中添加对手的得分。当前的数据结构有点棘手,但这是我所做的:

#create separate opponent df
opponent <- df %>%
  select(week, opponent, team.score)

#join onto original df on team name = opponent name
new_df <- df %>%
  inner_join(opponent, by = c("team" = "opponent", "week" = "week")) %>%
  rename("team_score" = "team.score.x",
         "opponent_score" = "team.score.y")

  team week opponent team_score opponent_score
1 Mary    1     John         10             15
2 John    1     Mary         15             10
3 Matt    1     Paul          6              7
4 Paul    1     Matt          7              6
5 Mary    2     Paul          8              3
6 John    2     Matt         12              2

接下来是添加我们的输赢逻辑:

#add in W/L logic
new_df <- new_df %>%
  mutate(win = if_else(team_score > opponent_score, 1, 0),
         loss = if_else(team_score < opponent_score, 1, 0))

  team week opponent team_score opponent_score win loss
1 Mary    1     John         10             15   0    1
2 John    1     Mary         15             10   1    0
3 Matt    1     Paul          6              7   0    1
4 Paul    1     Matt          7              6   1    0
5 Mary    2     Paul          8              3   1    0
6 John    2     Matt         12              2   1    0
7 Matt    2     John          2             12   0    1
8 Paul    2     Mary          3              8   0    1

最后,您可以使用团队名称的快速分组来获取“官方” W/L 记录:

#group by
aggregated_df <- new_df %>%
  group_by(team) %>%
  dplyr::summarise(win = sum(win), loss = sum(loss)) %>%
  arrange(-win)

  team    win  loss
  <chr> <dbl> <dbl>
1 John      2     0
2 Mary      1     1
3 Paul      1     1
4 Matt      0     2

希望你在幻想赛季中表现出色哈哈!

【讨论】:

  • 这个太棒了,谢谢!
【解决方案2】:

虽然很乱,但是可以。

library(dplyr)
library(tidyr)

df %>%
  rowwise %>%
  mutate(key = paste0(sort(c(team, opponent)), collapse = "-")) %>%
  group_by(week, key) %>%
  mutate(a = rank(team.score) - 1) %>%
  group_by(team,a) %>%
  summarize(n = n())%>%
  pivot_wider(id_cols = team, names_from = a, values_from = n, names_glue = "score_{a}") %>%
  replace(is.na(.), 0)

  team  score_1 score_0
  <chr>   <dbl>   <dbl>
1 John        2       0
2 Mary        1       1
3 Matt        0       2
4 Paul        1       1

【讨论】:

    猜你喜欢
    • 1970-01-01
    • 1970-01-01
    • 1970-01-01
    • 2010-11-04
    • 2020-03-04
    • 1970-01-01
    • 2016-09-06
    • 1970-01-01
    • 2018-03-07
    相关资源
    最近更新 更多