【发布时间】:2019-01-31 17:35:59
【问题描述】:
我希望构建一个函数,该函数根据函数中定义的变量名称以升序或降序方式对变量进行排名。
我可以手动进行排名,但我希望能够调用该函数以简化我的df 的代码。正在寻找有人向我展示如何在广泛而漫长的df 上执行该功能。我的示例代码如下。我希望 tov 和分钟数按升序排列,其他列均按降序排列。如果有人可以向我展示如何执行该函数,以便我为升序和降序变量定义变量名称,以及另一个选项,我只定义要降序排列的变量,所有其他列默认为升职。
library(tidyverse)
df <- tibble::tribble(
~Name, ~Team, ~minutes, ~ftm, ~fta, ~oreb, ~dreb, ~treb, ~ast, ~stl, ~blk, ~tov, ~pts, ~eff,
"Russell Westbrook", "OKC", 34.6, 8.8, 10.4, 1.7, 9, 10.7, 10.4, 1.6, 0.4, 5.4, 31.6, 33.8,
"James Harden", "HOU", 36.4, 9.2, 10.9, 1.2, 7, 8.1, 11.2, 1.5, 0.5, 5.7, 29.1, 32.4,
"Isaiah Thomas", "BOS", 33.8, 7.8, 8.5, 0.6, 2.1, 2.7, 5.9, 0.9, 0.2, 2.8, 28.9, 24.7,
"Anthony Davis", "NOP", 36.1, 6.9, 8.6, 2.3, 9.5, 11.8, 2.1, 1.3, 2.2, 2.4, 28, 31.1,
"DeMar DeRozan", "TOR", 35.4, 7.4, 8.7, 0.9, 4.3, 5.2, 3.9, 1.1, 0.2, 2.4, 27.3, 22.7,
"Damian Lillard", "POR", 35.9, 6.5, 7.3, 0.6, 4.3, 4.9, 5.9, 0.9, 0.3, 2.6, 27, 24.5,
"DeMarcus Cousins", "NOP", 34.2, 7.2, 9.3, 2.1, 8.9, 11, 4.6, 1.4, 1.3, 3.7, 27, 28.5,
"LeBron James", "CLE", 37.8, 4.8, 7.2, 1.3, 7.3, 8.6, 8.7, 1.2, 0.6, 4.1, 26.4, 31,
"Kawhi Leonard", "SAS", 33.4, 6.3, 7.2, 1.1, 4.7, 5.8, 3.5, 1.8, 0.7, 2.1, 25.5, 25.3,
"Stephen Curry", "GSW", 33.4, 4.1, 4.6, 0.8, 3.7, 4.5, 6.6, 1.8, 0.2, 3, 25.3, 25.2
)
df_wide <- df %>%
mutate_at(vars(ftm, ast), funs(rank = rank(desc(.)))) %>%
mutate_at(vars(tov, minutes), funs(rank = rank((.))))
df_wide
#> # A tibble: 10 x 18
#> Name Team minutes ftm fta oreb dreb treb ast stl blk
#> <chr> <chr> <dbl> <dbl> <dbl> <dbl> <dbl> <dbl> <dbl> <dbl> <dbl>
#> 1 Russ~ OKC 34.6 8.8 10.4 1.7 9 10.7 10.4 1.6 0.4
#> 2 Jame~ HOU 36.4 9.2 10.9 1.2 7 8.1 11.2 1.5 0.5
#> 3 Isai~ BOS 33.8 7.8 8.5 0.6 2.1 2.7 5.9 0.9 0.2
#> 4 Anth~ NOP 36.1 6.9 8.6 2.3 9.5 11.8 2.1 1.3 2.2
#> 5 DeMa~ TOR 35.4 7.4 8.7 0.9 4.3 5.2 3.9 1.1 0.2
#> 6 Dami~ POR 35.9 6.5 7.3 0.6 4.3 4.9 5.9 0.9 0.3
#> 7 DeMa~ NOP 34.2 7.2 9.3 2.1 8.9 11 4.6 1.4 1.3
#> 8 LeBr~ CLE 37.8 4.8 7.2 1.3 7.3 8.6 8.7 1.2 0.6
#> 9 Kawh~ SAS 33.4 6.3 7.2 1.1 4.7 5.8 3.5 1.8 0.7
#> 10 Step~ GSW 33.4 4.1 4.6 0.8 3.7 4.5 6.6 1.8 0.2
#> # ... with 7 more variables: tov <dbl>, pts <dbl>, eff <dbl>,
#> # ftm_rank <dbl>, ast_rank <dbl>, tov_rank <dbl>, minutes_rank <dbl>
df_long <- df %>%
gather(key = data_col, value = "stat_value", 3:14) %>%
group_by(data_col) %>%
mutate(rank = if_else(data_col %in% c("tov", "minutes"), rank(stat_value, ties.method = "first"), rank(-stat_value, ties.method = "first")))
df_long
#> # A tibble: 120 x 5
#> # Groups: data_col [12]
#> Name Team data_col stat_value rank
#> <chr> <chr> <chr> <dbl> <int>
#> 1 Russell Westbrook OKC minutes 34.6 5
#> 2 James Harden HOU minutes 36.4 9
#> 3 Isaiah Thomas BOS minutes 33.8 3
#> 4 Anthony Davis NOP minutes 36.1 8
#> 5 DeMar DeRozan TOR minutes 35.4 6
#> 6 Damian Lillard POR minutes 35.9 7
#> 7 DeMarcus Cousins NOP minutes 34.2 4
#> 8 LeBron James CLE minutes 37.8 10
#> 9 Kawhi Leonard SAS minutes 33.4 1
#> 10 Stephen Curry GSW minutes 33.4 2
#> # ... with 110 more rows
我想要的输出与上面列出的df 相同。我正在寻找一个函数来清理手动 if_else 和上面的 2 行代码。假设该函数被称为stat_rank。我希望代码操作如下:
df_wide <- df %>%
mutate_at(vars(ftm, ast, tov, minutes), funs(rank = stat_rank(.))))
df_long <- df %>%
gather(key = data_col, value = "stat_value", 3:14) %>%
group_by(data_col) %>%
mutate(rank = stat_rank(stat_value))
【问题讨论】:
-
你的预期输出是什么
am able to do the ranks manually, but I want to be able to call on the function in order to streamline the code for my df。您展示了两个代码 sn-ps。在代码中告诉我们问题 -
我刚刚编辑了我上面的问题以显示示例结果,以及我设想的功能如何工作。