我们可以用mtcars 数据框复制问题。第三个mutate() 语句中的以下代码导致所有行都将wt 值设置为High,因为在第一个mutate() 之后,wt 列是字符值向量。
library(dplyr)
data(mtcars)
mtcars <- mutate(mtcars,wt = ifelse(wt < 2.6,"Low", wt))
# at this point, wt is character
str(mtcars$wt)
> str(mtcars$wt)
chr [1:32] "2.62" "2.875" "Low" "3.215" "3.44" "3.46" "3.57" "3.19" "3.15" ...
到第三个mutate(),基于字符串比较Low 和Medium 的字符串值大于数字3.61,所有行都满足if_else() 的TRUE 条件。
mtcars <- mutate(mtcars, wt = ifelse( 2.6 <= wt & wt <= 3.61,"Medium",wt))
mtcars <- mutate(mtcars, wt = ifelse( wt > 3.61,"High",wt))
...和输出:
> head(mtcars)
mpg cyl disp hp drat wt qsec vs am gear carb
Mazda RX4 21.0 6 160 110 3.90 High 16.46 0 1 4 4
Mazda RX4 Wag 21.0 6 160 110 3.90 High 17.02 0 1 4 4
Datsun 710 22.8 4 108 93 3.85 High 18.61 1 1 4 1
Hornet 4 Drive 21.4 6 258 110 3.08 High 19.44 1 0 3 1
Hornet Sportabout 18.7 8 360 175 3.15 High 17.02 0 0 3 2
Valiant 18.1 6 225 105 2.76 High 20.22 1 0 3 1
我们可以通过使用case_when() 来防止这种行为,这会在一次数据传递中与wt 的数字版本进行所有比较。
# use case_when()
data(mtcars)
mtcars %>% mutate(wt = case_when(
wt < 2.6 ~ "Low",
wt >= 2.6 & wt <= 3.61 ~ "Medium",
wt > 3.61 ~ "High"
)) %>% head(.)
...和输出:
head(.)
mpg cyl disp hp drat wt qsec vs am gear carb
Mazda RX4 21.0 6 160 110 3.90 Medium 16.46 0 1 4 4
Mazda RX4 Wag 21.0 6 160 110 3.90 Medium 17.02 0 1 4 4
Datsun 710 22.8 4 108 93 3.85 Low 18.61 1 1 4 1
Hornet 4 Drive 21.4 6 258 110 3.08 Medium 19.44 1 0 3 1
Hornet Sportabout 18.7 8 360 175 3.15 Medium 17.02 0 0 3 2
Valiant 18.1 6 225 105 2.76 Medium 20.22 1 0 3 1
>
从 cmets 到这个答案,OP 不清楚如何将更改的列保存到现有数据框中。下面的代码 sn-p 解决了这个问题。
data(mtcars)
mtcars %>% mutate(wt = case_when(
wt < 2.6 ~ "Low",
wt >= 2.6 & wt <= 3.61 ~ "Medium",
wt > 3.61 ~ "High"
)) -> mtcars