【发布时间】:2020-04-14 05:31:55
【问题描述】:
我正在寻找一种方法,通过多个组(“season”、“meteo”)提取列模式(“meteo2”),这些组是我的数据帧“mydf”中的因子格式。这是我的测试代码如下,但它不起作用导致错误消息。使用一组“季节”,它可以工作。三列都有“NA”值。我不确定我的代码中哪一部分有问题。非常欢迎任何帮助!
str(mydf$season)
Factor w/ 4 levels "Spring","Summer",...:
str(mydf$meteo)
Factor w/ 7 levels "<40","<50","<60",..:
str(mydf$meteo2)
Factor w/ 4 levels "E","N","S","W":
# mode function
Mode = function(x){
ta = table(x)
tam = max(ta)
if (all(ta == tam))
mod = NA
else
if(is.numeric(x))
mod = as.numeric(names(ta)[ta == tam])
else
mod = names(ta)[ta == tam]
return(mod)}
# extracting mode
dataSummary<-mydf %>% select(season, meteo, meteo2) %>%
mutate(meteo = forcats::fct_explicit_na(meteo)) %>%
group_by(meteo, season) %>%
summarise(m=Mode(meteo2))
dataSummary
error : Column `m` can't promote group 30 to character
这是我的示例数据。
dput(head(mydf_sample))
structure(list(season = structure(c(3L, 3L, 3L, 3L, 3L, 3L), .Label = c("Spring",
"Summer", "Fall", "Winter"), class = "factor"), meteo2 = structure(c(2L,
2L, 2L, 1L, 2L, 2L), .Label = c("E", "N", "S", "W"), class = "factor"),
meteo = structure(c(6L, 6L, 6L, 6L, 7L, 7L), .Label = c("<40",
"<50", "<60", "<70", "<75", "<80", "80+"), class = "factor")), .Names = c("season",
"meteo2", "meteo"), row.names = c(NA, 6L), class = "data.frame")
>
【问题讨论】:
-
您可以使用
dput添加数据吗?dput(mydf)? -
您好 Ronak,我发布了我的示例数据,不是完整数据,因为我尝试过,但发布时间太长。
-
但这似乎适用于您的示例数据
mydf_sample %>% group_by(meteo, season) %>% summarise(m=Mode(meteo2)) -
> mydf_sample %>% group_by(meteo, season) %>% summarise(m=Mode(meteo2)) 错误:列
m无法将组28 提升为字符警告:因素@987654327 @ 包含隐式 NA,考虑使用forcats::fct_explicit_na> ## 我看到这条消息.. ## -
你可以试试
mydf_sample %>% group_by(meteo, season) %>% summarise(m= as.character(Mode(meteo2)))吗?