【发布时间】:2016-06-20 22:26:47
【问题描述】:
我使用以下代码将月份名称映射到数字,与其他没有 for 循环的数据帧计算相比,我发现它的效率较低。
Sys.time()
head(df[,4])
for (i in 1:nrow(df)){
df$monthnum[i]<-match(tolower(as.character(df[i,4])), tolower(month.name))
}
Sys.time()
我得到这样的输出:
> Sys.time()
[1] "2016-03-07 19:20:53 CST"
> dim(df)
[1] 229464 6
> head(df[,4])
[1] January January January January January January
Levels: April August December February January July June March May November October September
> for (i in 1:nrow(df)){
+ df$monthnum[i]<-match(tolower(as.character(df[i,4])), tolower(month.name))
+ }
> Sys.time()
[1] "2016-03-07 19:23:23 CST"
任何人都可以在数据框中使用 for 循环的逻辑。任何信息将不胜感激。
【问题讨论】:
-
也许this 有助于解释为什么循环使用数据帧效率如此之低。你的代码只是
df$monthnum <- match(tolower(as.character(df[,4], tolower(month.name))