这并不完全符合您的要求,但我已经写了 time_file (https://gist.github.com/4183595),其中 source()s 是一个 R 文件,并运行代码,然后重写文件,插入包含每个顶级语句运行时间的 cmets。
即time_file() 转这个:
{
load_all("~/documents/plyr/plyr")
load_all("~/documents/plyr/dplyr")
library(data.table)
data("baseball", package = "plyr")
vars <- list(n = quote(length(id)), m = quote(n + 1))
}
# Baseline case: use ddply
a <- ddply(baseball, "id", summarise, n = length(id))
# New summary method: ~20x faster
b <- summarise_by(baseball, group("id"), vars)
# But still not as fast as specialised count, which is basically id + tabulate
# so maybe able to eke out a little more with a C loop ?
count(baseball, "id")
进入这个:
{
load_all("~/documents/plyr/plyr")
load_all("~/documents/plyr/dplyr")
library(data.table)
data("baseball", package = "plyr")
vars <- list(n = quote(length(id)), m = quote(n + 1))
}
# Baseline case: use ddply
a <- ddply(baseball, "id", summarise, n = length(id))
#: user system elapsed
#: 0.451 0.003 0.453
# New summary method: ~20x faster
b <- summarise_by(baseball, group("id"), vars)
#: user system elapsed
#: 0.029 0.000 0.029
# But still not as fast as specialised count, which is basically id + tabulate
# so maybe able to eke out a little more with a C loop ?
count(baseball, "id")
#: user system elapsed
#: 0.008 0.000 0.008
它不会对顶级 { 块内的代码进行计时,因此您可以选择不对您不感兴趣的内容计时。
我认为无论如何都不会自动添加计时作为顶级效果而不以某种方式修改您运行代码的方式 - 即使用类似 time_file 而不是 source。
您可能想知道每个顶级操作的计时对代码的整体速度有何影响。嗯,这很容易用微基准来回答;)
library(microbenchmark)
microbenchmark(
runif(1e4),
system.time(runif(1e4)),
system.time(runif(1e4), gc = FALSE)
)
因此,时间增加的开销相对较小(我的计算机上为 20µs),但默认 gc 每次调用增加约 27 ms。因此,除非您有数千个顶级调用,否则您不太可能看到太大的影响。