我的贡献是计算条件的累积和之间的滞后差
cumdiff = function(x) diff(c(0, cumsum( x > .2)), 20)
与
filt = function(x) filter(x > 0.2, rep(1, 20), sides=1)
library(TTR); ttr = function(x) runSum(x > .2, 20)
cumsub = function(x) { z <- cumsum(c(0, x>0.2)); tail(z,-20) - head(z,-20) }
正常
> library(microbenchmark)
> set.seed(123); xx = rnorm(100000)
> microbenchmark(cumdiff(xx), filt(xx), ttr(xx), cumsub(xx))
Unit: milliseconds
expr min lq median uq max neval
cumdiff(xx) 11.192005 12.387862 12.469253 12.77588 13.72404 100
filt(xx) 20.979503 22.058045 22.442765 23.02612 62.91730 100
ttr(xx) 8.390923 10.023934 10.119772 10.46309 11.04173 100
cumsub(xx) 7.015654 8.483432 8.538171 8.73596 9.65421 100
它们在结果表示方式的细节上有所不同(例如,filt 和 ttr 具有领先的 NA)并且只有 filter 处理嵌入式 NA
> xx[22] = NA
> head(cumdiff(xx)) # NA's propagate, silently
[1] 9 9 NA NA NA NA
> ttr(xx)
Error in runSum(x > 0.2, 20) : Series contains non-leading NAs
> tail(filt(xx), -19)
[1] 9 9 NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA 8 8 9
...