【发布时间】:2018-11-15 02:35:32
【问题描述】:
我有计算正在运行的drawdown.duration 的工作代码,其中drawdown.duration 定义为当前月份和上一个peak 之间的月数。但是,我将代码实现为for 循环,并且运行速度很慢。
在R 中是否有更有效/更快的方法来实现这一点?
代码采用名为returnsWithValues 的data.frame(特别是tibble,因为我一直在使用dplyr)。
> structure(list(date = structure(c(789, 820, 850, 881, 911, 942
), class = "Date"), value = c(0.94031052, 0.930751624153046,
0.926756311376762, 0.874209664097166, 0.843026010916249, 2.1),
peak = c(1, 1, 1, 1, 1, 2.1), drawdown = c(-0.05968948, -0.0692483758469535,
-0.0732436886232377, -0.125790335902834, -0.156973989083751,
0)), class = c("tbl_df", "tbl", "data.frame"), row.names = c(NA,
-6L))
# A tibble: 6 x 4
date value peak drawdown
<date> <dbl> <dbl> <dbl>
1 1972-02-29 0.940 1 -0.0597
2 1972-03-31 0.931 1 -0.0692
3 1972-04-30 0.927 1 -0.0732
4 1972-05-31 0.874 1 -0.126
5 1972-06-30 0.843 1 -0.157
6 1972-07-31 2.1 2.1 0
我已经使用for 循环实现了drawdown.duration:
returnsWithValues <- returnsWithValues %>% mutate(drawdown.duration = NA)
# add drawdown.duration col
for (row in 1:nrow(returnsWithValues)) {
if(returnsWithValues[row,"value"] == returnsWithValues[row,"peak"]) {
returnsWithValues[row,"drawdown.duration"] = 0
} else {
if(row == 1){
returnsWithValues[row,"drawdown.duration"] = 1
} else {
returnsWithValues[row,"drawdown.duration"] = returnsWithValues[row - 1,"drawdown.duration"] + 1
}
}
}
正确答案如下:
> returnsWithValues
# A tibble: 6 x 5
date value peak drawdown drawdown.duration
<date> <dbl> <dbl> <dbl> <dbl>
1 1972-02-29 0.940 1 -0.0597 1
2 1972-03-31 0.931 1 -0.0692 2
3 1972-04-30 0.927 1 -0.0732 3
4 1972-05-31 0.874 1 -0.126 4
5 1972-06-30 0.843 1 -0.157 5
6 1972-07-31 2.1 2.1 0 0
【问题讨论】:
标签: r performance dplyr