R行迭代答案 - 爱码网

【问题标题】：R row iterationR行迭代
【发布时间】：2017-12-30 20:58:40
【问题描述】：

R 新手，希望能在一个小问题上得到帮助。

我有带有 OHLC 和其他一些信息的 XTS 股票价格数据集。有关我的数据集的结构，请参见图像。如果前一行的“关闭”值高于“打开”值，则“Is.Previous.Up”列的值为 1。我想遍历所有行并记录前 3 条记录在“Is.Previous.Up”中的值为 1 时。基本上，只是想记录运行。

这是我正在尝试的方法，虽然这永远不会产生结果，但它也不会产生错误。我认为有一种更清洁的方法可以做到这一点。

nrowstotal <- nrow(nq1m_rth_withruns)
counter = 0
for ( i in 1:nrowstotal)
{
  if (isTRUE(nq1m_rth_withruns$Is.Previous.Up) & isTRUE(lag(nq1m_rth_withruns$Is.Previous.Up, 1)) & isTRUE(lag(nq1m_rth_withruns$Is.Previous.Up, 2)))
  {
  counter = counter + 1
  }
}
counter

任何帮助将不胜感激。

dataset

这里是数据示例。应该有 3 个实例，其中 1 连续 3 次出现在“PrevUp”列中。

structure(c(6267.75, 6262.75, 6260.25, 6263, 6258.5, 6259, 6255.75, 
6241.25, 6243.5, 6244.75, 6235.25, 6233.75, 6235.75, 6240.75, 
6239, 6237.25, 6240.5, 6244.25, 6249.25, 6246.25, 6242.25, 6238.75, 
6239.75, 6246.5, 6240.75, 6240.5, 6240.25, 6242, 6239.25, 6238.25, 
6239.75, 6241.5, 6242.25, 6250.75, 6247.5, 6251, 6251, 6253.75, 
6255, 6254.25, 6254, 6255.75, 6257.5, 6256.25, 6253.25, 6252.5, 
6254.5, 6258.5, 6256.75, 6258.5, 6259, 6256.25, 6254.5, 6257.5, 
6258.75, 6259.75, 6260.25, 6261, 6268.5, 6264, 6264.75, 6264, 
6262.5, 6260.25, 6256, 6246.25, 6246, 6244.75, 6235.75, 6239, 
6241.25, 6241.25, 6241.25, 6240.75, 6244.75, 6249.75, 6249.25, 
6246.25, 6242.75, 6240, 6247.5, 6247.5, 6242.5, 6242, 6243.5, 
6242.5, 6240.5, 6241.5, 6243.75, 6243.25, 6250.75, 6251, 6251.5, 
6253.5, 6254, 6255.75, 6257, 6254.5, 6258, 6258.75, 6258.25, 
6257, 6254, 6254.75, 6258.75, 6259.25, 6259.25, 6261.25, 6260.75, 
6257.75, 6258.75, 6260, 6261.75, 6260.5, 6263, 6262, 6262.75, 
6259.25, 6259.25, 6257.5, 6258.25, 6254.5, 6238.25, 6241, 6242.5, 
6231.25, 6230.75, 6233.75, 6235.5, 6237.75, 6235, 6236, 6239.25, 
6243.25, 6245.5, 6241.25, 6236.75, 6236.25, 6239.5, 6238.75, 
6238.75, 6237.25, 6239, 6239.5, 6237, 6238.25, 6237.75, 6240.5, 
6242.25, 6247.25, 6247.5, 6249.5, 6250, 6253, 6253, 6251.75, 
6254, 6255.25, 6255.25, 6252.25, 6250.75, 6251, 6253.75, 6255.75, 
6255.5, 6257.25, 6254.75, 6254.5, 6253.75, 6257.25, 6258.25, 
6258.75, 6260.25, 6259.25, 6262.75, 6260.5, 6263.25, 6258.75, 
6259, 6255.75, 6241, 6243.25, 6245, 6235, 6234, 6235.5, 6241, 
6238.75, 6237.25, 6240.25, 6244.25, 6249.5, 6246.25, 6242.5, 
6238.5, 6240, 6246.25, 6240.75, 6240.75, 6240.25, 6241.75, 6239.5, 
6238.25, 6239.75, 6241.25, 6242.25, 6250.75, 6247.75, 6251.25, 
6251, 6254, 6255, 6254.25, 6253.75, 6255.75, 6257.5, 6256, 6253.25, 
6252.5, 6254.5, 6258.5, 6256.75, 6258.25, 6259, 6256.25, 6254.5, 
6257.75, 6259, 6259.75, 6260.25, 6260.75, 6260.5, 3815, 3606, 
2650, 2513, 1621, 4364, 9874, 3553, 1886, 5396, 3196, 2982, 2803, 
1993, 2453, 1646, 3815, 2376, 1890, 1534, 2122, 1584, 2229, 2159, 
1474, 1448, 1460, 892, 1287, 782, 1413, 1458, 2513, 1392, 1097, 
2488, 3091, 1653, 2331, 1162, 1441, 2007, 905, 1102, 1568, 1122, 
1219, 805, 1417, 3126, 1828, 833, 1574, 903, 941, 575, 1256, 
998, 2777, 2521, 1939, 1838, 1194, 2964, 6090, 2399, 1354, 3852, 
2245, 2041, 1962, 1458, 1779, 1323, 2602, 1788, 1455, 1181, 1651, 
1207, 1789, 1579, 1201, 1035, 1157, 756, 1065, 644, 1087, 875, 
1841, 1076, 855, 1806, 1646, 1114, 1445, 844, 1031, 1234, 658, 
840, 996, 673, 913, 633, 958, 1653, 1086, 615, 1003, 688, 692, 
422, 931, 648, 2347, 2185, 1223, 1582, 817, 2400, 6234, 1614, 
923, 3301, 1569, 1158, 1236, 1132, 1237, 695, 1627, 833, 1062, 
1001, 1300, 838, 930, 1376, 656, 698, 651, 443, 759, 320, 621, 
634, 782, 813, 383, 1214, 1479, 786, 1190, 712, 592, 855, 536, 
736, 913, 416, 441, 442, 520, 1430, 968, 507, 608, 378, 462, 
334, 502, 470, 1468, 1421, 1427, 931, 804, 1964, 3640, 1939, 
963, 2095, 1627, 1824, 1567, 861, 1216, 951, 2188, 1543, 828, 
533, 822, 746, 1299, 783, 818, 750, 809, 449, 528, 462, 792, 
824, 1731, 579, 714, 1274, 1612, 867, 1141, 450, 849, 1152, 369, 
366, 655, 706, 778, 363, 897, 1696, 860, 326, 966, 525, 479, 
241, 754, 528, 1, 0, 0, 1, 0, 1, 0, 0, 1, 1, 0, 0, 1, 1, 0, 0, 
1, 1, 1, 0, 0, 0, 1, 1, 0, 0, 0, 1, 0, 0, 1, 1, 1, 1, 0, 1, 0, 
1, 1, 0, 0, 1, 1, 0, 0, 0, 1, 1, 0, 1, 1, 0, 0, 1, 1, 1, 1, 1
), class = c("xts", "zoo"), .indexCLASS = c("POSIXct", "POSIXt"
), tclass = c("POSIXct", "POSIXt"), .indexTZ = "", tzone = "", index = structure(c(1510756200, 
1510756260, 1510756320, 1510756380, 1510756440, 1510756500, 1510756560, 
1510756620, 1510756680, 1510756740, 1510756800, 1510756860, 1510756920, 
1510756980, 1510757040, 1510757100, 1510757160, 1510757220, 1510757280, 
1510757340, 1510757400, 1510757460, 1510757520, 1510757580, 1510757640, 
1510757700, 1510757760, 1510757820, 1510757880, 1510757940, 1510758000, 
1510758060, 1510758120, 1510758180, 1510758240, 1510758300, 1510758360, 
1510758420, 1510758480, 1510758540, 1510758600, 1510758660, 1510758720, 
1510758780, 1510758840, 1510758900, 1510758960, 1510759020, 1510759080, 
1510759140, 1510759200, 1510759260, 1510759320, 1510759380, 1510759440, 
1510759500, 1510759560, 1510759620), tzone = "", tclass = c("POSIXct", 
"POSIXt")), .Dim = c(58L, 9L), .Dimnames = list(NULL, c("Open", 
"High", "Low", "Close", "Volume", "NumberOfTrades", "BidVolume", 
"AskVolume", "PrevUp")))

【问题讨论】：

请通过运行dput(dataset) 提供数据并将其复制粘贴到您的问题中。还要尽量避免数据集的数字，因为它缺乏可重复性
嗨萨蒂什。我添加了数据集的样本。谢谢。

标签： r

【解决方案1】：

我相信问题出在您的if statement 上。您实际上是在告诉它，如果 Is.Previous.Up 列为真，如果 Is.Previous.Up 列之前的观察结果为真，并且如果之前的观察结果为真，则在计数器上加一。您没有为 if statement 指定特定的行来执行操作。

我会这样写：

runs.counter<-0
stop.counter<-0
for (i in 1:nrow(dataset)){
    if(dataset[i,"PrevUp"]==1&&
    dataset[i-1,"PrevUp"]==1&&
    dataset[i-2,"PrevUp"]==1&&
    stop.counter==0){
        runs.counter<-runs.counter+1
        stop.counter<-1
    }else if(dataset[i,"PrevUp"]==0){
        stop.counter<-0
    }
}

【讨论】：

嗨，卡梅伦。我不太确定返回什么，但使用上面的数据样本返回计数器为 20，但只有 3 个实例连续有 3 个 1。
抱歉！现在已经修复了。
效果很好。非常感谢您的帮助。我明白你在说为什么这行得通而我的行不通。

【解决方案2】：

find_OC_diff() 函数将查找上一行中 close 值大于 open 值的位置。

它没有使用PrevUp 列。相反，它使用xts 对象的Close 和Open 列中的值。

find_OC_diff <- function(obj, val, col1, col2)
{
  # difference between close and open
  xts2 <- lag(obj[, col1] - obj[, col2])
  # assign the first value of close with 1
  xts2[1, 1] <- 1
  # get run length encoding for xts2$Close value greater than zero
  xts2_rle <- rle(as.vector(xts2 > 0))
  # check if run lengths greater than val: example val = 3
  # get its position
  xts2_rle3 <- which(xts2_rle$lengths >= val)
  # get the xts2_rle index where TRUE is greater than or equal to 3
  hits <- xts2_rle3[xts2_rle$values[xts2_rle3]]
  # create a dummy list based on xts2_rle structure. It will help
  # isolate the hits with TRUE and the rest with FALSE values
  close_hits <- Map(rep, x = FALSE, times = xts2_rle$lengths)
  # assign TRUE for close_hits with hits indices
  for( i in hits ){
    close_hits[[i]] <- rep(TRUE, times = length(close_hits[[i]]))
  }
  # return the indices of xts object (obj) where the run length of 
  # a condition is greater than or equal to 3 (val).
  return(close_hits)
}

library(xts)
which(unlist(find_OC_diff(obj = xts1, val = 3, col1 = "Close", col2 = "Open")))
# [1] 17 18 19 31 32 33 34 54 55 56 57 58

xts1[which(unlist(find_OC_diff(obj = xts1, val = 3, col1 = "Close", col2 = "Open")))]
# Open    High     Low   Close Volume NumberOfTrades BidVolume AskVolume PrevUp
# 2017-11-15 09:46:00 6240.50 6244.75 6239.25 6244.25   3815           2602      1627      2188      1
# 2017-11-15 09:47:00 6244.25 6249.75 6243.25 6249.50   2376           1788       833      1543      1
# 2017-11-15 09:48:00 6249.25 6249.25 6245.50 6246.25   1890           1455      1062       828      1
# 2017-11-15 10:00:00 6239.75 6243.75 6237.75 6241.25   1413           1087       621       792      1
# 2017-11-15 10:01:00 6241.50 6243.25 6240.50 6242.25   1458            875       634       824      1
# 2017-11-15 10:02:00 6242.25 6250.75 6242.25 6250.75   2513           1841       782      1731      1
# 2017-11-15 10:03:00 6250.75 6251.00 6247.25 6247.75   1392           1076       813       579      1
# 2017-11-15 10:23:00 6257.50 6260.00 6257.25 6259.00    903            688       378       525      1
# 2017-11-15 10:24:00 6258.75 6261.75 6258.25 6259.75    941            692       462       479      1
# 2017-11-15 10:25:00 6259.75 6260.50 6258.75 6260.25    575            422       334       241      1
# 2017-11-15 10:26:00 6260.25 6263.00 6260.25 6260.75   1256            931       502       754      1
# 2017-11-15 10:27:00 6261.00 6262.00 6259.25 6260.50    998            648       470       528      1

数据：

xts1 <- structure(c(6267.75, 6262.75, 6260.25, 6263, 6258.5, 6259, 6255.75, 
                    6241.25, 6243.5, 6244.75, 6235.25, 6233.75, 6235.75, 6240.75, 
                    6239, 6237.25, 6240.5, 6244.25, 6249.25, 6246.25, 6242.25, 6238.75, 
                    6239.75, 6246.5, 6240.75, 6240.5, 6240.25, 6242, 6239.25, 6238.25, 
                    6239.75, 6241.5, 6242.25, 6250.75, 6247.5, 6251, 6251, 6253.75, 
                    6255, 6254.25, 6254, 6255.75, 6257.5, 6256.25, 6253.25, 6252.5, 
                    6254.5, 6258.5, 6256.75, 6258.5, 6259, 6256.25, 6254.5, 6257.5, 
                    6258.75, 6259.75, 6260.25, 6261, 6268.5, 6264, 6264.75, 6264, 
                    6262.5, 6260.25, 6256, 6246.25, 6246, 6244.75, 6235.75, 6239, 
                    6241.25, 6241.25, 6241.25, 6240.75, 6244.75, 6249.75, 6249.25, 
                    6246.25, 6242.75, 6240, 6247.5, 6247.5, 6242.5, 6242, 6243.5, 
                    6242.5, 6240.5, 6241.5, 6243.75, 6243.25, 6250.75, 6251, 6251.5, 
                    6253.5, 6254, 6255.75, 6257, 6254.5, 6258, 6258.75, 6258.25, 
                    6257, 6254, 6254.75, 6258.75, 6259.25, 6259.25, 6261.25, 6260.75, 
                    6257.75, 6258.75, 6260, 6261.75, 6260.5, 6263, 6262, 6262.75, 
                    6259.25, 6259.25, 6257.5, 6258.25, 6254.5, 6238.25, 6241, 6242.5, 
                    6231.25, 6230.75, 6233.75, 6235.5, 6237.75, 6235, 6236, 6239.25, 
                    6243.25, 6245.5, 6241.25, 6236.75, 6236.25, 6239.5, 6238.75, 
                    6238.75, 6237.25, 6239, 6239.5, 6237, 6238.25, 6237.75, 6240.5, 
                    6242.25, 6247.25, 6247.5, 6249.5, 6250, 6253, 6253, 6251.75, 
                    6254, 6255.25, 6255.25, 6252.25, 6250.75, 6251, 6253.75, 6255.75, 
                    6255.5, 6257.25, 6254.75, 6254.5, 6253.75, 6257.25, 6258.25, 
                    6258.75, 6260.25, 6259.25, 6262.75, 6260.5, 6263.25, 6258.75, 
                    6259, 6255.75, 6241, 6243.25, 6245, 6235, 6234, 6235.5, 6241, 
                    6238.75, 6237.25, 6240.25, 6244.25, 6249.5, 6246.25, 6242.5, 
                    6238.5, 6240, 6246.25, 6240.75, 6240.75, 6240.25, 6241.75, 6239.5, 
                    6238.25, 6239.75, 6241.25, 6242.25, 6250.75, 6247.75, 6251.25, 
                    6251, 6254, 6255, 6254.25, 6253.75, 6255.75, 6257.5, 6256, 6253.25, 
                    6252.5, 6254.5, 6258.5, 6256.75, 6258.25, 6259, 6256.25, 6254.5, 
                    6257.75, 6259, 6259.75, 6260.25, 6260.75, 6260.5, 3815, 3606, 
                    2650, 2513, 1621, 4364, 9874, 3553, 1886, 5396, 3196, 2982, 2803, 
                    1993, 2453, 1646, 3815, 2376, 1890, 1534, 2122, 1584, 2229, 2159, 
                    1474, 1448, 1460, 892, 1287, 782, 1413, 1458, 2513, 1392, 1097, 
                    2488, 3091, 1653, 2331, 1162, 1441, 2007, 905, 1102, 1568, 1122, 
                    1219, 805, 1417, 3126, 1828, 833, 1574, 903, 941, 575, 1256, 
                    998, 2777, 2521, 1939, 1838, 1194, 2964, 6090, 2399, 1354, 3852, 
                    2245, 2041, 1962, 1458, 1779, 1323, 2602, 1788, 1455, 1181, 1651, 
                    1207, 1789, 1579, 1201, 1035, 1157, 756, 1065, 644, 1087, 875, 
                    1841, 1076, 855, 1806, 1646, 1114, 1445, 844, 1031, 1234, 658, 
                    840, 996, 673, 913, 633, 958, 1653, 1086, 615, 1003, 688, 692, 
                    422, 931, 648, 2347, 2185, 1223, 1582, 817, 2400, 6234, 1614, 
                    923, 3301, 1569, 1158, 1236, 1132, 1237, 695, 1627, 833, 1062, 
                    1001, 1300, 838, 930, 1376, 656, 698, 651, 443, 759, 320, 621, 
                    634, 782, 813, 383, 1214, 1479, 786, 1190, 712, 592, 855, 536, 
                    736, 913, 416, 441, 442, 520, 1430, 968, 507, 608, 378, 462, 
                    334, 502, 470, 1468, 1421, 1427, 931, 804, 1964, 3640, 1939, 
                    963, 2095, 1627, 1824, 1567, 861, 1216, 951, 2188, 1543, 828, 
                    533, 822, 746, 1299, 783, 818, 750, 809, 449, 528, 462, 792, 
                    824, 1731, 579, 714, 1274, 1612, 867, 1141, 450, 849, 1152, 369, 
                    366, 655, 706, 778, 363, 897, 1696, 860, 326, 966, 525, 479, 
                    241, 754, 528, 1, 0, 0, 1, 0, 1, 0, 0, 1, 1, 0, 0, 1, 1, 0, 0, 
                    1, 1, 1, 0, 0, 0, 1, 1, 0, 0, 0, 1, 0, 0, 1, 1, 1, 1, 0, 1, 0, 
                    1, 1, 0, 0, 1, 1, 0, 0, 0, 1, 1, 0, 1, 1, 0, 0, 1, 1, 1, 1, 1
), class = c("xts", "zoo"), .indexCLASS = c("POSIXct", "POSIXt"
), tclass = c("POSIXct", "POSIXt"), 
.indexTZ = "", tzone = "", 
index = structure(c(1510756200, 
                    1510756260, 1510756320, 1510756380, 1510756440, 1510756500, 1510756560, 
                    1510756620, 1510756680, 1510756740, 1510756800, 1510756860, 1510756920, 
                    1510756980, 1510757040, 1510757100, 1510757160, 1510757220, 1510757280, 
                    1510757340, 1510757400, 1510757460, 1510757520, 1510757580, 1510757640, 
                    1510757700, 1510757760, 1510757820, 1510757880, 1510757940, 1510758000, 
                    1510758060, 1510758120, 1510758180, 1510758240, 1510758300, 1510758360, 
                    1510758420, 1510758480, 1510758540, 1510758600, 1510758660, 1510758720, 
                    1510758780, 1510758840, 1510758900, 1510758960, 1510759020, 1510759080, 
                    1510759140, 1510759200, 1510759260, 1510759320, 1510759380, 1510759440, 
                    1510759500, 1510759560, 1510759620),
tzone = "", tclass = c("POSIXct", "POSIXt")), 
.Dim = c(58L, 9L), 
.Dimnames = list(NULL, c("Open", "High", "Low", "Close", "Volume", "NumberOfTrades", 
                         "BidVolume", "AskVolume", "PrevUp")))

【讨论】：

非常感谢，萨蒂什。我非常感谢答案和解释代码的 cmets。

【解决方案3】：

rle（运行长度编码）计算不同值沿向量的长度。在这里，我创建了一个只有一列的简短示例。

df<-data.frame(x=c(1,0,1,1,0,1,1,1,1,0,1,0,1,1,1,0,0))
rledf<-rle(df$x)
#Run Length Encoding
# lengths: int [1:10] 1 1 2 1 4 1 1 1 3 2
# values : num [1:10] 1 0 1 0 1 0 1 0 1 0

使用 cumsum，您可以得到向量中的位置。

position<-cumsum(rledf$length)

通过对位置向量进行子集化以仅获取长度>=3 和值=1 的值来找到起始索引

starting_indice<-position[rledf$values==1&rledf$length>=3]-3

【讨论】：

当我运行命令 rledf
是的，这可能是问题所在，在您发布示例时，您可以使用 as vector() 进行转换。我没有你的数据样本。看来你现在有了答案。