【发布时间】:2016-01-17 10:28:09
【问题描述】:
我正在使用 R 开发一个项目,至少与我之前的 R 项目相比,该项目的代码量相当大。该代码在以前的列数据上使用多个ifelse 语句,然后使用结果创建一个新列。由于我使用的数据是 5 分钟的时间范围,因此我必须为每 5 分钟的时间片编写一行新代码。我拥有的数据是从 09:30 到 16:00,所以这是很多代码行,根据我的计算大约是 75 行。我的数据示例;
Date Open High Low Close doy
1 2015-09-21 09:30:00 164.6700 164.7100 164.3700 164.5300 264
2 2015-09-21 09:35:00 164.5300 164.9000 164.5300 164.6400 264
3 2015-09-21 09:40:00 164.6600 164.8900 164.6000 164.8900 264
4 2015-09-21 09:45:00 164.9100 165.0900 164.9100 164.9736 264
5 2015-09-21 09:50:00 164.9399 165.0980 164.8200 164.8200 264
然后将这些数据过滤到这样的表格中;
data <- structure(list(doy = c(264, 265, 266, 267, 268, 271, 272, 11,12, 13), Date = structure(c(1442824200, 1442910600, 1442997000,1443083400, 1443169800, 1443429000, 1443515400, 1452504600, 1452591000,1452677400), class = c("POSIXct", "POSIXt"), tzone = ""), Or_High = c(164.71,162.96, 163.38, 161.37, 163.91, 162.06, 160.22, 164.5, 165.23,165.84), OR_Low = c(164.37, 162.62, 162.98, 161.06, 163.57, 161.66,159.7, 164.06, 164.84, 165.4), HOD = c(165.56, 163.36, 163.38,162.24, 164.43, 162.06, 160.96, 164.5, 165.78, 165.84), LOD = c(165.22,163.1, 162.98, 161.95, 164.24, 161.66, 160.75, 164.06, 165.56,165.4), Close = c(164.92, 163.02, 162.58, 161.85, 162.94, 159.84,160.19, 163.83, 165.02, 161.38), Range = c(0.340000000000003,0.260000000000019, 0.400000000000006, 0.29000000000002, 0.189999999999998,0.400000000000006, 0.210000000000008, 0.439999999999998, 0.219999999999999,0.439999999999998), `A-val` = c(NA, NA, NA, NA, NA, NA, NA, 0.0673439999999994,0.0659639999999996, 0.0729499999999996), `A-up` = c(NA, NA, NA,NA, NA, NA, NA, 164.567344, 165.295964, 165.91295), `A-down` = c(NA,NA, NA, NA, NA, NA, NA, 163.992656, 164.774036, 165.32705), `09:35` = structure(c(NA,NA, NA, NA, NA, NA, NA, 0, 0, 0), .Dim = c(10L, 1L), .Dimnames = list(NULL, "Low")), `09:40` = structure(c(NA, NA, NA, NA, NA,NA, NA, -1, 1, 0), .Dim = c(10L, 1L), .Dimnames = list(NULL,"Low")), `09:45` = structure(c(NA, NA, NA, NA, NA, NA, NA,0, 1, 0), .Dim = c(10L, 1L), .Dimnames = list(NULL, "Low")),`09:50` = structure(c(NA, NA, NA, NA, NA, NA, NA, -1, 1,0), .Dim = c(10L, 1L), .Dimnames = list(NULL, "Low")), `09:55` = structure(c(NA,NA, NA, NA, NA, NA, NA, -1, 0, 0), .Dim = c(10L, 1L), .Dimnames = list(NULL, "Low")), `10:00` = structure(c(NA, NA, NA, NA,NA, NA, NA, -1, 0, 0), .Dim = c(10L, 1L), .Dimnames = list(NULL, "Low")), `10:05` = structure(c(NA, NA, NA, NA,NA, NA, NA, -1, 0, 0), .Dim = c(10L, 1L), .Dimnames = list(NULL, "Low")), `10:10` = structure(c(NA, NA, NA, NA,NA, NA, NA, -1, 0, 0), .Dim = c(10L, 1L), .Dimnames = list(NULL, "Low")), `10:15` = structure(c(NA, NA, NA, NA,NA, NA, NA, -2, 0, -1), .Dim = c(10L, 1L), .Dimnames = list(NULL, "Low")), `10:20` = structure(c(NA, NA, NA, NA,NA, NA, NA, 0, 0, -1), .Dim = c(10L, 1L), .Dimnames = list(NULL, "Low")), `10:25` = structure(c(NA, NA, NA, NA,NA, NA, NA, -2, -1, -1), .Dim = c(10L, 1L), .Dimnames = list(NULL, "Low")), `10:30` = structure(c(NA, NA, NA, NA,NA, NA, NA, 0, 0, -1), .Dim = c(10L, 1L), .Dimnames = list(NULL, "Low")), `10:35` = structure(c(NA, NA, NA, NA,NA, NA, NA, 0, 0, -1), .Dim = c(10L, 1L), .Dimnames = list(NULL, "Low")), `10:40` = structure(c(NA, NA, NA, NA,NA, NA, NA, 0, -1, -2), .Dim = c(10L, 1L), .Dimnames = list(NULL, "Low")), `10:45` = structure(c(NA, NA, NA, NA,NA, NA, NA, 0, -1, 0), .Dim = c(10L, 1L), .Dimnames = list(NULL, "Low")), `10:50` = structure(c(NA, NA, NA, NA,NA, NA, NA, -1, -1, -2), .Dim = c(10L, 1L), .Dimnames = list(NULL, "Low")), `10:55` = structure(c(NA, NA, NA, NA,NA, NA, NA, -1, -1, 0), .Dim = c(10L, 1L), .Dimnames = list(NULL, "Low"))), .Names = c("doy", "Date", "Or_High","OR_Low", "HOD", "LOD", "Close", "Range", "A-val", "A-up", "A-down","09:35", "09:40", "09:45", "09:50", "09:55", "10:00", "10:05","10:10", "10:15", "10:20", "10:25", "10:30", "10:35", "10:40","10:45", "10:50", "10:55"), row.names = c(1L, 2L, 3L, 4L, 5L,6L, 7L, 78L, 79L, 80L), class = "data.frame")
这就是代码行的样子;
data[,14] <- ifelse(df %>% filter(hour(Date) == 09 & minute(Date) == 45) %>% select(Low) > data[,10], 1, ifelse(df %>% filter(hour(Date) == 09 & minute(Date) == 45) %>% select(High) < data[,11], -1, 0))
那么下一行代码应该是这样的;
data[,15] <- ifelse(df %>% filter(hour(Date) == 09 & minute(Date) == 50) %>% select(Low) > data[,10], 1, ifelse(df %>% filter(hour(Date) == 09 & minute(Date) == 50) %>% select(High) < data[,11], -1, 0))
还有下一个这样的等;
data[,16] <- ifelse(df %>% filter(hour(Date) == 09 & minute(Date) == 55) %>% select(Low) > data[,10], 1, ifelse(df %>% filter(hour(Date) == 09 & minute(Date) == 55) %>% select(High) < data[,11], -1, 0))
正如您所见,每行新代码只更改了代码的某些部分,例如用于求和的小时、分钟和列引用。也许下面的例子会更清楚。
示例;
colnames(data)[14] <- "09:45"
colnames(data)[15] <- "09:50"
colnames(data)[16] <- "09:55"
colnames(data)[17] <- "10:00"
colnames(data)[18] <- "10:05"
在这段代码中,是否可以在不手动单独更改每一行代码的情况下更改[#col ref#] 和时间?我意识到复制和粘贴可以与记事本一起使用,但这仍然意味着要编写单独的更改。我主要关心的不是写这篇文章所花费的时间,而是人为输入出错的风险。
如果有人对如何做到这一点有任何提示或技巧,或者在我现有代码的结构上不使用多个 if 语句的情况下实现相同的另一种方法,我将非常感谢您的帮助。这个问题与我之前发布的here 问题有关,可能会增加我想要实现的目标的清晰度。
谢谢。
【问题讨论】:
-
我不完全理解你的问题,但如果你发现自己不得不一遍又一遍地做非常相似的事情,我建议你创建一个带参数的函数。
-
在您的
ifelse代码中,您使用的是数据中不存在的列号。请修正你的例子。如果您在问题中提供以下元素,将会有很大帮助:输入、期望的输出以及您迄今为止尝试过的内容。 -
您提到使用记事本编写代码,我个人非常喜欢使用记事本++。它是一个非常强大的代码编辑器,并且具有很好的复制粘贴和替换功能。有一个名为“NppToR”的第三方软件,我将它与 Notepad++ 结合使用。
-
仍然不清楚您要做什么。请阅读How to Ask 以及如何提供reproducible example
-
您问题的真正答案是“不要”。您正在做的事情非常容易出错并且难以更改和维护。只是不要这样做。如果你这样做,未来你会讨厌过去的你。
标签: r