【发布时间】:2018-12-28 17:45:28
【问题描述】:
使用 Quantmod 包下载股票数据后,我想对数据进行子集化,并使用 (last / lag) 将 xts 中的最后一行数据与前一行进行比较。
首先我创建了一个函数来对四分位数的音量进行分类。
其次,我创建一个新数据集以过滤列表中的哪些股票昨天的成交量为 3(3rd quartile) = "stocks_with3"
现在我想再次对新创建的“stocks_with3”数据集进行子集化。
具体来说,我想要得到的是比较昨天的“打开”(使用最后一个)和昨天“(使用滞后)的“关闭”的 TRUE/FALSE。
我想要得到的正是昨天交易量在第三个四分位数的股票的“开盘价”是否小于或等于昨天之前的“收盘价”。
但在运行子集时,我收到一条错误消息:“维数不正确”
我对子集的方法是使用 last(获取 xts 中的最后一个可用数据)和 lag(将其与前一行进行比较)
#Get stock list data
library(quantmod)
library(xts)
Symbols <- c("XOM","MSFT","JNJ","IBM","MRK","BAC","DIS","ORCL","LW","NYT","YELP")
start_date=as.Date("2018-06-01")
getSymbols(Symbols,from=start_date)
stock_data = sapply(.GlobalEnv, is.xts)
all_stocks <- do.call(list, mget(names(stock_data)[stock_data]))
#function to split volume data quartiles into 0-4 results
Volume_q_rank <- function(x) {
stock_name <- stringi::stri_extract(names(x)[1], regex = "^[A-Z]+")
stock_name <- paste0(stock_name, ".Volqrank")
column_names <- c(names(x), stock_name)
x$volqrank <- as.integer(cut(quantmod::Vo(x),
quantile(quantmod::Vo(x),probs=0:4/4),include.lowest=TRUE))
x <- setNames(x, column_names)return(x)
}
all_stocks <- lapply(all_stocks, Volume_q_rank)
#Create a new dataset using names and which with stocks of Volume in the 3rd quartile.
stock3 <- sapply(all_stocks, function(x) {last(x[, grep("\\.Volqrank",names(x))]) == 3})
stocks_with3 <- names(which(stock3 == TRUE))
#Here is when I get the error.
stock3_check <- sapply(stocks_with3, function(x) {last(x[, grep("\\.Open",names(x))]) <= lag(x[, grep("\\.Close", 1), names(x)])})
#Expected result could be the same or running this for a single stock but applied to all the stocks in the list:
last(all_stocks$MSFT$MSFT.Open) <= lag(all_stocks$MSFT$MSFT.Close, 1)
#But I'm having the error when trying to apply to whole list using "sapply" "last" and "lag"
Any suggestion will be appreciated.
Thank you very much.
【问题讨论】: