【问题标题】:Applying function to a subset of xts quantmod将函数应用于 xts quantmod 的子集
【发布时间】:2020-10-06 16:10:50
【问题描述】:

我正在尝试按年份获取股票价格的标准差,但每年都得到相同的值。

我尝试使用 dplyr (group_by, summarise) 和一个函数,但其​​中任何一个都没有运气,两者都返回相同的值 67.0。

它可能是传递了整个数据帧而没有对其进行子集化,如何解决这个问题?

library(quantmod)
library(tidyr)
library(dplyr)

#initial parameters
initialDate = as.Date('2010-01-01')
finalDate = Sys.Date()

ybeg = format(initialDate,"%Y")
yend = format(finalDate,"%Y")

ticker = "AAPL"

#getting stock prices
stock = getSymbols.yahoo(ticker, from=initialDate, auto.assign = FALSE)
stock = stock[,4] #working only with closing prices

使用 dplyr:

#Attempt 1 with dplyr - not working, all values by year return the same

stock = stock %>% zoo::fortify.zoo()
stock$Date = stock$Index
separate(stock, Date, c("year","month","day"), sep="-") %>% 
   group_by(year) %>%
   summarise(stdev= sd(stock[,2]))

# A tibble: 11 x 2
#   year  stdev
#   <chr> <dbl>
# 1 2010   67.0
# 2 2011   67.0
#....
#10 2019   67.0
#11 2020   67.0

与功能:

#Attempt 2 with function - not working - returns only one value instead of multiple

#getting stock prices
stock = getSymbols.yahoo(ticker, from=initialDate, auto.assign = FALSE)
stock = stock[,4] #working only with closing prices

#subsetting
years = as.character(seq(ybeg,yend,by=1))
years

calculate_stdev = function(series,years) {
  series[years] #subsetting by years, to be equivalent as stock["2010"], stock["2011"] e.g.
  sd(series[years][,1]) #calculate stdev on closing prices of the current subset
}

yearly.stdev = calculate_stdev(stock,years)

> yearly.stdev
[1] 67.04185

【问题讨论】:

    标签: r dplyr subset quantmod


    【解决方案1】:

    使用apply.yearly()(更通用的period.apply() 的便捷包装)对getSymbols() 返回的xts 对象的年度子集调用函数。

    您可以使用Cl() 函数从getSymbols() 返回的对象中提取关闭列。

    stock = getSymbols("AAPL", from = "2010-01-01", auto.assign = FALSE)
    apply.yearly(Cl(stock), sd)
    ##            AAPL.Close
    ## 2010-12-31   5.365208
    ## 2011-12-30   3.703407
    ## 2012-12-31   9.568127
    ## 2013-12-31   6.412542
    ## 2014-12-31  13.371293
    ## 2015-12-31   7.683550
    ## 2016-12-30   7.640743
    ## 2017-12-29  14.621191
    ## 2018-12-31  20.593861
    ## 2019-12-31  34.538978
    ## 2020-06-19  29.577157
    

    【讨论】:

      【解决方案2】:

      我不知道dplyr,但这是data.table 的方法

      library(data.table)
      
      # convert data.frame to data.table
      setDT(stock)
      
      # convert your Date column with content like "2020-06-17" from character to Date type
      stock[,Date:=as.Date(Date)]
      
      # calculate sd(price) grouped by year, assuming here your price column is named "price"
      stock[,sd(price),year(Date)]
      

      【讨论】:

        【解决方案3】:

        不要在 summarise 函数中再次传递数据框的名称。请改用变量名。

        separate(stock, Date, c("year","month","day"), sep="-") %>% 
          group_by(year) %>% 
          summarise(stdev = sd(AAPL.Close)) # <-- here
        # A tibble: 11 x 2
        #   year  stdev
        #   <chr> <dbl>
        # 1 2010   5.37
        # 2 2011   3.70
        # 3 2012   9.57
        # 4 2013   6.41
        # 5 2014  13.4 
        # 6 2015   7.68
        # 7 2016   7.64
        # 8 2017  14.6 
        # 9 2018  20.6 
        #10 2019  34.5 
        #11 2020  28.7 
        

        【讨论】:

        • 谢谢@Edward,我将数据帧作为stock[,2] 传递,以便在使用多个代码和调用for 循环时实现自动化。我可能会将AAPL.Close 列重命名为通用名称,以便可以为每个代码通用调用它。
        • @PedroCampos 您可以使用Cl()getSymbols() 返回的任何代码中获取关闭列。您还可以使用apply.yearly() 调用年度子集上的函数。
        猜你喜欢
        • 1970-01-01
        • 1970-01-01
        • 2016-08-11
        • 2021-11-17
        • 1970-01-01
        • 1970-01-01
        • 2012-09-13
        • 1970-01-01
        • 1970-01-01
        相关资源
        最近更新 更多