【问题标题】:R: How to change annual Data to monthly Data with specific assumptions?R:如何在特定假设下将年度数据更改为月度数据?
【发布时间】:2021-10-14 16:32:01
【问题描述】:

我有一个庞大的公司年度账面市值比数据集。我需要通过以下逻辑将这些转换为月度数据: 公司 i (stock_id) 从 6 月 y + 1 日到 5 月 y + 2 日的 BtoM 等于 y 年的 BtoM。我如何应用此逻辑并在庞大的数据集中捕获输出(我有 1968-2018 年的 n 个股票数据)?非常感谢您的每一次帮助! 可重现的代码:

library(tidyverse)

Date <- as.Date(c('1994-12-01', '1995-12-01', '1996-12-01', '1994-12-01', '1995-12-01', '1996- 
12-01'))
stock_id <- c('80482', '80482', '80482', '80483', '80483', '80483')
BtoM <- as.numeric(c('0.0111', '0.0079', '0.0293', '0.671', '0.721', '0.219'))

Book_to_Market <- data.frame(Date, stock_id, BtoM)
Book_to_Market <- Book_to_Market %>% 
mutate(stock_id = as.integer(stock_id))

这给出了:

              Date stock_id   BtoM
      1 1994-12-01    80482 0.0111
      2 1995-12-01    80482 0.0079
      3 1996-12-01    80482 0.0293
      4 1994-12-01    80483 0.6710
      5 1995-12-01    80483 0.7210
      6 1996-12-01    80483 0.2190

我想要的输出如下所示:

          Date_2 stock_id_2 BtoM_2
   1  1995-06-01      80482 0.0111
   2  1995-07-01      80482 0.0111
   3  1995-08-01      80482 0.0111
   4  1995-09-01      80482 0.0111
   5  1995-10-01      80482 0.0111
   6  1995-11-01      80482 0.0111
   7  1995-12-01      80482 0.0111
   8  1996-01-01      80482 0.0111
   9  1996-02-01      80482 0.0111
   10 1996-03-01      80482 0.0111
   11 1996-04-01      80482 0.0111
   12 1996-05-01      80482 0.0111
   13 1995-06-01      80483 0.6710
   14 1995-07-01      80483 0.6710
   15 1995-08-01      80483 0.6710
   16 1995-09-01      80483 0.6710
   17 1995-10-01      80483 0.6710
   18 1995-11-01      80483 0.6710
   19 1995-12-01      80483 0.6710
   20 1996-01-01      80483 0.6710
   21 1996-02-01      80483 0.6710
   22 1996-03-01      80483 0.6710
   23 1996-04-01      80483 0.6710
   24 1996-05-01      80483 0.6710

【问题讨论】:

    标签: r dataframe date tidyr


    【解决方案1】:
    library(dplyr) 
    library(lubridate)  # for `ymd()` and `month()`
    
    # generate list of dates for each stock_id:
    data.frame(Date = rep(seq.Date(as.Date("1995-06-01"),
                                   as.Date("1996-05-01"), 
                                   by = "month"), by = 2),
               stock_id = rep(c(80482, 80483), each = 12)) %>%
    
    # link back to a Dec 1st with particular year, based on current month
    mutate(rounded_date = ymd(
        paste(year(Date) - if_else(month(Date) >= 6, 1, 2), 1201))) %>%
      
    # join to source data 
    left_join(src %>% mutate(Date = ymd(Date)), 
                by = c("rounded_date" = "Date", "stock_id"))
    

    结果

             Date stock_id rounded_date   BtoM
    1  1995-06-01    80482   1994-12-01 0.0111
    2  1995-07-01    80482   1994-12-01 0.0111
    3  1995-08-01    80482   1994-12-01 0.0111
    4  1995-09-01    80482   1994-12-01 0.0111
    5  1995-10-01    80482   1994-12-01 0.0111
    6  1995-11-01    80482   1994-12-01 0.0111
    7  1995-12-01    80482   1994-12-01 0.0111
    8  1996-01-01    80482   1994-12-01 0.0111
    9  1996-02-01    80482   1994-12-01 0.0111
    10 1996-03-01    80482   1994-12-01 0.0111
    11 1996-04-01    80482   1994-12-01 0.0111
    12 1996-05-01    80482   1994-12-01 0.0111
    13 1995-06-01    80483   1994-12-01 0.6710
    14 1995-07-01    80483   1994-12-01 0.6710
    15 1995-08-01    80483   1994-12-01 0.6710
    16 1995-09-01    80483   1994-12-01 0.6710
    17 1995-10-01    80483   1994-12-01 0.6710
    18 1995-11-01    80483   1994-12-01 0.6710
    19 1995-12-01    80483   1994-12-01 0.6710
    20 1996-01-01    80483   1994-12-01 0.6710
    21 1996-02-01    80483   1994-12-01 0.6710
    22 1996-03-01    80483   1994-12-01 0.6710
    23 1996-04-01    80483   1994-12-01 0.6710
    24 1996-05-01    80483   1994-12-01 0.6710
    

    【讨论】:

    • 也有效,但不如我的特定数据集的crestor解决方案好。不过还是谢谢你!
    【解决方案2】:
    library(tidyverse)
    
    Book_to_Market <- data.frame(
      date =
        as.Date(
          c(
            '1994-12-01',
            '1995-12-01',
            '1996-12-01',
            '1994-12-01',
            '1995-12-01',
            '1996-12-01'
          )
        ),
      stock_id = c('80482', '80482', '80482', '80483', '80483', '80483'),
      BtoM = c('0.0111', '0.0079', '0.0293', '0.671', '0.721', '0.219')
    ) %>%
      mutate(stock_id = as.integer(stock_id),
             BtoM = as.numeric(BtoM))
    
    Book_to_Market %>%
      rowwise() %>%
      mutate(month = list(seq.Date(date - months(6), date + months(5), by = 'month'))) %>%
      unnest(cols = c(month)) %>%
      select(-date)
    #> # A tibble: 72 x 3
    #>    stock_id   BtoM month     
    #>       <int>  <dbl> <date>    
    #>  1    80482 0.0111 1994-06-01
    #>  2    80482 0.0111 1994-07-01
    #>  3    80482 0.0111 1994-08-01
    #>  4    80482 0.0111 1994-09-01
    #>  5    80482 0.0111 1994-10-01
    #>  6    80482 0.0111 1994-11-01
    #>  7    80482 0.0111 1994-12-01
    #>  8    80482 0.0111 1995-01-01
    #>  9    80482 0.0111 1995-02-01
    #> 10    80482 0.0111 1995-03-01
    #> # ... with 62 more rows
    

    【讨论】:

    • 工作得像个魅力!谢谢!
    猜你喜欢
    • 2015-11-01
    • 2021-03-08
    • 1970-01-01
    • 1970-01-01
    • 2023-03-31
    • 2020-01-07
    • 1970-01-01
    • 1970-01-01
    • 2014-07-03
    相关资源
    最近更新 更多