【问题标题】:How to compute a time series of pairwise correlations如何计算成对相关的时间序列
【发布时间】:2017-10-02 04:02:53
【问题描述】:

我有多个因素的时间序列:

df = read.table(text="
    date        factor     stock    value
    30-Jun-17   DivYield    AAPL    0.05
    30-Jun-17   DivYield    GOOG    0.055
    30-Jun-17   DivYield    MSFT    0.02
    31-Jul-17   DivYield    AAPL    0.055
    31-Jul-17   DivYield    GOOG    0.05
    31-Jul-17   DivYield    MSFT    0.025
    30-Jun-17   PB          AAPL    12
    30-Jun-17   PB          GOOG    11
    30-Jun-17   PB          MSFT    16
    31-Jul-17   PB          AAPL    11
    31-Jul-17   PB          GOOG    12
    31-Jul-17   PB          MSFT    14
    30-Jun-17   ROE         AAPL    0.1
    30-Jun-17   ROE         GOOG    0.12
    30-Jun-17   ROE         MSFT    0.12
    31-Jul-17   ROE         AAPL    0.1
    31-Jul-17   ROE         GOOG    0.1
    31-Jul-17   ROE         MSFT    0.12
            ", header = TRUE)
df$date = lubridate::dmy(df$date)

我需要计算因子之间的成对相关性,而且我需要每天都这样做。 Pearson 相关性的结果类似于:

Date        Factor1  Factor2 Correlation.Time.Series
30-Jun-17   DivYield    PB      -0.998337488
30-Jun-17   DivYield    ROE     -0.381246426
30-Jun-17   PB          ROE     0.327326835
31-Jul-17   DivYield    PB      -0.984324138
31-Jul-17   DivYield    ROE     -0.987829161
31-Jul-17   PB          ROE     0.944911183

关于如何攻击这个的任何想法?

这是我的第一次尝试:

library(tidyverse)
df.spread = spread(df, key = factor, value = value)
first.attempt = df.spread %>%
    select(-stock) %>%
    group_by(date) %>%
    do(as.data.frame(cor(.[,-1])))

这似乎做到了。问题是输出没有显示相关性的标签:

        date   DivYield        PB         ROE
1 2017-06-30  1.0000000 -0.9983375 -0.3812464
2 2017-06-30 -0.9983375  1.0000000  0.3273268
3 2017-06-30 -0.3812464  0.3273268  1.0000000
4 2017-07-31  1.0000000 -0.9843241 -0.9878292
5 2017-07-31 -0.9843241  1.0000000  0.9449112
6 2017-07-31 -0.9878292  0.9449112  1.0000000

【问题讨论】:

  • 把你的数据库变成3维数组。

标签: r correlation


【解决方案1】:

查看corrr 包。这与 mutate + map 组合将为您提供一列行名,以便您可以匹配相关对。

df.spread %>%
  select(-stock) %>%
  group_by(date) %>%
  nest() %>%
  mutate(cor_tbls = map(data, ~corrr::correlate(.x))) %>%
  unnest(cor_tbls)

这给了你:

# A tibble: 6 x 5
        date  rowname   DivYield         PB        ROE
      <date>    <chr>      <dbl>      <dbl>      <dbl>
1 2017-06-30 DivYield         NA -0.9983375 -0.3812464
2 2017-06-30       PB -0.9983375         NA  0.3273268
3 2017-06-30      ROE -0.3812464  0.3273268         NA
4 2017-07-31 DivYield         NA -0.9843241 -0.9878292
5 2017-07-31       PB -0.9843241         NA  0.9449112
6 2017-07-31      ROE -0.9878292  0.9449112         NA

【讨论】:

    猜你喜欢
    • 2015-01-22
    • 2023-03-27
    • 1970-01-01
    • 2014-09-29
    • 1970-01-01
    • 1970-01-01
    • 2021-04-21
    • 1970-01-01
    • 2018-02-26
    相关资源
    最近更新 更多