【问题标题】:Component reconstruction for multivariate lagged time series多元滞后时间序列的分量重建
【发布时间】:2021-01-09 14:55:07
【问题描述】:

我正在尝试使用蒙特卡罗测试编写多元奇异谱分析。就此而言,我正在研究一个代码片段,该代码片段可以使用输入序列的 pca/ssa 分解产生的滞后轨迹矩阵和投影基 (ST-PC) 来重建输入序列。附加的代码片段适用于滞后的单变量(即单一)时间序列,但我正在努力为滞后的多变量时间序列进行这种重建。我在数学上不太了解这个过程,而且——毫不奇怪——我也没有设法对其进行编程。有用的链接附在随附代码的功能描述中。输入数据的格式应为(时间 * 系列数),例如 288x3 表示 288 个时间级别的 3 个时间系列。

希望你能帮帮我!

import numpy as np

def lagged_covariance_matrix(data, M):
    """ Computes the lagged covariance matrix using the Broomhead & King method 
    
    Background: Plaut, G., & Vautard, R. (1994). Spells of low-frequency oscillations and 
    weather regimes in the Northern Hemisphere. Journal of the atmospheric sciences, 51(2), 210-236.
    
    Arguments:
        data : pxn time series, where p denotes the length of the time series and n the number of channels 
        M : window length """

    # explicitely 'add' spatial dimension if input is a single time series    
    if np.ndim(data) == 1:
        data = np.reshape(data,(len(data),1))
    
    T = data.shape[0]    
    L = data.shape[1]    
    N = T - M + 1         
    
    X = np.zeros((T, L, M))
    
    for i in range(M):
        X[:,:,i] = np.roll(data, -i, axis = 0)
    
    X = X[:N]
    
    # X constitutes the trajectory matrix and is a stacked hankel matrix
    X = np.reshape(X, (N, M*L), order = 'C') # https://www.jstatsoft.org/article/viewFile/v067i02/v67i02.pdf
    
    # choose the smallest projection basis for computation of the covariance matrix    
    if M*L >= N:        
        return 1/(M*L) * X.dot(X.T), X
    
    else:
        return 1/N * X.T.dot(X), X
    
def sort_by_eigenvalues(eigenvalues, PCs): 
    """ Sorts the PCs and eigenvalues by descending size of the eigenvalues """
    
    desc = np.argsort(-eigenvalues)
    
    return eigenvalues[desc], PCs[:,desc]

def Reconstruction(M, E, X):
    """ Reconstructs the series as the sum of M subseries.
    
    See: https://en.wikipedia.org/wiki/Singular_spectrum_analysis, 'Basic SSA' &
    the work of Vivien Sainte Fare Garnot on univariate time series (https://github.com/VSainteuf/mcssa)
    
    Arguments: 
        M : window length 
        E : eigenvector basis 
        X : trajectory matrix """
       
    time = len(X) + M - 1
    RC = np.zeros((time, M))
       
    # step 3: grouping
    for i in range(M):
        d = np.zeros(M)
        d[i] = 1
        I = np.diag(d)
        
        Q = np.flipud(X @ E @ I @ E.T)
        
        # step 4: diagonal averaging        
        for k in range(time):
            RC[k, i] = np.diagonal(Q, offset = -(time - M - k)).mean()

    return RC 

#=====================================================================================================
#=====================================================================================================
#=====================================================================================================

# input data
data = None

# number of lags a.k.a. window length
M = 45 # M = 1 means no lag  

covmat, X = lagged_covariance_matrix(data, M)        

# get the eigenvalues and vectors of the covariance matrix
vals, vecs = np.linalg.eig(covmat)
eig_data, eigvec_data = sort_by_eigenvalues(vals, vecs)

# component reconstruction
recons_data = Reconstruction(M, eigvec_data, X)

【问题讨论】:

    标签: pca


    【解决方案1】:

    以下工作但不直接使用投影底座 (ST-PC)。因此,最初的问题仍然存在,但这已经有很大帮助并为我解决了问题。该代码片段利用了 ST-PCs 投影基与从滞后轨迹矩阵的单值分解获得的 u & vt 矩阵之间的相似性。我认为它给出的答案与使用 ST-PC 投影底座获得的答案相同?

    def lag_reconstruction(data, X, M, pairs = None):
        """ Reconstructs the series as the sum of M subseries using the lagged trajectory matrix. 
            Based on equation 2.9 of Plaut, G., & Vautard, R. (1994). Spells of low-frequency oscillations and weather regimes in the Northern Hemisphere. Journal of Atmospheric Sciences, 51(2), 210-236.
            Inspired by work of R. van Westen and C. Wieners """    
    
        time = data.shape[0]    # number of time levels of the original series
        L = data.shape[1]       # number of input series
        N = time - M + 1
        
        u, s, vt = np.linalg.svd(X, full_matrices = False)
        rc = np.zeros((time, L, M))
        
        for t in range(time): 
            
            counter = 0
            
            for i in range(M): 
                    
                if t-i >= 0 and t-i < N:            
                    counter += 1
                    
                    if pairs:
                        for k in pairs:
                            rc[t,:,i] += u[t-i, k] * s[k] * vt[k, i*L : i*L + L]
                    else: 
                        for k in range(len(s)):  
                            rc[t,:,i] += u[t-i, k] * s[k] * vt[k, i*L : i*L + L]
        
            rc[t] = rc[t]/counter
                   
        return rc
    

    【讨论】:

      猜你喜欢
      • 2012-08-25
      • 1970-01-01
      • 1970-01-01
      • 2021-07-06
      • 2021-10-04
      • 1970-01-01
      • 1970-01-01
      • 2017-12-08
      • 2014-10-08
      相关资源
      最近更新 更多