【问题标题】:how to overcome the "'numpy.ndarray' object is not callable" error?如何克服“'numpy.ndarray'对象不可调用”错误?
【发布时间】:2021-11-19 14:28:39
【问题描述】:

我使用以下链接中的代码查看了使用 PCA 和自动编码器的异常检测:Machine learning for anomaly detection and condition monitoring 并且我尝试运行代码部分以使用带有 Mahalanobis 距离的 PCA,但是,如果我运行代码,我总是得到异常消息,事实证明问题出在协方差矩阵函数部分,其中出现错误'numpy.ndarray' object is not callable。我尝试创建新变量,将数据框更改为 NumPy,但没有任何效果导致此错误的原因是什么?

代码:

def cov_matrix(data, verbose=False):
    # data = pd.DataFrame(data).to_numpy()
    print('calculating the covaraince matrix')
    covariance_matrix = np.cov(data, rowvar=False)
    print('Done the covaraince matrix')
    if is_pos_def(covariance_matrix):
        inv_covariance_matrix = np.linalg.inv(covariance_matrix)
        if is_pos_def(inv_covariance_matrix):
            return covariance_matrix, inv_covariance_matrix
        else:
            print("Error: Inverse of Covariance Matrix is not positive definite!")
    else:
        print("Error: Covariance Matrix is not positive definite!")
        
def MahalanobisDist(inv_cov_matrix, mean_distr, data, verbose=False):
    inv_covariance_matrix = inv_cov_matrix
    vars_mean = mean_distr
    diff = data - vars_mean
    md = []
    for i in range(len(diff)):
        md.append(np.sqrt(diff[i].dot(inv_covariance_matrix).dot(diff[i])))
    return md

def MD_detectOutliers(dist, extreme=False, verbose=False):
    k = 3. if extreme else 2.
    threshold = np.mean(dist) * k
    outliers = []
    for i in range(len(dist)):
        if dist[i] >= threshold:
            outliers.append(i)  # index of the outlier
    return np.array(outliers)

def MD_threshold(dist, extreme=False, verbose=False):
    k = 3. if extreme else 2.
    threshold = np.mean(dist) * k
    return threshold

    #### Main code:
    # Inputting the training and test dataframes:
    data_train = np.array(principalDf_C0.values)
    data_test_C1 = np.array(principalDf_C1.values)
    data_test_C2 = np.array(principalDf_C2.values)
    data_test_C3 = np.array(principalDf_C4.values)
    data_test_C4 = np.array(principalDf_C5.values)
    
    print('Training Dataframe: ', data_train[:,])
    print('Test1 Dataframe: ', data_test_C1)
    print('Test2 Dataframe: ', data_test_C2)
    print('Test3 Dataframe: ', data_test_C3)
    print('Test4 Dataframe: ', data_test_C4)
    
    data_train_df = pd.DataFrame(principalDf_C0.values)
    data_test_df_C1 =  pd.DataFrame(principalDf_C1.values)
    data_test_df_C2 =  pd.DataFrame(principalDf_C2.values)
    data_test_df_C3 =  pd.DataFrame(principalDf_C4.values)
    data_test_df_C4 =  pd.DataFrame(principalDf_C5.values)
    
    # Calculating the covariance matrix:
    cov_matrix, inv_cov_matrix = cov_matrix(data=data_train)
    
    # Calculating the mean value for the input variables:
    mean_distr = data_train_df.mean(axis=0)
    
    # Calculating the Mahalanobis distance and threshold value to flag datapoints as an anomaly:
    dist_test_C1 = MahalanobisDist(inv_cov_matrix, mean_distr, data_test_df_C1, verbose=True)
    dist_test_C2 = MahalanobisDist(inv_cov_matrix, mean_distr, data_test_df_C2, verbose=True)
    dist_test_C3 = MahalanobisDist(inv_cov_matrix, mean_distr, data_test_df_C3, verbose=True)
    dist_test_C4 = MahalanobisDist(inv_cov_matrix, mean_distr, data_test_df_C4, verbose=True)
    dist_train = MahalanobisDist(inv_cov_matrix, mean_distr, data_train_df, verbose=True)
    threshold = MD_threshold(dist_train, extreme = True)

    # Distribution of Threshold value for flagging an anomaly:
    plt.figure()
    sns.distplot(np.square(dist_train),bins = 10, kde= False)
    # plt.xlim([0.0,15])
    plt.show()
    
    plt.figure()
    sns.distplot(dist_train, bins = 10, kde= True, color = 'green');
    # plt.xlim([0.0,5])
    plt.xlabel('Mahalanobis dist')
    plt.show()
    
    anomaly_train = pd.DataFrame(index=data_train_df.index)
    anomaly_train['Mob dist']= dist_train
    anomaly_train['Thresh'] = threshold
    # If Mob dist above threshold: Flag as anomaly
    anomaly_train['Anomaly'] = anomaly_train['Mob dist'] > anomaly_train['Thresh']
    anomaly_train.index = X_train_PCA.index
    
    anomaly_C1 = pd.DataFrame(index=data_test_df_C1.index)
    anomaly_C1['Mob dist']= dist_test_C1
    anomaly_C1['Thresh'] = threshold
    # If Mob dist above threshold: Flag as anomaly
    anomaly_C1['Anomaly'] = anomaly_C1['Mob dist'] > anomaly_C1['Thresh']
    anomaly_C1.index = data_test_df_C1.index
    anomaly_C1.head()
    
    anomaly_C2 = pd.DataFrame(index=data_test_df_C2.index)
    anomaly_C2['Mob dist']= dist_test_C2
    anomaly_C2['Thresh'] = threshold
    # If Mob dist above threshold: Flag as anomaly
    anomaly_C2['Anomaly'] = anomaly_C2['Mob dist'] > anomaly_C2['Thresh']
    anomaly_C2.index = data_test_df_C2.index
    anomaly_C2.head()
    
    anomaly_C3 = pd.DataFrame(index=data_test_df_C3.index)
    anomaly_C3['Mob dist']= dist_test_C3
    anomaly_C3['Thresh'] = threshold
    # If Mob dist above threshold: Flag as anomaly
    anomaly_C3['Anomaly'] = anomaly_C3['Mob dist'] > anomaly_C3['Thresh']
    anomaly_C3.index = data_test_df_C3.index
    anomaly_C3.head()
    
    anomaly_C4 = pd.DataFrame(index=data_test_df_C4.index)
    anomaly_C4['Mob dist']= dist_test_C4
    anomaly_C4['Thresh'] = threshold
    # If Mob dist above threshold: Flag as anomaly
    anomaly_C4['Anomaly'] = anomaly_C4['Mob dist'] > anomaly_C4['Thresh']
    anomaly_C4.index = data_test_df_C4.index
    anomaly_C4.head()

    final_scored = pd.concat([anomaly_train, anomaly_C1, anomaly_C2, anomaly_C3, anomaly_C4])
    print(final_scored)
except Exception:
    print('Cannot implement Anomaly detection using Mahalanobis distance metric')
    pass

【问题讨论】:

  • 能否包含回溯或缩小错误发生的位置?
  • @ogdenkev 在这一行 cov_matrix, inv_cov_matrix = cov_matrix(data=data_train)

标签: python arrays dataframe numpy


【解决方案1】:

根据您的评论,您在 var cov_matrix 和函数 cov_matrix() 之间存在命名空间冲突

将该行更改为例如

matrix, inv_matrix = cov_matrix(data=data_train)

并相应地更新您的代码,或重命名cov_matrix()。一个好的约定是返回事物的函数的名称中应该包含动词,例如generate_cov_matrix()calculate_cov_matrix().*

(是的,代码应该运行一次,因为 AFAICS 之后您不会再调用 cov_matrix(),但我猜您正在使用持久解释器会话并再次评估代码一次 cov_matrix()已被覆盖。)

*此约定假定函数存在副作用,并异常返回。当然,如果你是在写函数式的,并且有副作用是例外而不是规则,你可能想要颠倒它,或者完全遵循另一个约定。

【讨论】:

  • 谢谢!成功了
【解决方案2】:

我的猜测是您遇到了一个问题,即您有一个名为 cov_matrix 的变量和一个名为 cov_matrix 的函数。在某些时候,我认为你用变量覆盖了函数,这是一个numpy.ndarray。稍后你尝试调用函数cov_matrix(),但对象实际上是变量,即numpy数组。

【讨论】:

  • 糟糕。不是想偷你的风头。如果您更新我的答案中的加分项并@提及我,以便我知道(即有问题的行和持续会话),我将删除我的。
  • 谢谢!成功了
  • 好的,OP 已经接受了我的回答,所以删除它有点晚了。不过,对不起,你说的完全一样。
  • 谢谢,@2e0byo!无论如何,我更愿意留下你的答案,因为你提供了额外的背景和建议,即使我@提到你,我也不愿意将它们添加到我的答案中。
猜你喜欢
  • 2019-09-04
  • 2021-06-11
  • 1970-01-01
  • 1970-01-01
  • 2021-12-23
  • 1970-01-01
  • 1970-01-01
  • 2021-02-23
  • 1970-01-01
相关资源
最近更新 更多