【问题标题】:Finding Optimal Value of K寻找 K 的最优值
【发布时间】:2018-11-28 08:31:38
【问题描述】:

如何计算 k 个簇的从质心到簇中每个点的 mean_distances。

公式:

我的代码:

def mean_distances(k, X):
"""
Arguments:

k -- int, number of clusters
X -- np.array, matrix of input features

Returns:

Array of shape (k, ), containing mean of sum distances 
    from centroid to each point in the cluster for k clusters
"""

### START CODE HERE ###
mod = KMeans(X, k)
clusters, final_centrs = mod.final_centroids()
dist = []
for i in range(k):
    d =  np.sum(np.linalg.norm((clusters[i] - final_centrs[i, :])**2)).mean()
    dist.append(d)
return dist
### END CODE HERE ###

但它不能正常工作。 (PS 没有 scklearn,只有 numpy)

【问题讨论】:

  • KMeans() 从何而来?另外:缩进问题。

标签: python numpy k-means


【解决方案1】:

您正在取外部和的每个元素的平均值(即每个内部和),而不是外部和的平均值:

import numpy as np
from sklearn.cluster import KMeans

def mean_distances(k, X):
    """
    Arguments:

        k -- int, number of clusters
        X -- np.array, matrix of input features

    Returns:

        Array of shape (k, ), containing mean of sum distances 
        from centroid to each point in the cluster for k clusters
    """

    mod = KMeans(X, k)
    clusters, final_centrs = mod.final_centroids()
    dist = []
    for i in range(k):
        d =  np.sum(np.linalg.norm((clusters[i] - final_centrs[i, :])**2))
        dist.append(d)
    return dist.mean()

【讨论】:

    猜你喜欢
    • 2020-05-21
    • 2013-07-30
    • 2023-03-24
    • 1970-01-01
    • 1970-01-01
    • 1970-01-01
    • 2019-07-27
    • 2012-04-13
    • 1970-01-01
    相关资源
    最近更新 更多