【发布时间】:2020-09-24 16:05:50
【问题描述】:
我正在计算自己的距离矩阵,如下所示,我想将其用于聚类。
import numpy as np
from math import pi
#points containing time value in minutes
points = [100, 200, 600, 659, 700]
def convert_to_radian(x):
return((x / (24 * 60)) * 2 * pi)
rad_function = np.vectorize(convert_to_radian)
points_rad = rad_function(points)
#generate distance matrix from each point
dist = points_rad[None,:] - points_rad[:, None]
#Assign shortest distances from each point
dist[((dist > pi) & (dist <= (2*pi)))] = dist[((dist > pi) & (dist <= (2*pi)))] -(2*pi)
dist[((dist > (-2*pi)) & (dist <= (-1*pi)))] = dist[((dist > (-2*pi)) & (dist <= (-1*pi)))] + (2*pi)
dist = abs(dist)
#check dist
print(dist)
我的距离矩阵如下所示。
[[0. 0.43633231 2.18166156 2.43909763 2.61799388]
[0.43633231 0. 1.74532925 2.00276532 2.18166156]
[2.18166156 1.74532925 0. 0.25743606 0.43633231]
[2.43909763 2.00276532 0.25743606 0. 0.17889625]
[2.61799388 2.18166156 0.43633231 0.17889625 0. ]]
我希望有 2 个集群(例如,集群 1:0,1 和集群 2:2,3,4),使用 kmeans 对上述预先计算的距离矩阵。
当我检查 kmeans 文档时,似乎不推荐使用预先计算的距离 -> precompute_distances='deprecated'。
文档链接:https://scikit-learn.org/stable/modules/generated/sklearn.cluster.KMeans.html
我想知道使用我预先计算的距离矩阵来执行 kmeans 的其他选项是什么。
如果需要,我很乐意提供更多详细信息
【问题讨论】:
标签: python scikit-learn k-means