【问题标题】:Equal Group Clustering Algorithm等群聚类算法
【发布时间】:2020-12-21 10:49:09
【问题描述】:

我有 300 个收集点,我需要根据 GEO COORDINATE 对其进行聚类。但是我所有的集群都应该有一个上限为 8 下限为 5。我如何在 Python 中做到这一点。

Refer Image for Sample output

【问题讨论】:

  • 请分享所需的输出并解释您想要什么。
  • 我想是这样输出,纬度经度路由代码18.2521536 76.4982399 Cluster_01 18.2526484 76.4976308 Cluster_01 18.2526006 76.4972857 Cluster_01 18.2533365 76.4975484 Cluster_01 18.2535941 76.4987773 Cluster_01 18.2535462 76.4986933 Cluster_01 18.2503783 76.5116291 Cluster_02 18.2512383 76.5085317 Cluster_02 18.2506268 76.5082113 Cluster_02 18.2516204 76.5064285 Cluster_02我有300 个这样的坐标,必须以 8 分钟的 6 的最大集群大小进行聚类

标签: python cluster-computing sklearn-pandas


【解决方案1】:

My question 回答您的问题。您需要将position 更改为GEO COORDINATE 数据,并将x,y 更改为Latitude Longitude

dfcluster = DataFrame(position, columns=['x', 'y'])
kmeans = KMeans(n_clusters=4).fit(dfcluster)
centroids = kmeans.cluster_centers_
#for plot
# plt.scatter(dfcluster['x'], dfcluster['y'], c=kmeans.labels_.astype(float), s=50, alpha=0.5)
# plt.scatter(centroids[:, 0], centroids[:, 1], c='red', s=50)
# plt.show()
dfcluster['cluster'] = kmeans.labels_
dfcluster=dfcluster.drop_duplicates(['x', 'y'], keep='last')
dfcluster = dfcluster.sort_values(['cluster', 'x', 'y'], ascending=True)

n=8
dfcluster1=dfcluster.head(n)
n=5
dfcluster2=dfcluster.tail(n)

另外,对于平等的群体使用,Size Constrained Clustering solver

pip install size-constrained-clusteringpip install git+https://github.com/jingw2/size_constrained_clustering.git 开头,您可以使用minmax flowHeuristics

n_samples = 2000
n_clusters = 3
X = np.random.rand(n_samples, 2)

model = equal.SameSizeKMeansMinCostFlow(n_clusters)

#model = equal.SameSizeKMeansHeuristics(n_clusters)
model.fit(X)
centers = model.cluster_centers_
labels = model.labels_

【讨论】:

    猜你喜欢
    • 1970-01-01
    • 2016-10-03
    • 2012-11-14
    • 2020-06-18
    • 2013-11-26
    • 2018-07-24
    • 2011-03-29
    • 2020-02-14
    • 2022-01-03
    相关资源
    最近更新 更多