【问题标题】:Clustering near Lines using coordinates in Python在 Python 中使用坐标对近线进行聚类
【发布时间】:2022-11-25 18:55:15
【问题描述】:

我有一个列表,其中包含一些线的起点和终点的 x 和 y 坐标。Lines as csv

331,178,486,232
185,215,386,308
172,343,334,419
406,128,570,165
306,106,569,166
159,210,379,299
236,143,526,248
303,83,516,178
409,62,572,106
26,287,372,427
31,288,271,381
193,228,432,330
120,196,432,329
136,200,374,297
111,189,336,289
284,186,560,249
333,202,577,254
229,194,522,219
349,111,553,165
121,322,342,416
78,303,285,391
103,315,340,415

在我的示例图像上,线条看起来像这样。 Lines plotted 我想将彼此靠近的线分组为簇,并为每个簇创建一条线。对于这个例子,我想要 5 个集群。之后我想计算每个聚类线到下一个聚类线的距离。

import csv, math
file = open("lines.csv")
csvreader = csv.reader(file)

lines = []
for data in csvreader:
    lines.append({'x1':int(data[0]), 'y1':int(data[1]), 'x2':int(data[2]), 'y2':int(data[3])})

def point_delta(p1, p2):
    return abs(p1 - p2)


for line in lines[:2]:    
    for line_rev in lines:        
        #x_start_delta = abs(line['x1'] - line_rev['x1'])
        x_start_delta = point_delta(line['x1'], line_rev['x1'])
        y_start_delta = abs(line['y1'] - line_rev['y1'])
        start_distance = math.sqrt(x_start_delta**2 + y_start_delta**2)
        x_end_delta = abs(line['x2'] - line_rev['x2'])
        y_end_delta = abs(line['y2'] - line_rev['y2'])
        end_distance = math.sqrt(x_end_delta**2 + y_end_delta**2)
        avg_distance = (start_distance + end_distance)/2
        cluster = 0
        if avg_distance < 100: 
            print(f"distance: {avg_distance}")
            
    print("############## next line ##############")

我已经编写了一些代码来计算每条线之间的距离,但无法找到一种方法来将彼此靠近的线保存在不同的列表中。

有人知道如何执行此操作或者是否有另一种创建集群的方法?我也在考虑使用中点而不是起点/终点

【问题讨论】:

    标签: python list line cluster-analysis distance


    【解决方案1】:

    你可以在它上面加一个集群,但它在最后的孤独行上有问题


    data = [[331,178,486,232],
    [185,215,386,308],
    [172,343,334,419],
    [406,128,570,165],
    [306,106,569,166],
    [159,210,379,299],
    [236,143,526,248],
    [303,83,516,178],
    [409,62,572,106],
    [26,287,372,427],
    [31,288,271,381],
    [193,228,432,330],
    [120,196,432,329],
    [136,200,374,297],
    [111,189,336,289],
    [284,186,560,249],
    [333,202,577,254],
    [229,194,522,219],
    [349,111,553,165],
    [121,322,342,416],
    [78,303,285,391],
    [103,315,340,415]]
    
    import pandas as pd
    import sklearn
    from sklearn.cluster import MiniBatchKMeans
    import numpy as np
    
    lines = pd.DataFrame(data)
    
    CLUSTERS = 5
    
    X = lines.values
    
    kmeans = MiniBatchKMeans(n_clusters=CLUSTERS,max_no_improvement=100).fit(X)
    
    import numpy as np
    import pylab as pl
    from matplotlib import collections  as mc
    
    lines_segments = [ [ (l[0],l[1]),([l[2],l[3]]) ] for l in lines.values]
    center_segments = [ [ (l[0],l[1]),([l[2],l[3]]) ] for l in kmeans.cluster_centers_] 
    
    
    line_collection = mc.LineCollection(lines_segments, linewidths=2)
    centers = mc.LineCollection(center_segments, colors='red', linewidths=4, alpha=1)
    
    fig, ax = pl.subplots()
    
    ax.add_collection(line_collection)
    ax.add_collection(centers)
    ax.autoscale()
    ax.margins(0.1)
    

    你可以看到中心

    kmeans.cluster_centers_
    

    【讨论】:

      猜你喜欢
      • 2018-06-22
      • 2014-09-05
      • 1970-01-01
      • 2019-09-27
      • 2021-07-27
      • 1970-01-01
      • 2012-03-24
      • 1970-01-01
      • 1970-01-01
      相关资源
      最近更新 更多