【问题标题】:KMeans usage in ELKI, comprehensive exampleKMeans 在 ELKI 中的使用,综合示例
【发布时间】:2019-11-16 05:37:37
【问题描述】:

我想使用 KMeans 聚类算法,例如 Scala 中的 ELKI 中的 Elkan's kmeans 来获取聚类的质心。一个全面的例子会很好,因为浏览 ELKI 文档有点困难。

【问题讨论】:

    标签: java scala cluster-analysis k-means elki


    【解决方案1】:

    假设您的内存中已有可用数据。您可以使用Pure Java API 创建Database 并将此数据库用作聚类算法的输入。完整的 Scala 代码:

    import scala.collection.JavaConverters._
    
    import de.lmu.ifi.dbs.elki.algorithm.clustering.kmeans.KMeansElkan
    import de.lmu.ifi.dbs.elki.algorithm.clustering.kmeans.initialization.RandomUniformGeneratedInitialMeans
    import de.lmu.ifi.dbs.elki.data.model.KMeansModel
    import de.lmu.ifi.dbs.elki.data.{Clustering, NumberVector}
    import de.lmu.ifi.dbs.elki.database.{Database, StaticArrayDatabase}
    import de.lmu.ifi.dbs.elki.datasource.ArrayAdapterDatabaseConnection
    import de.lmu.ifi.dbs.elki.distance.distancefunction.minkowski.SquaredEuclideanDistanceFunction
    import de.lmu.ifi.dbs.elki.utilities.random.RandomFactory
    
    
    def createDatabase(coords: Seq[(Double, Double)]): Database = {
      // Allocate 2-d array (matrix).
      val data = Array.ofDim[Double](coords.length, 2)
      // Fill the matrix
      coords.zipWithIndex.foreach {
        case ((x, y), idx) =>
          data.update(idx, Array(x, y))
      }
      // Create a database
      val db = new StaticArrayDatabase(new ArrayAdapterDatabaseConnection(data), null)
      // Load the data into the database
      db.initialize()
      db
    }
    
    val nClusters = numberOfClustersForDemand // Set your number of clusters
    val nIters = 1000
    // Convert my own type to the seq of (Double, Double)
    val coords = activities.map(act => (act.getCoord.getX, act.getCoord.getY))
    val db: Database = createDatabase(coords)
    // Create an instance of KMeansElkan clustering Algorithm
    val kmeans: KMeansElkan[NumberVector] = new KMeansElkan[NumberVector](SquaredEuclideanDistanceFunction.STATIC, nClusters, nIters,
      new RandomUniformGeneratedInitialMeans(RandomFactory.DEFAULT),true)
    // Run the algorithm
    val result: Clustering[KMeansModel] = kmeans.run(db)
    // Show the results
    val clustersInfo: Array[ClusterInfo] = result.getAllClusters.asScala.zipWithIndex.map { case (cluster, idx) =>
      println(s"# $idx: ${cluster.getNameAutomatic}")
      println(s"Size: ${cluster.size()}")
      println(s"Model: ${cluster.getModel}")
      println(s"Center: ${cluster.getModel.getMean.toVector}")
      println(s"getPrototype: ${cluster.getModel.getPrototype.toString}")
      ClusterInfo(cluster.size, new Coord(cluster.getModel.getMean))
    }.toArray
    

    【讨论】:

      猜你喜欢
      • 2017-09-04
      • 2016-04-24
      • 2014-03-05
      • 1970-01-01
      • 1970-01-01
      • 1970-01-01
      • 1970-01-01
      • 1970-01-01
      • 2018-07-15
      相关资源
      最近更新 更多