【问题标题】:How to predict cluster membership with cmeans?如何使用 cmeans 预测集群成员?
【发布时间】:2013-12-13 03:40:54
【问题描述】:

我正在使用 e1071 R 包中的 cmeans 对我的数据进行聚类。我想预测新数据的集群成员资格,但我不知道如何编写预测函数。虽然预测硬聚类成员很简单(只需分配到最近的聚类中心),但我不知道如何计算成员值,因为它们在 cl$membership 中给出:

cl <- cmeans( train, centers= 10, m= 1.08 )
# cl$membership contains the "soft" cluster membership
# the following line does not work, unfortunately
cl.new <- predict( cl, test )

# getting the hard cluster assignments is easy
predict.fclust <- function( cl, x ) { 
  which.cl <- function( xx ) 
    which.min( apply( cl$centers, 1, function( y ) sum( ( y - xx )^2 ) ) ) 
  ret <- apply( x, 1, which.cl )
  names( ret ) <- rownames( x )
  ret
}
# this works, but only predicts hard clustering
cl.new <- predict( cl, test )

【问题讨论】:

    标签: r cluster-analysis


    【解决方案1】:

    成员定义为 (Wikipedia)

    考虑cmeans帮助页面中的这个例子:

    library("e1071")
    set.seed(1)
    x <- rbind(matrix(rnorm(100,sd=0.3), ncol=2),
               matrix(rnorm(100,mean=1,sd=0.3), ncol=2))
    cl <- cmeans(x, 2, 20, verbose=TRUE, method="cmeans", m=2)
    

    那么成员值可以计算如下:

    ## compute distances between samples and cluster centers for default setting
    ## dist="euclidean"; use absolute values for dist="manhattan"
    cc <- cl$centers
    dm <- sapply(seq_len(nrow(x)),
                 function(i) apply(cc, 1, function(v) sqrt(sum((x[i, ]-v)^2))))
    
    m <- 2
    ## compute cluster membership values
    ms <- t(apply(dm, 2,
                  function(x) {
                    tmp <- 1/((x/sum(x))^(2/(m-1)))  # formula above
                    tmp/sum(tmp)  # normalization
                  }))
    

    比较:

    R> head(cl$membership)
               1      2
    [1,] 0.02669 0.9733
    [2,] 0.01786 0.9821
    [3,] 0.03622 0.9638
    [4,] 0.13481 0.8652
    [5,] 0.13708 0.8629
    [6,] 0.20024 0.7998
    
    R> head(ms)
               1      2
    [1,] 0.02669 0.9733
    [2,] 0.01786 0.9821
    [3,] 0.03622 0.9638
    [4,] 0.13481 0.8652
    [5,] 0.13708 0.8629
    [6,] 0.20024 0.7998
    
    R> all.equal(ms, cl$membership, tolerance=1e-15)
    [1] TRUE
    

    【讨论】:

    • 非常感谢。这正是我所需要的。
    猜你喜欢
    • 2018-02-13
    • 2014-08-15
    • 2015-03-28
    • 2017-05-25
    • 1970-01-01
    • 1970-01-01
    • 1970-01-01
    • 2021-03-04
    • 2015-04-12
    相关资源
    最近更新 更多