【发布时间】:2022-03-27 03:43:14
【问题描述】:
我正在尝试计算 R 中的最佳聚类数。我正在使用以下代码
library(factoextra)
library(tidyverse)
library(cluster)
wsCustomer <- read.csv(url("https://archive.ics.uci.edu/ml/machine-learning-databases/00292/Wholesale customers data.csv"))
#Converting Region and Channel columns ; replacing values by names
wsCustomer <- wsCustomer %>% mutate(Channel = ifelse(Channel == 1 , "HoReCa","Retail"),
Region = case_when(Region == 1 ~ "Lisbon",
Region == 2 ~ "Oporto",
Region == 3 ~ "Others"))
head(wsCustomer)
df <- as_tibble(scale(wsCustomer[3:8]))
# compute gap statistic
set.seed(123)
gap_stat <- clusGap(df, FUN = kmeans, nstart = 25,
K.max = 10, B = 50)
它给了我以下警告
警告信息: 没有在 10 次迭代中收敛
如何摆脱这个警告信息?
【问题讨论】:
标签: r cluster-analysis