【问题标题】:How to produce conditional plots from within a function如何从函数内生成条件图
【发布时间】:2020-10-03 13:17:18
【问题描述】:

我的数据结构如下:

set.seed(123)
dat1 <- data.frame(State = rep(c("NY","MA","FL","GA"), each = 10),
                   Loc = rep(c("a","b","c","d","e","f","g","h"),each = 5),
                   ID = rep(c(1:10), each = 2),
                   var1 = rnorm(200),
                   var2 = rnorm(200),
                   var3 = rnorm(200),
                   var4 = rnorm(200),
                   var5 = rnorm(200))

我正在为 PCA 使用 FactoMineR 和 factoextra 包。我正在编写以下函数来为 PCA 生成摘要输出和绘图:

pfun <- function(dat, cols, ncp){
  res <- PCA(dat[,cols], scale.unit = T, ncp = ncp, graph = F)
  eigs<-round(res$eig, 2)
  scree <- fviz_eig(res, addlabels = T)
  contribplot<-corrplot(get_pca_var(res)$contrib, is.corr = F)#variable contributions to each pc
  cos2plot<-corrplot(pca.vars$cos2, is.corr=F)#quality of var representation in each pc
  output<- list(eigs, scree, contribplot, cos2plot)
  return(output)
}
pfun(dat = cdatsq, cols = 7:13, ncp = 7)

到目前为止,该函数工作正常,但我还希望它为函数确定特征值小于或等于 1 的每个主成分的数量/组合生成双图和变量贡献图。例如,我尝试在函数中使用 num &lt;- sum(eigs[,1]&gt;=1, na.rm = TRUE)#for the number of pcs to keep and plot 和 for 循环:

for(i in 1:sum(eigs[,1]>=1, na.rm = TRUE)){
  fviz_contrib(res, choice = "var", axes = i, top = 10)
}

这不起作用,我怎样才能使这些与其余输出一起打印?此外,我想使用fviz_pca_biplot()sum(eigs[,1]&gt;=1, na.rm = TRUE) 范围内的每个主成分组合生成双图。在函数之外,一个绘图调用如下所示:

#example shown for PC2:PC3 with points labeled by `Loc` 
fviz_pca_biplot(res, axes = c(2,3), geom.ind = "point", pointsize=0, repel = T)+ 
  ggtitle("plot for PC2:PC3")+
  geom_text(aes(label = paste0(dat1$Loc)), alpha = 0.5, size = 3, nudge_y = 0.1, show.legend = FALSE)

但是在函数中,我如何在sum(eigs[,1]&gt;=1, na.rm = TRUE) 的范围内指定主成分的“所有组合”(即,PC1:PC2、PC2:PC3 等会有一个图)? 理想情况下,我想为每个分组变量将双标图分成单独的网格(例如,双标点由State 着色的页面和它们由Loc 着色的页面)。

【问题讨论】:

    标签: r ggplot2 functional-programming data-visualization pca


    【解决方案1】:

    您需要print for 循环中的输出才能导出它们。要获取所选 PC 的所有组合,您可以使用combn

    编辑:

    要获取网格,您可以使用cowplot 中的plot_grid

    library(factoextra)
    library(FactoMineR)
    library(corrplot)
    library(cowplot)
    
    set.seed(123)
    dat1 <- data.frame(State = rep(c("NY","MA","FL","GA"), each = 10),
                       Loc = rep(c("a","b","c","d","e","f","g","h"),each = 5),
                       ID = rep(c(1:10), each = 2),
                       var1 = rnorm(200),
                       var2 = rnorm(200),
                       var3 = rnorm(200),
                       var4 = rnorm(200),
                       var5 = rnorm(200))
    
    
    
    pfun <- function(dat, cols, ncp){
        res <- PCA(dat[,cols], scale.unit = T, ncp = ncp, graph = F)
        eigs <- round(res$eig, 2)
        scree <- fviz_eig(res, addlabels = T)
        pca.vars <- get_pca_var(res)
        contribplot <- corrplot(pca.vars$contrib, is.corr = F)#variable contributions to each pc
        cos2plot <- corrplot(pca.vars$cos2, is.corr=F)#quality of var representation in each pc
        keep.eigs <- sum(eigs[,1]>=1, na.rm = TRUE)
        contribs <- lapply(seq_len(keep.eigs), function(i) fviz_contrib(res, choice = "var", axes = i, top = 10))
        cowplot::plot_grid(plotlist=contribs, ncol=3)
        eig.comb <- combn(keep.eigs, 2, simplify = FALSE)
        biplots <- lapply(eig.comb, function(x){
            fviz_pca_biplot(res, axes = x, geom.ind = "point", pointsize=0, repel = T)+ 
                ggtitle(paste0("plot for PC", x[1], ":PC", x[2]))+
                geom_text(aes(label = paste0(dat$Loc), colour=dat$Loc), 
                          alpha = 0.5, size = 3, 
                          nudge_y = 0.1, show.legend = FALSE)
        })
        print(cowplot::plot_grid(plotlist=biplots, ncol=3))
        biplots2 <- lapply(eig.comb, function(x){
            fviz_pca_biplot(res, axes = x, geom.ind = "point", pointsize=0, repel = T)+ 
                ggtitle(paste0("plot for PC", x[1], ":PC", x[2]))+
                geom_text(aes(label = paste0(dat$State), colour=dat$State), 
                          alpha = 0.5, size = 3, 
                          nudge_y = 0.1, show.legend = FALSE)
        })
        print(cowplot::plot_grid(plotlist=biplots2, ncol=3))
        output <- list(eigs, scree, contribplot, cos2plot)
        return(output)
    }
    
    pfun(dat = dat1, cols = 4:8, ncp = 7)
    

    #> [[1]]
    #>        eigenvalue percentage of variance cumulative percentage of variance
    #> comp 1       1.14                  22.88                             22.88
    #> comp 2       1.08                  21.68                             44.57
    #> comp 3       1.02                  20.30                             64.87
    #> comp 4       0.93                  18.66                             83.53
    #> comp 5       0.82                  16.47                            100.00
    #> 
    #> [[2]]
    

    #> 
    #> [[3]]
    #>           Dim.1       Dim.2     Dim.3       Dim.4      Dim.5
    #> var1 0.20414881  0.24443766 0.5704115  0.80144254 0.02769182
    #> var2 0.89612168 -0.03274609 0.1541064  0.16242237 0.66822795
    #> var3 0.07326261  0.42569819 0.5364510  0.81272052 0.00000000
    #> var4 0.03185269  1.00000000 0.3135185 -0.04406605 0.54682715
    #> var5 0.64274654  0.21074258 0.2736449  0.11561294 0.60538540
    #> 
    #> [[4]]
    #>           Dim.1       Dim.2     Dim.3       Dim.4      Dim.5
    #> var1 0.22611471  0.25238130 0.5362197  0.68682597 0.02081676
    #> var2 0.94869940 -0.02188827 0.1505096  0.14271101 0.50232677
    #> var3 0.08943830  0.43173613 0.5047551  0.69642899 0.00000000
    #> var4 0.04619648  1.00000000 0.2982062 -0.03311043 0.41106619
    #> var5 0.68411533  0.21904048 0.2612629  0.10285356 0.45508617
    

    reprex package (v0.3.0) 于 2020 年 6 月 13 日创建

    【讨论】:

      猜你喜欢
      • 2021-11-22
      • 1970-01-01
      • 1970-01-01
      • 1970-01-01
      • 2021-11-05
      • 1970-01-01
      • 1970-01-01
      • 2022-08-09
      • 1970-01-01
      相关资源
      最近更新 更多