【问题标题】:r ggplot how to iteratively plot across all character variables in a df and plot them in the grid structure dynamicallyr ggplot 如何在 df 中迭代地绘制所有字符变量并将它们动态地绘制在网格结构中
【发布时间】:2017-09-19 07:27:01
【问题描述】:
library(gridExtra)
library(grid)
library(tidyquant) # theme_tq()
library(ggplot2)

我有以下数据:

head(training,20)
   gender_dom marital     race edu_level rental inc_level cluster_Kproto
1           F  single   others jrcollege rented      high              1
2           M married hispanic  postgrad leased    medium              3
3           M   other hispanic  highschl rented      high              1
6           M   other hispanic  postgrad rented    medium              3
7           M married    black doctorate leased    medium              3
8           M married hispanic jrcollege  owned      high              2
10          F  single   others  graduate rented       low              3
12          F married    asian  highschl  owned    medium              3
14          M  single hispanic  graduate leased      high              1
16          F married    white  postgrad rented    medium              3
18          M   other   others  postgrad leased    medium              3
22          F  single   others  graduate leased      high              2
23          M  single    asian doctorate leased    medium              3
25          F   other    white  highschl rented    medium              3
26          M   other    asian jrcollege leased       low              3
27          F  single    white jrcollege leased    medium              3
28          M married    asian doctorate rented       low              3
29          F   other    white  highschl rented      high              1
30          F  single hispanic jrcollege leased      high              2
31          F   other    asian jrcollege  owned       low              3

# Make variables into factors
factor_vars <- c('gender_dom','marital','race','edu_level','rental','inc_level','cluster_Kproto')
training[factor_vars] <- lapply(training[factor_vars], function(x) as.factor(x))
str(training)

library(tidyquant) # theme_tq()
par(ask=F) # to remove hit enter for each plot

(p1 <- training %>% ggplot(aes(x = race, fill = cluster_Kproto)) + geom_bar(position = position_dodge(width = 0.8), 
    width = 0.7, alpha = 0.8) + scale_fill_manual(values = palette_light()) + 
    theme_tq() + theme(legend.position = "right") + guides(fill = guide_legend("Cluster")) + 
    ggtitle("Cluster Distribution by Race") + theme(plot.title = element_text(hjust = 0.5)) + 
    theme(axis.text.x = element_text(angle = 45, vjust = 1, hjust = 1)) + 
    labs(x = "", fill = ""))

(p2 <- training %>% ggplot(aes(x = edu_level, fill = cluster_Kproto)) + geom_bar(position = position_dodge(width = 0.8), 
    width = 0.7, alpha = 0.8) + scale_fill_manual(values = palette_light()) + 
    theme_tq() + theme(legend.position = "right") + guides(fill = guide_legend("Cluster")) + 
    ggtitle("Cluster Distribution by Education level") + theme(plot.title = element_text(hjust = 0.5)) + 
    theme(axis.text.x = element_text(angle = 45, vjust = 1, hjust = 1)) + 
    labs(x = "", fill = ""))

(p3 <- training %>% ggplot(aes(x = marital, fill = cluster_Kproto)) + geom_bar(position = position_dodge(width = 0.8), 
    width = 0.7, alpha = 0.8) + scale_fill_manual(values = palette_light()) + 
    theme_tq() + theme(legend.position = "right") + guides(fill = guide_legend("Cluster")) + 
    ggtitle("Cluster Distribution by Marital") + theme(plot.title = element_text(hjust = 0.5)) + 
    theme(axis.text.x = element_text(angle = 45, vjust = 1, hjust = 1)) + 
    labs(x = "", fill = ""))

grid.arrange(p1, p2,p3,ncol = 2,nrow=2)

我想要做的是用cluster_kproto 绘制所有分类变量,就像上面图中的少数变量一样,迭代地包括情节标题也来自 ggplot 函数的 x 值,所有这些都是网格排列的如果我的分类变量没有变化,我仍然可以得到网格排列图,而无需像我现在所做的那样手动绘制每个分类变量。

在这里需要一些帮助!!!!!!

【问题讨论】:

  • 哇——你的最后一句话听起来很紧急!!!!!! :-) 如果您可以以更可复制的格式发布您的数据,以便我们可以轻松地将其导入 R,那么帮助您和测试我们的建议会容易得多(请参阅dput)。否则,据我所知 - 我认为你做得很好 - 你只需要将你的情节代码提取到一个有趣的地方。然后从数据中提取所有因子 var colnames 并对它们应用绘图乐趣。请记住,您需要引用传递给绘图函数的 x var 名称。然后,您可以选择如何显示图(网格、导出为 pdf...)
  • 我认为您可以为此使用构面,而不是创建 3 个独立的图。然后将代码包装到一个函数中,该函数将获取要绘制的列的名称并适当地更新分面和绘图的标题。
  • 顺便说一句,您最好在未来的标准数据集(例如 mtcars、iris、diamonds 等)上说明您的观点,并提供包含 最低限度的 MWE我>。重现您的努力变得更加容易!
  • @R Kiselev 将尽快重新发布 mtcars dataset ......

标签: r loops ggplot2


【解决方案1】:

创建自定义函数plotMyData,然后循环遍历您要绘制的变量:

plotMyData <- function(varName, inputData) {
    ggplot(inputData, aes(x = get(varName), fill = cluster_Kproto)) + 
        geom_bar(position = position_dodge(width = 0.8), 
                 width = 0.7, alpha = 0.8) + 
        scale_fill_manual(values = palette_light()) + 
        ggtitle(paste("Cluster Distribution by", varName)) + 
        labs(x = "", fill = "") +
        guides(fill = guide_legend("Cluster")) + 
        theme_tq() + 
        theme(legend.position = "right") + 
        theme(plot.title = element_text(hjust = 0.5)) + 
        theme(axis.text.x = element_text(angle = 45, vjust = 1, hjust = 1))
}

plotList <- list()
varToPlot <- c("race", "edu_level", "marital")

for(i in varToPlot) {
    plotList[[i]] <- plotMyData(i, training)
}
do.call("grid.arrange", c(plotList, ncol = 2, nrow = 2))

【讨论】:

  • @Nishant 乐于助人! :-)
  • 使其更具动态性:plotList &lt;- list() varToPlot &lt;- names(training[-7]) for(i in varToPlot) { plotList[[i]] &lt;- plotMyData(i, training) } do.call("grid.arrange", c(plotList, ncol = 2, nrow = ceiling(length(varToPlot)/2))
猜你喜欢
  • 2021-08-26
  • 2021-08-14
  • 1970-01-01
  • 1970-01-01
  • 2018-07-15
  • 1970-01-01
  • 2015-03-17
  • 1970-01-01
  • 1970-01-01
相关资源
最近更新 更多