【问题标题】:Replicating a Data Visualization with R/ggplot使用 R/ggplot 复制数据可视化
【发布时间】:2018-06-15 13:23:18
【问题描述】:

使用ggplot2复制我在印刷媒体中看到的可视化

上下文
我一直希望使数据可视化更吸引人/更美观,特别是对于非数据人员,他们是与我一起工作的大多数人(营销人员、管理人员等利益相关者)——我注意到当可视化看起来像学术时——出版质量(标准ggplot2 美学)他们倾向于假设他们无法理解它并且不费心尝试,首先破坏了可视化的整个目的。然而,当它看起来更形象化(就像你可能在网站或营销材料上看到的东西)时,他们会专注并尝试理解可视化,通常是成功的。通常我们会在这些类型的可视化中进行最有趣的讨论,所以这是我的最终目标。

可视化

这是我在一些营销手册上看到的关于按地理位置划分的网络流量的设备份额的一些内容,虽然它实际上有点忙且不清楚,但它比我在标准中创建的类似堆叠条形图更能引起共鸣——我有一点也不知道如何在ggplot2 中复制这样的东西,任何尝试都将不胜感激!以下是一些在data.table 中使用的示例整洁数据:

structure(list(country = c("Argentina", "Argentina", "Argentina", 
                       "Brazil", "Brazil", "Brazil", "Canada",
                       "Canada", "Canada", "China", "China",
                       "China", "Japan", "Japan", "Japan", "Spain",
                       "Spain", "Spain", "UK", "UK", "UK", "USA",
                       "USA", "USA"), 
           device_type = structure(c(1L, 2L, 3L, 1L, 2L, 3L, 1L, 
                                     2L, 3L, 1L, 2L, 3L, 1L, 2L, 
                                     3L, 1L, 2L, 3L, 1L, 2L, 3L, 
                                     1L, 2L, 3L), 
                                   class = "factor", 
                                   .Label = c("desktop", 
                                              "mobile", 
                                              "multi")), 
           proportion = c(0.37, 0.22, 0.41, 0.3, 0.31, 0.39, 
                          0.35, 0.06, 0.59, 0.19, 0.2, 0.61, 
                          0.4, 0.18, 0.42, 0.16, 0.28, 0.56, 
                          0.27, 0.06, 0.67, 0.37, 0.08, 0.55)),
      .Names = c("country", "device_type", "proportion"), 
      row.names = c(NA, -24L), 
      class = c("data.table", "data.frame"))

【问题讨论】:

    标签: r ggplot2


    【解决方案1】:

    你也可以考虑googleVis

    library(googleVis)
    
    dat <- structure(list(country = c("Argentina", "Argentina", "Argentina", 
                               "Brazil", "Brazil", "Brazil", "Canada",
                               "Canada", "Canada", "China", "China",
                               "China", "Japan", "Japan", "Japan", "Spain",
                               "Spain", "Spain", "UK", "UK", "UK", "USA",
                               "USA", "USA"), 
                   device_type = structure(c(1L, 2L, 3L, 1L, 2L, 3L, 1L, 
                                             2L, 3L, 1L, 2L, 3L, 1L, 2L, 
                                             3L, 1L, 2L, 3L, 1L, 2L, 3L, 
                                             1L, 2L, 3L), 
                                           class = "factor", 
                                           .Label = c("desktop", 
                                                      "mobile", 
                                                      "multi")), 
                   proportion = c(0.37, 0.22, 0.41, 0.3, 0.31, 0.39, 
                                  0.35, 0.06, 0.59, 0.19, 0.2, 0.61, 
                                  0.4, 0.18, 0.42, 0.16, 0.28, 0.56, 
                                  0.27, 0.06, 0.67, 0.37, 0.08, 0.55)),
              .Names = c("country", "device_type", "proportion"), 
              row.names = c(NA, -24L), 
              class = c("data.table", "data.frame"))
    
    link_order <- unique(dat$country)
    node_order <- unique(as.vector(rbind(dat$country, as.character(dat$device_type))))
    
    link_cols <- data.frame(color = c('#ffd1ab', '#ff8d14', '#ff717e', '#dd2c40', '#d6b0ea', 
                            '#8c4fab','#00addb','#297cbe'), 
                            country = c("UK", "Canada", "USA", "China", "Spain", "Japan", "Argentina", "Brazil"),
                            stringsAsFactors = F)
    
    node_cols <- data.frame(color = c("#ffc796", "#ff7100", "#ff485b", "#d20000", 
                                      "#cc98e6", "#6f2296", "#009bd2", "#005daf", 
                                      "grey", "grey", "grey"),
                            type = c("UK", "Canada", "USA", "China", "Spain", "Japan", 
                                     "Argentina", "Brazil", "multi", "desktop", "mobile"))
    
    link_cols2 <- sapply(link_order, function(x) link_cols[x == link_cols$country, "color"])
    node_cols2 <- sapply(node_order, function(x) node_cols[x == node_cols$type, "color"])
    
    actual_link_cols <- paste0("[", paste0("'", link_cols2,"'", collapse = ','), "]")
    actual_node_cols <- paste0("[", paste0("'", node_cols2,"'", collapse = ','), "]")
    
    opts <- paste0("{
            link: { colorMode: 'source',
                   colors: ", actual_link_cols ," },
            node: {colors: ", actual_node_cols ,"}}")
    
    Sankey <- gvisSankey(dat, 
                         from = "country", 
                         to = "device_type", 
                         weight = "proportion",
                         options = list(height = 500, width = 1000, sankey = opts))
    
    
    plot(Sankey) 
    

    【讨论】:

    • 这绝对是华丽!如何修改代码以适应多级 sankey?
    • @tangerine7199 你基本上必须定义更多的链接(在这种情况下 - 移动到?和桌面到?等)
    【解决方案2】:

    您可以尝试使用“ggalluvial”包及其各自的“geom”。

    Chek this out

    【讨论】:

    • 啊,就是这样 - 冲积和桑基图 .. 在您上面的链接之后,我找到了我正在寻找的详细信息(包括 ggaluvial)对 here
    猜你喜欢
    • 2019-03-18
    • 2018-06-05
    • 2015-02-15
    • 2021-12-04
    • 1970-01-01
    • 2021-09-19
    • 1970-01-01
    • 1970-01-01
    • 2022-01-01
    相关资源
    最近更新 更多