【问题标题】:create a graph showing the range over which the date are spread by each group with selection of category in R创建一个图表,显示每个组在 R 中选择类别的日期分布范围
【发布时间】:2023-04-09 18:04:01
【问题描述】:

我是 R 新手,我在 R 中有一个 data.table,如下所示:

> dt <- data.table(category = rep(c("0001", "0002", "0003"), c(10,3,4)), 
                  grp = c("01", "03", "04", "06", "81", "82", "83", "84", "85", "86", 
                          "01", "02", "03",
                          "01", "02", "03", "04"),
                  min_date = c("2012-04-18", "2012-04-18", "2012-04-23", "2012-04-10", "2012-04-05", 
                               "2012-04-13", "2012-04-10", "2012-04-07", "2012-04-19", "2012-04-05",
                               "2012-04-04", "2012-04-06", "2012-04-25", "2012-04-19", "2012-04-05",
                               "2012-04-26", "2012-04-27"),
                  max_date = c("2012-05-23", "2012-05-19", "2012-05-19", "2012-04-24", "2012-05-23", 
                               "2012-05-09", "2012-05-19", "2012-05-24", "2012-05-22", "2012-05-23",
                               "2012-05-12", "2012-05-11", "2012-05-17", "2012-05-22", "2012-05-22",
                               "2012-05-19", "2012-05-17"),
                  hours_played = c(426, 381, 318, 168, 583, 314, 477, 568, 398, 582, 458, 429, 268, 
                                   395, 568, 276, 238))

> dt
    category grp   min_date   max_date hours_played
 1:     0001  01 2012-04-18 2012-05-23          426
 2:     0001  03 2012-04-18 2012-05-19          381
 3:     0001  04 2012-04-23 2012-05-19          318
 4:     0001  06 2012-04-10 2012-04-24          168
 5:     0001  81 2012-04-05 2012-05-23          583
 6:     0001  82 2012-04-13 2012-05-09          314
 7:     0001  83 2012-04-10 2012-05-19          477
 8:     0001  84 2012-04-07 2012-05-24          568
 9:     0001  85 2012-04-19 2012-05-22          398
10:     0001  86 2012-04-05 2012-05-23          582
11:     0002  01 2012-04-04 2012-05-12          458
12:     0002  02 2012-04-06 2012-05-11          429
13:     0002  03 2012-04-25 2012-05-17          268
14:     0003  01 2012-04-19 2012-05-22          395
15:     0003  02 2012-04-05 2012-05-22          568
16:     0003  03 2012-04-26 2012-05-19          276
17:     0003  04 2012-04-27 2012-05-17          238

我想创建一个可视化以显示每个组 grp 处于活动状态的范围以及相应的 hours_played 。应该有一个功能可以从下拉列表中选择category

从可用类别的下拉列表中选择一个类别时,图表应显示属于该类别的所有组都处于活动状态的日期范围,以及该类别旁边/内部的播放小时数。时间轴必须是X轴,时间范围可以是10天。

类似这样:我的绘画技巧很差,但只是想知道我想要什么。

我如何在 R 中做到这一点。

【问题讨论】:

  • 谁能帮忙,我是 R 新手,所以不太了解图形。
  • 标准图形无法实现“选择类别”的功能(它们只是图像)。这可以使用 Shiny 来实现。我会使用方面。一会儿提交答案
  • 谢谢,甘特图也有帮助
  • 我需要选择类别的功能,因为原始数据中有 1000 多个类别

标签: r plot ggplot2 plotly ggvis


【解决方案1】:

我向你推荐这个情节:

library(ggplot2)
library(scales)
ggplot(dt) +
    aes(y = grp, x = as.Date(min_date)) +
    geom_segment(aes(yend = grp, 
                     xend = as.Date(max_date), 
                     color = grp), 
                 size = 5,
                 show.legend = FALSE) +

    geom_text(aes(label = paste0('grp', grp)), 
              nudge_x = 3,
              size = 3) +

    geom_text(aes(label = paste0(hours_played, ' h'), 
                  x = as.Date(max_date)), 
              nudge_x = 1.5,
              size = 2) +

    facet_grid(category ~ ., scales = 'free_y', labeller = label_both) +
    scale_x_date('Date', date_breaks = '10 days', expand = c(0, 2)) +
    scale_color_brewer(palette = 'Set3') +
    theme_bw() +
    theme(axis.line.y = element_blank(),
          axis.text.y = element_blank(),
          axis.ticks.y = element_blank())

重要的位是三个geom_*(一个用于片段,两个用于文本)和faceting(根据category将情节分成三个子情节)

更新:

要为情节添加交互,我们需要一个反应式环境。最简单的是.Rmd 文档。

将其粘贴到新的.Rmd 文件中,然后“运行”它:

---
output: html_document
runtime: shiny

---

```{r data, echo = F}
dt <- data.frame(category = rep(c("0001", "0002", "0003"), c(10,3,4)), 
                  grp = c("01", "03", "04", "06", "81", "82", "83", "84", "85", "86", 
                          "01", "02", "03",
                          "01", "02", "03", "04"),
                  min_date = c("2012-04-18", "2012-04-18", "2012-04-23", "2012-04-10", "2012-04-05", 
                               "2012-04-13", "2012-04-10", "2012-04-07", "2012-04-19", "2012-04-05",
                               "2012-04-04", "2012-04-06", "2012-04-25", "2012-04-19", "2012-04-05",
                               "2012-04-26", "2012-04-27"),
                  max_date = c("2012-05-23", "2012-05-19", "2012-05-19", "2012-04-24", "2012-05-23", 
                               "2012-05-09", "2012-05-19", "2012-05-24", "2012-05-22", "2012-05-23",
                               "2012-05-12", "2012-05-11", "2012-05-17", "2012-05-22", "2012-05-22",
                               "2012-05-19", "2012-05-17"),
                  hours_played = c(426, 381, 318, 168, 583, 314, 477, 568, 398, 582, 458, 429, 268, 
                                   395, 568, 276, 238))
```

```{r graph, echo = F}
library(ggplot2)
library(scales)

selectInput('category','Choose the category:', choices = unique(dt$category))

dt_filtered <- reactive({
  dt[dt$category == input$category, ]
})

renderPlot({
  ggplot(dt_filtered()) +
    aes(y = grp, x = as.Date(min_date)) +
    geom_segment(aes(yend = grp, 
                     xend = as.Date(max_date), 
                     color = grp), 
                 size = 5,
                 show.legend = FALSE) +

    geom_text(aes(label = paste0('grp', grp)), 
              nudge_x = 3,
              size = 3) +

    geom_text(aes(label = paste0(hours_played, ' h'), 
                  x = as.Date(max_date)), 
              nudge_x = 1.5,
              size = 2) +
    scale_x_date('Date', date_breaks = '10 days', expand = c(0, 2)) +
    scale_color_brewer(palette = 'Set3') +
    theme_bw() +
    theme(axis.line.y = element_blank(),
          axis.text.y = element_blank(),
          axis.ticks.y = element_blank())

})
```

【讨论】:

  • 实际上我有超过 1000 个类别的数据,这就是我想要一个选择选项的原因。
  • 有什么方法可以过滤类别吗?
猜你喜欢
  • 2020-02-19
  • 1970-01-01
  • 1970-01-01
  • 1970-01-01
  • 1970-01-01
  • 1970-01-01
  • 1970-01-01
  • 1970-01-01
  • 2011-01-10
相关资源
最近更新 更多