在ggplot中对Y轴上的不完整数值字符串进行排序答案

【问题标题】：Sort incomplete string of numeric values on Y axis in ggplot在ggplot中对Y轴上的不完整数值字符串进行排序
【发布时间】：2021-10-19 13:45:00
【问题描述】：

我正在尝试在我的 ggplot 中使用 y 轴上的参与者记录 ID。记录 ID 跳过（例如 1、3、10、100）。我的问题是三个方面：

我想在 y 轴上显示每个 ID，但是当我转换为 as.numeric(as.character(record_id))) 时，轴是有序的，但没有考虑到记录 ID 跳过。
如果我转换为 as.character，这是正确的概念，但我不知道如何排序，因此即使使用 str_order，它也不会显示为 1、10、100、3。

到目前为止，使用ggplot(sincevax_reshape, aes(x=value, y=as.character(sort(as.numeric(record_id))))) 让我得到了 y 轴的外观，但排序不正确。
一旦我让记录 ID 在 Y 轴上正确排序，有没有办法增加每个记录 ID 之间的垂直间距，这样 Y 轴就不会那么拥挤/聚集？

     record_id  variable value
6           10    Sample  -182
7           11    Sample  -233
14          21    Sample  -189
16          23    Sample  -232
17          24    Sample  -214
21          30    Sample  -197
23          32    Sample  -133
24          33    Sample  -203
28          39    Sample  -165
29          41    Sample  -226
1105         3     Today   106
1106         4     Today   163
1107         6     Today    79
1108         7     Today   113
1109         9     Today   133
1110        10     Today   177
1111        11     Today   118

最终目标是这样的，顶部没有所有空格：

【问题讨论】：

标签： r sorting ggplot2

【解决方案1】：

您可以尝试将数字转换为因数：

library(ggplot2)

df$record_id <- factor(df$record_id, levels = df$record_id)

ggplot(df, aes(x = value, y = record_id)) + 
  geom_col()

^{由reprex package (v2.0.0) 于 2021-08-17 创建}

使用的数据

df <- structure(list(record_id = c(10L, 11L, 21L, 23L, 24L, 30L, 32L, 
33L, 39L, 41L), variable = c("Sample", "Sample", "Sample", "Sample", 
"Sample", "Sample", "Sample", "Sample", "Sample", "Sample"), 
    value = c(-182L, -233L, -189L, -232L, -214L, -197L, -133L, 
    -203L, -165L, -226L)), class = "data.frame", row.names = c("6", 
"7", "14", "16", "17", "21", "23", "24", "28", "29"))

【讨论】：

非常感谢！我上面的例子很糟糕，因为我没有显示每个记录 ID 都有多行，因为数据框是长格式的，所以我无法让分解工作。我将更新上面的示例以包含此内容。

【解决方案2】：

你在找这个吗？

df %>% 
    ggplot(aes(x = factor(record_id), y = value)) +
    geom_col() +
    coord_flip()

【讨论】：

我在上面添加了一张图片，以表达我的想法！谢谢！

【解决方案3】：

根据您的第一个提案的精神，一个选项可能是根据它们的值为不连续的数字 ID 分配等级，例如：

df$record_id_rank <- rank(df$record_id)

请注意，在重复 ID 的情况下，排名将生成浮点值，这听起来像是您的长数据中可能有。在这种情况下，您可以将关系减少到它们的整数：

df$record_id_rank <- floor(df$record_id_rank)

然后，您可以像上面其他人一样在 y 轴上绘制 df$record_id_rank。（如果您希望轴标签到真实 ID，而不是顺序编号，我相信您可以在 ggplot 中映射。）

【讨论】：