【问题标题】:fill value not mapping to correct factor ggplot填充值未映射到正确的因子 ggplot
【发布时间】:2020-04-28 16:56:51
【问题描述】:

我正在尝试绘制一个按变量着色的散点图,x 轴上的值使用 geom_rect 填充。但是我无法弄清楚如何以正确的顺序制作因子映射。

这是我的数据示例:

head(prod_cons_diff, n = 10)
# A tibble: 10 x 10
   country             year cons.e iso3c terr.e diff.prod.cons.e prod.cons                 continent xstart  xend
   <chr>              <int>  <dbl> <chr>  <dbl>            <dbl> <chr>                     <chr>      <dbl> <dbl>
 1 China               2017  2333. CHN    2685.           352.   Territorial > Consumption Asia         0.5   1.5
 2 USA                 2017  1552. USA    1439.          -113.   Consumption > Territorial Americas     1.5   2.5
 3 India               2017   617. IND     671.            53.8  Territorial > Consumption Asia         2.5   3.5
 4 Japan               2017   380. JPN     324.           -55.9  Consumption > Territorial Asia         3.5   4.5
 5 Russian Federation  2017   375. RUS     450.            74.9  Territorial > Consumption Europe       4.5   5.5
 6 Germany             2017   244. DEU     218.           -26.4  Consumption > Territorial Europe       5.5   6.5
 7 South Korea         2017   183. KOR     175.            -7.79 Consumption > Territorial Asia         6.5   7.5
 8 Saudi Arabia        2017   169. SAU     173.             3.62 Territorial > Consumption Asia         7.5   8.5
 9 Iran                2017   166. IRN     187.            20.8  Territorial > Consumption Asia         8.5   9.5
10 Indonesia           2017   164. IDN     159.            -4.62 Consumption > Territorial Asia         9.5  10.5

当我运行以下 ggplot 脚本时:

ggplot(prod_cons_diff, aes(x = fct_reorder(country, diff.prod.cons.e), y = diff.prod.cons.e * 3.664)) + 
  geom_point(aes(col = prod.cons)) + # add geom_point otherwise i can't map geom_rect (continuous) to country (discrete)
  geom_rect(aes(ymin = -1500, ymax = 1500, 
                xmin = xstart, xmax = xend, 
                fill = continent), alpha = 0.3, col = NA) + 
  geom_point(aes(col = prod.cons)) + # re-add geom_point so that it appears on top of the fill
  geom_hline(yintercept = 0, linetype = 'dashed') +
  coord_flip() +
  scale_color_manual(values = c('red', 'blue')) + 
  theme_minimal()

然而,填充变量显然是错误的:中国不在欧洲,美国不在亚洲等。

我尝试将国家和大陆设置为具有特定级别的因素,但无法正确设置。我还尝试使用来自forcatsas_factor() 从这里(mapping (ordered) factors to colors in ggplot)回答 2,但找不到该功能。 as_factor() 似乎在 sjlabelled (https://www.rdocumentation.org/packages/sjlabelled/versions/1.1.3/topics/as_factor) 中,但这也行不通。

我尝试制作一个简单的可重现示例,但其中的因素会正确映射。从本质上讲,我无法准确计算出该因素是如何影响整个大陆和国家的水平的。

我想有一个简单的解决方案,但我一直在用头撞墙。

回应@Matt 的以下评论:

> dput(head(prod_cons_diff, n = 10))
structure(list(country = c("China", "USA", "India", "Japan", 
"Russian Federation", "Germany", "South Korea", "Saudi Arabia", 
"Iran", "Indonesia"), year = c(2017L, 2017L, 2017L, 2017L, 2017L, 
2017L, 2017L, 2017L, 2017L, 2017L), cons.e = c(2333.11521896672, 
1552.00682401808, 616.7239620176, 380.216883675894, 374.633869915012, 
244.223647570196, 182.62081469552, 169.164508003068, 166.402218417086, 
164.032430920609), iso3c = c("CHN", "USA", "IND", "JPN", "RUS", 
"DEU", "KOR", "SAU", "IRN", "IDN"), terr.e = c(2685.24946186172, 
1438.52306916917, 670.566180528622, 324.269234030281, 449.519945642447, 
217.785589557643, 174.832142238684, 172.780926461956, 187.211971723987, 
159.409240780077), diff.prod.cons.e = c(352.134242894999, -113.483754848911, 
53.8422185110221, -55.9476496456134, 74.8860757274351, -26.4380580125526, 
-7.78867245683526, 3.61641845888749, 20.8097533069009, -4.62319014053256
), prod.cons = c("Territorial > Consumption", "Consumption > Territorial", 
"Territorial > Consumption", "Consumption > Territorial", "Territorial > Consumption", 
"Consumption > Territorial", "Consumption > Territorial", "Territorial > Consumption", 
"Territorial > Consumption", "Consumption > Territorial"), continent = c("Asia", 
"Americas", "Asia", "Asia", "Europe", "Europe", "Asia", "Asia", 
"Asia", "Asia"), xstart = c(0.5, 1.5, 2.5, 3.5, 4.5, 5.5, 6.5, 
7.5, 8.5, 9.5), xend = c(1.5, 2.5, 3.5, 4.5, 5.5, 6.5, 7.5, 8.5, 
9.5, 10.5)), class = c("tbl_df", "tbl", "data.frame"), row.names = c(NA, 
-10L))

【问题讨论】:

  • 你能提供dput(head(prod_cons_diff, n = 10))的输出吗?
  • @Matt,感谢您查看此内容。在 Q 中添加了请求的输出。

标签: r ggplot2


【解决方案1】:

当您根据在重新排序数据集之前定义的 x 值定义 geom_rect 时,您的值将不再匹配新的排序。

因此,您需要为您的geom_rect 重新计算xstartxend 的位置,以匹配数据集的新顺序。

这是使用dplyr 管道序列的可能解决方案:

library(dplyr)

df %>% arrange(diff.prod.cons.e) %>% 
  mutate(country = factor(country, unique(country)),
         continent = factor(continent, unique(continent))) %>%
  mutate(xstart2 = row_number() - 0.5, xend2 = row_number()+0.5)

              country year cons.e iso3c terr.e diff.prod.cons.e               prod.cons continent xstart xend xstart2 xend2
1                 USA 2017   1552   USA   1439          -113.00 Consumption>Territorial  Americas    1.5  2.5     0.5   1.5
2               Japan 2017    380   JPN    324           -55.90 Consumption>Territorial      Asia    3.5  4.5     1.5   2.5
3             Germany 2017    244   DEU    218           -26.40 Consumption>Territorial    Europe    5.5  6.5     2.5   3.5
4         South_Korea 2017    183   KOR    175            -7.79 Consumption>Territorial      Asia    6.5  7.5     3.5   4.5
5           Indonesia 2017    164   IDN    159            -4.62 Consumption>Territorial      Asia    9.5 10.5     4.5   5.5
6        Saudi_Arabia 2017    169   SAU    173             3.62 Territorial>Consumption      Asia    7.5  8.5     5.5   6.5
7                Iran 2017    166   IRN    187            20.80 Territorial>Consumption      Asia    8.5  9.5     6.5   7.5
8               India 2017    617   IND    671            53.80 Territorial>Consumption      Asia    2.5  3.5     7.5   8.5
9  Russian_Federation 2017    375   RUS    450            74.90 Territorial>Consumption    Europe    4.5  5.5     8.5   9.5
10              China 2017   2333   CHN   2685           352.00 Territorial>Consumption      Asia    0.5  1.5     9.5  10.5

所以,现在如果您将这些新位置传递到geom_rect,您可以获得大陆的正确着色模式:

library(dplyr)
library(ggplot2)

df %>% arrange(diff.prod.cons.e) %>% 
  mutate(country = factor(country, unique(country)),
         continent = factor(continent, unique(continent))) %>%
  mutate(xstart2 = row_number() - 0.5, xend2 = row_number()+0.5) %>%
  ggplot(aes(x = country, y = diff.prod.cons.e * 3.664)) + 
  geom_point(aes(col = prod.cons)) + # add geom_point otherwise i can't map geom_rect (continuous) to country (discrete)
  geom_rect(aes(ymin = -1500, ymax = 1500, 
                xmin = xstart2, xmax = xend2, 
                fill = continent), alpha = 0.3, col = NA) + 
  geom_point(aes(col = prod.cons)) + # re-add geom_point so that it appears on top of the fill
  geom_hline(yintercept = 0, linetype = 'dashed') +
  coord_flip() +
  scale_color_manual(values = c('red', 'blue')) + 
  theme_minimal()

【讨论】:

  • 太好了,谢谢。这行得通。使用unique() 突变为factor() 很有帮助。我没想到。
  • 不客气;)。是的,组合或arrange 然后使用unique 非常方便设置因子向量的适当顺序。
猜你喜欢
  • 1970-01-01
  • 1970-01-01
  • 1970-01-01
  • 2020-08-19
  • 1970-01-01
  • 2019-08-17
  • 1970-01-01
  • 1970-01-01
  • 2020-08-02
相关资源
最近更新 更多