【问题标题】:How to split columns and plot a graph using facet_wrap?如何使用 facet_wrap 拆分列并绘制图形?
【发布时间】:2020-07-25 14:40:38
【问题描述】:

https://www.kaggle.com/shivamb/netflix-shows-and-movies-exploratory-analysis 包含数据集。 (2.13MB)

我正在尝试从 netflix 数据集中拆分国家列,并绘制一个表示来自三个国家的电影的多面条形图。

可重现的代码如下:-

library(tidyverse)
library(scales)
library(lubridate)

netflix_tbl <- read.csv("netflix_titles_nov_2019.csv")

netflix_wrangled_tbl <- netflix_tbl%>%
    mutate(date_added = dmy(date_added), 
           date = day(date_added), month = month(date_added), year = year(date_added),
           count = readr::parse_number(as.character(duration)),
           show_type = stringr::str_remove(duration, as.character(count)))

netflix_wrangled_tbl %>%
    filter(type == "Movie") %>% 
    separate_rows(country, sep = ",")%>% 
    filter(country == "India" | country == "United States"| country == "United Kingdom")%>%
  separate_rows(cast, sep = ",")%>%
  # Count by country and cast
  count(country, cast)%>%
  slice_max(n, n = 24)%>%
  ggplot(aes(y = tidytext::reorder_within(cast, n, country), x = n))+
  geom_col() +
  tidytext::scale_y_reordered() +
  facet_wrap(~country, scales = "free")

结果输出是,

预期的输出是:-

我可以知道我哪里出错了以及如何实现预期的输出吗?谢谢。

【问题讨论】:

    标签: r ggplot2 facet


    【解决方案1】:

    尝试使用以下代码修改代码的最后一个片段:

    netflix_wrangled_tbl %>%
      filter(type == "Movie") %>% 
      separate_rows(country, sep = ",")%>% 
      filter(country == "India" | country == "United States"| country == "United Kingdom")%>%
      separate_rows(cast, sep = ",")%>%
      filter(cast!="") %>%
      # Count by country and cast
      count(country, cast)%>%
      group_by(country) %>% arrange(desc(n)) %>%
      group_by(country) %>%
      slice(seq_len(24)) %>%
      ggplot(aes(y = tidytext::reorder_within(cast, n, country), x = n))+
      geom_col() +
      tidytext::scale_y_reordered() +
      facet_wrap(~country, scales = "free")
    

    【讨论】:

    • 代码运行良好。只是想澄清一下您的输出是否正确或预期输出中提到的输出是否正确?
    • @SriSreshtan cast 中有很多空格,这就是您的代码不起作用的原因,而关于紫色图,它取决于您要显示的行数。该情节也来自python而不是R
    • 同意。那为什么输出会有差异呢?
    • 这可能是您选择的顶部行数。数据也太旧了,所以请确保您拥有与那些 python 笔记本中相同的数据。
    猜你喜欢
    • 1970-01-01
    • 2020-04-29
    • 1970-01-01
    • 2019-07-29
    • 1970-01-01
    • 1970-01-01
    • 2022-12-25
    • 2022-11-12
    • 1970-01-01
    相关资源
    最近更新 更多