【问题标题】:How to spread my dataset under different categories如何将我的数据集分布在不同的类别下
【发布时间】:2020-03-14 14:40:07
【问题描述】:

我还是 R 新手。谁能帮我弄清楚如何将我的数据集分布在不同的类别下?

这是我在 R 中得到的。

带有“项目代码”的值属于“广泛类别”,其他值属于子类别(实际项目)。所以现在,我想将子类别中的值分开(即带有NA 代码的值)并将其放入第三列((如第 1 列是“项目代码”,第 2 列是“广泛类别”,和列#3 是“特定项目”)

更具体地说,我希望最终结果看起来像这样:

我正在考虑使用spread() 命令,但它似乎不起作用。有人可以就我的以下步骤给我一些建议吗?

(我正在考虑将“广泛类别”指定为变量,将“子类别”指定为另一个变量,然后,我也许可以展开表格?不确定)

【问题讨论】:

  • 欢迎来到 Stack Overflow!您可能想查看How to make a great R reproducible example 以帮助其他人帮助您;特别是,粘贴数据图像不是很有帮助,如果您的数据以文本形式提供,我们可以重现您的问题并帮助您更快地解决它。一种方法是 edit 您的问题包括 R 命令 dput(head(df)) 的输出(您将 df 替换为数据框的实际名称)

标签: r dplyr tidyverse


【解决方案1】:

这是您可以考虑的tidyverse 解决方案。我将添加示例数据,以便其他人可以提供替代方案。

library(tidyverse)

df %>%
  fill(Item.Code) %>%
  group_by(Item.Code) %>%
  mutate(Category = first(Item)) %>%
  slice(2:n())

输出

# A tibble: 12 x 3
# Groups:   Item.Code [3]
   Item.Code Item                                        Category                        
       <dbl> <fct>                                       <fct>                           
 1       221 Prunus amygdalus                            Almonds, with shell             
 2       221 Almond (Prunus dulcis or Amygdalus communis Almonds, with shell             
 3       711 Pimpinella anisum (aniseed)                 Anise, badian, fennel, coriander
 4       711 Illicium verum (star anise)                 Anise, badian, fennel, coriander
 5       711 Carum carvi                                 Anise, badian, fennel, coriander
 6       711 Coriandrum sativum (coriander               Anise, badian, fennel, coriander
 7       711 Cuminum cyminum (cumin)                     Anise, badian, fennel, coriander
 8       711 Foeniculum vulgare (fennel)                 Anise, badian, fennel, coriander
 9       711 Juniperus communis (common juniper)         Anise, badian, fennel, coriander
10       800 Agave                                       Agave fibres nes                
11       800 Agave fourcroydes (Henequen)                Agave fibres nes                
12       800 Agave americana (century plant)             Agave fibres nes

数据

df <- data.frame(
  Item.Code = c(800, NA, NA, NA, 221, NA, NA, 711, NA, NA, NA, NA, NA, NA, NA),
  Item = c("Agave fibres nes", "Agave", "Agave fourcroydes (Henequen)", "Agave americana (century plant)", "Almonds, with shell",
           "Prunus amygdalus", "Almond (Prunus dulcis or Amygdalus communis", "Anise, badian, fennel, coriander",
           "Pimpinella anisum (aniseed)", "Illicium verum (star anise)", "Carum carvi", "Coriandrum sativum (coriander",
           "Cuminum cyminum (cumin)", "Foeniculum vulgare (fennel)", "Juniperus communis (common juniper)")
)

【讨论】:

  • 有效!完美的!非常感谢!根据您的代码,我还找到了另一种使它看起来更好的方法。这是我得到的:crop.sorted %>% mutate(Category = ifelse(!is.na(Code), Item, Code)) %>% fill(Code,Category) %>% group_by(Code)%>% slice(2:n())%>% select(Code,Category,Item)->crop.sorted
【解决方案2】:

我们也可以使用data.table

library(data.table)
library(zoo)
setDT(df)[,  c(.SD[-1], .(Category = first(Item))),.(Item.Code = na.locf0(Item.Code))]
#    Item.Code                                        Item                         Category
# 1:       800                                       Agave                 Agave fibres nes
# 2:       800                Agave fourcroydes (Henequen)                 Agave fibres nes
# 3:       800             Agave americana (century plant)                 Agave fibres nes
# 4:       221                            Prunus amygdalus              Almonds, with shell
# 5:       221 Almond (Prunus dulcis or Amygdalus communis              Almonds, with shell
# 6:       711                 Pimpinella anisum (aniseed) Anise, badian, fennel, coriander
# 7:       711                 Illicium verum (star anise) Anise, badian, fennel, coriander
# 8:       711                                 Carum carvi Anise, badian, fennel, coriander
# 9:       711               Coriandrum sativum (coriander Anise, badian, fennel, coriander
#10:       711                     Cuminum cyminum (cumin) Anise, badian, fennel, coriander
#11:       711                 Foeniculum vulgare (fennel) Anise, badian, fennel, coriander
#12:       711         Juniperus communis (common juniper) Anise, badian, fennel, coriander

数据

df <- data.frame(
  Item.Code = c(800, NA, NA, NA, 221, NA, NA, 711, NA, NA, NA, NA, NA, NA, NA),
  Item = c("Agave fibres nes", "Agave", "Agave fourcroydes (Henequen)", "Agave americana (century plant)", "Almonds, with shell",
           "Prunus amygdalus", "Almond (Prunus dulcis or Amygdalus communis", "Anise, badian, fennel, coriander",
           "Pimpinella anisum (aniseed)", "Illicium verum (star anise)", "Carum carvi", "Coriandrum sativum (coriander",
           "Cuminum cyminum (cumin)", "Foeniculum vulgare (fennel)", "Juniperus communis (common juniper)")
)

【讨论】:

    猜你喜欢
    • 2014-03-19
    • 2023-04-07
    • 2015-05-09
    • 2018-10-26
    • 1970-01-01
    • 1970-01-01
    • 1970-01-01
    • 2015-07-30
    • 2019-07-14
    相关资源
    最近更新 更多