【问题标题】:Error with summing columns in R (invalid 'type' (character) of argument)?对 R 中的列求和时出错(参数的“类型”(字符)无效)?
【发布时间】:2022-09-29 19:56:43
【问题描述】:

我有以下数据集:

structure(list(Patient_ID = c(\"1234\", \"1234\", \"1234\", \"1234\", 
\"1234\", \"1234\", \"1234\", \"1234\", \"1234\"), Unit_Type = c(\"ABC\", 
\"ABC\", \"ABC\", \"ABC\", \"ABC\", \"DEF\", \"DEF\", \"DEF\", \"GHI\"), Status = c(\"Returned\", 
\"R\", \"R\", \"R\", \"T\", \"T\", 
\"T\", \"T\", \"T\")), class = \"data.frame\", row.names = c(NA, 
-9L))

并对其使用了以下计算:

df <- df %>%
  count(ID, Unit_Type, Status) %>%
  pivot_wider(names_from = c(Unit, Status), values_from = n)

我想将\'ABC_R\'\'ABC_T\'ID 相加(我知道示例数据集只有一个唯一的患者ID,但我的真实数据集有更多),但我不断收到以下错误消息:

> aggregate(df, by=list(df$ABC_T, df$ABC_R), FUN=sum, na.rm = TRUE)
Error in FUN(X[[i]], ...) : invalid \'type\' (character) of argument
  • 当然,它的结构(list(Patient_ID = \"1234\", ABC_Returned = 4L, ABC_Transfused = 1L, DEF_Transfused = 3L, GHI_Transfused = 1L, ABC_Ordered = 5), row.names = c(NA, -1L), class= c(\"tbl_df\", \"tbl\", \"data.frame\"))

标签: r


【解决方案1】:

我想你会寻找这个:

您收到错误的原因是您的代码尝试对作为字符列的第一列求和,df[,-1] 的子集应该有效:

aggregate(df[,-1], by=list(df$ABC_Transfused, df$ABC_Returned), FUN=sum, na.rm = TRUE)
  Group.1 Group.2 ABC_Returned ABC_Transfused DEF_Transfused GHI_Transfused
1       1       4            4              1              3              1

【讨论】:

    【解决方案2】:

    我们可以使用

    library(dplyr)
    df %>% 
        mutate(ABC_ordered = ABC_Returned + ABC_Transfused)
    

    -输出

    # A tibble: 1 × 6
      Patient_ID ABC_Returned ABC_Transfused DEF_Transfused GHI_Transfused ABC_ordered
      <chr>             <int>          <int>          <int>          <int>       <int>
    1 1234                  4              1              3              1           5
    

    【讨论】:

      【解决方案3】:

      也许这就是你要找的东西?

      我添加了另一个带有另一个 ID 的数据框,希望能澄清一下,但我认为你想要一个按行操作。

      由于数据已经在计数步骤中汇总,我认为您不需要进行任何额外的分组,因为一次观察是一名患者

      data <- structure(list(Patient_ID = c("1234", "1234", "1234", "1234", 
      "1234", "1234", "1234", "1234", "1234"), Unit_Type = c("ABC", 
      "ABC", "ABC", "ABC", "ABC", "DEF", "DEF", "DEF", "GHI"), Status = c("Returned", 
      "Returned", "Returned", "Returned", "Transfused", "Transfused", 
      "Transfused", "Transfused", "Transfused")), class = "data.frame", row.names = c(NA, 
      -9L))
      
      data2 <- structure(list(Patient_ID = c("1235", "1235", "1235", "1235", 
      "1235", "1235", "1235", "1235", "1235"), Unit_Type = c("ABC", 
      "ABC", "ABC", "ABC", "ABC", "DEF", "DEF", "DEF", "GHI"), Status = c("Returned", 
      "Returned", "Returned", "Returned", "Transfused", "Transfused", 
      "Transfused", "Transfused", "Transfused")), class = "data.frame", row.names = c(NA, 
      -9L))
      
      data%>%
        rbind(data2) -> data_full
      
      
      data_full2 <- data_full %>%
        count(Patient_ID, Unit_Type, Status) %>%
        pivot_wider(names_from = c(Unit_Type, Status), values_from = n)
      
      
      data_full2%>%
        rowwise()%>%
        mutate(ABC_Ordered = sum(c(ABC_Returned,
                                   ABC_Transfused),
                                 na.rm = TRUE))%>%
        ungroup() -> data_full3
      

      【讨论】:

        猜你喜欢
        • 1970-01-01
        • 2021-05-04
        • 2016-08-07
        • 1970-01-01
        • 2013-08-04
        • 1970-01-01
        • 2020-08-30
        • 2022-08-19
        • 2018-12-01
        相关资源
        最近更新 更多