对 R 中的列求和时出错（参数的“类型”（字符）无效）？答案

【问题标题】：Error with summing columns in R (invalid 'type' (character) of argument)?对 R 中的列求和时出错（参数的“类型”（字符）无效）？
【发布时间】：2022-09-29 19:56:43
【问题描述】：

我有以下数据集：

structure(list(Patient_ID = c(\"1234\", \"1234\", \"1234\", \"1234\", 
\"1234\", \"1234\", \"1234\", \"1234\", \"1234\"), Unit_Type = c(\"ABC\", 
\"ABC\", \"ABC\", \"ABC\", \"ABC\", \"DEF\", \"DEF\", \"DEF\", \"GHI\"), Status = c(\"Returned\", 
\"R\", \"R\", \"R\", \"T\", \"T\", 
\"T\", \"T\", \"T\")), class = \"data.frame\", row.names = c(NA, 
-9L))

并对其使用了以下计算：

df <- df %>%
  count(ID, Unit_Type, Status) %>%
  pivot_wider(names_from = c(Unit, Status), values_from = n)

我想将\'ABC_R\' 和\'ABC_T\' 与ID 相加（我知道示例数据集只有一个唯一的患者ID，但我的真实数据集有更多），但我不断收到以下错误消息：

> aggregate(df, by=list(df$ABC_T, df$ABC_R), FUN=sum, na.rm = TRUE)
Error in FUN(X[[i]], ...) : invalid \'type\' (character) of argument

当然，它的结构(list(Patient_ID = \"1234\", ABC_Returned = 4L, ABC_Transfused = 1L, DEF_Transfused = 3L, GHI_Transfused = 1L, ABC_Ordered = 5), row.names = c(NA, -1L), class= c(\"tbl_df\", \"tbl\", \"data.frame\"))

标签： r

【解决方案1】：

我想你会寻找这个：

您收到错误的原因是您的代码尝试对作为字符列的第一列求和，df[,-1] 的子集应该有效：

aggregate(df[,-1], by=list(df$ABC_Transfused, df$ABC_Returned), FUN=sum, na.rm = TRUE)

  Group.1 Group.2 ABC_Returned ABC_Transfused DEF_Transfused GHI_Transfused
1       1       4            4              1              3              1

【讨论】：

【解决方案2】：

我们可以使用

library(dplyr)
df %>% 
    mutate(ABC_ordered = ABC_Returned + ABC_Transfused)

-输出

# A tibble: 1 × 6
  Patient_ID ABC_Returned ABC_Transfused DEF_Transfused GHI_Transfused ABC_ordered
  <chr>             <int>          <int>          <int>          <int>       <int>
1 1234                  4              1              3              1           5

【讨论】：

【解决方案3】：

也许这就是你要找的东西？

我添加了另一个带有另一个 ID 的数据框，希望能澄清一下，但我认为你想要一个按行操作。

由于数据已经在计数步骤中汇总，我认为您不需要进行任何额外的分组，因为一次观察是一名患者

data <- structure(list(Patient_ID = c("1234", "1234", "1234", "1234", 
"1234", "1234", "1234", "1234", "1234"), Unit_Type = c("ABC", 
"ABC", "ABC", "ABC", "ABC", "DEF", "DEF", "DEF", "GHI"), Status = c("Returned", 
"Returned", "Returned", "Returned", "Transfused", "Transfused", 
"Transfused", "Transfused", "Transfused")), class = "data.frame", row.names = c(NA, 
-9L))

data2 <- structure(list(Patient_ID = c("1235", "1235", "1235", "1235", 
"1235", "1235", "1235", "1235", "1235"), Unit_Type = c("ABC", 
"ABC", "ABC", "ABC", "ABC", "DEF", "DEF", "DEF", "GHI"), Status = c("Returned", 
"Returned", "Returned", "Returned", "Transfused", "Transfused", 
"Transfused", "Transfused", "Transfused")), class = "data.frame", row.names = c(NA, 
-9L))

data%>%
  rbind(data2) -> data_full


data_full2 <- data_full %>%
  count(Patient_ID, Unit_Type, Status) %>%
  pivot_wider(names_from = c(Unit_Type, Status), values_from = n)


data_full2%>%
  rowwise()%>%
  mutate(ABC_Ordered = sum(c(ABC_Returned,
                             ABC_Transfused),
                           na.rm = TRUE))%>%
  ungroup() -> data_full3

【讨论】：