【问题标题】:Count number of element for each row in a matrix [duplicate]计算矩阵中每一行的元素数[重复]
【发布时间】:2021-01-01 06:35:31
【问题描述】:

您好,我有一个矩阵,例如:

  COL1 COL2 COL3
A "A"  "B"  NA
B "B"  "B"  "C"
C NA   NA   NA
D "B"  "B"  "B"
E NA   NA   "C"
F "A"  "A"  "C"

我会为每一行(A、B、C、D 等)获取字母数量为 AB

示例:

  Nb
A 2
B 2
C 0
D 3
E 0
F 2

有人有想法吗?

【问题讨论】:

  • 你可以试试rowSums(df == 'A'|df == 'B', na.rm = TRUE)

标签: r matrix


【解决方案1】:

您可以使用apply() 尝试base R 解决方案:

#Base R
df$Var <- apply(df,1,function(x) length(which(!is.na(x) & x %in% c('A','B'))))

输出:

  COL1 COL2 COL3 Var
A    A    B <NA>   2
B    B    B    C   2
C <NA> <NA> <NA>   0
D    B    B    B   3
E <NA> <NA>    C   0
F    A    A    C   2

使用的一些数据:

#Data
df <- structure(list(COL1 = c("A", "B", NA, "B", NA, "A"), COL2 = c("B", 
"B", NA, "B", NA, "A"), COL3 = c(NA, "C", NA, "B", "C", "C")), row.names = c("A", 
"B", "C", "D", "E", "F"), class = "data.frame")

或者如果你对tidyverse感到好奇:

library(tidyverse)
#Code
df %>% mutate(id=1:n()) %>%
  left_join(df %>% mutate(id=1:n()) %>%
  pivot_longer(cols = -id) %>%
  filter(value %in% c('A','B')) %>%
  group_by(id) %>%
  summarise(Var=n())) %>% ungroup() %>%
  replace(is.na(.),0) %>% select(-id)

输出:

  COL1 COL2 COL3 Var
1    A    B    0   2
2    B    B    C   2
3    0    0    0   0
4    B    B    B   3
5    0    0    C   0
6    A    A    C   2

【讨论】:

  • 谢谢,我该如何用 thresgold 替换它,例如,如果我想计算 (0-4) 和 (6-10) 之间的值的数量?
  • @chippycentra 您可以设置filter(),在这种情况下tidyverse 选项应该很有用。如果我理解正确,您想过滤最终变量,使其介于 ( 0-4)(6-10) 之间,所以我会添加一个新管道,例如 filter(Var&gt;0 &amp; Var&lt;4 | Var&gt;6 &amp; Var&lt;10) 让我知道这是否是您要找的!
【解决方案2】:

另一种方法是使用sapply:

df$n <- sapply(1:nrow(df), function(i) sum((df[i,] %in% c('A', 'B'))))

# COL1 COL2 COL3 n
# A    A    B <NA> 2
# B    B    B    C 2
# C <NA> <NA> <NA> 0
# D    B    B    B 3
# E <NA> <NA>    C 0
# F    A    A    C 2

您也可以使用purrr::map_dbl 实现相同的输出。只需将sapply 替换为map_dbl

【讨论】:

    【解决方案3】:
    library(dplyr)
     df <- structure(list(COL1 = c("A", "B", NA, "B", NA, "A"), COL2 = c("B", 
                                                                         "B", NA, "B", NA, "A"), COL3 = c(NA, "C", NA, "B", "C", "C")), row.names = c("A", 
                                                                                                                                                      "B", "C", "D", "E", "F"), class = "data.frame")
     df %>% 
       rowwise() %>% 
       mutate(sumVar = across(c(COL1:COL3),~ifelse(. %in% c("A", "B"),1,0)) %>% sum)
    # A tibble: 6 x 4
    # Rowwise: 
      COL1  COL2  COL3  sumVar
      <chr> <chr> <chr>  <dbl>
    1 A     B     NA         2
    2 B     B     C          2
    3 NA    NA    NA         0
    4 B     B     B          3
    5 NA    NA    C          0
    6 A     A     C          2
    

    【讨论】:

      猜你喜欢
      • 1970-01-01
      • 1970-01-01
      • 1970-01-01
      • 1970-01-01
      • 1970-01-01
      • 1970-01-01
      • 1970-01-01
      • 1970-01-01
      • 1970-01-01
      相关资源
      最近更新 更多