【问题标题】:How to count content of desired column of the grouped values in a dataframe [duplicate]如何计算数据框中分组值的所需列的内容[重复]
【发布时间】:2017-08-03 03:42:08
【问题描述】:

我有以下数据框:


testdf <- structure(list(gene = structure(c(2L, 2L, 2L, 2L, 2L, 1L, 1L, 
1L, 1L, 1L), .Label = c("Actc1", "Cbx1"), class = "factor"), 
    p1 = structure(c(5L, 1L, 2L, 3L, 4L, 1L, 1L, 1L, 1L, 1L), .Label = c("BoneMarrow", 
    "Liver", "Pulmonary", "Umbilical", "Vertebral"), class = "factor"), 
    p2 = structure(c(1L, 1L, 1L, 1L, 1L, 5L, 2L, 3L, 4L, 1L), .Label = c("Adipose", 
    "Liver", "Pulmonary", "Umbilical", "Vertebral"), class = "factor")), .Names = c("gene", 
"p1", "p2"), class = "data.frame", row.names = c(NA, -10L))

testdf
#>     gene         p1        p2
#> 1   Cbx1  Vertebral   Adipose
#> 2   Cbx1 BoneMarrow   Adipose
#> 3   Cbx1      Liver   Adipose
#> 4   Cbx1  Pulmonary   Adipose
#> 5   Cbx1  Umbilical   Adipose
#> 6  Actc1 BoneMarrow Vertebral
#> 7  Actc1 BoneMarrow     Liver
#> 8  Actc1 BoneMarrow Pulmonary
#> 9  Actc1 BoneMarrow Umbilical
#> 10 Actc1 BoneMarrow   Adipose

我要做的是按gene分组并计算p1的频率。结果是这样的:

Cbx1  5 #Vertebral, Bone Marrow, Liver, Pulmonary, Umbilical
Actc1 1 #Bone Marrow

我试过了,但它没有给出我想要的:

testdf %>% group_by(gene) %>% mutate(n=n())

【问题讨论】:

    标签: r dplyr


    【解决方案1】:

    使用aggregate的替代方法

    aggregate(p1 ~ gene, testdf, function(x) length(unique(x)))
    
    #   gene p1
    #1 Actc1  1
    #2  Cbx1  5
    

    【讨论】:

      【解决方案2】:

      您可以使用n_distinct 来计算唯一值:

      testdf %>% group_by(gene) %>% summarise(n = n_distinct(p1))
      
      # A tibble: 2 x 2
      #    gene     n
      #  <fctr> <int>
      #1  Actc1     1
      #2   Cbx1     5
      

      【讨论】:

        【解决方案3】:

        你也可以使用tapply

         with(testdf,tapply(p1,gene,function(x)length(unique(x))))
          Actc1  Cbx1 
              1     5 
        

        【讨论】:

          猜你喜欢
          • 1970-01-01
          • 1970-01-01
          • 1970-01-01
          • 2021-06-07
          • 1970-01-01
          • 1970-01-01
          • 1970-01-01
          • 2019-09-21
          • 1970-01-01
          相关资源
          最近更新 更多