【问题标题】:How to mean center and z-score data using R?如何使用 R 表示中心和 z 分数数据?
【发布时间】:2022-12-19 13:24:34
【问题描述】:

我在看一篇论文,发现作者计算了一个基因签名(由许多基因组成)的分数对于每个样本基于以下步骤:1)平均中心,2)平均基因,3)Z-分数。描述对我来说似乎有点混乱,我想要一个确切的例子。

这是一个数据框。如何根据上述方法分别计算这两个样本的签名分数(每列一个分数)?

        Sample1   Sample2
Gene1 0.3019117 0.3649211
Gene2 0.2861431 0.3072168
Gene3 0.3794475 0.6505417
Gene4 0.2794465 0.3906110
Gene5 0.3334156 0.5845917
Gene6 0.3513268 0.6560779

数据

structure(list(Sample1 = c(0.301911734515308, 0.286143128965312, 
                                    0.379447523688471, 0.279446490938859, 0.333415615105398, 0.351326812590339
), Sample2 = c(0.36492108146509, 0.307216787356549, 0.650541715557005, 
                       0.390610992781682, 0.584591653411763, 0.656077880562312)), row.names = c("Gene1", 
                                                                                                "Gene2", "Gene3", "Gene4", "Gene5", "Gene6"
                       ), class = "data.frame")

【问题讨论】:

  • 以下对你有用吗? @林财金?

标签: r


【解决方案1】:

这不是很清楚。我假设:

  1. 只是每个样本的平均值
  2. 是每个基因的行向平均值
  3. 是每个样本的z分数

    如果为真,则执行以下操作:

    #assume mean center is mean of each sample and z score
    df <- df %>% 
      mutate(across(everything(),
                   list(mean=mean,
                        z_score=scale),
                    .names ="{.fn}_{.col}"))
    #sample level mean
    df <- df %>% 
      rowwise() %>% 
      mutate(gene_mean=mean(c_across(1:2)))
    

【讨论】:

    【解决方案2】:

    以下是我们如何使用我描述的步骤计算基因签名分数的示例:

    # Load necessary libraries
    library(dplyr)
    
    # Define the gene expression levels for each gene in each sample
    gene_expression <- data.frame(
      gene1 = c(2, 3, 5, 1),
      gene2 = c(4, 5, 2, 3),
      gene3 = c(3, 1, 4, 2)
    )
    
    # Mean center the data
    mean_centered <- gene_expression %>%
      mutate_all(funs(.-mean(.)))
    
    # Average over genes to get a single value for each sample
    gene_signature_scores <- rowMeans(mean_centered)
    
    # Z-score the gene signature scores
    z_scored <- (gene_signature_scores - mean(gene_signature_scores)) / sd(gene_signature_scores)
    

    【讨论】:

      猜你喜欢
      • 1970-01-01
      • 1970-01-01
      • 1970-01-01
      • 1970-01-01
      • 2021-08-12
      • 1970-01-01
      • 2018-12-10
      • 1970-01-01
      • 1970-01-01
      相关资源
      最近更新 更多