【问题标题】:Make 0/1 character matrix from random phylogenetic tree in R?从R中的随机系统发育树制作0/1字符矩阵?
【发布时间】:2019-01-27 15:52:50
【问题描述】:

是否有可能从左侧的分叉系统发育树生成像右侧所示的0/1 字符矩阵。矩阵中的1 表示存在一个联合进化枝的共享字符。

这段代码生成了不错的随机树,但我不知道从哪里开始将结果转换为字符矩阵。

library(ape) # Other package solutions are acceptable

forest <- rmtree(N = 2, n = 10, br = NULL)
plot(forest)

为了清楚起见,我可以使用以下代码生成随机矩阵,然后绘制树。

library(ape)
library(phangorn)

ntaxa <- 10
nchar <- ntaxa - 1

char_mat <- array(0, dim = c(ntaxa, ntaxa - 1))

for (i in 1:nchar) {
  char_mat[,i] <- replace(char_mat[,i], seq(1, (ntaxa+1)-i), 1)
}

char_mat <- char_mat[sample.int(nrow(char_mat)), # Shuffle rows 
                     sample.int(ncol(char_mat))] # and cols

# Ensure all branch lengths > 0
dist_mat <- dist.gene(char_mat) + 0.5
upgma_tree <- upgma(dist_mat)
plot.phylo(upgma_tree, "phylo")

我想要的是生成随机树,然后从这些树中制作矩阵。 This solution 没有生成正确的矩阵类型。

为清晰起见编辑:我正在生成二进制字符矩阵,学生可以使用这些矩阵通过简单的简约法绘制系统发育树。 1 字符表示将分类群联合成进化枝的同源性。因此,所有行必须共享一个字符(一列中所有行的1)并且某些字符必须仅由两个分类单元共享。 (我不考虑自体变形。)

例子:

【问题讨论】:

    标签: r phylogeny ape-phylo


    【解决方案1】:

    您可以查看ape 中的rTraitDisc 函数,该函数非常简单:

    library(ape)
    ## You'll need to simulate branch length!
    forest <- rmtree(N = 2, n = 10)
    
    ## Generate on equal rate model character
    (one_character <- rTraitDisc(forest[[1]], type = "ER", states = c(0,1)))
    # t10  t7  t5  t9  t1  t4  t2  t8  t3  t6 
    #   0   0   0   1   0   0   0   0   0   0 
    # Levels: 0 1
    
    ## Generate a matrix of ten characters
    (replicate(10, rTraitDisc(forest[[1]], type = "ER", states = c(0,1))))
    
    #     [,1] [,2] [,3] [,4] [,5] [,6] [,7] [,8] [,9] [,10]
    # t10 "0"  "0"  "0"  "0"  "1"  "0"  "0"  "0"  "0"  "0"  
    # t7  "0"  "0"  "0"  "0"  "1"  "0"  "0"  "0"  "0"  "0"  
    # t5  "0"  "0"  "0"  "0"  "0"  "0"  "0"  "0"  "0"  "0"  
    # t9  "0"  "0"  "1"  "0"  "0"  "0"  "0"  "0"  "0"  "0"  
    # t1  "0"  "0"  "1"  "0"  "0"  "0"  "0"  "0"  "0"  "0"  
    # t4  "0"  "0"  "1"  "0"  "0"  "0"  "0"  "0"  "0"  "0"  
    # t2  "0"  "0"  "1"  "0"  "0"  "0"  "0"  "0"  "0"  "0"  
    # t8  "0"  "0"  "0"  "0"  "0"  "0"  "0"  "0"  "0"  "0"  
    # t3  "0"  "0"  "0"  "0"  "0"  "0"  "0"  "0"  "0"  "0"  
    # t6  "0"  "0"  "0"  "0"  "0"  "0"  "0"  "0"  "0"  "0"
    

    要将其应用于多棵树,最好的方法是创建一个 lapply 函数,如下所示:

    ## Lapply wrapper function
    generate.characters <- function(tree) {
        return(replicate(10, rTraitDisc(tree, type = "ER", states = c(0,1))))
    }
    
    ## Generate 10 character matrices for each tree
    lapply(forest, generate.characters)
    
    # [[1]]
    #     [,1] [,2] [,3] [,4] [,5] [,6] [,7] [,8] [,9] [,10]
    # t10 "0"  "0"  "0"  "1"  "0"  "0"  "0"  "0"  "0"  "0"  
    # t7  "0"  "0"  "0"  "1"  "0"  "0"  "0"  "0"  "0"  "0"  
    # t5  "0"  "0"  "0"  "1"  "0"  "0"  "0"  "0"  "0"  "0"  
    # t9  "0"  "0"  "0"  "0"  "0"  "0"  "0"  "0"  "0"  "0"  
    # t1  "0"  "0"  "0"  "0"  "0"  "0"  "0"  "0"  "0"  "0"  
    # t4  "0"  "0"  "0"  "0"  "0"  "0"  "0"  "0"  "0"  "0"  
    # t2  "0"  "0"  "0"  "0"  "0"  "0"  "0"  "0"  "0"  "0"  
    # t8  "0"  "0"  "0"  "1"  "0"  "1"  "0"  "0"  "0"  "1"  
    # t3  "0"  "0"  "0"  "0"  "0"  "1"  "0"  "0"  "0"  "0"  
    # t6  "0"  "0"  "0"  "0"  "0"  "1"  "0"  "0"  "0"  "0"  
    
    # [[2]]
    #     [,1] [,2] [,3] [,4] [,5] [,6] [,7] [,8] [,9] [,10]
    # t7  "0"  "0"  "0"  "0"  "0"  "0"  "0"  "0"  "0"  "0"  
    # t9  "1"  "0"  "0"  "0"  "0"  "0"  "0"  "0"  "0"  "0"  
    # t5  "0"  "0"  "0"  "0"  "0"  "0"  "0"  "0"  "0"  "0"  
    # t2  "0"  "0"  "0"  "0"  "0"  "0"  "0"  "0"  "0"  "0"  
    # t4  "0"  "1"  "0"  "0"  "1"  "0"  "0"  "0"  "0"  "0"  
    # t6  "0"  "1"  "0"  "0"  "1"  "0"  "0"  "0"  "0"  "0"  
    # t10 "0"  "1"  "1"  "0"  "1"  "1"  "0"  "0"  "0"  "1"  
    # t8  "0"  "1"  "1"  "0"  "1"  "0"  "0"  "0"  "0"  "0"  
    # t3  "0"  "1"  "0"  "0"  "0"  "0"  "0"  "0"  "0"  "0"  
    # t1  "0"  "1"  "0"  "0"  "0"  "0"  "0"  "0"  "0"  "0" 
    

    另一种选择是使用dispRity 包中的sim.morpho。此函数重用rTraitDisc 函数,但实现了更多模型,并允许将速率作为抽样分布提供。它还允许角色看起来更“逼真”,而无需太多不变的数据,并确保生成的角色“看起来”像真正的形态学角色(比如具有适量的同质性等......)。

    library(dispRity)
    ## You're first tree
    tree <- forest[[1]]
    ## Setting up the parameters
    my_rates = c(rgamma, rate = 10, shape = 5)
    my_substitutions = c(runif, 2, 2)
    
    ## HKY binary (15*50)
    matrixHKY <- sim.morpho(tree, characters = 50, model = "HKY",
         rates = my_rates, substitution = my_substitutions)
    
    ## Mk matrix (15*50) (for Mkv models)
    matrixMk <- sim.morpho(tree, characters = 50, model = "ER", rates = my_rates) 
    
    ## Mk invariant matrix (15*50) (for Mk models)
    matrixMk <- sim.morpho(tree, characters = 50, model = "ER", rates = my_rates,
         invariant = FALSE)
    
    ## MIXED model invariant matrix (15*50)
    matrixMixed <- sim.morpho(tree, characters = 50, model = "MIXED",
         rates = my_rates, substitution = my_substitutions,  invariant = FALSE,
         verbose = TRUE)
    

    我建议您阅读 sim.morpho 函数以获取有关模型如何工作的正确参考资料,或阅读 dispRity package manual 中的相关部分。

    【讨论】:

    • 谢谢,但这不能满足我的需要。 rTraitDisc 正在生成随机特征,这些特征在树的各个点上进化,但它们不能用于组装树,这正是我所需要的。对不起,我不清楚;并试图澄清。不过,你的回答确实给了我未来方向的想法!
    【解决方案2】:

    我想出了如何使用phangorn 包中的Descendants 制作矩阵。我仍然需要使用合适的节点标签对其进行调整,以匹配原始问题中的示例矩阵,但框架就在那里。

    library(ape)
    library(phangorn)
    
    ntaxa <- 8
    nchar <- ntaxa - 1
    
    tree <- rtree(ntaxa, br = NULL)
    
    # Gets descendants, but removes the first ntaxa elements,
    # which are the individual tips
    desc <- phangorn::Descendants(tree)[-seq(1, ntaxa)]
    
    char_mat <- array(0, dim = c(ntaxa, nchar))
    
    for (i in 1:nchar) {
      char_mat[,i] <- replace(char_mat[,i], y <- desc[[i]], 1)
    }
    
    rownames(char_mat) <- tree$tip.label
    char_mat
    #>    [,1] [,2] [,3] [,4] [,5] [,6] [,7]
    #> t6    1    1    0    0    0    0    0
    #> t3    1    1    1    0    0    0    0
    #> t7    1    1    1    1    0    0    0
    #> t2    1    1    1    1    1    0    0
    #> t5    1    1    1    1    1    0    0
    #> t1    1    0    0    0    0    1    1
    #> t8    1    0    0    0    0    1    1
    #> t4    1    0    0    0    0    1    0
    
    plot(tree)
    

    reprex package (v0.2.1) 于 2019 年 1 月 28 日创建

    【讨论】:

      猜你喜欢
      • 1970-01-01
      • 1970-01-01
      • 1970-01-01
      • 1970-01-01
      • 2011-10-13
      • 1970-01-01
      • 2022-11-17
      • 1970-01-01
      • 1970-01-01
      相关资源
      最近更新 更多