【问题标题】:Trying to write data into newick format R试图将数据写入 newick 格式 R
【发布时间】:2020-07-27 18:40:27
【问题描述】:

我有一个数据集,从顶层开始具有不同级别的分支:stock -> mbranch -> sbranch -> lsbranch。我希望能够将这些级别的数据可视化为 Newick 格式。我在每个库存级别中有不同的语言组,并希望根据这些最高级别的组制作不同的树。

例如我的数据格式如下:

sample= data.frame("stock" = c("A", "A", "B", "B", "B"), "mbranch" = c("C", "D", "E", "F", "G"), "sbranch" = c("H", "O", NA, "K", "L"), "lsbranch" = c("I", "J", NA, "M", "N"), "name" = c("Andrea", "Kevin", "Charlie", "Naomi", "Sam"))

我正在尝试输出 newick 树格式,类似于:

tree = "(A(C(H(I(Andrew))),D(O(J(Kevin)))),B(E(Charlie),F(K(M(Naomi))),G(L(N(Sam)))));"
plot(read.dendrogram(tree))

我正在这样做,所以稍后我可以对输出树的节点进行距离矩阵。

函数 write.tree 是否能够分析这样的数据并从中创建一棵树(假设我的实际数据集要大得多)?或者一般来说,一个输出树格式的函数。谢谢

【问题讨论】:

    标签: r tree dendrogram ape-phylo


    【解决方案1】:

    你可以使用ape::read.tree()函数来读取你的newick格式树

    tree = "(A(C(H(I(Andrew))),D(O(J(Kevin)))),B(E(Charlie),F(K(M(Naomi))),G(L(N(Sam)))));"
    my_tree <- read.tree(text = tree)
    plot(my_tree)
    

    然后您可以使用 ape::write.tree 将树保存到 newick 文件中:

    write.tree(my_tree, file = "my_file_name.tre")
    

    要将您的表格从ape 转换为"phylo" 对象,您可以使用此函数(可能需要一些调整):

    ## The function
    data.frame.to.phylo <- function(sample){
        ## Making an edge table
        edge_table <- rbind(
            ## The root connecting A to B
            rbind(c("root", "A"),c("root", "B")),
            ## All the nodes connecting to the tips
            cbind(sample$stock, sample$name)
            )
    
        ## Translating the values in the edge table into edge IDs
        ## The order must be tips, root, nodes
        element_names <- c(unique(sample$name), "root", unique(sample$stock))
        element_ids   <- seq(1:length(element_names))
    
        ## Looping through each ID and name
        for(element in element_ids) {
            edge_table <- ifelse(edge_table == element_names[element], element_ids[element], edge_table)
        }
    
        ## Make numeric
        edge_table <- apply(edge_table, 2, as.numeric)
    
        ## Build the phylo object
        phylo_object <- list()
        phylo_object$edge <- edge_table
        phylo_object$tip.label <- unique(sample$name)
        phylo_object$node.label <- c("root", unique(sample$stock))
        phylo_object$Nnode <- length(phylo_object$node.label)
    
        ## Forcing the class to be "phylo"
        class(phylo_object) <- "phylo"
        return(phylo_object)
    }
    
    ## The data
    sample = data.frame("stock" = c("A", "A", "B", "B", "B"), "mbranch" = c("C", "D", "E", "F", "G"), "sbranch" = c("H", "O", NA, "K", "L"), "lsbranch" = c("I", "J", NA, "M", "N"), "name" = c("Andrea", "Kevin", "Charlie", "Naomi", "Sam"))
    
    ## Plotting the data.frame for testing the function
    plot(data.frame.to.phylo(sample))
    

    干杯, 托马斯

    【讨论】:

      猜你喜欢
      • 2015-03-29
      • 2017-08-20
      • 1970-01-01
      • 1970-01-01
      • 2015-04-28
      • 2012-03-12
      • 1970-01-01
      • 1970-01-01
      • 1970-01-01
      相关资源
      最近更新 更多