【问题标题】:Separate an element of a data frame and split into two columns in alphabetical mannerSeparate an element of a data frame and split into two columns in alphabetical manner
【发布时间】:2022-12-02 00:47:44
【问题描述】:

I have this data frame:

> d
      gene_pair
1   ABHD4_ABHD5
2     ABL1_ABL2
3       ABR_BCR
4   ACAP2_ACAP3
5  ACTX_ACTR1B
6 ACVR2A_ACVR2B

This is the dput:

> dput(d)
structure(list(gene_pair = c("ABHD4_ABHD5", "ABL1_ABL2", "ABR_BCR", 
"ACAP2_ACAP3", "ACTX_ACTR1B", "ACVR2A_ACVR2B")), row.names = c(NA, 
6L), class = "data.frame")

I would like to create a new column called sorted gene pair, where I make sure the genes are in alphabetical order.

I have tried:

d %>%
  rowwise() %>% 
  mutate(paste(sort(strsplit(gene_pair, '_')), collapse = '_'))

But I get an atomic error

Expected outcome of the sorted_gene_pair column:

> d
    sorted_gene_pair
1   ABHD4_ABHD5
2     ABL1_ABL2
3       ABR_BCR
4   ACAP2_ACAP3
5  ACTR1B_ACTX
6 ACVR2A_ACVR2B

【问题讨论】:

    标签: r tidyverse


    【解决方案1】:

    You'll need to unlist to use sort (needs an atomic vector, not a list):

    library(dplyr)
    
    d |>
      rowwise() |>
      mutate(sorted_gene_pair = paste(sort(unlist(strsplit(gene_pair, '_'))), collapse = '_')) |>
      ungroup()
    

    Output:

    # A tibble: 6 × 2
      gene_pair     sorted_gene_pair
      <chr>         <chr>        
    1 ABHD4_ABHD5   ABHD4_ABHD5  
    2 ABL1_ABL2     ABL1_ABL2    
    3 ABR_BCR       ABR_BCR      
    4 ACAP2_ACAP3   ACAP2_ACAP3  
    5 ACTX_ACTR1B   ACTR1B_ACTX  
    6 ACVR2A_ACVR2B ACVR2A_ACVR2B
    

    【讨论】:

    • __________thanks!
    猜你喜欢
    • 2022-12-28
    • 2022-12-28
    • 2022-12-02
    • 1970-01-01
    • 2022-11-20
    • 2022-12-02
    • 2022-12-02
    • 2022-12-02
    • 2022-12-02
    相关资源
    最近更新 更多