【发布时间】:2020-08-21 06:58:22
【问题描述】:
我想在数据框中搜索主题列表。这里我附上示例数据集和代码。
Gene_and_Promoter <- tibble::tribble(
~Gene, ~Promoter,
"Gene1", "AGTCACGTGCGTGCATACGTGCAAATTGGGCGTACGTGGCTATCTCAACTATCH",
"Gene2", "AACGTGGCGTGGCAGTGCACGTGCCAGTTGTCCCGCAGTGTGCATACTACTCT",
"Gene3", "ACTGGCTACGTGCTGCAATGCGTGCGTAGTGCGTACCAAAGTTAAACCGGCG",
"Gene4", "GCAATACGTGCAAGTGCGTGTACGTGCGTGATGTCGTACGTAACCGGCCGGT",
"Gene5", "ATACGTGCGTCGTACGTGCGTACTAATACATACATCATAATTTAAACCCG",
"Gene6", "GGGGGAATCTCGTTCCTACGTCAAGGATAGATGCTGATAGTCGTA"
)
Motifs <- tibble::tribble(
~MOTIF,
"CGTGC",
"GGAATA",
"CCAG",
"CGTA"
)
Gene_and_Promoter %>%
mutate(CGTGC = vcountPattern("CGTGC",DNAStringSet(Gene_and_Promoter$Promoter))) %>%
mutate(GGAATA = vcountPattern("GGAATA",DNAStringSet(Gene_and_Promoter$Promoter))) %>%
mutate(CCAG = vcountPattern("CCAG",DNAStringSet(Gene_and_Promoter$Promoter))) %>%
mutate(CGTA = vcountPattern("CGTA",DNAStringSet(Gene_and_Promoter$Promoter)))
上述代码提供了所需的输出(Motif 在启动器中存在)。
我可以通过减少使用 mutate 的次数来优化上面的代码吗? (可能通过迭代)
【问题讨论】: