【发布时间】:2019-05-12 10:41:51
【问题描述】:
我正在尝试编写一个程序,该程序在定义长度的 DNA 序列的元素中进行转换,但我无法理解从循环中得到的输出。对于循环的前四次迭代,它似乎可以很好地移码,然后似乎恢复到旧序列。我已经非常努力地理解这种行为,但我对编程太陌生,无法解决这个问题,非常感谢任何帮助。
这是我的代码:
seq = "ACTGCATTTTGCATTTT"
search = "TGCATTTTG"
import regex as re
def kmers(text,n):
for a in text:
b = text[text.index(a):text.index(a)+n]
c = len(re.findall(b, text, overlapped=True))
print ("the count for " + b + " is " + str(c))
(kmers(seq,3))
和我的输出:
the count for ACT is 1
the count for CTG is 1
the count for TGC is 2
the count for GCA is 2
#I expected 'CAT' next, from here on I don't understand the behaviour
the count for CTG is 1
the count for ACT is 1
the count for TGC is 2
the count for TGC is 2
the count for TGC is 2
the count for TGC is 2
the count for GCA is 2
the count for CTG is 1
the count for ACT is 1
the count for TGC is 2
the count for TGC is 2
the count for TGC is 2
the count for TGC is 2
很明显,最终我想删除重复项等,但我被困在为什么我的 for 循环没有按照我的预期工作的问题上,这让我停止了我的工作以使其变得更好。
谢谢
【问题讨论】:
标签: python for-loop bioinformatics dna-sequence