【发布时间】:2015-11-27 01:11:12
【问题描述】:
我正在尝试使用 CSV 将一些数据写入 Excel 电子表格。 我正在编写一个主题查找器,从 fasta 读取输入并输出到 excel。 但我很难以正确的格式写入数据。
我在 excel 中想要的结果如下所示..
SeqName M1 Hits M2 Hits
Seq1 MN[A-Z] 3 V[A-Z]R[ML] 2
Seq2 MN[A-Z] 0 V[A-Z]R[ML] 5
Seq3 MN[A-Z] 1 V[A-Z]R[ML] 0
我已经生成了正确的结果,但我只是不知道如何将它们以上述正确格式放置。
这是我目前的代码。
import re
from Bio import SeqIO
import csv
import collections
def SearchMotif(f1, motif, f2="motifs.xls"):
with open(f1, 'r') as fin, open(f2,'wb') as fout:
# This makes SeqName static and everything else mutable thus, when more than 1 motifs are searched,
# they can be correctly placed into excel.
writer = csv.writer(fout, delimiter = '\t')
motif_fieldnames = ['SeqName']
writer_dict = csv.DictWriter(fout,delimiter = '\t' ,fieldnames=motif_fieldnames)
for i in range(0,len(motif),1):
motif_fieldnames.append('M%d' %(i+1))
motif_fieldnames.append('Hits')
writer_dict.writeheader()
# Reading input fasta file for processing.
fasta_name = []
for seq_record in SeqIO.parse(f1,'fasta'):
sequence = repr(seq_record.seq) # re-module only takes string
fasta_name.append(seq_record.name)
print sequence **********
for j in motif:
motif_name = j
print motif_name **********
number_count = len(re.findall(j,sequence))
print number_count **********
writer.writerow([motif_name])
for i in fasta_name:
writer.writerow([i]) # [] makes it fit into one column instead of characters taking each columns
带有星号 ********** 的打印语句生成这个...其中 number 是 Hits 的数量,差异序列是 seq1、seq2 ...等等。
Seq('QIKDLLVSSSTDLDTTLVLVNAIYFKGMWKTAFNAEDTREMPFHVTKQESKPVQ...LTS', SingleLetterAlphabet())
PA[A-Z]
0
Y[A-Z]L[A-Z]
0
Seq('SFNVATLPAESSSTDLDTTVLLPDEPAEVSDLERIETEWTNMKILELPFAPQMK...VSS', SingleLetterAlphabet())
PA[A-Z]
2
Y[A-Z]L[A-Z]
0
Seq('PAESIYFKIEKTYNLT', SingleLetterAlphabet())
PA[A-Z]
1
Y[A-Z]L[A-Z]
1
【问题讨论】: