【问题标题】:How to modify my code in order to extract my output executed from code on Python terminal to a fasta file?如何修改我的代码以便将我从 Python 终端上的代码执行的输出提取到 fasta 文件?
【发布时间】:2021-02-26 06:28:39
【问题描述】:
我被困在这部分。
我已经生成了一个代码来使用 Biopython 制作我想要的序列。
这些是我下面的代码。
如何修改此代码以便将其保存为我当前工作目录中的快速文件。
for record in SeqIO.parse("4N8C.pdb", "pdb-seqres"):
print('>'+s+'_chain_'+record.annotations['chain']+'\n'+record.seq)
>my_protein_chain_H
QVQLQQSGPEVVRPGVSVRISCKGSGYTFTDYAMHWVKQSHAKSLDWIGVIGTDNGNTNYNQKFKGKATMTVDKSSNTAYMELGRLTSEDSAIYYCARRDRDDVWFAYWGQGTLVTVSAAKTTAPSVYPLAPVCGDTTGSSVTLGCLVKGYFPEPVTLTWNSGSLSSGVHTFPAVLQSDLYTLSSSVTVTSSTWPSQSITCNVAHPASSTKVDKKIEPRGP
>my_protein_chain_I
QVQLQQSGPEVVRPGVSVRISCKGSGYTFTDYAMHWVKQSHAKSLDWIGVIGTDNGNTNYNQKFKGKATMTVDKSSNTAYMELGRLTSEDSAIYYCARRDRDDVWFAYWGQGTLVTVSAAKTTAPSVYPLAPVCGDTTGSSVTLGCLVKGYFPEPVTLTWNSGSLSSGVHTFPAVLQSDLYTLSSSVTVTSSTWPSQSITCNVAHPASSTKVDKKIEPRGP
>my_protein_chain_L
DIVMSQSPSSLAVSVGEKVSMSCKSSQSLFYSSYQKDLLAWYQQKPGQSPKLLIYWASTRESGVPDRFTGSGSGTDFTLTISSVKAEDLAVYFCQQYYTYPLTFGAGTKLELKRADAAPTVSIFPPSSEQLTSGGASVVCFLNNFYPKDINVKWKIDGSERQNGVLNSWTDQDSKDSTYSMSSTLTLTKDEYERHNSYTCEATHKTSTSPIVKSFNRNEC
>my_protein_chain_M
DIVMSQSPSSLAVSVGEKVSMSCKSSQSLFYSSYQKDLLAWYQQKPGQSPKLLIYWASTRESGVPDRFTGSGSGTDFTLTISSVKAEDLAVYFCQQYYTYPLTFGAGTKLELKRADAAPTVSIFPPSSEQLTSGGASVVCFLNNFYPKDINVKWKIDGSERQNGVLNSWTDQDSKDSTYSMSSTLTLTKDEYERHNSYTCEATHKTSTSPIVKSFNRNEC
>my_protein_chain_X
MSLLTEVETPIRNEWGCRCNDSS
>my_protein_chain_Y
MSLLTEVETPIRNEWGCRCNDSS
【问题讨论】:
标签:
python
command-line
terminal
biopython
pdb-files
【解决方案1】:
单个fasta文件:
#!/usr/bin/env python3
# -*- coding: utf-8 -*-
"""
Created on Thu Dec 31 11:11:53 2020
@author: Pietro
"""
from Bio import SeqIO
pdb_file = "2ms2.pdb"
with open(pdb_file+'_fasta', "w+") as write_fasta:
for record in SeqIO.parse(pdb_file, "pdb-seqres"):
print('>'+record.name+'_chain_'+record.annotations['chain']+'\n'+record.seq)
SeqIO.write(record , write_fasta , "fasta")
# write_fasta.write('>'+record.name+'_chain_'+record.annotations['chain']+'\n'+str(record.seq)+'\n')
多个fasta文件:
#!/usr/bin/env python3
# -*- coding: utf-8 -*-
"""
Created on Thu Dec 31 11:11:53 2020
@author: Pietro
"""
from Bio import SeqIO
pdb_file = "4N8C.pdb"
for record in SeqIO.parse(pdb_file, "pdb-seqres"):
print('>'+record.name+'_chain_'+record.annotations['chain']+'\n'+record.seq)
with open(pdb_file+'_chain_'+record.annotations['chain']+'_fasta', "w+") as write_fasta:
SeqIO.write(record , write_fasta , "fasta")
# write_fasta.write('>'+record.name+'_chain_'+record.annotations['chain']+'\n'+str(record.seq))
SeqIO.write 和 write_fasta.write 行是等效的,但后者在单行上打印 fasta 序列,而前者将序列拆分为 60 aa 的块
write_fasta.write 不包含描述,可能不符合 fasta
('pdb:chainID_chain' 而不是 'pdb:chainID _chain ...')
【解决方案2】:
单个fasta文件:
#!/usr/bin/env python3
# -*- coding: utf-8 -*-
"""
Created on Thu Dec 31 11:11:53 2020
@author: Pietro
"""
from Bio import SeqIO
from Bio.PDB import PDBParser
pdb_file = "4N8C.pdb"
pdb = pdb_file.split('.')[0]
parser = PDBParser()
with open(pdb_file+'_fasta', "w+") as write_fasta:
for record in SeqIO.PdbIO.AtomIterator(pdb, parser.get_structure(pdb ,pdb_file)):
SeqIO.write(record , write_fasta , "fasta")