【问题标题】:How to modify my code in order to extract my output executed from code on Python terminal to a fasta file?如何修改我的代码以便将我从 Python 终端上的代码执行的输出提取到 fasta 文件?
【发布时间】:2021-02-26 06:28:39
【问题描述】:

我被困在这部分。 我已经生成了一个代码来使用 Biopython 制作我想要的序列。 这些是我下面的代码。 如何修改此代码以便将其保存为我当前工作目录中的快速文件。

for record in SeqIO.parse("4N8C.pdb", "pdb-seqres"):
    print('>'+s+'_chain_'+record.annotations['chain']+'\n'+record.seq)

    >my_protein_chain_H
    QVQLQQSGPEVVRPGVSVRISCKGSGYTFTDYAMHWVKQSHAKSLDWIGVIGTDNGNTNYNQKFKGKATMTVDKSSNTAYMELGRLTSEDSAIYYCARRDRDDVWFAYWGQGTLVTVSAAKTTAPSVYPLAPVCGDTTGSSVTLGCLVKGYFPEPVTLTWNSGSLSSGVHTFPAVLQSDLYTLSSSVTVTSSTWPSQSITCNVAHPASSTKVDKKIEPRGP
    >my_protein_chain_I
    QVQLQQSGPEVVRPGVSVRISCKGSGYTFTDYAMHWVKQSHAKSLDWIGVIGTDNGNTNYNQKFKGKATMTVDKSSNTAYMELGRLTSEDSAIYYCARRDRDDVWFAYWGQGTLVTVSAAKTTAPSVYPLAPVCGDTTGSSVTLGCLVKGYFPEPVTLTWNSGSLSSGVHTFPAVLQSDLYTLSSSVTVTSSTWPSQSITCNVAHPASSTKVDKKIEPRGP
    >my_protein_chain_L
    DIVMSQSPSSLAVSVGEKVSMSCKSSQSLFYSSYQKDLLAWYQQKPGQSPKLLIYWASTRESGVPDRFTGSGSGTDFTLTISSVKAEDLAVYFCQQYYTYPLTFGAGTKLELKRADAAPTVSIFPPSSEQLTSGGASVVCFLNNFYPKDINVKWKIDGSERQNGVLNSWTDQDSKDSTYSMSSTLTLTKDEYERHNSYTCEATHKTSTSPIVKSFNRNEC
    >my_protein_chain_M
    DIVMSQSPSSLAVSVGEKVSMSCKSSQSLFYSSYQKDLLAWYQQKPGQSPKLLIYWASTRESGVPDRFTGSGSGTDFTLTISSVKAEDLAVYFCQQYYTYPLTFGAGTKLELKRADAAPTVSIFPPSSEQLTSGGASVVCFLNNFYPKDINVKWKIDGSERQNGVLNSWTDQDSKDSTYSMSSTLTLTKDEYERHNSYTCEATHKTSTSPIVKSFNRNEC
    >my_protein_chain_X
    MSLLTEVETPIRNEWGCRCNDSS
    >my_protein_chain_Y
    MSLLTEVETPIRNEWGCRCNDSS

【问题讨论】:

标签: python command-line terminal biopython pdb-files


【解决方案1】:

单个fasta文件:

#!/usr/bin/env python3
# -*- coding: utf-8 -*-
"""
Created on Thu Dec 31 11:11:53 2020

@author: Pietro
"""


from Bio import SeqIO

pdb_file = "2ms2.pdb"

with open(pdb_file+'_fasta', "w+") as write_fasta:
    for record in SeqIO.parse(pdb_file, "pdb-seqres"):
        print('>'+record.name+'_chain_'+record.annotations['chain']+'\n'+record.seq)
        SeqIO.write(record , write_fasta , "fasta")
#        write_fasta.write('>'+record.name+'_chain_'+record.annotations['chain']+'\n'+str(record.seq)+'\n')

多个fasta文件:


#!/usr/bin/env python3
# -*- coding: utf-8 -*-
"""
Created on Thu Dec 31 11:11:53 2020

@author: Pietro
"""


from Bio import SeqIO

pdb_file = "4N8C.pdb"


for record in SeqIO.parse(pdb_file, "pdb-seqres"):
    print('>'+record.name+'_chain_'+record.annotations['chain']+'\n'+record.seq)
    with open(pdb_file+'_chain_'+record.annotations['chain']+'_fasta', "w+") as write_fasta:
        SeqIO.write(record , write_fasta , "fasta")
#       write_fasta.write('>'+record.name+'_chain_'+record.annotations['chain']+'\n'+str(record.seq))

SeqIO.write 和 write_fasta.write 行是等效的,但后者在单行上打印 fasta 序列,而前者将序列拆分为 60 aa 的块

write_fasta.write 不包含描述,可能不符合 fasta ('pdb:chainID_chain' 而不是 'pdb:chainID _chain ...')

【讨论】:

    【解决方案2】:

    单个fasta文件:

    
    #!/usr/bin/env python3
    # -*- coding: utf-8 -*-
    """
    Created on Thu Dec 31 11:11:53 2020
    
    @author: Pietro
    """
    
    
    from Bio import SeqIO
    
    
    from Bio.PDB import PDBParser
    
    
    pdb_file = "4N8C.pdb"
    
    pdb = pdb_file.split('.')[0]
    
    parser = PDBParser()
    
    
    with open(pdb_file+'_fasta', "w+") as write_fasta:
        for record in SeqIO.PdbIO.AtomIterator(pdb, parser.get_structure(pdb ,pdb_file)):
            SeqIO.write(record , write_fasta , "fasta")
    
    

    【讨论】:

      猜你喜欢
      • 1970-01-01
      • 2022-01-03
      • 1970-01-01
      • 1970-01-01
      • 2018-12-24
      • 1970-01-01
      • 1970-01-01
      • 2016-09-20
      • 2019-09-12
      相关资源
      最近更新 更多