【问题标题】:How can I obtain amino acid sequence from this URL?如何从该 URL 获取氨基酸序列?
【发布时间】:2019-11-17 12:47:21
【问题描述】:

我想使用 python 和 Selenium 从下面的 url 获取氨基酸序列,但无法成功。 http://flybase.org/download/sequence/FBgn0003719/FBpp

我试过你的美丽汤和硒。

from selenium import webdriver

driver = webdriver.Chrome()

driver.get('http://flybase.org/download/sequence/FBgn0003719/FBpp')

iframe = driver.find_element_by_class_name('scroller')

notification_element = driver.find_element_by_class_name('fastaSeq')

print(notification_element)

消息:没有这样的元素:无法找到元素

【问题讨论】:

    标签: python web-scraping


    【解决方案1】:

    您可以使用selenium 加载页面并使用BeautifulSoup 访问序列:

    from selenium import webdriver
    from bs4 import BeautifulSoup as soup
    d = webdriver.Chrome('/path/to/chromedriver')
    d.get('http://flybase.org/download/sequence/FBgn0003719/FBpp')
    sequence = soup(d.page_source, 'html.parser').find('div', {'class':'fastaSeq'}).text
    

    输出:

    'MKGMRLMPMK MKAKLVVLSV GALWMMMFFL VDYAEGRRLS QLPESECDFD FKEQPEDFFG ILDSSLVPPK EPKDDIYQLK TTRQHSGRRR KQSHKSQNKA ALRLPPPFLW TDDAVDVLQH SHSPTLNGQP IQRRRRAVTV RKERTWDYGV IPYEIDTIFS GAHKALFKQA MRHWENFTCI KFVERDPNLH ANYIYFTVKN CGCCSFLGKN GNGRQPISIG RNCEKFGIII HELGHTIGFH HEHARGDRDK HIVINKGNIM RGQEYNFDVL SPEEVDLPLL PYDLNSIMHY AKNSFSKSPY LDTITPIGIP PGTHLELGQR KRLSRGDIVQ ANLLYKCASC GRTYQQNSGH IVSPHFIYSG NGVLSEFEGS GDAGEDPSAE SEFDASLTNC EWRITATNGE KVILHLQQLH LMSSDDCTQD YLEIRDGYWH KSPLVRRICG NVSGEVITTQ TSRMLLNYVN RNAAKGYRGF KARFEVVCGG DLKLTKDQSI DSPNYPMDYM PDKECVWRIT APDNHQVALK FQSFELEKHD GCAYDFVEIR DGNHSDSRLI GRFCGDKLPP NIKTRSNQMY IRFVSDSSVQ KLGFSAALML DVDECKFTDH GCQHLCINTL GSYQCGCRAG YELQANGKTC EDACGGVVDA TKSNGSLYSP SYPDVYPNSK QCVWEVVAPP NHAVFLNFSH FDLEGTRFHY TKCNYDYLII YSKMRDNRLK KIGIYCGHEL PPVVNSEQSI LRLEFYSDRT VQRSGFVAKF VIDVDECSMN NGGCQHRCRN TFGSYQCSCR NGYTLAENGH NCTETRCKFE ITTSYGVLQS PNYPEDYPRN IYCYWHFQTV LGHRIQLTFH DFEVESHQEC IYDYVAIYDG RSENSSTLGI YCGGREPYAV IASTNEMFMV LATDAGLQRK GFKATFVSEC GGYLRATNHS QTFYSHPRYG SRPYKRNMYC DWRIQADPES SVKIRFLHFE IEYSERCDYD YLEITEEGYS MNTIHGRFCG KHKPPIIISN SDTLLLRFQT DESNSLRGFA ISFMAVDPPE DSVGEDFDAV TPFPGYLKSM YSSETGSDHL LPPSRLI'
    

    【讨论】:

      【解决方案2】:

      使用网络选项卡中的专用 API,然后只需要 requests

      import requests
      
      r = requests.get('http://flybase.org/api/sequence/id/FBgn0003719/FBpp').json()
      print(r['resultset']['result'][0]['sequence'])
      

      【讨论】:

        猜你喜欢
        • 2016-07-07
        • 2014-03-28
        • 2014-04-08
        • 1970-01-01
        • 1970-01-01
        • 2014-05-13
        • 1970-01-01
        • 1970-01-01
        • 1970-01-01
        相关资源
        最近更新 更多