【问题标题】:How can I scrape tables that seem to be hidden by jquery?如何抓取似乎被 jquery 隐藏的表?
【发布时间】:2021-07-18 16:34:08
【问题描述】:

我试图在website 上用它们的含义来抓取这些单词,我刮掉了第一个表,但即使在通过单击它显示单词列表 2 之后,bs4 也找不到该表(或任何其他隐藏表)。对于这样的切换/隐藏元素,我有什么不同的吗?

这是我用来访问第一个表的内容:

root = "https://www.graduateshotline.com/gre-word-list.html#x2"

content = requests.get(root).text
soup = BeautifulSoup(content,'html.parser')
table = soup.find_all('table',attrs={'class':'tablex border1'})[0]
print(table)

【问题讨论】:

    标签: python web-scraping beautifulsoup


    【解决方案1】:
    import pandas as pd
    
    df = pd.read_html('https://www.graduateshotline.com/gre/load.php?file=list2.html',
                      attrs={'class': 'tablex border1'})[0]
    
    print(df)
    

    输出:

                        0                                                  1
    0        multifarious                varied; motley; greatly diversified
    1      substantiation                giving facts to support (statement)
    2                feud          bitter quarrel over a long period of time
    3    indefatigability               not easily exhaustible; tirelessness
    4          convoluted                        complicated;coiled; twisted
    ..                ...                                                ...
    257        insensible              unconscious; unresponsive; unaffected
    258          gourmand  a person who is devoted to eating and drinking...
    259             plead              address a court of law as an advocate
    260            morbid            diseased; unhealthy (e.g.. about ideas)
    261            enmity                              hatred being an enemy
    
    [262 rows x 2 columns]
    

    【讨论】:

    • 谢谢!不知道你可以这样访问它。
    猜你喜欢
    • 2012-08-17
    • 1970-01-01
    • 2018-09-22
    • 2020-03-25
    • 2020-01-27
    • 2021-06-08
    • 2021-06-16
    • 1970-01-01
    • 1970-01-01
    相关资源
    最近更新 更多