用 BeautifulSoup 在 Python 中提取 <Script> 的内容答案

【问题标题】：Extract content of <Script> in Python with BeautifulSoup用 BeautifulSoup 在 Python 中提取 <Script> 的内容
【发布时间】：2018-09-15 07:16:52
【问题描述】：

我想提取窗口的值。FEED__INITIAL__STATE

Piece of code

我该怎么做？

【问题讨论】：

始终建议将代码粘贴到此处，而不是显示它的图片。这样人们就可以复制代码并对其进行更改，以防您的代码不完整/错误。

标签： python python-3.x beautifulsoup

【解决方案1】：

也许你应该这样尝试：

import requests
from bs4 import BeautifulSoup

def check_script_tag(url):

    r = requests.get(url)
    parsed_html = BeautifulSoup(r.content, features="html.parser")

    try:
        text = parsed_html.body.find('script').text
        print (text)  # Here text in script tag !!
    except AttributeError:
        print("There is no script tag !!")

check_script_tag("https://stackoverflow.com")

【讨论】：

【解决方案2】：

首先，我们要找到所有的脚本标签，然后匹配它，

p.s - 在RasitAydin 代码中更新

import requests
from bs4 import BeautifulSoup


def check_script_tag(url):
    r = requests.get(url)
    parsed_html = BeautifulSoup(r.content, features="html.parser")

    script_tags = parsed_html.body.find_all('script')
    for script_tag in script_tags:
        text = script_tag.text
        if 'window.FEED__INITIAL__STATE'.lower() in text.lower():
            print(text)


check_script_tag(" YOUR WEB URL")

【讨论】：