以编程方式保存 Internet Explorer 中的 HTML答案

【问题标题】：Programmatically save the HTML from internet explorer以编程方式保存 Internet Explorer 中的 HTML
【发布时间】：2014-07-29 12:14:10
【问题描述】：

是否有一种编程方式（最好是在 Python 中）从 Windows 的 Internet Explorer 网页中保存 HTML 源代码？我用 Python 的 urllib2.urlopen 试过这个，但我得到了 404 错误。但是我可以在没有 404 的情况下使用 Internet Explorer 打开链接。我想我可以使用 python 的 Webbrowser 模块在 IE 中打开链接，但是 Webbrowser 没有办法从 IE 中保存 HTML。

【问题讨论】：

发布您尝试保存的网址

标签： python html

【解决方案1】：

import urllib
from lxml import html

url = "http://yourWebsite.com/index.html"
page = html.fromstring(urllib.urlopen(url).read())

你试过了吗？

【讨论】：

【解决方案2】：

这可行，但是我不知道是哪个网站，是否需要身份验证，这就是原因。您没有提供有关该网站的详细信息以及您在问题中尝试的内容。这是如何从网页保存 html 的示例：

import urllib

url = 'http://www.google.com'
lines = urllib.urlopen(url).readlines()

html = open('google.html', 'w')
for line in lines:
    html.write(line)

【讨论】：