【发布时间】:2018-04-11 14:57:43
【问题描述】:
我正在使用 BeautifulSoup 并尝试根据这一行读取一个用希伯来语编写并在 windows-1255 中编码的网站:
<meta http-EQUIV="Content-Type" Content="text/html; charset=windows-1255">
当我尝试对其进行编码时,我收到以下错误:
> UnicodeEncodeError: 'charmap' codec can't encode characters in position 6949-6950: character maps to <undefined>
代码:
from bs4 import BeautifulSoup
import requests
r = requests.get('http://www.plonter.co.il')
soup = BeautifulSoup(r.text)
print soup.prettify().encode('windows-1255')
【问题讨论】:
标签: python web-scraping character-encoding beautifulsoup