【发布时间】:2018-06-19 05:25:27
【问题描述】:
我正在尝试使用 NLTK 库来训练数据。我遵循一步一步的过程。我做了第一步,但是在做第二步时,我收到了以下错误:
TypeError: a bytes-like object is required, not 'list'
我已尽力纠正它,但我再次遇到同样的错误。
这是我的代码:
from bs4 import BeautifulSoup
import urllib.request
response = urllib.request.urlopen('http://php.net/')
html = response.read()
soup = BeautifulSoup(html,"html5lib")
text = soup.get_text(strip=True)
print (text)
这是我的错误
C:\python\lib\site-packages\bs4\__init__.py:181: UserWarning: No parser was explicitly specified, so I'm using the best available HTML parser for this system ("html5lib"). This usually isn't a problem, but if you run this code on another system, or in a different virtual environment, it may use a different parser and behave differently.
The code that caused this warning is on line 8 of the file E:/secure secure/chatbot-master/nltk.py. To get rid of this warning, change code that looks like this:
BeautifulSoup(YOUR_MARKUP})
to this:
BeautifulSoup(YOUR_MARKUP, "html5lib")
markup_type=markup_type))
Traceback (most recent call last):
File "E:/secure secure/chatbot-master/nltk.py", line 8, in <module>
soup = BeautifulSoup(html)
File "C:\python\lib\site-packages\bs4\__init__.py", line 228, in __init__
self._feed()
File "C:\python\lib\site-packages\bs4\__init__.py", line 289, in _feed
self.builder.feed(self.markup)
File "C:\python\lib\site-packages\bs4\builder\_html5lib.py", line 72, in feed
doc = parser.parse(markup, **extra_kwargs)
File "C:\python\lib\site-packages\html5lib\html5parser.py", line 236, in parse
parseMeta=parseMeta, useChardet=useChardet)
File "C:\python\lib\site-packages\html5lib\html5parser.py", line 89, in _parse
parser=self, **kwargs)
File "C:\python\lib\site-packages\html5lib\tokenizer.py", line 40, in __init__
self.stream = HTMLInputStream(stream, encoding, parseMeta, useChardet)
File "C:\python\lib\site-packages\html5lib\inputstream.py", line 148, in HTMLInputStream
return HTMLBinaryInputStream(source, encoding, parseMeta, chardet)
File "C:\python\lib\site-packages\html5lib\inputstream.py", line 416, in __init__
self.rawStream = self.openStream(source)
File "C:\python\lib\site-packages\html5lib\inputstream.py", line 453, in openStream
stream = BytesIO(source)
TypeError: a bytes-like object is required, not 'list'
【问题讨论】:
-
你看过这个帖子吗:stackoverflow.com/questions/16206380/… ?你可以试试get_text:crummy.com/software/BeautifulSoup/bs4/doc/#get-text
-
我试过运行你的脚本,它返回的文本很好吗?你能发布详细的错误信息吗?
-
运行时遇到这样的错误
-
TypeError: a bytes-like object is required, not 'list'
-
脚本工作正常,请编辑问题并添加错误消息。
标签: python python-3.x beautifulsoup