【发布时间】:2019-08-08 17:13:29
【问题描述】:
我正在尝试收集 Python Requests 库的所有内部链接并过滤掉所有外部链接。
我正在使用正则表达式来做同样的事情。但是它抛出了我无法解决的这种类型的错误。
我的代码:
import requests
from bs4 import BeautifulSoup
import re
r = requests.get('https://2.python-requests.org/en/master/')
content = BeautifulSoup(r.text)
[i['href'] for i in content.find_all('a') if not re.match("http", i)]
错误:
TypeError Traceback (most recent call last)
<ipython-input-10-b7d82067fe9c> in <module>
----> 1 [i['href'] for i in content.find_all('a') if not re.match("http", i)]
<ipython-input-10-b7d82067fe9c> in <listcomp>(.0)
----> 1 [i['href'] for i in content.find_all('a') if not re.match("http", i)]
~\Anaconda3\lib\re.py in match(pattern, string, flags)
171 """Try to apply the pattern at the start of the string, returning
172 a Match object, or None if no match was found."""
--> 173 return _compile(pattern, flags).match(string)
174
175 def fullmatch(pattern, string, flags=0):
TypeError: expected string or bytes-like object
【问题讨论】:
标签: python regex python-3.x list-comprehension