您可以使用list.extend 将项目添加到列表中,然后在返回之前对最终列表进行排序。
例如:
import re
import requests
from bs4 import BeautifulSoup
links = ["https://bitcointalk.org/index.php?board=159.0",
"https://bitcointalk.org/index.php?board=159.40",
"https://bitcointalk.org/index.php?board=159.80"]
def get_span(links):
rv = []
r = re.compile(r'\d{7,}\.\d+')
for url in links:
soup = BeautifulSoup(requests.get(url).content, "html.parser")
rv.extend(a['href'] for a in soup.select('span[id^="msg_"] > a') if r.search(a['href']))
return sorted(rv, key=lambda k: float(r.search(k).group(0)), reverse=True)
all_links = get_span(links)
# print links on screen:
for link in all_links:
print(link)
打印:
https://bitcointalk.org/index.php?topic=5255494.0
https://bitcointalk.org/index.php?topic=5255416.0
https://bitcointalk.org/index.php?topic=5255389.0
https://bitcointalk.org/index.php?topic=5255376.0
https://bitcointalk.org/index.php?topic=5255316.0
https://bitcointalk.org/index.php?topic=5254720.0
https://bitcointalk.org/index.php?topic=5254480.0
https://bitcointalk.org/index.php?topic=5254448.0
https://bitcointalk.org/index.php?topic=5254287.0
https://bitcointalk.org/index.php?topic=5252504.0
https://bitcointalk.org/index.php?topic=5251621.0
https://bitcointalk.org/index.php?topic=5250998.0
https://bitcointalk.org/index.php?topic=5250388.0
https://bitcointalk.org/index.php?topic=5250185.0
https://bitcointalk.org/index.php?topic=5248406.0
https://bitcointalk.org/index.php?topic=5247112.0
... and so on.
编辑:如果你想显示链接文本 n
ext to url, you can use this example:
import re
import requests
from bs4 import BeautifulSoup
links = ["https://bitcointalk.org/index.php?board=159.0",
"https://bitcointalk.org/index.php?board=159.40",
"https://bitcointalk.org/index.php?board=159.80"]
def get_span(links):
rv = []
r = re.compile(r'\d{7,}\.\d+')
for url in links:
soup = BeautifulSoup(requests.get(url).content, "html.parser")
rv.extend((a['href'], a.text) for a in soup.select('span[id^="msg_"] > a') if r.search(a['href']))
return sorted(rv, key=lambda k: float(r.search(k[0]).group(0)), reverse=True)
all_links = get_span(links)
# print links on screen:
for link, text in all_links:
print('{} {}'.format(link, text))
打印:
https://bitcointalk.org/index.php?topic=5255494.0 NUL Token - A new hyper-deflationary experiment! Airdrop!
https://bitcointalk.org/index.php?topic=5255416.0 KEEP NETWORK - A privacy layer for Ethereum
https://bitcointalk.org/index.php?topic=5255389.0 [ANN] ICO - OBLICHAIN | Blockchain technology at the service of creative genius
https://bitcointalk.org/index.php?topic=5255376.0 UniChain - The 4th Generation Blockchain Made For The Smart Society 5.0
https://bitcointalk.org/index.php?topic=5255316.0 INFINITE RICKS ! First Multiverse Cryptocurrency ! PoS 307%
https://bitcointalk.org/index.php?topic=5254720.0 [GMC] GameCredits - Unofficial & Unmoderated for Censored Posts.
https://bitcointalk.org/index.php?topic=5254480.0 [ANN] [BTCV] Bitcoin VaultA higher standard in security
https://bitcointalk.org/index.php?topic=5254448.0 [ANN] Silvering (SLVG) token - New Silver Asset Backed Cryptocurrency
... and so on.