【发布时间】:2021-09-20 08:30:29
【问题描述】:
直到前几天,下面的 sn-p 才能正常工作。有什么方法可以轻松提取此 div class="row mb-4" 中的所有数据。我的想法是,如果对页面进行额外的更改,脚本仍然不会受到影响。
import requests
from bs4 import BeautifulSoup
header = {
'User-Agent': 'Mozilla/5.0 (Windows NT 10.0; Win64; x64; rv:90.0) Gecko/20100101 Firefox/90.0'
}
url = "https://bscscan.com/token/"
token = "0x4ce1a5cb12151423ea479cfd0c52ec5021d108d8"
tokenurl = str(url)+str(token)
contractpage = requests.get(tokenurl,header)
ca = BeautifulSoup(contractpage.content, 'html.parser')
tokenholders = ca.find(id='ContentPlaceHolder1_tr_tokenHolders').get_text()
tokenholdersa = (((tokenholders.strip().strip("Holders:")).strip()).strip(" a ")).strip()
tholders = ((((tokenholders.strip()).strip("Holders:")).strip()).strip(" a ")).strip()
tokenaname = ca.find('span', class_='text-secondary small').get_text().strip()
def get_transfer_count(str:token)->str:
with requests.Session() as s:
s.headers = {'User-Agent':'Mozilla/5.0'}
r = s.get(f'https://bscscan.com/token/{token}')
try:
sid = re.search(r"var sid = '(.*?)'", r.text).group(1)
r = s.get(f'https://bscscan.com/token/generic-tokentxns2?m=normal&contractAddress={token}&a=&sid={sid}&p=1')
return re.search(r"var totaltxns = '(.*?)'", r.text).group(1)
except:
pass
transcount = get_transfer_count(token)
print ("Token: ", tokenaname)
print ("Holders: ", tholders)
print ("Transfers: ", transcount)
上一个输出:
Token: Binemon
Holders: 27,099
Transfers: 439,636
想要改进的输出:
Token: Binemon
PRICE: $0.01 @ 0.000037 BNB (-22.41%)
Fully Diluted Market Cap: $14,011,783.50
Total Supply: 975,000,000 BIN
Holders: 27,099 addresses
Transfers: 439,636
Contract: 0xe56842ed550ff2794f010738554db45e60730371
Decimals: 18
Official Site: https://binemon.io/
Social Profiles:
https://twitter.com/binemonnft
https://t.me/binemonchat
https://docs.binemon.io/
https://coinmarketcap.com/currencies/binemon/
https://www.coingecko.com/en/coins/binemon/
【问题讨论】:
-
我的示例代码为 403。 :(
标签: python python-3.x web-scraping beautifulsoup