【问题标题】:How to separate a line of grabbed data from webpage for more readability如何从网页中分离一行抓取的数据以提高可读性
【发布时间】:2021-09-12 10:48:41
【问题描述】:

目前,sn-p 正在运行,但所提取的数据不太美观。我想把这行数据一分为二。

import requests, re, random
from bs4 import BeautifulSoup
header = {
    'User-Agent': 'Mozilla/5.0 (Windows NT 10.0; Win64; x64; rv:90.0) Gecko/20100101 Firefox/90.0'
}

url = "https://bscscan.com/address/0xe1fd7b4c9debac3c490d8a553c455da4979482e4"
req = requests.get(url,header, timeout=10)
soup = BeautifulSoup(req.content, 'html.parser')
creator = soup.find(id='ContentPlaceHolder1_trContract').get_text()
tokentracker = soup.find(id='ContentPlaceHolder1_tr_tokeninfo').get_text()

print (creator)
print (tokentracker)

电流输出:

ContractCreator:
0xab3a68876925ecc5f361cefe78b3dae78b971436 at txn 0xc78e35353426d2851be008bf4de269652a4ce1746d025fae5aabd72454a31715


TokenTracker:

 StackDoge (STACKDOGE)

想要的输出:

Contract Owner: 0xab3a68876925ecc5f361cefe78b3dae78b971436
Transaction ID: 0xc78e35353426d2851be008bf4de269652a4ce1746d025fae5aabd72454a31715

Token Name: StackDoge (STACKDOGE)

【问题讨论】:

    标签: python python-3.x beautifulsoup


    【解决方案1】:

    你可以用“at txn”作为分隔符来分割字符串:

    txt = "0xab3a68876925ecc5f361cefe78b3dae78b971436 at txn 0xc78e35353426d2851be008bf4de269652a4ce1746d025fae5aabd72454a31715"
    
    x = txt.split(" at txn ")
    
    print(f'Contract Owner: {x[0]}')
    print(f'Transaction ID: {x[1]}')
    

    这将打印:

    Contract Owner: 0xab3a68876925ecc5f361cefe78b3dae78b971436
    Transaction ID: 0xc78e35353426d2851be008bf4de269652a4ce1746d025fae5aabd72454a31715
    

    【讨论】:

    • 最后一部分怎么样:Token Name: StackDoge (STACKDOGE)
    【解决方案2】:

    尝试以下方法:

    newlist = [x.strip() for x in creator.split("at txn")]
    print("Contract Owner: " + newlist[0])
    print("Transaction ID: " + newlist[1])
    

    【讨论】:

    • 脚本的最后一部分怎么样:Token Name: StackDoge (STACKDOGE)
    【解决方案3】:

    这是您可以尝试的另一种解决方案,

    print("Contract Owner:", creator.find('a', attrs={"title": "Creator Address"}).text)
    print("Transaction ID:", creator.find('a', attrs={"title": "Creator Txn Hash"}).text)
    
    print("Token Name:", tokentracker.find("a").text)
    

    Contract Owner: 0xab3a68876925ecc5f361cefe78b3dae78b971436
    Transaction ID: 0xc78e35353426d2851be008bf4de269652a4ce1746d025fae5aabd72454a31715
    Token Name: StackDoge (STACKDOGE)
    

    【讨论】:

    • 删除 .get_text() 部分后它可以工作。
    猜你喜欢
    • 2022-01-21
    • 2020-10-05
    • 2018-10-15
    • 2021-10-01
    • 1970-01-01
    • 1970-01-01
    • 1970-01-01
    • 2019-12-24
    • 1970-01-01
    相关资源
    最近更新 更多