【问题标题】:Python - BeautifulSoup - For loop outputting data in wrong orderPython - BeautifulSoup - For循环以错误的顺序输出数据
【发布时间】:2020-07-18 17:33:42
【问题描述】:

我正在尝试从以下站点抓取匹配数据

https://sport-tv-guide.live/live/darts

数据正在无错误地抓取,但输出未按预期显示。我相信这是因为我错误地调用了 for 循环(见下面的代码和输出)

import requests
from bs4 import BeautifulSoup

def makesoup(url):
    cookies = {'mycountries' : '101,28,3,102,42,10,18,4,2'}
    r = requests.post(url,  cookies=cookies)
    return BeautifulSoup(r.text,"lxml")
   
    
def matchscrape(g_data):

    for match in g_data:

        scheduled = match.findAll('div', class_='main time col-sm-2 hidden-xs')
        details = match.findAll('div', class_='col-xs-6 mobile-normal')
        
        for schedule in scheduled:
           
            print("DateTimes; ", schedule.text.strip())
        for detail in details:
                print("Details:",  detail.text.strip())
            
            
def matches():
    soup=makesoup(url = "https://sport-tv-guide.live/live/darts")
    matchscrape(g_data = soup.findAll("div", {"class": "listData"}))

以上代码提供以下输出:

然后我尝试按照下面的代码更改 for 循环的位置

import requests
from bs4 import BeautifulSoup

def matchscrape(g_data):

    for match in g_data:

        scheduled = match.findAll('div', class_='main time col-sm-2 hidden-xs')
        details = match.findAll('div', class_='col-xs-6 mobile-normal')
        
        for schedule in scheduled:
           
            print("DateTimes; ", schedule.text.strip())
            for detail in details:
                print("Details:",  detail.text.strip())

但我收到以下输出

我试图得到的输出是

感谢任何可以建议或提供解决方案的人。

【问题讨论】:

    标签: python beautifulsoup


    【解决方案1】:

    使用zip() 内置函数将数据“绑定”在一起:

    import requests
    from bs4 import BeautifulSoup
    
    def makesoup(url):
        cookies = {'mycountries' : '101,28,3,102,42,10,18,4,2'}
        r = requests.post(url,  cookies=cookies)
        return BeautifulSoup(r.text,"lxml")
    
    
    def matchscrape(g_data):
    
        for match in g_data:
            scheduled = match.findAll('div', class_='main time col-sm-2 hidden-xs')
            details = match.findAll('div', class_='col-xs-6 mobile-normal')
    
            for s, d in zip(scheduled, details):  # <-- using zip() here!
                print(s.text.strip())
                print(d.text.strip())
    
    
    def matches():
        soup=makesoup(url = "https://sport-tv-guide.live/live/darts")
        matchscrape(g_data = soup.findAll("div", {"class": "listData"}))
    
    matches()
    

    打印:

    Darts 17:00
    Simon Whitlock vs. Joyce Ryan
    World Matchplay
    Darts 18:00
    Ratajski Krzysztof vs. Wattimena Jermaine
    World Matchplay
    

    【讨论】:

    • 谢谢。我以前没用过这个功能。所以学习新东西是件好事。
    猜你喜欢
    • 1970-01-01
    • 1970-01-01
    • 1970-01-01
    • 2015-06-08
    • 1970-01-01
    • 1970-01-01
    • 1970-01-01
    • 1970-01-01
    • 1970-01-01
    相关资源
    最近更新 更多