【问题标题】:Python - how to write multiple lines to a file? (list, not string)Python - 如何将多行写入文件? (列表,不是字符串)
【发布时间】:2021-03-15 01:28:47
【问题描述】:

我终其一生都无法弄清楚如何获取此列表中的元素(它们本身就是列表),以便在我将它们写入文件时打印成多行。我从网站上刮掉标题,然后刮掉链接。最终目标(为了您的洞察力)是将标题和链接以如下格式配对:

<a href='www.mywebsite.com/curry-recipe'>Curry Recipe<a/> 

但就目前而言,问题是虽然我最终的 desfinalList 看起来还不错,例如:

[['Curry Recipe', 'www.originalwebsite.com/curry-recipe'], ['Pancake Recipe', 'www.originalwebsite.com/pancake-recipe']]

如果不将其全部放入一大行中,我似乎无法将其打印到文件中。使用文本换行,它在视觉上是易于管理的,但我更喜欢它在多行上。

有问题的代码是最后一个块。

def OFDdesserts():

    urlA = 'https://olivesfordinner.com/category/dessert/page/{}'

    for i in range(2,5): 
        url = urlA.format(i)
    
        response = requests.get(url)
        htmlText = response.text
    
        soup = BeautifulSoup(htmlText, 'lxml')
        links = soup.find_all('article')

        for title in links[0:12]:
            titleActual = title.get('aria-label')
            if 'Giveaway' not in titleActual:
                hyperL = title.find('header', class_ = 'entry-header').a['href']
                if titleActual not in desTitleList:
                    desTitleList.append(titleActual)
                    desLinkList.append(hyperL)


    desList3.append([[x,y] for x,y in zip(desTitleList, desLinkList)])

    #erase duplicates
    for item in desList3:
        if item not in desfinalList:
            desfinalList.append(item) 

    #write the file
    for elem in desfinalList:
        with open('recipes/desserts.txt', 'w') as f:
            f.write('\n \n'.join(map(str, desfinalList)))
            print('just added something yummy to desserts!')

【问题讨论】:

  • 您的代码对我来说很好用。列表示例列表创建一个文件,两个列表之间有一个空行。我在windows PC上,如果这有什么不同的话。你确定你提供的例子是你真正要处理的?
  • 是的,我复制粘贴了

标签: python-3.x list web-scraping beautifulsoup file-writing


【解决方案1】:

最好使用.select() 方法,它像jQuery 或CSS 一样选择器,并使用str() 来获取html 的链接&lt;a href="...."&gt;anchor&lt;/a&gt;

def OFDdesserts():
    urlA = 'https://...../page/{}'
    linkTags = []

    for i in range(2,5): 
        url = urlA.format(i)
        response = requests.get(url)
        soup = BeautifulSoup(response.text, 'html.parser')
        links = soup.select('.content article h2 a')

        for link in links:
            if 'Giveaway' in link.text:
                continue
            # clean tag (i) from anchor text
            link.extract()
            # clean link attributes
            link.attrs = {'href': link.attrs['href']}
            linkStr = str(link)
            if linkStr not in linkTags:
                linkTags.append(linkStr)

    #write the file
    with open('desserts.txt', 'w') as f:
        f.write('\n\n'.join(linkTags))
        print('just added something yummy to desserts!')

【讨论】:

    猜你喜欢
    • 2019-09-17
    • 1970-01-01
    • 1970-01-01
    • 1970-01-01
    • 2011-07-22
    • 1970-01-01
    • 2014-05-02
    • 2013-02-02
    • 1970-01-01
    相关资源
    最近更新 更多