【问题标题】:Python code writes csv file, only prints two texts entriesPython 代码写入 csv 文件,只打印两个文本条目
【发布时间】:2020-04-17 17:43:38
【问题描述】:
#Stuff needed to run
import requests
import urllib.request
import io
from bs4 import BeautifulSoup as soup

#Pick url, request it, save response, read response, soup it into variable
my_url = 'https://old.reddit.com/r/all/'
request = urllib.request.Request(my_url,headers={'User-Agent': 'your bot 0.1'})
response = urllib.request.urlopen(request)
page_html = response.read()
page_soup = soup(page_html, "html.parser")

#get all the posts, get one post, get all the authors, get one author
posts = page_soup.findAll("div", {"class": "top-matter"})
post = posts[0]
authors = page_soup.findAll("p", {"class":"tagline"})
author = authors[0]

#make filename, open to write, set the headers, write the headers,
filename = "redditAll.csv"
f = open(filename, "w")
headers = "Title of the post, Author of the post\n"
f.write(headers)

#for the post and author in posts and authors, get one of each, open the file & write it, repeat
for post, author in zip(posts, authors):
    post_text = post.p.a.text.replace(",", " -")
    username = author.a.text
    with open(filename, "w", encoding="utf-8") as f:
        f.write(post_text + "," + username + "\n")

#close the file
f.close()

运行此代码并打开 csv 文件后,只有两个单元格中有文本。

应该不止两个,因为 reddit.com/r/all 上的帖子不止两个

改了

for post, author in zip(posts, authors):
    post_text = post.p.a.text.replace(",", " -")
    username = author.a.text
    with open(filename, "w", encoding="utf-8") as f:
        f.write(post_text + "," + username + "\n")

到这里

with open(filename, "w", encoding="utf-8") as f:
    for post, author in zip(posts, authors):
        post_text = post.p.a.text.replace(",", " -")
        username = author.a.text
        f.write(post_text + "," + username + "\n")

【问题讨论】:

  • python 有一个名为 csv 的内置库,用于编写 csv 文件。
  • 上下文管理器with open.... 应该位于外端,其中包含for 循环。 f.write(post_text + "," + username + "\n") 必须在 for 循环内

标签: python csv writing


【解决方案1】:

试试这个:

# for the post and author in posts and authors, get one of each, open the file & write it, repeat
def writer():
    with open(filename, "w", encoding="utf-8") as f:
        for post_, author_ in zip(posts, authors):
            post_text = post_.p.a.text.replace(",", " -")
            username = author_.a.text
            # with open(filename, "w", encoding="utf-8") as f:
            f.write(post_text + "," + username + "\n")

writer()

【讨论】:

  • 这行得通,但我能问一下你为什么把它放在 def 中然后调用它吗?而不是仅仅从with open( etc 到最后一行?
【解决方案2】:

您可以使用a 参数以附加模式打开文件,第二次打开文件写入时,请查看SO thread 以了解如何执行此操作。或者将with open(filename, "w", encoding="utf-8") as f:移到循环外

w 参数将覆盖文件中的先前数据,因此每次循环运行时,记录将被新记录覆盖,只留下文件中的最终记录

作为 cmets 提到的内容之一,我还将使用内置的 csv 库来读取/写入 csv 文件。 Here 是它的文档

【讨论】:

    【解决方案3】:

    改了

    for post, author in zip(posts, authors):
        post_text = post.p.a.text.replace(",", " -")
        username = author.a.text
        with open(filename, "w", encoding="utf-8") as f:
            f.write(post_text + "," + username + "\n")
    

    到这里

    with open(filename, "w", encoding="utf-8") as f:
        for post, author in zip(posts, authors):
            post_text = post.p.a.text.replace(",", " -")
            username = author.a.text
            f.write(post_text + "," + username + "\n")
    

    【讨论】:

      猜你喜欢
      • 1970-01-01
      • 2017-08-18
      • 1970-01-01
      • 2016-07-21
      • 2018-01-18
      • 1970-01-01
      • 1970-01-01
      • 2016-12-24
      • 2021-08-15
      相关资源
      最近更新 更多