【问题标题】:How do I write this data into a csv file如何将此数据写入 csv 文件
【发布时间】:2021-02-05 05:57:13
【问题描述】:

我正在做一个 python 练习,我想将作者、标签和文本放入一个 csv 文件,但我不知道如何去做。

import requests
import bs4
import pandas as pd
import csv
res = requests.get('https://quotes.toscrape.com')

soup = bs4.BeautifulSoup (res.text,'lxml')
#soup

author = soup.select('.col-md-8')
example = author[1]

example.select('.text')
for item1 in example.select(".text"):
    print(item1.text)

example.select('.tag')
for item2 in example.select(".tag"):
    print(item2.text)

example.select ('div span')
for item3 in example.select(".author"):
   print(item3.text)

file_to_output = open('QuotesToScrape.csv','w',newline='')
csv_writer = csv.writer(file_to_output,delimiter=',')
csv_writer.writerow(['Text','Tag','Author'])
csv_writer.writerows([[item3.text,item2.text,item1.text],['4','5','6']])
file_to_output.close()


【问题讨论】:

    标签: python csv web-scraping


    【解决方案1】:
    import pandas as pd
    
    data = {
        'text': ['text 1', 'text 2', 'text 3'],
        'tag': [1, 2, 3],
        'author': ['author 1', 'author 2', 'author 3']
    }
    
    df = pd.DataFrame(data=data)
    

    输出:

    | text   | tag | author   |
    | ------ | --- | -------- |
    | text 1 | 1   | author 1 |
    | text 2 | 2   | author 2 |
    | text 3 | 3   | author 3 |
    

    那么你可以使用to_csv()method

    df.to_csv(
        'QuotesToScrape.csv',
        sep = ',',
        index = False
    )
    

    【讨论】:

      【解决方案2】:

      您需要将text tagauthor 存储在一个列表中。然后你需要将它写入csv。您可以使用zip(text1, tag1, author1), (text2, tag2, author2)...so on 进行分组

      import requests
      import bs4
      import pandas as pd
      import csv
      res = requests.get('https://quotes.toscrape.com')
      
      soup = bs4.BeautifulSoup (res.text,'lxml')
      #soup
      
      author = soup.select('.col-md-8')
      example = author[1]
      
      
      # for item1 in example.select(".text"):
      #     print(item1.text)
      
      # for item2 in example.select(".tag"):
      #     print(item2.text)
      
      # for item3 in example.select(".author"):
      #     print(item3.text)
      
      text_tag_author = zip([i.text.replace(';', '') for i in example.select(".text")], 
                            [i.text.replace(';', '') for i in example.select(".tag")], 
                            [i.text.replace(';', '') for i in example.select(".author")])
      
      
      file_to_output = open('QuotesToScrape.csv','w',newline='')
      csv_writer = csv.writer(file_to_output,delimiter=',')
      csv_writer.writerow(['Text','Tag','Author'])
      # for each_row in text_tag_author:
      #     print(each_row)
      csv_writer.writerows(text_tag_author)
      # csv_writer.writerows([[item3.text,item2.text,item1.text],['4','5','6']])
      file_to_output.close()
      

      【讨论】:

      • @mrsir217 嘿太棒了!如果它解决了您的问题,请不要忘记接受答案,这样它就不会再次弹出。
      【解决方案3】:

      工作代码示例:

      import requests
      from bs4 import BeautifulSoup
      
      ## Get the html from the page at your url
      request = requests.get('https://quotes.toscrape.com')
      
      ## Create a BS object from the request text
      soup = BeautifulSoup(request.text, features="html.parser")
      
      ## Every block of text is a div with class="quote" so 
      ## lets start by pulling all of those
      quotes = soup.find_all("div", {"class": "quote"})
      
      ## Before we itterate through all the quotes lets make a new csv
      f = open("test.csv", "w")
      
      ## Lets write the headers into the csv
      f.write("quote,author\n")
      
      ## Now lets itterate through all of those quotes
      for each in quotes:
      
          ## Now that we have each broken out you can find the individual tags you need
          quote = each.find("span", {"class": "text"}).text
      
          ## Each author is in a small tag so this one is easy
          author = each.find("small").text
      
          ## Now lets write each quote/author to the csv
          f.write(f'{quote},{author}\n')
      
      ## Now we close the csv file
      f.close()
      

      为了清晰和理解,我发表了评论,如果您有任何问题,请告诉我。

      【讨论】:

        猜你喜欢
        • 1970-01-01
        • 1970-01-01
        • 1970-01-01
        • 1970-01-01
        • 1970-01-01
        • 2019-08-22
        • 1970-01-01
        • 2015-01-21
        相关资源
        最近更新 更多