使用 python 将口袋妖怪从 GitHub 保存到 CSV答案

【问题标题】：Saving pokemons to CSV from GitHub using python使用 python 将口袋妖怪从 GitHub 保存到 CSV
【发布时间】：2021-04-29 17:14:22
【问题描述】：

我是 python 世界的新手，想知道如何将数据从 github 抓取到 CSV 文件中，例如 https://gist.github.com/simsketch/1a029a8d7fca1e4c142cbfd043a68f19#file-pokemon-csv

我正在尝试使用此代码，但它不是很成功。绝对应该有一种更简单的方法来做到这一点。

提前谢谢你！

from bs4 import BeautifulSoup
import requests
import csv

url = 'https://gist.github.com/simsketch/1a029a8d7fca1e4c142cbfd043a68f19'

r = requests.get(url)
soup = BeautifulSoup(r.text, 'html.parser')

pokemon_table = soup.find('table', class_= 'highlight tab-size js-file-line-container')


for pokemon in pokemon_table.find_all('tr'):
        name = [pokemon.find('td', class_= 'blob-code blob-code-inner js-file-line').text]


with open('output.csv', 'w') as csvfile:
    writer = csv.writer(csvfile)
    writer.writerows(name)

【问题讨论】：

标签： python csv web-scraping beautifulsoup

【解决方案1】：

如果您不介意使用 CSV 文件的“原始版本”，以下代码可以使用：

import requests

response = requests.get("https://gist.githubusercontent.com/simsketch/1a029a8d7fca1e4c142cbfd043a68f19/raw/bd584ee6c307cc9fab5ba38916e98a85de9c2ba7/pokemon.csv")

with open("output.csv", "w") as file:
    file.write(response.text)

您使用的 URL 未链接到 CSV 的原始版本，但以下链接：

https://gist.githubusercontent.com/simsketch/1a029a8d7fca1e4c142cbfd043a68f19/raw/bd584ee6c307cc9fab5ba38916e98a85de9c2ba7/pokemon.csv

编辑 1：为了澄清这一点，您可以通过按链接中显示的 CSV 文件右上角的“原始”按钮访问该 CSV 文件的原始版本提供。

编辑 2：另外，看起来下面的 URL 也可以工作，并且基于原始 URL“构建”更短更容易： https://gist.githubusercontent.com/simsketch/1a029a8d7fca1e4c142cbfd043a68f19/raw/pokemon.csv

【讨论】：

你好 Sherlock，一切都比我想象的要容易 :D 关于 RAW CSV，是的，我知道这一点，但是我正在学习如何刮，所以我努力水平 :) 非常感谢！ :)