从 URL 获取 CSV 文件并将其转换为数组 - Python 2.7答案

【问题标题】：Get CSV file from URL and convert it to array - Python 2.7从 URL 获取 CSV 文件并将其转换为数组 - Python 2.7
【发布时间】：2017-02-22 17:09:58
【问题描述】：

我正在尝试获取地震数据，并将其转换为数组，以便我可以使用该数据在地图上可视化地震。我正在写这个脚本：

import requests
import csv


def csv_to_array(a):
    b = requests.get(a)
    my_file = open(b, "rb")
    for line in my_file:
        el = [i.strip() for i in line.split(',')]
        return el

我将其导入另一个模块，并且：

import csvToArray
data = csvToArray.csv_to_array(
"http://earthquake.usgs.gov/earthquakes/feed/v1.0/summary/all_hour.csv")
i = 1
while i < len(data):
    stuff = data[i].split(',')
    print stuff[1], stuff[2]
    lat = float(stuff[1])
    lon = float(stuff[2])
    x = webMercX(lon, zoom) - cx
    y = webMercY(lat, zoom) - cy
    i += 1

上面脚本的其他功能是不必要的，但是当我运行它时，我得到了以下错误。

while i < len(data):
TypeError: object of type 'NoneType' has no len()

【问题讨论】：

因为print 没有返回任何东西:)，你没有从csv_to_array 返回任何东西
哇，这是一个愚蠢的错误（我刚刚修复了），但现在我收到以下错误：my_file = open(b, "rb") TypeError: coercing to Unicode: need string or缓冲区，找到响应它没有将 URL 识别为字符串
现在您只是返回文件的第一行。该函数立即在第一个 return 处结束

标签： python arrays csv

【解决方案1】：

大部分建议都是代码中的cmets，但也有一些通用的：

使用更好的名称
return 立即退出函数，如果使用yield可以逐行生成

具有学习经验的新代码：

def csv_to_array(url): # use descriptive variable names
    response = requests.get(url)
    lines = response.text.splitlines() # you don't need an open...the data is already loaded
    for line in lines[1:]: # skip first line (has headers)
        el = [i.strip() for i in line.split(',')]
        yield el # don't return, that immediately ends the function

data = csv_to_array("http://earthquake.usgs.gov/earthquakes/feed/v1.0/summary/all_hour.csv")

for row in data: # don't use indexes, just iterate over the data
    # you already split on commas.
    print(row[1], row[2]) # again, better names
    lat = float(row[1])
    lon = float(row[2])
#     x = webMercX(lon, zoom) - cx
#     y = webMercY(lat, zoom) - cy

懒人代码：

import pandas as pd
pd.read_csv('http://earthquake.usgs.gov/earthquakes/feed/v1.0/summary/all_hour.csv')

【讨论】：

为了在谷歌地图上绘图，我会看看 gmplot。

【解决方案2】：

您可以用生成器替换您的第一个函数，该生成器迭代响应数据并为文件的每一行生成数组

def csv_to_array(a):
    response = requests.get(a) 
    # you can access response's body via text attribute
    for line in response.text.split('\n'):
        yield [i.strip() for i in line.split(',')]


list(csv_to_array(some_url))

【讨论】：