Python CSV阅读器跳过9个标题答案

【问题标题】：Python CSV reader to skip 9 headersPython CSV阅读器跳过9个标题
【发布时间】：2015-02-19 09:42:25
【问题描述】：

import os
import csv

def get_file_path(filename):
    currentdirpath = os.getcwd()
    file_path = os.path.join(os.getcwd(), filename)
    print(file_path)
    return(file_path)

path = get_file_path('Invoice-Item.csv')

def read_csv(filepath):
    with open(filepath, 'r') as csvfile:
        reader = csv.reader(csvfile)
        for i in range(0, 9):            
            next(reader, None)        
        for row in reader:
            print(row[0])                   

read_csv(path)

我正在寻找一种跳过 9 个标题而不是范围函数的技术。任何帮助，将不胜感激。下面是一个csv文件的示例

Summary Journal Entry,JE-00000060
Journal Entry Date,28/02/2015
Accounting Period,Feb-15
Accounting Period Start,1/02/2015
Accounting Period End,28/02/2015
Included Transaction Types,Invoice Item
Included Time Period,01/02/2015-09/02/2015
Journal Run,JR-00000046
Segments,
,
Customer Account Number,Transaction Amount
210274174,545.45
210274174,909.09
210274174,909.09
210274174,909.09
210274174,909.09

【问题讨论】：

标签： python csv python-3.x

【解决方案1】：

您可以使用itertools.islice() 跳过固定数量的行：

from itertools import islice

next(islice(reader, 9, 9), None)        
for row in reader:
    print(row[0])

islice() 对象被指示跳过 9 行，然后立即停止而不产生进一步的结果。它本身就是一个迭代器，所以你仍然需要在它上面调用next()。

如果您想跳过行直到“空”行，这需要不同的方法。当您遇到只有空单元格的行时，您必须检查每一行并停止阅读：

for row in reader:
    if not any(row):  # only empty cells or no cells at all
        break

for row in reader:
    print(row[0])

后一种方法的演示：

>>> import csv
>>> import io
>>> sample = '''\
... Summary Journal Entry,JE-00000060
... Journal Entry Date,28/02/2015
... Accounting Period,Feb-15
... Accounting Period Start,1/02/2015
... Accounting Period End,28/02/2015
... Included Transaction Types,Invoice Item
... Included Time Period,01/02/2015-09/02/2015
... Journal Run,JR-00000046
... Segments,
... ,
... Customer Account Number,Transaction Amount
... 210274174,545.45
... 210274174,909.09
... 210274174,909.09
... 210274174,909.09
... 210274174,909.09
... '''
>>> with io.StringIO(sample) as csvfile:
...     reader = csv.reader(csvfile)
...     for row in reader:
...         if not [c for c in row if c]:
...             break
...     for row in reader:
...         print(row[0])                   
... 
Customer Account Number
210274174
210274174
210274174
210274174
210274174

请注意，您希望将换行处理留给csv.reader；打开你的文件集时newline='':

with open(filepath, 'r', newline='') as csvfile:

【讨论】：

是否可以使用 WHILE 之类的条件，因为我们只想读取文件的一部分。即我们要继续阅读标题行，直到我们到达空白行
@RicardLe 你没有空行；你有一排有一个空单元格；逗号仍然很重要。但是，您没有要求任意计数跳过，这是一个 dudferent 问题。
我已经尝试过上面的建议，但它不会跳过 9 个标题。我是否错过了介于两者之间的东西。
@RicardLe：啊，我的错，你使用的是 Python 3。将更正filter(None, ...) 产生一个可迭代的，而不是一个列表。
由于某些原因它没有工作。它不会跳过标题。

【解决方案2】：

如果您使用的是 numpy，请查看 genfromtxt (http://docs.scipy.org/doc/numpy/user/basics.io.genfromtxt.html) 中的 skip_header 参数

import numpy as np     
r = np.genfromtxt(filepath, skip_header=9, names = ['account','amount'] , delimiter = ',')
print(r.account[0],r.amount[0])

【讨论】：

【解决方案3】：

如果您考虑使用 pandas，read_csv 让读取文件变得非常简单：

import pandas as pd

data = pd.read_csv(filename, skiprows=9)

【讨论】：