【问题标题】:No field names CSV Reader Python UTF-8没有字段名称 CSV Reader Python UTF-8
【发布时间】:2013-04-17 09:55:15
【问题描述】:

所以我试图打开并读取没有字段名称的 csv 文件。根据我所做的研究,我很确定它是用 UTF-8 编码的。我的 csv 格式如下:

1,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0, 
2,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0, 

我使用以下打开阅读:

def parseCSVCounter(csv_file):

with codecs.open(csv_file, "r", "utf-8-sig","strict", -1) as f:
    f = str(f)
    relayreader = csv.reader(f, delimiter=',')
    for row in relayreader:
        print(row)

        try:
            #row[0] = unicode(row[0], 'latin-1')
            counter(row)
            print('starting row..')

        except UnicodeDecodeError, e:
            print('something went wrong1')
            print e

        except Exception, e:
            print('something went wrong')
            print e

这会生成

Starting Command..
['<']
something went wrong
invalid literal for int() with base 10: '<'
['o']
something went wrong
invalid literal for int() with base 10: 'o'
........
starting row..
['9']
starting row..
['3']
starting row..
['8']
starting row..
['2']
starting row..
['8']
starting row..
['>']
something went wrong
invalid literal for int() with base 10: '>'`

我删减了这个来证明我的观点。似乎它会自动为我生成字段名称。使用csv.DictReader(fieldnames = 'foo'),我可以按顺序指定字段名称。如何让csv.reader() 忽略缺少的字段名称?

【问题讨论】:

  • &gt;&gt;&gt; x = open('aaa', 'w') &gt;&gt;&gt; str(x) "&lt;open file 'aaa', mode 'w' at 0x01FC9860&gt;"

标签: python csv utf-8 field


【解决方案1】:

你确实不需要需要打电话给str(f)直接使用文件对象

with codecs.open(csv_file, "r", "utf-8-sig", "strict") as f:
    relayreader = csv.reader(f, delimiter=',')

您正尝试将str(f) 的输出读取为 CSV 文件,而这是一个格式如下的字符串:

<open file '/path/to/file', mode 'rb' at 0x105f10d20>

您可以从错误输出中看到这一点;它拼写出&lt;o等,一直到内存地址数字和结束&gt;

请注意,utf-8-sig 编解码器可以处理出现在文件开头的 UTF-8 编码 BOM,但除非预期 BOM 存在,否则普通的 UTF-8 编解码器将只很好。

【讨论】:

    猜你喜欢
    • 2016-05-04
    • 1970-01-01
    • 2015-10-12
    • 1970-01-01
    • 2011-06-27
    • 1970-01-01
    • 1970-01-01
    • 1970-01-01
    • 1970-01-01
    相关资源
    最近更新 更多