从 csv 文件错误中选择列答案

【问题标题】：selecting columns from csv file error从 csv 文件错误中选择列
【发布时间】：2018-03-15 08:32:42
【问题描述】：

我有一个包含 20501 行和 26 列的 CSV 文件。我想选择 5 列和 9 列数据。这是我所拥有的

import csv 
filename = 'feed_data.csv'
f = open(filename)
readCSV = csv.reader(f, delimiter=',')
names = []
confidence_score = []
for row in readCSV:
    names.append(row[8])
    confidence_score.append(row[4])

这是错误

Traceback (most recent call last):
File "C:/Users/raady/PycharmProjects/feeder_Classification/test.py", line 10, in <module>
for row in readCSV:
File "C:\Users\raady\AppData\Local\Programs\Python\Python36\lib\encodings\cp1252.py", line 23, in decode
return codecs.charmap_decode(input,self.errors,decoding_table)[0]
UnicodeDecodeError: 'charmap' codec can't decode byte 0x9d in position 1009: character maps to <undefined>

如何纠正错误？我不想使用熊猫。

有什么方法可以将两列都复制到一个变量中，而不是分别复制名称和 confidence_score？

编辑：我已经安装了 python 3.6 和 pycharm 环境。我已经安装了 pycharm 环境中的所有软件包。

编辑 2：我已经通过修改f=open(filename,encoding='utf8') 尝试了建议的link，但我仍然有错误UnicodeDecodeError: 'utf-8' codec can't decode byte 0x89 in position 934: invalid start byte。 CSV 文件已以 utf8 编码。

编辑 3：我像这样稍微修改了代码

import csv
filename = 'feed_data.csv'
# filename = 'test.csv'

with open(filename) as csvfile:
   readCSV = csv.reader(csvfile, delimiter=',')
   data2 = []
   for row in readCSV:
       data = []
       data.append(row[14]) # appending names
       data.append(row[5])  # appending confidence
       data2.append(data)

   print(data2)

我正在添加test.py和feed_data这两个文件（直接从kaggle下载）。当我尝试使用 test.py 时，它工作正常，我可以选择所需的列数据，但不能使用 feed_data.py，它给出了上面提到的错误。

【问题讨论】：

你知道相关文件的编码类型吗？
我提到了 utf8 作为编码类型，它给了我这个错误，UnicodeDecodeError: 'utf-8' codec can't decode byte 0x89 in position 934: invalid start byte
UnicodeDecodeError: 'charmap' codec can't decode byte X in position Y: character maps to <undefined>的可能重复
我尝试过提及编码 f=open(filename,encoding = 'utf8') 然后我收到了评论中提到的错误
我正在使用 python 3.6，这些信息会有帮助吗？

标签： python csv multiple-columns

【解决方案1】：

答案从问题编辑中移出：

稍作修改有帮助
with open(filename, encoding='utf8', errors='ignore') as csvfile:
问题出在数据库文件上，有关缺少实际的编码技术。尝试了可用的通过在 Visual Studio 代码的帮助下检查编码格式。某行数据已损坏并被上述命令忽略。

【讨论】：