UnicodeDecodeError：“utf-8”编解码器无法解码位置 23 中的字节 0xea：无效的继续字节答案

【问题标题】：UnicodeDecodeError: 'utf-8' codec can't decode byte 0xea in position 23: invalid continuation byteUnicodeDecodeError：“utf-8”编解码器无法解码位置 23 中的字节 0xea：无效的继续字节
【发布时间】：2019-05-02 22:36:47
【问题描述】：

我无法摆脱这个错误。使用 pandas 读取 csv 时，我不断收到“UnicodeDecodeError: 'utf-8' codec can't decode byte 0xea in position 23: invalid continuation byte”。

我已经尝试了我在网上看到的所有内容。我已经将 csv 文件转换为多种编码，但我仍然无法让这个错误消失。我已经使用 sublimetext 和记事本将文件转换为 UTF-8。

import tensorflow as tf
import pandas as pd

csv_path="C:\\Users\\diogo\\Transferências\\E0.csv"
dataset=pd.read_csv(csv_path,encoding="utf-8")

我希望正确读取数据集，但我总是显示此错误。同样，当我更改熊猫阅读器的编码时，我仍然收到错误“'utf-8'编解码器无法解码。这应该发生吗？当我更改'utf-时，错误是否应该变为另一个错误- 8'编码？如果您知道将 csv 读取到 tensorflow 的任何替代方法，那么这些信息也将不胜感激。谢谢。

【问题讨论】：

标签： python-3.x pandas tensorflow

【解决方案1】：

我最后发现编码是“cp1252”，代码如下：

with open('food.csv') as f:
    print(f)

当我用崇高的文本和记事本保存文件时，仍然不知道为什么编码没有更改为“utf-8”。

【讨论】：

【解决方案2】：

这不需要任何模块导入，但您可以按照问题中指定的步骤重新打开。

with open('some_file.csv') as file:
    print(file.read()) # should return a (probably long) string
    print(file.decode('utf-8')) # remove the 'b' in the b'string'

【讨论】：

【解决方案3】：

尝试使用

open(filepath_, 'rb')

而不是

open(filepath_)

这对我有用 Python 3.8.5

【讨论】：