【发布时间】:2018-12-19 20:15:30
【问题描述】:
这是我目前用来打开用户拥有的各种文件的工具:
# check the encoding quickly
with open(file, 'rb') as fp:
start_data = fp.read(4)
if start_data.startswith(b'\x00\x00\xfe\xff'):
encoding = 'utf-32'
elif start_data.startswith(b'\xff\xfe\x00\x00'):
encoding = 'utf-32'
elif start_data.startswith(b'\xfe\xff'):
encoding = 'utf-16'
elif start_data.startswith(b'\xff\xfe'):
encoding = 'utf-16'
else:
encoding = 'utf-8'
# open the file with that encoding
with open(file, 'r', encoding=encoding) as fp:
do_something()
是否有比上述更好的方法来正确打开未知的 utf 文件?
【问题讨论】:
标签: python csv unicode byte-order-mark