【发布时间】:2021-10-15 18:53:29
【问题描述】:
我在 Python 3.9 中有以下代码,它可以工作,但我在 UTF-8 编码文本文件的开头收到了一个垃圾字符,这使它错误地读取了第一行的第一个字符。如何去除我正在阅读的 UTF-8 文件开头的所有垃圾字符?
代码如下:
actions = {'#': 'comment', 'A': 'action', 'T': 'text for polly', 'F': 'filename'}
action = "#"
poly_text_received=False
script_line = "none"
line_cnt = 0
with open(input("Enter the script filename: "),'r') as script_file:
for line in script_file:
line_cnt = line_cnt + 1
line = line.strip()
action = actions.get(line[0])
if action == 'comment': #Action is a comment
line = line[1:].lstrip(':')
print(f'Ignoring comment: \n'
f' {line}')
这里是输入文件的示例 - 代码还有更多内容,它总是查看行的第一个字符,并根据该字符执行特定操作:
#Preceed each comment with "#"
#
A:Start of video (show design with component explorer open)
T:Once you identify sets of identical components, you can create your physical reuse source circuit.
F:Start.mp3
#
A: Circle the IO_Port Groups in Component Explorer
T:This design shows four groups of identical components.
F: Circle_IO_Port_Groups.mp3
#
【问题讨论】:
-
可能是
open(input("Enter the script filename: "),'r', encoding='utf-8-sig')? -
成功了 - 谢谢!