【发布时间】:2015-07-23 15:52:47
【问题描述】:
我正在使用 Python 2.7.5 读取 CSV 文件 (input.csv),忽略某些行,然后将结果写入新的 CSV 文件 (output.csv)。我做了很多不同的尝试,但它们都导致操作系统(Red Hat 和 Mac OS X)将输出文件视为“数据”,而不是“ASCII 文本”。
输入.csv:
cat -v input.csv (truncated)
HKEY_LOCAL_MACHINE\SOFTWARE\Microsoft\Windows NT\CurrentVersion\Windows\Spooler,yes,1^M
HKEY_LOCAL_MACHINE\SOFTWARE\Microsoft\Windows NT\CurrentVersion\Windows\AppInit_DLLs,no,A^M
HKEY_LOCAL_MACHINE\SYSTEM\CurrentControlSet\Control\Session Manager,seed,0x714b3c99^M
file input.csv
input.csv: data
script.py(最新尝试):
import io
input_file = '/Users/spork_user/Desktop/input.csv'
output_file = '/Users/spork_user/Desktop/output.csv'
with io.open(input_file, 'r', newline='\r\n') as infile, io.open(output_file, 'w', newline='\n') as outfile:
for line in infile:
#filters for lines I don't want, for example:
if "Does not exist" in line:
continue
#to verify how the line appears to python when it reads it in
print repr(line)
#without the rstrip, i get a blank line between each line in my output, and it's still seen as 'data'
outfile.write(unicode(line.rstrip('\r\n')+'\n'))
运行:
python script.py (truncated)
u'HKEY_LOCAL_MACHINE\\SOFTWARE\\Microsoft\\Windows NT\\CurrentVersion\\Windows\\Spooler,yes,1\r\n'
u'HKEY_LOCAL_MACHINE\\SOFTWARE\\Microsoft\\Windows NT\\CurrentVersion\\Windows\\AppInit_DLLs,no,A\r\n'
u'HKEY_LOCAL_MACHINE\\SYSTEM\\CurrentControlSet\\Control\\Session Manager,seed,0x714b3c99\r\n'
输出.csv:
cat -v output.csv (truncated)
HKEY_LOCAL_MACHINE\SOFTWARE\Microsoft\Windows NT\CurrentVersion\Windows\Spooler,yes,1
HKEY_LOCAL_MACHINE\SOFTWARE\Microsoft\Windows NT\CurrentVersion\Windows\AppInit_DLLs,no,A
HKEY_LOCAL_MACHINE\SYSTEM\CurrentControlSet\Control\Session Manager,seed,0x714b3c99
file output.csv
output.csv: data
无论我尝试使用哪种打开读/写标志组合或换行符剥离,output.csv 文件最终都会被操作系统视为“数据”。
然而,如果我用硬编码的输出制作一个简化的脚本,它会为我提供我正在寻找的 ASCII 文件类型:
simplified.py:
import io
output_file = '/Users/spork_user/Desktop/simple_output.csv'
with io.open(output_file, 'w', newline='\n') as outfile:
outfile.write(unicode('hello\n'))
outfile.write(unicode('this\n'))
outfile.write(unicode('works\n'))
运行:
python simplified.py
<no output>
simple_output.csv:
cat -v simple_output.csv
hello
this
works
file simple_output.csv
simple_output.csv: ASCII text
如何让操作系统将 output.csv 视为 ASCII 文本,如 simple_output.csv?
谢谢
【问题讨论】:
标签: python file unix csv ascii