【发布时间】:2018-04-21 14:49:07
【问题描述】:
我编写了以下代码来获取一个大型 csv 文件,并根据列中的特定单词将其拆分为多个 csv 文件。原始的 csv 文件有一些字段是字符串,并且它们周围有引号。
例如:
Field1,Field2,Field3,Field4
1,2,"red",3
1,4,"red",4
3,4,"blue",4
等等
我的代码根据 Field4 将文件拆分为单独的 csv。
我的输出如下所示:
3.csv
Field1,Field2,Field3,Field4
1,2,red,3
4.csv
Field1,Field2,Field3,Field4
1,4,red,4
3,4,blue,4
我希望我的输出在字段 3 中的字符串周围保持引号。文件被输入到一个软件中,该软件仅在字符串周围有引号时才有效,这很烦人。
我当前的代码如下所示:
import csv
#Creates empty set - this will be used to store the values that have already been used
newfilelist = set()
#Opens the large csv file in "read" mode
with open('File.csv', 'r') as csvfile:
#Read the first row of the large file and store the whole row as a string (headerstring)
read_rows = csv.reader(csvfile)
headerrow = next(read_rows)
headerstring=','.join(headerrow)
for row in read_rows:
#Store the whole row as a string (rowstring)
rowstring=','.join(row)
#Takes Field 4
newfilename = (row[3])
#This basically makes sure it is not looking at the header row.
if newfilename != "field4":
#If the newfilename is not in the newfilename set, add it to the list and create new csv file with header row.
if newfilename not in newfilelist:
newfilelist.add(newfilename)
with open('//output/' +str(newfilename)+'.csv','a') as f:
f.write(headerstring)
f.write("\n")
f.close()
#If the newfilename is in the newfilelist set, append the current row to the existing csv file.
else:
with open('//output/' +str(newfilename)+'.csv','a') as f:
f.write(rowstring)
f.write("\n")
f.close()
谁能告诉我如何得到字符串周围的引号?不幸的是,使用我的文件的软件要求它们采用这种格式!
【问题讨论】:
标签: python csv parsing text quotes