【发布时间】:2016-07-03 16:05:11
【问题描述】:
您好,我需要详细说明一个大的 csv 文件(20M 行),为每个逗号分隔的字段添加双引号。 csv 文件有 8 个字段,逗号分隔如下:
'2016-03-12','12393659','134',,'35533605',189348,9798,gmail.com;live_com.com
'2016-03-12','12390103','138',,'35438006',5133,1897,google.com
'2016-03-12','45616164','139',,'01318800',10945593,596633,facebook.com;tumblr.com;t.co
'2016-03-12','45673436','38',,'86441702',4350985,150327,serving-sys.com;chartboost.com;admarvel.com;mydas.mobi;adap.tv;cloudfront.net
如您所见,前 3 个字段在单引号之间,第 4 个为空白,第 5 个在单引号之间,第 6 到第 8 个仅以逗号分隔。 我想得到以下结果(也是第 4 个字段,即使为空也需要双引号):
"2016-03-12","12393659","134","","35533605","189348","9798","gmail.com;live_com.com"
"2016-03-12","12390103","138","","35438006","5133","1897","google.com"
"2016-03-12","45616164","139","","01318800","10945593","596633","facebook.com;tumblr.com;t.co"
"2016-03-12","45673436","38","","86441702","4350985,"150327","serving-sys.com;chartboost.com;admarvel.com;mydas.mobi;adap.tv;cloudfront.net"
我通过 sed 和 awk 混合获得部分结果:
sed -e s/\'//g inpu.csv > output.csv eliminate quotes
awk '{gsub(/[^,]+/,"\"&\"")}1' output.csv > output1.csv add double quotes
但是第四个字段没有双引号,我需要尽可能减少阐述时间。 无论如何,有助于以更好的表现和第四场双引号来完成所有工作。 非常感谢您的帮助。 M.Tave
【问题讨论】: