【发布时间】:2019-03-27 21:31:29
【问题描述】:
这是我的 input.csv 文件
dealerid,address,city,state,zip,vin,stocknumber,type,color,year,make,model,trim,bodystyle,fueltype,mileage,transmission,interiorcolor,interiorfabric,price,titlestatus,warranty,options_text,cylinders,engine,engineaspiration,enginetext,drivetrain,transmissiontext,mpgcity,mpghighway,features_text,vdc_url,images
TS06095298,999 wanna Road,Windsor,CT,06095,22HDT13S922218113,298,Used,Red,2002,OLDSMOBILE,BRAVADA,,,,136000,AUTOMATIC,,,2200,Clear,Available,"This vehicle is offered for sale by a verified private seller and features: FREE vehicle history & title report. Original window sticker available. Seller`s identity, email and phone verified. Secure cashless transactions. No cash needed. Pay securely by debit card or ACH. Bill of Sale and receipt issued for completed transactions. Vehicle financing options may be available.",,,,,,,,,,https://www.example.com/listings/298,"https://www.example.com/rails/00008.jpg,https://www.example.com/rails/AM00010.jpg"
TS06095298,999 wanna Road,Windsor,CT,06095,22HDT13S922123453,307,Used,Brown,2008,HONDA,599,,,,217538,AUTOMATIC,,,3500,Clear,Available,"This vehicle is offered for sale by a verified private seller and features: FREE vehicle history & title report. Original window sticker available. Seller`s identity, email and phone verified. Secure cashless transactions. No cash needed. Pay securely by debit card or ACH. Bill of Sale and receipt issued for completed transactions. Vehicle financing options may be available.",,,,,,,,,,https://www.example.com/listings/211,"https://www.example.com/rails/00008.jpg,https://www.example.com/rails/AM00010.jpg"
我需要用双引号将所有列括起来,所以我最终会得到一个像这样的文件:
"dealerid","address","city","state","zip","vin","stocknumber","type","color","year","make","model","trim","bodystyle","fueltype","mileage","transmission","interiorcolor","interiorfabric","price","titlestatus","warranty","options_text","cylinders","engine","engineaspiration","enginetext","drivetrain","transmissiontext","mpgcity","mpghighway","features_text","vdc_url","images"
"TS06095298","999 wanna Road,Windsor","CT","06095","22HDT13S922218113","298","Used","Red","2002","OLDSMOBILE","BRAVADA","","","","136000,AUTOMATIC","","","2200","Clear","Available","This vehicle is offered for sale by a verified private seller and features: FREE vehicle history & title report. Original window sticker available. Seller`s identity, email and phone verified. Secure cashless transactions. No cash needed. Pay securely by debit card or ACH. Bill of Sale and receipt issued for completed transactions. Vehicle financing options may be available.","","","","","","","","","","https://www.example.com/listings/298","https://www.example.com/rails/00008.jpg,https://www.example.com/rails/AM00010.jpg"
"TS06095298","999 wanna Road,Windsor","CT","06095","22HDT13S922123453","307","Used","Brown","2008","HONDA","599","","","","217538","AUTOMATIC","","","3500","Clear","Available","This vehicle is offered for sale by a verified private seller and features: FREE vehicle history & title report. Original window sticker available. Seller`s identity, email and phone verified. Secure cashless transactions. No cash needed. Pay securely by debit card or ACH. Bill of Sale and receipt issued for completed transactions. Vehicle financing options may be available.","","","","","","","","","","https://www.example.com/listings/211","https://www.example.com/rails/00008.jpg,https://www.example.com/rails/AM00010.jpg"
该文件始终保持不变,某些列中缺少相同的数据。
images 列和 features 文本列已经包装好了。
看到始终缺少相同的信息,我决定在每行的开头添加双引号并开始用双引号替换逗号,但开始遇到一些问题。
这是我目前所拥有的。我知道代码效率不高,但这是一个开始。
#!/bin/bash
#- Temp Directories
tmp_dir="$(mktemp -d -t 'csv.XXXXX' || mktemp -d 2>/dev/null)"
tmp_input1="${tmp_dir}/temp_input1.csv"
tmp_input2="${tmp_dir}/temp_input2.csv"
tmp_input3="${tmp_dir}/temp_input3.csv"
#- Variables
client="00000"
wDir="$(pwd)"
ftpDir="${wDir}/.clientftp"
clientDir="${ftpDir}/${client}"
csvFile="${clientDir}/final.csv"
inputCsv="${wDir}/input.csv"
# Lets Begin
cd "$wDir" || exit
cp "$inputCsv" "$tmp_input1"
dos2unix "$tmp_input1"
# place first line to a temp file , surrounding commas with double quotes , adding double quotes to the front and end of line
head -1 "$tmp_input1" | sed -e 's/,/","/g;s/.*/"&"/' > "$tmp_input2"
# place remainding lines to a temp file
sed 1,1d "$tmp_input1" | sed "s/^/\"/" > "$tmp_input3"
sed -i 's/",,,,,,,,,,https/","","","","","","","","","","https/g' "$tmp_input3"
sed -i 's/,Clear,Available,"/","Clear","Available","/g' "$tmp_input3"
sed -i 's/,,,,/","","","","/g' "$tmp_input3"
sed -i 's/,,,/","","","/g' "$tmp_input3"
# Create final file
cat "$tmp_input2" > "$csvFile"
cat "$tmp_input3" >> "$csvFile"
rm -rf "$tmp_dir"
{ clear; echo ""; echo ""; echo "nano $csvFile"; echo ""; }
nano "$csvFile"
这个脚本产生:
"dealerid","address","city","state","zip","vin","stocknumber","type","color","year","make","model","trim","bodystyle","fueltype","mileage","transmission","interiorcolor","interiorfabric","price","titlestatus","warranty","options_text","cylinders","engine","engineaspiration","enginetext","drivetrain","transmissiontext","mpgcity","mpghighway","features_text","vdc_url","images"
"TS06095298,999 wanna Road,Windsor,CT,06095,22HDT13S922218113,298,Used,Red,2002,OLDSMOBILE,BRAVADA","","","","136000,AUTOMATIC","","","2200","Clear","Available","This vehicle is offered for sale by a verified private seller and features: FREE vehicle history & title report. Original window sticker available. Seller`s identity, email and phone verified. Secure cashless transactions. No cash needed. Pay securely by debit card or ACH. Bill of Sale and receipt issued for completed transactions. Vehicle financing options may be available.","","","","","","","","","","https://www.example.com/listings/298,"https://www.example.com/rails/00008.jpg,https://www.example.com/rails/AM00010.jpg"
"TS06095298,999 wanna Road,Windsor,CT,06095,22HDT13S922123453,307,Used,Brown,2008,HONDA,599","","","","217538,AUTOMATIC","","","3500","Clear","Available","This vehicle is offered for sale by a verified private seller and features: FREE vehicle history & title report. Original window sticker available. Seller`s identity, email and phone verified. Secure cashless transactions. No cash needed. Pay securely by debit card or ACH. Bill of Sale and receipt issued for completed transactions. Vehicle financing options may be available.","","","","","","","","","","https://www.example.com/listings/211,"https://www.example.com/rails/00008.jpg,https://www.example.com/rails/AM00010.jpg"
所以现在我有几个问题:
1- vdc_url 列没有右双引号
2-前10个逗号需要用双引号括起来
最后一列可以包含多于 3 张图片
任何帮助将不胜感激。
【问题讨论】:
-
您的预期输出在第二行有
"136000,AUTOMATIC",但应该是"136000","AUTOMATIC",对吧? -
@BenjaminW。是的,先生,所有列都应该用双引号括起来。好收获!
-
另外,
"999 wanna Road,Windsor"应该是"999 wanna Road","Windsor"。 -
@BenjaminW。你是对的。看看格伦的回答。它似乎成功了,而且非常有效。无论如何感谢您的帮助,我很感激。