【发布时间】:2020-02-15 23:43:03
【问题描述】:
我有一个 Bash 脚本,它会提示输入以编辑 CSV 列:
echo X, as recorded in TIFF header, to be removed:
read X
sed -i "" "/^[^,]*_f_[^,_]*,/s/,$X /,f. /
s/,$X /,/" $pathToCSV".csv"
现在我想避免手动、一次输入,而是使用包含值 X 的 CSV 的另一列(C 列)来编辑第一列。这样,我可以从 TIFF 标头批量生成这些 CSV,而不必每次都输入 X(与下一个 CSV 不同)。
我环顾四周并尝试了一些方法,包括:
X=$(awk 'NR == 2 {print $3}' $pathToCSV".csv")
sed -i "" "/^[^,]*_f_[^,_]*,/s/,$X /,f. /
s/,$X /,/" $pathToCSV".csv"
这不起作用,输出仍然包括X。 (C列的所有行都包含相同的值,我随意使用C2。)
更新 1:
根据要求,我在此处粘贴部分 CSV:
/Volumes/Masters/DLTempSecure/EMEL_SLDP/201808/tiffs/arabic_0695/sld_arb0695_0001_a_1.tif Si Ar 695 Front Board Outside Si Ar 695
/Volumes/Masters/DLTempSecure/EMEL_SLDP/201808/tiffs/arabic_0695/sld_arb0695_0002_a_1a.tif Si Ar 695 Front Board Outside Si Ar 695
/Volumes/Masters/DLTempSecure/EMEL_SLDP/201808/tiffs/arabic_0695/sld_arb0695_0003_b_000.tif Si Ar 695 Front Board Inside Si Ar 695
/Volumes/Masters/DLTempSecure/EMEL_SLDP/201808/tiffs/arabic_0695/sld_arb0695_0009_b_003v.tif Si Ar 695 Flyleaf 003v Si Ar 695
/Volumes/Masters/DLTempSecure/EMEL_SLDP/201808/tiffs/arabic_0695/sld_arb0695_0010_f_001r.tif Si Ar 695 001r Si Ar 695
/Volumes/Masters/DLTempSecure/EMEL_SLDP/201808/tiffs/arabic_0695/sld_arb0695_0060_y_001r.tif Si Ar 695 Flyleaf 001r Si Ar 695
/Volumes/Masters/DLTempSecure/EMEL_SLDP/201808/tiffs/arabic_0695/sld_arb0695_0070_y_999.tif Si Ar 695 Back Board Inside Si Ar 695
/Volumes/Masters/DLTempSecure/EMEL_SLDP/201808/tiffs/arabic_0695/sld_arb0695_0071_z_1.tif Si Ar 695 Back Board Outside Si Ar 695
/Volumes/Masters/DLTempSecure/EMEL_SLDP/201808/tiffs/arabic_0695/sld_arb0695_0072_z_1a.tif Si Ar 695 Back Board Outside Si Ar 695
/Volumes/Masters/DLTempSecure/EMEL_SLDP/201808/tiffs/arabic_0695/sld_arb0695_0073_z_2.tif Si Ar 695 Spine Si Ar 695
/Volumes/Masters/DLTempSecure/EMEL_SLDP/201808/tiffs/arabic_0695/sld_arb0695_0074_z_3.tif Si Ar 695 Fore edge Si Ar 695
不应手动输入“Si Ar 695”(以及用于不同 CSV 的其他值),而应使用第三列从第二列前面删除重复值。
更新 2:
@anubhava 建议的下面的awk 命令完美运行:
awk 'BEGIN{FS=OFS=","} NF>2 && NR==1{s=$3} {sub("^" s "[[:blank:]]+", "", $2)} 1' $pathToCSV".csv"
不过,我只记得之前的sed命令,即
sed -i "" "/^[^,]*_f_[^,_]*,/s/,$X /,f. /
s/,$X /,/" $pathToCSV".csv"
只要第一列中的文件名包含_f_,就会在我的第二列前面加上f.。我认为这应该是一个正则表达式调整,但我正在努力在上面的 awk 命令中实现它。
这是输入:
/Volumes/Masters/DLTempSecure/EMEL_SLDP/201808/tiffs/arabic_0695/sld_arb0695_0010_f_001r.tif Si Ar 695 001r Si Ar 695
/Volumes/Masters/DLTempSecure/EMEL_SLDP/201808/tiffs/arabic_0695/sld_arb0695_0011_f_001v.tif Si Ar 695 001v Si Ar 695
/Volumes/Masters/DLTempSecure/EMEL_SLDP/201808/tiffs/arabic_0695/sld_arb0695_0012_f_002r.tif Si Ar 695 002r Si Ar 695
/Volumes/Masters/DLTempSecure/EMEL_SLDP/201808/tiffs/arabic_0695/sld_arb0695_0013_f_002v.tif Si Ar 695 002v Si Ar 695
/Volumes/Masters/DLTempSecure/EMEL_SLDP/201808/tiffs/arabic_0695/sld_arb0695_0014_f_003ar.tif Si Ar 695 003r Si Ar 695
/Volumes/Masters/DLTempSecure/EMEL_SLDP/201808/tiffs/arabic_0695/sld_arb0695_0015_f_003av.tif Si Ar 695 003v Si Ar 695
/Volumes/Masters/DLTempSecure/EMEL_SLDP/201808/tiffs/arabic_0695/sld_arb0695_0016_f_004br.tif Si Ar 695 004r Si Ar 695
/Volumes/Masters/DLTempSecure/EMEL_SLDP/201808/tiffs/arabic_0695/sld_arb0695_0017_f_004bv.tif Si Ar 695 004v Si Ar 695
/Volumes/Masters/DLTempSecure/EMEL_SLDP/201808/tiffs/arabic_0695/sld_arb0695_0018_f_005r.tif Si Ar 695 005r Si Ar 695
这是所需的输出:
/Volumes/Masters/DLTempSecure/EMEL_SLDP/201808/tiffs/arabic_0695/sld_arb0695_0010_f_001r.tif f. 001r Sinai Arabic 695
/Volumes/Masters/DLTempSecure/EMEL_SLDP/201808/tiffs/arabic_0695/sld_arb0695_0011_f_001v.tif f. 001v Sinai Arabic 695
/Volumes/Masters/DLTempSecure/EMEL_SLDP/201808/tiffs/arabic_0695/sld_arb0695_0012_f_002r.tif f. 002r Sinai Arabic 695
/Volumes/Masters/DLTempSecure/EMEL_SLDP/201808/tiffs/arabic_0695/sld_arb0695_0013_f_002v.tif f. 002v Sinai Arabic 695
/Volumes/Masters/DLTempSecure/EMEL_SLDP/201808/tiffs/arabic_0695/sld_arb0695_0014_f_003ar.tif f. 003r Sinai Arabic 695
/Volumes/Masters/DLTempSecure/EMEL_SLDP/201808/tiffs/arabic_0695/sld_arb0695_0015_f_003av.tif f. 003v Sinai Arabic 695
/Volumes/Masters/DLTempSecure/EMEL_SLDP/201808/tiffs/arabic_0695/sld_arb0695_0016_f_004br.tif f. 004r Sinai Arabic 695
/Volumes/Masters/DLTempSecure/EMEL_SLDP/201808/tiffs/arabic_0695/sld_arb0695_0017_f_004bv.tif f. 004v Sinai Arabic 695
/Volumes/Masters/DLTempSecure/EMEL_SLDP/201808/tiffs/arabic_0695/sld_arb0695_0018_f_005r.tif f. 005r Sinai Arabic 695
如果对 Update 2 有任何帮助,我将不胜感激!
【问题讨论】:
-
考虑发布 pathToCSV 文件(如果文件很大,至少几行)。
-
如果您的 CSV 以逗号分隔,则
Si Ar 695是示例数据中的第二列而不是第三列 -
我没有粘贴第一列。第一列是 TIFF 的文件路径。
-
最好显示正确的输入数据,否则你会得到错误的答案。