选项 1
有一个适用于某些版本的 awk 的解决方案:
awk '{ $(NF+1)=$1;$1="";$0=$0;} NF=NF ' infile.txt
解释:
$(NF+1)=$1 # add a new field equal to field 1.
$1="" # erase the contents of field 1.
$0=$0;} NF=NF # force a re-calc of fields.
# and use NF to promote a print.
结果:
United Arab Emirates AE
Antigua & Barbuda AG
Netherlands Antilles AN
American Samoa AS
Bosnia and Herzegovina BA
Burkina Faso BF
Brunei Darussalam BN
但是,旧版本的 awk 可能会失败。
选项 2
awk '{ $(NF+1)=$1;$1="";sub(OFS,"");}1' infile.txt
即:
awk '{ # call awk.
$(NF+1)=$1; # Add one trailing field.
$1=""; # Erase first field.
sub(OFS,""); # remove leading OFS.
}1' # print the line.
请注意,需要擦除的是 OFS,而不是 FS。分配字段 $1 时,将重新计算该行。这会将 FS 的所有运行更改为一个 OFS。
但即使是该选项仍然会因多个分隔符而失败,正如更改 OFS 所清楚表明的那样:
awk -v OFS=';' '{ $(NF+1)=$1;$1="";sub(OFS,"");}1' infile.txt
该行将输出:
United;Arab;Emirates;AE
Antigua;&;Barbuda;AG
Netherlands;Antilles;AN
American;Samoa;AS
Bosnia;and;Herzegovina;BA
Burkina;Faso;BF
Brunei;Darussalam;BN
这表明 FS 的运行正在更改为一个 OFS。
避免这种情况的唯一方法是避免重新计算字段。
一个可以避免重新计算的函数是 sub。
可以捕获第一个字段,然后使用 sub 从 $0 中删除,然后重新打印。
选项 3
awk '{ a=$1;sub("[^"FS"]+["FS"]+",""); print $0, a;}' infile.txt
a=$1 # capture first field.
sub( " # replace:
[^"FS"]+ # A run of non-FS
["FS"]+ # followed by a run of FS.
" , "" # for nothing.
) # Default to $0 (the whole line.
print $0, a # Print in reverse order, with OFS.
United Arab Emirates AE
Antigua & Barbuda AG
Netherlands Antilles AN
American Samoa AS
Bosnia and Herzegovina BA
Burkina Faso BF
Brunei Darussalam BN
即使我们更改 FS、OFS 和/或添加更多分隔符,它仍然有效。
如果输入文件改为:
AE..United....Arab....Emirates
AG..Antigua....&...Barbuda
AN..Netherlands...Antilles
AS..American...Samoa
BA..Bosnia...and...Herzegovina
BF..Burkina...Faso
BN..Brunei...Darussalam
命令变为:
awk -vFS='.' -vOFS=';' '{a=$1;sub("[^"FS"]+["FS"]+",""); print $0,a;}' infile.txt
输出将是(仍然保留分隔符):
United....Arab....Emirates;AE
Antigua....&...Barbuda;AG
Netherlands...Antilles;AN
American...Samoa;AS
Bosnia...and...Herzegovina;BA
Burkina...Faso;BF
Brunei...Darussalam;BN
该命令可以扩展到多个字段,但仅限于现代 awks 和 --re-interval 选项处于活动状态。对原始文件的这个命令:
awk -vn=2 '{a=$1;b=$2;sub("([^"FS"]+["FS"]+){"n"}","");print $0,a,b;}' infile.txt
会输出这个:
Arab Emirates AE United
& Barbuda AG Antigua
Antilles AN Netherlands
Samoa AS American
and Herzegovina BA Bosnia
Faso BF Burkina
Darussalam BN Brunei