【问题标题】:How to print all columns after matching on key field匹配关键字段后如何打印所有列
【发布时间】:2017-02-21 20:06:22
【问题描述】:

在匹配关键字段后,如何连接两个文件中每一行的所有字段?如果 f2 中的字段数未知,如何泛化这个单行?

f2:    
a 1 2    
b 3 4    
c 5 6   

f3:    
10 a x y z    
11 g x y z    
12 j x y z    

observed:    
a 10 x y z
a1 10 x y z

Desired:    
a 1 2 10 x y z

这些是我最好的尝试,但不正确:

awk 'FNR==NR{a[$1]=$2;next} ($2 in a) {print a[$2],$0}' f2.txt f3.txt > f4.txt

awk 'FNR==NR{a[$1]=$2$3;next} ($2 in a) {print a[$2],$0}' f2.txt f3.txt > f4.txt

【问题讨论】:

    标签: awk


    【解决方案1】:
      awk  'NR==FNR{a[$1]=$0;next} ($2 in a){print a[$2],$1,$3,$4,$5}' f2.txt f3.txt > f4.txt
    

    将整体保存为value,column1作为key,读取第二个文件时,检查数组a中的column2是否,如果是,打印a[$2]和其余列

    一种更短的方式(这个命令的缺点是在 10 和 x 之间有一个额外的空格):

    awk  'NR==FNR{a[$1]=$0;next} ($2 in a){second=$2; $2="";print a[second],$0}' f2.txt f3.txt > f4.txt
    

    将第二个文件的 $2 替换为空字符串,并打印整行 $0

    【讨论】:

      【解决方案2】:

      如果您的文件按示例中的键排序,join 是此任务的工具

      join -11 -22 f2.txt f3,txt
      

      【讨论】:

        【解决方案3】:

        @mxttgen31:试试:

        awk 'FNR==NR{Q=$2;$2="";A[Q]=$0;next} ($1 in A){print $0,A[$1]}'  f3  f2
        

        上述命令解释如下:

        awk 'FNR==NR{      ##### Checking condition FNR==NR here, where FNR and NR both denotes the number of line, 
                                 only difference between FNR and NR is as we could read mutiple files from awk, 
                                 value of FNR will be RESET on next file's start, where NR's value will be keep on increasing till 
                                 it completes reading all the file. so this condition will be TRUE only when first Input_file(which is f3 here) will be TRUE.
        Q=$2;              ##### Assigning second field's value to variable Q.
        $2="";             ##### making second field's value to NULL now.
        A[$2]=$0;          ##### Create an array named A whose index is $2 and value is current line.
        next}              ##### putting next(awk's in-built keyword) which skips all next further statements and take the cursor again to starting.
        ($1 in A)          ##### Cursor will come here whenever second Input_file is being read, so here checking $1(first field) is present in array A then do following.
        {print $0,A[$1]}   ##### print the current line($0) of current file named f2 and print array A's value whose index is $1 of current file f2.
        ' f3  f2           ##### Mentioning Input_files here.
        

        【讨论】:

        • 你的输出是a 1 2 a 10 x y z,OP想要的输出是a 1 2 10 x y z
        猜你喜欢
        • 2019-08-30
        • 2023-03-07
        • 1970-01-01
        • 2011-05-01
        • 1970-01-01
        • 2018-11-03
        • 1970-01-01
        • 2017-06-30
        • 2022-01-24
        相关资源
        最近更新 更多