【问题标题】:Compare two files and output file with mark比较两个文件并输出带有标记的文件
【发布时间】:2018-09-04 14:57:32
【问题描述】:

一整天都在工作,只是无法正常工作,我从 diff awk sed 中尝试的代码太多,可以再次记住我尝试过的代码,

这是我的问题,我有 2 个文件(file1 和 file2)

File1 :

#4 and a row (2)
+1 hello post (5)
10 Years After (6)
21 & Over (8)
50_50 (1)
Almost Christmas (3)

File2:

#4 and a row (2) http://example.com/post1
+1 hello post (5) http://example.com/post2
Not over yet (3) http://example.com/post12
10 Years After (6) http://example.com/post3
Can get it done (2) http://example.com/post24
21 & Over (8) http://example.com/post9
50_50 (1) http://example.com/post7
hear me loud (5) http://example.com/post258
Almost Christmas (3) http://example.com/post5

我的问题是如何比较这两个文件并生成这样的 File3 输出

#4 and a row (2) http://example.com/post1
+1 hello post (5) http://example.com/post2
----> Not over yet (3) http://example.com/post12
10 Years After (6) http://example.com/post3
----> Can get it done (2) http://example.com/post24
21 & Over (8) http://example.com/post9
50_50 (1 http://example.com/post7
----> hear me loud (5) http://example.com/post258
Almost Christmas (3) http://example.com/post5

----> 表示此文本行不在 file1 中。

我希望我已经解释得足够好,如果可能的话,请帮助我,因为我缺乏 linux 技能,谢谢你!并希望有人能帮我解决这个问题。

~干杯~

@RavinderSingh13 的解决方案

awk -v s1="---->" 'FNR==NR{a[$0]=$0;next} {val=$0;sub(/ http.*/,"",val);printf("%s\n",val in a?$0:s1 OFS $0)}' file1 file2

效果很好

【问题讨论】:

  • 您可以使用 winmerge 直观地比较文件,但您不会得到所需的输出

标签: bash awk diff


【解决方案1】:

您能否尝试关注awk,如果这对您有帮助,请告诉我。

awk -v s1="---->" 'FNR==NR{a[$0]=$0;next} {val=$0;sub(/ http.*/,"",val);printf("%s\n",val in a?$0:s1 OFS $0)}' Input_file1  Input_file2

现在也添加非单线形式的解决方案。

awk -v s1="---->" '
FNR==NR{ a[$0]=$0;next }
{
  val=$0;
  sub(/ http.*/,"",val);
  printf("%s\n",val in a?$0:s1 OFS $0)
}
'  Input_file1   Input_file2

【讨论】:

  • 成功了!谢谢你真棒!我尝试了你的第一个 awk,它可以工作
  • @pdku,很高兴它对您有所帮助,如果有人在 SO stackoverflow.com/help/someone-answers 上为您提供帮助,请查看此内容,并继续学习和分享 :)
  • @pdku,当你 15 岁时,这不是你可以为有用的帖子做的问题,欢呼并享受学习:)
【解决方案2】:

Awk解决方案:

awk 'NR==FNR{ a[$0]; next }
     { 
         r = $0; m = "";
         sub(/ http:.*/, ""); 
         if ($0 in a) delete a[$0]; else m = "----> ";
         print m r 
     }' file1 file2
  • r = $0 - 分配有当前处理记录的变量
  • m - 旨在成为 marker 的变量

输出:

#4 and a row (2) http://example.com/post1
+1 hello post (5) http://example.com/post2
----> Not over yet (3) http://example.com/post12
10 Years After (6) http://example.com/post3
----> Can get it done (2) http://example.com/post24
21 & Over (8) http://example.com/post9
50_50 (1) http://example.com/post7
----> hear me loud (5) http://example.com/post258
Almost Christmas (3) http://example.com/post5

【讨论】:

    【解决方案3】:
    $ cat tst.awk
    NR==FNR {
        keys[$0]
        next
    }
    {
        key = $0
        sub(/ [^ ]+$/,"",key)
        print (key in keys ? "" : "----> ") $0
    }
    
    $ awk -f tst.awk file1 file2
    #4 and a row (2) http://example.com/post1
    +1 hello post (5) http://example.com/post2
    ----> Not over yet (3) http://example.com/post12
    10 Years After (6) http://example.com/post3
    ----> Can get it done (2) http://example.com/post24
    21 & Over (8) http://example.com/post9
    50_50 (1) http://example.com/post7
    ----> hear me loud (5) http://example.com/post258
    Almost Christmas (3) http://example.com/post5
    

    【讨论】:

      【解决方案4】:

      统一差异怎么样?例如:

      diff -u file1 <(awk 'NF--' file2)
      

      输出:

      --- file1   2018-03-26 14:59:49.569347677 +0200
      +++ /proc/self/fd/11    2018-03-26 15:01:34.117800718 +0200
      @@ -1,6 +1,9 @@
       #4 and a row (2)
       +1 hello post (5)
      +Not over yet (3)
       10 Years After (6)
      +Can get it done (2)
       21 & Over (8)
       50_50 (1)
      +hear me loud (5)
       Almost Christmas (3)
      

      【讨论】:

      • 是的,这个解决方案也很有效,我希望我能为你投票。
      猜你喜欢
      • 1970-01-01
      • 1970-01-01
      • 1970-01-01
      • 1970-01-01
      • 1970-01-01
      • 2022-01-24
      • 1970-01-01
      • 1970-01-01
      • 1970-01-01
      相关资源
      最近更新 更多