如何逐行比较一个文件的内容与shell脚本中的另一个文件？答案

【问题标题】：How to compare content of one file to the another file in shell script line by line?如何逐行比较一个文件的内容与shell脚本中的另一个文件？
【发布时间】：2022-01-11 11:30:53
【问题描述】：

文件 1 内容（TOp.txt）：

/
/boot
/home
/ptd
/ptd/tcd
/ptd/splunkforwarderdd
/ptd/sdt/pqr
/ptd/smr
/ptd/apps/ddas

文件 2 内容（POp.txt）：

/
/boot
/home
/ptd
/ptd/tcd
/ptd/apps/ddas
/ptc/ddd

输出文件 1(Op1.txt)：

/ptd/splunkforwarderdd
/ptd/sdt/pqr
/ptd/smr

输出文件 2(Op2.txt)：

/ptc/ddd

Shell 脚本：

while read linesT; do
    TOp=$linesT
      while read linesP; do
         POp=$linesp
         if [[ "$TOp" == "$POp"  ]]; then        
         a=cool
         else 
         echo $TOp
         fi
     done < POp.txt
  done < TOp.txt

我确实尝试过上面的代码。但它没有按预期工作。

我希望 TOp.txt 文件将每一行与 POp.txt 文件中的每一行进行比较，并将文件 TOp.txt 中缺少的行显示为输出。

对于 POp.txt 文件也是如此。

【问题讨论】：

diff file1 file2? sort+comm.
也许是一个 XY 问题 .... 你考虑过使用 sdiff 吗？
是的，我两种方式都试过了。

标签： bash shell script

【解决方案1】：

comm 比较文件的行数

有趣的选项：

   -1     suppress column 1 (lines unique to FILE1)

   -2     suppress column 2 (lines unique to FILE2)

   -3     suppress column 3 (lines that appear in both files)

   --nocheck-order
          do not check that the input is correctly sorted

答案：

comm --nocheck-order -23 TOp.txt POp.txt > Op1.txt
comm --nocheck-order -13 TOp.txt POp.txt > Op2.txt

未排序的文件可能会得到错误的结果，因此请执行以下操作：

$ comm -23 <(sort TOp.txt) <(sort POp.txt)
/ptd/sdt/pqr
/ptd/smr
/ptd/splunkforwarderdd

$ comm -13 <(sort TOp.txt) <(sort POp.txt)
/ptc/ddd

【讨论】：

我已经编辑了数据。它不适用于上述数据
它现在可以工作了。文件必须是未排序的形式。

【解决方案2】：

你应该用不同的 fd 单独阅读。但是 sort/diff 会好很多

while read -u 3 linesT && read -u 4 linesP; do
  TOp=$linesT
  POp=$linesp
  if [[ "$TOp" == "$POp"  ]]; then
     a=cool
     else
     echo $TOp
  fi
done 3< TOp.txt 4< POp.txt

【讨论】：

显示错误的o/p。

【解决方案3】：

假设这两个文件的#lines 相同，请尝试以下操作：

#!/bin/bash

while IFS=$'\t' read -r top pop; do
    if [[ $top != $pop ]]; then
        echo "$top" >> Op1.txt
        echo "$pop" >> Op2.txt
    fi
done < <(paste <(sort TOp.txt) <(sort POp.txt))

paste ... 命令并排合并两个已排序的文件然后将变量$top 和$pop 分配给每一行。

[编辑]
如果两个文件的行不平衡，最好使用awk解决方案：

awk 'NR==FNR {t[$0]++; next}                                    # memorize lines in "TOp.txt"
    {p[$0]++}                                                   # memorize lines in "POp.txt"
    END {
        for (i in t) if (p[i] == "") print i > "Op1.txt"        # lines only in "TOp.txt"
        for (i in p) if (t[i] == "") print i > "Op2.txt"        # lines only in "POp.txt"
    }
' TOp.txt POp.txt

【讨论】：

它没有按预期工作。
感谢您的反馈。我想第二个awk 版本会使用更新的输入文件产生您预期的结果，不是吗？
它没有给出相同的o/p
确实如此。在什么意义上你的意思是“不一样”？
我正在对不同的数据进行尝试。它适用于所有其他数据，但不适用于仅适用于该数据:(。我不能在这里分享内容。

【解决方案4】：

抱歉，我没有完全理解您的算法。它似乎表现得像最近邻域采样，但变量diff的使用对我来说看起来很模糊。

处理图像的常用方法是存储整个像素最初进入记忆。然后您可以随机访问以下任何像素流程和代码会更直接。

顺便说一句，您发布的输入图像太小而无法识别问题，因为很细的线条和小点很容易折叠，即使过滤如果算法合适。建议使用较大的图片进行评估。

这是我根据你的代码重写的：

#include <stdio.h>
#include <stdlib.h>

#include "bmp.h"

int main(int argc, char *argv[])
{
    // ensure proper usage
    if (argc != 4) {
        fprintf(stderr, "Usage: resize n infile outfile\n");
        return 1;
    }

    // read the scaling factor
    float f = atof(argv[1]);
    if(f <= 0 || f > 1) {
        fprintf(stderr, "f, the resize factor, must be between 0 and 1.\n");
        return 1;
    }
    char *infile = argv[2];
    char *outfile = argv[3];

    // open input file
    FILE *inptr = fopen(infile, "r");
    if (inptr == NULL) {
        fprintf(stderr, "Could not open %s.\n", infile);
        return 2;
    }

    // open output file
    FILE *outptr = fopen(outfile, "w");
    if (outptr == NULL) {
        fclose(inptr);
        fprintf(stderr, "Could not create %s.\n", outfile);
        return 3;
    }

    // read infile's BITMAPFILEHEADER
    BITMAPFILEHEADER bf;
    fread(&bf, sizeof(BITMAPFILEHEADER), 1, inptr);

    // read infile's BITMAPINFOHEADER
    BITMAPINFOHEADER bi;
    fread(&bi, sizeof(BITMAPINFOHEADER), 1, inptr);

    // ensure infile is (likely) a 24-bit uncompressed BMP 4.0
    if (bf.bfType != 0x4d42 || bf.bfOffBits != 54 || bi.biSize != 40 ||
        bi.biBitCount != 24 || bi.biCompression != 0) {
        fclose(outptr);
        fclose(inptr);
        fprintf(stderr, "Unsupported file format.\n");
        return 4;
    }

    BITMAPFILEHEADER bf_resize = bf;
    BITMAPINFOHEADER bi_resize = bi;
    bi_resize.biWidth = bi.biWidth * f;
    bi_resize.biHeight = bi.biHeight * f;
    int padding = bi.biWidth % 4;       // you can simplify the calculation
    int padding_resize = bi_resize.biWidth % 4;
    bi_resize.biSizeImage = (bi_resize.biWidth * sizeof(RGBTRIPLE) + padding_resize) * bi_resize.biHeight;
    bf_resize.bfSize = bi_resize.biSizeImage + sizeof(BITMAPFILEHEADER) + sizeof(BITMAPINFOHEADER);

    // allocate mamory for the rgb triplets of the original (input) image
    RGBTRIPLE *pix = malloc(sizeof(RGBTRIPLE) * bi.biWidth * bi.biHeight);
    if (pix == NULL) {
        fprintf(stderr, "malloc failed.\n");
        return 5;
    }
    // temporary storage
    RGBTRIPLE triple;

    // read the entire pixels of the original image and store into the memory
    for (int i = 0; i < bi.biHeight; i++) {
        for (int j = 0; j < bi.biWidth; j++) {
            fread(&triple, sizeof(RGBTRIPLE), 1, inptr);
            pix[i * bi.biWidth + j] = triple;
        }
        // skip over padding, if any
        fseek(inptr, padding, SEEK_CUR);
    }

    // write outfile's header
    fwrite(&bf_resize, sizeof(BITMAPFILEHEADER), 1, outptr);
    fwrite(&bi_resize, sizeof(BITMAPINFOHEADER), 1, outptr);

    // write the pixels of destination (resized) image
    for (int i = 0; i < bi_resize.biHeight; i++) {
        for (int j = 0; j < bi_resize.biWidth; j++) {
            // calculate the corresponding coorinates in the original image
            int m = (i / f + 0.5);              // +0.5 for rounding
            if (m > bi.biHeight - 1) {          // limit the value
                m = bi.biHeight - 1;
            }
            int n = (j / f + 0.5);
            if (n > bi.biWidth - 1) {
                n = bi.biWidth - 1;
            }
            // pick the pixel value at the coordinate
            triple = pix[m * bi.biWidth + n];
            // write RGB triplet to outfile
            fwrite(&triple, sizeof(RGBTRIPLE), 1, outptr);
        }
        // padding for the output image, if any
        for (int j = 0; j < padding_resize; j++) {
            fputc(0x00, outptr);
        }
    }
    free(pix);
    fclose(inptr);
    fclose(outptr);

    return 0;
}

输入图像： f=0.5 的输出图像：

【讨论】：