将一个文件中的数据与另一个文件进行比较并打印输出答案

【问题标题】：Comparing the data from one file to another and print the output将一个文件中的数据与另一个文件进行比较并打印输出
【发布时间】：2021-07-07 05:39:01
【问题描述】：

我有三个名称为 File1、File2 和 File3 的文件。三个文件的数据如下图：

文件1：

    // Class of "A2" of type "ONE".
    // Class of "A3" of type "ONE".
    // Class of "D1" of type "TWO".
    // Class of "D2" of type "TWO".
    // Class of "D3" of type "FOUR".
    // Class of "D6" of type "FIVE."

文件2：

@CLASS_NAMES = ("one",
    "two",
    "three",);
@CLASS_LIST_NAMES = ("ONE.A1",
    "ONE.A2",
    "ONE.A3",
    "TWO.D1",
    "TWO.D2");

文件3：

D3
D4
D5

我需要签入 File1 类“D3”存在于 @CLASS_LIST_NAMES 的 File2 中或不是。如果 @CLASS_LIST_NAMES 的 File2 中不存在它，那么我需要检查 File3 是否存在 D3或不。如果 D3 存在于 File3 中，则输出应为 PASS，如果它不存在于 File2 和 File3 中，则输出应为 >失败。

同样，我需要检查 File1 中的所有类列表（A2、A3、D1、D2....）是否存在于 @CLASS_LIST_NAMES 与否，如果它们不存在于 File2 中，我需要使用 File3 进行验证并将输出打印为 PASS 或 FAIL。

我尝试了以下代码：

#!/bin/bash

sed -n '/@CLASS_LIST_NAMES =/,/)/p' File2

我被困在这里，谁能告诉我接下来需要做什么。

Deisred_Output：从 File1 - D6 在 File2 和 File3 中都找不到，它应该打印为 FAIL。输出应如下所示：

Fail: D6 is not found

【问题讨论】：

您可以通过使用IF/ELSE 和grep 的组合来完成此操作...
您可以添加一个示例以获得所需的输出吗？
@AhmetSaidAkbulut 我修改了问题。请找到我想要的输出。由于从 File1 - D6 不存在于 File2 和 File3 中，输出应为 D6 上的 Fail 。
@ADAPANIKHIL ：第 1 步：将文件 1 中的所有类名放入一个数组中。第 2 步：将文件 2 中的所有类名放入一个数组中。第 3 步：将文件 3 中的所有类名放入一个数组中。第 4 步：处理数组以实现您的算法。由于您要处理多个类，因此将所有数据保存在内存中可能会更有效，而不是一次又一次地遍历同一个文件。
@user1934428 但是如何将 File1 和 File2 数据处理成数组。数据会随机变化，所以...

标签： bash awk sed

【解决方案1】：

您可以通过grep 和awk 实现此目的

使用支持-P选项的GNU grep

awk 'NR==FNR{a[$0]; next} !($0 in a){print "Fail: "$0 " is not found"}' <(cat file3 <(grep -Po '(?<=\.)[^"]+' file2)) <(grep -Po '(?<=of ")\w+' file1)

如果您想提取仅存在于@CLASS_LIST_NAMES 语句中的类名，请使用下面的一个。

awk 'NR==FNR{a[$0]; next} !($0 in a){print "Fail: "$0 " is not found"}' <(cat file3 <(sed -n '/@CLASS_LIST_NAMES/,/;$/p' | grep -Po '(?<=\.)[^"]+' file2)) <(grep -Po '(?<=of ")\w+' file1)

如果file1中的空格数不一致，可以使用awk处理

# expects the 4th column is the variable, input format shouldn't change

awk 'NR==FNR{a[$0]; next} {gsub("\"","",$4)} !($4 in a){print "Fail: "$4" is not found"}' <(cat file3 <(sed -n '/@CLASS_LIST_NAMES/,/;$/p' | grep -Po '(?<=\.)[^"]+' file2)) file1

# alternate way using FPAT if the position of actual field can change, but it occurs first between double quotes

awk 'NR==FNR{a[$0]; next} {gsub("\"","",$1)} !($1 in a){print "Fail: "$1" is not found"}' <(cat file3 <(sed -n '/@CLASS_LIST_NAMES/,/;$/p' | grep -Po '(?<=\.)[^"]+' file2)) FPAT="\"[^ \"]+" file1

【讨论】：

你能解释一下代码吗？我们在代码中做了什么……
代码对于给定的样本工作正常，你能解释一下代码吗，你是如何只处理 File2 中的数据的，因为在 File2 中我们只需要检查 CLASS_LIST_NAMES ..
@ADAPANIKHIL 我正在提取文件 2 中 . 之后的部分，即类名。 "ONE.A2" 使用 grep 转换为 A2
有没有办法提取 . 仅用于 CLASS_LIST_NAME 。提取操作应在起始字符串为 @CLASS_LIST_NAMES = ( 和结束字符串为 ) 之间执行。
sed -n '/@CLASS_LIST_NAMES/,/;$/p' file2 这将为您提供与@CLASS_LIST_NAMES相对应的行