使用来自另一个文件的行范围从单独文件中的字符串替换每 2 nth答案

【问题标题】：Replace each 2 nth occurs from a string in separate files using line range from another file使用来自另一个文件的行范围从单独文件中的字符串替换每 2 nth
【发布时间】：2021-12-20 14:53:46
【问题描述】：

我有三个文件：

0.txt e 0-1.txt 与以下内容相同：

"#sun\t",
"car_snif = house.group_tree(home_cool)\t",
"machine(shoes_shirt.shop)\t",
"car_snif = house.group_tree(home_cool)\t",
"machine(shoes_shirt.shop)\t",
"car_snif = house.group_tree(home_cool)\t",
"machine(shoes_shirt.shop)\t",
"car_snif = house.group_tree(home_cool)\t",
"machine(shoes_shirt.shop)\t",
"#sun\t",
"car_snif = house.group_tree(home_cool)\t",
"machine(shoes_shirt.shop)\t",
"car_snif = house.group_tree(home_cool)\t",
"machine(shoes_shirt.shop)\t",
"car_snif = house.group_tree(home_cool)\t",
"machine(shoes_shirt.shop)\t",
"car_snif = house.group_tree(home_cool)\t",

和下面的源文件1.txt：

(food, apple,)(bag, tortoise,)
(sky, cat,)(sun, sea,)
(car, shape)(milk, market,)
(man, shirt)(hair, life)
(dog, big)(bal, pink)

对于0.txt，我想用1 nth 1.txt 行替换从home_cool 发生的每一个2 nth，但只使用最多第二个1.txt 的行（然后是sed -n '1,2p'），这样我的2.txt 输出如下：

"#sun\t",
"car_snif = house.group_tree((food, apple,)(bag, tortoise,))\t",
"machine(shoes_shirt.shop)\t",
"car_snif = house.group_tree((food, apple,)(bag, tortoise,))\t",
"machine(shoes_shirt.shop)\t",
"car_snif = house.group_tree((sky, cat,)(sun, sea,))\t",
"machine(shoes_shirt.shop)\t",
"car_snif = house.group_tree((sky, cat,)(sun, sea,))\t",
"machine(shoes_shirt.shop)\t",
"#sun\t",
"car_snif = house.group_tree((food, apple,)(bag, tortoise,))\t",
"machine(shoes_shirt.shop)\t",
"car_snif = house.group_tree((food, apple,)(bag, tortoise,))\t",
"machine(shoes_shirt.shop)\t",
"car_snif = house.group_tree((sky, cat,)(sun, sea,))\t",
"machine(shoes_shirt.shop)\t",
"car_snif = house.group_tree((sky, cat,)(sun, sea,))\t",

当在2.txt 完成该过程时，我想用 1 nth 1.txt 行替换从 home_cool 到 0-1.txt 发生的所有 2 nth使用1.txt 的第三行（然后是sed -n '3,5p'），这样我的3.txt 输出如下：

"#sun\t",
"car_snif = house.group_tree((car, shape)(milk, market,))\t",
"machine(shoes_shirt.shop)\t",
"car_snif = house.group_tree((car, shape)(milk, market,))\t",
"machine(shoes_shirt.shop)\t",
"car_snif = house.group_tree((man, shirt)(hair, life))\t",
"machine(shoes_shirt.shop)\t",
"car_snif = house.group_tree((man, shirt)(hair, life))\t",
"machine(shoes_shirt.shop)\t",
"#sun\t",
"car_snif = house.group_tree((dog, big)(bal, pink))\t",
"machine(shoes_shirt.shop)\t",
"car_snif = house.group_tree((dog, big)(bal, pink))\t",
"machine(shoes_shirt.shop)\t",
"car_snif = house.group_tree((car, shape)(milk, market,))\t",
"machine(shoes_shirt.shop)\t",
"car_snif = house.group_tree((car, shape)(milk, market,))\t",

通过下面的行，我可以将home_cool 替换为0.txt 分为两个步骤（第一步sed -n '1,2p' 和第二步sed -n '3,5p'）。但是我想把第一步保存在2.txt，第二步保存在3.txt：

awk 'NR==FNR {a[NR]=$0; n=NR; next}/home_cool/ { gsub("home_cool", a[int((++i-1)%(n*2)/2)+1])}1' <(cat 1.txt | tee >(sed -n '1,2p') >(sed -n '3,5p')) 0.txt >> 2.txt

所以我真正想要的是（下面的伪代码）：

awk 'NR==FNR {a[NR]=$0; n=NR; next}/home_cool/ { gsub("home_cool", a[int((++i-1)%(n*2)/2)+1])}1' <(cat 1.txt | tee >(sed -n '1,2p') >(sed -n '3,5p')) | "to sed -n '1,2p' make" 0.txt >> 2.txt | "to sed -n '3,5p' make" 0-1.txt >> 3.txt

我怎样才能通过维护一个命令行来做到这一点而不中断几个孤立的 awk 片段？

注意：问题的标题可能应该是“多个输入，相同的过程，不同的输出”

【问题讨论】：

相关，我假设：stackoverflow.com/q/69867423/4162356 是 2 nth 和 1 nth 2nd 和 1st 吗？
@JamesBrown 是的，相关，但我相信这个新问题还不够。
@Cyrus 我认为这可能不公平。他们真的删除了带有答案的问题吗？我只看到他们删除了一个问题——这个问题的以前版本，因为很难清楚地表达这个问题。更不用说（显然）将英语作为第二语言的人了。这个版本更清楚地描述了这个问题。 OP 应该编辑他们之前的问题，而不是删除和重新发布，并且可能先由同事或朋友审查。
@dan 我同意 Cyrus 的观点，发生的情况是我已经删除了旧问题，最正确的是保留这个新问题，如果需要我需要完全编辑它。
@dan：有人删除了我的评论。 example 有答案的问题。

标签： awk stdout stdin gsub tee

【解决方案1】：

这行得通：

awk \
'FNR==1 {++f}
f==1 {a[i++]=$0}
f==2 {if ($0~/home_cool/) {gsub(/home_cool/, a[int(j++/2)%2]) }; print > "2.txt"}
f==3 {if ($0~/home_cool/) {gsub(/home_cool/, a[int(k++/2)%3 + 2]) }; print > "3.txt"}' \
    1.txt 0.txt 0-1.txt

硬编码"2.txt" 和"3.txt" 的替代方案包括：

使用以-v outfile1=2.txt -v outfile2=2.txt 分配的变量
将它们替换为outfile，并使用此参数列表：1.txt outfile=2.txt 0.txt outfile=3.txt 0-1.txt
将它们替换为ARGV[4] 和ARGV[5]，添加f==4 {exit} 行，并使用此参数列表：1.txt 0.txt 0-1.txt 2.txt 3.txt

注意事项：

如果给定的文件是空的，它不会导致f 增加，并相应地破坏。在 gawk 中，可以使用 ENDFILE 代替。看到这个答案：How to get the filenumber that is being processing by an awk script?

【讨论】：

它完全按预期工作，我将阅读您的回复文档。非常感谢用户markp-fuso。我真的会避免删除问题，我经常认为他们没有任何资源可以回答，我最终删除它以获得时间并更新我的想法。也感谢用户 Cyrus，以及委员会的其他用户。
也许这是一个聊天对话，但我试图通过system() 在awk 之前放置一个shell 命令f==3 打印3.txt，唯一的问题是shell命令正在循环并被重复，我应该打开一个新问题吗？
@7beggars_nnnnm 您可以将if (FNR==1) {system("#")} 作为f==3 块内的第一个命令。
我进行了测试并且它工作正常，即使我在第三个替换过程之前添加第二个 shell 命令等待新输出 4.txt。看这里onlinegdb.com/brgArNpwF。如果你想在这里回答stackoverflow.com/questions/69877886/…。
见3.shonlinegdb.com/brgArNpwF

【解决方案2】：

参考OP的previous Q&A ...

虽然我们当然可以修改之前的问答（已接受的答案）以在各种输入中执行这些拆分操作，但我会通过将这个新问题分成两个单独的操作来投票支持简单性而不是复杂性，例如：

awk '... from previous Q&A ...' <(head -2 1.txt) 0.txt   > 2.txt
awk '... from previous Q&A ...' <(tail +3 1.txt) 0-1.txt > 3.txt

去掉不必要的/house_cool01/{...} 代码行变成：

awk 'NR==FNR {a[NR]=$0; n=NR; next} /home_cool/ {gsub("home_cool", a[int((i++)%(n*2)/2)+1] )} 1' <(head -2 1.txt) 0.txt   > 2.txt
awk 'NR==FNR {a[NR]=$0; n=NR; next} /home_cool/ {gsub("home_cool", a[int((i++)%(n*2)/2)+1] )} 1' <(tail +3 1.txt) 0-1.txt > 3.txt

这些生成：

$ cat 2.txt
"#sun\t",
"car_snif = house.group_tree((food, apple,)(bag, tortoise,))\t",
"machine(shoes_shirt.shop)\t",
"car_snif = house.group_tree((food, apple,)(bag, tortoise,))\t",
"machine(shoes_shirt.shop)\t",
"car_snif = house.group_tree((sky, cat,)(sun, sea,))\t",
"machine(shoes_shirt.shop)\t",
"car_snif = house.group_tree((sky, cat,)(sun, sea,))\t",
"machine(shoes_shirt.shop)\t",
"#sun\t",
"car_snif = house.group_tree((food, apple,)(bag, tortoise,))\t",
"machine(shoes_shirt.shop)\t",
"car_snif = house.group_tree((food, apple,)(bag, tortoise,))\t",
"machine(shoes_shirt.shop)\t",
"car_snif = house.group_tree((sky, cat,)(sun, sea,))\t",
"machine(shoes_shirt.shop)\t",
"car_snif = house.group_tree((sky, cat,)(sun, sea,))\t",

$ cat 3.txt
"#sun\t",
"car_snif = house.group_tree((car, shape)(milk, market,))\t",
"machine(shoes_shirt.shop)\t",
"car_snif = house.group_tree((car, shape)(milk, market,))\t",
"machine(shoes_shirt.shop)\t",
"car_snif = house.group_tree((man, shirt)(hair, life))\t",
"machine(shoes_shirt.shop)\t",
"car_snif = house.group_tree((man, shirt)(hair, life))\t",
"machine(shoes_shirt.shop)\t",
"#sun\t",
"car_snif = house.group_tree((dog, big)(bal, pink))\t",
"machine(shoes_shirt.shop)\t",
"car_snif = house.group_tree((dog, big)(bal, pink))\t",
"machine(shoes_shirt.shop)\t",
"car_snif = house.group_tree((car, shape)(milk, market,))\t",
"machine(shoes_shirt.shop)\t",
"car_snif = house.group_tree((car, shape)(milk, market,))\t",

【讨论】：

但这会分成几个 awk 块，这正是我不寻求的
相对干净/简单的实现；比替代/可能复杂的解决方案更容易理解/维护； KISS principle;您能否更新问题以解释为什么您需要一个 awk 电话？
我也在尝试给出一个解决方案，同时我试图让问题更清晰。谢谢。
谢谢！我正在将我未来需要提出的所有问题进行模式化，以最大限度地简化工作，而不是提出你真实背景中非常零散的问题。