【发布时间】:2022-10-24 02:29:02
【问题描述】:
我有一个包含 6 列的大 ASCII 文件。该文件的行数是 24 的倍数(第四列是日期,%Y%m%d%H%M:24 行--> 1 天)并表示唯一(这 24 行中的 1、2、5、6 列的值相同:是一个测量站)。
这是一个 2x24 行的剪切示例,即 2 个不同的站点:
1_200061208 0 0.000000 202202150000 36.680573 15.094369
1_200061208 0 0.000000 202202150100 36.680573 15.094369
1_200061208 0 -99999 202202150200 36.680573 15.094369
1_200061208 0 0.000000 202202150300 36.680573 15.094369
1_200061208 0 0.000000 202202150400 36.680573 15.094369
1_200061208 0 0.000000 202202150500 36.680573 15.094369
1_200061208 0 0.000000 202202150600 36.680573 15.094369
1_200061208 0 0.000000 202202150700 36.680573 15.094369
1_200061208 0 -99999 202202150800 36.680573 15.094369
1_200061208 0 0.000000 202202150900 36.680573 15.094369
1_200061208 0 0.000000 202202151000 36.680573 15.094369
1_200061208 0 0.000000 202202151100 36.680573 15.094369
1_200061208 0 0.000000 202202151200 36.680573 15.094369
1_200061208 0 0.000000 202202151300 36.680573 15.094369
1_200061208 0 0.000000 202202151400 36.680573 15.094369
1_200061208 0 0.000000 202202151500 36.680573 15.094369
1_200061208 0 0.000000 202202151600 36.680573 15.094369
1_200061208 0 0.000000 202202151700 36.680573 15.094369
1_200061208 0 0.000000 202202151800 36.680573 15.094369
1_200061208 0 0.000000 202202151900 36.680573 15.094369
1_200061208 0 0.000000 202202152000 36.680573 15.094369
1_200061208 0 0.000000 202202152100 36.680573 15.094369
1_200061208 0 0.000000 202202152200 36.680573 15.094369
1_200061208 0 0.000000 202202152300 36.680573 15.094369
1_200061190 0 0.000000 202202150000 36.728195 14.993018
1_200061190 0 0.000000 202202150100 36.728195 14.993018
1_200061190 0 0.000000 202202150200 36.728195 14.993018
1_200061190 0 0.000000 202202150300 36.728195 14.993018
1_200061190 0 0.000000 202202150400 36.728195 14.993018
1_200061190 0 0.000000 202202150500 36.728195 14.993018
1_200061190 0 0.000000 202202150600 36.728195 14.993018
1_200061190 0 0.000000 202202150700 36.728195 14.993018
1_200061190 0 0.000000 202202150800 36.728195 14.993018
1_200061190 0 0.000000 202202150900 36.728195 14.993018
1_200061190 0 0.000000 202202151000 36.728195 14.993018
1_200061190 0 0.000000 202202151100 36.728195 14.993018
1_200061190 0 0.000000 202202151200 36.728195 14.993018
1_200061190 0 0.000000 202202151300 36.728195 14.993018
1_200061190 0 0.000000 202202151400 36.728195 14.993018
1_200061190 0 -99999 202202151500 36.728195 14.993018
1_200061190 0 0.000000 202202151600 36.728195 14.993018
1_200061190 0 0.000000 202202151700 36.728195 14.993018
1_200061190 0 0.000000 202202151800 36.728195 14.993018
1_200061190 0 0.000000 202202151900 36.728195 14.993018
1_200061190 0 0.000000 202202152000 36.728195 14.993018
1_200061190 0 0.000000 202202152100 36.728195 14.993018
1_200061190 0 0.000000 202202152200 36.728195 14.993018
1_200061190 0 0.000000 202202152300 36.728195 14.993018
我的目标是检查在第三列中,同一站点(第 1、2、5、6 列)每天(24 行)出现的 -99999 是否超过 1 次;在这种情况下,我想删除整个 24 行(换句话说,我想删除该站的整个测量日)。
预期的输出是相同的文件,没有满足我检查的 24xn 行。
在给出的示例中,预期的输出是:
1_200061190 0 0.000000 202202150000 36.728195 14.993018
1_200061190 0 0.000000 202202150100 36.728195 14.993018
1_200061190 0 0.000000 202202150200 36.728195 14.993018
1_200061190 0 0.000000 202202150300 36.728195 14.993018
1_200061190 0 0.000000 202202150400 36.728195 14.993018
1_200061190 0 0.000000 202202150500 36.728195 14.993018
1_200061190 0 0.000000 202202150600 36.728195 14.993018
1_200061190 0 0.000000 202202150700 36.728195 14.993018
1_200061190 0 0.000000 202202150800 36.728195 14.993018
1_200061190 0 0.000000 202202150900 36.728195 14.993018
1_200061190 0 0.000000 202202151000 36.728195 14.993018
1_200061190 0 0.000000 202202151100 36.728195 14.993018
1_200061190 0 0.000000 202202151200 36.728195 14.993018
1_200061190 0 0.000000 202202151300 36.728195 14.993018
1_200061190 0 0.000000 202202151400 36.728195 14.993018
1_200061190 0 -99999 202202151500 36.728195 14.993018
1_200061190 0 0.000000 202202151600 36.728195 14.993018
1_200061190 0 0.000000 202202151700 36.728195 14.993018
1_200061190 0 0.000000 202202151800 36.728195 14.993018
1_200061190 0 0.000000 202202151900 36.728195 14.993018
1_200061190 0 0.000000 202202152000 36.728195 14.993018
1_200061190 0 0.000000 202202152100 36.728195 14.993018
1_200061190 0 0.000000 202202152200 36.728195 14.993018
1_200061190 0 0.000000 202202152300 36.728195 14.993018
请给我代码。
【问题讨论】:
-
你尝试了什么?你在哪里卡住了?请参阅How to Ask 和tour
-
我使用了关联数组,但只有在出现次数大于 1 时才能打印。
awk '($3 =="-99999") {a[$1 FS $2 FS $5 FS $6]++} END {for (i in a) {if (a[i] >1) print i,a[i] }}' filename.txt -
如果那一大块文本是您的示例输入,那么预期的输出是什么?不要在无法格式化和可能遗漏的 cmets 中添加信息 - edit 您的问题包含所有相关信息。