【发布时间】:2020-02-06 01:00:18
【问题描述】:
我在awk 脚本中遇到问题 - 我需要生成一份报告,其中包含数据文件中每个作业的最低、最高和平均分数。分配的名称位于column 3。
输入数据是:
Student,Catehory,Assignment,Score,Possible
Chelsey,Homework,H01,90,100
Chelsey,Homework,H02,89,100
Chelsey,Homework,H03,77,100
Chelsey,Homework,H04,80,100
Chelsey,Homework,H05,82,100
Chelsey,Homework,H06,84,100
Chelsey,Homework,H07,86,100
Chelsey,Lab,L01,91,100
Chelsey,Lab,L02,100,100
Chelsey,Lab,L03,100,100
Chelsey,Lab,L04,100,100
Chelsey,Lab,L05,96,100
Chelsey,Lab,L06,80,100
Chelsey,Lab,L07,81,100
Chelsey,Quiz,Q01,100,100
Chelsey,Quiz,Q02,100,100
Chelsey,Quiz,Q03,98,100
Chelsey,Quiz,Q04,93,100
Chelsey,Quiz,Q05,99,100
Chelsey,Quiz,Q06,88,100
Chelsey,Quiz,Q07,100,100
Chelsey,Final,FINAL,82,100
Chelsey,Survey,WS,5,5
Sam,Homework,H01,19,100
Sam,Homework,H02,82,100
Sam,Homework,H03,95,100
Sam,Homework,H04,46,100
Sam,Homework,H05,82,100
Sam,Homework,H06,97,100
Sam,Homework,H07,52,100
Sam,Lab,L01,41,100
Sam,Lab,L02,85,100
Sam,Lab,L03,99,100
Sam,Lab,L04,99,100
Sam,Lab,L05,0,100
Sam,Lab,L06,0,100
Sam,Lab,L07,0,100
Sam,Quiz,Q01,91,100
Sam,Quiz,Q02,85,100
Sam,Quiz,Q03,33,100
Sam,Quiz,Q04,64,100
Sam,Quiz,Q05,54,100
Sam,Quiz,Q06,95,100
Sam,Quiz,Q07,68,100
Sam,Final,FINAL,58,100
Sam,Survey,WS,5,5
Andrew,Homework,H01,25,100
Andrew,Homework,H02,47,100
Andrew,Homework,H03,85,100
Andrew,Homework,H04,65,100
Andrew,Homework,H05,54,100
Andrew,Homework,H06,58,100
Andrew,Homework,H07,52,100
Andrew,Lab,L01,87,100
Andrew,Lab,L02,45,100
Andrew,Lab,L03,92,100
Andrew,Lab,L04,48,100
Andrew,Lab,L05,42,100
Andrew,Lab,L06,99,100
Andrew,Lab,L07,86,100
Andrew,Quiz,Q01,25,100
Andrew,Quiz,Q02,84,100
Andrew,Quiz,Q03,59,100
Andrew,Quiz,Q04,93,100
Andrew,Quiz,Q05,85,100
Andrew,Quiz,Q06,94,100
Andrew,Quiz,Q07,58,100
Andrew,Final,FINAL,99,100
Andrew,Survey,WS,5,5
Ava,Homework,H01,55,100
Ava,Homework,H02,95,100
Ava,Homework,H03,84,100
Ava,Homework,H04,74,100
Ava,Homework,H05,95,100
Ava,Homework,H06,84,100
Ava,Homework,H07,55,100
Ava,Lab,L01,66,100
Ava,Lab,L02,77,100
Ava,Lab,L03,88,100
Ava,Lab,L04,99,100
Ava,Lab,L05,55,100
Ava,Lab,L06,66,100
Ava,Lab,L07,77,100
Ava,Quiz,Q01,88,100
Ava,Quiz,Q02,99,100
Ava,Quiz,Q03,44,100
Ava,Quiz,Q04,55,100
Ava,Quiz,Q05,66,100
Ava,Quiz,Q06,77,100
Ava,Quiz,Q07,88,100
Ava,Final,FINAL,99,100
Ava,Survey,WS,5,5
Shane,Homework,H01,50,100
Shane,Homework,H02,60,100
Shane,Homework,H03,70,100
Shane,Homework,H04,60,100
Shane,Homework,H05,70,100
Shane,Homework,H06,80,100
Shane,Homework,H07,90,100
Shane,Lab,L01,90,100
Shane,Lab,L02,0,100
Shane,Lab,L03,100,100
Shane,Lab,L04,50,100
Shane,Lab,L05,40,100
Shane,Lab,L06,60,100
Shane,Lab,L07,80,100
Shane,Quiz,Q01,70,100
Shane,Quiz,Q02,90,100
Shane,Quiz,Q03,100,100
Shane,Quiz,Q04,100,100
Shane,Quiz,Q05,80,100
Shane,Quiz,Q06,80,100
Shane,Quiz,Q07,80,100
Shane,Final,FINAL,90,100
Shane,Survey,WS,5,5
awk 脚本:
BEGIN {
FS=" *\\, *"
}
FNR>1 {
min[$3]=(!($3 in min) || min[$3]> $4 )? $4 : min[$3]
max[$3]=(max[$3]> $4)? max[$3] : $4
cnt[$3]++
sum[$3]+=$4
}
END {
print "Name\tLow\tHigh\tAverage"
for (i in cnt)
printf("%s\t%d\t%d\t%.1f\n", i, min[i], max[i], sum[i]/cnt[i])
}
预期样本输出:
Name Low High Average
Q06 77 95 86.80
L05 40 96 46.60
WS 5 5 5
Q07 58 100 78.80
L06 60 99 61
L07 77 86 64.80
当我运行脚本时,我得到的所有不正确分配的“低”为 0。我哪里错了?请指导。
【问题讨论】:
-
在此处显示您的示例数据和脚本。
-
请不要发布输入和预期输出示例的图像或链接,请您在问题中将它们作为带有代码标签的文本发布,然后让我们知道。
-
已被其他用户编辑。
-
使用 GNU awk,我从您发布的代码中得到的结果与从我的 datamash 答案中得到的结果相同,顺便说一句(在使其按排序顺序而不是随机打印分配之后)。你的工作正常。
-
@Shawn ,您的意思是我在上面发布的 awk 脚本按预期工作吗?当我运行它时,我看到所有分配的“低”为 0。您能否分享产生正确输出的 awk 代码,以便我纠正我的错误?