【发布时间】:2018-04-17 22:10:29
【问题描述】:
我有以下数据
primary,first,second
1,393440.09,354096.08
1,4410533.33,3969479.99
1,-4803973.41,-4323576.07
我必须对第一列和第二列进行汇总和求和。下面是我正在执行的脚本
data_load= load <filelocation> using org.apache.pig.piggybank.storage.CSVExcelStorage(',', 'NO_MULTILINE', 'NOCHANGE', 'SKIP_INPUT_HEADER') As (primary:double, first:double,second:double)
dataAgrr = group data_load by primary;
sumData = FOREACH dataAgrr GENERATE
group as data,
SUM(data_load.first) as first,
SUM(data_load.second) as second,
SUM(data_load.primary) as primary;
执行后,产生如下输出:
(1.0,0.009999999951105565,-5.820766091346741E-11,3.0)
但是当我们手动添加第二列 (354096.08, 3969479.99, -4323576.07) 时会得到 0。
【问题讨论】:
标签: hadoop apache-pig