Hive DML 查询 - 爱码网

聚合函数

常见的聚合函数 包括 max min sum avg count,这类聚合函数,都是会执行MapReduce的

-- 求员工表中 最大 最小 平均 以及所有员工工资的和
 select max(sal),min(sal),avg(sal),sum(sal) from ruoze_emp;

Hive DML 查询

分组查询

出现在select中的字段，要么出现在group by子句中，要么出现在聚合函数中;

--求部门的平均工资
select deptno,avg(sal) from ruoze_emp group by deptno;

Hive DML 查询

--求每个部门、工作岗位的最高工资
select deptno,job,max(sal) from ruoze_emp group by deptno,job;

Hive DML 查询

--求每个部门的平均薪水大于2000的部门
select deptno,avg(sal) from ruoze_emp group by deptno having avg(sal) >2000;
注意: 在分组之后的条件 需要在使用having 来过滤 不可以使用where ...

Hive DML 查询

判断查询

case when then 相当于 if else

-- 对不同工资的人,给出不同的评判
select ename, sal,
case 
when sal>1 and sal<=1000 then 'LOWER'
when sal>1000 and sal<=2000 then 'MIDDLE'
when sal>2000 and sal<=4000 then 'HIGH'
ELSE 'HIGHEST' end
from ruoze_emp;

Hive DML 查询

Union 和 Union All

union all 常用于处理数据倾斜问题。
数据倾斜思路,一般将数据切下的表,切成小表分开处理然后在union all 的到最终的数据;

select count(1) from ruoze_emp where deptno=10
union all
select count(1) from ruoze_emp where deptno=20;