Hive高阶聚合函数 GROUPING SETS、Cube、Rollup

时间：2019-07-31 10:56:43 阅读：97 评论：0 收藏：0 [点我收藏+]

-- GROUPING SETS作为GROUP BY的子句，允许开发人员在GROUP BY语句后面指定多个统计选项，可以简单理解为多条group by语句通过union all把查询结果聚合起来结合起来。
select 
     device_id
    ,os_id
    ,app_id
    ,count(user_id) 
from test_xinyan_reg 
group by device_id,os_id,app_id 
grouping sets((device_id),(os_id),(device_id,os_id),())
-- 等价于
SELECT device_id,null,null,count(user_id) FROM test_xinyan_reg group by device_id UNION ALL 
SELECT null,os_id,null,count(user_id) FROM test_xinyan_reg group by os_id UNION ALL 
SELECT device_id,os_id,null,count(user_id) FROM test_xinyan_reg group by device_id,os_id UNION ALL 
SELECT null,null,null,count(user_id) FROM test_xinyan_reg
;

-- cube简称数据魔方，可以实现hive多个任意维度的查询，cube(a,b,c)则首先会对(a,b,c)进行group by，然后依次是(a,b),(a,c),(a),(b,c),(b),©,最后在对全表进行group by，他会统计所选列中值的所有组合的聚合
-- cube即为grouping sets的简化过程函数
select device_id,os_id,app_id,client_version,from_id,count(user_id)
from test_xinyan_reg
group by device_id,os_id,app_id,client_version,from_id with cube;


-- rollup可以实现从右到做递减多级的统计，显示统计某一层次结构的聚合。
select device_id,os_id,app_id,client_version,from_id,count(user_id)
from test_xinyan_reg
group by device_id,os_id,app_id,client_version,from_id with rollup;

ref: https://blog.csdn.net/qq_31573519/article/details/89054136

原文：https://www.cnblogs.com/chenzechao/p/11273980.html

踩

(0)

评论一句话评论（0）

分享档案

更多>

2021年09月23日 (328)
2021年09月24日 (313)
2021年09月17日 (191)
2021年09月15日 (369)
2021年09月16日 (411)
2021年09月13日 (439)
2021年09月11日 (398)
2021年09月12日 (393)
2021年09月10日 (160)
2021年09月08日 (222)