首页 > 其他 > 详细

[HIVE] rank() dense_rank() row_number()的学习

时间:2020-07-19 00:26:02      阅读:51      评论:0      收藏:0      [点我收藏+]

在对数据进行汇总和分析的时候,经常需要用到排名相关的操作,下面是hive中经常用到的3个排名函数:
rank()
dense_rank()
row_number()

函数说明:
rank():在一组数据内按顺序显示排名顺序,值相同的情况下,排序数会重复,下个排序数会根据记录数接着排名。
dense_rank():在一组数据内按顺序显示排名顺序,值相同的情况下,排序数会重复,下个排序数会根据排名数接着排名。
row_number():在一组数据内按顺序显示排名顺序,值相同的情况下,排序数不会重复。

实际操作:
booksales表中有分类(catalog)字段,价格(price)字段,现在想查询一下每个分类(catalog)下价格(price)有高到低的排名情况

排名前数据查询:
hive> select * from booksales;
OK
bookid catalog price amount dt
B00000001 novel 54.0 5 20200718
B00000002 novel 33.0 9 20200718
B00000003 novel 54.0 11 20200718
B00000004 novel 59.0 10 20200718
B00000005 novel 53.0 4 20200718
B00000006 literature 24.0 10 20200718
B00000007 literature 35.0 8 20200718
B00000008 literature 35.0 4 20200718
B00000009 literature 35.0 7 20200718
B00000010 literature 35.0 9 20200718
B00000011 management 84.0 6 20200718
B00000012 management 73.0 7 20200718
B00000013 management 84.0 5 20200718
B00000014 management 52.0 11 20200718
B00000015 management 87.0 11 20200718
B00000016 management 45.0 7 20200718
B00000017 social 39.0 1 20200718
B00000018 social 16.0 9 20200718
B00000019 social 85.0 9 20200718
B00000020 social 92.0 4 20200718

排名后数据查询:
hive> select
> *
> ,rank() over (partition by catalog order by price desc) price_rank
> ,dense_rank() over (partition by catalog order by price desc) price_dense_rank
> ,row_number() over (partition by catalog order by price desc) price_row_number
> from booksales;
OK
bookid catalog price amount dt price_rank price_dense_rank price_row_number
B00000008 literature 35.0 4 20200718 1 1 1
B00000007 literature 35.0 8 20200718 1 1 2
B00000010 literature 35.0 9 20200718 1 1 3
B00000009 literature 35.0 7 20200718 1 1 4
B00000006 literature 24.0 10 20200718 5 2 5
B00000015 management 87.0 11 20200718 1 1 1
B00000013 management 84.0 5 20200718 2 2 2
B00000011 management 84.0 6 20200718 2 2 3
B00000012 management 73.0 7 20200718 4 3 4
B00000014 management 52.0 11 20200718 5 4 5
B00000016 management 45.0 7 20200718 6 5 6
B00000004 novel 59.0 10 20200718 1 1 1
B00000001 novel 54.0 5 20200718 2 2 2
B00000003 novel 54.0 11 20200718 2 2 3
B00000005 novel 53.0 4 20200718 4 3 4
B00000002 novel 33.0 9 20200718 5 4 5
B00000020 social 92.0 4 20200718 1 1 1
B00000019 social 85.0 9 20200718 2 2 2
B00000017 social 39.0 1 20200718 3 3 3
B00000018 social 16.0 9 20200718 4 4 4

[HIVE] rank() dense_rank() row_number()的学习

原文:https://www.cnblogs.com/larry66/p/13337935.html

(0)
(0)
   
举报
评论 一句话评论(0
关于我们 - 联系我们 - 留言反馈 - 联系我们:wmxa8@hotmail.com
© 2014 bubuko.com 版权所有
打开技术之扣,分享程序人生!