分区表实际上对应一个HDFS文件系统上的独立文件夹,数据存储在文件夹下
HIVE中的分区就是分目录
分区是将数据按某个字段分区(表中不存在的字段)
分区表避免全表扫描,相当于索引
(1)引入分区表 /user/hive/warehouse/log_partition/20200702 /user/hive/warehouse/log_partition/20200703 /user/hive/warehouse/log_partition/20200704 (2)创建分区表 create external table dept_partition( username string, fullname string) partitioned by(month string) row format delimited fields terminated by ‘,‘ lines terminated by ‘\n‘; (3)加载数据导入分区 load data local inpath "/opt/module/datas/dept.txt" into table dept_partition partition(month="20200702"); load data local inpath "/opt/module/datas/dept.txt" into table dept_partition partition(month="20200703"); load data local inpath "/opt/module/datas/dept.txt" into table dept_partition partition(month="20200704"); (4)结果 /user/hive/warehouse/log_partition/20200702/20200702.log /user/hive/warehouse/log_partition/20200703/20200703.log /user/hive/warehouse/log_partition/20200704/20200704.log (5)查询 select * from dept_partition where month="20200702"; (6)增加分区(多个分区用空格) alter table dept_partition add partition(month="20200702") alter table dept_partition add partition(month="20200702") partition(month="20200703") (7)删除分区(多个分区用逗号) alter table dept_partition drop partition(month="20200702") alter table dept_partition drop partition(month="20200702"),partition(month="20200703") (8)查看分区 show partitions dept_partition; (9)查看分区表结构 desc formatted dept_partition;
原文:https://www.cnblogs.com/hapyygril/p/14089284.html