首页 > 其他 > 详细

hive多分隔符支持

时间:2020-09-29 23:16:07      阅读:43      评论:0      收藏:0      [点我收藏+]

1.问题描述

如何将多个字符作为字段分割符的数据文件加载到Hive表中,事例数据如下:

  字段分隔符为“@#$”

test1@#$test1name@#$test2value
test2@#$test2name@#$test2value
test3@#$test3name@#$test4value

如何将上述事例数据加载到Hive表(multi_delimiter_test)中,表结构如下:

字段名

字段类型

s1 string
s2 string
s3 string

 

2.Hive多分隔符支持

Hive在0.14及以后版本支持字段的多分隔符,参考https://cwiki.apache.org/confluence/display/Hive/MultiDelimitSerDe

 

3.实现步骤

1.准备多分隔符文件并装载到HDFS对应目录

[ec2-user@ip-172-31-8-141  ~]$ cat multi_delimiter_test.dat

test1@#$test1name@#$test2value

test2@#$test2name@#$test2value

test3@#$test3name@#$test4value  

[ec2-user@ip-172-31-8-141  ~]$ hadoop dfs -put multi_delimiter_test.dat /fayson/multi_delimiter_test

[ec2-user@ip-172-31-8-141  ~]$ hadoop dfs -ls /fayson/multi_delimiter_test

DEPRECATED: Use of this  script to execute hdfs command is deprecated.

Instead use the hdfs  command for it.


Found 1 items

-rw-r--r--   3 user_r supergroup         93 2017-08-23 03:24  /fayson/multi_delimiter_test/multi_delimiter_test.dat

[ec2-user@ip-172-31-8-141  ~]$

 

2.基于准备好的多分隔符文件建表

create  external table multi_delimiter_test(

s1 string,

s2 string,

s3 string)

ROW FORMAT  SERDE org.apache.hadoop.hive.contrib.serde2.MultiDelimitSerDe WITH  SERDEPROPERTIES ("field.delim"="@#$")

stored as  textfile location /fayson/multi_delimiter_test;

3.测试

>  select * from multi_delimiter_test;
+--------------------------+--------------------------+--------------------------+--+
|  multi_delimiter_test.s1  |  multi_delimiter_test.s2  |  multi_delimiter_test.s3  |
+--------------------------+--------------------------+--------------------------+--+
| test1                    | test1name                | test2value               |
| test2                    | test2name                | test2value               |
| test3                    | test3name                | test4value               |
+--------------------------+--------------------------+--------------------------+--+

 

 

 

 

 

 

 

 

 

 

 

 

字段名

字段类型

s1

String

s2

String

 

hive多分隔符支持

原文:https://www.cnblogs.com/LIAOBO/p/13752039.html

(0)
(0)
   
举报
评论 一句话评论(0
关于我们 - 联系我们 - 留言反馈 - 联系我们:wmxa8@hotmail.com
© 2014 bubuko.com 版权所有
打开技术之扣,分享程序人生!