本文收录在Linux运维企业架构实战系列
前言:本篇博客是博主踩过无数坑,反复查阅资料,一步步搭建,操作完成后整理的个人心得,分享给大家~~~
Hadoop是一个使用java编写的Apache开放源代码框架,它允许使用简单的编程模型跨大型计算机的大型数据集进行分布式处理。Hadoop框架工作的应用程序可以在跨计算机群集提供分布式存储和计算的环境中工作。Hadoop旨在从单一服务器扩展到数千台机器,每台机器都提供本地计算和存储。
Hadoop框架包括以下四个模块:
我们可以使用下图来描述Hadoop框架中可用的这四个组件。
自2012年以来,术语“Hadoop”通常不仅指向上述基本模块,而且还指向可以安装在Hadoop之上或之外的其他软件包,例如Apache Pig,Apache Hive,Apache HBase,Apache火花等
(1)阶段1
用户/应用程序可以通过指定以下项目向Hadoop(hadoop作业客户端)提交所需的进程:
(2)阶段2
然后,Hadoop作业客户端将作业(jar /可执行文件等)和配置提交给JobTracker,JobTracker负责将软件/配置分发到从站,调度任务和监视它们,向作业客户端提供状态和诊断信息。
(3)阶段3
不同节点上的TaskTrackers根据MapReduce实现执行任务,并将reduce函数的输出存储到文件系统的输出文件中。
Hbase全称为Hadoop Database,即hbase是hadoop的数据库,是一个分布式的存储系统。Hbase利用Hadoop的HDFS作为其文件存储系统,利用Hadoop的MapReduce来处理Hbase中的海量数据。利用zookeeper作为其协调工具。
Client
Zookeeper
Master
RegionServer
HLog(WAL log)
Region
Memstore 与 storefile
本次集群搭建共三台机器,具体说明下:
主机名 | IP | 说明 |
hadoop01 | 192.168.10.101 | DataNode、NodeManager、ResourceManager、NameNode |
hadoop02 | 192.168.10.102 | DataNode、NodeManager、SecondaryNameNode |
hadoop03 | 192.168.10.106 | DataNode、NodeManager |
1
2
3
4
|
$ cat /etc/redhat-release CentOS Linux release 7.3.1611 (Core) $ uname -r 3.10.0-514.el7.x86_64 |
注:本集群内所有进程均由clsn用户启动;要在集群所有服务器都进行操作。
1
2
3
4
5
6
7
8
|
[along@hadoop01 ~]$ sestatus SELinux status: disabled [root@hadoop01 ~]$ iptables -F [along@hadoop01 ~]$ systemctl status firewalld.service ● firewalld.service - firewalld - dynamic firewall daemon Loaded: loaded ( /usr/lib/systemd/system/firewalld .service; disabled; vendor preset: enabled) Active: inactive (dead) Docs: man :firewalld(1) |
1
2
|
$ id along uid=1000(along) gid=1000(along) groups =1000(along) |
1
2
3
4
5
6
7
|
$ cat /etc/hosts 127.0.0.1 localhost localhost.localdomain localhost4 localhost4.localdomain4 ::1 localhost localhost.localdomain localhost6 localhost6.localdomain6 192.168.10.101 hadoop01 192.168.10.102 hadoop02 192.168.10.103 hadoop03 |
1
2
|
$ yum -y install ntpdate $ sudo ntpdate cn.pool.ntp.org |
(1)生成密钥对,一直回车即可
1
|
[along@hadoop01 ~]$ ssh -keygen |
(2)保证每台服务器各自都有对方的公钥
1
2
3
4
5
6
7
8
9
10
|
---along用户 [along@hadoop01 ~]$ ssh -copy- id -i ~/. ssh /id_rsa .pub 127.0.0.1 [along@hadoop01 ~]$ ssh -copy- id -i ~/. ssh /id_rsa .pub hadoop01 [along@hadoop01 ~]$ ssh -copy- id -i ~/. ssh /id_rsa .pub hadoop02 [along@hadoop01 ~]$ ssh -copy- id -i ~/. ssh /id_rsa .pub hadoop03 ---root用户 [along@hadoop01 ~]$ ssh -copy- id -i ~/. ssh /id_rsa .pub 127.0.0.1 [along@hadoop01 ~]$ ssh -copy- id -i ~/. ssh /id_rsa .pub hadoop01 [along@hadoop01 ~]$ ssh -copy- id -i ~/. ssh /id_rsa .pub hadoop02 [along@hadoop01 ~]$ ssh -copy- id -i ~/. ssh /id_rsa .pub hadoop03 |
注:要在集群所有服务器都进行操作
(3)验证无秘钥认证登录
1
2
3
|
[along@hadoop02 ~]$ ssh along@hadoop01 [along@hadoop02 ~]$ ssh along@hadoop02 [along@hadoop02 ~]$ ssh along@hadoop03 |
在三台机器上都需要操作
1
2
3
4
5
6
7
8
9
10
11
|
[root@hadoop01 ~] # tar -xvf jdk-8u201-linux-x64.tar.gz -C /usr/local [root@hadoop01 ~] # chown along.along -R /usr/local/jdk1.8.0_201/ [root@hadoop01 ~] # ln -s /usr/local/jdk1.8.0_201/ /usr/local/jdk [root@hadoop01 ~] # cat /etc/profile.d/jdk.sh export JAVA_HOME= /usr/local/jdk PATH=$JAVA_HOME /bin :$JAVA_HOME /jre/bin :$PATH [root@hadoop01 ~] # source /etc/profile.d/jdk.sh [along@hadoop01 ~]$ java -version java version "1.8.0_201" Java(TM) SE Runtime Environment (build 1.8.0_201-b09) Java HotSpot(TM) 64-Bit Server VM (build 25.201-b09, mixed mode) |
1
2
3
4
|
[root@hadoop01 ~] # wget https://mirrors.tuna.tsinghua.edu.cn/apache/hadoop/common/hadoop-3.2.0/hadoop-3.2.0.tar.gz [root@hadoop01 ~] # tar -xvf hadoop-3.2.0.tar.gz -C /usr/local/ [root@hadoop01 ~] # chown along.along -R /usr/local/hadoop-3.2.0/ [root@hadoop01 ~] # ln -s /usr/local/hadoop-3.2.0/ /usr/local/hadoop |
1
2
3
4
5
|
[along@hadoop01 ~]$ cd /usr/local/hadoop/etc/hadoop/ [along@hadoop01 hadoop]$ vim hadoop- env .sh export JAVA_HOME= /usr/local/jdk export HADOOP_HOME= /usr/local/hadoop export HADOOP_CONF_DIR=${HADOOP_HOME} /etc/hadoop |
1
2
3
4
5
6
7
8
9
10
11
12
13
14
|
[along@hadoop01 hadoop]$ vim core-site.xml <configuration> <!-- 指定HDFS默认(namenode)的通信地址 --> <property> <name>fs.defaultFS< /name > <value>hdfs: //hadoop01 :9000< /value > < /property > <!-- 指定hadoop运行时产生文件的存储路径 --> <property> <name>hadoop.tmp. dir < /name > <value> /data/hadoop/tmp < /value > < /property > < /configuration > [root@hadoop01 ~] # mkdir /data/hadoop |
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
|
[along@hadoop01 hadoop]$ vim hdfs-site.xml <configuration> <!-- 设置namenode的http通讯地址 --> <property> <name>dfs.namenode.http-address< /name > <value>hadoop01:50070< /value > < /property > <!-- 设置secondarynamenode的http通讯地址 --> <property> <name>dfs.namenode.secondary.http-address< /name > <value>hadoop02:50090< /value > < /property > <!-- 设置namenode存放的路径 --> <property> <name>dfs.namenode.name. dir < /name > <value> /data/hadoop/name < /value > < /property > <!-- 设置hdfs副本数量 --> <property> <name>dfs.replication< /name > <value>2< /value > < /property > <!-- 设置datanode存放的路径 --> <property> <name>dfs.datanode.data. dir < /name > <value> /data/hadoop/datanode < /value > < /property > <property> <name>dfs.permissions< /name > <value> false < /value > < /property > < /configuration > [root@hadoop01 ~] # mkdir /data/hadoop/name -p [root@hadoop01 ~] # mkdir /data/hadoop/datanode -p |
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
|
[along@hadoop01 hadoop]$ vim mapred-site.xml <configuration> <!-- 通知框架MR使用YARN --> <property> <name>mapreduce.framework.name< /name > <value>yarn< /value > < /property > <property> <name>mapreduce.application.classpath< /name > <value> /usr/local/hadoop/etc/hadoop , /usr/local/hadoop/share/hadoop/common/ *, /usr/local/hadoop/share/hadoop/common/lib/ *, /usr/local/hadoop/share/hadoop/hdfs/ *, /usr/local/hadoop/share/hadoop/hdfs/lib/ *, /usr/local/hadoop/share/hadoop/mapreduce/ *, /usr/local/hadoop/share/hadoop/mapreduce/lib/ *, /usr/local/hadoop/share/hadoop/yarn/ *, /usr/local/hadoop/share/hadoop/yarn/lib/ * < /value > < /property > < /configuration > |
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
|
[along@hadoop01 hadoop]$ vim yarn-site.xml <configuration> <property> <name>yarn.resourcemanager. hostname < /name > <value>hadoop01< /value > < /property > <property> <description>The http address of the RM web application.< /description > <name>yarn.resourcemanager.webapp.address< /name > <value>${yarn.resourcemanager. hostname }:8088< /value > < /property > <property> <description>The address of the applications manager interface in the RM.< /description > <name>yarn.resourcemanager.address< /name > <value>${yarn.resourcemanager. hostname }:8032< /value > < /property > <property> <description>The address of the scheduler interface.< /description > <name>yarn.resourcemanager.scheduler.address< /name > <value>${yarn.resourcemanager. hostname }:8030< /value > < /property > <property> <name>yarn.resourcemanager.resource-tracker.address< /name > <value>${yarn.resourcemanager. hostname }:8031< /value > < /property > <property> <description>The address of the RM admin interface.< /description > <name>yarn.resourcemanager.admin.address< /name > <value>${yarn.resourcemanager. hostname }:8033< /value > < /property > < /configuration > |
1
2
|
[along@hadoop01 hadoop]$ echo ‘hadoop02‘ >> /usr/local/hadoop/etc/hadoop/masters [along@hadoop01 hadoop]$ echo ‘hadoop03 hadoop01‘ >> /usr/local/hadoop/etc/hadoop/slaves |
启动脚本文件全部位于 /usr/local/hadoop/sbin 文件夹下:
(1)修改 start-dfs.sh stop-dfs.sh 文件添加:
1
2
3
4
5
6
|
[along@hadoop01 ~]$ vim /usr/local/hadoop/sbin/start-dfs .sh [along@hadoop01 ~]$ vim /usr/local/hadoop/sbin/stop-dfs .sh HDFS_DATANODE_USER=along HADOOP_SECURE_DN_USER=hdfs HDFS_NAMENODE_USER=along HDFS_SECONDARYNAMENODE_USER=along |
(2)修改start-yarn.sh 和 stop-yarn.sh文件添加:
1
2
3
4
5
|
[along@hadoop01 ~]$ vim /usr/local/hadoop/sbin/start-yarn .sh [along@hadoop01 ~]$ vim /usr/local/hadoop/sbin/stop-yarn .sh YARN_RESOURCEMANAGER_USER=along HADOOP_SECURE_DN_USER=yarn YARN_NODEMANAGER_USER=along |
1
2
|
[root@hadoop01 ~] # chown -R along.along /usr/local/hadoop-3.2.0/ [root@hadoop01 ~] # chown -R along.along /data/hadoop/ |
1
2
3
4
|
[root@hadoop01 ~] # vim /etc/profile.d/hadoop.sh [root@hadoop01 ~] # cat /etc/profile.d/hadoop.sh export HADOOP_HOME= /usr/local/hadoop PATH=$HADOOP_HOME /bin :$HADOOP_HOME /sbin :$PATH |
1
2
3
4
5
6
7
8
9
10
11
12
13
|
[root@hadoop01 ~] # vim /data/hadoop/rsync.sh #在集群内所有机器上都创建所需要的目录 for i in hadoop02 hadoop03 do sudo rsync -a /data/hadoop $i: /data/ done #复制hadoop配置到其他机器 for i in hadoop02 hadoop03 do sudo rsync -a /usr/local/hadoop-3 .2.0 /etc/hadoop $i: /usr/local/hadoop-3 .2.0 /etc/ done [root@hadoop01 ~] # /data/hadoop/rsync.sh |
1
2
3
4
5
6
7
8
9
10
11
12
13
|
[along@hadoop01 ~]$ hdfs namenode - format ... ... /************************************************************ SHUTDOWN_MSG: Shutting down NameNode at hadoop01 /192 .168.10.101 ************************************************************/ [along@hadoop02 ~]$ hdfs namenode - format /************************************************************ SHUTDOWN_MSG: Shutting down NameNode at hadoop02 /192 .168.10.102 ************************************************************/ [along@hadoop03 ~]$ hdfs namenode - format /************************************************************ SHUTDOWN_MSG: Shutting down NameNode at hadoop03 /192 .168.10.103 ************************************************************/ |
(1)启动namenode、datanode
1
2
3
4
5
6
7
8
9
10
11
12
13
14
|
[along@hadoop01 ~]$ start-dfs.sh [along@hadoop02 ~]$ start-dfs.sh [along@hadoop03 ~]$ start-dfs.sh [along@hadoop01 ~]$ jps 4480 DataNode 4727 Jps 4367 NameNode [along@hadoop02 ~]$ jps 4082 Jps 3958 SecondaryNameNode 3789 DataNode [along@hadoop03 ~]$ jps 2689 Jps 2475 DataNode |
(2)启动YARN
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
|
[along@hadoop01 ~]$ start-yarn.sh [along@hadoop02 ~]$ start-yarn.sh [along@hadoop03 ~]$ start-yarn.sh [along@hadoop01 ~]$ jps 4480 DataNode 4950 NodeManager 5447 NameNode 5561 Jps 4842 ResourceManager [along@hadoop02 ~]$ jps 3958 SecondaryNameNode 4503 Jps 3789 DataNode 4367 NodeManager [along@hadoop03 ~]$ jps 12353 Jps 12226 NodeManager 2475 DataNode |
(1)网页访问:http://hadoop01:8088
该页面为ResourceManager 管理界面,在上面可以看到集群中的三台Active Nodes。
(2)网页访问:http://hadoop01:50070/dfshealth.html#tab-datanode
该页面为NameNode管理页面
到此hadoop集群已经搭建完毕!!!
1
2
3
4
|
[root@hadoop01 ~] # wget https://mirrors.tuna.tsinghua.edu.cn/apache/hbase/1.4.9/hbase-1.4.9-bin.tar.gz [root@hadoop01 ~] # tar -xvf hbase-1.4.9-bin.tar.gz -C /usr/local/ [root@hadoop01 ~] # chown -R along.along /usr/local/hbase-1.4.9/ [root@hadoop01 ~] # ln -s /usr/local/hbase-1.4.9/ /usr/local/hbase |
注:当前时间2018.03.08,hbase-2.1版本有问题;也可能是我配置的问题,hbase会启动失败;所以,我降级到了hbase-1.4.9版本。
1
2
3
4
|
[root@hadoop01 ~] # cd /usr/local/hbase/conf/ [root@hadoop01 conf] # vim hbase-env.sh export JAVA_HOME= /usr/local/jdk export HBASE_CLASSPATH= /usr/local/hbase/conf |
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
|
[root@hadoop01 conf] # vim hbase-site.xml <configuration> <property> <name>hbase.rootdir< /name > <!-- hbase存放数据目录 --> <value>hdfs: //hadoop01 :9000 /hbase/hbase_db < /value > <!-- 端口要和Hadoop的fs.defaultFS端口一致--> < /property > <property> <name>hbase.cluster.distributed< /name > <!-- 是否分布式部署 --> <value> true < /value > < /property > <property> <name>hbase.zookeeper.quorum< /name > <!-- zookooper 服务启动的节点,只能为奇数个 --> <value>hadoop01,hadoop02,hadoop03< /value > < /property > <property> <!--zookooper配置、日志等的存储位置,必须为以存在 --> <name>hbase.zookeeper.property.dataDir< /name > <value> /data/hbase/zookeeper < /value > < /property > <property> <!--hbase master --> <name>hbase.master< /name > <value>hadoop01< /value > < /property > <property> <!--hbase web 端口 --> <name>hbase.master.info.port< /name > <value>16666< /value > < /property > < /configuration > |
注:zookeeper有这样一个特性:
1
2
3
4
|
[root@hadoop01 conf] # vim regionservers hadoop01 hadoop02 hadoop03 |
1
2
3
|
[root@hadoop01 ~] # vim /etc/profile.d/hbase.sh export HBASE_HOME= /usr/local/hbase PATH=$HBASE_HOME /bin :$PATH |
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
|
[root@hadoop01 ~] # mkdir -p /data/hbase/zookeeper [root@hadoop01 ~] # vim /data/hbase/rsync.sh #在集群内所有机器上都创建所需要的目录 for i in hadoop02 hadoop03 do sudo rsync -a /data/hbase $i: /data/ sudo scp -p /etc/profile .d /hbase .sh $i: /etc/profile .d/ done #复制hbase配置到其他机器 for i in hadoop02 hadoop03 do sudo rsync -a /usr/local/hbase-2 .1.3 $i: /usr/local/ done [root@hadoop01 conf] # chown -R along.along /data/hbase [root@hadoop01 ~] # /data/hbase/rsync.sh hbase.sh 100% 62 0.1KB /s 00:00 hbase.sh 100% 62 0.1KB /s 00:00 |
注:只需在hadoop01服务器上操作即可。
(1)启动
1
2
3
4
5
|
[along@hadoop01 ~]$ start-hbase.sh hadoop03: running zookeeper, logging to /usr/local/hbase/logs/hbase-along-zookeeper-hadoop03 .out hadoop01: running zookeeper, logging to /usr/local/hbase/logs/hbase-along-zookeeper-hadoop01 .out hadoop02: running zookeeper, logging to /usr/local/hbase/logs/hbase-along-zookeeper-hadoop02 .out ... ... |
(2)验证
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
|
---主hbase [along@hadoop01 ~]$ jps 4480 DataNode 23411 HQuorumPeer # zookeeper进程 4950 NodeManager 24102 Jps 5447 NameNode 23544 HMaster # hbase master进程 4842 ResourceManager 23711 HRegionServer ---2个从 [along@hadoop02 ~]$ jps 12948 HRegionServer # hbase slave进程 3958 SecondaryNameNode 13209 Jps 12794 HQuorumPeer # zookeeper进程 3789 DataNode 4367 NodeManager [along@hadoop03 ~]$ jps 12226 NodeManager 19559 Jps 19336 HRegionServer # hbase slave进程 19178 HQuorumPeer # zookeeper进程 2475 DataNode |
名称 |
命令表达式 |
创建表 |
create ‘表名称‘,‘列簇名称1‘,‘列簇名称2‘....... |
添加记录 |
put ‘表名称‘, ‘行名称‘,‘列簇名称:‘,‘值‘ |
查看记录 |
get ‘表名称‘,‘行名称‘ |
查看表中的记录总数 |
count ‘表名称‘ |
删除记录 |
delete ‘表名‘,行名称‘,‘列簇名称‘ |
删除表 |
①disable ‘表名称‘ ②drop ‘表名称‘ |
查看所有记录 |
scan ‘表名称‘ |
查看某个表某个列中所有数据 |
scan ‘表名称‘,[‘列簇名称:‘] |
更新记录 |
即重写一遍进行覆盖 |
(1)启动hbase 客户端
1
2
3
4
5
6
7
8
9
10
11
12
|
[along@hadoop01 ~]$ hbase shell #需要等待一些时间 SLF4J: Class path contains multiple SLF4J bindings. SLF4J: Found binding in [jar: file : /usr/local/hbase-1 .4.9 /lib/slf4j-log4j12-1 .7.10.jar! /org/slf4j/impl/StaticLoggerBinder .class] SLF4J: Found binding in [jar: file : /usr/local/hadoop-3 .2.0 /share/hadoop/common/lib/slf4j-log4j12-1 .7.25.jar! /org/slf4j/impl/StaticLoggerBinder .class] SLF4J: See http: //www .slf4j.org /codes .html #multiple_bindings for an explanation. SLF4J: Actual binding is of type [org.slf4j.impl.Log4jLoggerFactory] HBase Shell Use "help" to get list of supported commands. Use "exit" to quit this interactive shell. Version 1.4.9, rd625b212e46d01cb17db9ac2e9e927fdb201afa1, Wed Dec 5 11:54:10 PST 2018 hbase(main):001:0> |
(2)查询集群状态
1
2
|
hbase(main):001:0> status 1 active master, 0 backup masters, 3 servers, 0 dead, 0.6667 average load |
(3)查询hive版本
1
2
|
hbase(main):002:0> version 1.4.9, rd625b212e46d01cb17db9ac2e9e927fdb201afa1, Wed Dec 5 11:54:10 PST 2018 |
(1)创建一个demo表,包含 id和info 两个列簇
1
2
3
4
|
hbase(main):001:0> create ‘demo‘ , ‘id‘ , ‘info‘ 0 row(s) in 23.2010 seconds => Hbase::Table - demo |
(2)获得表的描述
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
|
hbase(main):002:0> list TABLE demo 1 row(s) in 0.6380 seconds => [ "demo" ] ---获取详细描述 hbase(main):003:0> describe ‘demo‘ Table demo is ENABLED demo COLUMN FAMILIES DESCRIPTION {NAME => ‘id‘ , BLOOMFILTER => ‘ROW‘ , VERSIONS => ‘1‘ , IN_MEMORY => ‘false‘ , KEEP_DELETED_CELLS => ‘FALSE‘ , DATA_BLOCK_ENCODING => ‘NONE‘ , TTL => ‘FOREVER‘ , COMPRESSION => ‘NONE‘ , MIN_VERSIONS => ‘ 0 ‘, BLOCKCACHE => ‘ true ‘, BLOCKSIZE => ‘ 65536 ‘, REPLICATION_SCOPE => ‘ 0‘} {NAME => ‘info‘ , BLOOMFILTER => ‘ROW‘ , VERSIONS => ‘1‘ , IN_MEMORY => ‘false‘ , KEEP_DELETED_CELLS = > ‘FALSE‘ , DATA_BLOCK_ENCODING => ‘NONE‘ , TTL => ‘FOREVER‘ , COMPRESSION => ‘NONE‘ , MIN_VERSIONS => ‘0‘ , BLOCKCACHE => ‘true‘ , BLOCKSIZE => ‘65536‘ , REPLICATION_SCOPE => ‘0‘ } 2 row(s) in 0.3500 seconds |
(3)删除一个列簇
注:任何删除操作,都需要先disable表
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
|
hbase(main):004:0> disable ‘demo‘ 0 row(s) in 2.5930 seconds hbase(main):006:0> alter ‘demo‘ ,{NAME=> ‘info‘ ,METHOD=> ‘delete‘ } Updating all regions with the new schema... 1 /1 regions updated. Done. 0 row(s) in 4.3410 seconds hbase(main):007:0> describe ‘demo‘ Table demo is DISABLED demo COLUMN FAMILIES DESCRIPTION {NAME => ‘id‘ , BLOOMFILTER => ‘ROW‘ , VERSIONS => ‘1‘ , IN_MEMORY => ‘false‘ , KEEP_DELETED_CELLS => ‘F ALSE ‘, DATA_BLOCK_ENCODING => ‘ NONE ‘, TTL => ‘ FOREVER ‘, COMPRESSION => ‘ NONE ‘, MIN_VERSIONS => ‘ 0‘, BLOCKCACHE => ‘true‘ , BLOCKSIZE => ‘65536‘ , REPLICATION_SCOPE => ‘0‘ } 1 row(s) in 0.1510 seconds |
(4)删除一个表
要先disable表,再drop
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
|
hbase(main):008:0> list TABLE demo 1 row(s) in 0.1010 seconds => [ "demo" ] hbase(main):009:0> disable ‘demo‘ 0 row(s) in 0.0480 seconds hbase(main):010:0> is_disabled ‘demo‘ #判断表是否disable true 0 row(s) in 0.0210 seconds hbase(main):013:0> drop ‘demo‘ 0 row(s) in 2.3270 seconds hbase(main):014:0> list #已经删除成功 TABLE 0 row(s) in 0.0250 seconds => [] hbase(main):015:0> is_enabled ‘demo‘ #查询是否存在demo表 ERROR: Unknown table demo! |
(1)插入数据
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
|
hbase(main):024:0> create ‘demo‘ , ‘id‘ , ‘info‘ 0 row(s) in 10.0720 seconds => Hbase::Table - demo hbase(main):025:0> is_enabled ‘demo‘ true 0 row(s) in 0.1930 seconds hbase(main):030:0> put ‘demo‘ , ‘example‘ , ‘id:name‘ , ‘along‘ 0 row(s) in 0.0180 seconds hbase(main):039:0> put ‘demo‘ , ‘example‘ , ‘id:sex‘ , ‘male‘ 0 row(s) in 0.0860 seconds hbase(main):040:0> put ‘demo‘ , ‘example‘ , ‘id:age‘ , ‘24‘ 0 row(s) in 0.0120 seconds hbase(main):041:0> put ‘demo‘ , ‘example‘ , ‘id:company‘ , ‘taobao‘ 0 row(s) in 0.3840 seconds hbase(main):042:0> put ‘demo‘ , ‘taobao‘ , ‘info:addres‘ , ‘china‘ 0 row(s) in 0.1910 seconds hbase(main):043:0> put ‘demo‘ , ‘taobao‘ , ‘info:company‘ , ‘alibaba‘ 0 row(s) in 0.0300 seconds hbase(main):044:0> put ‘demo‘ , ‘taobao‘ , ‘info:boss‘ , ‘mayun‘ 0 row(s) in 0.1260 seconds |
(2)获取demo表的数据
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
|
hbase(main):045:0> get ‘demo‘ , ‘example‘ COLUMN CELL id :age timestamp=1552030411620, value=24 id :company timestamp=1552030467196, value=taobao id :name timestamp=1552030380723, value=along id :sex timestamp=1552030392249, value=male 1 row(s) in 0.8850 seconds hbase(main):046:0> get ‘demo‘ , ‘taobao‘ COLUMN CELL info:addres timestamp=1552030496973, value=china info:boss timestamp=1552030532254, value=mayun info:company timestamp=1552030520028, value=alibaba 1 row(s) in 0.2500 seconds hbase(main):047:0> get ‘demo‘ , ‘example‘ , ‘id‘ COLUMN CELL id :age timestamp=1552030411620, value=24 id :company timestamp=1552030467196, value=taobao id :name timestamp=1552030380723, value=along id :sex timestamp=1552030392249, value=male 1 row(s) in 0.3150 seconds hbase(main):048:0> get ‘demo‘ , ‘example‘ , ‘info‘ COLUMN CELL 0 row(s) in 0.0200 seconds hbase(main):049:0> get ‘demo‘ , ‘taobao‘ , ‘id‘ COLUMN CELL 0 row(s) in 0.0410 seconds hbase(main):053:0> get ‘demo‘ , ‘taobao‘ , ‘info‘ COLUMN CELL info:addres timestamp=1552030496973, value=china info:boss timestamp=1552030532254, value=mayun info:company timestamp=1552030520028, value=alibaba 1 row(s) in 0.0240 seconds hbase(main):055:0> get ‘demo‘ , ‘taobao‘ , ‘info:boss‘ COLUMN CELL info:boss timestamp=1552030532254, value=mayun 1 row(s) in 0.1810 seconds |
(3)更新一条记录
1
2
3
4
5
6
7
|
hbase(main):056:0> put ‘demo‘ , ‘example‘ , ‘id:age‘ , ‘88‘ 0 row(s) in 0.1730 seconds hbase(main):057:0> get ‘demo‘ , ‘example‘ , ‘id:age‘ COLUMN CELL id :age timestamp=1552030841823, value=88 1 row(s) in 0.1430 seconds |
(4)获取时间戳数据
大家应该看到timestamp这个标记
1
2
3
4
5
6
7
8
9
|
hbase(main):059:0> get ‘demo‘ , ‘example‘ ,{COLUMN=> ‘id:age‘ ,TIMESTAMP=>1552030841823} COLUMN CELL id :age timestamp=1552030841823, value=88 1 row(s) in 0.0200 seconds hbase(main):060:0> get ‘demo‘ , ‘example‘ ,{COLUMN=> ‘id:age‘ ,TIMESTAMP=>1552030411620} COLUMN CELL id :age timestamp=1552030411620, value=24 1 row(s) in 0.0930 seconds |
(5)全表显示
1
2
3
4
5
6
7
8
9
10
|
hbase(main):061:0> scan ‘demo‘ ROW COLUMN+CELL example column= id :age, timestamp=1552030841823, value=88 example column= id :company, timestamp=1552030467196, value=taobao example column= id :name, timestamp=1552030380723, value=along example column= id :sex, timestamp=1552030392249, value=male taobao column=info:addres, timestamp=1552030496973, value=china taobao column=info:boss, timestamp=1552030532254, value=mayun taobao column=info:company, timestamp=1552030520028, value=alibaba 2 row(s) in 0.3880 seconds |
(6)删除id为example的‘id:age‘字段
1
2
3
4
5
6
7
8
|
hbase(main):062:0> delete ‘demo‘ , ‘example‘ , ‘id:age‘ 0 row(s) in 1.1360 seconds hbase(main):063:0> get ‘demo‘ , ‘example‘ COLUMN CELL id :company timestamp=1552030467196, value=taobao id :name timestamp=1552030380723, value=along id :sex timestamp=1552030392249, value=male |
(7)删除整行
1
2
3
4
5
6
|
hbase(main):070:0> deleteall ‘demo‘ , ‘taobao‘ 0 row(s) in 1.8140 seconds hbase(main):071:0> get ‘demo‘ , ‘taobao‘ COLUMN CELL 0 row(s) in 0.2200 seconds |
(8)给example这个id增加‘id:age‘字段,并使用counter实现递增
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
|
hbase(main):072:0> incr ‘demo‘ , ‘example‘ , ‘id:age‘ COUNTER VALUE = 1 0 row(s) in 3.2200 seconds hbase(main):073:0> get ‘demo‘ , ‘example‘ , ‘id:age‘ COLUMN CELL id :age timestamp=1552031388997, value=\x00\x00\x00\x00\x00\x00\x00\x01 1 row(s) in 0.0280 seconds hbase(main):074:0> incr ‘demo‘ , ‘example‘ , ‘id:age‘ COUNTER VALUE = 2 0 row(s) in 0.0340 seconds hbase(main):075:0> incr ‘demo‘ , ‘example‘ , ‘id:age‘ COUNTER VALUE = 3 0 row(s) in 0.0420 seconds hbase(main):076:0> get ‘demo‘ , ‘example‘ , ‘id:age‘ COLUMN CELL id :age timestamp=1552031429912, value=\x00\x00\x00\x00\x00\x00\x00\x03 1 row(s) in 0.0690 seconds hbase(main):077:0> get_counter ‘demo‘ , ‘example‘ , ‘id:age‘ #获取当前count值 COUNTER VALUE = 3 |
(9)清空整个表
1
2
3
4
5
|
hbase(main):078:0> truncate ‘demo‘ Truncating ‘demo‘ table (it may take a while ): - Disabling table... - Truncating table... 0 row(s) in 33.0820 seconds |
可以看出hbase是先disable掉该表,然后drop,最后重新create该表来实现清空该表。
原文:https://www.cnblogs.com/skyhu365/p/10636714.html