HBase协处理器统计表数据量

时间：2015-10-28 12:07:41 阅读：338 评论：0 收藏：0 [点我收藏+]

1.Java代码

import org.apache.hadoop.hbase.client.coprocessor.AggregationClient;

import org.apache.hadoop.hbase.client.coprocessor.LongColumnInterpreter;
import org.apache.hadoop.hbase.coprocessor.AggregateImplementation;

/**
* <p>
* 协处理器统计HBase表数据量
* </p>
*
*/
public class HBaseRecordsCounter {

/**
* HBase API添加协处理器
*
* @author zhanglei11 2015年10月28日上午10:38:40
* @param conf
* @param tableName
*/
public static void addCoprocessor(Configuration conf, String tableName) {
try {
byte[] tableNameBytes = Bytes.toBytes(tableName);
HBaseAdmin hbaseAdmin = new HBaseAdmin(conf);
hbaseAdmin.disableTable(tableNameBytes);
HTableDescriptor htd = hbaseAdmin.getTableDescriptor(tableNameBytes);
htd.addCoprocessor(AggregateImplementation.class.getName());
hbaseAdmin.modifyTable(tableNameBytes, htd);
hbaseAdmin.enableTable(tableNameBytes);
hbaseAdmin.close();
} catch (MasterNotRunningException e) {
e.printStackTrace();
} catch (ZooKeeperConnectionException e) {
e.printStackTrace();
} catch (IOException e) {
e.printStackTrace();
}

}

/**
* 统计表数量
*
*/
public static void exeCount(Configuration conf, String tableName, String family) {

try {
// 使用hbase提供的聚合coprocessor
AggregationClient aggregationClient = new AggregationClient(conf);
Scan scan = new Scan();
// 指定扫描列族，唯一值
scan.addFamily(Bytes.toBytes(family));
long start = System.currentTimeMillis();
long rowCount = aggregationClient.rowCount(TableName.valueOf(tableName), new LongColumnInterpreter(), scan);
System.out
.println("Row count: " + rowCount + "; time cost: " + (System.currentTimeMillis() - start) + "ms");
} catch (Throwable e) {
e.printStackTrace();
}
}

public static void main(String[] args) {

String tableName = "HIK_METADATA";

Configuration conf = new Configuration();
conf.set("hbase.zookeeper.quorum", "hadoop17,hadoop19,hadoop20");
conf.set("hbase.rootdir", "hdfs://hadoop17:8020/hbase");
// 提高RPC通信时长
conf.setLong("hbase.rpc.timeout", 600000);
// 设置Scan缓存
conf.setLong("hbase.client.scanner.caching", 1000);

// addCoprocessor(conf, tableName);

exeCount(conf, tableName, "info");

}

2. Shell启用协处理器

启用协处理器方法1.

启动全局aggregation，能过操纵所有的表上的数据。通过修改hbase-site.xml这个文件来实现，只需要添加如下代码：

<property>
   <name>hbase.coprocessor.user.region.classes</name>
   <value>org.apache.hadoop.hbase.coprocessor.AggregateImplementation</value>
 </property>

启用协处理器方法2.

hbase shell添加coprocessor:

disable ‘member‘
alter ‘member‘,METHOD => ‘table_att‘,‘coprocessor‘ => ‘hdfs://master24:9000/user/hadoop/jars/test.jar|mycoprocessor.SampleCoprocessor|1001|‘
enable ‘member‘

hbase shell 删除coprocessor:

disable ‘member‘
alter ‘member‘,METHOD => ‘table_att_unset‘,NAME =>‘coprocessor$1‘
enable ‘member‘

HBase协处理器统计表数据量

原文：http://www.cnblogs.com/warmingsun/p/4916606.html

踩

(0)

评论一句话评论（0）

分享档案

更多>

2021年09月23日 (328)
2021年09月24日 (313)
2021年09月17日 (191)
2021年09月15日 (369)
2021年09月16日 (411)
2021年09月13日 (439)
2021年09月11日 (398)
2021年09月12日 (393)
2021年09月10日 (160)
2021年09月08日 (222)