zabbix做为越来越受大家欢迎的监控工具,其相对于nagios,cacti之流,最大的一个特点就是数据是存放在关系型数据库中的,这样就可以极大的方便后续的数据查询,处理等,比如我们想知道一台机器全天ioutil 超过80的时间比例,在zabbix的数据库中,一个sql就可以搞定了,而在cacti中就不这么方便了,而且也不用担心数据随着时间的边长而被稀释掉。
在做zabbix的数据分析时,用到的比较多的表一般有hosts,items,interface,hisory*,trend*相关表,比如,通过zabbix监控整个hadoop集群的mapred的使用情况,只需要把每台机器的lastvalue进行聚合就好了。。
可以简单通过下面这种方式:
#!/usr/bin/python #edit by ericni #to get hadoop totaol statistics # -*- coding: utf8 -*- import MySQLdb import sys import os def get_total_value(sql): db = MySQLdb.connect(host=‘xxx‘,user=‘xxxx‘,passwd=‘xxx‘,db=‘xxx‘) cursor = db.cursor() cursor.execute(sql) try: result = cursor.fetchone()[0] except: result = 0 cursor.close() db.close() return result if __name__ == ‘__main__‘: sql = ‘‘ if sys.argv[1] == "all_mapTaskSlots": sql = "select sum(lastvalue) from hosts a, items b where key_ = ‘hadoop_metrics[mrmetrics.log,mapred.tasktracker,mapTaskSlots]‘ and lower(host) like ‘%-hadoop-datanode%‘ and a.hostid = b.hostid" elif sys.argv[1] == "all_maps_running": sql = "select sum(lastvalue) from hosts a, items b where key_ = ‘hadoop_metrics[mrmetrics.log,mapred.tasktracker,maps_running]‘ and lower(host) like ‘%-hadoop-datanode%‘ and a.hostid = b.hostid" elif sys.argv[1] == "all_reduceTaskSlots": sql = "select sum(lastvalue) from hosts a, items b where key_ = ‘hadoop_metrics[mrmetrics.log,mapred.tasktracker,reduceTaskSlots]‘ and lower(host) like ‘%-hadoop-datanode%‘ and a.hostid = b.hostid" elif sys.argv[1] == "all_reduces_running": sql = "select sum(lastvalue) from hosts a, items b where key_ = ‘hadoop_metrics[mrmetrics.log,mapred.tasktracker,reduces_running]‘ and lower(host) like ‘%-hadoop-datanode%‘ and a.hostid = b.hostid" elif sys.argv[1] == "all_ThreadsBlocked": sql = "select sum(lastvalue) from hosts a, items b where key_ = ‘hadoop_stats[datanode,ThreadsBlocked]‘ and lower(host) like ‘%-hadoop-datanode%‘ and a.hostid = b.hostid" elif sys.argv[1] == "all_ThreadsRunnable": sql = "select sum(lastvalue) from hosts a, items b where key_ = ‘hadoop_stats[datanode,ThreadsRunnable]‘ and lower(host) like ‘%-hadoop-datanode%‘ and a.hostid = b.hostid" elif sys.argv[1] == "all_ThreadsWaiting": sql = "select sum(lastvalue) from hosts a, items b where key_ = ‘hadoop_stats[datanode,ThreadsWaiting]‘ and lower(host) like ‘%-hadoop-datanode%‘ and a.hostid = b.hostid" else: sys.exit(0) value = get_total_value(sql) print value
然后把可用的total map和total running map画在一个graph里面就可以知道map的使用率情况了。。
当然,zabbix也有自己的前端聚合的功能,不过相对来说,这样灵活性会高一些。。
本文出自 “菜光光的博客” 博客,请务必保留此出处http://caiguangguang.blog.51cto.com/1652935/1369808
原文:http://caiguangguang.blog.51cto.com/1652935/1369808