Elasticsearch Curator是一款ES索引(或快照)的清理神器,可以帮你轻松管理ES中的索引和快照,整体实现过程如下:
从ES集群中获取索引或者快照,作为待执行列表
通过用户定义的filters(过滤器)从待执行列表中删除索引或者快照
通过待执行列表,可以为其定义各种各样的处理动作
项目官网:https://www.elastic.co/guide/en/elasticsearch/client/curator/current/index.html
公司的项目中使用ES存取服务端的日志,用于分析查询。由于服务端日志的量很大,且ES集群整体来说也比较昂贵,我们需要对日志索引进行特殊处理(仅保留五天以内的日志)。后来和开发一起调研了下相关工具,发现了这款ES CURATOR。
1. pip安装
2. elasticsearch-curator安装
sudo pip3 install elasticsearch-curator==5.8.1
执行:curator --help
Usage: curator [OPTIONS] ACTION_FILE
Curator for Elasticsearch indices.
See http://elastic.co/guide/en/elasticsearch/client/curator/current
Options:
--config PATH Path to configuration file. Default: ~/.curator/curator.yml
--dry-run Do not perform any changes.
--version Show the version and exit.
--help Show this message and exit.
curator运行需两个配置文件:
1) config.yml内容例子
client:hosts:- 192.168.21.63- 192.168.14.146- 192.168.21.174port: 9200url_prefix:use_ssl: Falsecertificate:client_cert:client_key:aws_key:aws_secret_key:aws_region:ssl_no_validate: Falsehttp_auth:timeout: 30master_only: Falselogging:loglevel: INFOlogfile: /usr/local/elasticsearch/logs/curator.loglogformat: jsonblacklist: [‘elasticsearch‘, ‘urllib3‘]
hosts:es集群内的机器ip,如果有ES集群内有多个机器则填写多个ip,如果机器的端口号都不完全一样,可以写成ip:port的方式,不同集群的es机器的ip不能放在这个hosts下面(只能另起一个配置)
port:默认es的端口号,没有则使用这个端口
timeout: 默认30s超时
logfile: 默认将执行结果输出到STDOUT或控制台。可以指定本地文件进行输出
blacklist: 默认值为‘elasticsearch‘, ‘urllib3‘。Python模块不被输出。
2) action.yml
actions:
1:
action: delete_indices
description: "Delete indices older than 5 days (based on index name), for tlog- prefixed indices."
options:
ignore_empty_list: True
filters:
- filtertype: pattern
kind: prefix
value: tlog-
- filtertype: age
source: name
direction: older
unit: days
unit_count: 5
timestring: ‘%Y.%m.%d‘
exclude:
2:
action: delete_indices
description: "Delete indices older than 5 days (based on index name), for internal- prefixed indices."
options:
ignore_empty_list: True
filters:
- filtertype: pattern
kind: prefix
value: internal-
- filtertype: age
source: name
direction: older
unit: days
unit_count: 5
timestring: ‘%Y.%m.%d‘
exclude:
1,2...n表示要执行的任务的需要,有多个任务依次编号即可。
delete_indices标识执行的动作为删除索引,action参考:https://www.elastic.co/guide/en/elasticsearch/client/curator/current/actions.html
ignore_empty_list:是否忽略错误空列表,option参考:https://www.elastic.co/guide/en/elasticsearch/client/curator/current/option_ignore_empty.html
filters:过滤器,比如以上action 1中表示处理tlog前缀的索引,早于五天前的索引会被清除。filters参考:https://www.elastic.co/guide/en/elasticsearch/client/curator/current/filters.html
配置好了action.yml和config.yml后,可以执行:
/usr/bin/curator --config /usr/local/elasticsearch/config/config.yml /usr/local/elasticsearch/config/action.yml
看是否有报错,并查看下对应索引大小是否有变化(如果符合过滤器的索引就会被删除,不符合的索引则不会处理)
输入:crontab -e,添加如下内容
0 23 * * * /usr/bin/curator --config /usr/local/elasticsearch/config/config.yml /usr/local/elasticsearch/config/action.yml >> /usr/local/elasticsearch/logs/curator.log 2>&1
以上定时任务每晚11点执行一次。
博主:测试生财
座右铭:专注测试与自动化,致力提高研发效能;通过测试精进完成原始积累,通过读书理财奔向财务自由。
csdn:https://blog.csdn.net/ccgshigao
elasticsearch索引和快照清理:es curator
原文:https://www.cnblogs.com/qa-freeroad/p/14007936.html