首页 > 其他 > 详细

Redis Sentinel 原理简单介绍

时间:2020-09-09 16:43:14      阅读:233      评论:0      收藏:0      [点我收藏+]

记录下自己关于Redis Sentinel的理解~

不管什么中间件,只要是单点部署就都会有单点故障的隐患,所以很容易想到的架构是:主从架构

Redis主从架构

技术分享图片

Redis主从复制原理

主从复制分完全同步、部分同步两种情况:

  1. 完全同步:当一个从节点连接到Master后,向master发送一个SYNC命令(新版本PSYNC),master执行BGSAVE生成RDB文件,同时开启一个buffer记录master上的写操作,RDB文件生成好后,发送给Slave节点,slave保存到本地磁盘,然后再加载到内存。然后master将buffer里面到写命令发给slave, 好像是通过redis协议???

  2. 部分同步:slave可以发送PSYNC master_run_id offset 请求部分同步,master和slaves都会记录同步的offset,如果slave请求同步的offset对应的数据在master上有,就同步给slave, 如果在master上没有,就会执行一次完全同步。

从库请求master进行一次完全同步:
master的日志:

8233:M 01 Sep 2020 16:55:02.260 * Replica 127.0.0.1:6380 asks for synchronization
38233:M 01 Sep 2020 16:55:02.260 * Full resync requested by replica 127.0.0.1:6380
38233:M 01 Sep 2020 16:55:02.260 * Starting BGSAVE for SYNC with target: disk
38233:M 01 Sep 2020 16:55:02.261 * Background saving started by pid 38299
38299:C 01 Sep 2020 16:55:02.323 * DB saved on disk
38299:C 01 Sep 2020 16:55:02.323 * RDB: 4 MB of memory used by copy-on-write
38233:M 01 Sep 2020 16:55:02.378 * Background saving terminated with success
38233:M 01 Sep 2020 16:55:02.378 * Synchronization with replica 127.0.0.1:6380 succeeded

从库:6380的日志:

$ tail -f 6380.log
38295:S 01 Sep 2020 16:55:02.259 * Connecting to MASTER 127.0.0.1:6379
38295:S 01 Sep 2020 16:55:02.259 * MASTER <-> REPLICA sync started
38295:S 01 Sep 2020 16:55:02.259 * Non blocking connect for SYNC fired the event.
38295:S 01 Sep 2020 16:55:02.260 * Master replied to PING, replication can continue...
38295:S 01 Sep 2020 16:55:02.260 * Partial resynchronization not possible (no cached master)
38295:S 01 Sep 2020 16:55:02.262 * Full resync from master: 46ef90de89e6771b67bc2b43371da2f97a03b4d1:0
38295:S 01 Sep 2020 16:55:02.378 * MASTER <-> REPLICA sync: receiving 175 bytes from master
38295:S 01 Sep 2020 16:55:02.378 * MASTER <-> REPLICA sync: Flushing old data
38295:S 01 Sep 2020 16:55:02.378 * MASTER <-> REPLICA sync: Loading DB in memory
38295:S 01 Sep 2020 16:55:02.378 * MASTER <-> REPLICA sync: Finished with success

哨兵架构

单点故障解决了,但是主从切换还得人工来搞,能不能做到自动切换呢,当然可以!
技术分享图片

Master: 6379
Slaves:6380,6381
Sentinels:26379,26380,26381

哨兵原理

1. 哨兵之间的自动发现

  1. 每个sentinel节点每2秒都会向自己监控的master和slaves节点的 Pub/Sub channel: __sentinel__:hello发送message
  2. 每个sentinel节点订阅master和slave的channel:__sentinel__:hello 来自动发现其他的sentinel

sentinel发布的message:__sentinel__:hello通道的内容:

127.0.0.1:6381> PSUBSCRIBE *

Reading messages... (press Ctrl-C to quit)

  1. "psubscribe"

  2. "*"

  3. (integer) 1

  4. "pmessage"

  5. "*"

  6. "__sentinel__:hello"

  7. "127.0.0.1,26380,fc976b271914f43a4a318dfe8c1f41a2e747f8d8,1,mymaster,127.0.0.1,6381,1"

  8. "pmessage"

  9. "*"

  10. "__sentinel__:hello"

  11. "127.0.0.1,26379,b60bd3e15db23a9862d213e7703001c72d48dc73,1,mymaster,127.0.0.1,6381,1"

哨兵节点之间的发布订阅事件内容,自动发现了其他的Sentinel:

$ src/redis-cli -p 26379

127.0.0.1:26379> PSUBSCRIBE *

Reading messages... (press Ctrl-C to quit)

  1. "psubscribe"

  2. "*"

  3. (integer) 1

  4. "pmessage"

  5. "*"

  6. "+sentinel"

  7. "sentinel fc976b271914f43a4a318dfe8c1f41a2e747f8d8 127.0.0.1 26380 @ mymaster 127.0.0.1 6379"

2. 如何发现其他的Slaves

通过Master节点知道有哪些Slaves,通过向Master发送info命令来发现Master下的从。

3. 进行一次自动故障转移

3.1. master 宕机

手动kill掉master节点的进程

3.2. sentinel发现master宕机

1 查看sentinel的log日志:

$ tail -f 26379.log
(手动关闭了master6379节点)
38502:X 01 Sep 2020 17:26:38.311 # +sdown master mymaster 127.0.0.1 6379

38502:X 01 Sep 2020 17:26:38.395 # +odown master mymaster 127.0.0.1 6379 #quorum 2/2

2 查看sentinel之间的Pub/Sub Channel:

  1. "+sdown"

  2. "master mymaster 127.0.0.1 6379"

  3. "pmessage"

  4. "*"

  5. "+odown"

  6. "master mymaster 127.0.0.1 6379 #quorum 2/2"

  7. "pmessage"

3.3. Sentinel Leader选举

在三个sentinel中选出由哪个sentinel来做这次的主从自动切换,首先会sentinel投票
1 查看sentinel的log日志:

$ tail -f 26379.log
38502:X 01 Sep 2020 17:26:38.395 # +new-epoch 1
38502:X 01 Sep 2020 17:26:38.395 # +try-failover master mymaster 127.0.0.1 6379
38502:X 01 Sep 2020 17:26:38.396 # +vote-for-leader b60bd3e15db23a9862d213e7703001c72d48dc73 1 (给哨兵b60bd开启投票)
38502:X 01 Sep 2020 17:26:38.397 # fc976b271914f43a4a318dfe8c1f41a2e747f8d8 voted for b60bd3e15db23a9862d213e7703001c72d48dc73 1 (fc976b给sentinel Id=b60bd投1票)
38502:X 01 Sep 2020 17:26:38.454 # +elected-leader master mymaster 127.0.0.1 6379

3.4. 选择合适的slave作为新的master
  1. 查看sentinle的log日志:

$ tail -f 26379.log
38502:X 01 Sep 2020 17:26:38.454 # +failover-state-select-slave master mymaster 127.0.0.1 6379

38502:X 01 Sep 2020 17:26:38.545 # +selected-slave slave 127.0.0.1:6381 127.0.0.1 6381 @ mymaster 127.0.0.1 6379 (选择6381成为新的master)

38502:X 01 Sep 2020 17:26:38.545 * +failover-state-send-slaveof-noone slave 127.0.0.1:6381 127.0.0.1 6381 @ mymaster 127.0.0.1 6379 (6381成为新的master)

38502:X 01 Sep 2020 17:26:38.646 * +failover-state-wait-promotion slave 127.0.0.1:6381 127.0.0.1 6381 @ mymaster 127.0.0.1 6379

38502:X 01 Sep 2020 17:26:39.282 # +promoted-slave slave 127.0.0.1:6381 127.0.0.1 6381 @ mymaster 127.0.0.1 6379

38502:X 01 Sep 2020 17:26:39.282 # +failover-state-reconf-slaves master mymaster 127.0.0.1 6379

38502:X 01 Sep 2020 17:26:39.346 * +slave-reconf-sent slave 127.0.0.1:6380 127.0.0.1 6380 @ mymaster 127.0.0.1 6379

38502:X 01 Sep 2020 17:26:39.480 # -odown master mymaster 127.0.0.1 6379

38502:X 01 Sep 2020 17:26:40.131 * +slave-reconf-inprog slave 127.0.0.1:6380 127.0.0.1 6380 @ mymaster 127.0.0.1 6379

38502:X 01 Sep 2020 17:26:40.131 * +slave-reconf-done slave 127.0.0.1:6380 127.0.0.1 6380 @ mymaster 127.0.0.1 6379

38502:X 01 Sep 2020 17:26:40.186 # +failover-end master mymaster 127.0.0.1 6379

38502:X 01 Sep 2020 17:26:40.186 # +switch-master mymaster 127.0.0.1 6379 127.0.0.1 6381

38502:X 01 Sep 2020 17:26:40.186 * +slave slave 127.0.0.1:6380 127.0.0.1 6380 @ mymaster 127.0.0.1 6381

38502:X 01 Sep 2020 17:26:40.186 * +slave slave 127.0.0.1:6379 127.0.0.1 6379 @ mymaster 127.0.0.1 6381

  1. Slave的选择策略:
  1. 存活的slave
  2. 复制偏移量最大的
  3. Run Id 最小的
  1. 6381升级为master:

$ src/redis-cli -p 6381

127.0.0.1:6381> info Replication

# Replication

role:master

connected_slaves:1

slave0:ip=127.0.0.1,port=6380,state=online,offset=79554,lag=0

master_replid:d349582dc829f56d1da32e2d2f1434c6f2c44802

master_replid2:46ef90de89e6771b67bc2b43371da2f97a03b4d1

master_repl_offset:79554

second_repl_offset:65161

repl_backlog_active:1

repl_backlog_size:1048576

repl_backlog_first_byte_offset:631

repl_backlog_histlen:78924

127.0.0.1:6381>

3.5 上面涉及的完整的日志:
  1. Sentinel间的Pub/Sub内容:

$ src/redis-cli -p 26379

127.0.0.1:26379> PSUBSCRIBE *

Reading messages... (press Ctrl-C to quit)

  1. "psubscribe"

  2. "*"

  3. (integer) 1

  4. "pmessage"

  5. "*"

  6. "+sdown"

  7. "master mymaster 127.0.0.1 6379"

  8. "pmessage"

  9. "*"

  10. "+odown"

  11. "master mymaster 127.0.0.1 6379 #quorum 2/2"

  12. "pmessage"

  13. "*"

  14. "+new-epoch"

  15. "1"

  16. "pmessage"

  17. "*"

  18. "+try-failover"

  19. "master mymaster 127.0.0.1 6379"

  20. "pmessage"

  21. "*"

  22. "+vote-for-leader"

  23. "b60bd3e15db23a9862d213e7703001c72d48dc73 1"

  24. "pmessage"

  25. "*"

  26. "+elected-leader"

  27. "master mymaster 127.0.0.1 6379"

  28. "pmessage"

  29. "*"

  30. "+failover-state-select-slave"

  31. "master mymaster 127.0.0.1 6379"

  32. "pmessage"

  33. "*"

  34. "+selected-slave"

  35. "slave 127.0.0.1:6381 127.0.0.1 6381 @ mymaster 127.0.0.1 6379"

  36. "pmessage"

  37. "*"

  38. "+failover-state-send-slaveof-noone"

  39. "slave 127.0.0.1:6381 127.0.0.1 6381 @ mymaster 127.0.0.1 6379"

  40. "pmessage"

  41. "*"

  42. "+failover-state-wait-promotion"

  43. "slave 127.0.0.1:6381 127.0.0.1 6381 @ mymaster 127.0.0.1 6379"

  44. "pmessage"

  45. "*"

  46. "-role-change"

  47. "slave 127.0.0.1:6381 127.0.0.1 6381 @ mymaster 127.0.0.1 6379 new reported role is master"

  48. "pmessage"

  49. "*"

  50. "+promoted-slave"

  51. "slave 127.0.0.1:6381 127.0.0.1 6381 @ mymaster 127.0.0.1 6379"

  52. "pmessage"

  53. "*"

  54. "+failover-state-reconf-slaves"

  55. "master mymaster 127.0.0.1 6379"

  56. "pmessage"

  57. "*"

  58. "+slave-reconf-sent"

  59. "slave 127.0.0.1:6380 127.0.0.1 6380 @ mymaster 127.0.0.1 6379"

  60. "pmessage"

  61. "*"

  62. "-odown"

  63. "master mymaster 127.0.0.1 6379"

  64. "pmessage"

  65. "*"

  66. "+slave-reconf-inprog"

  67. "slave 127.0.0.1:6380 127.0.0.1 6380 @ mymaster 127.0.0.1 6379"

  68. "pmessage"

  69. "*"

  70. "+slave-reconf-done"

  71. "slave 127.0.0.1:6380 127.0.0.1 6380 @ mymaster 127.0.0.1 6379"

  72. "pmessage"

  73. "*"

  74. "+failover-end"

  75. "master mymaster 127.0.0.1 6379"

  76. "pmessage"

  77. "*"

  78. "+switch-master"

  79. "mymaster 127.0.0.1 6379 127.0.0.1 6381"

  80. "pmessage"

  81. "*"

  82. "+slave"

  83. "slave 127.0.0.1:6380 127.0.0.1 6380 @ mymaster 127.0.0.1 6381"

  1. 完整的Sentinel哨兵的log日志:

$ tail -f 26379.log

38501:X 01 Sep 2020 17:15:48.851 # oO0OoO0OoO0Oo Redis is starting oO0OoO0OoO0Oo

38501:X 01 Sep 2020 17:15:48.851 # Redis version=5.0.9, bits=64, commit=00000000, modified=0, pid=38501, just started

38501:X 01 Sep 2020 17:15:48.851 # Configuration loaded

38502:X 01 Sep 2020 17:15:48.854 * Running mode=sentinel, port=26379.

38502:X 01 Sep 2020 17:15:48.855 # Sentinel ID is b60bd3e15db23a9862d213e7703001c72d48dc73

38502:X 01 Sep 2020 17:15:48.855 # +monitor master mymaster 127.0.0.1 6379 quorum 2

38502:X 01 Sep 2020 17:15:48.855 * +slave slave 127.0.0.1:6381 127.0.0.1 6381 @ mymaster 127.0.0.1 6379 (发现了slave)

38502:X 01 Sep 2020 17:17:59.296 * +slave slave 127.0.0.1:6380 127.0.0.1 6380 @ mymaster 127.0.0.1 6379 (发现了slave)

38502:X 01 Sep 2020 17:20:31.119 * +sentinel sentinel fc976b271914f43a4a318dfe8c1f41a2e747f8d8 127.0.0.1 26380 @ mymaster 127.0.0.1 6379 (发现了另外一个sentinel)

38502:X 01 Sep 2020 17:22:42.715 # +sdown slave 127.0.0.1:6380 127.0.0.1 6380 @ mymaster 127.0.0.1 6379

38502:X 01 Sep 2020 17:23:42.558 * +reboot slave 127.0.0.1:6380 127.0.0.1 6380 @ mymaster 127.0.0.1 6379

38502:X 01 Sep 2020 17:23:42.659 # -sdown slave 127.0.0.1:6380 127.0.0.1 6380 @ mymaster 127.0.0.1 6379

.............
............

(手动关闭了master6379节点)
38502:X 01 Sep 2020 17:26:38.311 # +sdown master mymaster 127.0.0.1 6379

38502:X 01 Sep 2020 17:26:38.395 # +odown master mymaster 127.0.0.1 6379 #quorum 2/2

38502:X 01 Sep 2020 17:26:38.395 # +new-epoch 1

38502:X 01 Sep 2020 17:26:38.395 # +try-failover master mymaster 127.0.0.1 6379

38502:X 01 Sep 2020 17:26:38.396 # +vote-for-leader b60bd3e15db23a9862d213e7703001c72d48dc73 1 (给哨兵b60bd开启投票)

38502:X 01 Sep 2020 17:26:38.397 # fc976b271914f43a4a318dfe8c1f41a2e747f8d8 voted for b60bd3e15db23a9862d213e7703001c72d48dc73 1 (fc976b给sentinel Id=b60bd投1票)

38502:X 01 Sep 2020 17:26:38.454 # +elected-leader master mymaster 127.0.0.1 6379

38502:X 01 Sep 2020 17:26:38.454 # +failover-state-select-slave master mymaster 127.0.0.1 6379

38502:X 01 Sep 2020 17:26:38.545 # +selected-slave slave 127.0.0.1:6381 127.0.0.1 6381 @ mymaster 127.0.0.1 6379 (选择6381成为新的master)

38502:X 01 Sep 2020 17:26:38.545 * +failover-state-send-slaveof-noone slave 127.0.0.1:6381 127.0.0.1 6381 @ mymaster 127.0.0.1 6379 (6381成为新的master)

38502:X 01 Sep 2020 17:26:38.646 * +failover-state-wait-promotion slave 127.0.0.1:6381 127.0.0.1 6381 @ mymaster 127.0.0.1 6379

38502:X 01 Sep 2020 17:26:39.282 # +promoted-slave slave 127.0.0.1:6381 127.0.0.1 6381 @ mymaster 127.0.0.1 6379

38502:X 01 Sep 2020 17:26:39.282 # +failover-state-reconf-slaves master mymaster 127.0.0.1 6379

38502:X 01 Sep 2020 17:26:39.346 * +slave-reconf-sent slave 127.0.0.1:6380 127.0.0.1 6380 @ mymaster 127.0.0.1 6379

38502:X 01 Sep 2020 17:26:39.480 # -odown master mymaster 127.0.0.1 6379

38502:X 01 Sep 2020 17:26:40.131 * +slave-reconf-inprog slave 127.0.0.1:6380 127.0.0.1 6380 @ mymaster 127.0.0.1 6379

38502:X 01 Sep 2020 17:26:40.131 * +slave-reconf-done slave 127.0.0.1:6380 127.0.0.1 6380 @ mymaster 127.0.0.1 6379

38502:X 01 Sep 2020 17:26:40.186 # +failover-end master mymaster 127.0.0.1 6379

38502:X 01 Sep 2020 17:26:40.186 # +switch-master mymaster 127.0.0.1 6379 127.0.0.1 6381

38502:X 01 Sep 2020 17:26:40.186 * +slave slave 127.0.0.1:6380 127.0.0.1 6380 @ mymaster 127.0.0.1 6381

38502:X 01 Sep 2020 17:26:40.186 * +slave slave 127.0.0.1:6379 127.0.0.1 6379 @ mymaster 127.0.0.1 6381

38502:X 01 Sep 2020 17:27:10.258 # +sdown slave 127.0.0.1:6379 127.0.0.1 6379 @ mymaster 127.0.0.1 6381

Redis Sentinel 原理简单介绍

原文:https://www.cnblogs.com/yangweiqiang/p/13627041.html

(0)
(0)
   
举报
评论 一句话评论(0
关于我们 - 联系我们 - 留言反馈 - 联系我们:wmxa8@hotmail.com
© 2014 bubuko.com 版权所有
打开技术之扣,分享程序人生!