首页 > 数据库技术 > 详细

Mongodb 之 群集搭建与故障排除

时间:2019-07-19 10:07:14      阅读:85      评论:0      收藏:0      [点我收藏+]
一、搭建Mongodb群集

1、网络拓扑情况

技术分享图片

2、在三个节点上安装mongodb

参考本博客的安装文档,解压到相应的位置,并配置好环境变量就行,不需要启动。

3、分别在三台服务器的/app/mongodb目录下创建用于存放数据的文件夹

mkdir -p /app/mongodb/{master|slave|arbiter}

主节点配置文件:   #新建
#master.conf
dbpath=/app/mongodb/master
logpath=/app/mongodb/master.log
pidfilepath=/app/mongodb/master.pid
#keyFile=/app/mongodb/mongodb.key     (待群集配置好后,再启用该功能)
directoryperdb=true
logappend=true
replSet=testdb
bind_ip=10.3.152.78
port=27017
#auth=true                     (待群集配置好后,再启用该功能)
oplogSize=100
fork=true
noprealloc=true
maxConns=4000

从节点配置文件:
#slave.conf
dbpath=/app/mongodb/slave
logpath=/app/mongodb/slave.log
pidfilepath=/app/mongodb/slave.pid
#keyFile=/app/mongodb/mongodb.key
directoryperdb=true
logappend=true
replSet=testdb
bind_ip=10.3.151.34
port=27017
#auth=true
oplogSize=100
fork=true
noprealloc=true
maxConns=4000

仲裁节点配置文件:
#arbiter.conf
dbpath=/app/mongodb/arbiter
logpath=/app/mongodb/arbiter.log
pidfilepath=/app/mongodb/arbiter.pid
#keyFile=/app/mongodb/mongodb.key
directoryperdb=true
logappend=true
replSet=testdb
bind_ip=10.3.151.34
port=27020
#auth=true
oplogSize=100
fork=true
noprealloc=true
maxConns=4000

4、在主节点上安装openssl ,并生成KeyFile文件,用以三个库之间通信加密

yum -y install openssl
openssl rand -base64 756 > /app/mongodb/mongodb.key
chmod 400 /app/mongodb/mongodb.key

5、把mongodb.key文件分别复制从节点和仲载服务器的/app/mongodb/目录下,并设置相同的权限

6、启动三台服务器的mongodb

主节点: mongod -f /etc/mongodb_master.conf
从节点:mongod -f /etc/mongodb_slave.conf
仲载点:mongod -f /etc/mongodb_arbiter.conf

7、在主节点上,添加群集配置信息

技术分享图片

8、查看当前群集的状态

switched to db admin
testdb:PRIMARY> rs.status()
{
    "set" : "testdb",
    "date" : ISODate("2019-07-18T08:53:54.483Z"),
    "myState" : 1,
    "members" : [
        {
            "_id" : 0,
            "name" : "10.3.152.78:27017",
            "health" : 1,
            "state" : 1,
            "stateStr" : "PRIMARY",      #主节点
            "uptime" : 1302,
            "optime" : Timestamp(1563439850, 1),
            "optimeDate" : ISODate("2019-07-18T08:50:50Z"),
            "electionTime" : Timestamp(1563439854, 1),
            "electionDate" : ISODate("2019-07-18T08:50:54Z"),
            "configVersion" : 1,
            "self" : true
        },
        {
            "_id" : 1,
            "name" : "10.3.151.34:27017",
            "health" : 1,
            "state" : 2,
            "stateStr" : "SECONDARY",        #从节点
            "uptime" : 184,
            "optime" : Timestamp(1563439850, 1),
            "optimeDate" : ISODate("2019-07-18T08:50:50Z"),
            "lastHeartbeat" : ISODate("2019-07-18T08:53:54.360Z"),
            "lastHeartbeatRecv" : ISODate("2019-07-18T08:53:54.360Z"),
            "pingMs" : 0,
            "lastHeartbeatMessage" : "could not find member to sync from",
            "configVersion" : 1
        },
        {
            "_id" : 2,
            "name" : "10.3.151.34:27020",
            "health" : 1,
            "state" : 7,
            "stateStr" : "ARBITER",       #仲载节点
            "uptime" : 184,
            "lastHeartbeat" : ISODate("2019-07-18T08:53:54.360Z"),
            "lastHeartbeatRecv" : ISODate("2019-07-18T08:53:54.359Z"),
            "pingMs" : 0,
            "configVersion" : 1
        }
    ],
    "ok" : 1
}

9、在主库上添加超级管理员权限

testdb:PRIMARY> use admin
switched to db admin
testdb:PRIMARY> db.createUser(
... {
... user:"admin",
... pwd:"abc123",
... roles:["readWriteAnyDatabase","dbAdminAnyDatabase","userAdminAnyDatabase","clusterAdmin"]
... }
... )
Successfully added user: {
    "user" : "admin",
    "roles" : [
        "readWriteAnyDatabase",
        "dbAdminAnyDatabase",
        "userAdminAnyDatabase",
        "clusterAdmin"
    ]
}

10、此时为mongodb加上身份验证功能

把三个配置文件的注释去掉
#keyFile=/app/mongodb/mongodb.key
#auth=true

11、重新启动所有mongodb,登登主节点服务器,查看当前群集状态

[root@ops-site master]# mongo 10.3.152.78
MongoDB shell version: 3.0.15
connecting to: 10.3.152.78/test
testdb:PRIMARY> use admin
switched to db admin
testdb:PRIMARY> db.auth("admin","abc123")        #加密功能验证成功
1
testdb:PRIMARY> rs.status()              #当前状态正常
{
    "set" : "testdb",
    "date" : ISODate("2019-07-18T09:05:31.654Z"),
    "myState" : 1,
    "members" : [
        {
            "_id" : 0,
            "name" : "10.3.152.78:27017",
            "health" : 1,
            "state" : 1,
            "stateStr" : "PRIMARY",
            "uptime" : 152,
            "optime" : Timestamp(1563440306, 4),
            "optimeDate" : ISODate("2019-07-18T08:58:26Z"),
            "electionTime" : Timestamp(1563440628, 1),
            "electionDate" : ISODate("2019-07-18T09:03:48Z"),
            "configVersion" : 1,
            "self" : true
        },
        {
            "_id" : 1,
            "name" : "10.3.151.34:27017",
            "health" : 1,
            "state" : 2,
            "stateStr" : "SECONDARY",
            "uptime" : 103,
            "optime" : Timestamp(1563440306, 4),
            "optimeDate" : ISODate("2019-07-18T08:58:26Z"),
            "lastHeartbeat" : ISODate("2019-07-18T09:05:30.551Z"),
            "lastHeartbeatRecv" : ISODate("2019-07-18T09:05:30.550Z"),
            "pingMs" : 0,
            "configVersion" : 1
        },
        {
            "_id" : 2,
            "name" : "10.3.151.34:27020",
            "health" : 1,
            "state" : 7,
            "stateStr" : "ARBITER",
            "uptime" : 59,
            "lastHeartbeat" : ISODate("2019-07-18T09:05:30.540Z"),
            "lastHeartbeatRecv" : ISODate("2019-07-18T09:05:31.002Z"),
            "pingMs" : 0,
            "configVersion" : 1
        }
    ],
    "ok" : 1
}

12、在主库上创建一个test库,插入python集和几个文档,看从库是否同步文件

主库创建:
[root@ops-site master]# mongo 10.3.152.78
MongoDB shell version: 3.0.15
connecting to: 10.3.152.78/test
testdb:PRIMARY> use admin
switched to db admin
testdb:PRIMARY> db.auth("admin","abc123")
1
testdb:PRIMARY> use test
switched to db test
testdb:PRIMARY> db.createCollection("python")
{ "ok" : 1 }
testdb:PRIMARY> db.python.insert({x:1000})
WriteResult({ "nInserted" : 1 })
testdb:PRIMARY> db.python.insert({x1:1000})
WriteResult({ "nInserted" : 1 })
从库查看:(同步成功)
[root@nbutest arbiter]# mongo 10.3.151.34
MongoDB shell version: 3.0.15
connecting to: 10.3.151.34/test
testdb:SECONDARY> use admin
switched to db admin
testdb:SECONDARY> db.auth("admin","abc123")
1
testdb:SECONDARY> db.getMongo().setSlaveOk();       #设置从库可读(默认不能读 )
testdb:SECONDARY> use test
switched to db test
testdb:SECONDARY> db.python.find()
{ "_id" : ObjectId("5d303765712c279b0cae08b9"), "x" : 1000 }
{ "_id" : ObjectId("5d303769712c279b0cae08ba"), "x1" : 1000 }
testdb:SECONDARY> 

13、把主库停止,查看从库是否变成主库

主库:
[root@ops-site master]# mongo 10.3.152.78
MongoDB shell version: 3.0.15
connecting to: 10.3.152.78/test
testdb:PRIMARY> use admin
switched to db admin
testdb:PRIMARY> db.auth("admin","abc123")
1
testdb:PRIMARY> db.shutdownServer()
2019-07-18T17:15:07.766+0800 I NETWORK  DBClientCursor::init call() failed
server should be down...
2019-07-18T17:15:07.768+0800 I NETWORK  trying reconnect to 10.3.152.78:27017 (10.3.152.78) failed
2019-07-18T17:15:07.768+0800 I NETWORK  reconnect 10.3.152.78:27017 (10.3.152.78) ok
2019-07-18T17:15:07.770+0800 I NETWORK  DBClientCursor::init call() failed
2019-07-18T17:15:07.814+0800 I NETWORK  trying reconnect to 10.3.152.78:27017 (10.3.152.78) failed
2019-07-18T17:15:07.814+0800 I NETWORK  reconnect 10.3.152.78:27017 (10.3.152.78) ok
2019-07-18T17:15:08.475+0800 I NETWORK  Socket recv() errno:104 Connection reset by peer 10.3.152.78:27017
2019-07-18T17:15:08.475+0800 I NETWORK  SocketException: remote: 10.3.152.78:27017 error: 9001 socket exception [RECV_ERROR] server [10.3.152.78:27017] 

从库:
[root@nbutest arbiter]# mongo 10.3.151.34
MongoDB shell version: 3.0.15
connecting to: 10.3.151.34/test
testdb:PRIMARY>             #已经提示PRIMARY

14、启动主库查看是否把主库切换到10.3.152.78上

[root@ops-site master]# mongod -f /etc/mongodb_master.conf 
note: noprealloc may hurt performance in many applications
about to fork child process, waiting until server is ready for connections.
forked process: 28829
child process started successfully, parent exiting
You have mail in /var/spool/mail/root
[root@ops-site master]# mongo 10.3.152.78
MongoDB shell version: 3.0.15
connecting to: 10.3.152.78/test
testdb:PRIMARY>              #切换成功

二、群集的故障排除

1、从节点的状态为RECOVERING处理方法

1、登陆到/app/mongodb/slave目录上
[root@nbutest slave]# pwd
/app/mongodb/slave
[root@nbutest slave]# ll
总用量 24
drwxr-xr-x 3 root root 4096 7月  18 17:03 admin
drwxr-xr-x 2 root root 4096 7月  18 17:09 journal
drwxr-xr-x 3 root root 4096 7月  18 17:03 local
-rw-r--r-- 1 root root    6 7月  18 17:03 mongod.lock
-rw-r--r-- 1 root root   69 7月  18 16:34 storage.bson
drwxr-xr-x 3 root root 4096 7月  18 17:09 test

2、把该目录上的所有数据删除
如果稳妥一点,可以把配置文件的数据存放重新定一个新的位置 

3、启动mongodb,重新同步
testdb:PRIMARY> rs.status()
{
    "set" : "testdb",
    "date" : ISODate("2019-07-18T09:30:03.958Z"),
    "myState" : 1,
    "members" : [
        {
            "_id" : 0,
            "name" : "10.3.152.78:27017",
            "health" : 1,
            "state" : 1,
            "stateStr" : "PRIMARY",
            "uptime" : 743,
            "optime" : Timestamp(1563441001, 1),
            "optimeDate" : ISODate("2019-07-18T09:10:01Z"),
            "electionTime" : Timestamp(1563441462, 1),
            "electionDate" : ISODate("2019-07-18T09:17:42Z"),
            "configVersion" : 1,
            "self" : true
        },
        {
            "_id" : 1,
            "name" : "10.3.151.34:27017",
            "health" : 1,
            "state" : 2,
            "stateStr" : "SECONDARY",          #已经恢复
            "uptime" : 130,
            "optime" : Timestamp(1563441001, 1),
            "optimeDate" : ISODate("2019-07-18T09:10:01Z"),
            "lastHeartbeat" : ISODate("2019-07-18T09:30:03.747Z"),
            "lastHeartbeatRecv" : ISODate("2019-07-18T09:30:02.743Z"),
            "pingMs" : 0,
            "configVersion" : 1
        },
        {
            "_id" : 2,
            "name" : "10.3.151.34:27020",
            "health" : 1,
            "state" : 7,
            "stateStr" : "ARBITER",
            "uptime" : 743,
            "lastHeartbeat" : ISODate("2019-07-18T09:30:03.730Z"),
            "lastHeartbeatRecv" : ISODate("2019-07-18T09:30:03.846Z"),
            "pingMs" : 0,
            "configVersion" : 1
        }
    ],
    "ok" : 1
}

2、如果想把从节点的端口修改为27018

1、停掉从节点服务
2、修改配置文件的端口为28018
3、登陆主节点,删除当前从节点服务器
[root@ops-site master]# mongo 10.3.152.78
MongoDB shell version: 3.0.15
connecting to: 10.3.152.78/test
testdb:PRIMARY> use admin
switched to db admin
testdb:PRIMARY> db.auth("admin","abc123")
1
testdb:PRIMARY> rs.remove("10.3.151.34:27017")
{ "ok" : 1 }
testdb:PRIMARY> rs.add("10.3.151.34:27018")
{ "ok" : 1 }
testdb:PRIMARY> 

4、启动从节点
[root@nbutest slave]# mongod -f /etc/mongodb_slave.conf 
note: noprealloc may hurt performance in many applications
about to fork child process, waiting until server is ready for connections.
forked process: 25209
child process started successfully, parent exiting
[root@nbutest slave]# netstat -tlunp | grep mongod
tcp        0      0 10.3.151.34:27018           0.0.0.0:*                   LISTEN      25209/mongod        
tcp        0      0 10.3.151.34:27020           0.0.0.0:*                   LISTEN      24735/mongod 

5、查看群集的状态
testdb:PRIMARY> rs.status()
{
    "set" : "testdb",
    "date" : ISODate("2019-07-18T09:41:33.604Z"),
    "myState" : 1,
    "members" : [
        {
            "_id" : 0,
            "name" : "10.3.152.78:27017",
            "health" : 1,
            "state" : 1,
            "stateStr" : "PRIMARY",
            "uptime" : 1433,
            "optime" : Timestamp(1563442691, 1),
            "optimeDate" : ISODate("2019-07-18T09:38:11Z"),
            "electionTime" : Timestamp(1563441462, 1),
            "electionDate" : ISODate("2019-07-18T09:17:42Z"),
            "configVersion" : 3,
            "self" : true
        },
        {
            "_id" : 2,
            "name" : "10.3.151.34:27020",
            "health" : 1,
            "state" : 7,
            "stateStr" : "ARBITER",
            "uptime" : 1432,
            "lastHeartbeat" : ISODate("2019-07-18T09:41:31.679Z"),
            "lastHeartbeatRecv" : ISODate("2019-07-18T09:41:32.973Z"),
            "pingMs" : 0,
            "configVersion" : 3
        },
        {
            "_id" : 3,
            "name" : "10.3.151.34:27018",           #成功
            "health" : 1,
            "state" : 2,
            "stateStr" : "SECONDARY",
            "uptime" : 102,
            "optime" : Timestamp(1563442691, 1),
            "optimeDate" : ISODate("2019-07-18T09:38:11Z"),
            "lastHeartbeat" : ISODate("2019-07-18T09:41:31.690Z"),
            "lastHeartbeatRecv" : ISODate("2019-07-18T09:41:33.004Z"),
            "pingMs" : 0,
            "configVersion" : 3
        }
    ],
    "ok" : 1
}
testdb:PRIMARY> 

3、如果想添加多一个从节点

1、安装一个mongodb,设置环境变量,同时设置好配置文件(与其它节点的配置类似,只是绑定的IP地址不一样,auth为true,keyfile启用)
2、把其它服务器的mongodb.key文件复制到相应的位置,同时设置好相应的权限 ,否则通信失败。
3、启动该mongodb服务。
4、在主节点服务器上使用rs.add("10.3.151.34:27017"),如果是添加仲载节点命令为rs.addArb("10.3.151.150:27017")
5、查看检查群集状态成功就可以啦。

Mongodb 之 群集搭建与故障排除

原文:https://blog.51cto.com/12965094/2421453

(0)
(0)
   
举报
评论 一句话评论(0
关于我们 - 联系我们 - 留言反馈 - 联系我们:wmxa8@hotmail.com
© 2014 bubuko.com 版权所有
打开技术之扣,分享程序人生!