时间:2021-07-01 10:21:17 帮助过:23人阅读
在搭建mongoDB之前,我们要考虑几个小问题:
1、我们搭建集群的目的是什么?是多备份提高容错和系统可用性还是横向拓展存储大规模数据还是两者兼有?
如果是为了多备份那么选择replication集群搭建即可,如果是为了处理大数据则需要搭建sharding集群,如果两者兼有需要对每个shardsvr创建replica。
2、什么是sharding?和replication有什么不同?
简单而言,replica是mongo提供服务的一个基本单位,单机系统和replication集群对用户来讲没有区别,都只相当于一个服务节点,只不过replication集群有多备份,还有服务端选举,安全性更加有保证。而sharding集群包含3个角色:mongos,configsvr,shardsvr,对于一个集群来说mongos相当于master,负责对外提供服务,shardsvr相当于slave,负责分片存储数据,而configsvr相当于router,负责记录分片元信息。这3种角色中的任何一个角色中的子节点都是一个replica。具体说明参考官方网站对sharding和replication的描述:
replication:https://docs.mongodb.com/manual/replication/
sharding:https://docs.mongodb.com/manual/sharding/
3、我们集群的架构又该是什么样子?
如果对问题1和2有足够的认知的话,那么根据本地硬件环境构建一个什么样的集群大致也就清楚了,每个shardsvr的replication相当于一个slave,我们需要几个子节点就需要创建多少个shardsvr,configsvr是router信息,我们可以将所有机器都组成一个configsvr的replication用以提供router服务,至于mongos,内部使用一个节点也可以,如果需要稳定运行的话也需要组一个小的mongos的replication。
第二部分
下面是实战环节:
我这可以有5台服务器用来跑mongodb还有一批数据,当然,这5台机器上也跑着其他框架如spark,hadoop等等,由于spark和hadoop都是单点故障的(什么?多master?secondary?不存在的,老夫部署集群从来都是单点故障)所以mongos也是一台节点,数据端存放在5台机器上,又由于数据量较大,硬盘较小(别人组的raid5,加一起一台服务器也就1T多空间),所以肯定不考虑备份和稳定性了(2备份硬盘就没多大地方了,hdfs还有其他数据要放),那么架构可以构建如下:
下面shardsvr每一个都是一个单独的replica,开始部署:
1、创建配置文件:
a) configsvr
- <span style="color: #000000">systemLog:
- destination: </span><span style="color: #0000ff">file</span><span style="color: #000000">
- path: </span><span style="color: #800000">"</span><span style="color: #800000">/home/cloud/platform/logs/mongodb/configsvr.log</span><span style="color: #800000">"</span><span style="color: #000000">
- logAppend: </span><span style="color: #0000ff">true</span><span style="color: #000000">
- storage:
- dbPath: </span><span style="color: #800000">"</span><span style="color: #800000">/home/cloud/platform/data/configData</span><span style="color: #800000">"</span><span style="color: #000000">
- journal:
- enabled: </span><span style="color: #0000ff">true</span><span style="color: #000000">
- setParameter:
- enableLocalhostAuthBypass: </span><span style="color: #0000ff">false</span><span style="color: #000000">
- processManagement:
- fork: </span><span style="color: #0000ff">true</span><span style="color: #000000">
- replication:
- replSetName: </span><span style="color: #800000">"</span><span style="color: #800000">configsvr0</span><span style="color: #800000">"</span><span style="color: #000000">
- sharding:
- clusterRole: </span><span style="color: #800000">"</span><span style="color: #800000">configsvr</span><span style="color: #800000">"</span>
b) shardsvr
- <span style="color: #000000">systemLog:
- destination: </span><span style="color: #0000ff">file</span><span style="color: #000000">
- path: </span><span style="color: #800000">"</span><span style="color: #800000">/home/cloud/platform/logs/mongodb/shardsvr.log</span><span style="color: #800000">"</span><span style="color: #000000">
- logAppend: </span><span style="color: #0000ff">true</span><span style="color: #000000">
- storage:
- dbPath: </span><span style="color: #800000">"</span><span style="color: #800000">/home/cloud/platform/data/shardData</span><span style="color: #800000">"</span><span style="color: #000000">
- journal:
- enabled: </span><span style="color: #0000ff">true</span><span style="color: #000000">
- setParameter:
- enableLocalhostAuthBypass: </span><span style="color: #0000ff">false</span><span style="color: #000000">
- processManagement:
- fork: </span><span style="color: #0000ff">true</span><span style="color: #000000">
- replication:
- replSetName: </span><span style="color: #800000">"</span><span style="color: #800000">shardsvr1</span><span style="color: #800000">"</span><span style="color: #000000">
- sharding:
- clusterRole: </span><span style="color: #800000">"</span><span style="color: #800000">shardsvr</span><span style="color: #800000">"</span>
c) mongos
- <span style="color: #000000">systemLog:
- destination: </span><span style="color: #0000ff">file</span><span style="color: #000000">
- path: </span><span style="color: #800000">"</span><span style="color: #800000">/home/cloud/platform/logs/mongodb/mongos.log</span><span style="color: #800000">"</span><span style="color: #000000">
- logAppend: </span><span style="color: #0000ff">true</span><span style="color: #000000">
- net:
- bindIp: </span><span style="color: #800080">192.168</span>.<span style="color: #800080">12.161</span><span style="color: #000000">
- port: </span><span style="color: #800080">27017</span><span style="color: #000000">
- setParameter:
- enableLocalhostAuthBypass: </span><span style="color: #0000ff">false</span><span style="color: #000000">
- processManagement:
- fork: </span><span style="color: #0000ff">true</span><span style="color: #000000">
- sharding:
- configDB: </span><span style="color: #800000">"</span><span style="color: #800000">configsvr0/192.168.12.161:27019,192.168.12.162:27019,192.168.12.163:27019,192.168.12.164:27019,192.168.12.169:27019</span><span style="color: #800000">"</span>
注意:每台机器上的配置都略有不同,简易参考官方文档进行修改,replSetName这个是replication的设置,每个角色的子replication应该有相同的值,不同的replication应该有不同的值
接下来是启动脚本
a)shardsvr
- #!/bin/<span style="color: #000000">bash
- # use this to initiate: rs.initiate({_id:</span><span style="color: #800000">"</span><span style="color: #800000">shardsvr1</span><span style="color: #800000">"</span>,members:[{_id:<span style="color: #800080">0</span>,host:<span style="color: #800000">"</span><span style="color: #800000">192.168.12.161:27018</span><span style="color: #800000">"</span><span style="color: #000000">}]})
- </span>/home/cloud/platform/mongodb-<span style="color: #800080">3.4</span>.<span style="color: #800080">5</span>/bin/mongod --config /home/cloud/platform/mongodb-<span style="color: #800080">3.4</span>.<span style="color: #800080">5</span>/shardserver.conf
b)configsvr
- #!/bin/<span style="color: #000000">bash
- #use this to initiate: rs.initiate({_id:</span><span style="color: #800000">"</span><span style="color: #800000">configsvr0</span><span style="color: #800000">"</span>,configsvr:<span style="color: #0000ff">true</span>,members:[{_id:<span style="color: #800080">0</span>,host:<span style="color: #800000">"</span><span style="color: #800000">192.168.12.161:27019</span><span style="color: #800000">"</span>},{_id:<span style="color: #800080">1</span>,host:<span style="color: #800000">"</span><span style="color: #800000">192.168.12.162:27019</span><span style="color: #800000">"</span>},{_id:<span style="color: #800080">2</span>,host:<span style="color: #800000">"</span><span style="color: #800000">192.168.12.163:27019</span><span style="color: #800000">"</span>},{_id:<span style="color: #800080">3</span>,host:<span style="color: #800000">"</span><span style="color: #800000">192.168.12.164:27019</span><span style="color: #800000">"</span>},{_id:<span style="color: #800080">4</span>,host:<span style="color: #800000">"</span><span style="color: #800000">192.168.12.169:27019</span><span style="color: #800000">"</span><span style="color: #000000">}]})
- MONGO_HOME</span>=/home/cloud/platform/mongodb-<span style="color: #800080">3.4</span>.<span style="color: #800080">5</span>/<span style="color: #000000">
- ${MONGO_HOME}</span>/bin/mongod --config ${MONGO_HOME}/configserver.conf
c)mongos
- #!/bin/<span style="color: #000000">bash
- #mogos dont need to initiate,
- #</span><span style="color: #0000ff">sh</span>.enableSharding(<span style="color: #800000">"</span><span style="color: #800000">dbname</span><span style="color: #800000">"</span><span style="color: #000000">) to create database
- #</span><span style="color: #0000ff">sh</span>.shardCollection(<span style="color: #800000">"</span><span style="color: #800000">dbname.tablename</span><span style="color: #800000">"</span>, {<span style="color: #0000ff">id</span>: <span style="color: #800000">"</span><span style="color: #800000">hashed</span><span style="color: #800000">"</span>}) to create a shard table <span style="color: #0000ff">split</span> by <span style="color: #0000ff">id</span>
- /home/cloud/platform/mongodb-<span style="color: #800080">3.4</span>.<span style="color: #800080">5</span>/bin/mongos --config /home/cloud/platform/mongodb-<span style="color: #800080">3.4</span>.<span style="color: #800080">5</span>/mongosserver.conf
2、启动过程
a、将脚本和配置文件复制到每台机器上
b、启动每个shardsvr,然后登录到shardsvr上,执行初始化过程:
- <span style="color: #800080">1</span>、执行start-shardsvr.<span style="color: #0000ff">sh</span>
- <span style="color: #800080">2</span>、执行bin/mongo --host ${hostIP} --<span style="color: #000000">port ${hostport}
- shardsvr的默认端口是27018
- configsvr的默认端口是27019
- mongos的默认端口是27017
- 在上面配置文件中未指定端口,一切都以默认为主
- </span><span style="color: #800080">3</span>、执行rs.initiate({_id:<span style="color: #800000">"</span><span style="color: #800000">shardsvr1</span><span style="color: #800000">"</span>,members:[{_id:<span style="color: #800080">0</span>,host:<span style="color: #800000">"</span><span style="color: #800000">192.168.12.161:27018</span><span style="color: #800000">"</span><span style="color: #000000">}]})
- 进行初始化工作
- </span><span style="color: #800080">4</span>、执行rs.status()查看shardsvr状态,一个成功的例子如下:
c、启动所有configsvr,并使用mongo --host --port命令登录到任意一台configsvr的configsvr端口上(default:27019)。并执行初始化工作:
- rs.initiate({_id:<span style="color: #800000">"</span><span style="color: #800000">configsvr0</span><span style="color: #800000">"</span>,configsvr:<span style="color: #0000ff">true</span>,members:[{_id:<span style="color: #800080">0</span>,host:<span style="color: #800000">"</span><span style="color: #800000">192.168.12.161:27019</span><span style="color: #800000">"</span>},{_id:<span style="color: #800080">1</span>,host:<span style="color: #800000">"</span><span style="color: #800000">192.168.12.162:27019</span><span style="color: #800000">"</span>},{_id:<span style="color: #800080">2</span>,host:<span style="color: #800000">"</span><span style="color: #800000">192.168.12.163:27019</span><span style="color: #800000">"</span>},{_id:<span style="color: #800080">3</span>,host:<span style="color: #800000">"</span><span style="color: #800000">192.168.12.164:27019</span><span style="color: #800000">"</span>},{_id:<span style="color: #800080">4</span>,host:<span style="color: #800000">"</span><span style="color: #800000">192.168.12.169:27019</span><span style="color: #800000">"</span>}]})
d、启动mongos,这时已经可以在mongos上执行我们的操作了。
- printShardingStatus()
然后就是正常的mongo shell操作了,可以把mongos当成一个普通的单机mongodb来使用,操作基本相同,除了创建sharding表
创建表如下:
- <span style="color: #0000ff">sh</span>.enableSharding(<span style="color: #800000">"</span><span style="color: #800000">dbname</span><span style="color: #800000">"</span><span style="color: #000000">) to create database
- </span><span style="color: #0000ff">sh</span>.shardCollection(<span style="color: #800000">"</span><span style="color: #800000">dbname.tablename</span><span style="color: #800000">"</span>, {<span style="color: #800000">"</span><span style="color: #800000">_id</span><span style="color: #800000">"</span>: <span style="color: #800000">"</span><span style="color: #800000">hashed</span><span style="color: #800000">"</span>}) to create a shard table hashed by _id
需要注意的是,"_id"是mongo分片依据,不能重复,如果想以其他字段来进行hash,将命令中的"_id"改为字段名称就可以了,但是mongo还是会自动创建一个_id列用来索引
添加索引:
- db.collectionname.ensureIndex({<span style="color: #800000">"</span><span style="color: #800000">indexColumn</span><span style="color: #800000">"</span>:<span style="color: #800080">1</span>})
第三部分
JavaAPI小tips
获取连接:
- lazy val mongo = <span style="color: #0000ff">new</span> MongoClient("192.168.12.161", 27017<span style="color: #000000">)
- lazy val db </span>= mongo.getDatabase("testdb"<span style="color: #000000">)
- lazy val dbColl </span>= db.getCollection("origin2")
插入数据:
- var resList = <span style="color: #0000ff">new</span><span style="color: #000000"> ArrayList[Document]
- var d </span>= <span style="color: #0000ff">new</span><span style="color: #000000"> Document
- d.append(</span>"path"<span style="color: #000000">, x.getPath)
- d.append(</span>"name"<span style="color: #000000">, x.getName)
- d.append(</span>"content", filterHtml(Source.fromFile(x, detector(x)).getLines().toArray).mkString("\n"<span style="color: #000000">))
- resList.add(d)
- dbColl.insertMany(resList, </span><span style="color: #0000ff">new</span> InsertManyOptions().ordered(<span style="color: #0000ff">false</span>))
在插入过程中,如果"_id"出现重复值,那么默认情况下会中止当前插入操作并throw一个exception,即之前的数据已经插入进去,后面的数据没插入进表,在后面加入new InsertManyOptions().ordered(false)参数就可以将所有不重复的数据插入完成后再throw一个exception
mongoDB3.4的sharding集群搭建及JavaAPI的简易使用
标签:any god shel 子节点 ble .sh slist 小问题 ash