Pages

Saturday, February 14, 2015

mongoDB Sharded Cluster



MongoDB production cluster must have three Config Servers, one or more Query Routers, and two or more Shards. Shards are either replica sets or a standalone mongod instances. The mongos instances are routers for the cluster, typically one mongos instance on each application server.

Config Servers are used to store metadata that links requested data with the shard. The application connects to Query Routers which communicates to Config Servers to determine the data location and returns the data from the appropriate shards. Shards are responsible for the actual data storage operations.

3 Config Servers (mongod metadata) : cfg0.example.com, cfg1.example.com, cfg2.example.com
2 Query Routers (mongos) : mongos0.example.com, mongos1.example.com
Shard 1 (mongod) : mongodb0.example.com, mongodb1.example.com, mongodb2.exmple.com
Shard 2 (mongod) : mongodb3.example.com, mongodb4.example.com, mongodb5.exmple.com

Disable SELINUX

# vi /etc/selinux/config
SELINUX=disabled

Add the following lines to /etc/security/limits.conf to increase limit.
mongod        soft    nproc           64000
mongod        hard    nproc           64000
mongod        soft    nofile          64000
mongod        hard    nofile          64000

Disable the usage of transparent hugepages

Add the following script lines to /etc/rc.local to improve performance.
if test -f /sys/kernel/mm/transparent_hugepage/khugepaged/defrag; then
  echo 0 > /sys/kernel/mm/transparent_hugepage/khugepaged/defrag
fi
if test -f /sys/kernel/mm/transparent_hugepage/defrag; then
  echo never > /sys/kernel/mm/transparent_hugepage/defrag
fi
if test -f /sys/kernel/mm/transparent_hugepage/enabled; then
  echo never > /sys/kernel/mm/transparent_hugepage/enabled
fi

# chmod +x /etc/rc.local

Enable NTP on all cluster machines.
# yum install ntp
# service ntpd start
# chkconfig ntpd on

Configure mongoDB repository.

$ sudo vi /etc/yum.repos.d/mongodb-enterprise.repo

[mongodb-enterprise]
name=MongoDB Enterprise Repository
baseurl=https://repo.mongodb.com/yum/redhat/$releasever/mongodb-enterprise/stable/$basearch/
gpgcheck=0
enabled=1

Disk Configuration

# fdisk /dev/sdb

Command (m for help): n
Command action
e extended
p primary partition (1-4)
p
Partition number (1-4): 1
First cylinder (1-1044, default 1):
Using default value 1
Last cylinder, +cylinders or +size{K,M,G} (1-1044, default 1044):
Using default value 1044
Command (m for help): w

# mkfs.ext4 /dev/sdb1
# mkdir /data      
Add the following line to /etc/fstab
/dev/sdb1 /data ext4 noatime,nodiratime 0 2
# mount /dev/sdb1

Configure Config Servers

On each of the config servers,
Create a data directory to store metadata.
$ sudo mkdir /data/configdb      --> all config servers for journal
$ sudo chown mongod:mongod /data/configdb

Install mongoDB server.
$ sudo yum --disablerepo=* --enablerepo=mongodb-org-3.0 install -y mongodb-enterprise-server

Modify parameters in /etc/mongod.conf
configsvr=true
dbpath=/data/configdb
Comment out to listen on all interfaces.
# bind_ip=127.0.0.1


$ sudo firewall-cmd --zone=public --add-port=27019/tcp
$ sudo firewall-cmd --permanent --add-port=27019/tcp


Start mongod instance on each of the three config servers.
$ sudo service mongod start


Verify Mongod proccess by checking the contents of the log file at /var/log/mongodb/mongod.log.
[initandlisten] waiting for connections on port <port>
Where <port>  is 27019 by default.


Configure Query Routers

Install mongos, mongo shell and mongo tools on each of query routers.
$ sudo yum --disablerepo=* --enablerepo=mongodb-org-3.0 install -y mongodb-enterprise-mongos mongodb-enterprise-shell mongodb-enterprise-tools


$ sudo firewall-cmd --zone=public --add-port=27017/tcp
$ sudo firewall-cmd --permanent --add-port=27017/tcp


Start the mongos instance specifying the config servers on both Query Routers. The mongos runs on the default port 27017.

$ sudo mongos --configdb cfg0.example.com:27019,cfg1.example.com:27019,cfg2.example.com:27019

Query Routers should begin to communicate to the three configuration servers.

Create the Replica Sets

Create a data directory on each of the three shard Servers.
$ sudo yum --disablerepo=* --enablerepo=mongodb-org-3.0 install -y mongodb-enterprise-server

$ sudo mkdir /data/db
$ chown mongod:mongod /data/db

Modify parameters in /etc/mongod.conf
shardsvr=true
dbpath=/data/configdb
Comment out to listen on all interfaces.
# bind_ip=127.0.0.1
replSet=rs0

$ sudo firewall-cmd --zone=public --add-port=27018/tcp
$ sudo firewall-cmd --permanent --add-port=27018/tcp

Start mongod instance on each of the three shard servers that will deploy each member of the replica set to its own machine. 
$ sudo service mongod start

Verify Mongod proccess by checking the contents of the log file at /var/log/mongodb/mongod.log.
[initandlisten] waiting for connections on port <port>
Where <port> is 27018 by default.

From query router mongos0.example.com, connect to the replica set member on mongodb0.example.com
$ mongo --host mongodb0.example.com --port 27018 admin

Initiate the replica set.
> rs.initiate()

Verify the initial replica set configuration.
rs0:SECONDARY> rs.conf()

Add the members to the replica set.
rs0:PRIMARY> rs.add("mongodb1.example.com:27018")
{ "ok" : 1 }
rs0:PRIMARY> rs.add("mongodb2.example.com:27018")
{ "ok" : 1 }

Check the status of the replica set.
rs0:PRIMARY> rs.status()
Repeat the above steps to create another replica set rs1 on mongodb3.example.com, mongodb4.example.com, and mongodb5.example.com.

Add Shards to the Cluster

Connect to any of the Query Routers.   
$ mongo mongos0.example.com:27017/admin
mongos> show dbs
admin   (empty)
config  0.016GB
mongos> sh.status()

Add a Shard for a replica set rs0 with replica set name and a replica set member mongodb0.example.com to the cluster.
mongos> sh.addShard("rs0/mongodb0.example.com:27018")
{ "shardAdded" : "rs0", "ok" : 1 }
mongos> sh.addShard("rs1/mongodb3.example.com:27018")
{ "shardAdded" : "rs1", "ok" : 1}
mongos> sh.status()

Enable sharding for a Database.

mongos> sh.enableSharding("testdb")
{ "ok" : 1 }
OR
mongos> db.runCommand( { enablesharding: "testdb" } )
{ "ok" : 0, "errmsg" : "already enabled" }

mongos> sh.shardCollection("testdb.testCollection",{"name":1})

{ "collectionsharded" : "testdb.testCollection", "ok" : 1 }

mongos> sh.status()

--- Sharding Status ---

  sharding version: {

        "_id" : 1,

        "minCompatibleVersion" : 5,

        "currentVersion" : 6,

        "clusterId" : ObjectId("553c531d71621ab16006d19c")

}
  shards:
        {  "_id" : "rs0",  "host" : "rs0/192.168.1.21:27018,192.168.1.22:27018,mongodb0.example.com:27018" }
  balancer:
        Currently enabled:  yes
        Currently running:  no
        Failed balancer rounds in last 5 attempts:  2
        Last reported error:  ReplicaSetMonitor no master found for set: rs0
        Time of Reported error:  Sun Apr 26 2015 01:51:27 GMT-0400 (EDT)
        Migration Results for the last 24 hours:
                No recent migrations
  databases:
        {  "_id" : "admin",  "partitioned" : false,  "primary" : "config" }
        {  "_id" : "testdb",  "partitioned" : true,  "primary" : "rs0" }
                testdb.testCollection
                        shard key: { "name" : 1 }
                        chunks:
                                rs0     1
                        { "name" : { "$minKey" : 1 } } -->> { "name" : { "$maxKey" : 1 } } on : rs0 Timestamp(1, 0)

mongos> sh.getBalancerState()
true

On query router
$ mongostat --discover
$ mongostat --host mongodb0.example.com:27018, mongodb1.example.com:27018, mongodb2.example.com:27018
If you don't see new database
mongos> use config
switched to db config
mongos> db.databases.find()
{ "_id" : "admin", "partitioned" : false, "primary" : "config" }
{ "_id" : "testdb", "partitioned" : true, "primary" : "rs0" }

Create "hashed" shard key on id field on collection.
mongos> use testdb
mongos> db.test_collection.ensureIndex( { _id : "hashed" } )
{
        "raw" : {
                "rs0/192.168.1.21:27018,192.168.1.22:27018,mongodb0.example.com:27018" : {
                        "createdCollectionAutomatically" : true,
                        "numIndexesBefore" : 1,
                        "errmsg" : "exception: bad index key pattern { _id: \"hasted\" }: Unknown index plugin 'hasted'",
                        "code" : 67,
                        "ok" : 0,
                        "$gleStats" : {
                                "lastOpTime" : Timestamp(1430087893, 1),
                                "electionId" : ObjectId("553c6a2f97bbb1cbf830be1c")
                        }
                }
        },
        "code" : 67,
        "ok" : 0,
        "errmsg" : "{ rs0/192.168.1.21:27018,192.168.1.22:27018,mongodb0.example.com:27018: \"exception: bad index key pattern { _id: \"hasted\" }: Unknown index plugin 'hasted'\" }"
}

Enable sharding for a Collection

mongos> sh.shardCollection("testdb.testCollection", { "_id": "hashed" } )
{ "ok" : 0, "errmsg" : "already sharded" }

Insert data into the collection.

mongos> use testdb
mongos> for (var i = 1; i <= 500; i++) db.testCollection.insert( { x : i } )
WriteResult({ "nInserted" : 1 })

Query Data from the Collection
mongos> db.testCollection.find()
{ "_id" : ObjectId("553d69f83dc4c576940dc59a"), "x" : 1 }
{ "_id" : ObjectId("553d69f83dc4c576940dc59b"), "x" : 2 }
{ "_id" : ObjectId("553d69f83dc4c576940dc59c"), "x" : 3 }
{ "_id" : ObjectId("553d69f83dc4c576940dc59d"), "x" : 4 }
{ "_id" : ObjectId("553d69f83dc4c576940dc59e"), "x" : 5 }
{ "_id" : ObjectId("553d69f83dc4c576940dc59f"), "x" : 6 }
{ "_id" : ObjectId("553d69f83dc4c576940dc5a0"), "x" : 7 }
{ "_id" : ObjectId("553d69f83dc4c576940dc5a1"), "x" : 8 }
{ "_id" : ObjectId("553d69f83dc4c576940dc5a2"), "x" : 9 }
{ "_id" : ObjectId("553d69f83dc4c576940dc5a3"), "x" : 10 }
{ "_id" : ObjectId("553d69f83dc4c576940dc5a4"), "x" : 11 }
{ "_id" : ObjectId("553d69f83dc4c576940dc5a5"), "x" : 12 }
{ "_id" : ObjectId("553d69f83dc4c576940dc5a6"), "x" : 13 }
{ "_id" : ObjectId("553d69f83dc4c576940dc5a7"), "x" : 14 }
{ "_id" : ObjectId("553d69f83dc4c576940dc5a8"), "x" : 15 }
{ "_id" : ObjectId("553d69f83dc4c576940dc5a9"), "x" : 16 }
{ "_id" : ObjectId("553d69f83dc4c576940dc5aa"), "x" : 17 }
{ "_id" : ObjectId("553d69f83dc4c576940dc5ab"), "x" : 18 }
{ "_id" : ObjectId("553d69f83dc4c576940dc5ac"), "x" : 19 }
{ "_id" : ObjectId("553d69f83dc4c576940dc5ad"), "x" : 20 }
Type "it" for more

Get info about specific shards.
mongos> sh.status()
--- Sharding Status ---
  sharding version: {
        "_id" : 1,
        "minCompatibleVersion" : 5,
        "currentVersion" : 6,
        "clusterId" : ObjectId("553c531d71621ab16006d19c")
}
  shards:
        {  "_id" : "rs0",  "host" : "rs0/192.168.1.21:27018,192.168.1.22:27018,mongodb0.example.com:27018" }
  balancer:
        Currently enabled:  yes
        Currently running:  no
        Failed balancer rounds in last 5 attempts:  2
        Last reported error:  ReplicaSetMonitor no master found for set: rs0
        Time of Reported error:  Sun Apr 26 2015 18:45:42 GMT-0400 (EDT)
        Migration Results for the last 24 hours:
                No recent migrations
  databases:
        {  "_id" : "admin",  "partitioned" : false,  "primary" : "config" }
        {  "_id" : "testdb",  "partitioned" : true,  "primary" : "rs0" }
                testdb.testCollection
                        shard key: { "name" : 1 }
                        chunks:
                                rs0     1
                        { "name" : { "$minKey" : 1 } } -->> { "name" : { "$maxKey" : 1 } } on : rs0 Timestamp(1, 0)

Create the Database
> use SoccerLeague
switched to db SoccerLeague

Create a table(Collection)
> db
SoccerLeague
> db.createCollection("Teams")
{
        "ok" : 1,
        "$gleStats" : {
                "lastOpTime" : Timestamp(1430088542, 1),
                "electionId" : ObjectId("553c6a2f97bbb1cbf830be1c")
        }
}

Use the Collections (tables) with a DB
> show collections
Teams
system.indexes

Drop a Collection (table) within a DB
> db.Teams.drop();
true

Insert Data into the Collection
> var a = {"name":"New York Giants", "conference":"American"}
> db.Teams.save(a)
WriteResult({ "nInserted" : 1 })

Query Data from the Collection
> db.Teams.find()
{ "_id" : ObjectId("553d6ccd3dc4c576940dc78e"), "name" : "New York Giants", "conference" : "American" }
> var b = {"name":"New York Giants", "conference":"American"}
> db.Teams.save(b)
WriteResult({ "nInserted" : 1 })
> db.Teams.insert({ name: 'Apple', product: 'iPhone', emp_no: '0'})
WriteResult({ "nInserted" : 1 })
> db.Teams.find()
{ "_id" : ObjectId("553d6ccd3dc4c576940dc78e"), "name" : "New York Giants", "conference" : "American" }
{ "_id" : ObjectId("553d6d4e3dc4c576940dc78f"), "name" : "New York Giants", "conference" : "American" }
{ "_id" : ObjectId("553d6d9e3dc4c576940dc790"), "name" : "Apple", "product" : "iPhone", "emp_no" : "0" }
> db.Teams.find().forEach(printjson)
{
        "_id" : ObjectId("553d6ccd3dc4c576940dc78e"),
        "name" : "New York Giants",
        "conference" : "American"
}
{
        "_id" : ObjectId("553d6d4e3dc4c576940dc78f"),
        "name" : "New York Giants",
        "conference" : "American"
}
{
        "_id" : ObjectId("553d6d9e3dc4c576940dc790"),
        "name" : "Apple",
        "product" : "iPhone",
        "emp_no" : "0"
}
> db.Teams.find({"name":"New York Giants"});
{ "_id" : ObjectId("553d6ccd3dc4c576940dc78e"), "name" : "New York Giants", "conference" : "American" }
{ "_id" : ObjectId("553d6d4e3dc4c576940dc78f"), "name" : "New York Giants", "conference" : "American" }
> db.Teams.find({"name":"New York Yankees"});
Multiple Update a document (row) within a Collection
mongos> db.Teams.update({"name":"New York Giants"},{"name":"New York Jets", "conference":"National"}, {multi:true});
WriteResult({
        "nMatched" : 0,
        "nUpserted" : 0,
        "nModified" : 0,
        "writeError" : {
                "code" : 9,
                "errmsg" : "multi update only works with $ operators"
        }
})

Single Update a document (row) within a Collection
> db.Teams.update({"name":"New York Giants"},{"name":"New York Jets", "conference":"National"});
WriteResult({ "nMatched" : 1, "nUpserted" : 0, "nModified" : 1 })

updates first row
> db.Teams.find()
{ "_id" : ObjectId("553d6ccd3dc4c576940dc78e"), "name" : "New York Jets", "conference" : "National" }
{ "_id" : ObjectId("553d6d4e3dc4c576940dc78f"), "name" : "New York Jets", "conference" : "National" }
{ "_id" : ObjectId("553d6d9e3dc4c576940dc790"), "name" : "Apple", "product" : "iPhone", "emp_no" : "0" }

Remove a document (row) within a Collection
> db.Teams.remove({"name":"New York Giants"});
WriteResult({ "nRemoved" : 0 })
> db.Teams.remove({"_id":ObjectId("553d6ccd3dc4c576940dc78e")});
WriteResult({ "nRemoved" : 1 })
> db.Teams.count()
2
> exit

Bulk Load or Script Data into the Collection
> load("LoadData.js");
> show collections

 
Copy Database
> db.copyDatabase("SoccerLeague","FootballLeague","localhost")
{
        "ok" : 0,
        "errmsg" : "couldn't connect to server localhost:27017 (127.0.0.1), conn
ection attempt failed",
        "$gleStats" : {
                "lastOpTime" : Timestamp(1430088584, 1),
                "electionId" : ObjectId("553c6a2f97bbb1cbf830be1c")
        }
}

Create User

mongos> db.createUser(
...     {
...       user: "testUser",
...       pwd: "testpw",
...       roles: [
...          { role: "readWrite", db: "testdb" }
...       ]
...     }
... )
Successfully added user: {
        "user" : "testUser",
        "roles" : [
                {
                        "role" : "readWrite",
                        "db" : "testdb"
                }
        ]
}