heim / mongodb-shard-example

A simple sharding example with two shards and the enron-mail dataset.

Geek Repo:Geek Repo

Github PK Tool:Github PK Tool

A simple minimalistic mongodb cluster for local testing. The setup.sh-scripts initializes one config server, two shards and one mongos, adds the shards to the configdb and imports an excerpt of the enron-mail dataset.

The clean.sh-script kills all the mongo processes and removes the data and log directories.

To setup the database for sharding execute these commands at the mongo shell:

sh.enableSharding("enron") db.messages.ensureIndex({"headers.From": 1}) sh.shardCollection("enron.messages", {"headers.From": 1}, false)

Some fun commands to run in the shell.

db.messages.find({"headers.To": "jeff.skilling@enron.com"}).count() //explain() db.messages.ensureIndex({"headers.To": 1})

db.messages.find({"headers.From": "jeff.skilling@enron.com"}).count() //explain() db.messages.ensureIndex({"headers.From": 1}) sh.enableSharding("enron") sh.shardCollection("enron.messages", {"headers.From":1}, false)

db.messages.aggregate({$unwind: "$headers.To"}, {$group: {_id: "$headers.To", sum: {$sum: 1}}}, {$sort: { "sum": -1}}, {$limit: 5})

db.messages.ensureIndex({"headers.To": 1})

db.messages.aggregate({$unwind: "$headers.To"}, {$group: {_id: {"to": "$headers.To", "from" : "$headers.From"}, sum: {$sum: 1}}}, {$sort: { "sum": -1}}, {$limit: 5})

About

A simple sharding example with two shards and the enron-mail dataset.


Languages

Language:Shell 89.1%Language:JavaScript 10.9%