Unique Marketing, Guaranteed Results.

MongoDB, What it is?

August 27th, 2010 by Narshlob

Puts simply, MongoDB is a document store database. Things are written to the database in BSON (Binary jSON) and displayed to the user in JSON. The power of MongoDB is that it can handle tons of data. We ran a benchmark between MySQL and MongoDB. The dataset was huge, 50 million records. We did a search by email address to find everyone that had an email domain of yahoo.com.

The query in MySQL looked like this,
SELECT * FROM user_table WHERE email_address LIKE '%yahoo.com';

The results looked like this:

+--------------+
 | count(*)    |
+--------------+
 | 8354         |
+--------------+
1 row in set (11 min 41.79 sec)

The same query run in Mongo looked like this:

> db.user_collection.find({email_address : /neo\.rr\.com$/}).explain();
      ....
          "n" : 123904,
          "millis" : 126008,
      ....

As you can see, the query in MySQL took just under 12 minutes while the one in Mongo took barely over 2 minutes. That’s a ton of time saved.
Note that the MySQL table is MyISAM and indexed on email_address. The MongoDB collection is also indexed on email_address.

———————————————————————————————————————————–

MongoDB is written in C++. From the MongoDB website (http://www.mongodb.org/) we receive this synopsis;

“MongoDB bridges the gap between key-value stores (which are fast and highly scalable) and traditional RDBMS systems (which provide rich queries and deep functionality).
MongoDB (from “humongous”) is a scalable, high-performance, open source, document-oriented database.”

MongoDB is a document store database featuring full index support, replication and high availability, auto sharding, querying, fast in-place updates, map/reduce functionality, GridFS, and commercial support.

When searching for something in a relational database with a foreign key to a separate table, two queries must be performed to pull all the data pertaining to the two tables for the specific data. In MongoDB, there are no server-side joins. You will generally want one database collection for each top level object so when pulling related data, you don’t want to store the data in two separate tables, just embed it into the collection.

Let’s see this with an example:
Say you have a Peeps table and a Favs table. Favs is a collection of different things such as “Pepsi”, “Mt. Dew”, “Dr Pepper” and Peeps is a collection of different people we’ve interviewed.
In MySQL, Peeps might be built like so,

Peeps
  :id
  :name,
  :email_address,
  :phone
  ........
  :favs_id

And Favs would look like this

Favs
  :id
  :what

In MongoDB, we wouldn’t worry about trying to link two collections together using ids. We would simply embed the favs into the Peeps collection. It would look something like this:

{
  peeps: [
    {name: "yourmom", email_address: "blah@arhar.com", favs: [
      {what: "Pepsi"}]
    }
  ]
}

Thus when we query looking for “yourmom” we can easily find yourmom’s favs as well, without an additional query. You might be saying to yourself, “But that adds a lot of unnecessary data! Using a foreign key takes up a lot less space! Thou Fool!!”. I’d say, “space is cheap”. 100 million records might take up roughly 100 gigs of data in MongoDB, which is nothing. How many people out there really have that much data anyway?

We’re contemplating using MongoDB as our server log. The advantage to this would be that we can query on the log much easier than by using grep, or something like that. All we’d have to do is

db.logs.find({error: "RuntimeError"}).limit(20)

to find the first twenty instances of RuntimeError in our logs.

As you can see, there’s a lot of benefit to using MongoDB, and a lot of different ways it can be used. My advice is to check it out for yourself (http://www.mongodb.org/). Set up a server and start messing around with it. It even supports JavaScript in the client console. Simply.. Amazing..

Be Sociable, Share!
Filed under: Uncategorized — Narshlob @ 10:25 am on August 27, 2010

2 Comments

  1. MongoDB, What it is?…

    Puts simply, MongoDB is a document store database. Things are written to the database in BSON (Binary jSON) and displayed to the user in JSON. The power of MongoDB is that it can handle tons of data. We ran a benchmark between MySQL and MongoDB. The da…

    Trackback by ehcache.net — March 16, 2011 @ 6:36 am

  2. What it is? « PMA Media Group…

    MongoDB, What it is? « PMA Media Group…

    Trackback by pligg.com — June 15, 2011 @ 8:27 am

RSS feed for comments on this post. TrackBack URL

Leave a comment

Copyright © 2005-2014 PMA Media Group. All Rights Reserved