MongoDB Query Fundamentals


MongoDB is a prominent member of the new breed of NoSQL databases, which have captured the attention of many institutions. While the merits of NoSQL’s approach to data management are still being hotly debated within the software community, it’s worth spending some time exploring solutions such as MongoDB.

In an earlier
Developer.com article
I introduced MongoDB,
a prominent member of the new breed of NoSQL databases,
which have captured the attention of many institutions who just a short time
ago would have considered any relational database alternative to be heresy. While
the merits of NoSQL’s approach to data management are still being hotly debated
within the software community, it’s worth spending some time exploring solutions
such as MongoDB because in doing so you’ll be forced to look at your data from
an entirely new perspective.

In this article I’ll introduce one feature of MongoDB, which prompts such reconsideration.
As the categorical NoSQL moniker implies, solutions such as MongoDB do not
rely upon the traditional SQL query-based approach to retrieving and manipulating
data. But lacking SQL, how do you even access the data, let alone filter and
aggregate it, tasks traditionally accomplished using SQL’s WHERE and GROUP
BY
clauses, respectively? Read on to explore how these tasks are accomplished
in MongoDB.

Finding and Filtering Data

MongoDB’s find() method is useful for retrieving all of the elements in a collection. For instance if you create a database named library and add a few books to the collection, you can subsequently access all of those elements using find(). I’ll demonstrate this task using the Mongo shell:

%>mongo library
MongoDB shell version: 1.6.1
connecting to: library
> db.books.save({title: 'Farewell to Arms', author: 'Ernest Hemingway'})
> db.books.save({title: 'The Maltese Falcon', author: 'Dashiell Hammett'})
> db.books.save({title: 'The Sun Also Rises', author: 'Ernest Hemingway'})
> db.books.find() 
{ "_id" : ObjectId("4c83b9b84e3f3ef212be6e76"), "title" : "Farewell to Arms", "author" : "Ernest Hemingway" }
{ "_id" : ObjectId("4c83b9eb4e3f3ef212be6e77"), "title" : "The Maltese Falcon", "author" : "Dashiell Hammett" }
{ "_id" : ObjectId("4c83b9f94e3f3ef212be6e78"), "title" : "The Sun Also Rises", "author" : "Ernest Hemingway" }

What if you only wanted to retrieve the books authored by Hemingway? You’ll need to pass the desired key and associated value to the find() method:

>db.books.find({author:"Ernest Hemingway"})
{ "_id" : ObjectId("4c83b9b84e3f3ef212be6e76"), "title" : "Farewell to Arms", "author" : "Ernest Hemingway" }
{ "_id" : ObjectId("4c83b9f94e3f3ef212be6e78"), "title" : "The Sun Also Rises", "author" : "Ernest Hemingway" }

Sorting Data

Sorting retrieved data is a common task, accomplished easily enough in MongoDB using the sort() method. For instance, to sort data in ascending order according to the author’s name, you’ll pass the author key into the sort() method, associating the name with an integer value of 1 (I’ve added a few more books to render the sorting outcome more evident):

>db.books.find().sort({author: 1})
{ "_id" : ObjectId("4c83b9eb4e3f3ef212be6e77"), "title" : "The Maltese Falcon", "author" : "Dashiell Hammett" }
{ "_id" : ObjectId("4c83b9b84e3f3ef212be6e76"), "title" : "Farewell to Arms", "author" : "Ernest Hemingway" }
{ "_id" : ObjectId("4c83b9f94e3f3ef212be6e78"), "title" : "The Sun Also Rises", "author" : "Ernest Hemingway" }
{ "_id" : ObjectId("4c83c3e44e3f3ef212be6e79"), "title" : "1984", "author" : "George Orwell" }
{ "_id" : ObjectId("4c83c3f64e3f3ef212be6e7a"), "title" : "Animal Farm", "author" : "George Orwell" }
{ "_id" : ObjectId("4c83c4134e3f3ef212be6e7b"), "title" : "Catcher in the Rye", "author" : "J.D. Salinger" }

To sort a collection in descending order, you’ll associate -1 with the desired key name.

Limiting Results

Limiting the collection size is useful particularly when implementing pagination features. To limit the number of results returned by the find() method, use the limit() method, as demonstrated here:

>db.books.find({author:"Ernest Hemingway"}).limit(1)
{ "_id" : ObjectId("4c83b9b84e3f3ef212be6e76"), "title" : "Farewell to Arms", "author" : "Ernest Hemingway" }

Aggregating Results

The value of the data stored in your database often goes far beyond the ability
to merely retrieve it from the database, as demonstrated in the above examples.
You’ll often want to analyze the data in useful ways, for instance determining
the number of books written by each author in the database. To do this you’ll aggregate the
data, counting the number of books associated with each author. Using MongoDB’s group() method,
aggregation is easily accomplished, although the syntax can be a bit confusing
at first. The following example will produce a list of authors along with the
count of their books stored in the database

db.books.group({
 key: {'author': true}, 
 initial: {books: 0}, 
 reduce: 
   function(doc, prev) { 
     prev.books += 1
   }
});

Executing this command produces the following result:

[
	{
		"author" : "Ernest Hemingway",
		"books" : 2
	},
	{
		"author" : "Dashiell Hammett",
		"books" : 1
	},
	{
		"author" : "George Orwell",
		"books" : 2
	},
	{
		"author" : "J.D. Salinger",
		"books" : 1
	}
]

As mentioned, the group() method can be a bit confusing to newcomers, so let’s review the syntax:

  • The key key identifies the data we’d like to group (this would be akin to the parameter passed to the SQL GROUP BY clause)
  • The initial key identifies the initial count of the data we’re counting in association with the key value associated with the key, in this case the number of books written by each author as determined by the database
  • The reduce key defines the function, which will determine what
    to do each time a book associated with an author is encountered. In our case,
    we merely want to determine the number of books associated with each author,
    and so will increment this value by one.

Conclusion

As you learned in this article, MongoDB’s approach to data storage and management requires the developer to consider the process of data retrieval from a different perspective, although the aptly named MongoDB methods make the crossover fairly painless.

About the Author

Jason Gilmore is founder of the publishing and consulting firm WJGilmore.com. He is the author
of several popular books “Easy PHP Websites with the Zend Framework”,
“Easy PayPal with PHP”, and
“Beginning PHP and MySQL, Third Edition”. Follow him on Twitter at @wjgilmore.

Get the Free Newsletter!

Subscribe to Cloud Insider for top news, trends & analysis

Latest Articles