Things I wish I knew before I started working with Mongodb

Aditya Agarwal
codeburst
Published in
5 min readSep 18, 2017

--

MEAN Stack was all the rage and I was trying to wrap my head around it so I started a pet-project which was a simple blogging website in which some CRUD operations were performed. This means that all it does is send and receive data via Rest Apis and saves the data in mongodb using mongoose driver. From now on wherever I say mongo, it means the mongoose driver which let’s us use mongodb in Nodejs.

In the beginning you will find that mongo has a beautiful command called

Collection.find(query)

what it does is provide a list of all the documents present in that Collection which match the given query.

Beginners often tend to use the find method to get these documents and from then on handle all sorts of operations on these documents via Javascript only Eg- looping over each and executing some function, extracting a limited number of documents and many other things.

What this does is —

  1. Slow performance of the backend
  2. Never ending lines of code making everything a mess.
  3. Embarrassment of your future self.

As I kept digging up in mongo I found a great deal of functions which you can use to develop an efficient backend.

Note — I’m only providing small snippets of the codes which will give you the idea of how they are used. I will be sharing a gist later which will have fully working codes.

Let’s look at the most awesome ones —

> Query only one document -

Many a times you know that the query would result only one result so if you do this -

User.find({email:"foo@bar.com"}) // => [user-1]

then it will return a single user document wrapped in an array but why do it.

You can use findOne function to avoid this

User.findOne({email:"foo@bar.com"}) // => user-1

> Query document by Id -

Mongo provides unique id for each document which is commonly used to find documents

User.find({_id:"595873aaf6e9bcd415b44fc5"}) // => [user-1]

but this looks cubersome and can have issues like variable’s type error.

You should use findById function instead like thus this -

User.findById("595873aaf6e9bcd415b44fc5") // => user-1

> Looping over a list of documents and executing commands on each -

This has happened to everyone — “Loop over all the users which are unverified and send them an email to verify their account” . Many of us beginners do this

User.find({
verified: false
}).exec(function (err, users) {
var n = users.length;
for (var i = 0; i < n; i++) {
var user = users[i];
sendMail(user.email);
}
})

But using cursors is better because you are not loading lot of data into memory at once.

const cursor = User.find({ verified: false }).cursor();cursor.on('data', function (user) {  sendMail(user.email)});cursor.on('close', function () {  console.log("sent mail to all users")});

> Fetching only certain keys of documents -

Let’s just say you have to get only the email and name of a user. As beginners we would simply do this

User.findById(id).exec(function (err, user) {
var item = {
name: user.name,
email: user.email
}
addToList(item);
})

what you may not realise is that behind the scenes mongo had to work so as to fetch the complete document. How about improving it -

User.findById(id).select("name email").exec(function (err, user) {
var item = {
name: user.name,
email: user.email
}
addToList(item);
})

The select property would tell mongo that it only needs to get the name and email from the document and thus makes for more efficient code.

Note — if you want to get all the keys except some you can prefix it with “-” . Eg-

select("-password -token")

> Maintaining an array in a document.

Many a times we have to store an array of items in a document key.

Eg — Suppose you want to store the username of all people who follow you.

For this you have a schema like —

{
username: {
type: String,
default: "Anonymous"
},
email: {
type: String,
default: ""
},
followers: {
type: Array
}
}

Now a naive approach to do this would be to query the document , manipulate the array with Js functions like arr.push() and arr.splice(). However deleting a particular item is a really difficult in Js as first you have to find the index of the item, then use the splice method to remove that item and finally update the document in mongo.

Mongo provides two wonderful methods to handle this -

User.update({
_id: userId
}, {
$push: {
followers: "foo_bar"
}
}).exec(function(err, user){
console.log("foo_bar is added to the list of your followers");
})

The update function is given first the query which let’s it find the document which has to be updated and then secondly we provide certain operations which should be performed on that document.Here we use $push to make mongo add the value “foo_bar” to the key followers of a user.

Similarly, we can easily remove an element from the followers list -

User.update({
_id: userId
}, {
$pull: {
followers: "foo_bar"
}
}).exec(function(err, user){
console.log("foo_bar is removed from the list of your followers");
})

> Linking documents from different collections.

Consider the previous example. There we were just storing the username of all the followers. However if you need to access more data about the followers then what can you do ?

The naive approach would be to store the id of all users who follow you in array, loop through each and then using User.find() method to fetch all the data of a user.

However mongo has another trick up it’s sleeve and that’s the populate() method.

You can read in detail here — http://mongoosejs.com/docs/populate.html

The main concept is that we use ref key to tell mongo from where should it fetch the data via given id .

Assume that we two schemas Person and Story -

var personSchema = Schema({
_id: Number,
name: String,
age: Number,
stories: [{
type: Schema.Types.ObjectId,
ref: 'Story'
}]
});
var storySchema = Schema({
_creator: {
type: Number,
ref: 'Person'
},
title: String,
fans: [{
type: Number,
ref: 'Person'
}]
});
var Story = mongoose.model('Story', storySchema);
var Person = mongoose.model('Person', personSchema);

Now in Story schema you can see that _creator is an object and it is of type Number, this is done because the _id of Person schema is of Number type.

By using that ref above and populate function shown below, you can get the creator’s data while querying the Story collection easily.

Story.findOne({
title: 'Once upon a timex.'
})
.populate('_creator')
.exec(function (err, story) {
if (err) return handleError(err);
console.log('The creator is %s', story._creator.name);
// prints "The creator is Aaron"
});

You can see that we now don’t need to query the Person collection to get the creator’s details. Mongo does all this behind the scenes for you.

But as you know most of the times we don’t specify the _id in the schema and that’s fine as you can see that same is done while defining Story schema. Due to this, the _id which mongo generates will be of Schema.Types.ObjectId. Hence to populate the story in the Person schema, we specify type of stories as Schema.Types.ObjectId.

Lastly, I just wanted to give a little tip. After some digging of your own you will find about pagination and how we use skip() and limit() to do this. However this is a good method for small datasets. A faster way exists, please read here — https://scalegrid.io/blog/fast-paging-with-mongodb/

To read more articles like these you can follow me on Twitter and Medium or subscribe to my newsletter!

https://buttondown.email/itaditya

--

--