MongoDB - Update And Query Operators for Arrays
Iâve worked with mongoDB for 3 years now. Built several monoliths, microservices, serverless applications with different use cases using mongoDB. It is fast, scalable, dynamic and developer friendly. One of my favourite mongoDB feature is that it stores data as documents in BSON (JSON in binary) because of which we can store array fields inside a document unlike a SQL based database where we would have to create another table for it.
Letâs suppose we have the following data in our MongoDB posts collection stored in a very naive way (for simplicity).
Each post document has a title
, author
and a comments
array field. Each element of the comments
array
represents a user with some basic information like country
, name
, isGuest
, and country
.
[
{
_id: ObjectId('1232'),
title: 'Hello World',
author: 'John',
comments: [
{
id: '2001',
name: 'Adam',
text: 'Nice!',
isGuest: true,
country: 'usa',
},
{
id: '2002',
name: 'Marshall',
text: 'Good!',
isGuest: false,
country: 'usa',
},
{
id: '2003',
name: 'Joe',
text: 'I like it!',
isGuest: true,
country: 'india',
},
{
id: '2003',
name: 'Joe',
text: 'Learned a lot!',
isGuest: false,
country: 'france',
},
{
id: '2004',
name: 'Taylor',
text: 'Awesome Post!',
isGuest: true,
country: 'germany',
},
],
},
{
_id: ObjectId('1233'),
title: 'Secrets of Life',
author: 'John',
comments: [
{
id: '2005',
name: 'John',
text: 'Nice!',
isGuest: false,
country: 'usa',
},
{
id: '2006',
name: 'Peter',
text: 'Good!',
isGuest: false,
country: 'usa',
},
{
id: '2007',
name: 'Tushar',
text: 'I like it!',
isGuest: true,
country: 'india',
},
{
id: '2008',
name: 'Sunil',
text: 'Learned a lot!',
isGuest: false,
country: 'india',
},
{
id: '2004',
name: 'Taylor',
text: 'Awesome Post!',
isGuest: true,
country: 'germany',
},
],
},
{
_id: ObjectId('1234'),
title: 'Infinity Wars Endgame Review',
author: 'Peter',
comments: [
{
id: '2001',
name: 'Adam',
text: 'Nice!',
isGuest: true,
country: 'usa',
},
{
id: '2005',
name: 'John',
text: 'Good!',
isGuest: false,
country: 'usa',
},
{
id: '2007',
name: 'Tushar',
text: 'I like it!',
isGuest: true,
country: 'india',
},
{
id: '2008',
name: 'Sunil',
text: 'Learned a lot!',
isGuest: false,
country: 'india',
},
{
id: '2004',
name: 'Taylor',
text: 'Awesome Post!',
isGuest: true,
country: 'germany',
},
],
},
{
_id: ObjectId('1235'),
title: 'Infinity Wars Endgame Review',
author: 'Peter',
comments: [
{
id: '2005',
name: 'John',
text: 'Nice!',
isGuest: false,
country: 'usa',
},
{
id: '2005',
name: 'John',
text: 'Good!',
isGuest: false,
country: 'usa',
},
],
},
];
The most important concept to remember when dealing with arrays in MongoDB is to know that whenever we target a particular array element (a comment in this case), MongoDB is not just going to return the targetted array element(s), instead it will return all the array element(s) inside the matched documents.
It will make more sense after we go through a few examples.
Querying Arrays
db.posts.find({
'comments.name' : 'John'
});
If we translate this MongoDB query statement to English. It will be âRetrieve all posts where at least one comment is made by a user with name Johnâ
This will be the returned result.
[
{
_id: ObjectId('1233'),
title: 'Secrets of Life',
author: 'John',
comments: [
{
id: '2005',
name: 'John',
text: 'Nice!',
isGuest: false,
country: 'usa',
},
{
id: '2006',
name: 'Peter',
text: 'Good!',
isGuest: false,
country: 'usa',
},
{
id: '2007',
name: 'Tushar',
text: 'I like it!',
isGuest: true,
country: 'india',
},
{
id: '2008',
name: 'Sunil',
text: 'Learned a lot!',
isGuest: false,
country: 'india',
},
{
id: '2004',
name: 'Taylor',
text: 'Awesome Post!',
isGuest: true,
country: 'germany',
},
],
},
{
_id: ObjectId('1234'),
title: 'Infinity Wars Endgame Review',
author: 'Peter',
comments: [
{
id: '2001',
name: 'Adam',
text: 'Nice!',
isGuest: true,
country: 'usa',
},
{
id: '2005',
name: 'John',
text: 'Good!',
isGuest: false,
country: 'usa',
},
{
id: '2007',
name: 'Tushar',
text: 'I like it!',
isGuest: true,
country: 'india',
},
{
id: '2008',
name: 'Sunil',
text: 'Learned a lot!',
isGuest: false,
country: 'india',
},
{
id: '2004',
name: 'Taylor',
text: 'Awesome Post!',
isGuest: true,
country: 'germany',
},
],
},
{
_id: ObjectId('1235'),
title: 'Infinity Wars Endgame Review',
author: 'Peter',
comments: [
{
id: '2005',
name: 'John',
text: 'Nice!',
isGuest: false,
country: 'usa',
},
{
id: '2005',
name: 'John',
text: 'Good!',
isGuest: false,
country: 'usa',
},
],
},
];
Now if you look at the result you will notice that although we targetted comments where userâs name is John, MongoDB returned us all of the comments in the returned documents. The posts array it returned to us has at least one comment made by the user John and it also contains other comments which are not made by John.
This has been my greatest gotcha with MongoDB. It was a big Aha moment for me when I first understood this behaviour. It has helped me a lot in working with arrays and most importantly writing big database migration and modification scripts.
Now letâs suppose we want to get all the posts where at least one comment is from a guest user from usa.
db.posts.find({
'comments.country': 'usa',
'comments.isGuest': true
});
Looks good but this will return us all of the posts where at least one comment in the comments array is from a guest and at least one comment in the comments array is from country usa.
This is the result of the above query.
[
{
_id: ObjectId('1232'),
title: 'Hello World',
author: 'John',
comments: [
{
id: '2001',
name: 'Adam',
text: 'Nice!',
isGuest: true,
country: 'usa',
},
{
id: '2002',
name: 'Marshall',
text: 'Good!',
isGuest: false,
country: 'usa',
},
{
id: '2003',
name: 'Joe',
text: 'I like it!',
isGuest: true,
country: 'india',
},
{
id: '2003',
name: 'Joe',
text: 'Learned a lot!',
isGuest: false,
country: 'france',
},
{
id: '2004',
name: 'Taylor',
text: 'Awesome Post!',
isGuest: true,
country: 'germany',
},
],
},
{
_id: ObjectId('1233'),
title: 'Secrets of Life',
author: 'John',
comments: [
{
id: '2005',
name: 'John',
text: 'Nice!',
isGuest: false,
country: 'usa',
},
{
id: '2006',
name: 'Peter',
text: 'Good!',
isGuest: false,
country: 'usa',
},
{
id: '2007',
name: 'Tushar',
text: 'I like it!',
isGuest: true,
country: 'india',
},
{
id: '2008',
name: 'Sunil',
text: 'Learned a lot!',
isGuest: false,
country: 'india',
},
{
id: '2004',
name: 'Taylor',
text: 'Awesome Post!',
isGuest: true,
country: 'germany',
},
],
},
{
_id: ObjectId('1234'),
title: 'Infinity Wars Endgame Review',
author: 'Peter',
comments: [
{
id: '2001',
name: 'Adam',
text: 'Nice!',
isGuest: true,
country: 'usa',
},
{
id: '2005',
name: 'John',
text: 'Good!',
isGuest: false,
country: 'usa',
},
{
id: '2007',
name: 'Tushar',
text: 'I like it!',
isGuest: true,
country: 'india',
},
{
id: '2008',
name: 'Sunil',
text: 'Learned a lot!',
isGuest: false,
country: 'india',
},
{
id: '2004',
name: 'Taylor',
text: 'Awesome Post!',
isGuest: true,
country: 'germany',
},
],
},
];
The second document looks odd. It does not have a user who is a guest and who is also from usa. So it turns out that if a post has a comment of a user from usa and it also has another comment from a guest. This post will also match our above query which we clearly donât want. We wanted to match the posts where at least one comment is from a guest and the same comment should be also from the country usa.
The query searches the comments array for at least one element who is a guest and at least one element who is from usa. A single element does not need to meet both criteria.
$elemMatch
Letâs rewrite our query using $elemMatch
db.posts.find({
comments: {
$elemMatch: {
country: 'usa',
isGuest: true
}
}
});
The result from the above query is.
[
{
_id: ObjectId('1232'),
title: 'Hello World',
author: 'John',
comments: [
{
id: '2001',
name: 'Adam',
text: 'Nice!',
isGuest: true,
country: 'usa',
},
{
id: '2002',
name: 'Marshall',
text: 'Good!',
isGuest: false,
country: 'usa',
},
{
id: '2003',
name: 'Joe',
text: 'I like it!',
isGuest: true,
country: 'india',
},
{
id: '2003',
name: 'Joe',
text: 'Learned a lot!',
isGuest: false,
country: 'france',
},
{
id: '2004',
name: 'Taylor',
text: 'Awesome Post!',
isGuest: true,
country: 'germany',
},
],
},
{
_id: ObjectId('1234'),
title: 'Infinity Wars Endgame Review',
author: 'Peter',
comments: [
{
id: '2001',
name: 'Adam',
text: 'Nice!',
isGuest: true,
country: 'usa',
},
{
id: '2005',
name: 'John',
text: 'Good!',
isGuest: false,
country: 'usa',
},
{
id: '2007',
name: 'Tushar',
text: 'I like it!',
isGuest: true,
country: 'india',
},
{
id: '2008',
name: 'Sunil',
text: 'Learned a lot!',
isGuest: false,
country: 'india',
},
{
id: '2004',
name: 'Taylor',
text: 'Awesome Post!',
isGuest: true,
country: 'germany',
},
],
},
];
This is the correct result. Both the returned documents have a user named Adam who is a guest and is also from usa.
So the question becomes when to use $elemMatch?
I recommend to use it almost every time when you want to query a nested array element.
$elemMatch matches documents that contain an array field with at least one element that matches all the specified query criteria. It behaves like a normal âarray.fieldâ when there is just one criterion.
db.posts.find({
comments: {
$elemMatch: {
country: 'usa',
}
}
});
db.posts.find({
'comments.country': 'usa'
});
The above two snippets works exactly the same because there is just one criterion i.e country is âusaâ. If there are more than one conditions then use $elemMatch.
Updating Arrays
$ operator
$ operator acts as a placeholder for the index of the first element that matches the query document
So letâs say out of all the comments on a particular post you want to update the name of the user of a particular comment.
db.posts.update(
{
_id: ObjectId('1232'),
'comments.id': '2001'
},
{
$set: { 'comments.$.name': 'New Name' }
}
);
MongoDB searches the post collection to find a post document with id 1234 and which also has at least one comment with id 2001.
Now we aim to target the comment with an id 2001 but the search query will return us all of the comments in the matched post and in
order to update the specific 2001 id comment, we need its index. If we know the index we can do comments.<index>.name: 'New Name'
. So for
instance if the index is 2. We can do comments.2.name: 'New Name'
. MongoDB doesnât provide us with an index but it does provide us
with a placeholder or a pointer for the index as $. So we can do comments.$.name: 'New Name'
and MongoDB will place the correct value
of the index in $ and update the comment.
Important: The array field must appear as part of the query document for it to work.
$[] operator
Now sometimes we want to update all of the elements in an array. For example, if you want to set a version
v1 on all of the existing comments.
Or if you want to set a new property to all of the comments.
$[] is like a placeholder index for all of the elements of an array field.
Letâs add a new version property to all of our comments.
db.posts.updateMany(
{},
{
$set: { 'comments.$[].version': 'v1' }
}
);
Now all the comments on all of the posts will have a version: 'v1'
property.
If we want to add the version property to a particular post. weâd do.
db.posts.updateMany(
{
_id: ObjectId('1233')
},
{
$set: { 'comments.$[].version': 'v1' }
}
);
Now all the comments on the post with id â1234â will have a version: 'v1'
property.
$[] works correctly only with updateMany method or
multi: true
option as it tends to update more than one element.
Things get a little trickier from here.
Letâs suppose that John is no longer a guest user. John has registered an account with our blog.
Now we need to update Johnâs isGuest
field to false
in all of the postâs comments.
Letâs write the command for that.
db.posts.update(
{ 'comments.name': 'John' },
{ 'comments.$[].isGuest': false },
{ multi: true }
);
multi: true
applies the action on all of the matched documents. You can also use updateMany method instead.
This command looks pretty good. Update the isGuest field to false in all the comments where name is John.
Did you notice the problem with the above command?
The isGuest
field of all the highlighted lines in the document below was set to false by the above command.
[
{
_id: ObjectId('1232'),
title: 'Hello World',
author: 'John',
comments: [
{
id: '2001',
name: 'Adam',
text: 'Nice!',
isGuest: true,
country: 'usa',
},
{
id: '2002',
name: 'Marshall',
text: 'Good!',
isGuest: false,
country: 'usa',
},
{
id: '2003',
name: 'Joe',
text: 'I like it!',
isGuest: true,
country: 'india',
},
{
id: '2003',
name: 'Joe',
text: 'Learned a lot!',
isGuest: false,
country: 'france',
},
{
id: '2004',
name: 'Taylor',
text: 'Awesome Post!',
isGuest: true,
country: 'germany',
},
],
},
{
_id: ObjectId('1233'),
title: 'Secrets of Life',
author: 'John',
comments: [
{
id: '2005',
name: 'John',
text: 'Nice!',
isGuest: false,
country: 'usa',
},
{
id: '2006',
name: 'Peter',
text: 'Good!',
isGuest: false,
country: 'usa',
},
{
id: '2007',
name: 'Tushar',
text: 'I like it!',
isGuest: false,
country: 'india',
},
{
id: '2008',
name: 'Sunil',
text: 'Learned a lot!',
isGuest: false,
country: 'india',
},
{
id: '2004',
name: 'Taylor',
text: 'Awesome Post!',
isGuest: false,
country: 'germany',
},
],
},
{
_id: ObjectId('1234'),
title: 'Infinity Wars Endgame Review',
author: 'Peter',
comments: [
{
id: '2001',
name: 'Adam',
text: 'Nice!',
isGuest: false,
country: 'usa',
},
{
id: '2005',
name: 'John',
text: 'Good!',
isGuest: false,
country: 'usa',
},
{
id: '2007',
name: 'Tushar',
text: 'I like it!',
isGuest: false,
country: 'india',
},
{
id: '2008',
name: 'Sunil',
text: 'Learned a lot!',
isGuest: false,
country: 'india',
},
{
id: '2004',
name: 'Taylor',
text: 'Awesome Post!',
isGuest: false,
country: 'germany',
},
],
},
{
_id: ObjectId('1235'),
title: 'Infinity Wars Endgame Review',
author: 'Peter',
comments: [
{
id: '2005',
name: 'John',
text: 'Nice!',
isGuest: false,
country: 'usa',
},
{
id: '2005',
name: 'John',
text: 'Good!',
isGuest: false,
country: 'usa',
},
],
},
];
This command has gone terribly wrong. We wanted to update comments from a specific user but it updated
all of the comments by setting isGuest to false
.
Letâs try to understand why it went wrong.
db.posts.update(
{ 'comments.name': 'John' },
{ 'comments.$[].isGuest': false },
{ multi: true }
);
What we targetted?
We wanted to target all posts where there is at least one comment from John.
What was targetted?
This line targets all posts where there is at least one comment from John. MongoDB returned us all posts where there is at least one comment from John.
db.posts.update(
{ 'comments.name': 'John' },
{ 'comments.$[].isGuest': false },
{ multi: true }
);
What we targetted?
We wanted to update the isGuest
field to false
in comments only from John.
What was targetted?
Because MongoDB returned us all posts where there was at least one comment from John and
all other comments as well. The $[]
operator updated the isGuest
field to false
in
all the comments of all the returned posts and not specifically on the comments only from John.
Thatâs what $[] is supposed to do.
We need a way through which we can update individual array elements based on some condition. Like here we only want to update those comments where name is John and not all of the returned comments.
It is similar to performing result.map(e => updateElement(e))
where we are updating all the elements from
the query result.
What we need to do is update based on some condition i.e
result.map(e => { if(e.name === 'John') { return updateElement(e); } else { return e; } })
$[indentifier] operator to the rescue
db.posts.update(
{ 'comments.name': 'John' },
{ 'comments.$[e].isGuest': false },
{ arrayFilters: [ { 'e.name': 'John' } ], multi: true }
);
When updating array elements, MongoDB will check the { "e.name": "John" }
condition for each array element and only update the element if it is true.
db.posts.update(
{ 'comments.name': 'John' },
{ 'comments.$[e].isGuest': false },
{ arrayFilters: [ { 'e.name': 'John' } ], multi: true }
);
This line tells MongoDB that we want to target posts where there is at least one comment from John.
db.posts.update(
{ 'comments.name': 'John' },
{ 'comments.$[e].isGuest': false },
{ arrayFilters: [ { 'e.name': 'John' } ], multi: true }
);
This letter e is a placeholder index for all the array elements. Presence of it also means that it is a conditional update which means only update this element if the criteria mentioned in arrayFilters is matched.
db.posts.update(
{ 'comments.name': 'John' },
{ 'comments.$[e].isGuest': false },
{ arrayFilters: [ { 'e.name': 'John' } ], multi: true }
);
Here we are specifying the criteria to tell MongoDB when to update an element whose name
field
is equal to John
where e represents the element.
$all operator
Letâs take an example of a customers collection where we can tag each customer. In a typical dashboard we could expect the filter criterias to be like âShow only importantâ, âShow only businessâ, âShow only repeat and importantâ, etc.
[
{
_id: '123456abcd',
tags: ['important', 'business', 'new'],
},
{
_id: '123457abce',
tags: ['spam', 'new'],
},
{
_id: '123458abcf',
tags: ['important', 'repeat', 'normal'],
},
{
_id: '123459abcg',
tags: ['spam', 'repeat'],
},
];
# Find customers where each customer has ATLEAST one of the tag as important
db.customers.find({ tags: 'important' })
# Result - ['123456abcd', '123458abcf']
# Find customers where each customer has ATLEAST one of the tag as business
db.customers.find({ tags: 'business' })
# Result - ['123456abcd]
# Find customers where each customer has EXACTLY these two tags i,e business and important
db.customers.find({ tags: ['business', 'important'] })
# Result - []
# Find customers where each customer has EXACTLY these two tags i,e spam and new
db.customers.find({ tags: [ "spam", "new" ] })
# Result - ['123457abce']
# Find customers where each customer has either a business tag OR an important tag
db.customers.find({ tags: { $in: ['business', 'important'] } })
# Result - ['123456abcd', '123458abcf']
# Find customers where each customer has(includes) an important tag AND a repeat tag
db.customers.find({ tags: { $all: ['important', 'repeat'] } })
# Result - ['123458abcf']
$all tag is useful where we want to only find documents which includes all the tags specified in the filter. Iâll give another example just to make it more clear. Letâs say we are building a dashboard where we could search users based on the the singers they listen to and we could connect with the users who have similar song interests as we do. We could do a search with the filters applied as âEminem, Owl Cityâ. Now if we use the $in operator we would get a list of users which either listens to âEminemâ or âOwl Cityâ. They wouldnât listen to BOTH of the singers in the filter. But if we use the $all operator, we would get a list of all the users that listens to BOTH the singers in the filter.
$all operator is similar to $elemMatch operator but on array of strings or numbers, whereas, $elemMatch only works on array of sub documents.
# Find users where each user EITHER listens to eminem OR owl city
db.users.find({ tags: { $in: ['eminem', 'owlCity'] } })
# Find users where each user listens to BOTH eminem AND owl city
db.users.find({ tags: { $all: ['eminem', 'owlCity'] } })
Make sure your array field is indexed, otherwise array operations could take up a lot of time and put load on your Database.
Check this link for a more complex use of the $all operator.
Happy Coding đ