What's the $unwind operator in MongoDB?

This is my first day with MongoDB so please go easy with me :)

I can't understand the $unwind operator, maybe because English is not my native language.

db.article.aggregate(
    { $project : {
        author : 1 ,
        title : 1 ,
        tags : 1
    }},
    { $unwind : "$tags" }
);

The project operator is something I can understand, I suppose (it's like SELECT, isn't it?). But then, $unwind (citing) returns one document for every member of the unwound array within every source document.

Is this like a JOIN? If yes, how the result of $project (with _id, author, title and tags fields) can be compared with the tags array?

NOTE: I've taken the example from MongoDB website, I don't know the structure of tags array. I think it's a simple array of tag names.


First off, welcome to MongoDB!

The thing to remember is that MongoDB employs an "NoSQL" approach to data storage, so perish the thoughts of selects, joins, etc. from your mind. The way that it stores your data is in the form of documents and collections, which allows for a dynamic means of adding and obtaining the data from your storage locations.

That being said, in order to understand the concept behind the $unwind parameter, you first must understand what the use case that you are trying to quote is saying. The example document from mongodb.org is as follows:

{
 title : "this is my title" ,
 author : "bob" ,
 posted : new Date () ,
 pageViews : 5 ,
 tags : [ "fun" , "good" , "fun" ] ,
 comments : [
             { author :"joe" , text : "this is cool" } ,
             { author :"sam" , text : "this is bad" }
 ],
 other : { foo : 5 }
}

Notice how tags is actually an array of 3 items, in this case being "fun", "good" and "fun".

What $unwind does is allow you to peel off a document for each element and returns that resulting document. To think of this in a classical approach, it would be the equivilent of "for each item in the tags array, return a document with only that item".

Thus, the result of running the following:

db.article.aggregate(
    { $project : {
        author : 1 ,
        title : 1 ,
        tags : 1
    }},
    { $unwind : "$tags" }
);

would return the following documents:

{
     "result" : [
             {
                     "_id" : ObjectId("4e6e4ef557b77501a49233f6"),
                     "title" : "this is my title",
                     "author" : "bob",
                     "tags" : "fun"
             },
             {
                     "_id" : ObjectId("4e6e4ef557b77501a49233f6"),
                     "title" : "this is my title",
                     "author" : "bob",
                     "tags" : "good"
             },
             {
                     "_id" : ObjectId("4e6e4ef557b77501a49233f6"),
                     "title" : "this is my title",
                     "author" : "bob",
                     "tags" : "fun"
             }
     ],
     "OK" : 1
}

Notice that the only thing changing in the result array is what is being returned in the tags value. If you need an additional reference on how this works, I've included a link here. Hopefully this helps, and good luck with your foray into one of the best NoSQL systems that I have come across thus far.


$unwind duplicates each document in the pipeline, once per array element.

So if your input pipeline contains one article doc with two elements in tags, {$unwind: '$tags'} would transform the pipeline to be two article docs that are the same except for the tags field. In the first doc, tags would contain the first element from the original doc's array, and in the second doc, tags would contain the second element.


consider the below example to understand this Data in a collection

{
        "_id" : 1,
        "shirt" : "Half Sleeve",
        "sizes" : [
                "medium",
                "XL",
                "free"
        ]
}

Query -- db.test1.aggregate( [ { $unwind : "$sizes" } ] );

output

{ "_id" : 1, "shirt" : "Half Sleeve", "sizes" : "medium" }
{ "_id" : 1, "shirt" : "Half Sleeve", "sizes" : "XL" }
{ "_id" : 1, "shirt" : "Half Sleeve", "sizes" : "free" }

As per mongodb official documentation :

$unwind Deconstructs an array field from the input documents to output a document for each element. Each output document is the input document with the value of the array field replaced by the element.

Explanation through basic example :

A collection inventory has the following documents:

{ "_id" : 1, "item" : "ABC", "sizes": [ "S", "M", "L"] }
{ "_id" : 2, "item" : "EFG", "sizes" : [ ] }
{ "_id" : 3, "item" : "IJK", "sizes": "M" }
{ "_id" : 4, "item" : "LMN" }
{ "_id" : 5, "item" : "XYZ", "sizes" : null }

The following $unwind operations are equivalent and return a document for each element in the sizes field. If the sizes field does not resolve to an array but is not missing, null, or an empty array, $unwind treats the non-array operand as a single element array.

db.inventory.aggregate( [ { $unwind: "$sizes" } ] )

or

db.inventory.aggregate( [ { $unwind: { path: "$sizes" } } ] 

Above query output :

{ "_id" : 1, "item" : "ABC", "sizes" : "S" }
{ "_id" : 1, "item" : "ABC", "sizes" : "M" }
{ "_id" : 1, "item" : "ABC", "sizes" : "L" }
{ "_id" : 3, "item" : "IJK", "sizes" : "M" }

Why is it needed?

$unwind is very useful while performing aggregation. it breaks complex/nested document into simple document before performaing various operation like sorting, searcing etc.

To know more about $unwind :

https://docs.mongodb.com/manual/reference/operator/aggregation/unwind/

To know more about aggregation :

https://docs.mongodb.com/manual/reference/operator/aggregation-pipeline/