mongodb sort order on _id
I wonder how mongodb compare the "_id" field when doing query like the following:
db.data.find({"_id":{$gt:ObjectId("502aa46c0674d23e3cee6152")}}).sort({"_id":1}).limit(10);
Is it purely based on timestamp portion of the id?
To expand slightly on what Andre said:
Since the ObjectID timestamp is only to the second, two (or more) ObjectIDs could easily be created with the same value for the timestamp (the first 4 bytes). If these were created on the same machine (machine ID - the next 3 bytes), by the same process (PID - the next 2 bytes), then the only thing to differentiate them would be the "inc" field, the last 3 bytes at the end.
Update: Jan 2020
This answer continues to be popular so it is worth updating a little. The ObjectID spec has evolved since this answer was written 8 years ago and the 5 bytes after the timestamp are now simply random, which will greatly decrease the likelihood of any collisions. The last three bytes are still incremental, but initialised at a random value to start, again making collisions less likely. The ObjectID now contains less context (you can't easily tell where it was generated and by what process) but I would guess that the information was not being used in any meaningful way and has been deprecated in favor of better randomisation of the ID.
End Update
See here for the full spec:
https://docs.mongodb.com/manual/reference/method/ObjectId/#ObjectIDs-BSONObjectIDSpecification
That "inc" field is either an ever incrementing field (then you can reasonably expect the sort to be in the insert/create order) or a random value (then likely unique, but not ordered), assuming the spec is implemented correctly of course. Note that the ObjectIDs may be generated by the driver, or the application (or indeed manually) rather than by MongoDB itself, so unless you have full control over how they are generated, then any or all of the above may apply.
In a way you are correct, if you sort by the _id
you will sort by the insertion time. This does not mean that the only comparison is done on the timestamp portion. ObjectID's are a BSON object type in their own right, they can be directly compared with each other. As they start with a timestamp, it follows logically that those in the past will be less than those in the future.
You can find more detail in the documentation
copy paste from Mongo specs https://docs.mongodb.com/manual/reference/bson-types/#objectid
The relationship between the order of ObjectId values and generation time is not strict within a single second. If multiple systems, or multiple processes or threads on a single system generate values, within a single second; ObjectId values do not represent a strict insertion order. Clock skew between clients can also result in non-strict ordering even for values, because client drivers generate ObjectId values, not the mongod process.