How does object_id assignment work?
I'm playing around with Ruby's .object_id
and noticed that, in several sequential sessions of irb, I get these identical results:
false.object_id // 0
true.object_id // 2
nil.object_id // 4
100.object_id // 201
In fact, every integer's object_id seems to be ((value * 2) + 1).
On the other hand, a given string's object_id is never the same after exiting and re-running irb.
This raises several questions for me:
- Is there a known scheme by which certain
object_id
s are determined? Are others basically random? - The ids for true, false, and nil, aren't sequential. Is there a way to ask what object is represented by a given id? (I'm curious what the other single-digit and ids are tied to.)
- Could you (not that you should) write obfuscated Ruby where you use known object ids to refer to objects without naming them, like "object of id 201 + object of id 19" to mean "100 + 9"?
Update
Using Andrew Grimm's suggestion, I tried discovering other "low id" objects, but found that:
- There don't appear to be any more even-numbered objects in this sequence - ids 6, 8, 10, etc don't point to anything.
- As implied by my earlier experiment, all the odd-numbered ids belong to numbers. Specifically, id 1 points to the number 0, 3 points to 1, 5 points to 2, and so forth.
Solution 1:
In MRI the object_id
of an object is the same as the VALUE
that represents the object on the C level. For most kinds of objects this VALUE
is a pointer to a location in memory where the actual object data is stored. Obviously this will be different during multiple runs because it only depends on where the system decided to allocate the memory, not on any property of the object itself.
However for performance reasons true
, false
, nil
and Fixnum
s are handled specially. For these objects there isn't actually a struct with the object's data in memory. All of the object's data is encoded in the VALUE
itself. As you already figured out the values for false
, true
, nil
and any Fixnum
i
, are 0
, 2
, 4
and i*2+1
respectively.
The reason that this works is that on any systems that MRI runs on, 0
, 2
, 4
and i*2+1
are never valid addresses for an object on the heap, so there's no overlap with pointers to object data.
Solution 2:
Assigning Integer (value * 2) + 1
and non-integers (x * 2)
is analogous to Hilbert's paradox of the Grand Hotel, which describes how to assign infinitely more guests to an infinite hotel.
With regards to finding objects by their ID, there's ObjectSpace._id2ref(object_id)
. Unless your implementation doesn't have ObjectSpace.