Take the address of a one-past-the-end array element via subscript: legal by the C++ Standard or not?
Solution 1:
Yes, it's legal. From the C99 draft standard:
§6.5.2.1, paragraph 2:
A postfix expression followed by an expression in square brackets
[]
is a subscripted designation of an element of an array object. The definition of the subscript operator[]
is thatE1[E2]
is identical to(*((E1)+(E2)))
. Because of the conversion rules that apply to the binary+
operator, ifE1
is an array object (equivalently, a pointer to the initial element of an array object) andE2
is an integer,E1[E2]
designates theE2
-th element ofE1
(counting from zero).
§6.5.3.2, paragraph 3 (emphasis mine):
The unary
&
operator yields the address of its operand. If the operand has type ‘‘type’’, the result has type ‘‘pointer to type’’. If the operand is the result of a unary*
operator, neither that operator nor the&
operator is evaluated and the result is as if both were omitted, except that the constraints on the operators still apply and the result is not an lvalue. Similarly, if the operand is the result of a[]
operator, neither the & operator nor the unary*
that is implied by the[]
is evaluated and the result is as if the&
operator were removed and the[]
operator were changed to a+
operator. Otherwise, the result is a pointer to the object or function designated by its operand.
§6.5.6, paragraph 8:
When an expression that has integer type is added to or subtracted from a pointer, the result has the type of the pointer operand. If the pointer operand points to an element of an array object, and the array is large enough, the result points to an element offset from the original element such that the difference of the subscripts of the resulting and original array elements equals the integer expression. In other words, if the expression
P
points to thei
-th element of an array object, the expressions(P)+N
(equivalently,N+(P)
) and(P)-N
(whereN
has the valuen
) point to, respectively, thei+n
-th andi−n
-th elements of the array object, provided they exist. Moreover, if the expressionP
points to the last element of an array object, the expression(P)+1
points one past the last element of the array object, and if the expressionQ
points one past the last element of an array object, the expression(Q)-1
points to the last element of the array object. If both the pointer operand and the result point to elements of the same array object, or one past the last element of the array object, the evaluation shall not produce an overflow; otherwise, the behavior is undefined. If the result points one past the last element of the array object, it shall not be used as the operand of a unary*
operator that is evaluated.
Note that the standard explicitly allows pointers to point one element past the end of the array, provided that they are not dereferenced. By 6.5.2.1 and 6.5.3.2, the expression &array[5]
is equivalent to &*(array + 5)
, which is equivalent to (array+5)
, which points one past the end of the array. This does not result in a dereference (by 6.5.3.2), so it is legal.
Solution 2:
Your example is legal, but only because you're not actually using an out of bounds pointer.
Let's deal with out of bounds pointers first (because that's how I originally interpreted your question, before I noticed that the example uses a one-past-the-end pointer instead):
In general, you're not even allowed to create an out-of-bounds pointer. A pointer must point to an element within the array, or one past the end. Nowhere else.
The pointer is not even allowed to exist, which means you're obviously not allowed to dereference it either.
Here's what the standard has to say on the subject:
5.7:5:
When an expression that has integral type is added to or subtracted from a pointer, the result has the type of the pointer operand. If the pointer operand points to an element of an array object, and the array is large enough, the result points to an element offset from the original element such that the difference of the subscripts of the resulting and original array elements equals the integral expression. In other words, if the expression P points to the i-th element of an array object, the expressions (P)+N (equivalently, N+(P)) and (P)-N (where N has the value n) point to, respectively, the i+n-th and i−n-th elements of the array object, provided they exist. Moreover, if the expression P points to the last element of an array object, the expression (P)+1 points one past the last element of the array object, and if the expression Q points one past the last element of an array object, the expression (Q)-1 points to the last element of the array object. If both the pointer operand and the result point to elements of the same array object, or one past the last element of the array object, the evaluation shall not produce an overflow; otherwise, the behavior is undefined.
(emphasis mine)
Of course, this is for operator+. So just to be sure, here's what the standard says about array subscripting:
5.2.1:1:
The expression
E1[E2]
is identical (by definition) to*((E1)+(E2))
Of course, there's an obvious caveat: Your example doesn't actually show an out-of-bounds pointer. it uses a "one past the end" pointer, which is different. The pointer is allowed to exist (as the above says), but the standard, as far as I can see, says nothing about dereferencing it. The closest I can find is 3.9.2:3:
[Note: for instance, the address one past the end of an array (5.7) would be considered to point to an unrelated object of the array’s element type that might be located at that address. —end note ]
Which seems to me to imply that yes, you can legally dereference it, but the result of reading or writing to the location is unspecified.
Thanks to ilproxyil for correcting the last bit here, answering the last part of your question:
-
array + 5
doesn't actually dereference anything, it simply creates a pointer to one past the end ofarray
. -
&array[4] + 1
dereferencesarray+4
(which is perfectly safe), takes the address of that lvalue, and adds one to that address, which results in a one-past-the-end pointer (but that pointer never gets dereferenced. -
&array[5]
dereferences array+5 (which as far as I can see is legal, and results in "an unrelated object of the array’s element type", as the above said), and then takes the address of that element, which also seems legal enough.
So they don't do quite the same thing, although in this case, the end result is the same.
Solution 3:
It is legal.
According to the gcc documentation for C++, &array[5]
is legal. In both C++ and in C you may safely address the element one past the end of an array - you will get a valid pointer. So &array[5]
as an expression is legal.
However, it is still undefined behavior to attempt to dereference pointers to unallocated memory, even if the pointer points to a valid address. So attempting to dereference the pointer generated by that expression is still undefined behavior (i.e. illegal) even though the pointer itself is valid.
In practice, I imagine it would usually not cause a crash, though.
Edit: By the way, this is generally how the end() iterator for STL containers is implemented (as a pointer to one-past-the-end), so that's a pretty good testament to the practice being legal.
Edit: Oh, now I see you're not really asking if holding a pointer to that address is legal, but if that exact way of obtaining the pointer is legal. I'll defer to the other answerers on that.