Arrays, heap and stack and value types

int[] myIntegers;
myIntegers = new int[100];

In the above code, is new int[100] generating the array on the heap? From what I've read on CLR via c#, the answer is yes. But what I can't understand, is what happens to the actual int's inside the array. As they are value types, I'd guess they'd have to be boxed, as I can, for example, pass myIntegers to other parts of the program and it'd clutter up the stack if they were left on it all the time. Or am I wrong? I'd guess they'd just be boxed and would live on the heap for as long the array existed.


Your array is allocated on the heap, and the ints are not boxed.

The source of your confusion is likely because people have said that reference types are allocated on the heap, and value types are allocated on the stack. This is not an entirely accurate representation.

All local variables and parameters are allocated on the stack. This includes both value types and reference types. The difference between the two is only what is stored in the variable. Unsurprisingly, for a value type, the value of the type is stored directly in the variable, and for a reference type, the value of the type is stored on the heap, and a reference to this value is what is stored in the variable.

The same holds for fields. When memory is allocated for an instance of an aggregate type (a class or a struct), it must include storage for each of its instance fields. For reference-type fields, this storage holds just a reference to the value, which would itself be allocated on the heap later. For value-type fields, this storage holds the actual value.

So, given the following types:

class RefType{
    public int    I;
    public string S;
    public long   L;
}

struct ValType{
    public int    I;
    public string S;
    public long   L;
}

The values of each of these types would require 16 bytes of memory (assuming a 32-bit word size). The field I in each case takes 4 bytes to store its value, the field S takes 4 bytes to store its reference, and the field L takes 8 bytes to store its value. So the memory for the value of both RefType and ValType looks like this:

 0 ┌───────────────────┐
   │        I          │
 4 ├───────────────────┤
   │        S          │
 8 ├───────────────────┤
   │        L          │
   │                   │
16 └───────────────────┘

Now if you had three local variables in a function, of types RefType, ValType, and int[], like this:

RefType refType;
ValType valType;
int[]   intArray;

then your stack might look like this:

 0 ┌───────────────────┐
   │     refType       │
 4 ├───────────────────┤
   │     valType       │
   │                   │
   │                   │
   │                   │
20 ├───────────────────┤
   │     intArray      │
24 └───────────────────┘

If you assigned values to these local variables, like so:

refType = new RefType();
refType.I = 100;
refType.S = "refType.S";
refType.L = 0x0123456789ABCDEF;

valType = new ValType();
valType.I = 200;
valType.S = "valType.S";
valType.L = 0x0011223344556677;

intArray = new int[4];
intArray[0] = 300;
intArray[1] = 301;
intArray[2] = 302;
intArray[3] = 303;

Then your stack might look something like this:

 0 ┌───────────────────┐
   │    0x4A963B68     │ -- heap address of `refType`
 4 ├───────────────────┤
   │       200         │ -- value of `valType.I`
   │    0x4A984C10     │ -- heap address of `valType.S`
   │    0x44556677     │ -- low 32-bits of `valType.L`
   │    0x00112233     │ -- high 32-bits of `valType.L`
20 ├───────────────────┤
   │    0x4AA4C288     │ -- heap address of `intArray`
24 └───────────────────┘

Memory at address 0x4A963B68 (value of refType) would be something like:

 0 ┌───────────────────┐
   │       100         │ -- value of `refType.I`
 4 ├───────────────────┤
   │    0x4A984D88     │ -- heap address of `refType.S`
 8 ├───────────────────┤
   │    0x89ABCDEF     │ -- low 32-bits of `refType.L`
   │    0x01234567     │ -- high 32-bits of `refType.L`
16 └───────────────────┘

Memory at address 0x4AA4C288 (value of intArray) would be something like:

 0 ┌───────────────────┐
   │        4          │ -- length of array
 4 ├───────────────────┤
   │       300         │ -- `intArray[0]`
 8 ├───────────────────┤
   │       301         │ -- `intArray[1]`
12 ├───────────────────┤
   │       302         │ -- `intArray[2]`
16 ├───────────────────┤
   │       303         │ -- `intArray[3]`
20 └───────────────────┘

Now, if you passed intArray to another function, the value pushed onto the stack would be 0x4AA4C288, the address of the array, not a copy of the array.


Yes the array will be located on the heap.

The ints inside the array will not be boxed. Just because a value type exists on the heap, does not necessarily mean it will be boxed. Boxing will only occur when a value type, such as int, is assigned to a reference of type object.

For example

Does not box:

int i = 42;
myIntegers[0] = 42;

Boxes:

object i = 42;
object[] arr = new object[10];  // no boxing here 
arr[0] = 42;

You may also want to check out Eric's post on this subject:

  • The Stack Is An Implementation Detail, Part Two

To understand what's happening, here are some facts:

  • Object are always allocated on the heap.
  • The heap only contains objects.
  • Value types are either allocated on the stack, or part of an object on the heap.
  • An array is an object.
  • An array can only contain value types.
  • An object reference is a value type.

So, if you have an array of integers, the array is allocated on the heap and the integers that it contains is part of the array object on the heap. The integers reside inside the array object on the heap, not as separate objects, so they are not boxed.

If you have an array of strings, it's really an array of string references. As references are value types they will be part of the array object on the heap. If you put a string object in the array, you actually put the reference to the string object in the array, and the string is a separate object on the heap.