Why do I need to use type** to point to type*?

I've been reading Learn C The Hard Way for a few days, but here's something I want to really understand. Zed, the author, wrote that char ** is for a "pointer to (a pointer to char)", and saying that this is needed because I'm trying to point to something 2-dimensional.

Here is what's exactly written in the webpage

A char * is already a "pointer to char", so that's just a string. You however need 2 levels, since names is 2-dimensional, that means you need char ** for a "pointer to (a pointer to char)" type.

Does this mean that I have to use a variable that can point to something 2-dimensional, which is why I need two **?

Just a little follow-up, does this also apply for n dimension?

Here's the relevant code

char *names[] = { "Alan", "Frank", "Mary", "John", "Lisa" };
char **cur_name = names;

Solution 1:

No, that tutorial is of questionable quality. I wouldn't recommend to continue reading it.

A char** is a pointer-to-pointer. It is not a 2D array. It is not a pointer to an array. It is not a pointer to a 2D array.

The author of the tutorial is likely confused because there is a wide-spread bad and incorrect practice saying that you should allocate dynamic 2D arrays like this:

// BAD! Do not do like this!
int** heap_fiasco;
heap_fiasco = malloc(X * sizeof(int*));
for(int x=0; x<X; x++)
{
  heap_fiasco[x] = malloc(Y * sizeof(int));
}

This is however not a 2D array, it is a slow, fragmented lookup table allocated all over the heap. The syntax of accessing one item in the lookup table, heap_fiasco[x][y], looks just like array indexing syntax, so therefore a lot of people for some reason believe this is how you allocate 2D arrays.

The correct way to allocate a 2D array dynamically is:

// correct
int (*array2d)[Y] = malloc(sizeof(int[X][Y]));

You can tell that the first is not an array because if you do memcpy(heap_fiasco, heap_fiasco2, sizeof(int[X][Y])) the code will crash and burn. The items are not allocated in adjacent memory.

Similarly memcpy(heap_fiasco, heap_fiasco2, sizeof(*heap_fiasco)) will also crash and burn, but for other reasons: you get the size of a pointer not an array.

While memcpy(array2d, array2d_2, sizeof(*array2d)) will work, because it is a 2D array.

Solution 2:

Pointers took me a while to understand. I strongly recommend drawing diagrams.

Please have a read and understand this part of the C++ tutorial (at least with respect to pointers the diagrams really helped me).

Telling you that you need a pointer to a pointer to char for a two dimensional array is a lie. You don't need it but it is one way of doing it.

Memory is sequential. If you want to put 5 chars (letters) in a row like in the word hello you could define 5 variables and always remember in which order to use them, but what happens when you want to save a word with 6 letters? Do you define more variables? Wouldn't it be easier if you just stored them in memory in a sequence?

So what you do is you ask the operating system for 5 chars (and each char just happens to be one byte) and the system returns to you a memory address where your sequence of 5 chars begins. You take this address and store it in a variable which we call a pointer, because it points to your memory.

The problem with pointers is that they are just addresses. How do you know what is stored at that address? Is it 5 chars or is it a big binary number that is 8 bytes? Or is it a part of a file that you loaded? How do you know?

This is where the programming language like C tries to help by giving you types. A type tells you what the variable is storing and pointers too have types but their types tell you what the pointer is pointing to. Hence, char * is a pointer to a memory location that holds either a single char or a sequence of chars. Sadly, the part about how many chars are there you will need to remember yourself. Usually you store that information in a variable that you keep around to remind you how many chars are there.

So when you want to have a 2 dimensional data structure how do you represent that?

This is best explained with an example. Let's make a matrix:

1  2  3  4
5  6  7  8
9 10 11 12

It has 4 columns and 3 rows. How do we store that?

Well, we can make 3 sequences of 4 numbers each. The first sequence is 1 2 3 4, the second is 5 6 7 8 and the third and last sequence is 9 10 11 12. So if we want to store 4 numbers we will ask the system to reserve 4 numbers for us and give us a pointer to them. These will be pointers to numbers. However since we need to have 3 of them we will ask the system to give us 3 pointers to pointers numbers.

And that's how you end up with the proposed solution...

The other way to do it would be to realize that you need 4 times 3 numbers and just ask the system for 12 numbers to be stored in a sequence. But then how do you access the number in row 2 and column 3? This is where maths comes in but let's try it on our example:

1  2  3  4
5  6  7  8
9 10 11 12

If we store them next to each other they would look like this:

offset from start:  0  1  2  3    4  5  6  7    8   9  10  11   
numbers in memory: [1, 2, 3, 4], [5, 6, 7, 8], [9, 10, 11, 12]

So our mapping is like this:

row | column | offset | value
 1  |   1    |   0    |   1
 1  |   2    |   1    |   2
 1  |   3    |   2    |   3
 1  |   4    |   3    |   4
 2  |   1    |   4    |   5
 2  |   2    |   5    |   6
 2  |   3    |   6    |   7
 2  |   4    |   7    |   8
 3  |   1    |   8    |   9
 3  |   2    |   9    |  10
 3  |   3    |  10    |  11
 3  |   4    |  11    |  12

And we now need to work out a nice and easy formula for converting a row and column to an offset... I'll come back to it when I have more time... Right now I need to get home (sorry)...

Edit: I'm a little late but let me continue. To find the offset of each of the numbers from a row and column you can use the following formula:

offset = (row - 1) * 4 + (column - 1)

If you notice the two -1's here and think about it you will come to understand that it is because our row and column numberings start with 1 that we have to do this and this is why computer scientists prefer zero based offsets (because of this formula). However with pointers in C the language itself applies this formula for you when you use a multi-dimensional array. And hence this is the other way of doing it.

Solution 3:

From your question what i understand is that you are asking why you need char ** for the variable which is declared as *names[]. So the answer is when you simply write names[], than that it is the syntax of array and array is basically a pointer.

So when you write *names[] than that means you are pointing to an array. And as array is basically a pointer so that means you have a pointer to a pointer and thats why compiler will not complain if you write

char ** cur_name = names ;

In above line you are declaring a pointer to a character pointer and then initialinzing it with the pointer to an array (remember array is also pointer).