Dereferencing this pointer gives me -46, but I am not sure why

If you have something like ,

int x = 1234;
int *p = &x;

If You Dereference Pointer p then it will correctly read integer bytes. Because You declared it to be pointer to int . It will know how many bytes to read by sizeof() operator. Generally size of int is 4 bytes (for 32/64-bit platforms) but it is machine dependent that is why it will use sizeof() operator to know correct size and will read so.

For Your Code

 int y = 1234;
 char *p = &y;
 int *j  = &y;

Now pointer p points to y but we have declared it to be pointer to a char so it will only read one byte or whatever byte char is of . 1234 in binary would be represented as

00000000 00000000 00000100 11010010

Now if your machine is little endian it will store the bytes reversing them

11010010 00000100 00000000 00000000

11010010 is at address 00 Hypothetical address, 00000100 is at address 01 and so on.

BE:      00   01   02   03
       +----+----+----+----+   
    y: | 00 | 00 | 04 | d2 |
       +----+----+----+----+


LE:      00   01   02   03
       +----+----+----+----+
    y: | d2 | 04 | 00 | 00 |
       +----+----+----+----+

(In Hexadecimal)

So now if you dereference pointer p it will read only first byte and output will be (-46in case of signed char and 210 in case of unsigned char, according to the C standard the signed-ness of plain char is "implementation defined.) as Byte read would be 11010010(because we pointed signed char(in this case it is signed char).

On your PC negative numbers are represented as 2's Complement so the most-significant bit is the sign bit. First bit 1 denotes the sign. 11010010 = –128 + 64 + 16 + 2 = –46 and if you dereference pointer j it will completely read all bytes of int as we declared it to be pointer to int and output will be 1234

If you declare pointer j as int *j then *j will read sizeof(int) here 4 bytes(machine dependent). Same goes with char or any other data type the pointer pointed to them will read as many bytes there size is of , char is of 1 byte.

As others have pointed, you need to explicitly cast to char* as char *p = &y; is a constraint violation - char * and int * are not compatible types, instead write char *p = (char *)&y.

There are a couple of issues with the code as written.

First of all, you are invoking undefined behavior by trying to print the numeric representation of a char object using the %d conversion specifier:

Online C 2011 draft, §7.21.6.1, subclause 9:

If a conversion specification is invalid, the behavior is undefined.282) If any argument is not the correct type for the corresponding conversion specification, the behavior is undefined.

Yes, objects of type char are promoted to int when passed to variadic functions; printf is special, and if you want the output to be well-defined, then the type of the argument and the conversion specifier must match up. To print the numeric value of a char with %d or unsigned char argument with %u, %o, or %x, you must use the hh length modifier as part of the conversion spec:

printf( "%hhd ", *p );

The second issue is that the line

char *p = &y;

is a constraint violation - char * and int * are not compatible types, and may have different sizes and/or representations². Thus, you must explicitly cast the source to the target type:

char *p = (char *) &y;

The one exception to this rule occurs when one of the operands is void *; then the cast isn't necessary.

Having said all that, I took your code and added a utility that dumps the address and contents of objects in the program. Here's what y, p, and j look like on my system (SLES-10, gcc 4.1.2):

       Item        Address   00   01   02   03
       ----        -------   --   --   --   --
          y 0x7fff1a7e99cc   d2   04   00   00    ....

          p 0x7fff1a7e99c0   cc   99   7e   1a    ..~.
            0x7fff1a7e99c4   ff   7f   00   00    ....

          j 0x7fff1a7e99b8   cc   99   7e   1a    ..~.
            0x7fff1a7e99bc   ff   7f   00   00    ....

I'm on an x86 system, which is little-endian, so it stores multi-byte objects starting with the least-significant byte at the lowest address:

BE:      A   A+1  A+2  A+3
       +----+----+----+----+
    y: | 00 | 00 | 04 | d2 |
       +----+----+----+----+
LE:     A+3  A+2  A+1   A

On a little-endian system, the addressed byte is the least-significant byte, which in this case is 0xd2 (210 unsigned, -46 signed).

In a nutshell, you're printing the signed, decimal representation of that single byte.

As for the broader question, the type of the expression *p is char and the type of the expression *j is int; the compiler simply goes by the type of the expression. The compiler keeps track of all objects, expressions, and types as it translates your source to machine code. So when it sees the expression *j, it knows that it's dealing with an integer value and generates machine code appropriately. When it sees the expression *p, it knows it's dealing with a char value.

^{Admittedly, almost all modern desktop systems that I know of use the same representations for all pointer types, but for more oddball embedded or special-purpose platforms, that may not be true.

§ 6.2.5, subclause 28.}

_{(Please note this answer refers to the original form of the question, which asked how the program knew how many bytes to read, etc. I'm keeping it around on that basis, despite the rug having been pulled out from under it.)}

A pointer refers to a location in memory that contains a particular object and must be incremented/decremented/indexed with a particular stride size, reflecting the sizeof the pointed type.

The observable value of the pointer itself (e.g. through std::cout << ptr) need not reflect any recognisable physical address, nor does ++ptr need to increment said value by 1, sizeof(*ptr), or anything else. A pointer is just a handle to an object, with an implementation-defined bit representation. That representation doesn't and shouldn't matter to users. The only thing for which users should use the pointer is to... well, point to stuff. Talk of its address is nonportable and only useful in debugging.

Anyway, simply, the compiler knows how many bytes to read/write because the pointer is typed, and that type has a defined sizeof, representation, and mapping to physical addresses. So, based on that type, operations on ptr will be compiled to appropriate instructions in order to calculate the real hardware address (which again, need not correspond to the observable value of ptr), read the right sizeof number of memory 'bytes', add/subtract the right number of bytes so it points at the next object, etc.

First read the warning which says warning: initialization from incompatible pointer type [enabled by default] char *p = &y;

which means you should do explicit typecasting to avoid undefined behaviour according to standard §7.21.6.1, subclause 9 (pointed by @john Bode) as

chat *p = (char*)&y;

and

int y =1234;

here y is the local variable and it will be stores in the stack section of RAM.In Linux machine integers are stored in memory according to little endian format. Assume 4 bytes of memory reserved for y is from 0x100 to 0x104

    -------------------------------------------------
    | 0000 0000 | 0000 0000 | 0000 0100 | 1101 0010 |
    -------------------------------------------------
    0x104      0x103       0x102       0x101       0x100
                                                    y
                                                    p
                                                    j

As pointed above, j and p both points to same address 0x100 but when compiler will perform *p since p is signed character pointer by default it will check sign bit and here sign bit is 1 means one thing is sure that output it's going to print is negative number.

If sign bit is 1 i.e negative number and Negative numbers are stored in Memory as 2's compliment So

    actual          => 1101 0010 (1st byte)
    ones compliment => 0010 1101
                              +1
                      ------------
                       0010  1110 => 46 and since sign bit was one it will print -46

while printing if you are using %u format specifier which is for printing unsigned equivalent, it will not check sign bit, finally whatever data is there in 1 byte gets printed.

finally

printf("%d\n",*j);

In above statement while doing dereferencing j which is signed pointer by default and its a int pointer so it will check 31st bit for sign, which is 0 means output will be positive no and that is 1234.

Dereferencing this pointer gives me -46, but I am not sure why

Related

Recent Posts