Dereferencing this pointer gives me -46, but I am not sure why
If you have something like ,
int x = 1234;
int *p = &x;
If You Dereference Pointer p
then it will correctly read integer bytes. Because You declared it to be pointer to int
. It will know how many bytes to read by sizeof()
operator. Generally size of int
is 4 bytes
(for 32/64-bit platforms) but it is machine dependent that is why it will use sizeof()
operator to know correct size and will read so.
For Your Code
int y = 1234;
char *p = &y;
int *j = &y;
Now pointer p
points to y
but we have declared it to be pointer to a char
so it will only read one byte or whatever byte char is of .
1234
in binary would be represented as
00000000 00000000 00000100 11010010
Now if your machine is little endian it will store the bytes reversing them
11010010 00000100 00000000 00000000
11010010
is at address 00
Hypothetical address
, 00000100
is at address 01
and so on.
BE: 00 01 02 03
+----+----+----+----+
y: | 00 | 00 | 04 | d2 |
+----+----+----+----+
LE: 00 01 02 03
+----+----+----+----+
y: | d2 | 04 | 00 | 00 |
+----+----+----+----+
(In Hexadecimal)
So now if you dereference pointer p
it will read only first byte and output will be (-46
in case of signed char
and 210
in case of unsigned char
, according to the C standard the signed-ness of plain char is "implementation defined.) as Byte read would be 11010010
(because we pointed signed char
(in this case it is signed char
).
On your PC negative numbers are represented as 2's Complement so the most-significant bit
is the sign bit. First bit 1
denotes the sign. 11010010 = –128 + 64 + 16 + 2 = –46
and if you dereference pointer j
it will completely read all bytes of int
as we declared it to be pointer to int
and output will be 1234
If you declare pointer j as int *j
then *j
will read sizeof(int)
here 4 bytes(machine dependent). Same goes with char
or any other data type the pointer pointed to them will read as many bytes there size is of , char
is of 1 byte.
As others have pointed, you need to explicitly cast to char*
as char *p = &y;
is a constraint violation - char *
and int *
are not compatible types, instead write char *p = (char *)&y
.
There are a couple of issues with the code as written.
First of all, you are invoking undefined behavior by trying to print the numeric representation of a char
object using the %d
conversion specifier:
Online C 2011 draft, §7.21.6.1, subclause 9:
If a conversion specification is invalid, the behavior is undefined.282) If any argument is not the correct type for the corresponding conversion specification, the behavior is undefined.
Yes, objects of type char
are promoted to int
when passed to variadic functions; printf
is special, and if you want the output to be well-defined, then the type of the argument and the conversion specifier must match up. To print the numeric value of a char
with %d
or unsigned char
argument with %u
, %o
, or %x
, you must use the hh
length modifier as part of the conversion spec:
printf( "%hhd ", *p );
The second issue is that the line
char *p = &y;
is a constraint violation - char *
and int *
are not compatible types, and may have different sizes and/or representations2. Thus, you must explicitly cast the source to the target type:
char *p = (char *) &y;
The one exception to this rule occurs when one of the operands is void *
; then the cast isn't necessary.
Having said all that, I took your code and added a utility that dumps the address and contents of objects in the program. Here's what y
, p
, and j
look like on my system (SLES-10, gcc 4.1.2):
Item Address 00 01 02 03
---- ------- -- -- -- --
y 0x7fff1a7e99cc d2 04 00 00 ....
p 0x7fff1a7e99c0 cc 99 7e 1a ..~.
0x7fff1a7e99c4 ff 7f 00 00 ....
j 0x7fff1a7e99b8 cc 99 7e 1a ..~.
0x7fff1a7e99bc ff 7f 00 00 ....
I'm on an x86 system, which is little-endian, so it stores multi-byte objects starting with the least-significant byte at the lowest address:
BE: A A+1 A+2 A+3
+----+----+----+----+
y: | 00 | 00 | 04 | d2 |
+----+----+----+----+
LE: A+3 A+2 A+1 A
On a little-endian system, the addressed byte is the least-significant byte, which in this case is 0xd2
(210
unsigned, -46
signed).
In a nutshell, you're printing the signed, decimal representation of that single byte.
As for the broader question, the type of the expression *p
is char
and the type of the expression *j
is int
; the compiler simply goes by the type of the expression. The compiler keeps track of all objects, expressions, and types as it translates your source to machine code. So when it sees the expression *j
, it knows that it's dealing with an integer value and generates machine code appropriately. When it sees the expression *p
, it knows it's dealing with a char
value.
- Admittedly, almost all modern desktop systems that I know of use the same representations for all pointer types, but for more oddball embedded or special-purpose platforms, that may not be true.
- § 6.2.5, subclause 28.
(Please note this answer refers to the original form of the question, which asked how the program knew how many bytes to read, etc. I'm keeping it around on that basis, despite the rug having been pulled out from under it.)
A pointer refers to a location in memory that contains a particular object and must be incremented/decremented/indexed with a particular stride size, reflecting the sizeof
the pointed type.
The observable value of the pointer itself (e.g. through std::cout << ptr
) need not reflect any recognisable physical address, nor does ++ptr
need to increment said value by 1, sizeof(*ptr)
, or anything else. A pointer is just a handle to an object, with an implementation-defined bit representation. That representation doesn't and shouldn't matter to users. The only thing for which users should use the pointer is to... well, point to stuff. Talk of its address is nonportable and only useful in debugging.
Anyway, simply, the compiler knows how many bytes to read/write because the pointer is typed, and that type has a defined sizeof
, representation, and mapping to physical addresses. So, based on that type, operations on ptr
will be compiled to appropriate instructions in order to calculate the real hardware address (which again, need not correspond to the observable value of ptr
), read the right sizeof
number of memory 'bytes', add/subtract the right number of bytes so it points at the next object, etc.
First read the warning which says warning: initialization from incompatible pointer type [enabled by default] char *p = &y;
which means you should do explicit typecasting to avoid undefined behaviour according to standard §7.21.6.1, subclause 9 (pointed by @john Bode) as
chat *p = (char*)&y;
and
int y =1234;
here y
is the local variable
and it will be stores in the stack
section of RAM
.In Linux machine integers are stored in memory according to little endian
format. Assume 4 bytes
of memory reserved for y
is from 0x100
to 0x104
-------------------------------------------------
| 0000 0000 | 0000 0000 | 0000 0100 | 1101 0010 |
-------------------------------------------------
0x104 0x103 0x102 0x101 0x100
y
p
j
As pointed above, j
and p
both points to same address 0x100
but when compiler will perform *p
since p
is signed character pointer
by default it will check sign bit
and here sign bit
is 1
means one thing is sure that output it's going to print is negative number.
If sign bit
is 1
i.e negative number and Negative numbers are stored in Memory as 2's compliment So
actual => 1101 0010 (1st byte)
ones compliment => 0010 1101
+1
------------
0010 1110 => 46 and since sign bit was one it will print -46
while printing if you are using %u
format specifier which is for printing unsigned
equivalent, it will not
check sign bi
t, finally whatever data is there in 1 byte
gets printed.
finally
printf("%d\n",*j);
In above statement while doing dereferencing j
which is signed pointer
by default and its a int
pointer so it will check 31st bit for sign, which is 0
means output will be positive
no and that is 1234.