Why do we cast sockaddr_in to sockaddr when calling bind()?
The bind() function accepts a pointer to a sockaddr
, but in all examples I've seen, a sockaddr_in
structure is used instead, and is cast to sockaddr
:
struct sockaddr_in name;
...
if (bind (sock, (struct sockaddr *) &name, sizeof (name)) < 0)
...
I can't wrap my head around why is a sockaddr_in
struct used. Why not just prepare and pass a sockaddr
?
Is it just convention?
Solution 1:
No, it's not just convention.
sockaddr
is a generic descriptor for any kind of socket operation, whereas sockaddr_in
is a struct specific to IP-based communication (IIRC, "in" stands for "InterNet"). As far as I know, this is a kind of "polymorphism" : the bind()
function pretends to take a struct sockaddr *
, but in fact, it will assume that the appropriate type of structure is passed in; i. e. one that corresponds to the type of socket you give it as the first argument.
Solution 2:
I don't know if its very much relevant for this question, but I would like to provide some extra info which may make the typecaste more understandable as many people who haven't spent much time with C
get confused seeing such a typecaste.
I use macOS
, so I am taking examples based on header files from my system.
struct sockaddr
is defined as follows:
struct sockaddr {
__uint8_t sa_len; /* total length */
sa_family_t sa_family; /* [XSI] address family */
char sa_data[14]; /* [XSI] addr value (actually larger) */
};
struct sockaddr_in
is defined as follows:
struct sockaddr_in {
__uint8_t sin_len;
sa_family_t sin_family;
in_port_t sin_port;
struct in_addr sin_addr;
char sin_zero[8];
};
Starting from the very basics, a pointer just contains an address. So struct sockaddr *
and struct sockaddr_in *
are pretty much the same. They both just store an address. Only relevant difference is how compiler treats their objects.
So when you say (struct sockaddr *) &name
, you are just tricking the compiler and telling it that this address points to a struct sockaddr
type.
So let's say the pointer is pointing to a location 1000
. If the struct sockaddr *
stores this address, it will consider memory from 1000
to sizeof(struct sockaddr)
possessing the members as per the structure definition. If struct sockaddr_in *
stores the same address it will consider memory from 1000
to sizeof(struct sockaddr_in)
.
When you typecasted that pointer, it will consider the same sequence of bytes upto sizeof(struct sockaddr)
.
struct sockaddr *a = &name; // consider &name = 1000
Now if I access a->sa_len
, the compiler would access from location 1000
to sizeof(__uint8_t)
which is same bytes size as in case of sockaddr_in
. So this should access the same sequence of bytes.
Same pattern is for sa_family
.
After that there is a 14 byte character array in struct sockaddr
which stores data from in_port_t sin_port
(typedef
'd 16 bit unsigned integer = 2 bytes ) , struct in_addr sin_addr
(simply a 32 bit ipv4 address = 4 bytes) and char sin_zero[8]
(8 bytes). These 3 add up to make 14 bytes.
Now these three are stored in this 14 bytes character array and we can access any of these three by accessing appropriate indices and typecasting them again.
user529758's answer already explains the reason to do this.
Solution 3:
This is because bind can bind other types of sockets than IP sockets, for instance Unix domain sockets, which have sockaddr_un as their type. The address for an AF_INET socket has the host and port as their address, whereas an AF_UNIX socket has a filesystem path.