What is SOCK_DGRAM and SOCK_STREAM?
I just came across this strange thing I got to see application is that by default they use SOCK_STREAM
function. Why is it so? Is this SOCK_STREAM
just creating multiple streams? Or is it the standard SOCK_STREAM
function available for creating TCP stream(s)?
I thought tsunami is based on UDP, but still having some features like that of TCP, e.g. TCP fairness, friendlyness, etc.
Could somebody please shed some light on this issue? I am totally confused over this.
TCP almost always uses SOCK_STREAM
and UDP uses SOCK_DGRAM
.
TCP (SOCK_STREAM
) is a connection-based protocol. The connection is established and the two parties have a conversation until the connection is terminated by one of the parties or by a network error.
UDP (SOCK_DGRAM
) is a datagram-based protocol. You send one datagram and get one reply and then the connection terminates.
If you send multiple packets, TCP promises to deliver them in order. UDP does not, so the receiver needs to check them, if the order matters.
If a TCP packet is lost, the sender can tell. Not so for UDP.
UDP datagrams are limited in size, from memory I think it is 512 bytes. TCP can send much bigger lumps than that.
TCP is a bit more robust and makes more checks. UDP is a shade lighter weight (less computer and network stress).
Choose the protocol appropriate for how you want to interact with the other computer.
One of the ideas behind the Berkley Sockets API was that it could use different protocol families - not just the Internet Protocol (IP). But instead you had one API that could handle all kinds of "address families", e.g.:
- Internet Protocol version 4 (IPv4):
AF_INET
- IPX/SPX:
AF_IPX
- AppleTalk:
AF_APPLETALK
- NetBIOS:
AF_NETBIOS
- Internet Protocol version 6 (IPv6):
AF_INET6
- Infrared Data Association (IrDA):
AF_IRDA
- Bluetooth:
AF_BTH
Each protocol family generally has a few similar concepts of how data will be handled on a socket:
- sequenced, reliable, two-way, connection-based, byte-streams:
SOCK_STREAM
(what an IP person would call TCP) - connectionless, unreliable, datagrams:
SOCK_DGRAM
(what an IP person would call UDP)
Different address families have different terms for these basic concepts:
╔═══════════╦══════════════════════════╗
║ ║ Socket Type ║
║ Address ╟────────────┬─────────────╢
║ Family ║ SOCK_DGRAM │ SOCK_STREAM ║
╠═══════════╬════════════╪═════════════╣
║ IPX/SPX ║ SPX │ IPX ║
║ NetBIOS ║ NetBIOS │ n/a ║
║ IPv4 ║ UDP │ TCP ║
║ AppleTalk ║ DDP │ ADSP ║
║ IPv6 ║ UDP │ TCP ║
║ IrDA ║ IrLMP │ IrTTP ║
║ Bluetooth ║ ? │ RFCOMM ║
╚═══════════╩════════════╧═════════════╝
The point is:
- If you want reliable, two-way, connection-based, sequenced, byte-streams
- you ask for it using "SOCK_STREAM"
- and the sockets API will worry about figuring out that you want TCP
Similarly, if i were creating a socket over Infrared (IrDA, AF_IRDA
):
- i have no idea what protocol in IrDA is reliable, sequenced, and connection-based
- all i know is that i want something that is reliable, sequence, and connection-based
So you say:
socket(AF_IRDA, SOCK_STREAM, 0);
And Sockets will figure it out for me.
Bonus
Originally there was only the two protocol options:
- connectionless, unreliable, datagrams (
SOCK_DGRAM
) - connection-based, reliable, sequenced, two-way (
SOCK_STREAM
)
Later other protocol choices were added:
- a reliable message datagram (
SOCK_RDM
- "Reliable Datagram Multicast" - obsolete; do not use in new programs) - pseudo-stream sequenced packets based on datagrams (
SOCK_SEQPACKET
)
╔═══════════╦══════════════════════════════════════════════════════╗
║ ║ Socket Type ║
║ Address ╟────────────┬─────────────┬──────────┬────────────────╢
║ Family ║ SOCK_DGRAM │ SOCK_STREAM │ SOCK_RDM │ SOCK_SEQPACKET ║
╠═══════════╬════════════╪═════════════╪══════════╪════════════════╣
║ IPX/SPX ║ SPX │ IPX │ ? │ ? ║
║ NetBIOS ║ NetBIOS │ n/a │ ? │ ? ║
║ IPv4 ║ UDP │ TCP │ ? │ SCTP ║
║ AppleTalk ║ DDP │ ADSP │ ? │ ? ║
║ IPv6 ║ UDP │ TCP │ ? │ SCTP ║
║ IrDA ║ IrLMP │ IrTTP │ ? │ ? ║
║ Bluetooth ║ ? │ RFCOMM │ ? │ ? ║
╚═══════════╩════════════╧═════════════╧══════════╧════════════════╝
It's not guaranteed that any given address family will support such protocol choices; but some do.
Bonus Bonus Chatter
Hopefully now you see why it is redundant to pass IPPROTO_TCP
protocol in your call to create a socket:
socket(AF_INET, SOCK_STREAM, IPPROTO_TCP); // passing IPPROTO_TCP is redundant
socket(AF_INET, SOCK_STREAM, 0); // better
You already said you wanted a SOCK_STREAM
. You don't need to force TCP
on top of it. In the same way it's redundant to call:
socket(AF_INET, SOCK_DGRAM, IPPROTO_UDP); //passing IPPROTO_UDP is redundant
socket(AF_INET, SOCK_DGRAM, 0); // better
tl;dr: It's a protocol-independent way of asking for TCP or UDP. But since nobody on the planet uses AppleTalk, IPX/SPX, IrDA, Bluetooth, NetBIOS anymore, it's mostly vestigial.
Update: my answer seems no more relevant, but the original question referred to UDT, which is a connection-oriented protocol built on top of UDP. More info here: http://en.wikipedia.org/wiki/UDP-based_Data_Transfer_Protocol
UDT appears to provide API which mimics classic BSD sockets API, so it can be used as a drop-in replacement, for both stream and datagram oriented applications. Check e.g. sendmsg
and recvmsg
- both throw an exception if used on a socket created with SOCK_STREAM
, and all the stream oriented APIs throw an exception for socket created with SOCK_DGRAM
as well.
In case of SOCK_DGRAM
it perform some extra processing however, it doesn't simply wrap the UDP socket transparently in such case - as far as I understand the code after a quick review (I'm not familiar with UDT internals or protocol spec). Reading the technical papers could help a lot.
The library always creates its underlying, "real" socket as a datagram one (check channel.cpp, CChannel::open
).