How large should my recv buffer be when calling recv in the socket library

Solution 1:

The answers to these questions vary depending on whether you are using a stream socket (SOCK_STREAM) or a datagram socket (SOCK_DGRAM) - within TCP/IP, the former corresponds to TCP and the latter to UDP.

How do you know how big to make the buffer passed to recv()?

SOCK_STREAM: It doesn't really matter too much. If your protocol is a transactional / interactive one just pick a size that can hold the largest individual message / command you would reasonably expect (3000 is likely fine). If your protocol is transferring bulk data, then larger buffers can be more efficient - a good rule of thumb is around the same as the kernel receive buffer size of the socket (often something around 256kB).
SOCK_DGRAM: Use a buffer large enough to hold the biggest packet that your application-level protocol ever sends. If you're using UDP, then in general your application-level protocol shouldn't be sending packets larger than about 1400 bytes, because they'll certainly need to be fragmented and reassembled.

What happens if recv gets a packet larger than the buffer?

SOCK_STREAM: The question doesn't really make sense as put, because stream sockets don't have a concept of packets - they're just a continuous stream of bytes. If there's more bytes available to read than your buffer has room for, then they'll be queued by the OS and available for your next call to recv.
SOCK_DGRAM: The excess bytes are discarded.

How can I know if I have received the entire message?

SOCK_STREAM: You need to build some way of determining the end-of-message into your application-level protocol. Commonly this is either a length prefix (starting each message with the length of the message) or an end-of-message delimiter (which might just be a newline in a text-based protocol, for example). A third, lesser-used, option is to mandate a fixed size for each message. Combinations of these options are also possible - for example, a fixed-size header that includes a length value.
SOCK_DGRAM: An single recv call always returns a single datagram.

Is there a way I can make a buffer not have a fixed amount of space, so that I can keep adding to it without fear of running out of space?

No. However, you can try to resize the buffer using realloc() (if it was originally allocated with malloc() or calloc(), that is).

Solution 2:

For streaming protocols such as TCP, you can pretty much set your buffer to any size. That said, common values that are powers of 2 such as 4096 or 8192 are recommended.

If there is more data then what your buffer, it will simply be saved in the kernel for your next call to recv.

Yes, you can keep growing your buffer. You can do a recv into the middle of the buffer starting at offset idx, you would do:

recv(socket, recv_buffer + idx, recv_buffer_size - idx, 0);

Solution 3:

If you have a SOCK_STREAM socket, recv just gets "up to the first 3000 bytes" from the stream. There is no clear guidance on how big to make the buffer: the only time you know how big a stream is, is when it's all done;-).

If you have a SOCK_DGRAM socket, and the datagram is larger than the buffer, recv fills the buffer with the first part of the datagram, returns -1, and sets errno to EMSGSIZE. Unfortunately, if the protocol is UDP, this means the rest of the datagram is lost -- part of why UDP is called an unreliable protocol (I know that there are reliable datagram protocols but they aren't very popular -- I couldn't name one in the TCP/IP family, despite knowing the latter pretty well;-).

To grow a buffer dynamically, allocate it initially with malloc and use realloc as needed. But that won't help you with recv from a UDP source, alas.

Solution 4:

For SOCK_STREAM socket, the buffer size does not really matter, because you are just pulling some of the waiting bytes and you can retrieve more in a next call. Just pick whatever buffer size you can afford.

For SOCK_DGRAM socket, you will get the fitting part of the waiting message and the rest will be discarded. You can get the waiting datagram size with the following ioctl:

#include <sys/ioctl.h>
int size;
ioctl(sockfd, FIONREAD, &size);

Alternatively you can use MSG_PEEK and MSG_TRUNC flags of the recv() call to obtain the waiting datagram size.

ssize_t size = recv(sockfd, buf, len, MSG_PEEK | MSG_TRUNC);

You need MSG_PEEK to peek (not receive) the waiting message - recv returns the real, not truncated size; and you need MSG_TRUNC to not overflow your current buffer.

Then you can just malloc(size) the real buffer and recv() datagram.

Solution 5:

There is no absolute answer to your question, because technology is always bound to be implementation-specific. I am assuming you are communicating in UDP because incoming buffer size does not bring problem to TCP communication.

According to RFC 768, the packet size (header-inclusive) for UDP can range from 8 to 65 515 bytes. So the fail-proof size for incoming buffer is 65 507 bytes (~64KB)

However, not all large packets can be properly routed by network devices, refer to existing discussion for more information:

What is the optimal size of a UDP packet for maximum throughput?
What is the largest Safe UDP Packet Size on the Internet