Socket Programming Python: How to make sure entire message is received?

First of all, to send all the bytes you don't need a loop because python sockets provide a simple method: socket.sendall().

Now to your questions:

  1. Yes, even to receive just 4 bytes you should have a receive loop that calls recv() on the socket until 4 bytes are read.

  2. You can, if you can guarantee that such characters will not appear in the message itself. However, you'd still need to search every character that you read in for the magic delimiter, so it seems inferior to simply prefixing the message body with a length.

  3. When you call recv(n) that is only guaranteed to return at most n bytes, not exactly n bytes.

Here are three different recvall() methods to compare:

def recvall(sock, size):
    received_chunks = []
    buf_size = 4096
    remaining = size
    while remaining > 0:
        received = sock.recv(min(remaining, buf_size))
        if not received:
            raise Exception('unexpected EOF')
        received_chunks.append(received)
        remaining -= len(received)
    return b''.join(received_chunks)

and the much shorter

def recvall2(sock, size):
    return sock.recv(size, socket.MSG_WAITALL)

and finally another version that is a little shorter than the first but lacks a couple of features:

def recvall3(sock, size):
    result = b''
    remaining = size
    while remaining > 0:
        data = sock.recv(remaining)
        result += data
        remaining -= len(data)
    return result

The second one is nice and short, but it relies on a socket option socket.MSG_WAITALL that I do not believe is guaranteed to exist on every platform. The first and third ones should work everywhere. I haven't really benchmarked any to compare and contrast.


For sending, you only really need that loop if you've put the socket in non-blocking mode. If the socket is in blocking mode (the default), sock.send() won't return until it has sent the entire message or gets an error.

However, for receiving there's no equivalent, because TCP doesn't include message boundaries in the protocol. sock.recv() returns as soon as any data is available.

  1. Call sock.recv() in a loop until you get everything you need. Similar to the way your sending routine sends shorter substrings each iteration, you can reduce the size of the recv() argument by the amount you've read so far. So it can look like:
def myrecv(self, size):
    buffer = ''
    while size > 0:
        msg = self.sock.recv(size)
        buffer += msg
        size -= len(msg)
    return buffer

If you put a 4-byte length before each message, you can do something like:

msgsize = int(myrecv(4))
message = myrecv(msgsize)
  1. You could do that, but it makes things more complicated. You need to read one character at a time, checking for the delimiters, or implement a buffer that holds data that you've read but haven't yet returned to the caller, because it's past the end of the current message. Also, if the data can contain the delimiters, you need to be able to escape it.

  2. No, recv(1024) can return as soon as it gets any data, which may be less than the size of the message that was sent. If it guaranteed to return 1024 characters, it would hang if the sender only sent 500 characters, because it's waiting for the remaining 524 characters.