How to (de)construct data frames in WebSockets hybi 08+?

Since Chrome updated to v14, they went from version three of the draft to version eight of the draft.

I have an internal chat application running on WebSocket, and although I've gotten the new handshake working, the data framing apparently has changed as well. My WebSocket server is based on Nugget.

Does anybody have WebSocket working with version eight of the draft and have an example on how to frame the data being sent over the wire?


(See also: How can I send and receive WebSocket messages on the server side?)


It's fairly easy, but it's important to understand the format.

The first byte is almost always 1000 0001, where the 1 means "last frame", the three 0s are reserved bits without any meaning so far and the 0001 means that it's a text frame (which Chrome sends with the ws.send() method).

(Update: Chrome can now also send binary frames with an ArrayBuffer. The last four bits of the first byte will be 0002, so you can differ between text and binary data. The decoding of the data works exactly the same way.)

The second byte contains of a 1 (meaning that it's "masked" (encoded)) followed by seven bits which represent the frame size. If it's between 000 0000 and 111 1101, that's the size. If it's 111 1110, the following 2 bytes are the length (because it wouldn't fit in seven bits), and if it's 111 1111, the following 8 bytes are the length (if it wouldn't fit in two bytes either).

Following that are four bytes which are the "masks" which you need to decode the frame data. This is done using xor encoding which uses one of the masks as defined by indexOfByteInData mod 4 of the data. Decoding simply works like encodedByte xor maskByte (where maskByte is indexOfByteInData mod 4).

Now I must say I'm not experienced with C# at all, but this is some pseudocode (some JavaScript accent I'm afraid):

var length_code = bytes[1] & 127, // remove the first 1 by doing '& 127'
    masks,
    data;

if(length_code === 126) {
    masks = bytes.slice(4, 8);   // 'slice' returns part of the byte array
    data  = bytes.slice(8);      // and accepts 'start' (inclusively)
} else if(length_code === 127) { // and 'end' (exclusively) as arguments
    masks = bytes.slice(10, 14); // Passing no 'end' makes 'end' the length
    data  = bytes.slice(14);     // of the array
} else {
    masks = bytes.slice(2, 6);
    data  = bytes.slice(6);
}

// 'map' replaces each element in the array as per a specified function
// (each element will be replaced with what is returned by the function)
// The passed function accepts the value and index of the element as its
// arguments
var decoded = data.map(function(byte, index) { // index === 0 for the first byte
    return byte ^ masks[ index % 4 ];          // of 'data', not of 'bytes'
    //         xor            mod
});

You can also download the specification which can be helpful (it of course contains everything you need to understand the format).


This c# code works fine for me. Decode text data that comes from a browser to a c# server via socket.

    public static string GetDecodedData(byte[] buffer, int length)
    {
        byte b = buffer[1];
        int dataLength = 0;
        int totalLength = 0;
        int keyIndex = 0;

        if (b - 128 <= 125)
        {
            dataLength = b - 128;
            keyIndex = 2;
            totalLength = dataLength + 6;
        }

        if (b - 128 == 126)
        {
            dataLength = BitConverter.ToInt16(new byte[] { buffer[3], buffer[2] }, 0);
            keyIndex = 4;
            totalLength = dataLength + 8;
        }

        if (b - 128 == 127)
        {
            dataLength = (int)BitConverter.ToInt64(new byte[] { buffer[9], buffer[8], buffer[7], buffer[6], buffer[5], buffer[4], buffer[3], buffer[2] }, 0);
            keyIndex = 10;
            totalLength = dataLength + 14;
        }

        if (totalLength > length)
            throw new Exception("The buffer length is small than the data length");

        byte[] key = new byte[] { buffer[keyIndex], buffer[keyIndex + 1], buffer[keyIndex + 2], buffer[keyIndex + 3] };

        int dataIndex = keyIndex + 4;
        int count = 0;
        for (int i = dataIndex; i < totalLength; i++)
        {
            buffer[i] = (byte)(buffer[i] ^ key[count % 4]);
            count++;
        }

        return Encoding.ASCII.GetString(buffer, dataIndex, dataLength);
    }

To be more accurate, Chrome has gone from the Hixie-76 version of the protocol to the HyBi-10 version of the protocol. HyBi-08 through HyBi-10 all report as version 8 because it was really only the specification text that changed and not the wire format.

The framing has changed from using '\x00...\xff' to using a 2-7 byte header for each frame that contains the length of the payload among other things. There is a diagram of the frame format in section 4.2 of the specification. Also note that data from the client (browser) to the server is masked (4 bytes of the client-server frame headers contain the unmasking key).

You can look at websockify which is a WebSockets to TCP socket proxy/bridge that I created to support noVNC. It is implemented in python but you should be able to get the idea from the encode_hybi and decode_hybi routines.