What's the difference between WebSocket and plain socket communication?

According to the Wikipedia, the only relationship between HTTP and WebSocket is an extra handshake in the form of a Upgrade HTTP request. And after that, it seems the browser and HTTP server will just communicate in an old C/S paradigm over a plain socket.

So my questions are:

  • Is WebSocket just a plain socket communication?
  • It is called WebSocket because the communication targets server's 80 port? i.e. port 80 is just synonymous to web.
  • The 80 port is on the server side, what ports are used in client?
  • If it is just like plain socket communication between a browser and a server, why WebSocket is not implemented in browsers until recently? It seems nothing but a little C/S extension to the B/S paradigm.

ADD 1 (9:46 AM 5/23/2017)

Today, I revisited @jfriend00's excellent answer. Let'me summarize my understanding.

  • Socket is just a end-2-end communication channel. It doesn't impose restriction on what communication protocol can be used on it.
  • webSocket, like HTTP, is just another standalone communication protocol. Though the word socket in the name confused me at first.
  • webSocket leverages the same port number as HTTP for that if we can communicate through HTTP, we can be sure webSocket communication can be made. Becuase since the channel is through, we can pick the most appropriate way we talking along the channel.

webSockets and regular sockets are not the same thing. A webSocket runs over a regular socket, but runs its own connection scheme, security scheme and framing protocol on top of the regular socket and both endpoints must follow those additional steps for a connection to even be made. You can see the webSocket protocol here: https://www.rfc-editor.org/rfc/rfc6455

The biggest difference right away is that ALL webSocket connections start with an HTTP request from client to server. The client sends an HTTP request to the exact same server and port that is open for normal web communication (default of port 80, but if the web server is running on a different port, then the webSocket communication would follow it on that other port). The client sets a few custom headers, the most important of which is a header that indicates that the client wishes to "upgrade" to the webSocket protocol. In addition both sides exchange some security keys. If the server agrees to the "upgrade", then both client and server switch the protocol being spoken over that original socket from HTTP to webSocket and now the webSocket framing protocol is used.

In addition, the initial HTTP request can have a request path in it to indicate a "sub-destination" for the webSocket request. This allows all sorts of different webSocket requests to all be initiated with the same server and port.

There is also an optional sub-protocol specifier with the Sec-WebSocket-Protocol header which allows request to further identify sub protocols (a common one might be "chat") so that both sides can agree on a specific set of message identifiers and their corresponding meaning that might be used.

The fact that a webSocket connection starts with an HTTP connection is critically important because if you can reach the web server for normal web communication, then you can reach it for a webSocket request without any networking infrastructure anywhere between client and server having to open new holes in the firewall or open new ports or anything like that.

You can see an excellent summary of how a webSocket connection is started here: https://developer.mozilla.org/en-US/docs/WebSockets/Writing_WebSocket_servers.

The webSocket protocol also defines ping and pong packets that help both sides know if an idle webSocket is still connected.

One can only assume that the reason it took awhile to get webSockets into all common browsers is the same reason that lots of useful capabilities took awhile. First a group of motivated folks have to identify and agree upon a need, then that group needs to take the lead in developing an approach to solve the problem, then the idea gets kicked around for awhile either gathering support and dealing with objections or competing with alternate ways of solving such a problem and then it appears to have enough momentum to actually be something that could become a standard, then someone decides to do a test/trial implementation in a browser and a matching server implementation (sometimes this step comes much earlier). Then, if it's still finding momentum and appears to be on a standards track, other browser makers will pick up the idea and start on their implementation. Once all browser makers have a decent working implementation (usually there are rounds of standards improvement as different implementations find holes in the specification or as early developers identify problems or missing capabilities or security issues arise). Then, it gets to the point where at least two major browsers have the feature in their latest releases, the standard is considered relatively solid and consumers start to adopt those browsers and some sites start to improve their user experience by using the new capability. At that point, the trailing browsers start to feel pressure to implement it. Then, sometimes years later, all major browser have the feature and those browsers have enough overall user adoption that web sites can rely on the feature (without having to have a major second fallback design that works when a browser doesn't support the feature). This entire process can take many, many years.


Here's an example of the initial HTTP request to initiate a webSocket connection:

GET /chat HTTP/1.1
Host: example.com:8000
Upgrade: websocket
Connection: Upgrade
Sec-WebSocket-Key: dGhlIHNhbXBsZSBub25jZQ==
Sec-WebSocket-Version: 13

And, the server response:

HTTP/1.1 101 Switching Protocols
Upgrade: websocket
Connection: Upgrade
Sec-WebSocket-Accept: s3pPLMBiTxaQ9kYGzzhZRbK+xOo=

And, a data frame example:

0                   1                   2                   3
 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1
+-+-+-+-+-------+-+-------------+-------------------------------+
|F|R|R|R| opcode|M| Payload len |    Extended payload length    |
|I|S|S|S|  (4)  |A|     (7)     |             (16/64)           |
|N|V|V|V|       |S|             |   (if payload len==126/127)   |
| |1|2|3|       |K|             |                               |
+-+-+-+-+-------+-+-------------+ - - - - - - - - - - - - - - - +
|     Extended payload length continued, if payload len == 127  |
+ - - - - - - - - - - - - - - - +-------------------------------+
|                               |Masking-key, if MASK set to 1  |
+-------------------------------+-------------------------------+
| Masking-key (continued)       |          Payload Data         |
+-------------------------------- - - - - - - - - - - - - - - - +
:                     Payload Data continued ...                :
+ - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - +
|                     Payload Data continued ...                |
+---------------------------------------------------------------+

No, WebSockets are more than just plain sockets. They use a framing protocol which requires a handshake and then exchanges messages masked by XORint them with a 32bit random number. For more information, read the RFC which standardizes them.

The reason for this additional encoding layer is that allowing a web browser to create arbitrary socket connections would open various security problems. You could, for example, make visitors to your website connect to arbitrary mailservers via SMTP and make them send spam without the user realizing. That's why the protocol was designed in a way that any server-sided applications need to implement it intentionally before they can be used from web browsers.

Regarding ports: By default, WebSocket connects to Port 80, but the API can receive any port. The client-sided port is randomized, like in most TCP/IP-based protocols.

Why wasn't it implemented earlier? Because until recently the WhatWG and W3C didn't have the unified support by all major browser developers to get the authority they require to introduce new standards. That's why there is such a flood of new browser features under the label HTML5 recently.