WebRTC vs Websockets: If WebRTC can do Video, Audio, and Data, why do I need Websockets? [closed]

So I'm looking to build a chat app that will allow video, audio, and text. I spent some time researching into Websockets and WebRTC to decide which to use. Since there are plenty of video and audio apps with WebRTC, this sounds like a reasonable choice, but are there other things I should consider? Feel free to share your thoughts.

Things like:

  • Due to being new WebRTC is available only on some browsers, while WebSockets seems to be in more browsers.

  • Scalability - Websockets uses a server for session and WebRTC seems to be p2p.

  • Multiplexing/multiple chatrooms - Used in Google+ Hangouts, and I'm still viewing demo apps on how to implement.

  • Server - Websockets needs RedisSessionStore or RabbitMQ to scale across multiple machines.


WebRTC is designed for high-performance, high quality communication of video, audio and arbitrary data. In other words, for apps exactly like what you describe.

WebRTC apps need a service via which they can exchange network and media metadata, a process known as signaling. However, once signaling has taken place, video/audio/data is streamed directly between clients, avoiding the performance cost of streaming via an intermediary server.

WebSocket on the other hand is designed for bi-directional communication between client and server. It is possible to stream audio and video over WebSocket (see here for example), but the technology and APIs are not inherently designed for efficient, robust streaming in the way that WebRTC is.

As other replies have said, WebSocket can be used for signaling.

I maintain a list of WebRTC resources: strongly recommend you start by looking at the 2013 Google I/O presentation about WebRTC.


Websockets use TCP protocol.

WebRTC is mainly UDP.

Thus main reason of using WebRTC instead of Websocket is latency. With websocket streaming you will have either high latency or choppy playback with low latency. With WebRTC you may achive low-latency and smooth playback which is crucial stuff for VoIP communications.

Just try to test these technology with a network loss, i.e. 2%. You will see high delays in the Websocket stream.


WebSockets:

  • Ratified IETF standard (6455) with support across all modern browsers and even legacy browsers using web-socket-js polyfill.

  • Uses HTTP compatible handshake and default ports making it much easier to use with existing firewall, proxy and web server infrastructure.

  • Much simpler browser API. Basically one constructor with a couple of callbacks.

  • Client/browser to server only.

  • Only supports reliable, in-order transport because it is built On TCP. This means packet drops can delay all subsequent packets.

WebRTC:

  • Just beginning to be supported by Chrome and Firefox. MS has proposed an incompatible variant. The DataChannel component is not yet compatible between Firefox and Chrome.

  • WebRTC is browser to browser in ideal circumstances but even then almost always requires a signaling server to setup the connections. The most common signaling server solutions right now use WebSockets.

  • Transport layer is configurable with application able to choose if connection is in-order and/or reliable.

  • Complex and multilayered browser API. There are JS libs to provide a simpler API but these are young and rapidly changing (just like WebRTC itself).


webRTC or websockets? Why not use both.

When building a video/audio/text chat, webRTC is definitely a good choice since it uses peer to peer technology and once the connection is up and running, you do not need to pass the communication via a server (unless using TURN).

When setting up the webRTC communication you have to involve some sort of signaling mechanism. Websockets could be a good choice here, but webRTC is the way to go for the video/audio/text info. Chat rooms is accomplished in the signaling.

But, as you mention, not every browser supports webRTC, so websockets can sometimes be a good fallback for those browsers.


Security is one aspect you missed.

With Websockets the data has to go via a central webserver which typically sees all the traffic and can access it.

With WebRTC the data is end-to-end encrypted and does not pass through a server (except sometimes TURN servers are needed, but they have no access to the body of the messages they forward).

Depending on your application this may or may not matter.

If you are sending large amounts of data, the saving in cloud bandwidth costs due to webRTC's P2P architecture may be worth considering too.