What Is WebRTC?

What Is WebRTC?

We’ve been hearing the name WebRTC a lot lately. In fact, WebRTC, which has been in use since 2011, is not a new technology but is a technology that provides simultaneous media communication (audio and video). The most important feature of WebRTC, which has many advantages, is that it can work directly on many popular browsers without requiring additional software.

WebRTC stands for Web Based Real Time Communication. Multimedia applications can be designed using HTML5 and Javascript APIs.

We can define the communication format used in WebRTC as peer-to-peer. This communication is directly between peers, so you don’t need any media servers. WebRTC is free and has a BSD license, so you can develop WebRTC applications for free. (For example, you can experience a video conference virtual room with WebRTC at this link)

WebRTC Supported Browsers

Nowadays, the following browsers support WebRTC:

  • PC & MAC
    • Microsoft Edge 12+
    • Google Chrome 28+
    • Mozilla Firefox 22+
    • Safari 11+
    • Opera 18+
    • Vivaldi 1.9+
  • Android
    • Google Chrome 28+
    • Mozilla Firefox 24+
    • Opera Mobile 12+
  • iOS
    • MobileSafari / WebKit (iOS 11+)
  • Chrome OS
  • Firefox OS
  • BlackBerry 10
  • Tizen 3.0

WebRTC Components

There are 3 main components in WebRTC:

1. MediaStream API

The MediaStream API provides user access to the camera, microphone or screen using javascript.

2. RTCPeerConnection API

The RTCPeerConnection API provides NAT traversal, codec processing, mutual SDP negotiation, media transmission, and secure connection functions between peers.

3. RTCDataChannel API

The RTCDataChannel API provides the functionality of establishing bidirectional data transfer channels between peers.

Establishing Peer-to-Peer Connection

Signaling is a process that forms the connection between peers. It can be achieved by WebSocket, XMPP, SIP or any other mechanism. WebRTC technology utilizes protocols such as RTP, STUN, SIP and ICE.

WebRTC Signaling Process

Session Description Protocol (SDP)

Also known as SDP, it is a protocol used to communicate media capabilities (voice codecs, IP and port information, etc.) between peers before establishing a connection and to meet each peer at a common point.

Interactive Connectivity Establishment (ICE)

ICE is a framework for the NAT traversal mechanism. ICE collects all available candidates (local IP addresses, STUN return IP addresses, and transmitted IP addresses – TURN). All collected addresses are then sent to remote peers via SDP.

STUN Server

The STUN server enables peers to find public IP addresses, the types of NAT they use, and the relationship between the Internet-side port information associated with the local port information specified by NAT.

TURN Server

When STUN usage is not possible, it is used to transmit media streams over a TURN server (you may think of it as a proxy).

WebRTC is not always peer-to-peer (P2P), but in multiple communication situations (eg video conferencing), different solutions are available. Let’s take a look at these.

Multi-Point Communication Types

1. Mesh

In the mesh network, all peers send their streams separately to other connected peers directly on the network.

All Peers Communicate With Each Other in Mesh Topology

Since this structure is completely distributed, there is no need to have any media servers in the center. The disadvantage of the mesh structure is the use of high bandwidth. In a multi-video call using a mesh structure, if each user generates a 1 Mbps stream, the amount of data sent and received per user will be 4 Mbps in each direction.

2. SFU

SFU stands for Selective Forwarding Unit. An SFU receives incoming media streams from all users and then decides which users to send to.

SFU Transfers Media To All Peers Separately

In this model, each user transmits their own generated media stream to the SFU server. The SFU server can send whoever wants the stream. In this way, bandwidth is used more effectively. Similar with the mesh example above, if each user generates a 1 Mbps stream, the total outgoing data amount per user will be 1 Mbps and the total incoming data amount will be a maximum of 4 Mbps.

3. MCU

MCU stands for Multipoint Conferencing Unit. An MCU receives incoming media streams from all users, decodes them, creates a new layout, and sends it to all users as a single stream.

MCU Combines Media of All Peers & Sends a Single Stream to Peers

The difference of this structure from SFU is that a single combined stream will be sent to each user and the total transmission and reception amount per user will be 1 Mbps in each direction. The disadvantage of this structure, as you can imagine, is the high cost of the MCU with a high processing power in the center.

Leave a Reply

Your email address will not be published. Required fields are marked *