We’ve been hearing the name WebRTC a lot lately. In fact, WebRTC, which has been in use since 2011, is not a new technology but is a technology that provides simultaneous media communication (audio and video). The most important feature of WebRTC, which has many advantages, is that it can work directly on many popular browsers without requiring additional software.
We can define the communication format used in WebRTC as peer-to-peer. This communication is directly between peers, so you don’t need any media servers. WebRTC is free and has a BSD license, so you can develop WebRTC applications for free. (For example, you can experience a video conference virtual room with WebRTC at this link)
WebRTC Supported Browsers
Nowadays, the following browsers support WebRTC:
- PC & MAC
- Microsoft Edge 12+
- Google Chrome 28+
- Mozilla Firefox 22+
- Safari 11+
- Opera 18+
- Vivaldi 1.9+
- Google Chrome 28+
- Mozilla Firefox 24+
- Opera Mobile 12+
- MobileSafari / WebKit (iOS 11+)
- Chrome OS
- Firefox OS
- BlackBerry 10
- Tizen 3.0
There are 3 main components in WebRTC:
1. MediaStream API
2. RTCPeerConnection API
The RTCPeerConnection API provides NAT traversal, codec processing, mutual SDP negotiation, media transmission, and secure connection functions between peers.
3. RTCDataChannel API
The RTCDataChannel API provides the functionality of establishing bidirectional data transfer channels between peers.
Establishing Peer-to-Peer Connection
Signaling is a process that forms the connection between peers. It can be achieved by WebSocket, XMPP, SIP or any other mechanism. WebRTC technology utilizes protocols such as RTP, STUN, SIP and ICE.
Session Description Protocol (SDP)
Also known as SDP, it is a protocol used to communicate media capabilities (voice codecs, IP and port information, etc.) between peers before establishing a connection and to meet each peer at a common point.
Interactive Connectivity Establishment (ICE)
ICE is a framework for the NAT traversal mechanism. ICE collects all available candidates (local IP addresses, STUN return IP addresses, and transmitted IP addresses – TURN). All collected addresses are then sent to remote peers via SDP.
The STUN server enables peers to find public IP addresses, the types of NAT they use, and the relationship between the Internet-side port information associated with the local port information specified by NAT.
When STUN usage is not possible, it is used to transmit media streams over a TURN server (you may think of it as a proxy).
WebRTC is not always peer-to-peer (P2P), but in multiple communication situations (eg video conferencing), different solutions are available. Let’s take a look at these.
Multi-Point Communication Types
In the mesh network, all peers send their streams separately to other connected peers directly on the network.
Since this structure is completely distributed, there is no need to have any media servers in the center. The disadvantage of the mesh structure is the use of high bandwidth. In a multi-video call using a mesh structure, if each user generates a 1 Mbps stream, the amount of data sent and received per user will be 4 Mbps in each direction.
SFU stands for Selective Forwarding Unit. An SFU receives incoming media streams from all users and then decides which users to send to.
In this model, each user transmits their own generated media stream to the SFU server. The SFU server can send whoever wants the stream. In this way, bandwidth is used more effectively. Similar with the mesh example above, if each user generates a 1 Mbps stream, the total outgoing data amount per user will be 1 Mbps and the total incoming data amount will be a maximum of 4 Mbps.
MCU stands for Multipoint Conferencing Unit. An MCU receives incoming media streams from all users, decodes them, creates a new layout, and sends it to all users as a single stream.
The difference of this structure from SFU is that a single combined stream will be sent to each user and the total transmission and reception amount per user will be 1 Mbps in each direction. The disadvantage of this structure, as you can imagine, is the high cost of the MCU with a high processing power in the center.