WebRTC or Web Real Time Communication protocol is an open source protocol and technology that enables real time communication directly between web browsers and webRTC enabled applications
Using WebRTC you can do video calling, audio calling and data transfer between devices.
This capability is implemented using a set of JavaScript APIs that enable video, audio and data transmission between devices. These APIs include ICE, STUN, TURN, NAT and SDP
We are going to learn more about these protocols below
ICE (Interactive Connectivity Establishment)
ICe is a protocol that is used to find the best path to reach devices that is to establish a connection between devices.
ICE is used to navigate a best path through NAT routers and firewall rules. It overcomes the connectivity barriers introduced by NAT and firewall rules using the STUN and TURN servers
How does it work:
ICE gathers all the candidates for the media streams, that is the potential paths between devices trying to connect
It first tries a direct connection using STUN servers to find the client device IP addresses, if that fails this is due to NAT devices or firewall rules then it tries to connect using relays around NAT that is the TURN server
If you are looking for ICE servers and want to know more about ICE then refer to our article: Interactive Connectivity Establishment (ICE) Server: The Complete Guide
If you are looking for a TURN server for your app, then you can consider the Metered TURN servers, A global TURN server service provider.
If you are looking for a list of ICE servers
STUN server (Session Traversal Utilities for NAT)
STUN servers are used by devices that are behind a NAT to find out what their public IP address and port number is
Devices that are behind a NAT have private IP address assigned to them by the NAT router.
And all the traffic of all the devices that are behind a perticular NAT is routed through a single or a few public IP addresses
When devices want to connect with each other directly they want to know what is their own and others public IP and port number is
These client devices use STUN server to find out their own public IP and port number when they then communicate to (send to ) each other so as to establish a direct line of communication
A client device sends a request to the STUN server when then replies back with the IP address and port number from which the request came from
There are a lot of free and paid STUN servers available. Google also provides free stun servers for public usage google stun server list
NAT Network Address Translation
NAT or Network address translation is a method by which NAT devices use a single or a few public IP addresses to channel traffic to and from multiple devices which are behind it (These devices are give private IP and port number by the NAT device)
This process was invented to conserve limited number of IPv4 addresses, you can learn more about NAT and how the NAT or Network address translation works here: NAT traversal: How does it work?
TURN (Traversal Using Relays around NAT)
TURN relays the data for WebRTC connections when direct peer to peer connection is not possible due to NAT or firewall restrictions
TURN servers relay traffic between peers when direct connection between them fails
It is used as a last resort in the ICE server when direct communication between devices fail
TURN servers are resource intensive and require a lot of bandwidth and cpu to function
TURN servers need to be near your users hence you require TURN servers all over the globe if your users are distributed
If you are looking for a global turn server provider then you can consider Metered.ca TURN servers
SDP (Session Description Protocol)
SDP is a standard for describing multimedia communicatoin sessions for the purpose of
- Session announcement
- session invitataion
- and other forms of multimedia session initiation
SDP protocol itself does not deliver media streams or transport data. It just describes the format for session descrition that will convey information about the media steams in multimedia sessions to help the devices receive any particular multimedia stream
Purpose of SDP
SDP was designed to be extensible and works with varied network environments and formats
It is used to describe the multimedia communication and to control the logistics of connectivity and media exchange
Structure of SDP
SDP describes multimedia sessions using plain text encoding with simple syntax
An SDP message has text in the form of type=value
where type
is a single char
that signifies the type of the field and value
is a structured text string
These messages are typically transported with other protocols such as SIP Session Initiation Protocol or as a part of the WebRTC signalling process and establishing a new connection
Here are some of the key components of SDP
- Version: This shows the version of SDP that is being used
- Origin: Identifies the initiator of the session and the session identifier
- Session Name Provides a human readable name for the session
- Timing: Describes the start time and the stop time for the session
- Media Descriptions: Describes the Media components of the session, including media type that is audio, video text, port, protocol and other media formats theat are being used.
Role of SDP in WebRTC
SDP plays a part in the offer/answer model, this is a fundamental signalling mechanism in webrtc which is used to establish a connection between peers
Here is how SDP works in webrtc
- Offer/Answer: Here one client generates an SDP offer and sends this to the other client device with whom it wants to establish a connection
The other client then responds with an answer. This exchange describes the proposed media capabilities at both the client devices such as
supported codes, media types and encryption requirements for establishing a connection
- Negotiation:
The SDP exchange includes negotiation between the clients about which codes and encryption requirements are supported by both and can be used to establish a connection.
- ICE candidates:
SDP also conveys the ICE candidates in webrtc. These ICE candidates describes the potential pathways that can be taken to establish the connection in webrtc including STUN and TURN server addresses.
The SDP is dynamically updated during the ICE candidates gathering phase with addresses for STUN and TURN server connections from both the client devices.
MediaStream
The MediaStream API is a an important component of the WebRTC suite of APIs/
This API manages the flow of data related to media such as audio and video, with the help of media stream api a broad range of appliations can function like video streams, video calls and audio calls
MediaStream represents multiple streams of media such as multiple audio and video tracks that are synchronized for a seamless experience.
These streams can come from multiple sources such as microphones, cameras, screen recorders and even pre recorded media
These streams are then transmitted between peer devices for real time communication
Key features of MediaStream API
- Stream Capture
The MediaStream API can capture the media stream from a user device. This is done with the help of getUserMedia()
method.
This method asks the user's permission to access the microphone and camera inputs and returns a MediaStream containing the requested media types
- Track Manipulation:
The MediaStream returned by the getUserMedia()
function contains multiple tracks such as audio tracks and video tracks and these tracks can be indivdually manipulated as required
For example you can easily enable and disable individual tracks thus muting a user or disable their video output etc
- Stream Combination:
As we know there are multiple mediastream objects or tracks as we have seen above, these objects can be combined into a single stream of data or can also be seprated and individually manipulated as required
These tracks can be removed from one stream and be added to another stream , thus allowing for dynamic reconfiguration during a video call amoung many participants
- Cloning:
MedisStreams can also be cloned, this is perticular useful where the same media stream needs to used in multiple cpntexts simultaneously.
for example a single meia stream needs to be shown to multiple users in a video call and also has to be recorded for fututre referene
This stream can aso be encoded and manipulated as the user wishes without affecting the original stream
- Compatibility and constraints:
The API provides ability to have constraints on the media stream, these could be a reduction in the video quality or noise supperation for audio
This allows you to specify the media capture needs according to your application and client device compatibility and performance
Practical use cases for MediaStream API
- Video Conferencing
You can conduct video conferencing with the mediastream api, capture camera and audio streams of multiple participants and show it to other participants
- Media Recording:
You can combine the MediaStream api with the MediaRecorder API and record the stream locally in the browser or have features like session recording
- Real Time Media Processing
Media Stream can be processed in real time to apply affects, change the resolutions, and perform analytics and any other things that you want to do
- BroadCasting
MediaStreams can be be broadcasted to a loarge audience over the internet through media servers, you can also live stream events by using webrtc to record the camera and audio then using the media servers to broadcast the MediaStream on the internet
RTCPeerConnection
RTCPeerConnection is one of the core components of WebRTC suite of APIs.
The primary function for the RTCPeerConnection as the name implies is to establish and maintain a connection.
This connection allows direct exchange of data between client devices without the need for an intermediary (that is apart from the initial signalling process)
Everything handled by the RTCPeerConnection includes things like negotiating the connection details, managing the media and data transfer once the devices are connected.
Key features of RTCPeerConnection
- Connection Setup: RTCPeerConnection handles all the negotiation with regards to the media and the network details that is required to setup a connection between two devices.
These details include the offer/answer model and the ICE candidates. These details need to be communicated between peers through a signalling server.
- Signalling:
While RTCPeerConnection does not perform signalling, it generates the data that is required to send by the signalling server
This data includes the offer/answer and the ICE server candidates. The signalling process is important for establishing a connection between client devices and the RTCPeerConnection generates the data for the signalling server
- NAT Traversal:
Using the ICE and STUN and TURN servers RTCPeerConnection finds the best possible way to establish a connection between devices
If you are looking for a STUN and TURN servers then you can consider the metered.ca turn servers
- Media Stream Management:
Once the connection is establihed the RTCPeerConnection manages the media streams that are provided by the media stream api.
The RTCPeerConnection controls the flow of streams to and from the client devices.
- Data Channel Setup:
The RTCPeerConnection can establish data channels using the RTCDataChannel API
Using the RTCDataChannel API any arbitary data can be transferred between devices thus you can build any application using the webrtc
- Encryption
All the data transmitted is encryted by the RTCPeerConnection using DTLS encryotion protocol.
This ensures that all the communication is safe and secure
- Bandwidth Management:
Using RTCPeerConnection you can use the inbuilt mechanisms to manage bandwidth consumption based on factors like your application requirements and network requirements.
RTCDataChannel
RTCDataChannel is an important component of webrtc API. It enables bi directional transfer of data between devices using webrtc
Using this featues developers can build a wide range of applications apart from the video and audio calling for which traditionally webrtc has been used
Developers can build apps like chat apps, collaborative whiteboard and other collaborative apps, file sharing services
The data channel is designed to be highly flexible and supports both highly reliable data delivery as well as unreliable data delivery with low overhead.
The data channel can be configured to suit all kinds of data transfer needs
Key features of RTCDataChannel
- Bidirectional and Peer to Peer
Data Channel allows for direct peer to peer and birectional transfer of data between client devices
Unlike media streams that are used for video and audio data transfer the RTCDataChannel and pretty much transfer anything you throw at it
- Configurable transport
There are two modes of transport available iwth RTCDataChannel. 1. Reliable mode, where data is gauranteed to arrive in the order it was sent but it has a heavy load with it. 2 the unreliable mode which is quite lightweight but the data is not guaranteed to arrive at all.
- Integration with RTCPeerConnection
Data channels are established using the same RTCPeerConnection and utilize the same channels for communication as the other webrtc media apis and thus use the same TURN servers for communications
- Security:
Similar to other webrtc apis the RTCDataChannels are encrypted using the DTLS encrption for end to end encryption security
- Low Overhead
The RTCDataChannels use SCTP that is stram control transport protocol over DTLS and UDP
This comnination provides a balance of low latency and reliability over TCP based real time communication solutions
Practical Applications for RTCDataChannel
- Chat Application
- Collaborative tools
- File Sharing
- Gaming
Metered TURN servers
- API: TURN server management with powerful API. You can do things like Add/ Remove credentials via the API, Retrieve Per User / Credentials and User metrics via the API, Enable/ Disable credentials via the API, Retrive Usage data by date via the API.
- Global Geo-Location targeting: Automatically directs traffic to the nearest servers, for lowest possible latency and highest quality performance. less than 50 ms latency anywhere around the world
- Servers in all the Regions of the world: Toronto, Miami, San Francisco, Amsterdam, London, Frankfurt, Bangalore, Singapore,Sydney, Seoul, Dallas, New York
- Low Latency: less than 50 ms latency, anywhere across the world.
- Cost-Effective: pay-as-you-go pricing with bandwidth and volume discounts available.
- Easy Administration: Get usage logs, emails when accounts reach threshold limits, billing records and email and phone support.
- Standards Compliant: Conforms to RFCs 5389, 5769, 5780, 5766, 6062, 6156, 5245, 5768, 6336, 6544, 5928 over UDP, TCP, TLS, and DTLS.
- Multi‑Tenancy: Create multiple credentials and separate the usage by customer, or different apps. Get Usage logs, billing records and threshold alerts.
- Enterprise Reliability: 99.999% Uptime with SLA.
- Enterprise Scale: With no limit on concurrent traffic or total traffic. Metered TURN Servers provide Enterprise Scalability
- 5 GB/mo Free: Get 5 GB every month free TURN server usage with the Free Plan
- Runs on port 80 and 443
- Support TURNS + SSL to allow connections through deep packet inspection firewalls.
- Supports both TCP and UDP
- Free Unlimited STUN
Top comments (2)
Thank you for reading. I hope you like the article