The real challenge with dynamically scaling a media delivery platform is managing high concurrency when conferences or stream viewership peaks and eventually subsides.
Let us consider, for example, the number of concurrent users ranging between 100-1,000 across 10-100 conferences. It would be a good idea to autoscale the number of active media server instances based on usage to save a significant amount of cost and resources.
Jitsi’s architecture allows for dynamic scaling in real-time. Our previous blog provides the rationale behind using Jitsi Videobridge [JVB] as the media server. In addition to being powerful and optimized, JVBs are built to scale, which make them more dynamic for media transport.
Architecture:
The below figure shows a simplified diagram explaining the scaling architecture for running multiple conferences.
Amazon Route 53
Route 53 is a highly available and scalable DNS web service which routes end users to internet applications by translating human readable names, e.g., meet.sariska.io. The route policy used in our case is the geolocation routing policy, which routes traffic based on the location of your users.
Application Load Balancer
The Elastic Load Balancer automatically distributes incoming application traffic across multiple targets and virtual appliances in one or more Availability Zones (AZs).
HAProxy
The incoming connections need to be load-balanced between the shards. Additionally, new participants that want to join a running conference need to be routed to the correct shard.
A service running multiple instances of HA-Proxy, a popular open source software TCP/HTTP load balancer and proxying solution, is used for this purpose. New requests are load-balanced using the round-robin algorithm for fair load distribution. HAProxy uses DNS Service Discovery to find existing shards.
In the event there is an existing conference a user needs to join, HAProxy uses sticky tables to route all traffic for an existing conference to the correct shard. Sticky tables work similar to sticky sessions. In our example, HAProxy will store the mapping of a conference room URI to a specific shard in a dedicated key-value store that is shared with the other HAProxy instances.
Shard
The use of the term “shard” describes the composition that contains single containers for jicofo, prosody and multiple containers of JVB running in parallel.
A single shard component will contain the following services inside:
- NGNIX Server
- Jicofo manages media sessions between each of the participants in a conference and the videobridge. It uses the XMPP protocol for service discovery of all videobriges, chooses a videobridge and distributes the load if multiple video bridges are used. When connecting to a client, Jicofo will point to a videobridge that the client should connect to. It holds information about which conferences run on which videobridges.
- Prosody is an XMPP communication server that is used by Jitsi to create multi-user conferences.
- Jitsi Videobridge
Shard Arrangement:
In the setup shown above for a shard, the videobridges can be scaled up and down depending on the current load (number of video conferences and participants). The videobridge is typically the component with the highest load and therefore the main part that needs to be scaled.
Nevertheless, the single containers (web, jicofo, prosody) are also prone to running out of resources. This can be solved by scaling to multiple shards as shown above. These shards are load balanced by HAProxy