Whitepaper multipoint video_conferencing_june2012_wr

Copyright © 2012 Wainhouse Research, LLC Page 1

WHITEPAPER

Multipoint Videoconferencing Goes Virtual

Ira M. Weinstein

June 2012

Sponsored by:

How a next-generation

architecture is changing the

rules of multipoint video


Introduction

Today’s business environment is all about doing more with less. Organizations want to process more

transactions and develop more products, all with as few resources as possible. The same rule applies

when organizations make technology investments … they want to get as much utility out of those

investments and resources as possible. This expectation of squeezing every bit of value out of

technology investments has fueled increased interest in server virtualization.

According to Webopedia,” server virtualization is the partitioning of a physical server into smaller virtual

servers.“ Each of the resulting virtual servers are then truly independent, meaning they each run their

own operating system and can even be rebooted individually.

According to a 2011 Symantec survey of 3,700 enterprises around the world, 45% of respondents had

implemented server virtualization1. Given the long list of possible benefits associated with server

virtualization (ability to deploy less hardware, decreased up-front and ongoing cost, improved efficiency,

enhanced scalability, increased reliability and availability, decreased carbon footprint, decreased

management burden, etc.), it’s no wonder that enterprises are embracing server virtualization to such a

large degree.

In many organizations, videoconferencing management has shifted away from the A/V or facilities team

over to the IT department. These IT managers are looking to deploy visual communications in a manner

similar to other applications running on the network, and want to architect their videoconferencing

environments to be cost effective, easy to manage, and easy to scale in response to demand – all of

which can be achieved via virtualization. Traditionally it has not been possible to virtualize certain

elements of the videoconferencing environment – mainly video bridging. Recent technology

developments, however, have made it possible to fit aspects of the videoconferencing ecosystem into a

virtualized resources strategy.

This study outlines the challenges associated with scaling a traditional videoconferencing environment

and highlights a next-generation videoconferencing architecture that is changing many of the rules of

video bridging.

1 https://www4.symantec.com/mktginfo/whitepaper/Virt_and_Evolution_Cloud_Survey_060811.pdf,

http://www.eweek.com/c/a/Midmarket/Adoption-of-Server-Virtualization-Widespread-Symantec-Report-125308/

https://www4.symantec.com/mktginfo/whitepaper/Virt_and_Evolution_Cloud_Survey_060811.pdf

http://www.eweek.com/c/a/Midmarket/Adoption-of-Server-Virtualization-Widespread-Symantec-Report-125308/


The Fundamentals of Video Bridging

During a video call, each participating location's video and audio signals must be compressed (a.k.a.

encoded) and then transmitted to the remote location where it is decompressed (a.k.a. decoded) and

sent to the local video displays and speakers. This same process happens continuously in both

directions for the duration of the video call.

Figure 1: Encoding and Decoding during a Video Call

The necessary signal compression and decompression (encoding and decoding) is handled by the

videoconferencing system, which might be a dedicated appliance, personal computer, tablet, or

smartphone.

In a traditional videoconferencing environment, video calls involving more than two participants (a.k.a.

multipoint or multiparty calls) require the use of a device called a video bridge or multipoint control unit

(MCU).

Figure 2: A Traditional Multipoint Video Call

During a multipoint call, each participating location sends its video and audio signals to the MCU. The

MCU then decodes the incoming signal, mixes that signal with other incoming video and audio signals in

a process called compositing, and creates (encodes) and sends a new signal composed of all of the other

signals to each participating location. This decoding and re-encoding process is often called transcoding.

Compression Engine

(Encoder)

Decompression Engine

(Decoder)

Decompression Engine

(Decoder)

Compression Engine

(Encoder)

MCU


A traditional transcoding MCU must do the work of multiple endpoints simultaneously. For example, if 6

sites are connected to the MCU, the MCU must decode six incoming video and audio signals, perform six

video mixings, and create / encode six different outgoing video signals – all in real time. The need to

handle multiple video encodes and decodes simultaneously makes hosting traditional multipoint video

calls extremely computationally intensive.

Authors Note

Videoconferencing uses lossy compression techniques. This means that each time a video signal is

encoded or decoded, a portion of the original signal is lost. In addition, encoding and decoding

introduces latency (delay), which can impact the ability to interact comfortably during a video call.

In a traditional multipoint video bridging scenario, before a location’s camera signal is viewed by

another participant, it has been encoded by the source video system, decoded by the MCU, mixed /

composited by the MCU, re-encoded by the MCU, and then decoded by the receiving video system.

The use of multiple encodes and decodes has a negative effect on the overall user experience.

The Challenge of Scaling Traditional Video Bridging

The task facing videoconferencing / IT managers is to provide reliable, cost-effective, high quality, and

scalable multipoint videoconferencing for your global organization. From a 10,000 foot view,

enterprises have the following “traditional” options for addressing this requirement:

Option #1 – Hardware Video Bridges Due to the heavy computing demands of traditional video bridging, most video bridging today is handled

by hardware-based MCUs running on custom-built platforms. Unfortunately, there are a number of

disadvantages associated with buying a traditional hardware video bridge including:

Up-Front Cost – for hardware MCUs, list price per port range from $1k to over $10k depending upon

the model, configuration, and video resolution. While this may be suitable for an organization with

a limited number of video systems / users, organizations with thousands of personal video users

expecting to participate in multiparty calls will find this model cost prohibitive.

Operating Cost – hardware video bridges can be

expensive to own and operate. Specific costs include

hardware maintenance, rack space (some MCUs

require 15 or more rack units of space), power

consumption (some MCUs require 1500 watts or

more), and HVAC requirements.

Capacity Planning – hardware video bridge customers must purchase capacity up front as opposed

to purchasing capacity to match demand as it grows. For a service provider, this means an out of

pocket investment vs. an investment that is financed through operating income. For an enterprise

organization, this means additional up-front cost because the MCU must be sized to support the

today and tomorrow’s capacity requirements.

The high per port / connection cost of

traditional hardware video bridges is a

barrier to the widespread adoption of

videoconferencing.


Capacity Expansions – expanding a hardware video bridge typically involves a significant cost. In

most cases, expansions of only a few ports are not possible. Instead, customers must purchase a

bank of ports (e.g. 12 or 24) or an entire additional video bridge.

Centralized Architecture – the high cost of even a low capacity hardware video bridge drives many

organizations to centralize their video bridging hardware in a small number of locations. This means

some remote users will be a great distance away from the video bridge, and as a result will

experience audio delays. To avoid this problem, an organization could deploy additional bridges

around the world (a distributed architecture). However, this tends to be cost prohibitive.

Cascading Issues – cascaded meetings include participants connected to two or more video bridges.

The benefits of cascading include expanded capacity beyond that of a single bridge, the ability for

participants to connect to a “local” bridge, and the ability to reduce WAN bandwidth utilization by

sending a single stream between bridges. Unfortunately, cascading traditional video bridges results

in degraded video quality (due to multiple encodes / decodes of each person), increased latency,

and smaller images of many participants. The net is that traditional bridges are not well suited for

cascaded meetings, which eliminates a key benefit of a distributed architecture.

Time to Deploy – deploying a video bridge or adding additional hardware ports may take weeks to

complete. In the traditional group videoconferencing world in which new rooms took weeks to

bring online, this was acceptable. However, in a world in which an organization can bring thousands

of personal video devices (e.g. tablets) online in a matter of hours, this may be a problem.

Virtualization Unfriendly – the use of custom-built hardware means that hardware video bridges

cannot be virtualized.

Option #2 – Software Video Bridges In recent years, a number of software-based video bridges have become available. These bridges

operate in the same way as their hardware-based cousins, but leverage off-the-shelf servers instead of a

custom-built hardware platform.

Software-based MCUs offer several benefits including lower cost per port and lower operating costs

than hardware-based MCUs. However, they also introduce a handful of additional challenges including:

Performance / Capacity Compromise – traditional software MCUs are limited by the computing

power available within the standard Intel hardware platform. As a result, software MCUs typically

force a compromise between performance, feature-set, and capacity. Given the limited capacity of

standard software MCUs, many organizations will need to deploy multiple software MCUs and

servers to meet the demands of their user community.

Cascading Issues – software MCUs tend to have fewer ports than hardware MCU; a situation which

lends itself to cascaded meetings as a means of increasing the maximum number of participants per

meeting. Unfortunately, as described above, cascaded meetings on traditional MCUs are

problematic.


Virtualization Unfriendly – Software MCUs are typically installed on dedicated servers. While it may

be possible to run a traditional software MCU on a virtual server, this raises two options:

Option 1 – Dedicate all server resources to the MCU application in order to protect the user

experience. This means that the server cannot host other applications, thereby reducing the

virtualization benefit.

Option 2 – Share server resources across multiple applications to maximize efficiency. Note that

reducing the computing power available to the MCU is likely to impact performance and

capacity significantly, thereby eliminating much of the benefit of a software MCU.

Option #3 – Hosted Video Bridging Service Enterprises may choose to use a videoconferencing bridging service provider. In this scenario, the

service provider purchases one or more video bridges and makes them available to its customers.

There are several key advantages associated with using a video bridging service including the ability to

avoid the up-front cost of an MCU purchase, and the ability to outsource the burden of managing the

MCU to the service provider. In addition, the service provider will be responsible for ensuring that

ample capacity is available to support its customers.

Unfortunately, there are some notable shortcomings associated with this method including:

Usage-Based Cost – hosted bridging service customers trade the up-front cost of an MCU purchase

for either usage-based fees (e.g. $45 / hour / port used), meeting room fees (e.g. $30 / month /

meeting room for up to 5 callers) or monthly fees (e.g. $300 / month / port). Depending on the

usage pattern, this may be more or less expensive than purchasing a hardware or software MCU.

Lack of Flexibility – the use of a single video bridging platform across many customers means that

individual customers cannot configure the bridge to meet their specific needs. Instead, the service

provider defines the bridge settings to address what the service provider believes are the overall

requirements of its customer base.

Option #4 – Resource Rationing One way to deal with the increasing demand for multipoint videoconferencing is to allocate video

bridging resources based on availability. In this scenario, meeting requests are rejected when sufficient

bridging capacity is not available. Although this is a viable option, it is not advisable as rejecting meeting

requests will inhibit user adoption and limit the benefits enjoyed by the host organization. In addition,

this will motivate users to use more accessible communication mediums (e.g. audio conferencing, web

conferencing), even in situations where videoconferencing should be used.

Option #5 – Virtualization This option involves running the video bridging application on standard virtual servers. Key benefits of

this approach for both end-users and service providers include:


Low Cost - virtualized servers are less expensive to deploy and operate than dedicated servers and

custom-built hardware

Immediate Scalability – in response to demand, enterprises can “spin-up” virtual servers easily and

quickly, with little or no notice.

Deployment Flexibility – the use of standard virtual servers allows enterprises to deploy the

bridging application on their own server farms or on 3rd party virtual servers. In a matter of hours, a

customer could deploy a globally accessible, distributed bridging architecture without the need for

additional hardware.

IT Friendliness - the above model allows enterprises to treat the videoconferencing bridging

application as they treat all other business applications. This decreases management burden and

allows the VC environment to be supported by general IT resources. This also supports the general

IT trend toward virtualization, and is a giant step toward bringing videoconferencing into the IT

mainstream.

While this may sound like a great concept, the vast majority of videoconferencing bridging solutions

available today do not support the virtualized deployment model.

Summary of Options Options 1 through 3 above suffer from various issues ranging from high up-front cost, limited capacity,

inflexible capacity expansion options, and significant rack space and power requirements to

compromised performance, limited (or no) support for distributed architectures, and ongoing usage-

based fees. As a result, these options are not well suited to meet the large scale bridging requirements

of many enterprises today.

As organizations embrace personal videoconferencing (using UC clients, dedicated PC software, and

mobile devices), the number of users and the number of multipoint participants will increase

dramatically, which makes the problem of scaling video bridging even more acute.

Solution Spotlight – Vidyo

Vidyo, the sponsor of this study, follows a novel approach that supports large scale multipoint video

conferencing without the need for a traditional, transcoding MCU. As a result, the Vidyo solution avoids

many of the issues associated with traditional video bridging.

Vidyo's solution is based on a client / server architecture designed specifically for videoconferencing.

The “secret sauce” within the offering is the use of an intelligent media switching architecture instead of

an MCU-based transcoding architecture.


Figure 3: A Typical Vidyo Deployment

Vidyo Endpoints

The first elements within the Vidyo ecosystem are the Vidyo endpoints. Available for meeting rooms,

PCs (Windows, Mac, Linux) and mobile devices (iOS and Android), all of the endpoints leverage the same

software stack.

The Vidyo endpoints use a video compression standard called H.264 Scalable Video Coding (or SVC), and

SVC leverages a concept called video scaling. A video stream is “scalable” if parts of the stream can be

removed in such a way that the resulting stream can still be decoded. Using SVC, the Vidyo endpoints

convert each video stream into several layers.

Vidyo’s use of SVC and the layering methodology provides two key benefits:

1) Exceptionally strong network resiliency, which enables high quality video calling over

inexpensive, lossy IP networks (e.g. the public Internet).

2) The ability to combine the individual layers to create resulting video streams of different

degrees of quality.

Vidyo Architecture

The next part of the Vidyo ecosystem is the VidyoRouter, a software-based solution (delivered running

on a 1 RU appliance or as a virtual appliance running in VMWare and KVM environments) that performs

packet switching of the video signals without transcoding or processing those signals.


The VidyoRouter acts as the traffic cop of the environment, providing each endpoint with the

appropriate video layers to match its capabilities (bandwidth, processor power, screen resolution, etc.).

The VidyoRouter supports up to 1440p / 60 fps and up to 100 connections per server.

Figure 4: Vidyo’s H.264 SVC Encoding and Layer Switching Process

Since the video signals are not decoded, composited, or transcoded by the VidyoRouter, latency is

minimized and video quality is maintained throughout the video call. In addition, systems / users

connected to different VidyoRouters can participate in the same call without the quality and latency

issues associated with cascading traditional video bridges.

Unfortunately, traditional videoconferencing endpoints cannot generate the multiple layers necessary

to work natively within the Vidyo architecture. For group video users, the VidyoGateway (also a

software solution) converts these standard H.264 video streams into H.264 SVC streams, allowing the

traditional devices to participate in Vidyo-powered sessions. For desktop users without a Vidyo

endpoint, Vidyo’s “guest linking” capability allows guests to join Vidyo conferences from their PCs via a

hypertext link, similar to the way people join web conferences.

Author’s Note

Several videoconferencing vendors (e.g. Avaya, Polycom) have announced plans to support H.264

SVC in the near future. In addition, various efforts are underway to finalize H.264 SVC signaling and

media standards. This will hopefully result in basic interoperability between at least some SVC

solutions in the near future.

High Resolution - High Frame Ra

Medium Resolution - Medium Frame Rate

Low Resolution - Low Frame Rate

2 Mbps

500 Kbps

3G/4GLow Bandwidth

Medium Resolution - Medium Frame Rate

High Resolution - High Frame Rate

Low Resolution - Low Frame Rate

Source

Base Layer

High Resolution Layer

Base Layer


Base Layer


Base Layer


The VidyoPortal manages the deployed VidyoRouters, VidyoGateways, and Vidyo endpoints, allowing

administrators to manage the global Vidyo deployment

from within a single web-based user interface.

Vidyo Goes Virtual

Unlike traditional video bridges which are typically too

processor-dependent to be virtualized efficiently, the

VidyoRouter is available in a virtualization-ready version

(called VidyoRouter Virtual Edition) that can run on

VMWare or KVM virtualized servers in private, public, or hybrid clouds.

End-user customers and service providers can spin up additional VidyoRouter VE instances on demand,

immediately extending the capacity and global reach of the Vidyo deployment. In addition, non-

virtualized and virtualized VidyoRouters can work together to create a hybrid environment.

Vidyo Licensing

Unlike traditional hardware MCUs which require customers to pay up front for all ports, Vidyo operates

on a concurrent connection license model. Each Vidyo concurrent license, dubbed a VidyoLine, supports

one concurrent connection to the VidyoRouter. VidyoLines float among all globally deployed

VidyoRouters, which means that customers need to buy only the licenses to support the maximum

number of simultaneous connections to the VidyoRouter at any point in the day.

Example – Traditional Multipoint Model vs. Vidyo Multipoint Model

The example below highlights the multipoint infrastructure costs for a customer needing to support

multipoint calling for up to 20 sites in each of two locations (e.g. New York and Hong Kong).

Traditional MCU Method Vidyo Multipoint Method

Types of Endpoints Deployed H.323 / SIP Vidyo Room and Personal

Equipment Required – New York 1 x Traditional 20-port MCU 1 x VidyoRouter

Equipment Required – Hong Kong 1 x Traditional 20-port MCU 1 x VidyoRouter

Equipment Required – Any Location 1 None 1 x VidyoPortal

Software Licenses Required 2 None 25 x VidyoLines

Up-Front Cost (US $ - List Price) 3 $250k - $300k ~ $42k2

Annual Recurring Cost 4 ~ $42k (maintenance) ~ $6k (maintenance)

1 Traditional MCU method cost does not include commonly deployed videoconferencing infrastructure items such as gatekeepers, NAT / firewall traversal solutions, and management servers. These items may cost an additional $50k - $150k (depending upon the situation).

2 VidyoLines are only required for Video soft-client endpoints. VidyoRooms and VidyoGateway calls do not require VidyoLines.

3 Includes cost for multipoint infrastructure only. Does not include shipping, installation, configuration, and training costs.

4 Assumes a 15% annual maintenance cost.

2 If the customer wishes to supports point-to-point or multipoint calls between the Vidyo endpoints and legacy

H.323 / SIP video systems, a VidyoGateway (US $6k list price) would be required. Note that additional VidyoLines would not be required.

The Vidyo environment is 1/6th (or less)

of the price of a traditional video

bridging environment, and allows

customers to scale on demand, one

concurrent connection at a time.


At a list price of US $42k, the cost for the Vidyo multipoint calling infrastructure is only 16.7% of the cost

of the two traditional video bridges. In fact, the annual maintenance for the two video bridges is

roughly the same as the up-front cost for the entire Vidyo infrastructure deployment!

Note re: Capacity / Expansions

The above Vidyo cost estimate includes 25 concurrent connections around the world (25 on the Hong

Kong VidyoRouter, 25 on the New York VidyoRouter, or any combination in between). Should the

customer require additional simultaneous connections, the cost would be US $950 (one time list price)

per VidyoLine. This licensing model allows the customer to scale on demand, one connection at a time.

For example, for an additional $9,500 (list price for 10 VidyoLines), the customer would be able to host

up to 35 concurrent connections at any given time around the world.

In the traditional MCU world, a capacity expansion of this kind would require the purchase of one or

perhaps even two additional MCUs. Alternatively, if available, the customer could purchase a software

capacity upgrade. In either case, the customer could not upgrade one port at a time, and the upgrade

cost would likely be $100,000 or more.

Alternatively, for organizations seeking to avoid up-front expenses, the Vidyo solution is also available as

a hosted / cloud offering from a wide range of partners around the world. For ~ $30 / month

(depending upon the specific service selected),a user can host unlimited Vidyo-powered meetings

including up to five (5) guests. Non-Vidyo users using H.323 or SIP systems can participate in meetings

for an additional $0.20 / minute. Essentially, this all-OPEX model provides each user with a personal,

hosted video meeting room for no up-front cost.

Conclusion In the past, videoconferencing was available on expensive, hardware-based systems installed within

enterprise meeting rooms only. The high cost and limited number of available meeting rooms served to

limit the number of video endpoints installed throughout the organization, as well as the overall video

call volume. This, in turn, limited the number of multipoint ports / connections required.

Today, videoconferencing is no longer trapped within the board room. Instead, videoconferencing is

now available on user’s desktops and on their mobile devices. This ubiquity is driving the need for

increased multipoint scalability and cost effectiveness.

Enterprise managers are no longer interested in “brute force” solutions to IT challenges. Instead, they

expect solutions to be easy to deploy, manage and scale. Traditional video bridges / MCUs are not well

suited to meet these cost and scalability expectations.

Vidyo, the sponsor of this white paper, offers a next-generation architecture based on intelligent media

switching that eliminates the need for processor-intensive (and expensive) transcoding, increases

flexibility in deployment, supports both virtual and non-virtual environments, allows on-demand

expansions of one connection at a time, and is much less expensive than competing solutions.

Organizations seeking to conduct wide scale multipoint videoconferencing on a global basis should

carefully consider Vidyo.


About Wainhouse Research Wainhouse Research, www.wainhouse.com, is an independent market research firm that focuses on

critical issues in the Unified Communications and rich media conferencing fields, including applications

like distance education and e-Learning. The company conducts multi-client and custom research studies,

consults with end users on key implementation issues, publishes white papers and market statistics, and

delivers public and private seminars as well as speaker presentations at industry group meetings.

Wainhouse Research publishes a variety of reports that cover all aspects of rich media conferencing, and

the free newsletter, The Wainhouse Research Bulletin.

About the Author Ira M. Weinstein is a senior analyst and partner at Wainhouse Research and a 20-year veteran of the

conferencing, collaboration, and audio-visual industries. His prior experience includes senior positions

with conferencing and AV vendors, distributors, and resellers. In addition, Ira ran the global

conferencing department for a Fortune 50 investment bank. As the lead analyst of WR’s visual

collaboration team, Ira’s focus includes videoconferencing (mobile, desktop, group, and telepresence /

immersive), streaming / webcasting, and the visual elements of unified communications. Ira holds a B.S.

in Electrical Engineering from Lehigh University and can be reached at [email protected].

About Vidyo (copy provided by Vidyo)

Vidyo, Inc. pioneered Personal Telepresence enabling natural, HD multi-point videoconferences on

tablets and smart phones, PCs and Macs, room systems, gateways that interoperate with H.323 and SIP

endpoints, telepresence solutions and affordable cloud-based broadcast solutions. Learn more at

www.vidyo.com, on the Blog or follow @vidyo on Twitter.

http://www.wainhouse.com/

http://www.wainhouse.com/wr-bulletin.php?sec=62

mailto:[email protected]

http://www.vidyo.com/

http://www.vidyo.com/blog

http://www.twitter.com/vidyo

Date post:	22-May-2015
Category:	Technology
Upload:	johnshim
View:	296 times
Download:	0 times

Whitepaper multipoint video_conferencing_june2012_wr

Technology