+ All Categories
Home > Documents > An adaptive-predictive architecture for video streaming servers

An adaptive-predictive architecture for video streaming servers

Date post: 05-Sep-2016
Category:
Upload: stenio-fernandes
View: 217 times
Download: 0 times
Share this document with a friend
12
An adaptive-predictive architecture for video streaming servers Stenio Fernandes n , Judith Kelner, Djamel Sadok Center of Informatics (CIn), Federal University of Pernambuco (UFPE), Recife, PE, Brazil article info Article history: Received 17 October 2010 Received in revised form 15 May 2011 Accepted 22 May 2011 Available online 12 June 2011 Keywords: Multimedia architectures Video streaming abstract Providing perceptually good quality video streaming over today’s Internet is a complex task, as the available bandwidth and the encoded video rate can exhibit significant variability at different time scales. This research investigates the ways that such mismatches can be addressed, suggesting that this can be done by minimizing quality variability and increasing the overall video quality rendered to end- systems. Towards this end, we formulate a combination of innovative congestion control-aware mechanisms and filtering techniques. Our proposal involves extracting explicit information from the network, as well as providing the video source with consistent and stable information. We investigate this proposal throughout the following pages and reveal its efficiency by simulating representative scenarios. & 2011 Elsevier Ltd. All rights reserved. 1. Introduction The Internet has evolved considerably to embrace applications such as video and audio streaming, peer-to-peer systems, and IP telephony. While audio streaming and IP telephony are highly sensitive to delay despite their small bandwidth requirements, video applications usually require the timely transfer of much larger bandwidth-hungry objects. Therefore, the deployment of scalable video streaming services with guaranteed end-user quality remains a challenge, even with the increasing presence of broadband local loop access technologies, such as DSL and Cable. Triple Play (3P) services have been widely deployed in Europe, Asia, and North America. 3P bundles telephony, IP TV, and multimedia Internet access over the widely used copper platform. In addition, recent efforts from the MPEG ISO/IEC working group lead to important achievements, including the design of the MPEG-21 Multimedia Framework (MF) (Bormans et al., 2003; Burnett et al., 2003), MPEG-2 (the standard for Digital Television set top boxes and DVDs), MPEG-4 (the standard for multimedia for the fixed and mobile web), and MPEG-7 (the standard for describing and searching for audio and visual content). Users are often required to choose between a few available bitrates for video streams, according to their network Internet access capacity. Recent measurements show that most Internet streaming video is offered with bitrates of around 300 kbps (Wang et al., 2004; Saxena et al., 2008), which resembles peer- to-peer video service rates (De Cicco et al., 2008). Hence, this bit rate is well below the DVD-quality images and sound. In such a scenario, one could foresee an impending collapse if the number of video service subscribers increases faster than the overall network capacity growth rate. To make matters worse, there is a little effort made by applications to control their sending rates (Nichols and Claypool, 2004). A dire consequence of this is that the quality of end-user perception becomes highly dependent on network traffic fluctuations, often resulting in a sequence of cyclical playouts and buffering. In summary, it becomes necessary to ensure that the video source sending rate adapts to network conditions at a fine-grain level. Rate adaptation can be achieved according to two distinct approaches. First, one may rely on codec adaptation to match variable network capacity in an end-to-end scope (de Cuetos and Ross, 2002; Horn et al., 1999; Rejaie et al., 1999; Weber and de Veciana, 2003; Zhang et al., 2001). As stated by Lakshman et al. (1998), an adaptive encoder that satisfies network traffic con- straints may eventually achieve the best possible decoded video quality, as it uses network state observance to minimize losses. Major drawbacks of this strategy include the lack of both good bandwidth estimators and support for explicit congestion feed- back information. Second, in addition to end-systems congestion control mechanisms, network-based approaches rely on Active Queue Management (AQM), which requires network elements to interfere actively in the traffic to guarantee fairness between competing flows, as well as to prevent large delays and packet loss. A decade after the introduction of AQM (Floyd and Jacobson, 1993), recent studies have begun to focus on scalable AQM solutions with explicit feedback notification to enhance their results (Katabi and Falk, 2006; Kuzmanovic, 2005; Welzl, 2005; Xia et al., 2005). Like other services, fairness among competing flows is seen as an important design requirement. In this paper, Contents lists available at ScienceDirect journal homepage: www.elsevier.com/locate/jnca Journal of Network and Computer Applications 1084-8045/$ - see front matter & 2011 Elsevier Ltd. All rights reserved. doi:10.1016/j.jnca.2011.05.007 n Corresponding author. E-mail addresses: sfl[email protected] (S. Fernandes), [email protected] (J. Kelner), [email protected] (D. Sadok). Journal of Network and Computer Applications 34 (2011) 1683–1694
Transcript
Page 1: An adaptive-predictive architecture for video streaming servers

Journal of Network and Computer Applications 34 (2011) 1683–1694

Contents lists available at ScienceDirect

Journal of Network and Computer Applications

1084-80

doi:10.1

n Corr

E-m

jamel@c

journal homepage: www.elsevier.com/locate/jnca

An adaptive-predictive architecture for video streaming servers

Stenio Fernandes n, Judith Kelner, Djamel Sadok

Center of Informatics (CIn), Federal University of Pernambuco (UFPE), Recife, PE, Brazil

a r t i c l e i n f o

Article history:

Received 17 October 2010

Received in revised form

15 May 2011

Accepted 22 May 2011Available online 12 June 2011

Keywords:

Multimedia architectures

Video streaming

45/$ - see front matter & 2011 Elsevier Ltd. A

016/j.jnca.2011.05.007

esponding author.

ail addresses: [email protected] (S. Fernandes),

in.ufpe.br (D. Sadok).

a b s t r a c t

Providing perceptually good quality video streaming over today’s Internet is a complex task, as the

available bandwidth and the encoded video rate can exhibit significant variability at different time

scales. This research investigates the ways that such mismatches can be addressed, suggesting that this

can be done by minimizing quality variability and increasing the overall video quality rendered to end-

systems. Towards this end, we formulate a combination of innovative congestion control-aware

mechanisms and filtering techniques. Our proposal involves extracting explicit information from the

network, as well as providing the video source with consistent and stable information. We investigate

this proposal throughout the following pages and reveal its efficiency by simulating representative

scenarios.

& 2011 Elsevier Ltd. All rights reserved.

1. Introduction

The Internet has evolved considerably to embrace applicationssuch as video and audio streaming, peer-to-peer systems, and IPtelephony. While audio streaming and IP telephony are highlysensitive to delay despite their small bandwidth requirements,video applications usually require the timely transfer of muchlarger bandwidth-hungry objects. Therefore, the deployment ofscalable video streaming services with guaranteed end-userquality remains a challenge, even with the increasing presenceof broadband local loop access technologies, such as DSL andCable. Triple Play (3P) services have been widely deployed inEurope, Asia, and North America. 3P bundles telephony, IP TV, andmultimedia Internet access over the widely used copper platform.In addition, recent efforts from the MPEG ISO/IEC working grouplead to important achievements, including the design of theMPEG-21 Multimedia Framework (MF) (Bormans et al., 2003;Burnett et al., 2003), MPEG-2 (the standard for Digital Televisionset top boxes and DVDs), MPEG-4 (the standard for multimediafor the fixed and mobile web), and MPEG-7 (the standard fordescribing and searching for audio and visual content).

Users are often required to choose between a few availablebitrates for video streams, according to their network Internetaccess capacity. Recent measurements show that most Internetstreaming video is offered with bitrates of around 300 kbps(Wang et al., 2004; Saxena et al., 2008), which resembles peer-to-peer video service rates (De Cicco et al., 2008). Hence, this bit

ll rights reserved.

[email protected] (J. Kelner),

rate is well below the DVD-quality images and sound. In such ascenario, one could foresee an impending collapse if the numberof video service subscribers increases faster than the overallnetwork capacity growth rate. To make matters worse, there isa little effort made by applications to control their sending rates(Nichols and Claypool, 2004). A dire consequence of this is thatthe quality of end-user perception becomes highly dependent onnetwork traffic fluctuations, often resulting in a sequence ofcyclical playouts and buffering.

In summary, it becomes necessary to ensure that the videosource sending rate adapts to network conditions at a fine-grainlevel. Rate adaptation can be achieved according to two distinctapproaches. First, one may rely on codec adaptation to matchvariable network capacity in an end-to-end scope (de Cuetos andRoss, 2002; Horn et al., 1999; Rejaie et al., 1999; Weber and deVeciana, 2003; Zhang et al., 2001). As stated by Lakshman et al.(1998), an adaptive encoder that satisfies network traffic con-straints may eventually achieve the best possible decoded videoquality, as it uses network state observance to minimize losses.Major drawbacks of this strategy include the lack of both goodbandwidth estimators and support for explicit congestion feed-back information. Second, in addition to end-systems congestioncontrol mechanisms, network-based approaches rely on ActiveQueue Management (AQM), which requires network elements tointerfere actively in the traffic to guarantee fairness betweencompeting flows, as well as to prevent large delays and packetloss. A decade after the introduction of AQM (Floyd and Jacobson,1993), recent studies have begun to focus on scalable AQMsolutions with explicit feedback notification to enhance theirresults (Katabi and Falk, 2006; Kuzmanovic, 2005; Welzl, 2005;Xia et al., 2005). Like other services, fairness among competingflows is seen as an important design requirement. In this paper,

Page 2: An adaptive-predictive architecture for video streaming servers

S. Fernandes et al. / Journal of Network and Computer Applications 34 (2011) 1683–16941684

we argue that multimedia applications should take the advantageof either explicit or pro-active network information in order touse fine-grained coding or rate control procedures precisely.

Traditionally, when a server application tries to minimizequality variability, it does so by adding or dropping video layers(or fraction of layers). However, frequently adding and droppinglayers can be annoying to a user and certainly degrades theperceptual video quality. Furthermore, current underlying trans-port protocols impose upper bound limits on fairness since theyneed to ensure max–min fairness (e.g. TCP friendliness) andnetwork efficiency. The main contribution of this paper is theproposal of a general architecture for streaming scalable encodedvideo, which includes carefully selected adaptive and predictivefunctional blocks for efficient transmission.

We have organized the rest of this paper as follows. We beginby providing the necessary background and by presenting relatedwork in Section 2, which is then followed by an overview of theproposed solution in Section 3. Section 4 presents the perfor-mance analysis of the proposed architecture. In Section 5, wesummarize our work, discuss our main contributions, and pointout directions for future work.

STORED VIDEO

Enhancement Layer

Base Layer

XEL(t)

XBL(t)

X (t)

EL encoder rate

BL Encoder rate

Total EncoderRate

Fig. 1. Combination of BL and EL.

2. Background and related work

Researchers have recently been addressing the use of adapta-tion, smoothing, and prediction techniques as promisingapproaches to providing better quality or utilization of networkresources (de Cuetos and Ross, 2002; de Cuetos et al., 2004;Duffield et al., 1998; Kim and Ammar, 2005; Lakshman et al.,1999; Krasic and Legare, 2008; Li et al., 2010; Wagner andFrossard, 2009; Wu et al., 2009). Salehi et al. (1998) propose avideo smoothing technique where a video server sends video dataahead of schedule in order to minimize the variability of thetransmitted bit rate. Overall, their findings indicate that optimalsmoothing can result in a significant reduction in the networkresources required for VBR video. Kim and Ammar (2005) pro-posed an alternative solution to the problem of accommodatingthe mismatch between the available bandwidth variability andthe encoded video variability. Their focus was on quality adapta-tion algorithms for scalable, encoded, variable bit-rate video overthe Internet. To this end, they developed a quality adaptationmechanism that maximizes perceptual video quality by minimiz-ing quality variation, while at the same time increasing the usageof available bandwidth. de Cuetos and Ross (2002) and de Cuetoset al. (2004) investigated a solution for adaptive streaming toadapt to the short- and long-term variations in available band-width over a TCP-friendly connection. Using an approach similarto Lakshman et al.’s (1999), Duffield et al. (1998) proposed anadaptive smoothing algorithm for compressed video, calledSmoothed Adaptive Video over Explicit rate networks (SAVE).SAVE has been shown to be capable of maintaining the quality ofthe video at acceptable levels, ensuring that the delay is withinthe acceptable bounds, and that there are significant multiplexinggains. Like our approach, it uses the explicit rate-based controlmechanisms to transport compressed video. The main differenceis that SAVE was developed to work in the context of ATMnetworks, as it adapts the encoder by modifying the quantizationparameter, whereas our strategy uses stored and scalable videoand does not work with adaptation within the encoder.

Two important remarks made in the aforementioned researchpaper strongly support our proposal. First, the paper states thatthe combination of the explicit rate mechanism and the smooth-ing technique allows SAVE to achieve a higher multiplexing gain.Moreover, smoothing can also maintain a suitably selected rate,which, in our case, can match the average available bandwidth.

3. Overview of the proposed solution

Providing perceptually good quality streaming video overtoday’s best-effort Internet remains a hard task, as its traffic profileexhibits variability at multiple time scales (Ribeiro et al., 2005). Forsmall time-scale bandwidth fluctuations (in the order of a fewmilliseconds), a small play-back buffer at the client-side canprovide some limited relief. For longer time-scale bandwidthfluctuations (in the order of a few seconds), the use of eithermultiple versions of the same video or layered-encoded videois seen as the way forward. However, encoded video itself canalso exhibit significant rate variability at several time scales. Ourmain argument centers on the additional fact that even if theend-system knows the available bandwidth information precisely,server applications will need adaptive control policies. Such controlpolicies ought to decide which additional portion should bestreamed, e.g. whole layers in a multi-layer approach or a fractionin a Fine-Grained Scalable approach (de Cuetos et al., 2004).

We have narrowed our focus to pre-stored video. Also, adapta-tion does not refer to encoder manipulation by modifying itsquantization levels during compression. Note that our architec-ture may also be used in the context of live video streaming. Weconsider stored video encoded into two layers, namely a BaseLayer (BL) and a Fine-grained Enhancement Layer (EL). Both BLand EL follow the MPEG-4 Fine-Grain Scalable (FGS) specification(Li, 2001; Radha et al., 2001). The combination of the BL and theEL has a VBR profile. We denote the BL encoded rate as XBL(t) andthe EL encoded rate as XEL(t). We also denote the encoded rate ofthe combination as X(t)¼XEL(t)þXBL(t).

We introduce the following key assumptions. The first is thatthe available bandwidth exhibits multiple time-scale variability.The end-to-end network path is reliable and provides each flowwith its fair share by relying on transport protocols with explicitfeedback notification. In this paper, we use the Explicit ControlProtocol—XCP (Katabi and Falk, 2006). The second assumption isthat the network bottleneck has enough capacity for transportingthe BL flawlessly and that losses may only occur due to missedplay-out deadlines. The buffering capacity at the receiving endallows occasional selective retransmissions, but most importantlyit allows it to neglect small transmission delays in the networkpath by absorbing short time-scale available bandwidth varia-bility. Finally, a server always sends the BL and a portion of the ELaltogether. Figure 1 shows how the server merges both BL and ELof the stored video and feeds the aggregation into the network.

We have built a solution that cushions information provided bythe transport protocols in order to achieve a smooth perceptionlevel as much as possible. In a few words, we take into account theelevated impact of network conditions on continuous mediastreaming applications and vice versa. We rely on fine-grainedinformation at the network and transport levels in order to

Page 3: An adaptive-predictive architecture for video streaming servers

S. Fernandes et al. / Journal of Network and Computer Applications 34 (2011) 1683–1694 1685

provide precise feedback to multimedia applications, whichenables them to smooth out and adjust their sending rates.Figure 2 presents an overview of our novel architecture for videostreaming over best-effort networks. The streaming server unit isresponsible for the aggregation of the BL and the EL as it decideshow much of the EL should be added to the BL, based on theinformation that it receives from the adaptive-predictive (AP) unit.The AP unit, in turn, receives information about network statusfrom the transport protocol. The AP unit consists of two otherindependent functional blocks (the Prediction and Decision Units).

3.1. Design Principles

First, we look at how to extract relevant information from thelower levels (i.e. transport and network levels) and use it as a partof application processing. Another crucial design decision con-cerns the time scale for system components operation. There is aclear advantage that results from decoupling the time-scale ofinterest to the encoder from that of the transport level. Due to itsintrinsic characteristics, an encoder should not switch to EL levelsthat are at the same time scale as those that transport protocolsadapt to network conditions. In other words, an application servershould only change its quality level when it is safe to do so.By safe we mean that such change will presumably have a positiveimpact on the perceived user quality. Therefore, we argue that aspecially designed architectural component should be responsiblefor accumulating data from the transport levels and performingsome adjustments, before passing the right decisions onto thestreaming server application, thus decoupling the time scaledomain for different components. Therefore, the streaming serverworks at the Group of Pictures (GOP) time scale domain (i.e. in theorder of a few seconds), whereas transport protocols work at theRTT domain (i.e. in the order of hundreds of milliseconds).

With regard to application server adaptability, we focus onhow to manage state transitions (adaptability for the EL) to

Receiver

DECO

Best Effort Ne

SOURCE

LIVEVIDEO

STOREDVIDEO

StreamServe

ENCODER

Sender Bu

Fig. 2. Adaptive-predictive archi

minimize quality variation. First, we evaluate the smoothedavailable rate using a low-pass filter. We develop this idea furtherin the next sections. Other possible auxiliary sources of informa-tion for the adaptation heuristics include the stochastic volatilityand prediction error of the available bandwidth. A low value ofthese metrics means network stability and indicate that it isapparently safe to increase video quality using additional bitsfrom the EL. Higher values signal instability and the need tomaintain lower quality at the server as a measure to decreasevariability. Finally, the streaming server simply feeds the networkwith the aggregated encoder allowed rate, which is the combina-tion of BL and EL content frames.

3.2. System components

Figure 3 gives details of the AP unit. Three major componentsform this unit, namely the Dynamic Low-Pass Filter (DLPF), thePrediction Unit (PU), and the Decision Unit (DU). Informationflows from the transport level and then passes through the DLPF,PU, and DU, before feeding into the streaming server. Please notethat on one hand there could be feedback going from the DU tothe DLPF in order to adjust system parameters. On the other hand,the streaming server simply feeds the network with the aggre-gated encoder allowed rate, i.e. BL and EL content frames.

Figure 3 also emphasizes the time scale that each unit is using.The DLPF receives information about the available bandwidth atevery RTT and provides it to the PU. When necessary, the PUaccumulates some estimates before making any predictions. Animportant consequence of our approach is its decoupling of thetime scales used in the network and in the application levels. Ourplacement of the PU unit between the DLPF and the DU is,therefore, strategic. This way its benefits are two-fold: it providesinformation about prediction errors and decouples the twodifferent time scales.

RECEIVER

Buffer

DER

twork

ingr

Adaptive-Predictive

Unit

TransportProtocol

withExplicitNetwork

Feedback

ffer

tecture for video streaming.

Page 4: An adaptive-predictive architecture for video streaming servers

Adaptive-Predictive (AP) Unit

DecisionUnit

PredictionUnit

Filter Unit(DLPF)

Feedback:Stochastic Volatility(Prediction Error)

To theStreaming

Server

From theTransport

RTTTime Scale

GoPTime scale

Time ScaleDecoupling

From theStreaming

Server

To the Transport

Fig. 3. Adaptive-predictive unit.

To the PU

To the PU

Wavelet MRA

ApproximationSeries

DetailedSeries

WaveletDomain

MultiresolutionAnalysis

WaveletCoeficientsDenoising

Energy-basedVariabilityAnalysis

Adaptive Smoothing Techniques

Statistics-basedLow-Pass Filtering

ControlDecision

From theTransport

Prediction ErrorFeedback: Stochastic Volatility

Fig. 4. Filter unit (DLPF).

S. Fernandes et al. / Journal of Network and Computer Applications 34 (2011) 1683–16941686

3.2.1. Filter unit: Dynamic Low-Pass Filtering (DLPF)

In this work, we apply low-pass filtering techniques to providethe PU unit with less variable bandwidth estimates in the form ofa time series. Exponential smoothing methods proved to beoptimal for a very general class of state-space models, and theircorresponding adaptive methods proved to have trustworthy andimproved forecast accuracy over non-adaptive smoothing (Taylor,2004). In the multimedia streaming scope, the main reason forDLPF use comes from the fact that performance in explicit feed-back networks remains heavily dependent on the variations ofbackground traffic. This filter seeks to smooth out the stochasticbehavior of network information over time. Please note thatShakkottai et al. (2001) and Karnik and Kumar (2005) havedetermined that when the RTT is relatively large and ratevariations are fast, the end-system performance does notimprove. Our design decision was to reduce the complexity ofmatching the VBR encoding rate to the VBR available bandwidthinformation. Therefore, we opted to filter only high frequencycomponents of the available bandwidth signal, thus providing asmooth available rate to the PU. To put it more simply, we removehigh rate changes in the available bandwidth information fromthe transport protocol. Figure 4 shows details of the DLPF.

In this context, smoothing can be seen as filtering, wherebyhighly dynamic time series are input, and short-term variations(or noise) are removed to reveal the essential intrinsic informa-tion, such as mean, trend, or seasonality. We work with Exponen-tially Weighted Moving Average (EWMA), also known asexponential smoothing. Researchers argue that the smoothingparameter (also known as stepsize, learning rate, or gain) of

EWMA techniques should vary over time, in order to adjust to thelatest characteristics of the data. From now on we will call suchtechniques Dynamic Low-Pass Filter (DLPF). We implementedseveral algorithms for the choice of the smoothing parameter.Considering that the general form of a low-pass filter is given as

Wn¼ W

n�1�anðW

n�1�X

nÞ ¼ ð1�anÞW

n�1þanX

n

This is the general form of the Low-Pass filter (George and

Powell, 2005), where Wn

is the new estimate at time n, Wn�1

is the

previous estimate at time n�1, Xn

is the current traffic sample,and anA ½0,1� is the filter gain, stepsize, or smoothing parameter.

As a rule of thumb, the stochastic filter gain formulas, alsocalled adaptive stepsize rules, react to the errors in the estima-tion, with respect to the actual sample. Although our selection,implementation, and testing are by no means exhaustive, itis fair to say that the chosen algorithms may be seen as represen-tative of adaptive simple exponential smoothing techniques(Gaivoronski, 1988; George and Powell, 2005; Kesten, 1958;Taylor, 2004; Trigg, 1964; Tukey, 1977; Whybark, 1972).

If one is merely interested in extracting the main amplitudecomponents at low frequencies, a simple low-pass filter is theappropriate solution. However, it fails to provide informationabout variability at different time scales. Therefore, any attemptto evaluate noise, along with trends, should rely on an approachthat gathers information in both time and scale (or frequency)domains. The use of Wavelet Multiresolution Analysis (MRA)meets such requirements. In our opinion, it is very important tolook into the properties of the signal at different time scales other

Page 5: An adaptive-predictive architecture for video streaming servers

S. Fernandes et al. / Journal of Network and Computer Applications 34 (2011) 1683–1694 1687

than the original sampling time scale, which is, in this case, on theorder of RTT measurements. Using the Wavelet MRA approach,we are able to identify several signal properties at time scales offrames, Group of Pictures (GOP), video scenes, or any othercoarser time scale. We are not interested in time scales belowthe RTT, since we are not trying to investigate fractal or multi-fractal properties of the network traffic (Crovella and Bestravos,1997). As a result, our architecture also integrates a Wavelet MRAcomponent and a control decision function. The control decisionfunction plays a strategic role with regard to the DLPF. It willdecide which filtering technique will be active at any givenmoment. For example, based on prediction errors from the PU,it may decide to apply a more complex and precise filteringapproach by activating Wavelet MRA. Control decision function’susual behavior is used simply to evaluate how long the predictionerror has been above a given threshold. When the adaptivesmoothing techniques are not capable of providing the PU witha less variable time series and cannot sustain high predictionerror levels, the control decision switches to the Wavelet MRAfiltering strategy.

The Wavelet MRA component performs noise filtering alongwith energy analysis, thus conveying both denoised series(approximation) and variability (energy) levels to the DU. It alsoprovides soft threshold denoising in the Wavelet coefficients onthe approximation signal in order to get the cleanest signal trend.Meanwhile, the architecture processes the original signal in orderto evaluate the energy-based variability evaluation. Althoughfiltering with wavelet MRA is more complex than the adaptivesmoothing, it produces very precise information about the meanavailable bandwidth. Therefore, we advocate that when using thewavelet MRA functional unit, the DU prediction error threshold isset to a low level.

3.2.2. Prediction unit

Unlike EWMA, wavelets filtering requires the collection of anumber of samples before filtering. Additionally, for linear timeseries modeling, some approaches need more samples in order tocarry out accurate parameterization through maximum likelihoodoptimization. We argue that when we have a highly smoothavailable bandwidth time series arriving from the network, theresulting smoothness from the DLPF will eventually be satisfactory

n-Step AhPredictio

PredictionUnit

Prediction Errors: MAPE, MPE

Feedback:Stochastic Volati

(Prediction Erro

To theDecision

Unit

Fig. 5. Functional descriptio

for the control decision functional block. We refined our architec-ture by way of evaluating when it would be necessary to activatethe Wavelet decomposition in the DLPF.

Figure 5 shows our division of the PU functional block into twophases, namely the Linear Time Series Analysis and the PredictionError Analysis. During the first phase, the denoised signal isreceived and submitted to the Box and Jenkins procedure toidentify both the trend signal and the model order, parsimo-niously (e.g., using some criterion such as the Bayesian Informa-tion Criteria). Next, a one-step-ahead prediction is performed.During the second phase, the PU evaluates the prediction erroraccording to a given metric (e.g., Mean Square Error (MSE)).

We propose that at this point, the DU should not use theprediction values only from the PU. We have determined that it iswiser for it to use a one-step-ahead forecasting value, in conjunc-tion with the prediction errors, in order to verify whether the PUis providing accurate values. By analyzing a given metric forevaluating prediction errors, the DU can make a decision aboutwhether the network is in a steady state and whether it ispossible to improve quality beyond its current level. If so, it canincrease the number of bits of the stored MPEG-4 FGS video in thenext GOP. Next, we will present details of the whole heuristicassociated with the control decision performed by the DU.

3.2.3. Decision unit (DU)

At the heart of our architecture lies a decision unit. Recall thatour streaming server works at the GOP time scale domain,whereas transport protocols work at the RTT domain. The DUfocuses on how to manage EL adaptability by controlling its statetransitions. The DU uses other auxiliary sources of information forthe adaptation heuristics. Among them are the prediction errorsof the available bandwidth and the energy-based variability level.Low values for these metrics imply network stability and that it issafe to increase quality using additional bits from the EL.In contrast, higher values point out instability and indicate thatthe streaming server should maintain its current transmissionrate to ensure low variability. All information used in thisfunctional block comes from the PU. Following the work of Balket al. (2004) and Gotz and Mayer-Patel (2004), we describe theheuristic in terms of a Finite State Machine (FSM). As the encoderlimits the maximum sending rate for the EL to REL, the streaming

Linear Time Series Modeling and Analysis

From theDLPFARIMA Modeling:

Box & Jenkinsead n

lityr)

To the DLPF

n of the prediction unit.

Page 6: An adaptive-predictive architecture for video streaming servers

S. Fernandes et al. / Journal of Network and Computer Applications 34 (2011) 1683–16941688

server is able to identify the number of bits in a GoP that areneeded to generate a sending rate that matches the predictedavailable bandwidth.

Setting an appropriate set of levels for the grouping of the ELwill certainly improve the final quality. With today’s availabletechnology, we argue that the difference in the sending rates foreach layer should be approximately 200 kbps, although furtherstudy is still needed for accurate specification.

Let EL¼ fELig, i¼ 1. . .10 be the set of states, each correspond-ing to distinct MPEG-4 FGS EL encoded video levels (REL), whichmeans the minimum available bandwidth requirement within thecorresponding level. Let also FGI¼ fFGIig, i¼ 1. . .10 be the Fine-Grained Increase rate states, and FGD¼ fFGDig, i¼ 1. . .10 repre-sents the Fine-Grained Decrease rate states.

Figure 6 presents the FSM of the proposal algorithm for thecontrol decision functional block, DU. The initial state EL0 corre-sponds to the sending rate of the Base Layer level with the lowestquality available. We argue, without the loss of generality, that inthis state, the available bandwidth at the network remains abovethe BL requirements. When the DU receives the set of informationfrom the PU, namely the one-step-ahead predicted availablebandwidth but also the prediction error and possibly theenergy-based variability level, the system is then able to initiatethe transition to FGI0. The FIN state reveals that there is nosufficient available bandwidth, and the system is therefore notable to jump to the initial state EL0.

While in any one of the FGIn states, the system keeps a track ofa set of state parameters. By considering the information from thePU that arrives at GOP time scales, the system checks the firstcondition to determine whether the available bandwidth estimateis equal or greater than its actual level. Consider that the systemstarts in a given state, ELn. We first describe the case where theavailable bandwidth is less than its current sending rate. For thisGOP first time window, we argue that such a shortage of networkresources can be dealt with for a short period, and that server andreceiver buffers can handle this appropriately. In the next GOPtime window, if the available bandwidth is still below the currentlevel, the system switches to the FGD state, where it stays untilthe next GOP time window. In the case of a persistent lack ofbandwidth resources, the system finally switches from the FGD tothe ELn�1 state. Please note that the DU could take up to threeGOP time windows in order to switch to a lower state.

In the following paragraphs, we describe the case where theavailable bandwidth is more than the actual ELn requirement. TheDU should identify whether the available bandwidth is largeenough to support switching to the ELnþ1 level. If not, the DUstays in state ELn. If it is possible for the network to accommodatethe minimum sending rate for the ELnþ1 level, the DU should firstverify how stable the information coming from the PU is. In orderto this, it checks if either the energy-based variability level or theprediction error is below the given threshold. If so, the systemcould safely switch to ELnþ1. Otherwise, if one of these variables

FGD

EL0 EL1 EL2 ELn

FGI0 FGI1 FGI2 FGIn

FGDFGDFIN

Fig. 6. Finite state machine for the control decision functional block.

is above the threshold, the network may be subject to thefollowing conditions:

1.

In the case of a large prediction error, a non-stationarybehavior could exist in the available bandwidth series with along-term increase or decrease of the mean value. In such acase, the linear model would have been unable to obtain anaccurate forecast, thus leading to a sudden increase in theprediction error.

2.

In a case where the variability level is above the giventhreshold level, the network exhibits high variability.

In both cases, the DU should neither switch to the ELnþ1 statenor to FGIn. This way, the DU does not become aggressive inresponse to sporadic excesses of available bandwidth on thenetwork path. Intuitively, it will only switch to the ELnþ1 statein a scenario of low variability and a sustainable increase ofnetwork resources.

4. Performance analysis

FGS encoding design tries to cover a wide range of bandwidthwhile maintaining a simple, scalable structure (Li, 2001; Radhaet al., 2001; Seeling et al., 2004, 2005). Its EL stream can betruncated anywhere at the granularity of bits within each framebefore transmission, thus providing partial enhancement propor-tional to the number of bits decoded at the frame level. In thissection, we use the framework for evaluating the streaming ofFGS video with rate-distortion traces provided by Seeling et al.(2004, 2005) along with the guidelines described in Seeling et al.(2004, 2005). We use the Peak Signal to Noise Ratio (PSNR) for ournumerical studies as the objective metric. We also use the rate-distortion traces of the videos from de Cuetos et al. (2004). Wehave chosen movies with a variety of characteristics, namelyThriller (The Firm), Science Fiction (Star Wars), TV Show (Oprah),and Cartoon (Toy Story).

In order to undertake a proper quantitative evaluation, we firstgrouped frames in Group of Pictures (GOP), and then we use GOP-based metrics to evaluate the performance of all system compo-nents. Let Qn,n¼ 1. . .N be the quality of the nth received GOP. Themean and the sample variance of the GOP quality are calculatedas follows:

Q ¼1

N

XN

n ¼ 1

Qn

is the mean quality, and

s2Q ¼

1

N�1

� �XN

n�1

Qn�Qh i2

is the sample variance.The most important metric in our context is the coefficient of

variation, which is calculated as follows:

CoVQ ¼sQ

Q

4.1. Simulation scenario, configuration, and parameterization

The simulation scenario is based on a traditional dumbbellprofile, which consists of a video server and a client exchangingdata through a bottleneck router (we do not present the figurehere due to lack of space). In order to evaluate our architecture inhighly dynamic networks, we generated self-similar backgroundtraffic with five different shifts in level. Such a procedure allowed

Page 7: An adaptive-predictive architecture for video streaming servers

S. Fernandes et al. / Journal of Network and Computer Applications 34 (2011) 1683–1694 1689

us to compare the benefits of applying filtering algorithms withthe mere use of information provided by the underlying transportprotocol. Therefore, we evaluated the received quality for twoapproaches: DLPF, with several rate adaptation algorithms, andABR, with no filtering applied. We use the label ‘‘gaivo’’ to refer toGaivoronski’s adaptive stepsize rule.

4.1.1. Preliminary results

Figure 7 presents the Boxplot for the mean quality and Fig. 8shows the coefficient of variation for a movie trace when usingthe Trigg & Leach approach in the DLPF. In Fig. 7, we see nostatistically significant difference in the mean, since the notchesin both Boxplots (i.e. the confidence intervals) overlap. This ishardly a reason for not deploying DLPF. The careful reader willrecall that we are seeking less variability while maintainingacceptable quality levels. To this end, we present the NotchedBoxplots for the coefficient of variation in Fig. 8. From thisparticular scenario, that is, the Trigg & Leach implementation inthe DLPF unit and the movie trace ‘‘The Firm’’, such first leveldeployment provides less variability in video quality, with astatistically significant difference of about 6%. Further

A BR

MeanMovie Trace - the firm

Mea

n-Q

ualit

y

39.34

39.33

39.32

39.31

39.30

39.29

DLPF- trigg

Fig. 7. Mean quality: ABR�DLPF (T&L).

ABR

0.049

0.050

0.051

0.052

0.053

CoVMovie Trace - thefirm

CoV

-Qua

lity

DLPF- trigg

Fig. 8. CoV: ABR�DLPF (Trigg & Leach).

investigation is needed to verify whether other algorithms couldprovide similar performance levels. We also need to examinemovie traces with different characteristics.

Figs. 9–11 denote the summary of simulations for most DLPFalgorithms when using the three movie traces ‘‘Star Wars’’,‘‘Oprah’’ and ‘‘Toy Story’’, respectively. In Fig. 9, we explore theperformance of our architecture with the deployment of someDLPF algorithms when streaming the movie ‘‘Star Wars’’. It is nosurprise to us to see that, with an analysis of the coefficient ofvariation, all DLPF algorithms outperform the ABR approach. Thedifference in the CoV is roughly 15% for Kesten, 10% for Gaivor-onski and Trigg & Leach, and 8% for Whybark and Tukey rules.Figure 10 presents the simulation results of some DLPF algorithmswhen streaming the movie trace ‘‘Oprah’’. In fact, only the Kestenrule achieved the same CoV level as the ABR approach. For therest of the evaluated filtering algorithms, the difference in the CoVis roughly 12% (Gaivoronski, Trigg & Leach, Whybark, and Tukey).We should emphasize that Kesten’s rule appears to be highlyunstable, since it performs the best with some movie traces, butwith others it performs the worst. Similarly, with the streamingof the cartoon-type movie ‘‘Toy Story’’, all DLPF algorithms

AB keste gaiv trig whybar tuke

0.042

0.044

0.046

0.048

0.050

0.052

CoVMovie Trace - Starwars

CoV

-Qua

lity

Fig. 9. Performance comparison of DLPF algorithms—Star Wars.

ABR kesten gaivo trigg whybark tukey

0.042

0.044

0.046

0.048

0.050

CoV Movie Trace - oprah

CoV

-Qua

lity

Fig. 10. Performance comparison of DLPF algorithms—Oprah.

Page 8: An adaptive-predictive architecture for video streaming servers

Fig. 12. Prediction error (MAPE) evaluation after filtering: overall result.

Fig. 13. Prediction error (MPE) evaluation after filtering: overall result.

ABR kesten gaivo trigg whybark tukey

0.050

0.051

0.052

0.053

0.054

0.055

0.056

CoVMovie Trace -toystory

CoV

-Qua

lity

Fig. 11. Performance comparison of DLPF algorithms—Toy Story.

Fig. 14. Notched Boxplots—standard deviation: ABR�DLPFþPU(Tukey)þDU(MAPE,

10%)—The Firm.

S. Fernandes et al. / Journal of Network and Computer Applications 34 (2011) 1683–16941690

outperformed the ABR approach, as can be seen in Fig. 11. Thedifference in the CoV is roughly in the range of 7–8% for theKesten, Gaivoronski, Trigg & Leach, Whybark, and Tukey rules.

With the proposal of new functionalities, including the wave-let MRA, the Prediction Unit, and the Decision Unit, it is impera-tive that we conduct a new set of experiments in order to verifytheir correctness and validate our architecture. As the carefulreader may have noticed, the incorporation of the PU requiressetting a threshold value for a proper behavior of the DU, asdescribed in the last section.

In fact, we are trying to reveal any persistent variability in theavailable bandwidth estimates. In time series analysis, it iscommon to observe and evaluate several metrics for the predic-tion error. The decision concerning which metric is more appro-priate will essentially depend on the application and the type ofdata. A normal procedure is to evaluate periodically some errormeasures, i.e., some statistical measures of the goodness of fit.Typical forecasting errors include the Mean Absolute PercentageError (MAPE) and the Mean Percentage Error (MPE). Such metricsmatch our purposes, since we can evaluate the prediction error by

inspecting the percentage error in the prediction errors. Particu-larly, MPE indicates whether the forecasts are positively ornegatively biased, whereas MAPE does not consider biases. Bothmeasures will roughly give a hint of the variability persistence onthe network. Having selected the MAPE and MPE for predictionerror measurement, next we face the problem of defining theappropriate threshold level for the control decision in DU. Figures12 and 13 present the overall prediction errors using MPE andMAPE after streaming processing 108 000 samples from thenetwork trace. All metrics were measured after selecting somefiltering strategies (e.g., Trigg & Leach), followed by the forecast-ing procedures in PU. In general, most results for MAPE are belowthe 10% limit, whereas for MPE we observe errors in the rangebetween �4% and 4%. Although these results suggest that theoverall prediction errors remained below some well-definedthreshold, we should analyze their behavior with respect to thedynamic changes of the network. For the following studies,

Page 9: An adaptive-predictive architecture for video streaming servers

S. Fernandes et al. / Journal of Network and Computer Applications 34 (2011) 1683–1694 1691

we utilize MAPE for prediction error measurement and havedetermined a threshold level of 10% for the DU control decision.

4.2. Extended results: DU with MAPE threshold at 10%

Figures 14–19 show the notched Boxplots for the expectedstandard deviation of six distinct filter implementations (Tukey,Trigg, STES, FIR, Dennis, Whybark) when the streaming servertransmits the movie ‘‘The Firm’’.

The standard deviation for ABR in all simulations presents anunacceptable level of 2 dB. This high variability level is oftenreflected in video distortions such as flickers. All results achievedwhen the whole architecture is active exhibited less statisticallysignificant variability below 1 dB, hence confirming the receptionof a less distorted signal.

Figures 20–22 show the results of previous simulations butfrom a different perspective: we compute the number of statechanges for several parameterizations (filter choices) in the DLPF.

Fig. 15. Notched Boxplots—standard deviation: ABR�DLPFþPU(Trigg)þDU(MAPE,

10%)—The Firm.

Fig. 16. Notched Boxplots—standard deviation: ABR�DLPFþPU(STES)þDU(MAPE,

10%)—The Firm.

Fig. 17. Notched Boxplots—standard deviation: ABR�DLPFþPU(FIR)þDU(MAPE,

10%)—The Firm.

Fig. 18. Notched Boxplots—standard deviation: ABR�DLPFþPU(Dennis)þDU(MAPE,

10%)—The Firm.

In addition, we define different threshold levels for the meanaverage percentage error, at which the DU makes decisions forchanging states. The x-axis represents the MAPE threshold, mean-ing that the DU makes only state change decisions (from EL toFGD or FGI, and vice versa) if the prediction error is below such athreshold. As expected, as we increase the threshold level, weincrease the tolerance for large prediction errors, thus increasingthe number of changes. On the other hand, a low value for theerror measurement implies that the system will never changefrom its current state. This could impair an improvement in videoquality even if there were plenty of network resources.

At a first glance, we think that the number of changes in statesis very high. Recall that all video traces used in this study have108 000 frames and 9000 GoPs. Figure 20 shows that for a 50%tolerance in error prediction for the decision control, the numberof per-frame changes is around 2000 for all DLPF algorithms.In other words, around 1% of frames exhibited changes in quality.Although not shown here, the GoP-based evaluation follows the

Page 10: An adaptive-predictive architecture for video streaming servers

Fig. 19. Notched Boxplots—standard deviation: ABR�DLPFþPU(Whybark)þ

DU(MAPE, 10%)—The Firm.

Number of Video Layer Changes

0

500

1000

1500

2000

2500

PU - 1%

Treshold % - Prediction Unit

# of

cha

nges

FIRTukeyDennisSTESWhybarkTrigg

PU - 3%

PU - 5%

PU - 10%

PU - 30%

PU - 50%

Fig. 20. Number of changes in states (200 kbps, frame-based) according to PU

thresholds.

0

500

1000

1500

2000

2500

FIRDLPF Technique

# of

Cha

nges

PU - 5%PU - 10%PU - 30%PU - 50%

Tukey Dennis STES Whybark Trigg

Fig. 21. Bar plot: number of changes in states (200 kbps, frame-based) according

to PU thresholds.

Number of Video Layer Changes

0

1

2

3

4

5

PU - 5%Treshold % - Prediction Unit

Log

(# o

f cha

nges

)

FIRTukeyDennisSTESWhybarkTriggABR

PU - 10% PU - 30% PU - 50%

Fig. 22. Number of changes in states (logscale, frame-based) according to PU

thresholds.

S. Fernandes et al. / Journal of Network and Computer Applications 34 (2011) 1683–16941692

same pattern. Figure 21 shows a bar plot for the same set ofsimulation results.

It is also important to compare such results with the numberof changes that occur when the streaming server relies only onthe raw information coming from the transport protocol (e.g., theABR case). Figure 22 shows the same simulation results as thosein Fig. 20 but in a logarithmic scale. The straight line representsthe ABR case. Note that the deployment of our proposed archi-tecture reduces the number of changes by one–three orders ofmagnitude, according to the logarithmic scale. These results pointout promising benefits that could be realized from the utilizationof our architecture in network scenarios with high variability,such as some access or wireless networks.

Comparisons with similar strategies are somewhat difficult,since our approach requires explicit feedback notification fromlayers below application levels (transport/network), as XCP is themain candidate for providing such data. Although SAVE (Duffieldet al., 1998) has roughly the same network-level requirements asours, it was designed in the context of ATM networks. In such acase, given a delay bound and a maximum frame size at scenelevel, in its smoothing and rate adaptation algorithm, a rate isrequested from the ATM network in order to allocate sufficientbandwidth to transport the video flow. Therefore, it is necessaryto adapt SAVE’s approach to our scenario in a way that we canperform a fair comparison of results. The main problem relies onthe fact that, in ATM networks with explicit rate-based controlmechanisms, a host must signal the underlying network in orderto get its desired rate for transmission. In best-effort networks,but with XCP as the reliable transport protocol, the processassociated to the video application receives information fromthe XCP about the allowed sending rate for a determined timeframe. Adapting SAVE mechanisms to our scenarios (i.e., best-effort networks with XCP) requires some modification on theoriginal SAVE approach. Mainly, instead of calculating the nextrate to be requested from the ATM network, given video char-acteristics (e.g., maximum frame size in a scene) and a delaybound (for preserving source and destination buffers resources), itnow receives the allowed rate from the network through XCP.Then it must calculate the impact on the delays and consequentlyon the amount of memory necessary at source buffers to accom-modate the differences between the encoded rate and the allowedbandwidth. In terms of QoE, SAVE can maintain a fair steadyquality throughout the whole video, but at the expense oflarge delay variations, which imply large buffer requirements.We analyzed delay variations for different video characteristics

Page 11: An adaptive-predictive architecture for video streaming servers

S. Fernandes et al. / Journal of Network and Computer Applications 34 (2011) 1683–1694 1693

and available bandwidth dynamics and noticed that average delaystay at RTT scales. However, maximum delays can sometimesreach tens of seconds, which would require large buffer spaces toensure zero frame losses. Therefore, comparing to our approach,we have the advantage of keeping good QoE while keeping bufferrequirements at low levels.

5. Concluding remarks

In this paper, we designed and evaluated a proposal forstreaming scalable pre-stored video over networks with explicitfeedback notification. We pointed out that the performance ofapplications in explicit feedback networks depends on variationsin the background traffic. Therefore, in such conditions, smooth-ing the stochastic behavior of the network state through timeemerged as a promising approach. We proposed the use ofadaptive low-pass filtering for smoothing the raw traffic informa-tion (i.e. the available bandwidth) that is flowing from the net-work. We relied on recent proposals for congestion control andscalable encoded video to design a novel architecture for multi-media streaming on the Internet. Many technical challengesfacing the wide deployment of video services over the Internetwere presented. Scalability is a crucial factor that can determinewhether good video quality is delivered over unpredictable,highly dynamic networks. To circumvent some current draw-backs, we propose making the streaming server adapt to networkconditions in a fine-grain manner. This is possible by making useof both flexible coding standards and subjacent congestion con-trol mechanisms with explicit feedback notification.

We envisaged a mechanism that would take into account thevolatility of the available rate in order to determine the outputsending rate target. We discussed the possibility of smoothing theinformation received from the transport layer before making anydecision concerning the sending rate. With all these arguments athand, we advocate that any adaptive multimedia streamingsystems should rely on explicit feedback notification from thenetwork, in order to provide the streaming media with bothuninterrupted transport services and low quality variation. There-fore, we took a further step in the analysis and found a solution tothe problem of accommodating the mismatch between the avail-able bandwidth variability and the encoded video rate variability.

We also undertook a careful analysis and provided a solutionto the problem of decision control for streaming fractions ofstored, scalable video in networks with explicit feedback. Weexplored all important design decisions for the proposal of ourarchitecture and provided details about the Prediction Unit, aswell as about the Decision Unit. First, we presented motivationsfor using Wavelet Multiresolution Analysis (MRA) and explainedhow this technique was deployed within the architecture (speci-fically in the DLPF). Using wavelet MRA, we were able to extractthe mean available bandwidth precisely, along with the noiseenergy, at multiple time scales. Within the PU, we modeled themain signal with a parsimonious linear time series model, namelyARIMA, which forecasted changes in signal and evaluated itsprediction error. Using the samples from the DLPF, we were ableto perform one-step-ahead forecasting and to analyze thesesignals in the wavelet domain, in order to extract the essentialsignal, as well as the noise, energy. In addition, we proposed analgorithm for the Decision Unit (DU). The DU used either theprediction error measurement or the noise energy heuristically. Inorder to investigate our proposed architecture further, we carriedout an extensive simulation-based performance analysis andshowed that the degree of dispersion of the mean quality forABR presents an unacceptable level close to 2 dB. On the otherhand, all results from simulations done when the whole

architecture was active demonstrated less statistically significantvariability.

As far as we are concerned to integrating our proposedarchitecture into existing multimedia frameworks, it is clear thatit can easily fit on any open framework with well-definedapplication programming interfaces. In fact, GStreamer (2010)and VLC (2010) have all necessary features (e.g., capability ofdynamically loading modules) for a straightforward integration ofour architecture on them.

Our novel architecture opens up a broad avenue for futurework. Prospective topics for further research include extensionson the field of analytical modeling, application layer schedulingand prioritization, cross-layer design, and performance analysisover wireless networks. Real tests on an experimental testbed arealso under consideration.

References

Balk A, Gerla M, Maggiorini D, Sanadidi M. Adaptive video streaming: pre-encodedMPEG-4 with bandwidth scaling. Computer Networks 2004;44:415–39.

Bormans J, Gelissen J, Perkis A. MPEG-21: the 21st century multimedia framework.Signal Processing Magazine, IEEE 2003;20:53–62. March.

Burnett I, Van de Walle R, Hill K, Bormans J, Pereira F. MPEG-21: goals andachievements. Multimedia, IEEE 2003;10:60–70.

Crovella ME, Bestravos A. Self-similarity in world wide web traffic: evidence andpossible causes. IEEE Transactions on Networking 1997;5(6).

de Cuetos P, Ross KW. Adaptive rate control for streaming stored fine-grainedscalable video. In: Proceedings of the 12th ACM NOSSDAV. Miami, Florida,USA; 2002.

de Cuetos P, Reisslein M, Ross K. Evaluating the streaming of FGS-encoded videowith rate-distortion traces. Institut Eurecom technical report, RR–03–078;June 2004.

De Cicco L, Mascolo S, Palmisano V. Skype video responsiveness to bandwidthvariations. In: Proceedings of the 18th international workshop on network andoperating systems support for digital audio and video (Braunschweig,Germany, May 28–30, 2008). NOSSDAV ’08. ACM, New York, NY; 2008. p. 81–6.

Duffield NG, Ramakrishnan KK, Reibman AR. SAVE: an algorithm for smoothedadaptive video over explicit rate networks. Networking, IEEE/ACM Transac-tions 1998;6(6):717–28.

Floyd S, Jacobson V. Random early detection gateways for congestion avoidance.IEEE/ACM Transactions on Networking 1993;1(4):397–413.

Gaivoronski A. Implementation of stochastic quasigradient methods. In: ErmolievYu, Wets RJ-B, editors. Numerical techniques for stochastic optimization.Berlin: Springer-Verlag; 1988. p. 313–52.

George A, Powell WB. Adaptive stepsizes for recursive estimation with applica-tions in approximate dynamic programming. Technical report. CASTLE Labora-tory, Department of Operations Research, Princeton University; January 24,2005.

Gotz D, Mayer-Patel K. A general framework for multidimensional adaptation. In:Proceedings of the 12th annual ACM international conference on multimedia.New York, USA; 2004. p. 612–9.

GStreamer—Open Source Multimedia Framework. /http://gstreamer.freedesktop.org/S [accessed February 2010].

Horn U, Stuhlmuller K, Link M, Girod B. Robust internet video transmission basedon scalable coding and unequal error protection. Signal Processing: ImageCommunication 1999;15:77–94.

Karnik A, Kumar A. Performance of TCP congestion control with explicit ratefeedback. IEEE Transactions on Networking 2005;13(1).

Katabi Dina, Falk Aaron. Specification for the explicit control protocol (XCP). Work-in-progress. Network Working Group. Internet-Draft; Expires April 16, 2006.

Kesten H. Accelerated stochastic approximation. Annals of Mathematical Statistics1958;29:41–59.

Kim T, Ammar MH. Optimal quality adaptation for scalable encoded video.Selected Areas in Communications, IEEE Journal on 2005;23(2).

Krasic Charles, Legare Jean-Sebastien. Interactivity and scalability enhance-ments for quality-adaptive streaming. In: Proceedings of the 16th ACMinternational conference on multimedia (MM ’08). ACM, New York, NY, USA;2008, p. 753–6.

Kuzmanovic A. The power of explicit congestion notification. In: Proceedings ofthe SIGCOMM’05. Philadelphia, Pennsylvania, USA; August 21–26, 2005.

Lakshman TV, Ortega A, Reibman AR. VBR video: tradeoffs and potentials.Proceedings of the IEEE 1998;86(5):952–73.

Lakshman TV, Mishra PP, Ramakrishnan KK. Transporting compressed video overATM networks with explicit-rate feedback control. Networking, IEEE/ACMTransactions 1999;7(5):710–23.

Li W. Overview of fine granularity scalability in MPEG-4 video standard. IEEETransactions on Circuits and Systems for Video Technology 2001(3):.

Li Ying, Li Zhu, Chiang Mung, Robert Calderbank A. Intelligent video networkengineering with distributed optimization: two case studies. Intelligent

Page 12: An adaptive-predictive architecture for video streaming servers

S. Fernandes et al. / Journal of Network and Computer Applications 34 (2011) 1683–16941694

Multimedia Communication: Techniques and Applications, Studies in Compu-tational Intelligence 2010;280/2010:253–90.

Nichols J, Claypool M. Measurements of the congestion responsiveness ofwindows streaming media. In: Proceedings of the 14th ACM NOSSDAV. Cork,Ireland; 2004.

Radha HM, van der Schaar M, Chen Y. The MPEG-4 fine-grained scalable videocoding method for multimedia streaming over IP. IEEE Transactions on

Multimedia March 2001;3(1):53–69.Rejaie R, Estrin D, Handley M. Quality adaptation for congestion controlled video

playback over the Internet.Cambridge: ACM SIGCOMM; 1999.Ribeiro V, Zhang Z, Moon S, Diot C. Small-Time Scaling Behavior of Internet

Backbone Traffic. Computer Networks 2005;48(3):315–34. June.Salehi JD, Zhang Zhi-Li, Kurose J, Towsley D. Supporting stored video: reducing

rate variability and end-to-end resource requirements through optimalsmoothing. Networking, IEEE/ACM Transactions 1998;6:397–410.

Saxena M, Sharan U, Fahmy S. Analyzing video services in Web 2.0: a globalperspective. In: Proceedings of the 18th international workshop on network andoperating systems support for digital audio and video (Braunschweig, Germany,

May 28–30, 2008). NOSSDAV ’08. ACM, New York, NY; 2008. p. 39–44.Seeling P, Reisslein M, Kulapala B. Network performance evaluation using frame

size and quality traces of single-layer and two-layer video: a tutorial. IEEECommunications Surveys & Tutorials, 3rd quarter; 2004.

Seeling P, de Cuetos P, Reisslein M. Fine granularity scalable (FGS) video:implications for streaming and a trace-based evaluation methodology. IEEECommunications Magazine 2005;43(4):138–42.

Shakkottai S, Kumar A, Karnik A, Anvekar A. TCP performance over end-to-end ratecontrol and stochastic available capacity. Networking, IEEE/ACM Transactions2001;9(4):377–91.

Taylor James W. Smooth transition exponential smoothing. Journal of Forecasting,John Wiley & Sons Ltd. 2004;23(6):385–404.

Trigg D. Monitoring a forecasting system. Operations Research Quarterly1964;15(3):271–4.

Tukey JW. Exploratory data analysis.Reading MA: Addison-Wesley; 1977.Video LAN’s Projects. /http://www.videolan.org/S [accessed February 2010].Wagner Jean-Paul, Frossard Pascal. Layer thickness in congestion-controlled

scalable video. Multimedia Computing and Networking; 2009.Wang B, Kurose J, Shenoy P, Towsley D. Multimedia streaming via TCP: an analytic

performance study. SIGMETRICS’04. NY, USA; June 2004.Weber S, de Veciana G. Network design for rate adaptive media streams. In:

Proceedings of the IEEE INFOCOM; March 2003.Welzl M. Router aided congestion avoidance with scalable performance signalling.

In: Proceedings of the KiVS 2005. Kaiserslautern, Germany; March 2005.Whybark DC. A comparison of the surprising conclusions cited by adaptive

forecasting techniques. Logistics and Transportation Review 1972;8(3).Wu Dalei, Ci Song, Luo Haiyan, Wang Haohong, Katsaggelos A. A quality-driven decision

engine for live video transmission under service-oriented architecture. WirelessCommunications, IEEE 2009;16(4):48–54, doi:10.1109/MWC.2009.5281255.

Xia Y, Subramanian L, Stoica I, Kalyanaraman S. One more bit is enough. In:Proceedings of the SIGCOMM’05. Philadelphia, USA; August 21–26, 2005.

Zhang Q, Zhu W, Zhang Y-Q. Resource allocation for multimedia streaming overthe Internet. IEEE Transactions on Multimedia 2001;3.


Recommended