+ All Categories
Home > Documents > Quality Prediction-Based Dynamic Content Adaptation Framework Applied to Collaborative Mobile...

Quality Prediction-Based Dynamic Content Adaptation Framework Applied to Collaborative Mobile...

Date post: 12-Dec-2016
Category:
Upload: umesh
View: 216 times
Download: 2 times
Share this document with a friend
14

Click here to load reader

Transcript
Page 1: Quality Prediction-Based Dynamic Content Adaptation Framework Applied to Collaborative Mobile Presentations

IEEE TRANSACTIONS ON MOBILE COMPUTING, TMC-2011-03-0150.R1 1

Quality Prediction-Based Dynamic ContentAdaptation Framework Applied to

Collaborative Mobile PresentationsHabib Louafi, Student, IEEE, Stephane Coulombe, Senior, IEEE, and Umesh Chandra

Abstract—Today, professional documents, created in applications such as PowerPoint and Word, can be shared usingubiquitous mobile terminals connected to the Internet. GoogleDocs and EasyMeet are good examples of such collaborative Webapplications dedicated to professional documents. The static adaptation of professional documents has been studied extensively.Dynamic adaptation can be very useful and practical for interactive multimedia applications, because it allows the delivery ofhighly customized content to the end-user without the need to generate and store multiple transcoded versions. In this paper, wepropose a dynamic framework that enables us to estimate transcoding parameters on the fly in order to generate near-optimaladapted content for each user. The framework is compared to current dynamic methods as well as to static adaptation solutions.We show that the proposed framework provides a better trade-off between quality and storage compared to other static anddynamic approaches. To quantify the quality of the adapted content, we introduce a measure of the quality of the experiencebased on its visual quality of the adapted content, as well as on the impact of its total delivery time. The framework has beentested on (but is not limited to) OpenOffice Impress presentations.

Index Terms—Dynamic content adaptation, image transcoding, interactive multimedia applications, mobile device, OpenOffice,presentations, professional documents, SSIM, XHTML.

1 INTRODUCTION

TODAY, Web content can be accessed by PCs, aswell as by a wide variety of mobile devices

(mobile phones, smartphones, etc.) under many brandnames, and with varying features. With the frequentintroduction of new devices to the market [1], theirnumber and diversity is constantly growing. Thereis also a great diversity among the communicationnetworks used by these devices. Depending on itslocation, the same device could use different net-works (e.g., Wi-Fi, GPRS, or UMTS). Such diversityhas changed the logic behind Web content authoring,with the result that new patterns of Web contentpresentation have been designed [2]–[5]. At the sametime, Web content is becoming richer, and is usingvarious formats (XHTML, JPEG, GIF, SVG, Flash, etc.)and styles.

In a collaborative environment such as hosting aconference meeting comprising PCs and mobile de-vices, the presentation slides should be shared andpresented synchronously to all participants. The ob-vious solution is to send the PowerPoint presentationto the participants before the meeting. However, many

• H. Louafi and S.Coulombe are with the Department of Software and ITEngineering, Ecole de technologie superieure, Universite du Quebec,1100 Notre Dame Street West, Montreal, H3C 1K3, Quebec, Canada.E-mail: [email protected]; [email protected]

• U. Chandra is with the Nokia Research Center, 955 Page Mill Road,Palo Alto, CA 94304 USA.E-mail: [email protected]

mobile terminals will not support the PowerPointformat (or any other Office document format). Thosethat support the format still face the problem ofdownloading a potentially large document (often sev-eral MB). This operation is not only time-consuming,but drains the battery of units, and is often costly.Finally, the problem of synchronization with the hostremains; participants lose track of which slide is be-ing presented at any given time. This causes serioususability problems. Web technologies represent anattractive alternative because since they only requirethat the terminal be equipped with a browser (widelysupported), they can customize the content to eachparticipant and ensure constant synchronization withthe presentation by sending the slide only when it ispresented (slides are sent one by one).

In this paper, we focus on the customization oradaptation of slide documents for each participant.However, in a collaborative meeting context, it isvery important to keep the original layout and allembedded images, i.e., the same view should beshared between all the meeting participants. A similarproblem has been studied regarding the delivery ofWeb pages to be shared in a collaborative manner.This context is known as co-browsing or escortedbrowsing [6]. In this context, when it is not possibleto keep the original layout, an extra view is addedfor PCs that reflects what mobile users are seeing.That way, PC users can refer to that extra view tocollaborate with mobile users. In our approach, wepropose to keep the original layout intentionally, since

Digital Object Indentifier 10.1109/TMC.2012.173 1536-1233/12/$31.00 © 2012 IEEE

This article has been accepted for publication in a future issue of this journal, but has not been fully edited. Content may change prior to final publication.

Page 2: Quality Prediction-Based Dynamic Content Adaptation Framework Applied to Collaborative Mobile Presentations

IEEE TRANSACTIONS ON MOBILE COMPUTING, TMC-2011-03-0150.R1 2

in such meeting contexts, the content can be accessedby mobile devices as well as by PCs. In fact, the pro-posed solutions which completely change the content(shrinking it, converting it to text or audio, etc.) werededicated to limited mobile devices. Nowadays, moresophisticated devices, such the Apple iPhone, theNokia Lumia, etc., have been introduced to the mar-ket. With these new devices, it is possible to preservethe original layout of the content and consequentlydeliver rich content that is visually identical to theoriginal. The current tend is to deliver content that fitsthe target mobile device’s resolution and allows theuser to adapt its view by zooming and/or panning.

In recent years, we have witnessed the emergenceof context-aware systems [7], which bridge the gapbetween a diversified and technologically limitedclientele and richer Web content. Generally, a context-aware system involves several steps of content anal-ysis and transcoding operations to tailor the contentto meet the target mobile device’s constraints, userpreferences and role (e.g., attendee or host). Two ma-jor trends (static and dynamic) in content adaptationhave been studied extensively, with a great deal ofwork done in the area of static content adaptation [8]–[10], where different versions of the original contentare created and stored on the server. At runtime, whenthe content is requested, the best of those versionsis selected to be delivered. In dynamic adaptation, acustomized version is created on the fly, based onthe contextual data gathered mainly at the user’srequest [5], [11], [12].

There are advantages and disadvantages to usingeither one of these two strategies. The static approachleads to high processing complexity in generatingall the versions, and a great deal of storage spaceis required to save them. Therefore, to avoid hugeresponse times when content is requested, the pro-cessing is often performed offline. In this case, the is-sue of granularity becomes important, and determinesthe compromise between the quality of the deliveredcontent and its associated processing complexity andavailable storage space [13]. The more versions thereare available, the better the quality. With the dynamicstrategy, the content adaptation is typically performedon the fly, when the terminal’s context is known, whilethe end-user is waiting. In this case, the server couldeasily be overwhelmed when the number of requestsbecomes significant [11], [14]. In such a situation, theuser himself might lose interest in that content, owingto an unreasonable wait time.

In both content adaptation strategies, selecting theright format is crucial as well. Current solutions, suchas the Nokia EasyMeet [15] and GoogleDocs Mo-bile [16], convert PowerPoint presentations into JPEGimages that can be rendered by mobile Web browsers.However, raster formats such as these have majorlimitations in interactive applications, as they do notallow text editing or keyword searching. The XHTML

format could be more suitable in these situations,since instead of converting the whole slide into animage, only embedded images are adapted and thetext is resized. In fact, making the dynamic choicebetween JPEG and XHTML is a challenging task, withthe best depending on the rasterized resolution andthe amount of text relative to the number of imageson a slide.

In this paper, we propose a dynamic frameworkthat enables us to perform an on-the-fly estimationof near-optimal format and transcoding parametersprior to performing transcoding in order to reducecomputational complexity, while improving the userexperience. In this framework, we predict the visualquality of the adapted content and the amount oftime it takes to reach the end-user (delivery time). Theframework we propose has been applied to OpenOf-fice Impress presentations. It is designed to be quitegeneral, but future work can be carried out to vali-date its applicability to other professional documentstypes, such as Word (text) and Excel (spreadsheets).

The paper is organized as follows. We begin bystating the transcoding problem in section 2. In sec-tion 3, we show how the visual quality and the qualityof the experience are evaluated. The experimentalsetup comparing the static and dynamic approachesis presented in section 4. The experimental resultsare presented in section 5. Section 6 presents thecomputational complexity of the proposed dynamicframework. Finally, section 7 concludes the paper.

2 PROBLEM STATEMENTLet C be a professional document, referred to here asthe original document or content, composed of a setof pages ck made up of various components ck,i. Wecan write this formally as follows:C = c1, c2, ..., cn, where n is the total number of pagesin C.ck = ck,1, ck,2, . . . , ck,m(k), where m(k) is the totalnumber of components of the kth page.

For instance, C could be a PowerPoint presentationand ck the kth slide composed of various componentsck,i. Theoretically, a component can be any object. Forinstance, in a slide ck composed of a text box anda JPEG image, ck,1 represents the text box and ck,2represents the JPEG image.

Given a page ck, let W (ck) and H(ck) be its widthand height, in pixels, respectively.

For a page ck, let hk,1, hk,2,. . . , hk,m be sets ofcharacteristics that can be adjusted to adapt thatpage’s components ck,1, ck,2,. . . , ck,m respectively. Forexample, for a JPEG image (represented by ck,2)embedded in a presentation, we may have the sethk,2={resolution, quality factor}.

To be rendered by the target mobile device, theoriginal document must be adapted. To achieve this,various adaptation operations can be used. Concep-tually, different transcoding parameter combinations

This article has been accepted for publication in a future issue of this journal, but has not been fully edited. Content may change prior to final publication.

Page 3: Quality Prediction-Based Dynamic Content Adaptation Framework Applied to Collaborative Mobile Presentations

IEEE TRANSACTIONS ON MOBILE COMPUTING, TMC-2011-03-0150.R1 3

can be used by the adaptation operations for eachpage. Let P be the possible transcoding parametersthat can be used to adapt the original document’spages and their components, that is:

P = {f, z,QF}

where:• f ∈ {JPEG,XHTML} is the output format into

which the original page is transcoded,• 0 < z ≤ 100% is a scaling factor that defines the

output resolution of the adapted page, and• 0 < QF ≤ 100 represents the quality factor of the

outputted JPEG images on the adapted page.These parameters are applied as follows:

• If, for a given page ck, the selected output formatis JPEG, the whole page is rasterized into a JPEGimage and wrapped in an XHTML skeleton. Asa result, the whole page is converted into a Webpage that contains only one JPEG image. In thiscase, the parameters z and QF are used to createthat JPEG image.

• If the selected output format is XHTML, thewhole page is transformed into an XHTML file,which may include both text and images. As aresult, the output XHTML file will contain thesame number of components (text and images)as the original document. In this case, to preservethe initial intentions of the author of the originaldocument, the same z is used for all the compo-nents of the original pages and the same QF forall the embedded JPEG images. By preserving theauthor’s intentions, the adapted page will havethe same layout (relative sizes and positions ofembedded components) as the original one.

Our formulation of the problem is inspired fromthe work presented in [17]–[19]. Let D be the targetmobile device and W (D), H(D), F (D), S(D), BR(D)and NL(D) be its maximum permissible image width,image height, supported formats, file size (in bits),bitrate and latency of the network in use, respectively.

We define T (ck, f, z,QF ) as the operation thattransforms the original page ck using the transcodingparameters f , z and QF . Let tf,z,QF

k be the adaptedcontent created by T (ck, f, z,QF ).We say that the adapted content tf,z,QF

k is renderableby D if the following relations are true:

S(tf,z,QFk

)≤ S(D)

W (tf,z,QFk ) = zW (ck) ≤W (D)

H(tf,z,QFk ) = zH(ck) ≤ H(D)

f ∈ F (D)

where S(tf,z,QFk

), W

(tf,z,QFk

)and H

(tf,z,QFk

)are the

file size, the width and the height of the adaptedcontent generated by T respectively.

The set of transcoding parameter combinations thatcan be used by T to transform original content ck into

adapted contents that are renderable by D is given by:

R(ck, D) ={(f, z,QF )|tf,z,QF

k is renderable by D}

Since there could be multiple transcoding parame-ter combinations leading to versions renderable by D,we are interested in the combination that maximizesthe quality of the user’s experience. Let f∗(ck, D),z∗(ck, D) and QF ∗(ck, D) be this combination, whichis given by:(

f∗(ck,D), z∗(ck, D), QF ∗(ck, D))=

argmax(f,z,QF )∈R(ck,D)

QE(tf,z,QF

k , D) (1)

where, QE

is a function that evaluates the quality ofthe user’s experience. When (1) returns more thanone combination, the one that presents the best visualquality is arbitrarily selected. The following sectionshows how the visual quality and the quality of theexperience are evaluated.

3 QUALITY OF EXPERIENCE EVALUATION

The quality of the delivered content, as experiencedby the end-user, called the name quality of experience(Q

E), is affected by three factors [20]:

1) The quality of the content at the source, thatis, the quality of the adapted content beforedelivery.

2) The quality of service QoS, which is affectedby the delivery of the adapted content over thenetwork.

3) The human perception regarding the adaptedcontent.

In other words, QE

is affected by the visual qualityand the transport quality (quality associated withthe total delivery time). The first expresses how thecontent is appreciated visually, and the second ex-presses the impact of the total delivery time on theappreciation of the content. Based on these qualities,we propose to evaluate the Q

Eof the adapted content

as follows:For an adapted content tf,z,QF

k and a target mobiledevice D, we propose to evaluate the quality ofexperience Q

Eas follows:

QE(tf,z,QF

k , D) = QV(tf,z,QF

k , D)QT(tf,z,QF

k , D) (2)

where 0 ≤ QV

≤ 1 and 0 ≤ QT

≤ 1 represent thevisual quality and the transport quality respectively.This is not the only way of evaluating the qualityof experience, and as explained in [20], Q

Eas well

as QoS evaluation represent a completely separateresearch topic. In our framework, we propose theproduct of Q

Vand Q

Trather than the sum to prevent

large disparities in QV

and QT

from being able to pro-duce a high Q

E. In this context, the product is more

appropriate than the sum, since QV

and QT

are notcompensatory attributes. When two or more attributes

This article has been accepted for publication in a future issue of this journal, but has not been fully edited. Content may change prior to final publication.

Page 4: Quality Prediction-Based Dynamic Content Adaptation Framework Applied to Collaborative Mobile Presentations

IEEE TRANSACTIONS ON MOBILE COMPUTING, TMC-2011-03-0150.R1 4

are combined to produce a single attribute to repre-sent a given problem, the attributes are classified intocompensatory and non-compensatory attributes. Theformer (compensatory) can be summed whereas theothers cannot. This is the fruit of research performedparticularly in the marketing and decision makingfields [21], [22]. In the problem at hand, when a JPEGimage is aggressively transcoded, its Q

Twill be close

to 1 (very lightweight image) and its QV

close to 0(very distorted image). If Q

Vand Q

Tare summed, the

resulting QE

will be close to 1, which is misleading.Contrary to the sum, the product will be close to 0,which seems more reasonable.

Note that the QE

measure we propose is used toillustrate the benefits of our prediction-based dynamiccontent adaptation over existing methods when atrade-off between the visual quality of the adaptedcontent and its delivery time is considered. However,the framework applies to other quality measures aswell.

3.1 Visual Quality EvaluationLet ck = {ck,1, ck,2, . . . , ck,m(k)} be an original page.Let tf,z,QF

k = T (ck, f, z,QF ), be the adapted versionof ck, composed of the adapted components. We canwrite tf,z,QF

k , for page ck, as:

tf,z,QFk =

{tf,z,QFk,1 , tf,z,QF

k,2 , . . . , tf,z,QFk,m(k,f)

}where tf,z,QF

k,i is the ith transcoded component andm(k, f) is the total number of its components. It isgiven by (ignoring the XHTML wrapper, which hasno impact on quality, for both cases):

m(k, f) =

{m(k) if f = XHTML1 if f = JPEG

Using the SSIM [23] (or any other reliable visualquality index), it is possible to evaluate the visualquality of the adapted content. The later dependson the visual quality of its components, but also onthe area occupied by each component (the larger thearea, the larger the weight on quality it should have).Therefore, we propose to compute the visual qualityas a weighted sum of its components’ visual quality,the weights being the area occupied by each. Thus,we have:

QV(tf,z,QF

k , D) =

m(k,f)∑i=1

A(tf,z,QFk,i )Q

V(tf,z,QF

k,i , D)

m(k,f)∑i=1

A(tf,z,QFk,i )

(3)

QV(tf,z,QFk,i ,D)=

{Q

I(tf,z,QF

k,i ,D) if tf,z,QFk,i is an image

1 if tf,z,QFk,i is text

(4)

where:

• QI

is a metric that measures the image quality.• We assumed that the text is rendered perfectly,

and without loss of generality, the visual qualityof text boxes is set to 1. However, more sophisti-cated metrics could be used to take into accountthe text size, the color and even the font.

• A(tf,z,QFk,i ) is the visible (not hidden) area occu-

pied by tf,z,QFk,i . We always have A(tf,z,QF

k,i ) ≤H(tf,z,QF

k,i )W (tf,z,QFk,i ) since two components are

allowed to partially overlap one another. Forinstance, a text region can completely or partiallyoverlap an image region. However, if two imagesoverlap, the hidden regions of an image shouldnot be considered neither for computing its re-gion A nor its quality Q

I. This is particularly

important when the page contains a background.• When the adapted content tf,z,QF

k comprises oneimage (e.g., JPEG), its visual quality is reducedto the visual quality of that image, as is the casewhen the output format to be used is f = JPEG.

3.2 Visual Quality EstimationIf adapted content were available, it would bestraightforward, using (3), to compute its visual qual-ity. The challenge, with the dynamic content adapta-tion system we propose, is to be able to estimate thevisual quality of components subject to transcodingparameters without having to perform any transcod-ing operation. This estimation process is the key tothe proposed system’s reduced computational com-plexity.

The adapted content’s areas can be known at run-time (when the content is requested). That is, iff = XHTML, these areas can be computed by scalingthe areas of the original content’s components usingthe scaling parameter z and when f = JPEG, thearea of the whole of the original content is scaledusing z. From (4), the visual quality of the adaptedcomponents is set to 1 for text, and for images itis defined as the adapted image’s quality using agiven quality metric (e.g., SSIM). It is hoped thatthe image quality can be predicted using a solutionproposed in [17], in which it is shown to be possibleto estimate the SSIM of JPEG images (characterizedby QFin, their actual QF ) subject to changing theirscaling parameter (z) and quality factor (QFout) andfor a viewing conditions (zv). For original content ck,ithe value of zv controls the resolution, zvW (ck,i) ×zvH(ck,i), at which the original and the transcodedimages should be scaled for comparison, in orderto compute their SSIM. For instance, when zv = 1,the two images are compared at the resolution ofthe original one; when zv = z, the two images arecompared at the resolution of the transcoded one;when zv = min(W (D)/W (ck,i), H(D)/H(ck,i), 1), thetwo images are compared at the maximum resolutionsupported by the terminal or the original size of the

This article has been accepted for publication in a future issue of this journal, but has not been fully edited. Content may change prior to final publication.

Page 5: Quality Prediction-Based Dynamic Content Adaptation Framework Applied to Collaborative Mobile Presentations

IEEE TRANSACTIONS ON MOBILE COMPUTING, TMC-2011-03-0150.R1 5

TABLE 1Sub-array of predicted SSIM values,

SSIM(zv, QFin, z, QFout), computed for QFin = 80and zv = 40% (from [17].

Scaling, z,%QFout 10 20 30 40 50 60 70 80 90 100

10 0.25 0.43 0.55 0.62 0.69 0.73 0.76 0.79 0.80 0.8220 0.30 0.52 0.65 0.73 0.79 0.82 0.85 0.87 0.88 0.8930 0.33 0.56 0.69 0.77 0.83 0.86 0.89 0.90 0.91 0.9240 0.35 0.58 0.72 0.80 0.85 0.88 0.90 0.92 0.92 0.9450 0.36 0.61 0.74 0.82 0.87 0.90 0.92 0.93 0.94 0.9560 0.38 0.63 0.76 0.84 0.89 0.92 0.93 0.94 0.95 0.9670 0.39 0.65 0.78 0.86 0.90 0.93 0.94 0.95 0.95 0.9780 0.42 0.68 0.81 0.89 0.93 0.95 0.96 0.96 0.97 1.0090 0.45 0.72 0.85 0.92 0.95 0.96 0.97 0.97 0.98 0.99100 0.49 0.78 0.91 0.97 0.98 0.98 0.99 0.99 0.99 1.00

image, whichever is the smaller. In practice, this valuecan be set by the maximum resolution of the targetterminal.

In other words, when a JPEG image is transcodedusing a scaling parameter z and quality factor QF ,the SSIM of the transcoded image can be estimatedusing the predicted data that are tabulated in [17],and which are indexed by QFin, zv , z and QF . Forinstance, Table 1, which is extracted from [17], showsa sub-array of predicted SSIM values of transcodedJPEG images characterized by their actual QFin = 80,transcoded using z and QF and evaluated underviewing conditions of zv = 40%. As commercialproducts generally use a QF value between 75 and 85to encode documents (or re-encode images) into JPEGimages to preserve their visual quality, in Table 1, wepresented a sub-array for QFin = 80. OpenOffice, forinstance, proposes a default value of QF = 75.

According to this table, we predict that SSIM=0.90when an image encoded with QFin = 80 is transcodedusing z = 50%, QFout = 70, and viewed at zv = 40%.

Note that these predicted SSIM values were com-puted by training and clustering, in which only theQF (QFin) of the original image, and the transcodingparameters z, QF , and zv were considered. A moresophisticated clustering method, taking into consid-eration two additional features (the original image’snumber of bits per pixel and QFout − QFin), wasproposed in [24] to improve the prediction accuracy.Tables such as those in [17] can be used to estimatethe visual quality of the adapted content as follows:

• When the format to be used is f = XHTML, theadapted content will comprise the same numberof components as the original one. In this case,using the SSIM index, the visual quality of theadapted content’s components (4) becomes:

QV (tf,z,QFk,i , D) ={

SSIM(tXHTML,z,QFk,i , ck,i, zv) if ck,i is an image

1 if ck,i is text

where tXHTML,z,QFk,i is the transcoded version of

ck,i.The SSIM of the embedded images of the adaptedcontent can be estimated using the predicted

Fig. 1. XHTML QV

estimation

SSIM values [17], and thus, the estimated visualquality of the adapted components becomes:

QV (tf,z,QFk,i , D) ={

SSIM(zv, QFin(ck,i), z, QF ) if ck,i is an image1 if ck,i is text

where QFin(ck,i) represents the quality factor ofck,i and SSIM(zv, QFin(ck,i), z, QF ) is the esti-mated SSIM value that can be extracted from thepredicted SSIM arrays [17] using zv , QFin(ck,i), zand QF . Thus, the visual quality of the whole ofthe adapted content can be estimated using (3).This process is illustrated in Fig. 1.

• When the format to be used is f = JPEG, theadapted content will comprise only one JPEG im-age. Let tJPEG,z,QF

k,1 be this image transcoded at zand QF , and tJPEG,100%,80

k,1 be the image createdusing z = 100% and QF = 80. The latter is usedas a reference image. Note that in estimating thevisual quality, the image tJPEG,100%,80

k,1 is not actu-ally created. It is mentioned here only to illustratethe visual quality computing process. However,from [18], the reference image (tJPEG,100%,80

k,1 ) isneeded to estimate the file size of the adaptedcontent, as described in section 3.4. Now, usingthe SSIM index, the visual quality of the adaptedcontent (3) becomes:

QV(tf,z,QF

k , D) = QV(tJPEG,z,QF

k,1 , D)

= SSIM(tJPEG,z,QFk,1 , tJPEG,100%,80

k,1 , zv)

This visual quality can be estimated using thepredicted SSIM values computed for various zvand QFin as illustrated in Fig. 2. For example,Table 1 represents such values for zv = 40% andQFin = 80. The estimated visual quality is:

QV(tf,z,QF

k , D) = SSIM(zv, 80, z, QF )

where SSIM(zv, 80, z, QF ) is the estimated SSIMvalue that can be extracted from the predictedSSIM arrays [17] using zv , QFin(t

JPEG,100%,80k,1 ) =

80, z and QF . For example, using Table 1, weobtain: SSIM(40%, 80, 50%, 80) = 0.93.

This article has been accepted for publication in a future issue of this journal, but has not been fully edited. Content may change prior to final publication.

Page 6: Quality Prediction-Based Dynamic Content Adaptation Framework Applied to Collaborative Mobile Presentations

IEEE TRANSACTIONS ON MOBILE COMPUTING, TMC-2011-03-0150.R1 6

Fig. 2. JPEG QV

estimation

3.3 Transport Quality EvaluationNo doubt, the longer it takes to deliver the adaptedcontent, the less it is appreciated by the end-user. Asthe total delivery time increases, its associated qualityis reduced accordingly. Therefore, transport quality isinversely proportional to the total delivery time of theadapted content. For an adapted content tf,z,QF

k anda target mobile device D, the total delivery time isgiven by:

tdt(tf,z,QFk

)=

S(tf,z,QFk )

BR(D)+NL(D)+SL(D)+TL(ck, f, z,QF )

(5)

where• S(tf,z,QF

k ) is the file size in bits of tf,z,QFk .

• D is the target mobile device and BR(D) andNL(D) its bitrate and the latency of the networkto which the mobile device is connected respec-tively.

• SL(D) is the server latency. For a device D, it rep-resents the time spent by the request in the server(e.g., in the queue) waiting to be processed. Thisvalue evaluates the performance of the server.

• TL(ck, f, z,QF ) is the transcoding latency. It rep-resents how long the adaptation operation takesto complete. It depends on the original content ckand the transcoding parameters f , z and QF inuse. It can be estimated based on past transcodingoperations. On high-end computers, this valueshould be small.

We propose to evaluate the transport quality using anormalized Z-shaped built-in membership function [25]as illustrated in Fig. 3. This was inspired by thework of [3], [4], in which the authors used sigmfand gaussmf membership functions to map variousparameters, such as the network’s bandwidth andlatency, to quality values between 0 and 1. By varyingthe values of α and β, it is possible to create a familyof curves that have the same behavior. According todifferent research works performed to estimate thewaiting time that users can tolerate when accessingWeb content [26], the values of α and β can be setto model the user’s behavior regarding the waitingtime. These values can be determined by experienceor defined by the end-user. The value of α expressesthe period of time in which the end-user is fullysatisfied with the response time. The value of (α+β)/2expresses the period of time, in which that apprecia-tion is reduced to 50%, and when the total deliverytime reaches the value β, the user’s appreciation is

Fig. 3. Transport quality behavior for α = 5 and β = 10

nil. Thus, the transport quality can be formulated asfollows:

QT(tf,z,QF

k , D) = Zmf(tdt

(tf,z,QFk , D

), α, β

)(6)

with:

Zmf(x, α, β)=

⎧⎪⎪⎪⎪⎨⎪⎪⎪⎪⎩1 if x ≤ α

1− 2(x−αβ−α

)2 if α ≤ x ≤ α+β2

2(x−ββ−α

)2 if α+β2 ≤ x ≤ β

0 if x ≥ β

3.4 Transport Quality EstimationIn (5) and (6), all the parameters used in computingthe total delivery time and its associated transportquality must, if unknown, be estimated at run-time.Various algorithms have been proposed to estimatethe network bitrate at run-time [27]. The networklatency is generally estimated by “pinging” the targetmobile device at run-time or taking a mean value ofprevious probings that could have been performedwhen the user was registered [28].We propose to estimate the adapted content file sizeas well, using the method proposed in [18]. That is,when a JPEG image ck,i (characterized by its QF ,denoted QFin) is transcoded into another JPEG image(tJPEG,z,QFk,i ) using a scaling parameter z and quality

factor QF (QFout), the relative file size between them,denoted r(tJPEG,z,QF

k,i , ck,i) = S(tJPEG,z,QFk,i )/S(ck,i),

can be predicted by the method presented in [18].As explained in section 3.2, these predictors werecomputed by training and clustering, where QFin,z, QF were considered as features (in [24], theyproposed two new features to increase the predictionaccuracy). Table 2 shows predicted relative file sizesfor QFin = 80 and various values of z and QF (orQFout). The predicted relative file size can be used tocompute the total delivery time (tdt) and its associatedquality (Q

T) as follows:

• When the format to be used is f = XHTML, thefile size of the adapted content can be computedby summing the file sizes of its embedded imagesand text boxes and adding an additional sizerelated to the XHTML wrapper as follows:

S(tf,z,QFk

)=

m(k)∑i=1

r(tXHTML,z,QFk,i , ck,i)S(ck,i) + ψ

This article has been accepted for publication in a future issue of this journal, but has not been fully edited. Content may change prior to final publication.

Page 7: Quality Prediction-Based Dynamic Content Adaptation Framework Applied to Collaborative Mobile Presentations

IEEE TRANSACTIONS ON MOBILE COMPUTING, TMC-2011-03-0150.R1 7

TABLE 2Sub-array of predicted relative file sizes

rI(QFin, z, QFout) computed for QFin = 80 (from [18]).

Scaling, z,%QFout 10 20 30 40 50 60 70 80 90 100

10 0.03 0.04 0.05 0.07 0.08 0.10 0.12 0.15 0.17 0.2020 0.03 0.05 0.07 0.09 0.12 0.15 0.19 0.22 0.26 0.3230 0.04 0.05 0.08 0.11 0.15 0.19 0.24 0.21 0.34 0.4140 0.04 0.06 0.09 0.13 0.17 0.22 0.28 0.34 0.40 0.5050 0.04 0.06 0.10 0.14 0.19 0.25 0.32 0.39 0.46 0.5460 0.04 0.07 0.11 0.16 0.22 0.28 0.36 0.44 0.53 0.7170 0.04 0.08 0.13 0.18 0.25 0.33 0.42 0.52 0.63 0.8580 0.05 0.09 0.15 0.22 0.31 0.41 0.52 0.65 0.78 0.9590 0.06 0.12 0.21 0.31 0.44 0.59 0.75 0.93 1.12 1.12100 0.10 0.24 0.47 0.75 1.05 1.46 1.89 2.34 2.86 2.22

where S(ck,i) represents the file size of the com-ponent ck,i and r(tXHTML,z,QF

k,i , ck,i) the relativefile size between ck,i and its transcoded versiontXHTML,z,QFk,i .

Using the predicted relative file sizes [18] (e.g.,Table 2), the adapted content’s file size can beestimated as illustrated in Fig. 4. For instance,Table 2 shows that an image transcoded usingQFout = 80 and z = 80% will occupy 65% of itsoriginal file size. Formally, the estimated file sizeis given by:

S(tf,z,QFk

)=

m(k)∑i=1

r(tXHTML,z,QFk,i , ck,i)S(ck,i) + ψ

r(tXHTML,z,QFk,i , ck,i) ={rI(QFin(ck,i), z, QF ) if ck,i is an image

1 if ck,i is text

where:

– QFin(ck,i) is the QF of original image ck,i.– r

I(QFin(ck,i), z, QF ) is the estimated rela-

tive file size between the image ck,i and itstranscoded version tXHTML,z,QF

k,i , which canbe extracted from the predicted relative filesizes arrays tabulated in [18] (Table 2 showssuch an array for QFin = 80 and variousvalues of z and QF = QFout).

– ψ represents the added size of the XHTMLwrapper. Typically, the file size of theXHTML wrapper for one slide is equal to1 KB. Therefore, we set ψ = 1KB.

It is interesting to note that although the file sizeprediction model does not use explicit statisticsrelated to the compressed form of the input image(such as the number of zeroed DCT coefficients),it implicitly takes into account the compressibilityof the original image through its file size, S(ck,i).

• When the format to be used is f = JPEG,using the reference JPEG image created before(tJPEG,100%,80k,1 ), the file size of the adapted con-

Fig. 4. XHTML QT

estimation

tent becomes:

S(tf,z,QFk

)=

r(tJPEG,z,QFk,1 , tJPEG,100%,80

k,1 )S(tJPEG,100%,80k,1 ) + ψ

The file size of the adapted content can be esti-mated using the predicted relative file sizes [18]as illustrated in Fig. 5. Formally, the estimated filesize of the adapted content is given by:

S(tf,z,QFk

)=

r(tJPEG,z,QFk,1 , tJPEG,100%,80

k,1 )S(tJPEG,100%,80k,1 ) + ψ

r(tJPEG,z,QFk,1 , tJPEG,100%,80

k,1 )

= rI(QFin(t

JPEG,100%,80k,1 ), z, QF )

= rI(80, z, QF )

where:

– QFin(tJPEG,100%,80k,1 ) = 80 is the quality factor

of tJPEG,100%,80k,1

– rI(80, z, QF ) is the estimated relative file

size between the two images tJPEG,z,QFk,1 and

tJPEG,100%,80k,1 . It can be extracted from Ta-

ble 2.– ψ, as before, is the XHTML wrapper size.

Finally, using the proposed dynamic framework,the visual quality, the transport quality and the qualityof experience can be estimated on the fly. Using (2),the estimated quality of experience becomes:

QE(tf,z,QF

k , D) = QV(tf,z,QF

k , D)QT(tf,z,QF

k , D)

where QV(tf,z,QF

k , D) and QT(tf,z,QF

k , D) are two func-tions that estimate the visual and transport qualitiesusing the prediction arrays of SSIM and relative filesizes [17], [18] as illustrated in Figs. 1, 2, 4 and 5.

This article has been accepted for publication in a future issue of this journal, but has not been fully edited. Content may change prior to final publication.

Page 8: Quality Prediction-Based Dynamic Content Adaptation Framework Applied to Collaborative Mobile Presentations

IEEE TRANSACTIONS ON MOBILE COMPUTING, TMC-2011-03-0150.R1 8

Fig. 5. JPEG QT

estimation

3.5 Use of the Estimation Method

To estimate the optimal combination of format andtranscoding parameters that should be used to adaptan original content ck, we compute, using all com-binations of f , z and QF , two arrays Q

V(tf,z,QF

k , D)

and r(tf,z,QFk ). Based on these arrays, we compute the

estimated quality of experience array QE(tf,z,QF

k , D)and solve (1) to determine the best feasible solution(i.e., best transcoding parameters combination). Weexpect the solution to be near-optimal.

4 EXPERIMENTAL SETUP

4.1 Slides Corpus

To test and validate the proposed framework, a largecorpus of documents (presentation slides) was re-quired. Such slides could have been searched andcollected from the Web. However, to test and analyzethe proposed system on a wide range of slide types,we preferred creating slides composed of componentswith various sizes and positions in the slide. Theimages, as well the text, however, were collected fromthe Web in order to be representative of existingcontent. We developed a Java-based application thatuses OpenOffice APIs (UNO) to create a set of Impressslides [29]. Note that a slide is allowed to containtext and images sharing the same area (overlap). Thebackground was set to none (no master style), but aninserted image could represent 100% of the slide, andtherefore be considered as a background. The createdslides’ file sizes varied between 12 KB and 122 KB. Theimages were collected from Internet Web sites suchas [30]. The positions of the text boxes and images onthe slides were set randomly by a random numbergenerator that is part of the same application. Sincethe dimensions of the images and text boxes could becontinuous, and to avoid context dilution, quantizedvalues, representing the percentage of areas occupied

by images and text boxes, were used as follows:

I ∈ {0%, 10%, 20%, . . . , 100%}T ∈ {0%, 10%, 20%, . . . , 100%}

For instance, I = 40% and T = 30% mean that thearea occupied by the images represents 40% of theslide and that occupied by the text boxes 30%. Anexample of a slide composed of I = 40% and T = 25%is shown in Fig. 7(a).

To facilitate the validation, each document wascomposed of one slide. This restriction, which can beremoved later, does not affect the credibility of thevalidation, since each slide can be seen as separatecontent, and so is converted and sent separately. LetV be this validation set.

4.2 Transcoding MethodologyTo compare the quality of the transcoded content,each slide from V was transcoded using OpenOfficeJPEG and XHTML filters, which produce JPEG- andXHTML-based Web pages, respectively. The first filterconverts the whole slide into an image and wraps it ina skeleton Web page. In the proposed dynamic frame-work, to be able to estimate Q

Vand r of any adapted

content when the format used is JPEG, the createdJPEG image tJPEG,100%,80

k,1 was used as a referenceimage from which the other images (tJPEG,z,QF

k,i ) werecreated using ImageMagick command line tools [31].These images replaced those created by the JPEG-based filter. We could thus use the predicted SSIM andrelative file sizes computed by [17], [18] to estimateQ

Vand r of the adapted contents.

Note that, the native OpenOffice XHTML filter ca-pability was very limited, however, and it was foundto have numerous bugs. For instance, the layout wasnot preserved in the output XHTML Web page, andall the components (text boxes and images) werealigned on the left-hand side. Only one font (thedefault OS font) was used for the entire presentation.Moreover, it was not possible to embed images inthe traditional fashion, which consists in includingthe image URL in the XHTML text and saving theimage in a specific folder. The implemented solutioninvolved converting the embedded images into base-64 encoding, and then include them directly on theWeb page (which resulted in larger images). Also,no graphic possibilities were allowed, meaning thatall the embedded graphics were simply ignored inthe XHTML version. Furthermore, it was not possibleto change the embedded images’ characteristics, suchas the resolution and the quality factor. Ultimately,we improved the filter by fixing these importantbugs and limitations, and adding the possibility ofmanipulating images and their characteristics (withproper URL support), as well as using graphics. Afterthis extension, the modified OpenOffice XHTML filterwas able to convert the slide into a standard XHTML

This article has been accepted for publication in a future issue of this journal, but has not been fully edited. Content may change prior to final publication.

Page 9: Quality Prediction-Based Dynamic Content Adaptation Framework Applied to Collaborative Mobile Presentations

IEEE TRANSACTIONS ON MOBILE COMPUTING, TMC-2011-03-0150.R1 9

(a) (b)

Fig. 6. An example that summarizes the extensionsperformed on the native OpenOffice XHTML filter: (a)a slide as exported by the native XHTML filter, (b) asexported by our extended XHTML filter

file, which could include both text and images. Fig. 6summarizes the majority of the fixed bugs; Fig. 6(a)shows the Web page version of a slide as outputted bythe native OpenOffice XHTML filter, while Fig. 6(b)shows the Web page version created by our modi-fied filter (which visually corresponds exactly to theoriginal slide). An example of a slide as exported bythe extended OpenOffice XHTML filter is shown inFig. 7. Fig. 7(a) shows the original slide as renderedby OpenOffice, while, Figs. 7(b), 7(c) and 7(d) showits exported versions with the extended XHTML filterusing z = 30% and QF = 80, z = 80% and QF = 80,and z = 50% and QF = 60, respectively.

The following are the sets of transcoding parame-ters used by these filters, as explained in section 2:

f ∈ {JPEG,XHTML}z ∈ {10%, 20%, 30%, . . . , 100%}

QF ∈ {10, 20, 30, . . . , 100}Let W be the set of adapted contents created from

the original contents of V using the transcoding pa-rameters f , z and QF .

4.3 Validation Methodology

The OpenOffice (or MS-Office suite) XHTML filters,which produce JPEG-based XHTML pages, offer thepossibility of selecting the target resolution and JPEGquality factor. Thought few parameters are offeredvia their graphical interfaces, high, low and mediumquality, more precise parameters can be used pro-grammatically via their APIs, which is what com-mercial dynamic solutions use [15]. These techniquesdo not use any quality of experience measure. Theytypically adjust the JPEG resolution to the maximumresolution supported by the device and use a fixedJPEG quality factor (e.g., 80) to provide good visualquality regardless of the transfer time incurred. Thisdynamic system will be denoted below as a fixed-QF system. On the other hand, with static solutions,since different versions of the content are created,the quality of experience of each can be taken intoaccount.

(a) (b)

(c) (d)

Fig. 7. A slide as exported by our extended OpenOfficeXHTML filter: (a) the original slide, (b) transcoded usingz = 30% and QF = 80, (c) transcoded using z = 80%and QF = 80, (d) transcoded using z = 50%, QF = 60

Therefore, we propose to compare our method witha typical dynamic system (fixed-QF system) and var-ious static systems using different granularity levels.Most static transcoding systems create different ver-sions to suit various target mobile devices [3], [4],[8]. When the content is requested by the mobiledevice, the best adapted version among those createdin advance is selected for delivery. The granularity ofthe created versions should be adequate enough todeliver the best user experience possible. However, inpractice, it is not always possible to reach that levelof granularity, owing to numerous constraints, suchas storage space or CPU processing time limitations,for example. Thus, we propose to compare our solu-tion with the following hypothetical static transcodingsystems, which are inspired from realistic needs:

• Exhaustive system: This system creates the max-imum possible number of adapted content ver-sions. We can say it uses all combinations of thesequantized values: z = {10%, 20%, . . . , 100%} andQF = {10, 20, . . . , 100} for both the JPEG andXHTML formats. As a result, it creates 200 ver-sions for each slide. We assume that this systemprovides a high enough granularity content, andtherefore constitutes a good benchmark for com-paring the best adapted content provided by eachsystem.

• Granularity-based systems: In practice, it is notalways possible nor desirable to transcode con-tent into 200 versions, and so we may considerusing only a limited number of values for z andQF . For instance, for an Impress presentationcomposed of 30 slides, 6000 versions should becreated. For a server dedicated to organizing

This article has been accepted for publication in a future issue of this journal, but has not been fully edited. Content may change prior to final publication.

Page 10: Quality Prediction-Based Dynamic Content Adaptation Framework Applied to Collaborative Mobile Presentations

IEEE TRANSACTIONS ON MOBILE COMPUTING, TMC-2011-03-0150.R1 10

TABLE 3Static transcoding systems used in the validation.

meetings for example, the number of versionsthat should be created could be very high, andso would be the processing time and storagespace needed for that purpose. Since the mostwidely used quality factor is 80, we propose tocompare our solution with ten systems basedon that quality factor and different quantizedvalues of z (see Table 3). From the first system tothe tenth, the granularity is enriched gradually,always building from the previous system (i.e.,system i + 1 adds one more version on top ofsystem i, whose characteristics are selected tocover the parameters set). For instance, the firstsystem creates only one version (using z = 100%)for each format, while the next system creates twoversions (using z = 100% and 50%) and so on, asshown in Table 3. This schema is not the onlyone possible, and, depending on the availableresources, other sets of systems could be used,such as varying QF around 80 (e.g., 70 or 60).

Note that the proposed static systems are actuallymore sophisticated than those that are commonlyused. Typical static systems select the highest resolu-tion content supported by a device regardless of thedelivery time. But the proposed static systems, usingthe same Q

Emetric, will lead to a fairer comparison

between the full potential of static and dynamic ap-proaches.

The comparison focuses on two aspects. First, wecompare the performance of the proposed dynamicsystem, in terms of quality of experience, to thedynamic fixed-QF system and a static system with Nversions (i.e., how many versions must a static systemgenerate to match our system). We also compare thisquality with that of the exhaustive static system tomeasure how far we are from optimality. Secondly, wecompare the storage space required by each system.

Each adapted content tf,z,QFk in W , which is in fact

a Web page, is parsed, and its actual QV(tf,z,QF

ck, D)

and r(tf,z,QFck

) values are computed. We then compute,for the desired mobile device D, Q

T(tf,z,QF

ck, D) and

QE(tf,z,QF

ck, D). On the other hand, for each slide ck

in V , its QV(tf,z,QF

ck, D) and r(tf,z,QF

ck) are computed

using the proposed dynamic framework. We then

compute, for the same mobile device, QT(tf,z,QF

ck, D)

and QE(tf,z,QF

ck, D). The SSIM index exhibits a highly

non-linear relationship with the DMOS (DifferentialMean Opinion Score), and therefore cannot be useddirectly as a measure of the human perception ofquality. Consequently, to address the third require-ment regarding the Q

Edesign [20] (see section 3),

the SSIM values are mapped, using a logistic functionand regression, to their corresponding subjective MOS(Mean Opinion Score) values [32]. In other words, wecompute or estimate the SSIM, but then map it tothe corresponding MOS value. As a result two arrayswere created:

QE :[ck, f, z,QF,QT

(tf,z,QFck

, D),QE(tf,z,QF

ck, D)

]QE :

[ck, f, z,QF, QT

(tf,z,QFck

, D), QE(tf,z,QF

ck, D)

]The best adapted contents obtained by the

previously mentioned transcoding systems arecomputed as follows:

• Exhaustive static system: the best adapted con-tent is identified by solving (1) on the QE arrayfor each slide ck.

• Granularity-based systems: first, a sub-array isobtained from QE by selecting the rows corre-sponding to the values of z and QF that defineeach system (see Table 3). Then, the best adaptedcontent, for each slide ck, is identified by solving(1) on that sub-array.

• Fixed-QF system: first, a sub-array is obtainedfrom QE by setting the value of z according tothe resolution of the target mobile device andQF = 80. Then, the best adapted content, for eachslide ck, is identified by solving (1) on that sub-array.

• Proposed dynamic system: the best transcodingparameters (f∗(ck, D), z∗(ck, D), QF

∗(ck, D))) are

estimated by solving (1) on QE for eachslide ck. Using these optimal parameter esti-mates, the actual quality of experience is re-trieved from the QE array, which corresponds toQ

E(t

f∗(ck,D),z∗(ck,D),QF∗(ck,D)

ck , D). The latter rep-resents the actual quality of experience obtainedby the proposed dynamic system, which is com-pared to those obtained by the other transcodingsystems.

5 EXPERIMENTAL RESULTS

To compare the performance and precision of ourmethod with the transcoding systems mentioned inthe previous section, the Q

Eaverage deviation be-

tween the best adapted content obtained by each sys-tem and that of the exhaustive system was computed.Since the computed data are too numerous to be pre-sented here, we arbitrarily selected one scenario thatuses the mobile device and network communication

This article has been accepted for publication in a future issue of this journal, but has not been fully edited. Content may change prior to final publication.

Page 11: Quality Prediction-Based Dynamic Content Adaptation Framework Applied to Collaborative Mobile Presentations

IEEE TRANSACTIONS ON MOBILE COMPUTING, TMC-2011-03-0150.R1 11

Fig. 8. QE

as a function of bitrate for f = JPEG andNL = 488 ms

Fig. 9. QE

as a function of bitrate for f = XHTML andNL = 488 ms

options found in the marketplace. Based on estimateduser tolerable waiting times [26], the values of αand β which represent the behavior of the transportquality (see section 3.2) are set as follows: α = 5sand β = 10s. In this scenario, the proposed mobiledevice D is a Nokia N8, which has a resolution of640×360, and is connected to a GPRS network thathas a bitrate BR(D) = 50 kbps and network latencyNL(D) = 488 ms [28]. Since the default resolution ofthe slide, as rendered on a PC by the OpenOffice JPEGfilter, is 1058×794, the maximum viewing conditionsare min

(6401058 ,

360794

)≈ 45%, which suggest, from [17],

a comparison of images at zv = 40%. These mobiledevice and network characteristics are tested using thevalidation set V . It should be pointed out that similarconclusions are reached with other mobile devicesand network conditions when a compromise betweenvisual quality and transfer time must be achieved.

First, the average quality of experience is computed,for various bitrate values for the proposed and thefixed-QF dynamic system, the static exhaustive sys-tem as well as the static systems with one and fiveversions. The results are presented in Fig. 8 for JPEG,and in Fig. 9 for XHTML. As expected, the Q

E

increases with the bitrate up to a point of saturation(quite visible for XHTML and occurring at higherbitrates for JPEG). The average Q

Evalues obtained

for the proposed dynamic solution are close to thoseof the exhaustive system for JPEG and very close

for XHTML. When f = JPEG, our dynamic solutionperforms better than the static system with up to fiveversions and very close to it when f = XHTML.

The QE

average deviation (which shows the preci-sion of each system), as computed for this example,is plotted in Fig. 10 for BR(D) = 50 kbps andNL(D) = 488 ms. This figure shows the differencebetween our dynamic system, a fixed-QF dynamicsystem and the static systems (those with N versionsand the exhaustive one with 200 versions). Further, itshows the precision of the adapted content achievedfor each system by computing the average deviationof the Q

Efor each system from that of the exhaustive

one. For this mobile device, when the format used isXHTML, our dynamic system provides better qual-ity than the granularity-based system with N = 3and slightly lower quality compared to the othergranularity-based systems (although when N is small,larger fluctuations in quality are observed). All arevery close to the optimum. When the format used isJPEG, seven versions are needed by the granularity-based systems to reach the quality obtained by theproposed dynamic system (this number depends onthe bitrate, and increases for lower bitrates).As shown in Figs. 8, 9 and 10, although the Q

Eof our

dynamic solution is, on average, significantly betterthan that of the fixed-QF system for XHTML, thefixed-QF system can be better under some conditionsfor JPEG. However, these results hide a defect of thefixed-QF system. For a sub-set of documents (slides),we computed the optimal solution provided by ourdynamic method, the fixed-QF system and the ex-haustive static system. As shown in Figs. 11 and 12,overall, our dynamic method has the same behavioras that of the exhaustive static one. On the other hand,the Q

Eresults provided by the fixed-QF system are

highly variable. As shown in these figures, the curvesfollow a certain periodicity due to the nature andorder of the documents submitted to the test. Actually,the documents are created by varying the areas takenup by images and text boxes. That is, in the first tendocuments, the area of images represents 10% of theslide and the areas of text boxes vary from 10% to100%. In the second ten documents, the area of imagesis augmented to 20% and the areas of text boxes arevaried form 10% to 100%, and so on.

Of course the dynamic system’s performance ishighly dependent on the accuracy of the SSIM andfile size estimates. With more accurate estimates (anda higher granularity of these estimates, limited hereto 10× 10 tables from [17], [18]), the proposed systemcould perform even better. We can also observe thatthe estimated XHTML data are more precise than theJPEG data (average deviation of about 1% for XHTMLcompared to 6% for JPEG), and this is explained bythe fact that, in the XHTML solution, only the SSIMof the embedded images and the relative file size areestimated (not the textual parts, for which quality and

This article has been accepted for publication in a future issue of this journal, but has not been fully edited. Content may change prior to final publication.

Page 12: Quality Prediction-Based Dynamic Content Adaptation Framework Applied to Collaborative Mobile Presentations

IEEE TRANSACTIONS ON MOBILE COMPUTING, TMC-2011-03-0150.R1 12

Fig. 10. Average QE

deviation from the exhaustivestatic system computed for BR = 50 kbps, NL = 488ms and the entire slide corpus

Fig. 11. Optimal QE

computed for BR = 50 kbps,NL = 488 ms and a sub-set of slides. f = JPEG

Fig. 12. Optimal QE

computed for BR = 50 kbps,NL = 488 ms and a sub-set of slides. f = XHTML

size are known rather than estimated), whereas in theJPEG solution, the estimated data are computed forthe whole slide.

It is important to note that we have been veryconservative in the performance evaluation of staticsystems by assuming that the terminal could renderevery received image. For example, in this scenario, itwas assumed that the static system with N = 1 wouldsend a 1058×794 JPEG image which, upon reception,would then be scaled by the terminal to fit its screenresolution. However, in reality, it is possible that noneof the versions generated by a static system will besupported by the terminal (especially for low valuesof N ), providing another significant advantage to the

Fig. 13. Total delivery time computed for BR = 50kbps, NL = 488 ms and a sub-set of slides. f = JPEG

Fig. 14. Total delivery time computed for BR = 50kbps, NL = 488 ms and a sub-set of slides. f =XHTML

proposed system, as it sends only content the terminalcan support.

Although the fixed-QF dynamic system usuallyprovides good visual quality by setting QF = 80,it has no control over the file size, which affectsthe total delivery time, and therefore the Q

E. This

aspect has been tested on the same set of slidesby computing, for each system, the total deliverytime. Figs. 13 and 14 show the total delivery timerequired by each slide to be delivered when the formatused is JPEG and XHTML, respectively. In these twofigures, the proposed dynamic system provides a totaldelivery time very close to that of the exhaustivestatic system, whereas the fixed-QF system exhibits ahighly variable delivery time (more than 10s in someinstances).

Let us now examine the behavior of the stor-age space needed for the versions created for eachtranscoding system. To do so, the average file sizeof the versions created by each system is computedand plotted. Fig. 15 shows the average storage spacefor the JPEG and XHTML versions. As expected, theXHTML solution is very lightweight, as compared tothe JPEG. This is because only the embedded imagesare rasterized in the XHTML solution, and not thewhole slide, as is the case with the JPEG solution.As shown by the curves, our solution becomes in-creasingly competitive as the granularity of the static

This article has been accepted for publication in a future issue of this journal, but has not been fully edited. Content may change prior to final publication.

Page 13: Quality Prediction-Based Dynamic Content Adaptation Framework Applied to Collaborative Mobile Presentations

IEEE TRANSACTIONS ON MOBILE COMPUTING, TMC-2011-03-0150.R1 13

Fig. 15. Average storage space computed for theentire slide corpus

transcoding systems increases. In this example, westated that when JPEG is used, seven versions areneeded to reach our estimated optimal adapted con-tent. If the seventh system is used, the total storagespace that should be available is 487 KB on average,whereas in our solution, only one version is created,and only 55.5 KB is needed on average. This meansthat nearly 9 times more space should be available toaccommodate the seventh transcoding system. In theXHTML solution, three versions are needed to achieveour estimated adapted content. In this case, the totalspace needed is 75.8 KB. The latter is reduced relativeto that needed by the JPEG solution, but it is stillfar more than what is needed by our solution (16.9KB on average for only one version). Finally, for thesame example, ten versions (seven JPEG versions andthree XHTML ones) should be created by the statictranscoding systems, whereas using our solution, onlyone version (JPEG or XHTML) is created.

Transcoding many versions is CPU-intensive, andshould be performed off-line, which is not always pos-sible. In contrast, our dynamic solution provides near-optimal adapted content on the fly, while the end-useris still on-line. Overall, the proposed solution is veryattractive compared to the static transcoding systemspreviously presented. It achieves a good compromisebetween performance (little storage space and lessprocessing time) and good quality (close to that ofthe exhaustive system).

In summary, when the bitrate is very high, thereis no significant advantage in terms of quality ofexperience to select XHTML over JPEG for any ofthe systems presented. However, XHTML is increas-ingly attractive as the bitrate becomes smaller sinceit always ensures crisp and readable text. For staticsystems, XHTML requires, for a given Q

Eaverage

deviation, significantly fewer versions than JPEG andless storage space. It also offers other advantages, suchas being able to edit text. Therefore, it is clear thatXHTML is, on average, a better format for sharingpresentations than JPEG. Even for the proposed dy-namic framework, XHTML leads to more accurate Q

E

prediction and overall system performance. On a final

note, the proposed system is not only superior forlow bitrate connections, but for any situation wherea compromise between transport time and visualquality is required (e.g., high definition content over3G networks). Yet, it performs competitively in othersituations.

6 COMPLEXITY OF THE PROPOSED METHOD

In contrast with the exhaustive static system, whichrequires 200 transcoding operations, the proposeddynamic framework requires a single transcoding op-eration (performed after estimation of the transcodingparameters). To that, we must add the computationof the Q

Varray, which can be computed off-line;

QT

and QE

arrays, that are performed at runtime,and a look-up search in the Q

Earray. These added

computations are very light compared to a singletranscoding operation, and can be negligible on high-end servers. The fixed-QF system also requires asingle transcoding operation, but, as shown, exhibitsa highly variable quality. The granularity-based sys-tems require more than one transcoding operationand are thus more complex. The proposed systemoffers exceptional quality at minimal computationalcomplexity.

7 CONCLUSION

In this paper, we presented a novel dynamic contentadaptation framework applied to professional docu-ments shared using Web technologies. The methodestimates, on the fly, near-optimal transcoding pa-rameters based on the original content and its com-position (areas occupied by text and images). Wehave proposed a measure of the quality of experiencetaking into account the visual aspect of the transcodedcontent, in addition to the total delivery time. Theproposed measure, although useful, is mostly illustra-tive, as the proposed framework is general, and can beused with other quality measures. The validation re-sults show that dynamic content adaptation based onaccurate prediction can provide a good compromisebetween quality and delivery time, and drastically re-duce storage requirements. Even though the proposedframework is quite general, future research can beperformed to show and validate its applicability toother professional documents such as Word and Excel.

REFERENCES

[1] D. Sudhir and W. Tao, Content Networking in the Mobile Internet,Chapter 7: Content Adaptation for the Mobile Internet. John Wileyand Sons, 2004.

[2] W. Lum and F. Lau, “User-Centric Adaptation of StructuredWeb Documents for Small Devices,” in 19th Int. Conf. onAdvanced Information Networking and Applications (AINA’05),vol. 1. IEEE, 2005, pp. 507–512.

[3] ——, “User-Centric Content Negotiation for Effective Adap-tation Service in Mobile Computing,” IEEE Transactions onSoftware Engineering, vol. 29, no. 12, pp. 1100–1111, Dec 2003.

This article has been accepted for publication in a future issue of this journal, but has not been fully edited. Content may change prior to final publication.

Page 14: Quality Prediction-Based Dynamic Content Adaptation Framework Applied to Collaborative Mobile Presentations

IEEE TRANSACTIONS ON MOBILE COMPUTING, TMC-2011-03-0150.R1 14

[4] Y. Zhang, S. Zhang, and S. Han, “A New Methodology of QoSEvaluation and Service Selection for Ubiquitous Computing,”in Wireless Algorithms, Systems, and Applications, ser. LNCS.Springer Berlin / Heidelberg, 2006, vol. 4138, pp. 69–80.

[5] Y. Hwang, J. Kim, and E. Seo, “Structure-Aware Web Transcod-ing for Mobile Devices,” IEEE Internet Computing, vol. 7, no. 5,pp. 14–21, 2003.

[6] H. Chua, S. Scott, Y. Choi, and P. Blanchfield, “Web-pageAdaptation Framework for PC Mobile Device Collaboration,”in Advanced Information Networking and Applications, AINA2005. 19th Int. Conference on, vol. 2, Mar 2005, pp. 727–732.

[7] J. Hong, E. Suh, and S. Kim, “Context-aware Systems: ALiterature Review and Classification,” Expert Systems withApplications, vol. 36, no. 4, pp. 8509–8522, 2009.

[8] R. Mohan and J. Smith, “Adapting Multimedia Internet Con-tent for Universal Access,” IEEE Transactions on Multimedia,vol. 1, no. 1, pp. 104–114, Mar 1999.

[9] B. Noble, M. Price, and M. Satyanarayanan, “A Program-ming Interface for Application-Aware Adaptation in MobileComputing,” 2nd USENIX Symposium on Mobile and LocationIndependent Computing, vol. 8, no. 4, pp. 57–66, 1995.

[10] R. Jan, C. Lin, and M. Chern, “An Optimization Model forWeb Content Adaptation,” Computer Networks, vol. 50, no. 7,pp. 953–965, 2006.

[11] S. Chandra and C. S. Ellis, “JPEG compression metric as aquality-aware image transcoding,” in Proc. of the 2nd conferenceon USENIX Symposium on Internet Technologies and Systems,1999, pp. 81–92.

[12] F. Kitayama, S. Hitose, G. Kondoh, and K. Kuse, “Designof a Framework for Dynamic Content Adaptation to Web-enabled Terminals and Enterprise Applications,” ProceedingsSixth Asia Pacific Software Engineering Conference ASPEC99 CatNoPR00509, pp. 72–79, 1999.

[13] W. Lum and F. Lau, “On Balancing Between TranscodingOverhead and Spatial Consumption in Content Adaptation,”Proceedings of the 8th annual international conference on Mobilecomputing and networking MobiCom 02, p. 239, 2002.

[14] R. Han, P. Bhagwat, R. LaMaire, T. Mummert, V. Perret,and J. Rubas, “Dynamic Adaptation in an Image TranscodingProxy for Mobile Web Browsing,” IEEE Personal Communica-tions, vol. 5, no. 6, pp. 8–17, 1998.

[15] D. Li and U. Chandra, “Building Web-based CollaborationServices on Mobile Phones,” in 2008 Int. Symp. on CollaborativeTechnologies and Systems. IEEE, May 2008, pp. 295–304.

[16] Google, “GoogleDocs Mobile.” [Online]. Available: http://www.google.ca/mobile/docs/index.html

[17] S. Coulombe and S. Pigeon, “Quality-aware Selection of Qual-ity Factor and Scaling Parameters in JPEG Image Transcod-ing,” in 2009 IEEE Symp. on Computational Intelligence forMultimedia Signal and Vision Processing, Mar 2009, pp. 68–74.

[18] S. Pigeon and S. Coulombe, “Computationally Efficient Algo-rithms for Predicting the File Size of JPEG Images Subject toChanges of Quality Factor and Scaling,” in 2008 24th BiennialSymposium on Communications. IEEE, Jun 2008, pp. 378–382.

[19] S. Coulombe and S. Pigeon, “Low-complexity Transcoding ofJPEG Images With Near-optimal Quality Using a PredictiveQuality Factor and Scaling Parameters.” Image Processing, IEEETransactions on, vol. 19, no. 3, pp. 712–721, Mar 2010.

[20] F. Kuipers, R. Kooij, D. De Vleeschauwer, and K. Brunnstrom,“Techniques for Measuring Quality of Experience,” inWired/Wireless Internet Communications, ser. Lecture Notes inComputer Science. Springer-Verlag Berlin/Heidelberg, 2010,vol. 6074, pp. 216–227.

[21] L. Lee and R. Anderson, “A Comparison of Compensatoryand Non-Compensatory Decision Making Strategies in ITProject Portfolio Management,” 2009. [Online]. Available:http://aisel.aisnet.org/irwitpm2009/9

[22] A. Dieckmann, K. Dippold, and H. Dietrich, “Compensatoryversus Noncompensatory Models for Predicting ConsumerPreferences,” Judgment and Decision Making, vol. 4, no. 3, pp.200 –213, 2009.

[23] Z. Wang, A. C. Bovik, H. R. Sheikh, and E. P. Simoncelli,“Image Quality Assessment: From Error Visibility to StructuralSimilarity.” IEEE Transactions on Image Processing, vol. 13, no. 4,pp. 600–612, 2004.

[24] S. Pigeon and S. Coulombe, “Efficient Clustering-based Al-gorithm for Predicting File Size and Structural Similarity of

Transcoded JPEG Images,” Multimedia, Int. Symp. on, pp. 137–142, Dec 2011.

[25] The MathWorks, “Z-shaped built-in membership func-tion.” [Online]. Available: http://www.mathworks.com/help/toolbox/fuzzy/zmf.html

[26] F. Nah, “A study on Tolerable Waiting Time: How LongAre Web Users Willing to Wait?” Behaviour & InformationTechnology, vol. 23, no. 3, pp. 153–163, Jan 2004.

[27] H. Ningning and P. Steenkiste, “Evaluation and characteriza-tion of available bandwidth probing techniques,” IEEE Journalon Selected Areas in Communications, vol. 21, no. 6, pp. 879–894,Aug 2003.

[28] P. Svoboda, F. Ricciato, W. Keim, and M. Rupp, “MeasuredWEB Performance in GPRS, EDGE, UMTS and HSDPA withand without Caching,” in IEEE Int. Symp. on a World of Wireless,Mobile and Multimedia Networks (WoWMoM), Jun 2007, pp. 1–6.

[29] OpenOffice.org, “The OpenOffice.org API Project.” [Online].Available: http://api.openoffice.org

[30] University of Southern California, Signal and ImageProcessing Institute. Electrical Engineering Department, “TheUSC-SIPI Image Database.” [Online]. Available: http://sipi.usc.edu/database/database.cgi?volume=misc\&image=11

[31] ImageMagick, “ImageMagick Command Line Tools.” [Online].Available: http://www.imagemagick.org/script/index.php

[32] H. Sheikh, Z. Wang, L. Cormack, and A. Bovik, “LIVE ImageQuality Assessment Database Release 2.” [Online]. Available:http://live.ece.utexas.edu/research/quality/subjective.htm

Habib Louafi (S’11) received an Engineer-ing degree in Computer Science from theUniversity of Oran (Algeria) in 1993, an M.Sc.from the Universite du Quebec a Montreal(UQAM) in 2006. He is currently workingtoward a Ph.D. at the Ecole de technologiesuperieure (ETS is a part of the Universitedu Quebec network). His fields of interest in-clude mobile computing, context-aware sys-tems, QoE, content adaptation and collabo-rative mobile Web conferencing.

Stephane Coulombe (S’90-M’98-SM’01) re-ceived a B.Eng. in Electrical Engineeringfrom the Ecole Polytechnique de Montreal,Canada, in 1991, and a Ph.D. from INRS-Telecommunications, Montreal, in 1996. Heis a Professor at the Software and IT En-gineering Department, Ecole de technologiesuperieure (ETS is a part of the Universitedu Quebec network). From 1997 to 1999, hewas with Nortel Wireless Network Group in,Montreal, and from 1999 to 2004, he worked

with the Nokia Research Center, Dallas, TX, as Senior Engineerand as Program Manager in the Audiovisual Systems Laboratory.He joined ETS in 2004, where he currently carries out research anddevelopment on video processing and systems, media adaptation,and transcoding. Since 2009, he has held the Vantrix IndustrialResearch Chair in Video Optimization.

Umesh Chandra received an MS in Com-puter Science (CS) from University of NorthTexas in 1996 and BS in CS from OsmaniaUniversity in 1993. He is currently working asresearch lead at Nokia Research, Bangalore,India. He is currently working on developingmobile services targeting emerging marketsin the domain of Social and Location. Prior tothis he has worked on Mobile video confer-encing and developing standards in the areaof video transport. His area of interest is in

mobile technologies, IP Multimedia, data management, E2E servicecreation, Emerging markets.

This article has been accepted for publication in a future issue of this journal, but has not been fully edited. Content may change prior to final publication.


Recommended