Efﬁcient Content Distribution in DOOH Advertising …industry in the year 2015 [8]. The...

Efficient Content Distribution in DOOH AdvertisingNetworks Exploiting Urban Geo-Social Connectivity

Fang-Zhou Jiangú, Kanchana Thilakarathnaú, Mahbub Hassanú,Yusheng Ji†, Aruna Seneviratneú

úData61, CSIRO & UNSW, Australia, [email protected],†National Institute of Informatics, Japan, [email protected]

ABSTRACTDigital out-of-home (DOOH) advertising networks, com-prised of pervasive distributed digital signages (screens),are rapidly growing. It is reported that more than 70% ofDOOH revenue comes from local ads, while it is especiallychallenging to decide when and where to deliver the mostsuitable ad due to the spatio-temporal dynamics of humanmobility and preferences. Understanding urban geo-socialconnectivity in terms of people movement would greatlybenefit ad content distribution, and could potentially be uti-lized by a large number of mobile applications and geo-socialservices. However, existing DOOH ad distribution systemsare designed to target individuals, which might not be thebest choice in public spaces, and do not consider the pref-erences of “cohort of users”. In this paper, we propose analternative approach to target cohort of users extracting ur-ban geo-social connectivity through large-scale mobile net-work data and existing geo-social service data. We constructa dynamic urban geo-social connectivity graph, and formu-late the problem of distributing ads for maximum exposureto the “right” users under a constrained budget. Hence, wepropose a heuristic algorithm. Simulation results show thatour system targeting “cohort of users” achieves a maximum300% improvement compared to naive distributing methodin displaying ads to the “right people” when user preferencesare completely known, while a minimum of 25% improve-ment when the knowledge of user preferences is limited.

KeywordsSpatio-Temporal Dynamics; Digital out-of-home Network;Content Distribution

1. INTRODUCTIONDigital signage, as a form of an electronic display, is be-

coming ubiquitous and is a very important form of out-of-home (OOH) advertising in public places. Almost to befound everywhere, digital out-of-home (DOOH) advertising

c•2017 International World Wide Web Conference Committee

(IW3C2), published under Creative Commons CC BY 4.0 License.

WWW’17 Companion, April 3–7, 2017, Perth, Australia.

ACM 978-1-4503-4914-7/17/04.

http://dx.doi.org/10.1145/3041021.3051156

.

expanded rapidly. From 2009 to 2014, it grew at a rate ofapproximately 30% [7], representing over 40% of total OOHindustry in the year 2015 [8]. The pervasiveness of DOOHis partially due to its advantage of displaying multimediacontent with more flexibility and has the ability to adapt todi�erent contexts [4]. However, the challenge has been toreach the “right people” with the “right ads” and deliver-ing ads at the “right time” [21, 24]. Our hypothesis is thespatial and temporal dynamics of urban users can be ex-ploited to deliver “right ads” to “cohort of users”. Our workattempts to bridge the gap between urban social connectiv-ity and better ad delivery in physical web leveraging users’location information.

Advertisers have attempted to apply the successful modelof online personalized ads to DOOH. This approach has itslimitations in public space especially related to privacy [13].Delivering ad content based on “cohort of user” interest hasattracted some attentions [13], but surprisingly, do not takeinto account the spatial and temporal dynamics of user pref-erences. Currently, the approach is to distribute content viacommercialized content management systems [20, 5] via nu-merous user information collected by advertisers. These sys-tems generally consist of ad generators, ad servers and dig-ital screens. The advertisers send digital content to contentservers for scheduling content on the digital screens. How-ever, despite studies [23] in the area of urban modeling, theydo not take into account the spatial and temporal dynam-ics of user preferences. Furthermore, as the digital screenis design to reach large number of people, whether and howpersonalized ads in DOOH would improve e�ciency is stillnot answered.

The pervasive use of mobile devices enables us to havea much finer and complete understanding of the user dy-namics and their interest than traditional methods such ascustomer surveys. In this paper, we attempt to answer thequestion whether it is possible to better serve content togiven set of digital screens by leveraging urban geo-socialconnectivity graph built via large-scale mobile user mobilitydata. We first analyze a large-scale dataset collected by mo-bile network provider, and show the strong spatio-temporaldynamics of users. We then model urban connectivity ascohort of user movement in a dynamic graph. In addition,an optimization problem of where to distribute a given adat a given time is formulated assuming full knowledge ofuser preference. We further propose a heuristic algorithmexploiting spatio-temporal user correlation and spatial sim-ilarity when knowledge about users is limited. Finally, weevaluate the improvement of targeting “cohort of users” with

1363

real-world data-driven simulation, and the results shows upto 300% improvement in ad eyeballing by the “right peo-ple” compared to distribution scheme that does not takeinto account dynamics of “cohort of users” preferences. Ourconstructed fine-granularity spatial similarity graph in ur-ban area could potentially be used for other services andapplications such as city planning and location-based rec-ommendation systems. To the best of our knowledge, thisis the first work attempting to improve DOOH content dis-tribution e�ciency leveraging spatio-temporal dynamics ofusers and urban geo-social connectivity, and being evalu-ated with large-scale mobile dataset. This paper makes thefollowing contributions;

• We analyze fine-grained spatio-temporal dynamics ofurban users with large-scale datasets, and propose amethod of describing the user movement in form ofgeo-social connectivity graph.

• We formulate ad distribution in DOOH as an opti-mization problem, and propose a novel spatial simi-larity based heuristic algorithm that provide the bestsolution possible when knowledge about spatial userpreferences are limited.

• We evaluate both the idea of targeting “cohort of user”and proposed heuristic algorithm with data-drivensimulation, and show an up to 3 times improvementin eyeballing by people who are interested.

• Finally, we identify our work’s theoretical and practi-cal implications as well as potential applications andservices.

The rest of paper is organized as follows; We first presentthe sets of data used in Section 2. In Section 3, we analyzethe spatio-temporal dynamics of the collected dataset. Wethen formulate an optimization problem and describe thesolutions in Section 4-5. Section 6 evaluates our hypothesisand proposed heuristic algorithm, and Section 7 summarizesthe related work. Finally, Section 8 provides our discussionand Section 9 concludes the paper.

2. SETS OF DATAThe following datasets are used in this paper to gain in-

sights about geo-social connectivity of urban population.We further study the e�ect of targeting “cohort of users”with simulation driven by these datasets.

2.1 Mobile Urban Dynamics DatasetWe use population dynamic of a city derived from number

of mobile devices connected to mobile base stations collectedby NTT DoCoMo1, the largest Mobile Network Operator inJapan. In the first dataset (DS1 ), the city, where the datawas collected, is split into 9,367 250m*250m inner city gridsin urban areas and 7,160 500m*500m grids for the ruralareas. The data covers morning (5am), noon (12pm) andevening (6pm) snapshot on both weekdays and weekendsacross 4 seasons. The registered home addresses of users arealso available, e.g., 20 people currently at grid i come fromsuburb ID j. Additionally, data is available for a hourlysnapshots over 2 days (48hrs) of four hot-spots (2km ra-dius each includes over 150 grids) for fine-grained temporalanalysis. We denote this dataset as DS2.1www.nttdocomo.co.jp/english/

0 2000 4000 6000 8000 10000Population

0.0000

0.0002

0.0004

0.0006

0.0008

0.0010

PDF

Jan WeekdayApr WeekdayJul WeekdayOct Weekday

(a) Weekdays

0 2000 4000 6000 8000 10000Population

0.0000

0.0002

0.0004

0.0006

0.0008

0.0010

0.0012

PDF

Jan WeekendApr WeekendJul WeekendOct Weekend

(b) WeekendsFigure 1: Seasonal Spatial Population Distribution

2.2 Yelp DatasetWe obtained business names, rating, reviews and geo-

locations of 499,738 businesses in the same geographical areaby crawling the popular business review website Yelp2. Thedataset (DS3 ) provide us further insight into characteristicsof fine-grained geographical areas.

3. SPATIO-TEMPORAL DYNAMICSIn this section, we present the basics characteristics of the

datasets both spatially and temporally, and show that de-spite the observed strong dynamics, there are certain spatio-temporal patterns of “cohort of user” mobility and behaviorthat could be exploited to improve content distribution e�-ciency.3.1 Spatial Dynamics and Characteristics

In Figure 1, we first show spatial distribution of popular-tion on both weekdays and weekends across di�erent seasonsin a year, where x-axis represents population per grid. Itcould be seen that spatial distribution di�ers on weekdaysand weekends, while seasonal variation is subtle. In addi-tion, a higher proportion of grids with population densityper grid over 2000 on weekdays is observed. Although thePDF distribution show little seasonal variation, it does notgive a full picture of fine-grained spatio-temporal dynam-ics. We then compare detailed spatial and temporal urbanpopulation dynamics on both weekdays and weekends in aselected month. Overall, spatial distribution in the morn-ing is identical on both weekday and weekends. We suspectthat is due to most people still sleep at home during thattime, which presents a baseline distribution of city popula-tion. Moreover, we observe a clear human flow from suburbsinto central CBD (center circle) on weekdays from 5am to12pm, while a similar trend is not obvious on weekends. Wealso verified the diurnal pattern during the other times ofyear and did not observe significant di�erence.

Furthermore, we use Google Map API3 to decode the 1804suburb ID in DS1 to geo-location with latitude and longi-tude (generally center of the suburb). The home locationsalong with users current locations could be used to find thephysical distance that people travel away from home andurban geo-social connections. Figure 2a illustrates the PDFand CDF distributions of away from home distance. It couldbe seen that 90% of people are within 20km away from theirhome at all times. Furthermore, very few people travel over60km away from home. Figure 2b further examines the tem-poral dynamics of away from home distance in a bar chartfor the month January. As expected, people are generallycloser to their homes on weekends than weekdays. More-over, we observe a highly dynamic spatial variation across2www.yelp.com3https://maps.googleapis.com/

1364

0 20 40 60 80 100Commuting Distance(km)

0.00

0.02

0.04

0.06

0.08

0.10

0.12

0.14

0.16PD

F

0.0

0.2

0.4

0.6

0.8

1.0

CD

F

(a) Distribution5am/Jan 12pm/Jan 6pm/Jan

0

5

10

15

20

25

30

Mea

nD

ispl

acem

ent(

km)

WeekdaysWeekends

(b) Temporal DynamicsFigure 2: Away from Home Distance

0 5 10 15 20Business Density

0.0

0.1

0.2

0.3

0.4

0.5

0.6

PDF

(a) Distribution (b) Tokyo Bay AreaFigure 3: Business Characteristics

the city even at the same time. Again, as overall statisticsdoes not tell the full story of fine-grained spatio-temporaldynamics, we further study urban spatial characteristics.

We show urban geographical characteristics in Figure 3,where business density is used to describe business activityin a given area. Figure 3a shows the PDF distribution ofbusiness density of the whole city, while Figure 3b displaythe density map. Land usage of certain area, i.e. residentialor business, could potentially be identified by DS3. In thispaper, we mainly use business density as a feature, however,more features could be collected to further describe charac-teristics of an area, i.e. social economy [11].

3.2 Temporal DynamicsWe compare the temporal dynamics of four selected hot-

spots during a period of 48 hours in Figure 4 using finer-grained temporal dataset DS2. These hot spots representsexhibition center, entertainment center, inter-city and localtrain station respectively. From Figure 4a to Figure 4d,hourly box-plots of spatial distributions of population inthese hot-spots are shown. The box plot shows the me-dian, 25 and 75 percentile of population among the inter-ested area. Along with the box plot, we also include thetemporal dynamics of population spatial entropy. We defineSt

n

as spatial population entropy among n grids at time t.

St

n

= ≠nÿ

i=1

pt

i

log2

(pt

i

)log

2

(n) , where pt

i

= N t

iqj=n

j=1

N t

j

(1)

N t

j

represents the number of people in grid j at time t. Anentropy S

n

close to 0 indicates that the distribution is ex-tremely skewed, while close to 1 represents a more even dis-tribution. The dynamic entropy change reflects the sparse-ness variation spatially, and gives us a better understandingof the skewness of distribution in di�erent areas. Further-more, all selected hot-spots display an clear diurnal patternboth in boxplot and entropy, although the peak of sparsenessoccurs at di�erent time of the day depending on the functionof the area. Despite the strong spatio-temporal fluctuationof entropy, the value is generally over 0.9. This indicatesthat only limited grids are very di�erent to the remaining

ones in the same area, suggesting the existence of spatialcorrelation for the neighbouring areas.

We then focus more on temporal dynamic of a individ-ual grid. We define the rate of change of grid i at time t,Kt

i

, as the percentage change of population compared to theprevious observation. (in our case the previous hour)

Kt

i

= N t

i

≠ N t≠1

i

N t≠1

i

(2)

From Figure 4e to Figure 4h, interestingly, some gridspresent strong daily dynamic patterns (i.e. some grids atLocation 3), while other grids does not change significantlythroughout the day. More specifically, the rate of changeswings significantly higher in the inter-city station, whileit is quite stable in most areas of a local major station.The temporal evolution in these selected hot spots gives usa better understanding of the fine grained spatio-temporaldynamics of people.

We constructed a geo-social connectivity graph (details inSection 5), and present some initial findings in this para-graph. Figure 5a further illustrates the CDF of diversity(degree of connectivity) with regards to the number of sub-urbs that people are from in a given area. Location 1 as anexhibition center, is surprisingly less diverse than the over-all Tokyo baseline. Moreover, Location 3 as an inter-citytrain station is more diverse, with more than triple higherdiversity compared to the overall city level observation. Astronger diversity along with higher dynamics could poten-tially indicate an area more suitable for certain type of adcontent. We further study the characteristics and dynam-ics of connectivity in the following subsection. Moreover,intuitively, popular nodes attract people from various loca-tions. Figure 5b shows the degree of connectivity of thewhole city. Degree of connectivity among ranked nodes ap-proximately follows power law distribution, i.e. zipf (linearin log-log plot), which suggests that small number of verypopular nodes are accounted for a majority of connections.

Lastly, we investigate how a certain connection evolve overtime, and attempts to understand if the evolution is pre-dictable. We identify an unique connection by ID numberof both grid node and home node. We first measure how sig-nificant the sets of connections evolve overtime by Jaccardsimilarity. In total, we observe approximately 1M (992,957)unique connections, totally 4.9M (counting repeated ones)over time. Figure 6 shows the Jaccard similarity of connec-tion sets of di�erent time of day. Quite surprisingly, thepair wise similarity of connection sets at 5am on di�erentdays is lower than other times of the day. 12pm seems tobe the most stable time of day, with around 40% of sharedconnections over di�erent days. In addition, weekdays aremore similar than weekends, with 5am being an exception.

4. EFFICIENT CONTENT DISTRIBUTIONIN DOOH NETWORK

In existing DOOH ad content distribution systems, adver-tisers need to decide when and where to display the “right”ads for the “right” people. Therefore, advertisers are re-quired to either bid or come up with a set of locations whenthey wish to distribute certain ads. To the best of our knowl-edge, there has been no work optimizing ads distribution for“cohort of users” in DOOH considering the spatio-temporaldynamics of users. We attempt to solve the problem exploit-

1365

0 10 20 30 40Hours

0

2000

4000

6000

8000

10000

12000Po

pula

tion

0.86

0.87

0.88

0.89

0.90

0.91

0.92

Ent

ropy

(a) Location 1(exhibition center)

0 10 20 30 40Hours

0

5000

10000

15000

20000

25000

30000

Popu

lati

on

0.92

0.93

0.94

0.95

0.96

0.97

0.98

0.99

Ent

ropy

(b) Location 2(entertainment)

0 10 20 30 40Hours

02000400060008000

1000012000140001600018000

Popu

lati

on

0.93

0.94

0.95

0.96

0.97

Ent

ropy

(c) Location 3(inter-city station)

0 10 20 30 40Hours

0

5000

10000

15000

20000

25000

Popu

lati

on

0.90

0.92

0.94

0.96

0.98

Ent

ropy

(d) Location 4(local station)

0 10 20 30 40Hours

0

20

40

60

80

100

120

140

Are

as

�100

�80

�60

�40

�20

0

20

40

60

80

100

Rat

eof

Cha

nge(

%)

(e) Location 1

0 10 20 30 40Hours

0

50

100

150

200

Are

as

�100

�80

�60

�40

�20

0

20

40

60

80

100

Rat

eof

Cha

nge(

%)

(f) Location 2

0 10 20 30 40Hours

0

50

100

150

200

Are

as

�100

�80

�60

�40

�20

0

20

40

60

80

100

Rat

eof

Cha

nge(

%)

(g) Location 3

0 10 20 30 40Hours

0

50

100

150

200

Are

as

�100

�80

�60

�40

�20

0

20

40

60

80

100

Rat

eof

Cha

nge(

%)

(h) Location 4Figure 4: Temporal Evolution in Selected Hot Spots

0 50 100 150 200 250 300 350 400Degree of Connectivity

0.0

0.2

0.4

0.6

0.8

1.0

CD

F

Location 1Location 2Location 3Location 4All Tokyo (Baseline)

(a) Spatial Diversity

100 101 102 103 104

Ranked Nodes

100

101

102

103

104

Deg

ree

ofC

onne

ctiv

ity

(b) Spatial ConnectivityFigure 5: Spatial Distribution

5am 12pm 6pm0.0

0.1

0.2

0.3

0.4

0.5

0.6

Jacc

ard

Sim

ilari

ty

WeekdaysWeekends

Figure 6: Jaccard Similarity of Connections

ing urban geo-social connectivity constructed from mobileusers. The assumption we have made is that there existsspatial correlation of user preference based on local com-munity. For example, people living in city CBD are ex-pected to have di�erent ad preference as people who areliving in rural areas. This assumption is partially backed bythe spatio-temporal correlation of user interest observed atmobile edge [10] and the spatial-temporal dynamics pattern

we showed in Section 3. Thus, we consider the scenario thatan advertiser needs to determine a set of locations at eachtime window t to push their ad content.

5. PROBLEM FORMULATIONWe model urban connectivity in the form of user flow as

a dynamic graph Gt, consisting of two types of nodes, 1)Lt - Location nodes representing each location grid in thecity and 2) Ht - Home nodes representing the suburbs ofregistered homes of people for the considered time windowof t. There is a subgraph Gt

i

for each location node Lt

i

œ Lt

connecting Lt

i

to a set of home nodes based on the homelocations of the people at Lt

i

at time t. As such, we definean edge et

i,j

œ E between Lt

i

and Ht

j

if there is at least oneuser from Ht

j

at Lt

i

. Edge weight of et

i,j

is defined as thenumber of people connecting the two node N t

i,j

.With the aid of DS1, we attempt to model the geo-social

connectivity graph Gt. For each unique connection, we con-structed a temporal list indicating the existence of connec-tion at di�erent time. We then apply Logistic Regression forpredicting the existience of certain geo-social connectivity.Results show a learning score of around 0.7944, and regu-larization strength does not significantly change the result.Although the existence of connection could be relatively easyto predict with approximately 80% success rate, predictingthe strength of certain geo-social connection is quite chal-lenging. Polynomial regression of various degree is trainedfor DS1 to predict how strong each connection is tempo-

1366

rally, however considerably low R2 (< 5 ◊ 10≠4) could beachieved. Therefore, individual strength of connectivity ishard to predict due to the high dynamics, and data-drivenapproach is more suitable than direct modeling.

5.1 Known Spatial User PreferenceWe assume that an advertiser needs to distribute an ad

– to a set of locations LÕ. As node connectivities changeover time, we will solve the dynamic problem by solving foreach time window t. Further, we presume that individualuser preference towards one ad is binary, i.e. people eitherlike or dislike the ad –. Spatial user preference is defined asthe percentage of people who are interested in an ad amongthe total population of the considered area. For each homenode Ht

j

œ Ht, the spatial user preference for ad – can bedenoted as h–

j

, where h–

j

œ (0, 1) and j œ (1, 2, ...|Ht|).We define the cost of displaying an ad at location L

i

to beC

i

and the total cost constraint per an ad as C–

max

. Giventhe number of people in grid Lt

i

, N t

i

and its connectionswith home nodes Ht

j

, the problem is to find a set of locationLÕ µ Lt to display the ad – at each time window t thatwould maximize the exposure to total number of users �who are interested in – under the budget constraint C–

max

.Thus, we formulate the problem as follows4;

Maximizeÿ

’LiœLÕ

ÿ

’ei,j

Ni,j

◊ h–

j

s.t.ÿ

’LiœLÕ

Ci

Æ C–

max

Ci

Ø 0f(L

i

) =q

GiN

i,j

◊ h–

j

computes the total number ofpeople who are interested in – at location L

i

, by consideringits connections with home nodes. If we consider, f(i) as thevalue of each location and C

i

as the weight of each locationL

i

, then it is trivial to show that our problem is equivalentto standard Kanpsack problem, where a Knapsack of sizeC

max

needs to filled to the maximum value by selecting asubset of items LÕ µ Lt. This immediately follows that ourproblem is also NP-Hard problem and could not be solvedin polynomial time even for one time window. However,Knapsack problem is a well studied problem with number ofapproximation solutions. A reasonable approximation canbe found by a local greedy solution that maximizes the valueper unit cost. We denote �Õ =

q’LiœLÕ f(L

i

) as total valueof selected set LÕ.

5.1.1 Multiple knapsack problemWe can further extend to a list of advertisers with ads

{–, —, “...} to be distributed, and optimal locations sets{L

1

,L2

,L3

...} needs to be determined that will maximize thetotal value for all advertisers. Similarly, people’s intereststowards — and “ for all nodes are denoted as h—

1

, h—

2

, h—

3

, ...h—

n

and h“

1

, h“

2

, h“

3

, ...h“

n

respectively. As a result, the value ofeach location L

i

is di�erent for each advertiser. We assumethat the same digital screen can only display one ad at anygiven time window. From a DOOH system point of view,the selection of each L

k

might be conflicting goals. The opti-mization of the whole system is a more complicated problemto study. In this paper, we will focus on the single knapsack4For the remainder of the formalization, we drop the timewindow notation t for brevity

problem, but the same methodology can be used to extendto multi-knapsack problem.

5.1.2 Naive SolutionWe assume currently advertisers would choose their lo-

cation based purely on their budget and total number ofpeople the ad needs to be display to. The higher densityof business M

i

is generally correlated with higher consumerflow. As business density could be easily determined fromyellow-pages or online location based service websites suchas Yelp. One simple way is to maximize metrics Mi

Ci(busi-

ness density per cost) and only display ads in the areas withhighest number of consumer per dollar over long term. Thissolution, however, ignores the dynamics of mobile user anduser correlation. We denote this method as “Current NaiveMethod” to compare the performance with other algorithms.

5.2 Limited Knowledge on User PreferenceIn the previous subsection, we determined problem of find-

ing the optimal set of location to distribute a given ad –.However, an accurate knowledge of fine-grained spatial userpreference is not available. As a result, in this section, wediscuss how we could potentially leverage user correlationwhen limited knowledge about user preference is known.We assume advertisers only have knowledge about the topk home nodes that have the highest preference for an ad –.We propose a heuristic algorithm based on spatial similar-ity that output a list of potential locations LÕÕ based on thelimited knowledge of h–

j

, j œ (1, 2, ...|Ht|).

5.2.1 Edge vector DistanceMethods to compare graph similarity has been exten-

sively studied in the literature. i.e. iterative method Sim-Rank [9], feature extraction methods [6] and graph isomor-phism method Gromov-Hausdor� distances [16] and Eigen-vector Similarity [12]. Some further related work is surveyedin [18]. In our application, we are particularly interest in “ifa node in a graph is similar to a node in another graphdepends on the neighbor nodes its connected to”.

We use edge vector distance as a form of measuring usercorrelation that best suit our scenario. Specifically, for ourparticular measure, given e

1

, e2

... œ E are edges shared bygraph G

1

and G2

(E = E1

fl E2

). We construct vector v1

and v2

where vi

is the edge distance in each graph. Wedefine the edge length ‘t

e

= N t

i,k

, where N t

i,k1 is the numberof people connecting node N

1

with home node Hk

at timet. We compare edges giving each edge e weight ‘t

e

capturingits local topology.

dt(L1

, L2

) =

qeœE1flE2

(‘

te,1≠‘

te,2)

max(‘

te,1,‘

te,2)

|E1

fl E2

| (3)

5.2.2 Spatial Similarity based Heuristic AlgorithmGiven the knowledge of top K location set LÕ, we would

like to identify candidate location set LÕÕ under constraint.We do this by leveraging spatial similarity derived by edgevector distance. Detailed algorithm is described in Algo-rithm 1. In general, we first find point with the shortestmean pair-wise distance to all top K’ locations. These arethe points that are most similar to the top K’ locations withregards to the neighbor nodes that are connected and howthey are connected. We then recursively add these points

1367

Algorithm 1 Similarity Based Heuristic AlgorithmInput: current graph connectivity, top K’ locations LÕ

Output: set of target locations LÕÕ

initialization: current graph Gt, initial top K’ locationsLÕ = L

1

, L2

..Lk

,forall the possible nodes pair {L

i

, Lj

} œ Gt doEdge vector distance; dt(L

i

, Lj

) Ω Eq.(3) ;enddistList = list()for L

i

œ Gt doD

i

=q

LcœLÕ dt(Li

, Lc

)distList.add(D

i

)enddistList.sort()while

qLiœLÕ (Ci

) Æ C–

max

doLÕÕ.add(D

i

) from head of distListend

Table 1: Summary of Initial Parameters

C–

max

1000 Ci

œ [1,100]–

i

œ [0, 1] KÕ 3

to the output set based on the similarity until the budget isused up.

6. EVALUATION

6.1 Known Spatial User PreferenceWe developed a Python simulator to evaluate the perfor-

mance of system in a real-world setup. First, we assume thespatial variation of user preference is known. In the nextsubsection, we further evaluate the performance of our pro-posed algorithm when limited knowledge about spatial userpreference is given. A summary of parameters used in simu-lation are listed in Table 1. We initialize the cost C

i

of eachlocation L

i

by a random value between 1 to 100 and a userpreference –

i

by a random value between 0 and 1. Further-more, if not specified, C–

max

budget is set to be 1000. Therandom algorithm is repeated 50 times, where mean valueis taken for comparison.

We first show the temporal variation of total value �Õ us-ing DS2 , which indicates the temporal fluctuation of totaleyeballing to users of interest. In Figure 7, we show theperformance of four algorithms in four hot-spots with anhourly time window. The greedy algorithm performs ap-proximately similar to the optimal solution (≥ 99%) acrossall four di�erence locations over time. In addition, advertis-ers’ current naive solution does not always perform betterthan the random algorithm which is dependent on the timeand location. Interestingly, the total value derived by theoptimal solution displays a diurnal pattern due to the dy-namic changes of urban geo-social graph.

We further test all algorithms for the whole city simulat-ing DS1 in Figure 8. The knapsack solution again performsvery similarly to the greedy solution. Moreover, It couldbe seen that greedy solution, which considers user corre-lation performances consistently better than both currentnaive solution and random algorithm. In fact, on averagearound 300% improvement in total value �Õ comparing tocurrent solution and a 2800% improvement against randomalgorithm. Furthermore, current naive solution consider-ing both spatial characteristics and cost outperforms the

0 10 20 30 40Time (hr)

05000

10000150002000025000300003500040000

Tota

lVal

ue

KnapsackGreedyCurrentRandom

(a) Location 1

0 10 20 30 40Time (hr)

0

20000

40000

60000

80000

100000

Tota

lVal

ue


(b) Location 2

0 10 20 30 40Time (hr)

020000400006000080000

100000120000140000160000180000

Tota

lVal

ue


(c) Location 3

0 10 20 30 40Time (hr)

020000400006000080000

100000120000140000160000

Tota

lVal

ue


(d) Location 4Figure 7: Temporal performance variation

Weekday Morn

Weekday Noon

Weekday Eve

Weekend Morn

Weekend Noon

Weekend Eve0

50000

100000

150000

200000

250000

300000

350000

400000

Tota

lVal

ue

GreedyCurrentRandom

Figure 8: Overall City Performance

random algorithm by almost 8 times, although huge perfor-mance gain could be further achieved. This might be a resultof the existence of correlation between business density anduser density. The performance of greedy algorithm is timevariant, while random and current naive solution performrelatively stable over time.

6.2 Performance of Heuristic AlgorithmWe then evaluate the performance of our proposed heuris-

tic algorithm when user preference is not completely known.We first vary the value of KÕ, and compare the performanceof the proposed similarity based heuristic algorithm to ran-dom algorithm. It could be seen that only by knowingthe top 3 locations, where the highest number of users likethe content, performance could be significantly improved.By leveraging user correlation, distribution e�ciency is im-proved by over 10 times against a random algorithm. Inaddition, this little information could also improve advertis-ers’ current algorithm performance by approximately 35%.However, higher KÕ value does not dramatically improvethe performance further as shown in Figure 9a. In addi-tion, we also vary the budget amount C–

max

and comparethe performance with regards to total value �Õ in Figure 9b.In general, our proposed heuristic algorithm always outper-forms the current solution, although the gap narrows as thebudget amount C–

max

increases.A performance comparison is shown in Table 2 to com-

pare the current solution and heuristic algorithm with theupper (knapsack optimal) and lower bound (random). Thenormalized total value �Õ is used for comparison. We find

1368

1 2 3 4 5 6 7 8K’ Value

8.5

9.0

9.5

10.0

10.5

11.0Pe

rfor

man

ceR

atio

Heuristic/RandomStd

(a) Knowledge of user

500 1000 1500 2000 2500C

max

0.0

0.2

0.4

0.6

0.8

1.0

Tota

lval

ue

⇥107

Heuristic(3)CurrentRandom

(b) Varying C–

max

Figure 9: Performance EvaluationTable 2: Performance Evaluation

Algorithm Normalized Total Value �ÕPer. gain

Random 1 -Current Naive 7.5 650%Heuristic(1) 9.4 25.3%Heuristic(3) 10.1 7.4%

Optimal 29.2 189.1%

0 10 20 30 40Time (hr)

0

1

2

3

4

5

6

7

Rat

ioto

Cur

rent

Alg

o Heuristic (1)Heuristic (3)Heuristic (5)Optimal

(a) Location 1

0 10 20 30 40Time (hr)

0

2

4

6

8

10

12

Rat

ioto

Cur

rent

Alg


(b) Location 2

0 10 20 30 40Time (hr)

0

5

10

15

20

25

Rat

ioto

Cur

rent

Alg


(c) Location 3

0 10 20 30 40Time (hr)

0

2

4

6

8

10

12

Rat

ioto

Cur

rent

Alg


(d) Location 4Figure 10: Heuristic Performance

that even KÕ = 1 would significantly improve current con-tent distribution e�ciency by 25%. A higher KÕ translateto a higher performance gain, however, the rate of perfor-mance gain slows down as KÕ increases. Furthermore, thelast column shows the performance gain comparing to theprevious algorithm. Finally, we also evaluate the heuristicalgorithm on the four hotspots by varying the amount ofknown user information KÕ in Figure 10. It could be seenagain that performance of heuristic algorithms is spatial andtime dependent. Heuristic algorithm could perform reallyclosely to the optimal solution in some locations, while isfar from optimal in some other locations. As a result, tradi-tional static optimization method would perform poorly inreal-world scenario compared to data driven approaches.

7. RELATED WORKA DOOH ad delivery system was initially proposed in [5]

in a distributed manner, where demographic data is trackedby individual display. Phan et al. [20] present a contentmanagement system for delivering both advertising and non-advertising content for digital signage system. Their sys-tem is capable of receiving, storing and scheduling contentson a location-based out of home advertising network. Theclaimed scheduling algorithm determines available inventoryslot (screen location and time) based on registered user infor-

mation. Authors in [19] extended the system by creating anetwork program wheel to manage time slots in the network.The above systems require users connect to the system forregistration. AdTorrent [17] is a system for targeted adver-tisement distribution in a vehicular network. The systemtargets individual mobile user and integrate search, rank-ing and ad content delivery in the architecture. In [25],authors designed a location-aware mobile digital signagesystem (LDSS) based on GPS and wireless infrastructureas contrasted to traditional static digital signage. An ad-vertisement recommendation algorithm was proposed andcompared with traditional advertising methods, i.e. regiontriggered and sequential advertising. However, these worksfocused on individual user and do not take into account thespatio-temporal dynamics of “cohort of user” interest.

Although targeting individual user has proven success-ful for online advertising, it has its limitation in DOOHnetwork due to privacy concerns. There has been inten-sive research both in academia and industry in the field ofDOOH attempting to improve ad distribution without tar-geting individual user. In [13], authors addressed the chal-lenges and limitations of personalized advertising in DOOH,and suggested situationalization, which delivers content rel-evant to individual or a group of individuals based on thecontext. They further proposed PERSIT matrix with adap-tation strategy between personalization and situationliza-tion. PERSIT di�ers from our work in that it only providesa guide of when to adapt, but does not consider how toimprove ad distribution in a system. Satoh [22] presenteda framework for context-aware digital signage. This workis limited to displaying location based content in a singledigital screen, rather than a DOOH system. Furthermore,it does take into account the spatio-temporal dynamics ofusers who would consume the content.

Finally, our optimization problem is related to the “bill-board/retail store location selection” problem in the field ofland economy [15]. Geo-spotting [11] used location-basedsocial network check-in data (Foursquare) to identify opti-mal location for new retail store. In [14], authors attemptedto tackle the problem combining visualization and data min-ing using taxi trajectory data. However, both our problemformulation and methodology are significantly di�erent, anduser dynamics are not discussed in both works.

8. LIMITATIONS AND FUTURE WORKIn this work, we mainly exploit the spatio-temporal dy-

namics of users in the form of geo-social connectivity withlarge-scale mobile dataset, and there are factors could poten-tially be further studied, i.e. spatial user preference and ad-ditional spatial similarity features. Firstly, in our work, userinterest/preference in a certain area towards ad – is initial-ized randomly. The existence of spatial user preference couldfurther improve ad distribution by a similarity based heuris-tic algorithm. However, details about spatial user preferenceare not well understood, apart from discussed in a few stud-ies [26]. Questions further about user preference that mightbe understand through social networks or survey data are;

• Is there spatial user correlation of preference? Doesshorter distance between two areas link to a highersimilarity of user preferences?

• What other spatial characteristics could potentiallycontribute to the dynamic similarity graph.

1369

These questions related to spatial user preference, if under-stood, could be extended into a broad field of study. Hence,the similarity graph could be further extended to includethese correlations and characteristics. Secondly, this pa-per mainly addresses the problem of spatial optimizationat certain time window. We intent to further extend the pa-per into optimizing content distribution system in DOOH,where multiple content with di�erent user preferences needsto be distributed.

Finally, as we have constructed a spatial geo-social simi-larity graph based on spatial-temporal user dynamics. Thegraph could be built one o�, used by various services, andbe updated periodically. We believe the graph could poten-tially be used by a large number of mobile location basedservices and applications, i.e. to improve recommendationin location-based systems [2, 1] etc. Furthermore, the knowl-edge of spatial user preference could also be collected mucheasier through social networks or surveies compared indi-vidual user data. These graphs could be highly suitablefor tasks such as urban planning, and land economy plan-ning [3]. Our future work also includes blending data fromdi�erent sources to improve the geo-social graph.

9. CONCLUSIONIn this paper, we discuss how mobile data could be ex-

ploited to improve content distribution e�ciency in DOOHadvertising network. More specifically, we first analyze amobile dataset consists of fine-grained population dynam-ics and urban geo-social links. We construct a dynamicgraph of urban connectivity using the dataset, and formu-lated an optimization problem of reaching highest numberof “right” people at the “right” time. In addition, a heuris-tic algorithm is proposed to solve the problem when lim-ited user preference is known. We evaluate multiple algo-rithms with data-driven simulations, and show an over 300%improvement compared to advertisers’ current distributionapproach. This is achieved by targeting “cohort of user”perference and considering the spatio-temporal dynamics ofthose user preferences. Finally, our constructed urban simi-larity graph could potentially be used by many other mobileservices and geo-social applications.

AcknowledgmentThis research was supported/partially supported by Data61,CSIRO, UNSW, Australian Government Research TrainingProgram Scholarship and NII. The work was partially con-ducted under MOU program when author was in NII, Japan.We thank our colleagues who provided insights and exper-tises that greatly assisted the research.

10. REFERENCES[1] Bao, J., Zheng, Y., and Mokbel, M. F. Location-based and

preference-aware recommendation using sparse geo-socialnetworking data. In Proceedings of the 20th internationalconference on advances in geographic information systems(2012), ACM, pp. 199–208.

[2] Bao, J., Zheng, Y., Wilkie, D., and Mokbel, M.Recommendations in location-based social networks: a survey.Geoinformatica 19, 3 (2015), 525–565.

[3] Batty, M. Big data, smart cities and city planning. Dialoguesin Human Geography 3, 3 (2013), 274–279.

[4] Bauer, C., Dohmen, P., and Strauss, C. Interactive digitalsignage-An innovative service and its future strategies. InEmerging Intelligent Data and Web Technologies (EIDWT),2011 International Conference on (2011), IEEE, pp. 137–142.

[5] Carney, P. J., Pina, J. B., Boyle, J. J., and Perine, C. A.System and method for delivering out-of-home programming,June 18 2002. US Patent 6,408,278.

[6] Cha, S.-H. Comprehensive survey on distance/similaritymeasures between probability density functions. City 1, 2(2007), 1.

[7] EMarkerter. Propped by Digital Growth, Out-of-HomeAdvertising Holds Its Own in UK, France. Tech. rep., 2015.

[8] EMarkerter. U.S. Digital Out-of-home advertising. Tech. rep.,2015.

[9] Jeh, G., and Widom, J. Simrank: a measure ofstructural-context similarity. In Proceedings of the eighth ACMSIGKDD international conference on Knowledge discoveryand data mining (2002), ACM, pp. 538–543.

[10] Jiang, F., Thilakarathna, K., Kaafar, M. A., Rosenbaum, F.,and Seneviratne, A. A spatio-temporal analysis of mobileinternet tra�c in public transportation systems: A view of webbrowsing from the bus. In Proceedings of the 10th ACMMobiCom Workshop on Challenged Networks (2015), ACM,pp. 37–42.

[11] Karamshuk, D., Noulas, A., Scellato, S., Nicosia, V., andMascolo, C. Geo-spotting: mining online location-basedservices for optimal retail store placement. In Proceedings ofthe 19th ACM SIGKDD international conference onKnowledge discovery and data mining (2013), ACM,pp. 793–801.

[12] Koutra, D., Parikh, A., Ramdas, A., and Xiang, J. Algorithmsfor graph similarity and subgraph matching. In Technicalreport. Carnegie-Mellon-University, 2011.

[13] Lasinger, P., and Bauer, C. Situationalization, the new road toadaptive digital-out-of-home advertising. In Proceedings ofIADIS International Conference e-Society (2013),pp. 162–169.

[14] Liu, D., Weng, D., Li, Y., Bao, J., Zheng, Y., Qu, H., and Wu,Y. Smartadp: Visual analytics of large-scale taxi trajectoriesfor selecting billboard locations. IEEE transactions onvisualization and computer graphics (2016).

[15] Malczewski, J. Gis-based multicriteria decision analysis: asurvey of the literature. International Journal of GeographicalInformation Science 20, 7 (2006), 703–726.

[16] Mémoli, F. Gromov-hausdor� distances in euclidean spaces. InComputer Vision and Pattern Recognition Workshops, 2008.CVPRW’08. IEEE Computer Society Conference on (2008),IEEE, pp. 1–8.

[17] Nandan, A., Das, S., Zhou, B., Pau, G., and Gerla, M.Adtorrent: digital billboards for vehicular networks. In Proc. ofIEEE/ACM International Workshop on Vehicle-to-VehicleCommunications (V2VCOM), San Diego, CA, USA (2005).

[18] Papadimitriou, P., Dasdan, A., and Garcia-Molina, H. Webgraph similarity for anomaly detection. Journal of InternetServices and Applications 1, 1 (2010), 19–30.

[19] Phan, J. Web-based system and method to implement digitalout-of-home advertisements, Dec. 1 2011. US Patent App.13/115,773.

[20] Phan, M., Woo, D. Q., and Araki, J. Content management inout-of-home advertising networks, 2010.

[21] Ranganathan, A., and Campbell, R. H. Advertising in apervasive computing environment. In Proceedings of the 2ndinternational workshop on Mobile commerce (2002), ACM,pp. 10–14.

[22] Satoh, I. A framework for context-aware digital signage. InInternational Conference on Active Media Technology (2011),Springer, pp. 251–262.

[23] Shimosaka, M., Maeda, K., Tsukiji, T., and Tsubouchi, K.Forecasting urban dynamics with mobility logs by bilinearPoisson regression. In Proceedings of the 2015 ACMInternational Joint Conference on Pervasive and UbiquitousComputing (2015), ACM, pp. 535–546.

[24] Stalder, U. Digital out-of-home media: means and e�ects ofdigital media in public space. In Pervasive Advertising.Springer, 2011, pp. 31–56.

[25] Yu, K.-M., Yu, C.-Y., Yeh, B.-H., Hsu, C.-H., and Hsieh, H.-N.The design and implementation of a mobile location-awaredigital signage system. In Mobile Ad-hoc and Sensor Networks(MSN), 2010 Sixth International Conference on (2010),IEEE, pp. 235–238.

[26] Zheng, Y., Xie, X., and Ma, W.-Y. Geolife: A collaborativesocial networking service among user, location and trajectory.IEEE Data Eng. Bull. 33, 2 (2010), 32–39.

1370

Date post:	24-Aug-2020
Category:	Documents
Upload:	others
View:	2 times
Download:	0 times

Efﬁcient Content Distribution in DOOH Advertising …industry in the year 2015 [8]. The...

Documents