Understanding the Impact of Weather for ... - Christoph Trattner

Understanding the Impact of Weather for POIRecommendations

Christoph Trattner∗

Know-Center, AustriaAlexander Oberegger

TUG, Austria

Lukas EberhardTUG, Austria

Denis ParraPUC, Chile

Leandro MarinhoUFCG, Brasil

ABSTRACTPOI (point of interest) recommender systems for location-based social network services, such as Foursquare or Yelp,have gained tremendous popularity in the past few years.Much work has been dedicated into improving recommenda-tion services in such systems by integrating different featuresthat are assumed to have an impact on people’s preferencesfor POIs, such as time and geolocation. Yet, little atten-tion has been paid to the impact of weather on the users’final decision to visit a recommended POI. In this paper wecontribute to this area of research by presenting the firstresults of a study that aims to predict the POIs that userswill visit based on weather data. To this end, we extend thestate-of-the-art Rank-GeoFM POI recommender algorithmwith additional weather-related features, such as tempera-ture, cloud cover, humidity and precipitation intensity. Weshow that using weather data not only significantly increasesthe recommendation accuracy in comparison to the origi-nal algorithm, but also outperforms its time-based variant.Furthermore, we present the magnitude of impact of eachfeature on the recommendation quality, showing the need tostudy the weather context in more detail in the light of POIrecommendation systems.

KeywordsPOI Recommender Systems; Location-based services; Weather-Context

1. INTRODUCTIONLocation-based social networks (LBSN) enable users to

check-in and share places and relevant content, such as pho-tos, tips and comments that help other users in exploringnovel and interesting places in which they might not have

∗Corresponding author: [email protected]

Permission to make digital or hard copies of all or part of this work for personal orclassroom use is granted without fee provided that copies are not made or distributedfor profit or commercial advantage and that copies bear this notice and the full cita-tion on the first page. Copyrights for components of this work owned by others thanACM must be honored. Abstracting with credit is permitted. To copy otherwise, or re-publish, to post on servers or to redistribute to lists, requires prior specific permissionand/or a fee. Request permissions from [email protected].

RecTour’16, September 15–19, 2016, Boston, USA.c© 2016 ACM. ISBN xxx-xxxx-xx-xxx/xx/xx.

DOI: xxxx

been before. Foursquare1, for example, is a popular LBSNwith millions of subscribers doing millions of check-ins every-day all over the world2. This vast amount of check-in data,publicly available through Foursquare’s data access APIs,has recently inspired many researchers to investigate humanmobility patterns and behaviors with the aim of assistingusers by means of personalized POI (points of interest) rec-ommendation services [15,16].

Problem Statement. The problem we address in thispaper is the POI recommendation problem. Hence, givena user u and her check-in history Lu, i.e., the POIs thatshe has visited in the past, and current weather conditionsC = {c1, . . . , c|C|}, where ci are weather features such astemperature, wind speed, pressure, etc., we want to predictthe POIs L̂u = {l1, . . . , l|L|} that she will likely visit in thefuture that are not in Lu.Objective. Most of the existing approaches on POI rec-

ommendation exploit three main factors (aka contexts) ofthe data, namely, social, time and geolocation [5, 10, 15].While these approaches work reasonably well, little atten-tion has been paid to weather, a factor that may potentiallyhave a major impact on users’ decisions about visiting a POIor not. For example, if it is raining in a certain period oftime and place, the user may prefer to check-in indoor POIs.

In this paper we contribute to this area of research bypresenting the first results of a recently started project thatexploits weather data to predict, for a given user within agiven city, the POIs that she will likely visit in the future.To this end, we extract several weather features based ondata collected from forecast.io such as temperature, cloudcover, humidity or precipitation intensity, and feed it intoa state-of-the-art POI recommender algorithm called Rank-GeoFM [10].

Research Questions. To drive our research the follow-ing three research question were defined:

• RQ1. Do weather conditions have a relation with thecheck-in behaviour of Foursquare users?

• RQ2. Is it possible to improve current POI recom-mendation quality using these weather features?

• RQ3. Which weather features provide the highest im-pact on the recommendations?

Contributions. To the best of our knowledge, this is thefirst paper that investigates in detail the extent to which

1https://foursquare.com/2https://foursquare.com/infographics/10million

City #Check-Ins #Venues #Users SparsityMinneapolis 37,737 797 436 89.1%

Boston 42,956 1141 637 94.3%Miami 29,222 796 410 91.0%

Honolulu 16,042 410 173 77.4%

Table 1: Basic statistics of the dataset.

weather features such as temperature, cloud cover, humid-ity or precipitation intensity have an impact on users’ check-in behaviors and how these features perform in the contextof POI recommender systems. Although there is literatureshowing that POI recommender systems can be improvedby using some kind of weather context such as e.g. temper-ature, it is not clear yet, how much they add or what typeof weather feature is the most useful or maybe least usefulone. Another contribution of this paper is the introduc-tion of a weather-aware recommender method that buildsupon a very strong state-of-the-art POI recommender sys-tem called Rank-GeoFM. The method is implemented andembedded into the very popular recommender frameworkMyMediaLite [7] and can be downloaded for free from ourGitHub repository, details in Section 8.

Outline. The structure of this paper is as follows: In Sec-tion 2 we highlight relevant related work in the field. Sec-tion 3 describes how we enriched Rank-GeoFM with weatherdata. Section 4 describes the experimental setup and presentsresults from our empirical analysis. Section 5 presents in-sights on the results obtained with our weather-aware rec-ommender approach. Finally, Sections 6 and 7 conclude thepaper, with a summary of our main findings and future di-rections of the work.

2. RELATED WORKWith the advent of LBSNs, POI recommendation rapidly

became an active area of research within the recommendersystems, machine learning and GIS research communities [2].Most of the existing research works in this area exploit somesort of combination between (some or all) of the followingdata sources: check-in history, social (e.g. friendship rela-tions), time and geolocations [1,5,6,8,10,13,15]. While thesedifferent sources of data (aka contexts) affect the user’s deci-sion on visiting a POI in different ways, weather data, whichaccording to common sense may have a great influence onthis decision, they are still rarely used.

Martin et al. [11] proposed a mobile application whicharchitecture considered the use of weather data to person-alize a geocoding mobile service, but no implementation orevaluation was presented. A similar contribution was doneby Meehan et al. [12], who proposed a hybrid recommendersystem based on time, weather and media sentiment whenintroducing the VISIT mobile tourism recommender, butthey neither implemented nor evaluated it.

Among the few works that have actually used weather intothe recommendation pipeline, Braunhofer et al. [3] intro-duced a recommender system designed to run in mobile ap-plications for recommending touristic POIs in Italy. The au-thors conducted an online study with 54 users and found outthat recommendations that take into consideration weatherinformation were indeed able to increase the user satisfac-tion. Compared to this work, our implementation is basedin a more recent and state-of-the-art algorithm, and wealso provide details of which weather features contributethe most to the recommender performance. In an exten-

Sym. DescriptionU set of users u1, u2, ..., u|U|L set of POIs l1, l2, ..., l|L|FCf set of classes for feature fF set of weather feature classes f1, f2, ..., f|FCf |Θ latent model parameters containing the learned weights

{L(1), L(2), L(3), U(1), U(2), F (1)} for locations, users andweather features.

Xul |U | × |L| matrix containing the check-ins of users at POIs.Xulc |U | × |L| × |FCf | matrix containing the check-ins of users at

POIs at a specific feature class c.D1 user-POI pairs: (u, l)|xul > 0.D2 user-POI-feature class triples: (u, l, c)|xulc > 0.W geographical probability matrix of size |L|x|L| where wll′

contains the probability of l′ being visited after l has beenvisited according to their geographical distance. wll′ = (0.5+

d(l, l′))−1) where d(l, l′) is the geographical distance betweenthe latitude and longitude of l and l′.

WI probability that a weather feature class c is influenced by

feature class c′. wicc′ = cos sim(c, c′).Nk(l) set of k nearest neighbors of POI l.yul the recommendation score of user u and POI l.yulc the recommendation score of user u, POI l and weather fea-

ture class c.I(·) indicator function returning I(a) = 1 when a is true and 0

otherwise.ε margin to soften ranking incompatibility.γw learning rate for updates on weather latent parameters.γg learning rate for updates on latent parameters from base ap-

proach.E(·) a function that turns the rating incompatibility

Incomp(yulc, ε), that counts the number of locationsl′ ∈ L that should be ranked lower than l at the currentweather context c and user u but are ranked higher by themodel, into a loss E(r) =

∑ri=1

1i .

δucll′ function to approximate the indicator function with a contin-

uous sigmoid function s(a) = 11+exp(−a)

. δucll′ = s(yul′c +

ε− yulc)(1− s(yul′c + ε− yulc))

b |L|n c if the nth location l′ was ranked incorrect by the model the

expactation is that overall b |L|n c locations are ranked incor-rect.

g, µ auxiliary variable that save partial results of the calculationof the stochastic gradient.

Table 2: The notations used to describe Rank-GeoFM andthe incorporation of the weather context.

sion of their initial work, Braunhofer et al. [4] implementedand evaluated a context-aware recommender system whichuses weather data. They find that the model which lever-ages the weather context outperformed the version withoutit. Although more similar to our current work, they did notprovide a detailed feature analysis as the present article.

In summary, compared to previous works which have usedweather as a contextual factor for recommendation systems,we provide detailed information about our recommendationalgorithm and we contribute an implementation extendinga state-of-the-art matrix factorization model exploiting richweather data. Moreover, we also provide details on how theweather features were exploited by it, as well as a detailedanalysis about the impact of the features on the recommen-dation quality.

3. RECOMMENDATION APPROACHOur recommendation approach is built upon a state-of-

the-art POI recommender algorithm named Rank-GeoFM[10], a personalized ranking based matrix factorization method.We have selected Rank-GeoFM over other alternatives be-cause it has been shown to be a very strong POI recom-mender method compared to other approaches often cited

Algorithm 1: Rank-GeoFM with weather context

Input: check-in data D1, D2, geographical influence matrixW , weather influence matrix WI, hyperparametersε, C, α, β and learning rate γg and γw

Output: parameters of the modelΘ = {L(1), L(2), L(3), U(1), U(2), F}

1 init: Initialize Θ with N (0, 0.01); Shuffle D1 and D2

randomly2 repeat3 for (u, l) ∈ D1 do4 approach from Li et al. [10]5 end6 for (u, l, c) ∈ D2 do7 Compute yulc as Equation 3 and set n = 08 repeat9 Sample l′ and c′, Compute yul′c′ as

Equation 310 n++

11 until I(xulc > xul′c′ )I(yulc < yul′c′ + ε) = 1or n > |L|

12 if I(xulc > xul′c′ )I(yulc < yul′c′ + ε) = 1then

13 η = E(⌊|L|n

⌋)δucll′

14 g =(∑c∗∈FCf

wic′c∗f(1)c∗ −

∑c+∈FCf

wicc+f(1)c+

)15 f

(1)c ← f

(1)c − γwη(l

(2)l′ − l

(2)l )

16 l(3)l ← l

(3)l − γwηg

17 l(2)l′ ← l

(2)l′ − γwηfc

18 l(2)l ← l

(2)l + γwηfc

19 end20 Project updated factors to accomplish

constraints21 end22 until convergence

23 return Θ = {L(1), L(2), L(3), U(1), U(2), F (1)}

in the literature. In Li et al. [10] the authors comparedRank-GeoFM against twelve other recommender methods,showing that Rank-GeoFM significantly outperforms stronggeneric baselines, such as user-KNN, item-KNN CF, WRMF,BPR-MF [7] as well as specialized POI recommender meth-ods, such as BPP [17]. Another reason for choosing Rank-GeoFM is related to its ability to easily accommodate ad-ditional features that we plan to use in this work. The aimof Rank-GeoFM is to learn latent parameters that modelthe relationship between the context of interest (in our caseweather conditions) and the user/POI.

Table 2 describes the symbols used in the recommenderalgorithm. For each type of contextual data considered, la-tent model parameters are introduced. The prediction ofa <user, POI, context> triple is then made based on thislearned latent parameters. The parameters are trained us-ing a fast learning scheme introduced by the authors that isbased on Stochastic Gradient Descent (SGD).

To add the weather context into Rank-GeoFM, the weatherfeatures’ values needed to be discretized. This was done toreduce data sparsity. For example, if we considered tem-perature as a real number, most of the check-ins concerningspecific temperature values would probably be zero. Thus,transforming continuous values of weather features (e.g.,temperature) into intervals might alleviate this problem.Hence, a mapping function is introduced (see Equation 1)

that converts the weather features into interval bins. |FCf |defines the size of the bin for the current weather feature.We will refer to these bins as feature classes. Best resultswere obtained with |FCf | = 20 (validated on held-out data).

cf (value) =

⌊(value−min(f)) · (|FCf | − 1)

(max(f)−min(f))

⌋(1)

To extend the original Rank-GeoFM approach with weathercontext, three additional latent factors are introduced thatare represented by matrices in a K-dimensional space. Thefirst one is for incorporating the weather-popularity-scorethat models whether or not a location is popular in a specificweather feature class and is named L(2) ∈ R|L|×K , where Kdenotes the size of the latent parameter space. Furthermore,a matrix L(3) ∈ R|L|×K is introduced to model the influencebetween two feature classes. In other words, L(3) softens theborders between the particular feature classes. The third la-tent parameter F (1) ∈ R|FCf |×K is then used to parametrizethe feature classes of the specific weather feature. In addi-tion to the latent parameters, a Matrix WI ∈ R|FCf |×|FCf |

is introduced for storing the probability that a weather fea-ture class c is influenced by feature class c′. Denoting xulcas the frequency that a user u checked in at POI l with thecurrent weather context c, this probability is calculated asfollows:

wicc′ =

∑u∈U

∑l∈L xulcxulc′√∑

u∈U∑

l∈L x2ulc

√∑u∈U

∑l∈L x

2ulc′

(2)

To calculate the recommendation score for a given user u,POI l and weather feature class c, Equation 3 is introduced,where yul denotes the recommendation score as computedin Li et al. [10].

yul = u(1)u · l

(1)l + u(2)

u ·∑

l∗∈Nk(l)

wll∗ l(1)l∗

yulc = yul + f (1)c · l(2)l + l

(3)l ·

∑c∗∈FC

wicc∗f(1)c∗

(3)

Algorithm 1 shows how we incorporated the weather con-text features into the base Rank-GeoFM approach. Takingthe initialization and the hyperparameters from the originalapproach we first iterate over all pairs of users and POIs(u, l) ∈ D1, where D1 is the set of all check-ins and do theadjustments of the latent parameters as described in Li etal. [10].

We then introduce an iteration over all <user, venue,feature-class> triples (u, l, c) ∈ D2 in order to adjust thelatent parameters on the incorrect ranked venues accordingto the specific weather context. This adjustment is necessarybecause the algorithm might rank a triple (u, l, c) correctlywhere on the other hand (u, l, c′) might be ranked incor-rectly. The adjustments are then done accordingly to thebase algorithm in lines 6-20.

During our studies we found that with a learning rateof γg = .0001, as used in Li et al. [10], the algorithm didnot converge. The reason for that is that the adjustmentsare done on a higher granularity for each (u, l, c) triple andnot just on the (u, l) level. Henceforth, we introduce a newlearning rate parameter γw = .00001 for the weather con-text, for which stable results could be observed (validationon hold-out data). Similarly to Li et al. [10], we found inour experiments that the best values of the hyperparametersare as follows (validated on hold-out data): ε = .3, C = 1.0,

0.0 0.2 0.4 0.6 0.8 1.0Cloud cover

0.0

0.5

1.0

1.5

2.0

2.5

Che

ckin

pro

babi

lity

σ = 0.35 µ = 0.47 Data

(a) Cloud cover

0 2 4 6 8 10 12 14 16Visibility

0.00

0.05

0.10

0.15

0.20

0.25

0.30

0.35

Che

ckin

pro

babi

lity

σ = 3.88 µ = 12.02 Data

(b) Visibility

0.0 0.2 0.4 0.6 0.8 1.0Moonphase

0.0

0.2

0.4

0.6

0.8

1.0

1.2

1.4

Che

ckin

pro

babi

lity

σ = 0.29 µ = 0.53 Data

(c) Moonphase

0 5 10 15 20 25 30 35Precip intensity

0.0

0.2

0.4

0.6

0.8

1.0

1.2

Che

ckin

pro

babi

lity

σ = 1.49 µ = 0.83 Data

(d) Precipitation intensity

970 980 990 1000 1010 1020 1030 1040 1050Pressure

0.00

0.01

0.02

0.03

0.04

0.05

0.06

0.07

Che

ckin

pro

babi

lity

σ = 6.45 µ = 1015.99 Data

(e) Pressure

30 20 10 0 10 20 30 40 50Temperature

0.000

0.005

0.010

0.015

0.020

0.025

0.030

0.035

0.040

0.045

Che

ckin

pro

babi

lity

σ = 9.36 µ = 16.69 Data

(f) Temperature

0.0 0.2 0.4 0.6 0.8 1.0Humidity

0.0

0.5

1.0

1.5

2.0

Che

ckin

pro

babi

lity

σ = 0.21 µ = 0.6 Data

(g) Humidity

0 2 4 6 8 10 12 14 16 18Wind speed

0.00

0.05

0.10

0.15

0.20

0.25

Che

ckin

pro

babi

lity

σ = 2.05 µ = 3.71 Data

(h) Windspeed

Figure 1: Check-in distributions over the 8 weather features.

10 5 0 5 10 15 20 25 30 35Temperature

0.05

0.00

0.05

0.10

0.15

0.20

Che

ckin

pro

babi

lity

(a) “Austrian Restaurant”

30 20 10 0 10 20 30 40Temperature

0.02

0.01

0.00

0.01

0.02

0.03

0.04

0.05

0.06

0.07

Che

ckin

pro

babi

lity

(b) “Farm”

20 10 0 10 20 30 40Temperature

0.02

0.00

0.02

0.04

0.06

0.08

0.10

0.12

Che

ckin

pro

babi

lity

(c) “Ski Area”

30 20 10 0 10 20 30 40 50Temperature

0.02

0.01

0.00

0.01

0.02

0.03

0.04

0.05

0.06

0.07

Che

ckin

pro

babi

lity

(d) “Ice Cream Shop”

Figure 2: Examples of check-in distributions over differenttypes of places in Foursquare. On the left hand side, placeswhere people check-in at lower temperatures are shown andon the right higher temperature places are featured.

α = β = .2, and K = 100 as used for the dimensions of thematrices L(1), L(2) and L(3).

4. EXPERIMENTAL SETUPIn this section we describe in detail our experimental setup,

i.e., the datasets we used, a brief characterization of thisdatasets concerning the weather features used, and the eval-uation protocol we have chosen to conduct our study.

4.1 DatasetsThe dataset we used in this study was obtained from the

work of Yang et al. [14]. It is a Foursquare crawl comprisinguser check-in data from April 2012 to September 2013. Theoriginal dataset contains more than 33 million check-ins from

415 cities in 77 countries.However, before dealing with ourproblem on such a large scale, we decided to first concentrateour investigation on a small set of US cities. We selected fourcities that could represent some weather variety in order toinvestigate whether our model is resilient to such varietyof weather conditions (see Figure 3). Table 1 provides anoverview of the check-in statistics of the four target citieschosen for our experiments: Minneapolis, Boston, Miamiand Honolulu.

Concerning the weather information, we have used theAPI of forecast.io3 to collect, for each <time, place> tuplepresent in our dataset, their corresponding weather informa-tion. For that, we need to pass the following request to theAPI:

https://api.forecast.io/forecast/APIKEY/LAT,LON,TIME

For the purposes of our analysis, we obtained eight weatherfeatures, namely, cloud cover, visibility, moon phase, precip-itation intensity, pressure, temperature, humidity and windspeed, for all places and time-stamps in our dataset that areprovided by forecast.io.

4.2 Data AnalysisFigure 1 shows the probability distributions of check-ins

for each of the eight weather features used. Notice that thedistributions of pressure, temperature, humidity and windspeed resemble a normal distribution (see the colored ap-proximation curve). Moreover, while moon phase seems tofollow a uniform distribution, which indicates that it willlikely not help the recommendation model, the distributionof precipitation is very skewed, showing that users have astrong preference to check-in places when there is low pre-cipitation intensity (i.e., not raining), indicating that thisfeature might have a good discriminative power.

In addition to this, Figure 2 illustrates the check-in dis-tribution as a function of temperature in four different POIcategories. As highlighted in this Figure, different patternsoccur depending on the category chosen. While people pre-fer to check-in in e.g., “Austrian Restaurants” or “Ski Areas”

3https://developer.forecast.io/docs/v2

Hon

olul

uP

hoen

ixD

enve

rE

l Pas

oM

iam

iS

acra

men

toS

alt L

ake

City

Buf

falo

Tam

paJa

ckso

nvill

eN

ew Y

ork

Bal

timor

eN

ew O

rlean

sR

oche

ster

Bro

okly

nN

ewar

kC

hest

erO

akla

ndN

ashv

ille

Phi

lade

lphi

aR

alei

ghS

t. Lo

uis

Was

hing

ton

D.C

.S

an D

iego

Pro

vide

nce

Chi

cago

Mem

phis

Okl

ahom

a C

ityH

oust

onR

ichm

ond

Tal

laha

ssee

Bos

ton

Des

Moi

nes

Cha

rlotte

Nor

folk

San

Jos

eD

alla

sS

an A

nton

ioC

leve

land

Pitt

sbur

ghA

tlant

aIn

dian

apol

isLo

s A

ngel

esK

ansa

s C

ityC

olum

bus

Lans

ing

Col

umbi

aA

ustin

Cin

cinn

ati

Min

neap

olis

Det

roit

Sai

nt P

aul

Milw

auke

eP

ortla

ndH

artfo

rdM

adis

onH

arris

burg

Tre

nton

Sea

ttle

0.00

0.05

0.10

0.15

0.20

0.25

0.30

0.35

0.40

0.45

Sta

ndar

d de

viat

ion

Variability in terms of Cloud cover of different citiesSouth citiesNorth cities

(a) Cloud cover

Pho

enix

Hon

olul

uM

iam

iE

l Pas

oS

an A

nton

ioS

an J

ose

Oak

land

Tam

paD

alla

sD

enve

rP

ortla

ndN

ashv

ille

Mem

phis

Kan

sas

City

Okl

ahom

a C

ityA

ustin

Was

hing

ton

D.C

.S

eattl

eS

t. Lo

uis

Sac

ram

ento

Tal

laha

ssee

Nor

folk

Atla

nta

New

Orle

ans

Ral

eigh

Col

umbi

aC

harlo

tteD

etro

itJa

ckso

nvill

eC

leve

land

Indi

anap

olis

Sal

t Lak

e C

ityR

ichm

ond

Col

umbu

sS

an D

iego

Roc

hest

erH

oust

onC

hica

goP

ittsb

urgh

Los

Ang

eles

Cin

cinn

ati

Tre

nton

Des

Moi

nes

Lans

ing

Sai

nt P

aul

Phi

lade

lphi

aM

inne

apol

isC

hest

erB

uffa

loN

ew Y

ork

Har

tford

Bal

timor

eB

rook

lyn

Milw

auke

eN

ewar

kB

osto

nM

adis

onH

arris

burg

Pro

vide

nce

0.0

0.5

1.0

1.5

2.0

2.5

3.0

3.5

4.0

Sta

ndar

d de

viat

ion

Variability in terms of Visibility of different citiesSouth citiesNorth cities

(b) Visibility

Okl

ahom

a C

ityH

artfo

rdH

arris

burg

Nas

hvill

eS

an A

nton

ioC

olum

bia

Milw

auke

eB

altim

ore

Ric

hmon

dC

leve

land

Roc

hest

erM

inne

apol

isN

ewar

kO

akla

ndP

ortla

ndS

an D

iego

Den

ver

Pitt

sbur

ghB

osto

nC

inci

nnat

iB

uffa

loP

rovi

denc

eC

hica

goJa

ckso

nvill

eA

tlant

aN

orfo

lkP

hila

delp

hia

Mia

mi

Che

ster

Tre

nton

Des

Moi

nes

Hon

olul

uC

harlo

tteT

ampa

New

Orle

ans

Dal

las

Det

roit

St.

Loui

sK

ansa

s C

ityC

olum

bus

Sea

ttle

Indi

anap

olis

New

Yor

kLa

nsin

gLo

s A

ngel

esR

alei

ghW

ashi

ngto

n D

.C.

Pho

enix

Hou

ston

Mem

phis

Tal

laha

ssee

Mad

ison

Sai

nt P

aul

El P

aso

Sac

ram

ento

San

Jos

eS

alt L

ake

City

Bro

okly

nA

ustin

0.00

0.05

0.10

0.15

0.20

0.25

0.30

0.35

Sta

ndar

d de

viat

ion

Variability in terms of Moonphase of different citiesSouth citiesNorth cities

(c) Moonphase

Los

Ang

eles

San

Die

goO

akla

ndE

l Pas

oP

hoen

ixS

an J

ose

Sal

t Lak

e C

ityH

onol

ulu

Milw

auke

eK

ansa

s C

ityS

aint

Pau

lP

ortla

ndS

acra

men

toIn

dian

apol

isM

adis

onD

enve

rS

eattl

eD

alla

sS

t. Lo

uis

Min

neap

olis

Har

risbu

rgC

hica

goB

altim

ore

Des

Moi

nes

Bos

ton

Roc

hest

erH

artfo

rdP

hila

delp

hia

Okl

ahom

a C

ityB

uffa

loLa

nsin

gR

alei

ghR

ichm

ond

Pitt

sbur

ghC

harlo

tteC

leve

land

New

Yor

kD

etro

itP

rovi

denc

eN

ashv

ille

Col

umbu

sW

ashi

ngto

n D

.C.

New

ark

Hou

ston

Mem

phis

Nor

folk

Che

ster

Atla

nta

San

Ant

onio

Cin

cinn

ati

Bro

okly

nT

rent

onA

ustin

Tam

paT

alla

hass

eeM

iam

iJa

ckso

nvill

eN

ew O

rlean

sC

olum

bia

0.0

0.2

0.4

0.6

0.8

1.0

1.2

Sta

ndar

d de

viat

ion

Variability in terms of Precip intensity of different citiesSouth citiesNorth cities


Hon

olul

uS

an D

iego

Mia

mi

Oak

land

Los

Ang

eles

Tam

paS

an J

ose

New

Orle

ans

Tal

laha

ssee

Jack

sonv

ille

Sac

ram

ento

Atla

nta

Mem

phis

Pho

enix

Hou

ston

Nas

hvill

eC

olum

bia

San

Ant

onio

Cha

rlotte

Aus

tinC

inci

nnat

iD

alla

sIn

dian

apol

isC

olum

bus

El P

aso

St.

Loui

sR

alei

ghP

ittsb

urgh

Chi

cago

Okl

ahom

a C

ityP

ortla

ndC

leve

land

Lans

ing

Ric

hmon

dN

orfo

lkM

ilwau

kee

Kan

sas

City

Sea

ttle

Det

roit

Was

hing

ton

D.C

.M

adis

onH

arris

burg

Buf

falo

Bal

timor

eD

es M

oine

sC

hest

erP

hila

delp

hia

Tre

nton

Min

neap

olis

Den

ver

Sai

nt P

aul

New

ark

Bro

okly

nR

oche

ster

New

Yor

kH

artfo

rdP

rovi

denc

eS

alt L

ake

City

Bos

ton

0

1

2

3

4

5

6

7

8

9

Sta

ndar

d de

viat

ion

Variability in terms of Pressure of different citiesSouth citiesNorth cities

(e) Pressure

Hon

olul

uO

akla

ndS

an D

iego

Mia

mi

Los

Ang

eles

Sea

ttle

Tam

paS

an J

ose

New

Orle

ans

Jack

sonv

ille

Hou

ston

Por

tland

Tal

laha

ssee

San

Ant

onio

Aus

tinC

olum

bia

Atla

nta

Cha

rlotte

Nor

folk

Sac

ram

ento

Ral

eigh

Mem

phis

Pro

vide

nce

Dal

las

New

Yor

kB

rook

lyn

Ric

hmon

dN

ewar

kB

osto

nB

altim

ore

Che

ster

El P

aso

Was

hing

ton

D.C

.P

hoen

ixP

hila

delp

hia

Buf

falo

Nas

hvill

eR

oche

ster

Tre

nton

Har

tford

Okl

ahom

a C

ityH

arris

burg

Cle

vela

ndC

inci

nnat

iC

olum

bus

Pitt

sbur

ghLa

nsin

gS

t. Lo

uis

Det

roit

Milw

auke

eC

hica

goD

enve

rK

ansa

s C

ityIn

dian

apol

isS

alt L

ake

City

Mad

ison

Des

Moi

nes

Sai

nt P

aul

Min

neap

olis

0

2

4

6

8

10

12

14

Sta

ndar

d de

viat

ion

Variability in terms of Temperature of different citiesSouth citiesNorth cities

(f) Temperature

Hon

olul

uM

iam

iN

ew O

rlean

sS

eattl

eS

an D

iego

Cle

vela

ndT

ampa

Oak

land

Pho

enix

Buf

falo

Kan

sas

City

Los

Ang

eles

Mem

phis

Nor

folk

El P

aso

Roc

hest

erS

an J

ose

Chi

cago

Por

tland

Det

roit

St.

Loui

sJa

ckso

nvill

eB

altim

ore

Pitt

sbur

ghIn

dian

apol

isM

inne

apol

isD

alla

sN

ew Y

ork

Sai

nt P

aul

Des

Moi

nes

Hou

ston

Cin

cinn

ati

Nas

hvill

eC

olum

bus

Okl

ahom

a C

ityLa

nsin

gT

rent

onA

ustin

Milw

auke

eA

tlant

aC

hest

erP

hila

delp

hia

Ric

hmon

dR

alei

ghW

ashi

ngto

n D

.C.

Mad

ison

Cha

rlotte

Bro

okly

nH

arris

burg

San

Ant

onio

Pro

vide

nce

Tal

laha

ssee

Bos

ton

Col

umbi

aH

artfo

rdN

ewar

kS

alt L

ake

City

Den

ver

Sac

ram

ento

0.00

0.05

0.10

0.15

0.20

0.25

Sta

ndar

d de

viat

ion

Variability in terms of Humidity of different citiesSouth citiesNorth cities

(g) Humidity

Cha

rlotte

Hon

olul

uP

ortla

ndS

eattl

eC

olum

bia

Los

Ang

eles

San

Die

goT

alla

hass

eeP

hoen

ixR

alei

ghT

ampa

Pitt

sbur

ghR

ichm

ond

Nas

hvill

eA

tlant

aB

altim

ore

Aus

tinW

ashi

ngto

n D

.C.

New

Orle

ans

Tre

nton

Cin

cinn

ati

Mem

phis

Har

risbu

rgM

iam

iJa

ckso

nvill

eS

alt L

ake

City

Sac

ram

ento

Har

tford

St.

Loui

sD

enve

rD

etro

itO

akla

ndK

ansa

s C

ityP

rovi

denc

eS

an J

ose

Indi

anap

olis

San

Ant

onio

Mad

ison

Hou

ston

Bos

ton

Min

neap

olis

Milw

auke

eD

alla

sC

hest

erS

aint

Pau

lC

olum

bus

Phi

lade

lphi

aN

orfo

lkC

hica

goLa

nsin

gN

ewar

kC

leve

land

Roc

hest

erE

l Pas

oD

es M

oine

sN

ew Y

ork

Bro

okly

nB

uffa

loO

klah

oma

City

0.0

0.5

1.0

1.5

2.0

2.5

3.0

Sta

ndar

d de

viat

ion

Variability in terms of Wind speed of different citiesSouth citiesNorth cities

(h) Windspeed

Figure 3: Weather feature variability (sorted) measured via standard deviation over cities. Left: cities with lowest variability.Right: cities with highest variability.

0 20 40 60 80 100 120 140

POI category

0.42

0.44

0.46

0.48

0.50

0.52

0.54

0.56

Clo

ud c

over

(a) Cloud cover

0 20 40 60 80 100 120 140

POI category

14.6

14.8

15.0

15.2

15.4

Vis

ibili

ty

(b) Visibility

0 20 40 60 80 100 120 140

POI category

0.51

0.52

0.53

0.54

0.55

0.56

Moo

npha

se

(c) Moonphase

0 20 40 60 80 100 120 140

POI category

0.04

0.06

0.08

0.10

0.12

0.14

Pre

cip

inte

nsity


0 20 40 60 80 100 120 140

POI category

1.0

1.5

2.0

2.5

3.0

Pre

ssur

e

+1.014e3

(e) Pressure

0 20 40 60 80 100 120 140

POI category

14

15

16

17

18

19

20

21

Tem

pera

ture

(f) Temperature

0 20 40 60 80 100 120 140

POI category

0.52

0.54

0.56

0.58

0.60

0.62

0.64

0.66

Hum

idity

(g) Humidity

0 20 40 60 80 100 120 140

POI category

3.4

3.6

3.8

4.0

4.2

Win

dspe

ed

(h) Windspeed

Figure 4: Mean weather feature values (sorted) for POI categories with standard errors.

when the temperature is low, “Ice Cream Shops” or “Farms”are preferred when temperatures are higher.

Figure 3 shows how the weather features vary in each cityof the original Foursquare dataset. Notice that with theexception of moonphase, all the features present a depen-dency regarding the city where they are measured, indicat-ing that a different recommendation model should proba-bly be trained for each different city. Moreover, in general,weather shows a higher variability in the north of the USand a very low variability in the south that peaks in theisland Honolulu which shows almost no variability in termsof weather. Figure 4 shows the different mean values ofthe eight weather features over the POI categories. Withthe small overlapping of the standard error of the meansit’s revealed that indeed categories have a distinct popular-ity across various weather feature values. Even moonphaseshows a divergent category popularity at its tails.

After this analysis we can confidently state that there isindeed a relation between the weather conditions and thecheck-in behavior of Foursquare users, which answer our firstresearch question (RQ1) stated at Section 1.

4.3 EvaluationProtocol. To evaluate the performance of our algorithm,

we have chosen the same evaluation protocol as describedin the original Rank-GeoFM paper [10]. Hence, we split thedataset (according to the time line) into training, validationand testing sets for each city by adding the first 70% of thecheck-ins of each user to the training set, the following 20%to the test set and the rest to the validation set (=10%).The training set was then used to learn the latent modelparameters. During the training phase of the algorithm,the validation set was used to tune the algorithm conver-gence. When convergence was observed (typically around3,000-5,000 iterations with fast learning scheme enabled),the training was stopped and the learned parameters wereused to evaluate the model on the test set.

Baselines. As baselines for our experiments, we used theoriginal Rank-GeoFM approach, that models user-preferencesas well as geographical influence into the model. Further-more, we compare to the time-based method of Rank-GeoFM,that was also introduced in Li et al. [10].

Metric. As evaluation metric NDCG@k (Normalized

Discounted Cumulative Gain) with k = 204 was chosen, aswe want to predict the top-k POIs for a user.

5. RESULTSFigure 5 shows the results of our offline experiment. As

shown, in all cases over all four cities, Rank-GeoFM en-riched with our proposed weather features significantly out-perform the original Rank-GeoFM algorithm, which answersour RQ2. For all pairwise-comparisons (recommenders withweather context vs. without) a standard t-test showed thatthe p-values were always smaller than p < .001. What iseven more interesting to note is the performance of Rank-GeoFM that utilizes the time feature as contextual factor.As highlighted, in all cases, Rank-GeoFM with weather fea-tures, such as visibility and precipitation intensity outper-forms the time-based variant, showing the indeed weatherconditions may help to improve the recommendation qual-ity.

We also highlight the fact that certain weather featuresperform better than others and this pattern seems to becity dependent. This can be clearly observed in Figure 5,where the results of Rank-GeoFM with each weather featureis shown. This answers RQ3, showing which features providethe highest gain in recommendation quality. For example, inHonolulu the best performing feature is precipitation inten-sity, while in Minneapolis visibility seems to work the bestamong all investigated weather features. Similar patternscan be observed for other features, such as temperature orcloud cover changing their relative importance across thefour cities. These observations are in line with the results inFigure 1, showing a strong tendency of check-ins into POIsunder certain weather conditions. However, what is also in-teresting to note is the good performance of the moonphasefeature, which appeared to be uniformly distributed in gen-eral (cf. Figure 1). Hence, it appears, that at the level oflocations, there is indeed a strong preference for check-insin different phases of the moon. In a recent research, Ko-hyama et a. [9] found a relation between moonphase, tidal

4Please note, that we also run simulations with k=5 or 10,with similar trends in the results as obtained with k = 20.However, due to limited space, they were not included intothis paper.

Baseli

ne

Wind

spee

d

Humidi

ty

Tempe

ratur

e

Press

ure

Prec.

Int.

Moonp

hase

Time

Cloud c

over

Visibli

ty0.00

0.01

0.02

0.03

0.04

0.05

0.06

0.07

0.08

ND

CG

@20

(a) Minneapolis

Baseli

ne

Press

ure

Time

Humidi

ty

Wind

spee

d

Visibli

ty

Cloud c

over

Tempe

ratur

e

Moonp

hase

Prec.

Int.

0.00

0.01

0.02

0.03

0.04

0.05

0.06

0.07

ND

CG

@20

(b) Boston

Baseli

ne

Humidi

tyTim

e

Cloud c

over

Moonp

hase

Wind

spee

d

Press

ure

Tempe

ratur

e

Visibli

ty

Prec.

Int.

0.00

0.01

0.02

0.03

0.04

0.05

0.06

ND

CG

@20

(c) Miami

Baseli

ne

Press

ure

Humidi

ty

Moonp

hase

Wind

spee

d

Tempe

ratur

eTim

e

Cloud c

over

Visibli

ty

Prec.

Int.

0.00

0.02

0.04

0.06

0.08

0.10

ND

CG

@20

(d) Honolulu

Figure 5: Recommender accuracy for the 8 different weather context features (sorted by importance) compared to Rank-GeoFM without weather context (denoted as “Baseline”). For further comparison the time-aware version of Rank-GeoFM isincluded, denoted as “Time”. The red dotted line denotes the baseline.

variation, humidity and rainfall. Notably, we find a positivecorrelation by analyzing these data based on check-ins, find-ing a small but positive correlation between moonphase andprecipitation intensity, humidity, cloud cover and pressure,as seen in the last row of the correlation matrix shown inFigure 6. Although further analysis should be performed toestablish a link between our study and theirs, it might be in-dicative of an explanation regarding the effect of moonphasein our POI recommendation model.

Finally, the relative performance improvement over theoriginal Rank-GeoFM seems to be also location dependent.Hence, while our approach work to a great extent bettercompared to the baseline for Miami and Honolulu, the dif-ferences are less pronounced for Minneapolis. One reason forthis observation could be that there are more POIs availableshowing similar weather profiles. However, to further con-firm these hypotheses, additional analyses are needed.

6. CONCLUSIONSIn this paper we presented our preliminary findings on

how weather data may affect users’ check-in behavior andhow this information can be used in the context of a POIrecommender system. As our preliminary analyses on theFoursquare check-in data showed, the weather factors haveindeed a significant impact on the people’s check-in behav-ior, showing different check-in profiles for different kinds ofplaces (which answers RQ1). Further, we fed the proposedweather features into a state-of-the-art POI recommender

and we were able to increase the recommender accuracy incomparison to the original method that does not use weatherdata (thus answering RQ2). Furthermore, our experimentsrevealed that the weather context is more useful than thecontext of time and, that the weather features used in thiswork are city-dependent. Finally, our study showed (seeRQ3) that among the considered weather features, precipi-tation intensity and visibility are the most significant ones toimprove the ranking in a weather-aware POI recommendersystem.

7. FUTURE WORKCurrently, our work only investigates one weather feature

at a time. Investigating different hybridization or context-aware recommender system (CARS) methods and other con-text variables will be therefore a task to be conduct in ourfuture work. Furthermore, it will help to investigate in moredetail, how the algorithm performs on the whole Foursquaredataset, as more interesting patterns across cities may oc-cur. Finally, we would like to extend our investigations alsoat user levels, since the current ones concentrate only on theweather profiles of the POIs.

8. OPEN SCIENCEIn order to make the results obtained in this work repro-

ducible, we share code and data of this study. The proposedmethod Rank-GeoFM with weather context is implemented

Figure 6: Correlation matrix for the 8 weather features in-vestigated (*p < 0.5, **p < 0.01, ***p < 0.001).

with the help of the MyMediaLite framework [7] and canbe downloaded for free from our GitHub repository5. Fur-thermore, the data samples used in the experiments can berequested for free via email from CT.

AcknowledgementsThis work is supported by the Know-Center. The Know-Center is funded within the Austrian COMET Program -managed by the Austrian Research Promotion Agency (FFG).

9. REFERENCES[1] J. Bao, Y. Zheng, and M. F. Mokbel. Location-based

and preference-aware recommendation using sparsegeo-social networking data. In Proceedings of the 20thInternational Conference on Advances in GeographicInformation Systems, SIGSPATIAL ’12, pages199–208, New York, NY, USA, 2012. ACM.

[2] J. Bao, Y. Zheng, D. Wilkie, and M. Mokbel.Recommendations in location-based social networks:A survey. Geoinformatica, 19(3):525–565, July 2015.

[3] M. Braunhofer, M. Elahi, M. Ge, F. Ricci, andT. Schievenin. STS: design of weather-aware mobilerecommender systems in tourism. In Proceedings ofthe First International Workshop on Intelligent UserInterfaces: Artificial Intelligence meets HumanComputer Interaction (AI*HCI 2013) A workshop ofthe XIII International Conference of the ItalianAssociation for Artificial Intelligence (AI*IA 2013),Turin, Italy, December 4, 2013., 2013.

[4] M. Braunhofer, M. Elahi, F. Ricci, and T. Schievenin.Context-aware points of interest suggestion withdynamic weather data management. In Informationand communication technologies in tourism 2014,pages 87–100. Springer, 2014.

[5] C. Cheng, H. Yang, I. King, and M. R. Lyu. Fusedmatrix factorization with geographical and socialinfluence in location-based social networks. In Proc. ofAAAI, pages 17–23, 2012.

5https://github.com/aoberegg/WPOI

[6] G. Ference, M. Ye, and W.-C. Lee. Locationrecommendation for out-of-town users inlocation-based social networks. In Proceedings of the22Nd ACM International Conference on Information& Knowledge Management, CIKM ’13, pages 721–726,New York, NY, USA, 2013. ACM.

[7] Z. Gantner, S. Rendle, C. Freudenthaler, andL. Schmidt-Thieme. MyMediaLite: A freerecommender system library. In In Proc. of RecSys’11,2011.

[8] H. Gao, J. Tang, X. Hu, and H. Liu. Exploringtemporal effects for location recommendation onlocation-based social networks. In Proceedings of the7th ACM Conference on Recommender Systems,RecSys ’13, pages 93–100, New York, NY, USA, 2013.ACM.

[9] T. Kohyama and J. M. Wallace. Rainfall variationsinduced by the lunar gravitational atmospheric tideand their implications for the relationship betweentropical rainfall and humidity. Geophysical ResearchLetters, 43(2):918–923, 2016. 2015GL067342.

[10] X. Li, G. Cong, X.-L. Li, T.-A. N. Pham, andS. Krishnaswamy. Rank-geofm: A ranking basedgeographical factorization method for point of interestrecommendation. In Proc. of SIGIR’15, pages433–442, New York, NY, USA, 2015. ACM.

[11] D. Martin, A. Alzua, and C. Lamsfus. A ContextualGeofencing Mobile Tourism Service, pages 191–202.Springer Vienna, Vienna, 2011.

[12] K. Meehan, T. Lunney, K. Curran, andA. McCaughey. Context-aware intelligentrecommendation system for tourism. In PervasiveComputing and Communications Workshops(PERCOM Workshops), 2013 IEEE InternationalConference on, pages 328–331. IEEE, 2013.

[13] I. Nunes and L. Marinho. A personalizedgeographic-based diffusion model for locationrecommendations in lbsn. In Proceedings of the 20149th Latin American Web Congress, LA-WEB ’14,pages 59–67, Washington, DC, USA, 2014. IEEEComputer Society.

[14] D. Yang, D. Zhang, and B. Qu. Participatory culturalmapping based on collective behavior in locationbased social networks. ACM Transactions onIntelligent Systems and Technology, 2015. in press.

[15] M. Ye, P. Yin, W.-C. Lee, and D. L. Lee. Exploitinggeographical influence for collaborativepoint-of-interest recommendation. In Proc. ofSIGIR’11, pages 325–334. ACM, 2011.

[16] H. Yin, Y. Sun, B. Cui, Z. Hu, and L. Chen. Lcars: alocation-content-aware recommender system. In Proc.of KDD’13, pages 221–229. ACM, 2013.

[17] Q. Yuan, G. Cong, and A. Sun. Graph-basedpoint-of-interest recommendation with geographicaland temporal influences. In Proc. of CIKM’14, pages659–668. ACM, 2014.

Date post:	15-Mar-2023
Category:	Documents
Upload:	khangminh22
View:	0 times
Download:	0 times

Understanding the Impact of Weather for ... - Christoph Trattner

Documents