+ All Categories
Home > Documents > Gaussian process models using banded precisions...

Gaussian process models using banded precisions...

Date post: 07-Aug-2020
Category:
Upload: others
View: 11 times
Download: 0 times
Share this document with a friend
45
Gaussian process models using banded precisions matrices Nicolas Durrande, PROWLER.io – Mines St-Étienne ([email protected]) Second workshop on Gaussian processes St-Étienne, Oct. 2018
Transcript
Page 1: Gaussian process models using banded precisions matricestugaut.perso.math.cnrs.fr/pdf/workshop02/durrande.pdfGaussian process models using banded precisions matrices NicolasDurrande,PROWLER.io–MinesSt-Étienne

Gaussian process models using bandedprecisions matrices

Nicolas Durrande, PROWLER.io – Mines St-Étienne([email protected])

Second workshop on Gaussian processesSt-Étienne, Oct. 2018

Page 2: Gaussian process models using banded precisions matricestugaut.perso.math.cnrs.fr/pdf/workshop02/durrande.pdfGaussian process models using banded precisions matrices NicolasDurrande,PROWLER.io–MinesSt-Étienne

This talk presents the paper

which is a joint work with

2 / 43

Page 3: Gaussian process models using banded precisions matricestugaut.perso.math.cnrs.fr/pdf/workshop02/durrande.pdfGaussian process models using banded precisions matrices NicolasDurrande,PROWLER.io–MinesSt-Étienne

This talk presents the paper

which is a joint work with

2 / 43

Page 4: Gaussian process models using banded precisions matricestugaut.perso.math.cnrs.fr/pdf/workshop02/durrande.pdfGaussian process models using banded precisions matrices NicolasDurrande,PROWLER.io–MinesSt-Étienne

Motivating example

3 / 43

Page 5: Gaussian process models using banded precisions matricestugaut.perso.math.cnrs.fr/pdf/workshop02/durrande.pdfGaussian process models using banded precisions matrices NicolasDurrande,PROWLER.io–MinesSt-Étienne

We want to build a GP regression model with dataset (X ,Y ), anda GP f ∼ N (0, k) with exponential kernel:

k(x , y) = σ2 exp(−|x − y |

θ

)

0.0 0.2 0.4 0.6 0.8 1.0x

0.00

0.25

0.50

0.75

1.00

1.25

1.50

1.75

2.00

y

Data

1.00 0.75 0.50 0.25 0.00 0.25 0.50 0.75 1.00x

0.0

0.2

0.4

0.6

0.8

1.0

k(x,

0)

Covariance function

4 / 43

Page 6: Gaussian process models using banded precisions matricestugaut.perso.math.cnrs.fr/pdf/workshop02/durrande.pdfGaussian process models using banded precisions matrices NicolasDurrande,PROWLER.io–MinesSt-Étienne

Given the data, f has conditional mean m and covariance c :

m(x) = k(x ,X )k(X ,X )−1Y

c(x , y) = k(x , y)− k(x ,X )k(X ,X )−1k(X , y)

0.0 0.2 0.4 0.6 0.8 1.0x

1.0

0.5

0.0

0.5

1.0

1.5

2.0

2.5

ff(X

)=Y

The computationally expensive steps are computing the matrixk(X ,X ) (which is O(n2)) and inverting it (O(n3)).

5 / 43

Page 7: Gaussian process models using banded precisions matricestugaut.perso.math.cnrs.fr/pdf/workshop02/durrande.pdfGaussian process models using banded precisions matrices NicolasDurrande,PROWLER.io–MinesSt-Étienne

Let’s have a look at K = k(X ,X ) and its inverse Q = K−1:

0 1 2 3 4 5 6

0

1

2

3

4

5

6

K = k(X, X)

1.00

0.75

0.50

0.25

0.00

0.25

0.50

0.75

1.00

0 1 2 3 4 5 6

0

1

2

3

4

5

6

Q = k(X, X) 1

3

2

1

0

1

2

3

The precision matrix Q is tridiagonal... This is due to the Markovproperty of f .

6 / 43

Page 8: Gaussian process models using banded precisions matricestugaut.perso.math.cnrs.fr/pdf/workshop02/durrande.pdfGaussian process models using banded precisions matrices NicolasDurrande,PROWLER.io–MinesSt-Étienne

An element of a covariance matrix is K (xi , xj) = cov (f (xi ), f (xj)).⇒ It depends on the marginal distribution.

Things are different for a precision matrix. for I = {i , j} we have:

QI ,I = cov (f (XI ), f (XI ) | f (Xk), k /∈ I )−1

⇒ It depends on the conditional distribution.

As a consequence:Qi ,i = var (f (Xi ) | f (Xk), k 6= i)−1

Qi ,j = 0⇔ f (Xi ) and f (Xj) are conditionally independentgiven the other observations.There is no equivalent of the k(., .) for the precision matrix

7 / 43

Page 9: Gaussian process models using banded precisions matricestugaut.perso.math.cnrs.fr/pdf/workshop02/durrande.pdfGaussian process models using banded precisions matrices NicolasDurrande,PROWLER.io–MinesSt-Étienne

It is possible to compute directly the elements of the precisionmatrix.

For a GP with Matérn 1/2 kernel evaluated at X , we get:

Q = σ−2

11−λ2

1

−λ11−λ2

1

−λ11−λ2

1

1−λ21λ

22

(1−λ21)(1−λ2

2)

. . .

−λ21−λ2

2

. . . −λn−21−λ2

n−2

. . . 1−λ2n−2λ

2n−1

(1−λ2n−2)(1−λ2

n−1)−λn−11−λ2

n−1

11−λ2

n−1

where λi = exp(−(Xi+1 − Xi )/θ).

Note that Q has a band structure. Complexity is ����O(n2) O(n).

8 / 43

Page 10: Gaussian process models using banded precisions matricestugaut.perso.math.cnrs.fr/pdf/workshop02/durrande.pdfGaussian process models using banded precisions matrices NicolasDurrande,PROWLER.io–MinesSt-Étienne

Sampling

Let g ∼ N (0,Q−1) be a vector of length N.

Given µ and Q, one can generate a sample of g by:1. computing the Cholesky factorisation: Q = LLT O(N3)

2. sampling N independent variables vi ∼ N (0, 1) O(N)

3. computing µ+ L−T v O(N2)

Let’s do it!

9 / 43

Page 11: Gaussian process models using banded precisions matricestugaut.perso.math.cnrs.fr/pdf/workshop02/durrande.pdfGaussian process models using banded precisions matrices NicolasDurrande,PROWLER.io–MinesSt-Étienne

Sampling

We start with a regular grid of 20 points on [0, 1]. We compute theMatérn 1/2 precision and its Cholesky factor:

Q L

0 5 10 15

0

5

10

15

−→

0 5 10 15

0

5

10

15

It turns out that L is also a banded matrix!Cholesky is ����O(n3) O(n).

10 / 43

Page 12: Gaussian process models using banded precisions matricestugaut.perso.math.cnrs.fr/pdf/workshop02/durrande.pdfGaussian process models using banded precisions matrices NicolasDurrande,PROWLER.io–MinesSt-Étienne

Sampling

Computing L−T yields a dense triangular matrix (O(n3))... butsolving LT s = v is easy (O(n))!

LT s v

0 5 10 15

0

5

10

15

0.50.00.5

0

5

10

15

=

0.50.00.5

0

5

10

15

-Is this result surprising?-Not really given the Markov property...

11 / 43

Page 13: Gaussian process models using banded precisions matricestugaut.perso.math.cnrs.fr/pdf/workshop02/durrande.pdfGaussian process models using banded precisions matrices NicolasDurrande,PROWLER.io–MinesSt-Étienne

The banded structure of the precision allowed us to sample inO(N) time instead of O(N3)!

Question: Can we do the same for inference?

12 / 43

Page 14: Gaussian process models using banded precisions matricestugaut.perso.math.cnrs.fr/pdf/workshop02/durrande.pdfGaussian process models using banded precisions matrices NicolasDurrande,PROWLER.io–MinesSt-Étienne

The banded structure of the precision allowed us to sample inO(N) time instead of O(N3)!

Question: Can we do the same for inference?

12 / 43

Page 15: Gaussian process models using banded precisions matricestugaut.perso.math.cnrs.fr/pdf/workshop02/durrande.pdfGaussian process models using banded precisions matrices NicolasDurrande,PROWLER.io–MinesSt-Étienne

Inference with banded precision

13 / 43

Page 16: Gaussian process models using banded precisions matricestugaut.perso.math.cnrs.fr/pdf/workshop02/durrande.pdfGaussian process models using banded precisions matrices NicolasDurrande,PROWLER.io–MinesSt-Étienne

There are many models leading to sparse/banded precisions:

Continuous timeGP with the Markov property⇒ Brownian motion, ...

Linear Stochastic differential equations df (t) = Ff (t) + dB(t)⇒ Brownian motion, GPs from the Matérn family

DiscreteState Space Models: ft = At−1ft−1 + εt⇒ Autoregressive models

Gaussian Graphical modelsGaussian Markov Random Fields (GMRF).

14 / 43

Page 17: Gaussian process models using banded precisions matricestugaut.perso.math.cnrs.fr/pdf/workshop02/durrande.pdfGaussian process models using banded precisions matrices NicolasDurrande,PROWLER.io–MinesSt-Étienne

GMRF are Gaussian vectors indexed by the nodes of a graph. Theyare defined directly via their precision matrix, which is typicallysparse.

ExampleLet G = ({1, . . . ,N},E ) be an undirected graph with adjacencymatrix A, and degree matrix D.

We introduce the following norm for vectors indexed by the graphnodes:

||f ||2 =∑

(i ,j)∈E

(fi − fj)2 = f tQf

where Q = D − A. This can be seen as a RKHS with kernel Q−1.

15 / 43

Page 18: Gaussian process models using banded precisions matricestugaut.perso.math.cnrs.fr/pdf/workshop02/durrande.pdfGaussian process models using banded precisions matrices NicolasDurrande,PROWLER.io–MinesSt-Étienne

GP regression

Let g ∼ N (0,Q−1) be a vector of length N and {A,B} be apartition of {1, . . . ,N}.

The conditional distribution of gA | gB is N (m,P−1) with:

m = Q−1A,AQA,BgB

P = QA,A

If Q is banded (say bandwidth l), we can make this efficient byimplementing dedicated operators:

Cholesky factorisation ����O(n3) O(nl2)

Triangular Solve ����O(n2) O(nl)

16 / 43

Page 19: Gaussian process models using banded precisions matricestugaut.perso.math.cnrs.fr/pdf/workshop02/durrande.pdfGaussian process models using banded precisions matrices NicolasDurrande,PROWLER.io–MinesSt-Étienne

Given a graph with N nodes and a vector Y of n observations at asubset of nodes X , we consider the following model:

f ∼ N (0,Q−1)

yi = f (xi ) + εi with εi ∼ N (0, τ2) i.i.d.

Where Q is banded and depends on some parameters θ.

Question: Can we efficiently estimate the model parameters θ, τ2?

17 / 43

Page 20: Gaussian process models using banded precisions matricestugaut.perso.math.cnrs.fr/pdf/workshop02/durrande.pdfGaussian process models using banded precisions matrices NicolasDurrande,PROWLER.io–MinesSt-Étienne

If we have observations only at a subset of node, then we can writeK = EQ−1ET . The marginal likelihood of the model is then:

L(θ, τ2) = −n

2log(2π)− 1

2log |EQ−1ET + τ2I | − 1

2Y T (EQ−1ET + τ2I )−1Y

= −n

2log(2π)− 1

2log |Q + τ−2ETE |+ 1

2log |Q| − 1

2log |τ2I |

− 12τ2Y

TY +1

2τ4YTE (Q + τ−2ETE )−1ETY

= −n

2log(2π)− log |L|+ log |LQ | −

n

2log τ2 − 1

2τ2YTY

+1

2τ4YTEL−TL−1ETY

with LLT = (Q + τ−2ETE ) and LQLTQ = Q.

⇒ Previous operators allow doing this in O(Nl2)!

18 / 43

Page 21: Gaussian process models using banded precisions matricestugaut.perso.math.cnrs.fr/pdf/workshop02/durrande.pdfGaussian process models using banded precisions matrices NicolasDurrande,PROWLER.io–MinesSt-Étienne

More fancy models...

Another model we are interested in is:

f ∼ N (0,Q−1)

p(y | f ) =n∏

i+1

pi (yi | fi )

Exampleyi | f (xi ) ∼ B(φ(x)) with φ(x) = exp(f (x))

1+exp(f (x)) :

x

2

2

f

GP Prior

x0

1

Y

Data

19 / 43

Page 22: Gaussian process models using banded precisions matricestugaut.perso.math.cnrs.fr/pdf/workshop02/durrande.pdfGaussian process models using banded precisions matrices NicolasDurrande,PROWLER.io–MinesSt-Étienne

In this case, the conditional distribution of f given the data is notGaussian anymore, and there is usually no analytical solution.

ExampleIf we consider the classification example with only two observationsY = (1, 0)T , we have the following distributions over (f1, f2).

posterior prior likelihoodpf |y=Y (F ) pf (F ) py |f=F (Y )

Source: Nickisch and Rasmussen, Approximations for BinaryGaussian Process Classification, JMLR 2008.

20 / 43

Page 23: Gaussian process models using banded precisions matricestugaut.perso.math.cnrs.fr/pdf/workshop02/durrande.pdfGaussian process models using banded precisions matrices NicolasDurrande,PROWLER.io–MinesSt-Étienne

Variational Inference

Variational inference consists in optimizing the parameters of adistribution (say q) such that it approximates pf |y=Y .

The objective function is a lower bound to the marginal likelihood:

log py (Y ) ≥ EF∼q logpy ,f (Y ,F )

q(F )

=n∑

i=1

EF∼q log pyi |fi=Fi(Yi )− KL[q ‖ p].

The distribution q is typically chosen to be to be multivariateGaussian.

In our settings, pf has a banded precision, and we choose q to bethe pdf of N (mq, (LqL

Tq )−1) where mq is a vector and Lq a banded

lower triangular matrix.21 / 43

Page 24: Gaussian process models using banded precisions matricestugaut.perso.math.cnrs.fr/pdf/workshop02/durrande.pdfGaussian process models using banded precisions matrices NicolasDurrande,PROWLER.io–MinesSt-Étienne

Variational Inference

Can we compute efficiently∑n

i=1 EF∼q log pyi |fi=Fi(Yi )− KL[q ‖ p]?

The first term depends on the marginal distribution of q: wethus need mq and the diagonal terms of Q−1

q . The expectationmay be obtained analytically or numerically.The second one is

KL[q ‖ p] =12

(tr(Q−1

q Qp) + 2∑i

log[Lq]ii − log[Lp]ii

+ (mp −mq)TLpLTp (mp −mq)− N

).

The trace term looks challenging, but it boils down to the sumof an element-wise product, where Qp is banded. ComputeQ−1

q only inside the band is sufficient!

The new operators required are sparse inverse subset, and abanded-matrix product. They are O(Nl2) and O(Nl).

22 / 43

Page 25: Gaussian process models using banded precisions matricestugaut.perso.math.cnrs.fr/pdf/workshop02/durrande.pdfGaussian process models using banded precisions matrices NicolasDurrande,PROWLER.io–MinesSt-Étienne

MCMC

MCMC is the classical approach to perform exact inference whenno analytical solutions are available.

Since f typically has high correlations, it is common to applywhitening f = L−T v and to do the sampling directly on v .

Given a prior on the hyper-parameters, the log joint density is then:

log p(v , θ, y) = log p(v) + log p(θ)

+n∑

i=1

log p(yi |θ, (L−>Q v)i ).

One can see that it does not require any supplementary operator.

23 / 43

Page 26: Gaussian process models using banded precisions matricestugaut.perso.math.cnrs.fr/pdf/workshop02/durrande.pdfGaussian process models using banded precisions matrices NicolasDurrande,PROWLER.io–MinesSt-Étienne

For all the operators we have listed so far, a version dedicated tobanded matrices can already be found in the literature :)

If we want the methods to be efficient, we need to have access tothe gradients of our objectives :(

24 / 43

Page 27: Gaussian process models using banded precisions matricestugaut.perso.math.cnrs.fr/pdf/workshop02/durrande.pdfGaussian process models using banded precisions matrices NicolasDurrande,PROWLER.io–MinesSt-Étienne

Implementation in an autodifferentiationframework

25 / 43

Page 28: Gaussian process models using banded precisions matricestugaut.perso.math.cnrs.fr/pdf/workshop02/durrande.pdfGaussian process models using banded precisions matrices NicolasDurrande,PROWLER.io–MinesSt-Étienne

The principle of autodifferentiation libraries such as Theano,PyTorch or TensorFlow is that each operator comes with animplementation of its derivative.

Using the chain rule, the autodifferentiation framework can thencompute the derivative of any variable with respect to any other.

In practice, these frameworks typically use reverse modedifferentiation: given a chain of operations X → Y → · · · → c

(with X ,Y matrices), they use[

∂c∂Yij

]to compute

[∂c∂Xij

].

26 / 43

Page 29: Gaussian process models using banded precisions matricestugaut.perso.math.cnrs.fr/pdf/workshop02/durrande.pdfGaussian process models using banded precisions matrices NicolasDurrande,PROWLER.io–MinesSt-Étienne

With the notation Y =[

∂c∂Yij

], we have

dc =∑ij

∂c

∂YijdYij = tr(Y TdY ).

X can be obtained using the relation between dY and dX andpermutations inside the trace.

ExampleConsider the following operations: X → Y = XXT → c = sum(Y ).

Y is the matrix with entries ∂c∂Yij

= 1

X is (Y + Y T )X :

dc = tr(Y TdY ) = tr(Y T (dX XT + X dXT ))

= tr(Y TdX XT ) + tr(Y TX dXT )

= tr(XT Y TdX ) + tr(dXXT Y )

= tr(XT Y TdX ) + tr(XT Y dX )

= tr(XT (Y T + Y )dX )

27 / 43

Page 30: Gaussian process models using banded precisions matricestugaut.perso.math.cnrs.fr/pdf/workshop02/durrande.pdfGaussian process models using banded precisions matrices NicolasDurrande,PROWLER.io–MinesSt-Étienne

Applying this to our operators gives

Symbol Input Forward Reverse Mode SensitivitiesP B1, B2 P = B1B2 B1 = P(P,BT

2 ) B2 = P(BT1 , P)

P B , v p = Bv B = O(p, v) v = P(BT , p)O m, v O = mvT m = P(O, v) v = P(OT ,m)

S L, v s = L−1vv = S(LT , s)

LT = −O(S(L, v), S(LT , s))

S L, B S = L−1BB = S(LT , S)

LT = −P(S(L,B),S(LT , S)T )

These expressions require the forward operators we already have!

For Cholesky and sparse inverse subset, things are a bit tricky:⇒ we modified existing non banded code and we used ’tangent’.

28 / 43

Page 31: Gaussian process models using banded precisions matricestugaut.perso.math.cnrs.fr/pdf/workshop02/durrande.pdfGaussian process models using banded precisions matrices NicolasDurrande,PROWLER.io–MinesSt-Étienne

Experiments

29 / 43

Page 32: Gaussian process models using banded precisions matricestugaut.perso.math.cnrs.fr/pdf/workshop02/durrande.pdfGaussian process models using banded precisions matrices NicolasDurrande,PROWLER.io–MinesSt-Étienne

Exp. 1: Mauna Loa CO2 dataset

We looked at the weekly measurement of CO2 concentration at theMauna Loa observatory (Hawaii):

1960 1970 1980 1990 2000 2010 2020time (years)

320

340

360

380

400

CO2 c

once

ntra

tion

(ppm

)

This dataset consists of 3082 observations.

30 / 43

Page 33: Gaussian process models using banded precisions matricestugaut.perso.math.cnrs.fr/pdf/workshop02/durrande.pdfGaussian process models using banded precisions matrices NicolasDurrande,PROWLER.io–MinesSt-Étienne

Exp. 1: Mauna Loa CO2 dataset

We consider 3 implementations of a GP model with exponentialkernel:

GPflow a classic GPR implementation based on covariancesKalman a Kalman filter implemented in TensorFlowCustom the proposed framework with banded prec and

custom ops

Note that the model settings correspond to a bandwidth of one.

31 / 43

Page 34: Gaussian process models using banded precisions matricestugaut.perso.math.cnrs.fr/pdf/workshop02/durrande.pdfGaussian process models using banded precisions matrices NicolasDurrande,PROWLER.io–MinesSt-Étienne

Exp. 1: Mauna Loa CO2 dataset

We then compare the execution time for computing thelog-likelihood and its gradients (w.r.t the kernel parameters).

Considering subsets of the data allows us to study the influence ofthe number of observations:

0 500 1000 1500 2000 2500 3000number of data

0

500

1000

1500

2000

2500

time

(ms)

Native TensorFlow KalmanCustom operator GPRGPflow GPR

32 / 43

Page 35: Gaussian process models using banded precisions matricestugaut.perso.math.cnrs.fr/pdf/workshop02/durrande.pdfGaussian process models using banded precisions matrices NicolasDurrande,PROWLER.io–MinesSt-Étienne

Exp. 1: Mauna Loa CO2 dataset

Similarly, we can study the influence of the precision bandwidthon the log-likelihood execution time.

To do so we consider more complex models with quasi-periodiccomponents.

5 10 15 20 25 30 35 40bandwidth

0

1500

3000

4500

6000

time

(ms)

Native TensorFlow KalmanCustom operator GPRDense covariance

On this example the number of observations is fixed to 1500. 33 / 43

Page 36: Gaussian process models using banded precisions matricestugaut.perso.math.cnrs.fr/pdf/workshop02/durrande.pdfGaussian process models using banded precisions matrices NicolasDurrande,PROWLER.io–MinesSt-Étienne

Exp. 1: Mauna Loa CO2 dataset

The following figure shows the predictions for the model with kernel

k(d) = k3/2(d) + k1/2(d) cos(ωd) + k1/2(d) cos(2ωd)

which has a bandwidth of 5:

2010 2015 2020time (years)

385

390

395

400

405

410

415

CO2 c

once

ntra

tion

(ppm

)

34 / 43

Page 37: Gaussian process models using banded precisions matricestugaut.perso.math.cnrs.fr/pdf/workshop02/durrande.pdfGaussian process models using banded precisions matrices NicolasDurrande,PROWLER.io–MinesSt-Étienne

Exp. 2: Porto taxi dataset

This dataset consists of taxi GPS location for a year.

Location of passenger pick-ups.

35 / 43

Page 38: Gaussian process models using banded precisions matricestugaut.perso.math.cnrs.fr/pdf/workshop02/durrande.pdfGaussian process models using banded precisions matrices NicolasDurrande,PROWLER.io–MinesSt-Étienne

Exp. 2: Porto taxi dataset

It has already been successfully modelled by Cox processes:

f ∼ GP(0, k)

yD ∼ P(∫Df (x)2dx

)This model assumes there is a smooth underlying rate.

Ref: S. John and J. Hensman, Large-scale Cox processwith variational Fourier Features, ICML 2018

36 / 43

Page 39: Gaussian process models using banded precisions matricestugaut.perso.math.cnrs.fr/pdf/workshop02/durrande.pdfGaussian process models using banded precisions matrices NicolasDurrande,PROWLER.io–MinesSt-Étienne

Exp. 2: Porto taxi dataset

We introduce a graph corresponding to the road network and clipthe data to the graph nodes (first three weeks):

0

1

2

4

10

37 / 43

Page 40: Gaussian process models using banded precisions matricestugaut.perso.math.cnrs.fr/pdf/workshop02/durrande.pdfGaussian process models using banded precisions matrices NicolasDurrande,PROWLER.io–MinesSt-Étienne

Exp. 2: Porto taxi dataset

We choose the following Cox processes model with a GP indexed bythe graph nodes:

f ∼ N (0,Q−1)

yi ∼ P (exp(fi )wi )

where wi is the length of the edges leading to node i .

Q is define such that the norm it generates is the sum of Matérn1/2 norm on each edge:

gTQh =1σ2

∑(i,j)∈E

11− λi,j

(gi gj)

(1 −λi,j−λi,j 1

)(hihj

)− 1

2gihi −

12gjhj

where λi ,j = σ2 exp(−di ,j/`) with σ2 = 10, ` = 104.

38 / 43

Page 41: Gaussian process models using banded precisions matricestugaut.perso.math.cnrs.fr/pdf/workshop02/durrande.pdfGaussian process models using banded precisions matrices NicolasDurrande,PROWLER.io–MinesSt-Étienne

Exp. 2: Porto taxi dataset

Using variational inference, we obtain the following posterior on f :

-11.3

-4.0

-2.3

-0.6

2.2

39 / 43

Page 42: Gaussian process models using banded precisions matricestugaut.perso.math.cnrs.fr/pdf/workshop02/durrande.pdfGaussian process models using banded precisions matrices NicolasDurrande,PROWLER.io–MinesSt-Étienne

Exp. 2: Porto taxi dataset

It corresponds to the following predictions for y :

0

19

37

56

149

40 / 43

Page 43: Gaussian process models using banded precisions matricestugaut.perso.math.cnrs.fr/pdf/workshop02/durrande.pdfGaussian process models using banded precisions matrices NicolasDurrande,PROWLER.io–MinesSt-Étienne

Exp. 2: Porto taxi dataset

Using the next three weeks of the data as a test set, we comparethe likelihood of three candidates:

Variational inference -15778.5Hamiltonian Monte Carlo -15873.6

baseline using empirical rates -17146.6

Our models perform better than the baseline, which means that ourinitial assumptions make sense!

41 / 43

Page 44: Gaussian process models using banded precisions matricestugaut.perso.math.cnrs.fr/pdf/workshop02/durrande.pdfGaussian process models using banded precisions matrices NicolasDurrande,PROWLER.io–MinesSt-Étienne

Conclusion

42 / 43

Page 45: Gaussian process models using banded precisions matricestugaut.perso.math.cnrs.fr/pdf/workshop02/durrande.pdfGaussian process models using banded precisions matrices NicolasDurrande,PROWLER.io–MinesSt-Étienne

In a nutshell...

The proposed framework is applicable to a wide class of modelssuch as State space models, Gaussian Markov Random Fields orContinuous Markovian processes.

It allows to perform state of the art inference: Maximum likelihood,Variational inference or Hamiltonian Monte Carlo.

The resulting models show interesting behaviours and inference isfast!

43 / 43


Recommended