+ All Categories
Home > Documents > Functional data ?

Functional data ?

Date post: 31-Dec-2015
Category:
Upload: hilel-burnett
View: 19 times
Download: 3 times
Share this document with a friend
Description:
Global sensitivity analysis of computer models with functional inputs B. Iooss (CEA Cadarache) M. Ribatet (CEMAGREF Lyon) Conference SAMO 2007 Budapest, Hongrie. Functional data ?. - PowerPoint PPT Presentation
29
06/27/22 Global sensitivity analysis of computer models with functional inputs B. Iooss (CEA Cadarache) M. Ribatet (CEMAGREF Lyon) Conference SAMO 2007 Budapest, Hongrie
Transcript
Page 1: Functional data ?

04/19/23

Global sensitivity analysis of computer models with functional inputs

B. Iooss (CEA Cadarache)

M. Ribatet (CEMAGREF Lyon)

Conference SAMO 2007Budapest, Hongrie

Page 2: Functional data ?

B. Iooss – SAMO 2007 - 22/06/07

Functional data ?•Classical model writes Y = f (X) , where Y is a scalar output variable

and X is a vector of scalar input variables.

X is considered as a vector of random variables Y is a random variable

Page 3: Functional data ?

B. Iooss – SAMO 2007 - 22/06/07

Functional data ?•Classical model writes Y = f (X) , where Y is a scalar output variable

and X is a vector of scalar input variables.

X is considered as a vector of random variables Y is a random variable

•The model with functional variables writes Y(v) = f (X1(u1),…, Xp(up)), where

– v and ui are some parameters (scalar or multidimensionnal),

– Y(v) is an output function,

– Xi(ui) is an input function (possibly constant).

Page 4: Functional data ?

B. Iooss – SAMO 2007 - 22/06/07

Functional data ?•Classical model writes Y = f (X) , where Y is a scalar output variable and

X is a vector of scalar input variables.

X is considered as a vector of random variables Y is a random variable

•The model with functional variables writes Y(v) = f (X1(u1),…, Xp(up)), where

– v and ui are some parameters (scalar or multidimensionnal),

– Y(v) is an output function,

– Xi(ui) is an input function (possibly constant).

Ex. for u and v : time t, spatial coordinates (x,y,z), temperature T, …

Xi(ui) are considered as random functions Y(v) is a random function.

Page 5: Functional data ?

B. Iooss – SAMO 2007 - 22/06/07

An example of a functional input problem

First study :•20 random input variables (permeability, porosity, Kd, …),•20 scalar outputs (concentrations at piezometers),•LH sample (N=300) 300 model evaluations (3 days)•Construction of metamodels,•Global sensitivity analysis (Sobol) via the use of metamodels.

Result : permeability of the second layer is the most influent variable.

August 2002 December 2010

Concentrationsmap

Pollutant (90Sr) transport simulation in porous media [ Volkova et al., SERRA 07 ]

Page 6: Functional data ?

B. Iooss – SAMO 2007 - 22/06/07

Second study :

• We want to take into account the spatial heterogeneity of the permeability.

• We represent it by a random field (x,y).

Realisations of this random field are obtained via geostatistical simulation techniques.

Classical methods of global sensitivity analysis or metamodel construction are no more applicable.

50 100 150 200 250 300

50

100

150

200

50 100 150 200 250 300

50

100

150

200

2 possible realisations of the permeability

An example of a functional input problem

Page 7: Functional data ?

B. Iooss – SAMO 2007 - 22/06/07

Some recent works (not exhaustive)

Functional input :– Tarantola et al., SERRA 02 : environmental assessment problem.

Some inputs represent the errors in spatially distributed maps (random fields), obtained by simulations.

Page 8: Functional data ?

B. Iooss – SAMO 2007 - 22/06/07

Some recent works (not exhaustive)

Functional input :– Tarantola et al., SERRA 02 : environmental assessment problem.

Some inputs represent the errors in spatially distributed maps (random fields), obtained by simulations.

– Ruffo et al., RESS 06 : hydrocarbon exploration risk evaluation.

The basin and petroleum system models are very complex random fields.

Consider one scenario variable (32 basin models) as a categorical variable.

Page 9: Functional data ?

B. Iooss – SAMO 2007 - 22/06/07

Some recent works (not exhaustive)

Functional input :– Tarantola et al., SERRA 02 : environmental assessment problem.

Some inputs represent the errors in spatially distributed maps (random fields), obtained by simulations.

– Ruffo et al., RESS 06 : hydrocarbon exploration risk evaluation.

The basin and petroleum system models are very complex random fields.

Consider one scenario variable (32 basin models) as a categorical variable.

– Zabalza-Mezghani et al., JPSE 04 : hydrocarbon production optimization.

The random field is considered as an uncontrollable input variable (« Stochastic uncertainty parameter » ).

The other scalar inputs are the controllable variables.

Page 10: Functional data ?

B. Iooss – SAMO 2007 - 22/06/07

Our problem and some possible solutions

Compute the Sobol indices when some input variables are functional.

Page 11: Functional data ?

B. Iooss – SAMO 2007 - 22/06/07

Our problem and some possible solutions

Compute the Sobol indices when some input variables are functional.

•Complete discretization : unrealizable (several thousands of parameters).

Page 12: Functional data ?

B. Iooss – SAMO 2007 - 22/06/07

Our problem and some possible solutions

Compute the Sobol indices when some input variables are functional.

•Complete discretization : unrealizable (several thousands of parameters).

•Expansion in an appropriate basis function : impracticable in some cases (for ex. if the functional input is a temporal white noise).

Page 13: Functional data ?

B. Iooss – SAMO 2007 - 22/06/07

Our problem and some possible solutions

Compute the Sobol indices when some input variables are functional.

•Complete discretization : unrealizable (several thousands of parameters).

•Expansion in an appropriate basis function : impracticable in some cases (for ex. if the functional input is a temporal white noise).

•Consider the functional input as an unique multi-dimensional parameter.

Multidimensional sensitivity indices (Sobol, MCS 01, Jacques et al., RESS 06) via algorithms which use some independent samples (simple Monte-Carlo).

FAST, RBD and quasi-MC methods are not applicable.

Page 14: Functional data ?

B. Iooss – SAMO 2007 - 22/06/07

Our problem and some possible solutionsCompute the Sobol indices when some input variables are

functional.

•Complete discretization : unrealizable (several thousands of parameters).

•Expansion in an appropriate basis function : impracticable in some cases (for ex. if the functional input is a temporal white noise).

•Consider the functional input as an unique multi-dimensional parameter. Multidimensional sensitivity indices (Sobol, MCS 01, Jacques et al., RESS 06) via

algorithms which use some independent samples (simple Monte-Carlo). FAST, RBD and quasi-MC methods are not applicable.

•Replace the functional input by a scalar parameter ~ U[0,1] : it governs the simulation (or not) of the functional input (Tarantola et al., SERRA 02).

Calculate the Sobol index of by any methods.It leads to a quantification of the sensitivity of the output due to the

presence/absence of , but not due to the variability of.

Page 15: Functional data ?

B. Iooss – SAMO 2007 - 22/06/07

Moreover, in our case, we need metamodels

We deal with complex computer codes : non linear effects, time consuming, large number of inputs (>10).

The Sobol indices estimation cannot be made via the direct use of the code, but via the intermediate use of a metamodel.

Page 16: Functional data ?

B. Iooss – SAMO 2007 - 22/06/07

Moreover, in our case, we need metamodels

We deal with complex computer codes : non linear effects, time consuming, large number of inputs (>10).

The Sobol indices estimation cannot be made via the direct use of the code, but via the intermediate use of a metamodel.

Zabalza-Mezgani et al., JPSE 04, propose to consider the functional input as an uncontrollable parameter.

With scalar inputs X and functional input (u), the metamodel becomes

a mean component E(Y|X) and a variance component Var(Y|X).

Uncertainty propagation via this joint model.E(Y|X) + (Y|X)

E(Y|X)

X

Y

Page 17: Functional data ?

B. Iooss – SAMO 2007 - 22/06/07

Sobol indices of the joint modelVar[Y(X ,) ] = Var[ E(Y |X ) ] + E[ Var(Y |X) ]

= Var[ Ym(X) ] + E[ Yd (X) ]

Variance decomposition of Y :

Variance decomposition of Ym :

Then, Sobol indices of X on Y are obtained by :

E[Yd (X) ] contains all the terms including effects of

Total Sobol indice of :

)(Var

)]Var[E( m

Y

XYS i

iX

)(Var

)(

Y

YES dT

)()()()()(Var 121

dp

p

jiij

p

ii YYVYVYVY

)()()()(Var 121

mp

p

jimij

p

imim YVYVYVY

Page 18: Functional data ?

B. Iooss – SAMO 2007 - 22/06/07

Modeling the mean Ym and dispersion Yd

Dual modeling by 2 polynomials (Taguchi 86, Vining & Myers, JQT 90).

Joint modeling by 2 Generalized Linear Models (McCullagh & Nelder 89)

– more general theoretical framework (exponential family distribution),

– modelize simultaneously the mean and variance: iterative fits,– no replications needed (require less computations).

•For the dispersion d, we take the deviance contribution.

•Deviance analysis, Student and Fisher tests, residuals analyses, … allow to perform terms selection and to choose functions g and v.

iii

jjijiiii

Y

xgY

v)(Var

: ,)(E

22)(Var

log: ,)(E

ii

jjijiiii

d

ud

mean dispersion

Page 19: Functional data ?

B. Iooss – SAMO 2007 - 22/06/07

The drawback of GLM is its parametric form which leads to limitations when modeling complex computer codes.

Replace it by popular non parametric models : GAM (Hastie & Tibshirani)

si’s are obtained by fitting a smoother to the data : penalized regression splines (integrated model selection via Generalized Cross Validation).

Deviance analysis, statistical tests on coefficients, residuals analyses, … allow to perform terms selection.

Compared to other metamodels (kriging, neural networks) :– GAM offers a direct interpretation of the model – the drawback stands in the additive effect hypothesis.

Joint modeling with Generalized Additive Models

p

jijiij

p

iii XXsXsgY ),()()(;)(

1

X

ji

jiiji

ii UUsUsd ),()()log(;)(1

U

Page 20: Functional data ?

B. Iooss – SAMO 2007 - 22/06/07

Simple example : Ishigami function with Xi ~ U[-, ]

To test our joint models, X3 is considered as an uncontrollable input.

Models are fitted on 1e3 data. Predictivity coef. Q2 is computed on 1e4 test data.

)sin(1.0)sin(7)sin( 143

221 XXXXY

Page 21: Functional data ?

B. Iooss – SAMO 2007 - 22/06/07

Simple example : Ishigami function with Xi ~ U[-, ]

To test our joint models, X3 is considered as an uncontrollable input.

Models are fitted on 1e3 data. Predictivity coef. Q2 is computed on 1e4 test data.

Joint GLM (Q2 = 61 %) :

Simple GAM (Q2 = 75 %) :

Joint GAM Q2 (mean) =76 %, Explained deviance : 93% (mean), 37% (dispersion)

)sin(1.0)sin(7)sin( 143

221 XXXXY

7.5 and 29.029.017.269.292.1 42

31

221 dm YXXXXY

)()(67.276.3 211 XsXsXY

)(59.0exp and )()(06.375.3 1211 XsYXsXsXY dm

Page 22: Functional data ?

B. Iooss – SAMO 2007 - 22/06/07

Simple example : Ishigami function with Xi ~ U[-, ]

To test our joint models, X3 is considered as an uncontrollable input.

Models are fitted on 1e3 data. Predictivity coef. Q2 is computed on 1e4 test data.

Joint GLM (Q2 = 61 %) :

Simple GAM (Q2 = 75 %) :

Joint GAM Q2 (mean) =76 %, Explained deviance : 93% (mean), 37% (dispersion)

)sin(1.0)sin(7)sin( 143

221 XXXXY

7.5 and 29.029.017.269.292.1 42

31

221 dm YXXXXY

)()(67.276.3 211 XsXsXY

)(59.0exp and )()(06.375.3 1211 XsYXsXsXY dm

Indices Exact Joint GLM Joint GAM Simple GAM

S1 0.314 0.314 0.325 0.333

S2 0.442 0.318 0.414 0.441

ST3 0.244 0.366 0.261 0.25

S13 0.244 0 > 0 unknown

S23 0 0 0 unknown

Page 23: Functional data ?

B. Iooss – SAMO 2007 - 22/06/07

An hydrogeological applicationPollutant (90Sr) transport simulation in porous media

•16 scalar input variables : sorption coef. (kd) and permeabilities (per) of different hydrogeologic layers, porosity, infiltration rate, …

•1 functional input : the permeability

•LH sample (N=300) for the 16 inputs 300 model evaluations (8 days)

•1 output : the concentration at a specified location

50 100 150 200 250 300

50

100

150

200

Page 24: Functional data ?

B. Iooss – SAMO 2007 - 22/06/07

An hydrogeological applicationPollutant (90Sr) transport simulation in porous media

•16 scalar input variables : sorption coef. (kd) and permeabilities (per) of different hydrogeologic layers, porosity, infiltration rate, …

•1 functional input : the permeability

•LH sample (N=300) for the 16 inputs 300 model evaluations (8 days)

•1 output : the concentration at a specified location

•Joint GAM : Devexp(mean) = 98%, Devexp(dispersion) = 29%

Explanatory terms : mean [ s(kd1) , s(kd2) , s(per3) , s(per2,kd2) ]

dispersion [ kd1 , kd2 ]

50 100 150 200 250 300

50

100

150

200

Page 25: Functional data ?

B. Iooss – SAMO 2007 - 22/06/07

An hydrogeological applicationPollutant (90Sr) transport simulation in porous media

•16 scalar input variables : sorption coef. (kd) and permeabilities (per) of different hydrogeologic layers, porosity, infiltration rate, …

•1 functional input : the permeability

•LH sample (N=300) for the 16 inputs 300 model evaluations (8 days)

•1 output : the concentration at a specified location

•Joint GAM : Devexp(mean) = 98%, Devexp(dispersion) = 29%

Explanatory terms : mean [ s(kd1) , s(kd2) , s(per3) , s(per2,kd2) ]dispersion [ kd1 , kd2 ]

S(kd2)=52%, S(per2)=8%, S(kd2,per2)=6%, S(kd1)=4%

ST()=28%, S(kd1,) > 0 and S(kd2,) > 0

50 100 150 200 250 300

50

100

150

200

Page 26: Functional data ?

B. Iooss – SAMO 2007 - 22/06/07

Conclusions•This approach, based on joint models to compute Sobol

sensitivity indices, is useful in the following situations :

– model with « complex » functional inputs, – time consuming model (so a metamodel is needed),– heteroscedasticity (functional input interacts with scalar

inputs),

Page 27: Functional data ?

B. Iooss – SAMO 2007 - 22/06/07

Conclusions•This approach, based on joint models to compute Sobol

sensitivity indices, is useful in the following situations :

– model with « complex » functional inputs, – time consuming model (so a metamodel is needed),– heteroscedasticity (functional input interacts with scalar

inputs),

•Another great interest : uncertainty propagation.

Page 28: Functional data ?

B. Iooss – SAMO 2007 - 22/06/07

Conclusions•This approach, based on joint models to compute Sobol

sensitivity indices, is useful in the following situations :

– model with « complex » functional inputs, – time consuming model (so a metamodel is needed),– heteroscedasticity (functional input interacts with scalar

inputs),

•Another great interest : uncertainty propagation.

•Actual limitations :

– It cannot distinguish the effects of different functional inputs.

– we obtain qualitative sensitivity indices of the interactions between functional input and other inputs.

Page 29: Functional data ?

B. Iooss – SAMO 2007 - 22/06/07

Useful SOFTWARE

R Packages :

“JointModeling”

“sensitivity” of G. Pujol


Recommended