From GLMs to GAMs
April 27, 2021
Radost Roumenova Wenman, FCAS, MAAA, CSPAConsulting Actuary
Introduction
2
Generalized Linear Models GLM Generalized Additive Models GAM
οΏ½ππ πΈπΈ π¦π¦ = πΏπΏπ»π»π·π· + ππ π§π§οΏ½ππ πΈπΈ π¦π¦ = πΏπΏπ»π»π·π·= GLM + ππ π§π§
How about Polynomial Fits?
3
Background
Trevor Hastie and Robert Tibshirani
(1986) Replace the linear predictor
with an βadditiveβ predictor
Utilizes smooth functions
Useful in uncovering
nonlinear effects
Completely automatic
βNo detective work is
neededβ¦β
4
GAMs vs. GLMs
General Linear Model (GLM)
Generalized Linear Models (GLMs)
Generalized Additive Models (GAMs)
5
Transform X
log (x), x2, x3,Box-Coxβ¦
Categorize X GAMs
How Can We Address Nonlinearity?
Modeling Nonlinearity
6
Basis Functions
7
ππ πΈπΈ |ππ ππ = ππ0 + ππ1ππ1 + ππ2ππ2 + β―+ ππ9ππ9
ππ πΏπΏ = ππ πΈπΈ |ππ ππ = ππ0 + ππ1B1 X + ππ2B2 X + β―+ ππ9B9(X)
Regression Splines
8
Smoothing Splines and the Bias-Variance Trade-off
9
GAM β Example with Poisson
10
log ππππ = log ππππ + ππππ = log ππππ + π½π½π§π§ππlog ππππ = log ππππ + ππππ = log ππππ + ππ(π§π§ππ)
GLMGAM
ππ π§π§, ππ = οΏ½ππ=1
ππ
π§π§ππ log ππππ β ππππ β log π§π§ππ!
= οΏ½ππ=1
ππ
π§π§ππ f π§π§ππ β exp(f π§π§ππ ) β log π§π§ππ!
β12πποΏ½ ππβ²β² π§π§ 2 dπ§π§= οΏ½
ππ=1
ππ
π§π§ππ f π§π§ππ β exp(f π§π§ππ ) β log π§π§ππ!
Flexibility of GAMs
β’ Multiple predictors
β’ Mixture of smoothing splines, linear terms, and nominal variables
β’ Smooth interactions
11
GAM Summary Output
12
Family: gaussian Link function: identity
parametric coefficients:Estimate Std. Error t value Pr(>|t|) (Intercept) -25.546 1.951 -13.1 <2e-16 ***
coef(gam_mod) β smooth terms:s(z).1 s(z).2 s(z).3 s(z).4 s(z).5-63.718 43.476 -110.350 -22.181 35.034
s(z).6 s(z).7 s(z).8 s(z).993.176 9.283 -111.661 17.603
Approximate significance of smooth terms:edf F p-value
s(z) 8.693 53.52 <2e-16 ***
R-sq.(adj) = 0.783 Deviance explained = 79.8%GCV = 545.78 Scale est. = 506 n = 133
Hypothesis Testing, GCV, AIC, Stepwise Variable Selection, Shrinkage
Insurance Application of GAMs β Geospatial Smoothing
Including geographic territories directly in a GLM is generally not feasible!
β’ Popular technique β smoothing and clustering
β zero exposure?
β homogeneous?
β clustering method?
β’ Alternative technique β GAM
β directly applies spatial smoothing
β can use longitude and latitude
13
GAM Approach to Modeling Geolocation Dataβ’ Method 1 (two-step)
β include non-geographic variables as predictors in a GLM
β extract the GLM residuals
β use GAM to regress the GLM residuals on f(longitude, latitude)
β’ Method 2 (two-step)
β include non-geographic variables as predictors in a GLM
β extract the GLM linear predictor
β Use the GLM linear predictor as an offset in a GAM only with f(longitude, latitude)
β’ Method 3 (one-step)
β include all variables, including geolocation, as predictors in a GAM
14
In a Nutshellβ¦
GAMs = Penalized GLMs!
15
Recommended References
β’ Hastie, T., Tibshirani, R. (1990). Generalized Additive Models, Chapman & Hall/CRC.
β’ Wood, S. (2017). Generalized Additive Models: An Introduction with R, Chapman & Hall/CRC.
β’ Fahrmeir, L., Kneib, T., Lang, L., Marx, B. (2013). Regression: Models, Methods and Applications, Springer.
β’ Klein, N., Denuit, M., Lang, S., and Kneib, T. (2014). Nonlife Ratemaking and Risk Management with Bayesian Generalized Additive Models for Location, Scale, and Shape. Insurance: Mathematics and Economics 55:225β49.
β’ http://www.variancejournal.org/issues/13-01/141.pdf
β’ https://www.soa.org/globalassets/assets/files/e-business/pd/events/2020/predictive-analytics-4-0/pd-2020-09-pas-session-006.pdf
16