From GLMs to GAMs
April 27, 2021
Radost Roumenova Wenman, FCAS, MAAA, CSPAConsulting Actuary
Introduction
2
Generalized Linear Models GLM Generalized Additive Models GAM
๏ฟฝ๐๐ ๐ธ๐ธ ๐ฆ๐ฆ = ๐ฟ๐ฟ๐ป๐ป๐ท๐ท + ๐๐ ๐ง๐ง๏ฟฝ๐๐ ๐ธ๐ธ ๐ฆ๐ฆ = ๐ฟ๐ฟ๐ป๐ป๐ท๐ท= GLM + ๐๐ ๐ง๐ง
How about Polynomial Fits?
3
Background
Trevor Hastie and Robert Tibshirani
(1986) Replace the linear predictor
with an โadditiveโ predictor
Utilizes smooth functions
Useful in uncovering
nonlinear effects
Completely automatic
โNo detective work is
neededโฆโ
4
GAMs vs. GLMs
General Linear Model (GLM)
Generalized Linear Models (GLMs)
Generalized Additive Models (GAMs)
5
Transform X
log (x), x2, x3,Box-Coxโฆ
Categorize X GAMs
How Can We Address Nonlinearity?
Modeling Nonlinearity
6
Basis Functions
7
๐๐ ๐ธ๐ธ |๐๐ ๐๐ = ๐๐0 + ๐๐1๐๐1 + ๐๐2๐๐2 + โฏ+ ๐๐9๐๐9
๐๐ ๐ฟ๐ฟ = ๐๐ ๐ธ๐ธ |๐๐ ๐๐ = ๐๐0 + ๐๐1B1 X + ๐๐2B2 X + โฏ+ ๐๐9B9(X)
Regression Splines
8
Smoothing Splines and the Bias-Variance Trade-off
9
GAM โ Example with Poisson
10
log ๐๐๐๐ = log ๐๐๐๐ + ๐๐๐๐ = log ๐๐๐๐ + ๐ฝ๐ฝ๐ง๐ง๐๐log ๐๐๐๐ = log ๐๐๐๐ + ๐๐๐๐ = log ๐๐๐๐ + ๐๐(๐ง๐ง๐๐)
GLMGAM
๐๐ ๐ง๐ง, ๐๐ = ๏ฟฝ๐๐=1
๐๐
๐ง๐ง๐๐ log ๐๐๐๐ โ ๐๐๐๐ โ log ๐ง๐ง๐๐!
= ๏ฟฝ๐๐=1
๐๐
๐ง๐ง๐๐ f ๐ง๐ง๐๐ โ exp(f ๐ง๐ง๐๐ ) โ log ๐ง๐ง๐๐!
โ12๐๐๏ฟฝ ๐๐โฒโฒ ๐ง๐ง 2 d๐ง๐ง= ๏ฟฝ
๐๐=1
๐๐
๐ง๐ง๐๐ f ๐ง๐ง๐๐ โ exp(f ๐ง๐ง๐๐ ) โ log ๐ง๐ง๐๐!
Flexibility of GAMs
โข Multiple predictors
โข Mixture of smoothing splines, linear terms, and nominal variables
โข Smooth interactions
11
GAM Summary Output
12
Family: gaussian Link function: identity
parametric coefficients:Estimate Std. Error t value Pr(>|t|) (Intercept) -25.546 1.951 -13.1 <2e-16 ***
coef(gam_mod) โ smooth terms:s(z).1 s(z).2 s(z).3 s(z).4 s(z).5-63.718 43.476 -110.350 -22.181 35.034
s(z).6 s(z).7 s(z).8 s(z).993.176 9.283 -111.661 17.603
Approximate significance of smooth terms:edf F p-value
s(z) 8.693 53.52 <2e-16 ***
R-sq.(adj) = 0.783 Deviance explained = 79.8%GCV = 545.78 Scale est. = 506 n = 133
Hypothesis Testing, GCV, AIC, Stepwise Variable Selection, Shrinkage
Insurance Application of GAMs โ Geospatial Smoothing
Including geographic territories directly in a GLM is generally not feasible!
โข Popular technique โ smoothing and clustering
โ zero exposure?
โ homogeneous?
โ clustering method?
โข Alternative technique โ GAM
โ directly applies spatial smoothing
โ can use longitude and latitude
13
GAM Approach to Modeling Geolocation Dataโข Method 1 (two-step)
โ include non-geographic variables as predictors in a GLM
โ extract the GLM residuals
โ use GAM to regress the GLM residuals on f(longitude, latitude)
โข Method 2 (two-step)
โ include non-geographic variables as predictors in a GLM
โ extract the GLM linear predictor
โ Use the GLM linear predictor as an offset in a GAM only with f(longitude, latitude)
โข Method 3 (one-step)
โ include all variables, including geolocation, as predictors in a GAM
14
In a Nutshellโฆ
GAMs = Penalized GLMs!
15
Recommended References
โข Hastie, T., Tibshirani, R. (1990). Generalized Additive Models, Chapman & Hall/CRC.
โข Wood, S. (2017). Generalized Additive Models: An Introduction with R, Chapman & Hall/CRC.
โข Fahrmeir, L., Kneib, T., Lang, L., Marx, B. (2013). Regression: Models, Methods and Applications, Springer.
โข Klein, N., Denuit, M., Lang, S., and Kneib, T. (2014). Nonlife Ratemaking and Risk Management with Bayesian Generalized Additive Models for Location, Scale, and Shape. Insurance: Mathematics and Economics 55:225โ49.
โข http://www.variancejournal.org/issues/13-01/141.pdf
โข https://www.soa.org/globalassets/assets/files/e-business/pd/events/2020/predictive-analytics-4-0/pd-2020-09-pas-session-006.pdf
16