+ All Categories
Home > Documents > Tsinghua Universitypengcui.thumedialab.com/papers/Explainable and... · Transportation Fintech....

Tsinghua Universitypengcui.thumedialab.com/papers/Explainable and... · Transportation Fintech....

Date post: 03-Jul-2020
Category:
Upload: others
View: 1 times
Download: 0 times
Share this document with a friend
30
Towards Explainable and Stable Prediction Peng Cui Tsinghua University
Transcript
Page 1: Tsinghua Universitypengcui.thumedialab.com/papers/Explainable and... · Transportation Fintech. Black-box Model Risk of Today’s AI Slide from DARPA. Unexplainable Human in the loop

Towards Explainable and Stable Prediction

Peng Cui

Tsinghua University

Page 2: Tsinghua Universitypengcui.thumedialab.com/papers/Explainable and... · Transportation Fintech. Black-box Model Risk of Today’s AI Slide from DARPA. Unexplainable Human in the loop

Right time to consider Risk of Today’s AI

2

Human

Healthcare Law

Transportation Fintech

Page 3: Tsinghua Universitypengcui.thumedialab.com/papers/Explainable and... · Transportation Fintech. Black-box Model Risk of Today’s AI Slide from DARPA. Unexplainable Human in the loop

Black-box Model

Risk of Today’s AI

Slide from DARPA

Page 4: Tsinghua Universitypengcui.thumedialab.com/papers/Explainable and... · Transportation Fintech. Black-box Model Risk of Today’s AI Slide from DARPA. Unexplainable Human in the loop

Human in the loopUnexplainable

Medical Military Finance

Risk of Today’s AI Algorithms

Page 5: Tsinghua Universitypengcui.thumedialab.com/papers/Explainable and... · Transportation Fintech. Black-box Model Risk of Today’s AI Slide from DARPA. Unexplainable Human in the loop

5

Yes

Maybe

No

Risk of Today’s AI Algorithms

Page 6: Tsinghua Universitypengcui.thumedialab.com/papers/Explainable and... · Transportation Fintech. Black-box Model Risk of Today’s AI Slide from DARPA. Unexplainable Human in the loop

• Cancer survival rate prediction

6

Training Data

Predictive Model

Testing Data

City Hospital

University Hospital

Features:

• Body status

• Income

• Treatments

• Medications

Higher income, higher survival rate.

City Hospital

Survival rate is not so correlated with income.

Risk of Today’s AI Algorithms

Page 7: Tsinghua Universitypengcui.thumedialab.com/papers/Explainable and... · Transportation Fintech. Black-box Model Risk of Today’s AI Slide from DARPA. Unexplainable Human in the loop

• It’s the fault of Data!

• Our models are designed under the IID hypothesis.

7

Why they fail?

Training Distribution

Test Distribution

Model

Page 8: Tsinghua Universitypengcui.thumedialab.com/papers/Explainable and... · Transportation Fintech. Black-box Model Risk of Today’s AI Slide from DARPA. Unexplainable Human in the loop

8

Correlation: 0.95

P-value: e-10

Statistical Support

Page 9: Tsinghua Universitypengcui.thumedialab.com/papers/Explainable and... · Transportation Fintech. Black-box Model Risk of Today’s AI Slide from DARPA. Unexplainable Human in the loop

• Comes down to the Model

9

Research Problems

ModelDistribution 1

Distribution 1

Distribution 2

Distribution 3

Distribution n

Accuracy 1

Accuracy 2

Accuracy 3

Accuracy n

I.I.D. Learning

Transfer Learning

VAR (Acc)Stable

Prediction

Page 10: Tsinghua Universitypengcui.thumedialab.com/papers/Explainable and... · Transportation Fintech. Black-box Model Risk of Today’s AI Slide from DARPA. Unexplainable Human in the loop

A fundamental thought

10

Causality

ExplainabilityStability

Page 11: Tsinghua Universitypengcui.thumedialab.com/papers/Explainable and... · Transportation Fintech. Black-box Model Risk of Today’s AI Slide from DARPA. Unexplainable Human in the loop

Causality for Explainability

11

Page 12: Tsinghua Universitypengcui.thumedialab.com/papers/Explainable and... · Transportation Fintech. Black-box Model Risk of Today’s AI Slide from DARPA. Unexplainable Human in the loop

• Prediction / Classification

• 𝑋: vector of features; 𝑌 = 0,1

• Suppose 𝑋 = {𝑆, 𝑉}, and 𝑌 = 𝑓 𝑆 + 𝜀

• 𝑆: set of stable (causal) features

• 𝑉: set of non-causal features

• 𝑃(𝑌|𝑆) is stable, but 𝑃(𝑌|𝑉) is not stable

• Y and V is NOT independent

• Some 𝒗 ⊆ 𝑽 would be learned as

important predictors

Causality for Stability

12

Spurious Correlation !

Page 13: Tsinghua Universitypengcui.thumedialab.com/papers/Explainable and... · Transportation Fintech. Black-box Model Risk of Today’s AI Slide from DARPA. Unexplainable Human in the loop

Towards stable prediction

• Discard spurious correlation and embrace causality.

13

X

T Y

Typical Causal Framework

Estimate the causal effect oftreatment T on output Yunder the confounder X

(A/B Testing)

X T YEstimate the correlation effect

of variable T and output Ywithout evaluating the

relationships between X and T.Typical Correlation Framework

Page 14: Tsinghua Universitypengcui.thumedialab.com/papers/Explainable and... · Transportation Fintech. Black-box Model Risk of Today’s AI Slide from DARPA. Unexplainable Human in the loop

Causal Inference by Absolute Matching

14

X

T Y

Typical Causal Framework

Analogy of A/B Testing

Given a feature T

Find out the sample pairs that one contains

T while the other don’t, but they are similar

in all other features.

Calculate the difference of Y distribution in

treated and controlled groups. (correlation

between T and Y)

The requirement is too strong and we can hardly find satisfied groups

of samples.

Page 15: Tsinghua Universitypengcui.thumedialab.com/papers/Explainable and... · Transportation Fintech. Black-box Model Risk of Today’s AI Slide from DARPA. Unexplainable Human in the loop

Causal Inference by Confounder Balancing

15

X

T Y

Typical Causal Framework

Too many parameters. For N samples and K features, we need to

learn K*N weights. Not learning-friendly.

Analogy of A/B Testing

Given a feature T

Assign different weights to samples so that

the samples with T and the samples without

T have similar distributions in X

Calculate the difference of Y distribution in

treated and controlled groups. (correlation

between T and Y)

Page 16: Tsinghua Universitypengcui.thumedialab.com/papers/Explainable and... · Transportation Fintech. Black-box Model Risk of Today’s AI Slide from DARPA. Unexplainable Human in the loop

Global Balancing: bridging causality and prediction

16

X

T Y

Typical Causal Framework

Reduce the parameter number from K*N to N.

Analogy of A/B Testing

Given ANY feature T

Assign different weights to samples so that the

samples with T and the samples without T have

similar distributions in X

Calculate the difference of Y distribution in

treated and controlled groups. (correlation

between T and Y)

Kun Kuang, Peng Cui, Susan Athey, Ruoxuan Li, Bo Li. Stable Prediction across Unknown Environments. KDD, 2018.

Page 17: Tsinghua Universitypengcui.thumedialab.com/papers/Explainable and... · Transportation Fintech. Black-box Model Risk of Today’s AI Slide from DARPA. Unexplainable Human in the loop

Theoretical Guarantee

17

Kun Kuang, Peng Cui, Susan Athey, Ruoxuan Li, Bo Li. Stable Prediction across Unknown Environments. KDD, 2018.

0

Page 18: Tsinghua Universitypengcui.thumedialab.com/papers/Explainable and... · Transportation Fintech. Black-box Model Risk of Today’s AI Slide from DARPA. Unexplainable Human in the loop

Causal Regularizer

18

All featuresexcluding

treatment j

Set feature j as treatment variable

SampleWeights

Indicator oftreatment

status

Zheyan Shen, Peng Cui, Kun Kuang, Bo Li. Causally Regularized Learning on Data with Agnostic Bias. ACM MM, 2018.

Page 19: Tsinghua Universitypengcui.thumedialab.com/papers/Explainable and... · Transportation Fintech. Black-box Model Risk of Today’s AI Slide from DARPA. Unexplainable Human in the loop

Causally Regularized Logistic Regression

19

Samplereweightedlogistic loss

CausalContribution

Zheyan Shen, Peng Cui, Kun Kuang, Bo Li. Causally Regularized Learning on Data with Agnostic Bias. ACM MM, 2018.

Page 20: Tsinghua Universitypengcui.thumedialab.com/papers/Explainable and... · Transportation Fintech. Black-box Model Risk of Today’s AI Slide from DARPA. Unexplainable Human in the loop

From Shallow to Deep - DGBR

20

Kun Kuang, Peng Cui, Susan Athey, Ruoxuan Li, Bo Li. Stable Prediction across Unknown Environments. KDD, 2018.

Page 21: Tsinghua Universitypengcui.thumedialab.com/papers/Explainable and... · Transportation Fintech. Black-box Model Risk of Today’s AI Slide from DARPA. Unexplainable Human in the loop

From Autoencoder to CNN - CNBB

21

Yue He, Zheyan Shen, Peng Cui. Towards Non-I.I.D. Image Classification: A Dataset and Baselines. (under review)

Page 22: Tsinghua Universitypengcui.thumedialab.com/papers/Explainable and... · Transportation Fintech. Black-box Model Risk of Today’s AI Slide from DARPA. Unexplainable Human in the loop

NICO Dataset (released)

• 19 categories, 10 contexts for each category, ~1300 images for each category

22

Page 23: Tsinghua Universitypengcui.thumedialab.com/papers/Explainable and... · Transportation Fintech. Black-box Model Risk of Today’s AI Slide from DARPA. Unexplainable Human in the loop

23

NICO Dataset (released)

Page 24: Tsinghua Universitypengcui.thumedialab.com/papers/Explainable and... · Transportation Fintech. Black-box Model Risk of Today’s AI Slide from DARPA. Unexplainable Human in the loop

Experimental Result - insights

Page 25: Tsinghua Universitypengcui.thumedialab.com/papers/Explainable and... · Transportation Fintech. Black-box Model Risk of Today’s AI Slide from DARPA. Unexplainable Human in the loop

Experimental Result – Stability

25

Traditional regression models are very sensitive to non-iid

setting. But our model performs stably.

Page 26: Tsinghua Universitypengcui.thumedialab.com/papers/Explainable and... · Transportation Fintech. Black-box Model Risk of Today’s AI Slide from DARPA. Unexplainable Human in the loop

Experimental Result - insights

26

Page 27: Tsinghua Universitypengcui.thumedialab.com/papers/Explainable and... · Transportation Fintech. Black-box Model Risk of Today’s AI Slide from DARPA. Unexplainable Human in the loop

Experiment 2 – online advertising

• Environments generating:

• Separate the whole dataset into 4 environments by users’ age, including

𝐴𝑔𝑒 ∈ [20,30), 𝐴𝑔𝑒 ∈ [30,40), 𝐴𝑔𝑒 ∈ [40,50), and 𝐴𝑔𝑒 ∈ [50,100).

27

Page 28: Tsinghua Universitypengcui.thumedialab.com/papers/Explainable and... · Transportation Fintech. Black-box Model Risk of Today’s AI Slide from DARPA. Unexplainable Human in the loop

Conclusions

• Predictive modeling is not only about Accuracy.

• Stability and Explainability are critical for us to trust a predictive

model.

• Causality has been demonstrated to be useful in stable prediction.

• How to marry causality with predictive modeling effectively and

efficiently is still an open problem.

28

Page 29: Tsinghua Universitypengcui.thumedialab.com/papers/Explainable and... · Transportation Fintech. Black-box Model Risk of Today’s AI Slide from DARPA. Unexplainable Human in the loop

30

Hao Zou, Kun Kuang, Boqi Chen, Peng Cui, Peixuan Chen. Focused Context Balancing for Robust

Offline Policy Evaluation. KDD, 2019.

Kun Kuang, Peng Cui, Susan Athey, Ruoxuan Li, Bo Li. Stable Prediction across Unknown

Environments. KDD, 2018.

Zheyan Shen, Peng Cui, Kun Kuang, Bo Li. Causally Regularized Learning on Data with Agnostic

Bias. ACM Multimedia, 2018.

Kun Kuang, Peng Cui, Bo Li, Shiqiang Yang. Estimating Treatment Effect in the Wild via

Differentiated Confounder Balancing. KDD, 2017.

Kun Kuang, Peng Cui, Bo Li, Shiqiang Yang. Treatment Effect Estimation with Data-Driven Variable

Decomposition. AAAI, 2017.

Yue He, Zheyan Shen, Peng Cui. Towards Non-I.I.D. Image Classification: A Dataset and

Baselines. (under review)

Zheyan Shen, Peng Cui, Tong Zhang. Stable Learning of Linear Models via Sample Reweighting.

(under review)

Reference

Page 30: Tsinghua Universitypengcui.thumedialab.com/papers/Explainable and... · Transportation Fintech. Black-box Model Risk of Today’s AI Slide from DARPA. Unexplainable Human in the loop

Thanks!

Peng Cui

[email protected]://pengcui.thumedialab.com

31


Recommended