+ All Categories
Home > Documents > Contrastive Relevance Propagation for Interpreting …...Contrastive Relevance Propagation for...

Contrastive Relevance Propagation for Interpreting …...Contrastive Relevance Propagation for...

Date post: 25-May-2020
Category:
Upload: others
View: 14 times
Download: 0 times
Share this document with a friend
44
Contrastive Relevance Propagation for Interpreting Predictions by a Single-Shot Object Detector Hideomi Tsunakawa 1 , Yoshitaka Kameya 1 , Hanju Lee 2 , Yosuke Shinya 2 , and Naoki Mitsumoto 2 1 Department of Information Engineering, Meijo University 2 DENSO CORPORATION 1 IJCNN-19
Transcript
Page 1: Contrastive Relevance Propagation for Interpreting …...Contrastive Relevance Propagation for Interpreting Predictions by a Single-Shot Object Detector Hideomi Tsunakawa1, Yoshitaka

Contrastive Relevance Propagationfor Interpreting Predictions

by a Single-Shot Object Detector

Hideomi Tsunakawa1, Yoshitaka Kameya1,

Hanju Lee2, Yosuke Shinya2, and Naoki Mitsumoto2

1Department of Information Engineering, Meijo University2DENSO CORPORATION

1IJCNN-19

Page 2: Contrastive Relevance Propagation for Interpreting …...Contrastive Relevance Propagation for Interpreting Predictions by a Single-Shot Object Detector Hideomi Tsunakawa1, Yoshitaka

Outline• Background

• Proposed method: CRP

• Experiments

IJCNN-19 2

Page 3: Contrastive Relevance Propagation for Interpreting …...Contrastive Relevance Propagation for Interpreting Predictions by a Single-Shot Object Detector Hideomi Tsunakawa1, Yoshitaka

Outline• Background

• Proposed method: CRP

• Experiments

IJCNN-19 3

Page 4: Contrastive Relevance Propagation for Interpreting …...Contrastive Relevance Propagation for Interpreting Predictions by a Single-Shot Object Detector Hideomi Tsunakawa1, Yoshitaka

Background: SSD (1)• Object detection is a well-known task in computer vision

• SSD (Single-Shot MultiBox Detector) [Liu+ ECCV-16]:– Known for its high speed and accuracy– Outputs:

• Confidences for classes• Location offsets

(center on x-axis, center on y-axis, width, height)

IJCNN-19 4

Input: Output:

Classification

Localization

Page 5: Contrastive Relevance Propagation for Interpreting …...Contrastive Relevance Propagation for Interpreting Predictions by a Single-Shot Object Detector Hideomi Tsunakawa1, Yoshitaka

Background: SSD (2)• SSD:

– Based on a (large) single convolutional network

– Layers for classification and layers for localization areconnected from several convolutional layers→ Different resolutions

IJCNN-19 5

Non-m

axim

um

suppre

ssio

nCls4Loc4

Cls7Loc7

Cls8Loc8

Cls9Loc9

Cls10Loc10

Cls11Loc11

Inputimage

300

300

VGG-16 until Pool5 layer

38

38

Conv4_3

512

19

19

Conv6

1024

19

19

Conv7

1024

10

10

512

Conv8_25

5

Conv9_2

256

Conv10_2

3

3256 256

1

1

Conv11_2

Conv:3x3x1024

Conv:1x1x1024

Conv: 1x1x256Conv: 3x3x512-s2

Conv: 1x1x128Conv: 3x3x256-s2

Conv: 1x1x128Conv: 3x3x256-s1

Conv: 1x1x128Conv: 3x3x256-s1

Classification

Localization

Page 6: Contrastive Relevance Propagation for Interpreting …...Contrastive Relevance Propagation for Interpreting Predictions by a Single-Shot Object Detector Hideomi Tsunakawa1, Yoshitaka

Background: LRP (1)• LRP (Layer-wise Relevance Propagation) [Bach+ 15]:

– Often used for interpreting predictions of DNNs

IJCNN-19 6

Cls4Loc4

Cls7Loc7

Cls8Loc8

Cls9Loc9

Cls10Loc10

Cls11Loc11

300

300

38

38

Conv4_3

512

19

19

Conv6

1024

19

19

Conv7

1024

10

10

512

Conv8_25

5

Conv9_2

256

Conv10_2

3

3256 256

11

Conv11_2

Input:Output:

Page 7: Contrastive Relevance Propagation for Interpreting …...Contrastive Relevance Propagation for Interpreting Predictions by a Single-Shot Object Detector Hideomi Tsunakawa1, Yoshitaka

Background: LRP (1)• LRP (Layer-wise Relevance Propagation) [Bach+ 15]:

– Often used for interpreting predictions of DNNs

– Propagates relevance backward from the outputto the input features

– Creates a heatmap using relevance at the input features

IJCNN-19 7

Cls4Loc4

Cls7Loc7

Cls8Loc8

Cls9Loc9

Cls10Loc10

Cls11Loc11

300

300

38

38

Conv4_3

512

19

19

Conv6

1024

19

19

Conv7

1024

10

10

512

Conv8_25

5

Conv9_2

256

Conv10_2

3

3256 256

11

Conv11_2

Input:Output:

Heatmap: Relevanceto “dog”

Relevance propagation

Page 8: Contrastive Relevance Propagation for Interpreting …...Contrastive Relevance Propagation for Interpreting Predictions by a Single-Shot Object Detector Hideomi Tsunakawa1, Yoshitaka

Background: LRP (2)• LRP is equipped with several propagation rules:

– Common:Rj

(l + 1): distributed to lower units

Ri(l) := Sj Rij

Rij: passed through connection

IJCNN-19 8

Layer l Layer l + 1

Rj(l + 1)

Ri(l) Rij

Page 9: Contrastive Relevance Propagation for Interpreting …...Contrastive Relevance Propagation for Interpreting Predictions by a Single-Shot Object Detector Hideomi Tsunakawa1, Yoshitaka

Background: LRP (2)• LRP is equipped with several propagation rules:

– Common:Rj

(l + 1): distributed to lower units

Ri(l) := Sj Rij

Rij: passed through connection

IJCNN-19 9

Layer l Layer l + 1

Rj(l + 1)

Ri(l) Rij

Page 10: Contrastive Relevance Propagation for Interpreting …...Contrastive Relevance Propagation for Interpreting Predictions by a Single-Shot Object Detector Hideomi Tsunakawa1, Yoshitaka

Background: LRP (2)• LRP is equipped with several propagation rules:

– Common:Rj

(l + 1): distributed to lower units

Ri(l) := Sj Rij

Rij: passed through connection

IJCNN-19 10

Layer l Layer l + 1

Rj(l + 1)

Ri(l) Rij

Page 11: Contrastive Relevance Propagation for Interpreting …...Contrastive Relevance Propagation for Interpreting Predictions by a Single-Shot Object Detector Hideomi Tsunakawa1, Yoshitaka

Background: LRP (2)• LRP is equipped with several propagation rules:

– Common:Rj

(l + 1): distributed to lower units

Ri(l) := Sj Rij

Rij: passed through connection

– Simple LRP:

– -LRP:

– -LRP:

IJCNN-19 11

Layer l Layer l + 1

Rj(l + 1)

Ri(l) Rij

Page 12: Contrastive Relevance Propagation for Interpreting …...Contrastive Relevance Propagation for Interpreting Predictions by a Single-Shot Object Detector Hideomi Tsunakawa1, Yoshitaka

Background: Indistinguishable Heatmaps (1)

• Heatmaps are almost invariant even when the target class has been changed

• Heatmaps obtained with -LRP ( = 1, = 0):

IJCNN-19 12

Target class: “dog”(actually predicted)

Target class: “cat”(“what-if” analysis)

Page 13: Contrastive Relevance Propagation for Interpreting …...Contrastive Relevance Propagation for Interpreting Predictions by a Single-Shot Object Detector Hideomi Tsunakawa1, Yoshitaka

Background: Indistinguishable Heatmaps (2)• Relevance propagated in each layer:

IJCNN-19 13Relevance decreases exponentially

Page 14: Contrastive Relevance Propagation for Interpreting …...Contrastive Relevance Propagation for Interpreting Predictions by a Single-Shot Object Detector Hideomi Tsunakawa1, Yoshitaka

Background: Indistinguishable Heatmaps (3)

• Recent works that seem to support our observation:

– [Adebayo+ NeurIPS-18]:

• Uses Inception v3 (a large network)

• If relevance = gradient input, the input part dominates→ Heatmaps will be invariant

(since the input is of course fixed)

– [Ancona+ ICLR-18]:

• Several methods tend to return similar heatmaps(theoretically or empirically):–Gradient input

–DeepLIFT (Rescale)– Integrated Gradients– Simple LRP

IJCNN-19 14

Page 15: Contrastive Relevance Propagation for Interpreting …...Contrastive Relevance Propagation for Interpreting Predictions by a Single-Shot Object Detector Hideomi Tsunakawa1, Yoshitaka

Background: Our Motivation• We introduce contrastive relevance that highlights

the more important part to the target class

• We design the meaning of relevance to be consistentin two heterogeneous tasks in SSD:

– Classification– Localization (Regression)

IJCNN-19 15

Target class: “dog” Target class: “cat”

Page 16: Contrastive Relevance Propagation for Interpreting …...Contrastive Relevance Propagation for Interpreting Predictions by a Single-Shot Object Detector Hideomi Tsunakawa1, Yoshitaka

Outline✓ Background

• Proposed method: CRP

• Experiments

IJCNN-19 16

Page 17: Contrastive Relevance Propagation for Interpreting …...Contrastive Relevance Propagation for Interpreting Predictions by a Single-Shot Object Detector Hideomi Tsunakawa1, Yoshitaka

Contrastive Relevance Propagation (CRP)

• CRP: LRP tailored for SSD

– Classifies SSD’s layers into 4 types

– Applies semantically appropriate propagation rules toeach layer type

– In both classification and localization, the meanings of“relevance” are the same

IJCNN-19 17

Cls4Loc4

Cls7Loc7

Cls8Loc8

Cls9Loc9

Cls10Loc10

Cls11Loc11

300

300

38

38

Conv4_3

512

19

19

Conv6

1024

19

19

Conv7

1024

10

10

512

Conv8_25

5

Conv9_2

256

Conv10_2

3

3256 256

11

Conv11_2

A detected box

Relevanceto class kof interest

High-Level Feature LayerLow-Level Feature Layers

ClassificationLayer

Page 18: Contrastive Relevance Propagation for Interpreting …...Contrastive Relevance Propagation for Interpreting Predictions by a Single-Shot Object Detector Hideomi Tsunakawa1, Yoshitaka

Contrastive Relevance Propagation (CRP)

• CRP: LRP tailored for SSD

– Classifies SSD’s layers into 4 types

– Applies semantically appropriate propagation rules toeach layer type

– In both classification and localization, the meanings of“relevance” are the same

IJCNN-19 18

Cls4Loc4

Cls7

Loc7Cls8Loc8

Cls9Loc9

Cls10Loc10

Cls11Loc11

300

300

38

38

Conv4_3

512

19

19

Conv6

1024

19

19

Conv7

1024

10

10

512

Conv8_25

5

Conv9_2

256

Conv10_2

3

3256 256

11

Conv11_2

A detected box

Relevanceto shiftingto right

LocalizationLayer

High-Level Feature LayerLow-Level Feature Layers

Page 19: Contrastive Relevance Propagation for Interpreting …...Contrastive Relevance Propagation for Interpreting Predictions by a Single-Shot Object Detector Hideomi Tsunakawa1, Yoshitaka

Contrastive Relevance Propagation (CRP)

• CRP: LRP tailored for SSD

– Classifies SSD’s layers into 4 types

– Applies semantically appropriate propagation rules toeach layer type

– In both classification and localization, the meanings of“relevance” are the same

IJCNN-19 19

Cls4Loc4

Cls7

Loc7

Cls8Loc8

Cls9Loc9

Cls10Loc10

Cls11Loc11

300

300

38

38

Conv4_3

512

19

19

Conv6

1024

19

19

Conv7

1024

10

10

512

Conv8_25

5

Conv9_2

256

Conv10_2

3

3256 256

11

Conv11_2

Another detected box

Relevanceto class k’of interest

ClassificationLayer

High-Level Feature LayerLow-Level Feature Layers

Page 20: Contrastive Relevance Propagation for Interpreting …...Contrastive Relevance Propagation for Interpreting Predictions by a Single-Shot Object Detector Hideomi Tsunakawa1, Yoshitaka

CRP: Propagation Rules in Classification

IJCNN-19 20

Classificationlayer

High-level featurelayer

Low-level featurelayer

class 1

class K

class k

class k*(target)

Page 21: Contrastive Relevance Propagation for Interpreting …...Contrastive Relevance Propagation for Interpreting Predictions by a Single-Shot Object Detector Hideomi Tsunakawa1, Yoshitaka

CRP: Propagation Rules in Classification

IJCNN-19 21

Classificationlayer

High-level featurelayer

Low-level featurelayer

class 1

class k*(target)

class K

class k

InitialRelevance

1

0

0

0

Page 22: Contrastive Relevance Propagation for Interpreting …...Contrastive Relevance Propagation for Interpreting Predictions by a Single-Shot Object Detector Hideomi Tsunakawa1, Yoshitaka

CRP: Propagation Rules in Classification

IJCNN-19 22

Classificationlayer

High-level featurelayer

Low-level featurelayer

class 1

class k*(target)

class K

class k

We use w+-rule( -LRP with = 1, = 0)

to find units that positivelycontribute to class k*

Page 23: Contrastive Relevance Propagation for Interpreting …...Contrastive Relevance Propagation for Interpreting Predictions by a Single-Shot Object Detector Hideomi Tsunakawa1, Yoshitaka

CRP: Propagation Rules in Classification

IJCNN-19 23

Classificationlayer

High-level featurelayer

Low-level featurelayer

class 1

class k*(target)

class K

class k

At this moment, we can compute

a class-specific relevance Ri[k*] for the target class k*by summing up the passed relevance

Page 24: Contrastive Relevance Propagation for Interpreting …...Contrastive Relevance Propagation for Interpreting Predictions by a Single-Shot Object Detector Hideomi Tsunakawa1, Yoshitaka

CRP: Propagation Rules in Classification

IJCNN-19 24

Classificationlayer

High-level featurelayer

Low-level featurelayer

class 1

class k*(target)

class K

class kWe compute contrastive relevance

to find units that make a significantlypositive or a significantly negativecontribution to the target class k*

“average relevance” over other classes

Page 25: Contrastive Relevance Propagation for Interpreting …...Contrastive Relevance Propagation for Interpreting Predictions by a Single-Shot Object Detector Hideomi Tsunakawa1, Yoshitaka

CRP: Propagation Rules in Classification

IJCNN-19 25

Classificationlayer

High-level featurelayer

Low-level featurelayer

class 1

class k*(target)

class K

class k

Until the input layer, we use w+-rule

to distribute the positivity or the negativity of contrastive relevance(activations xi are non-negative due to ReLU)

Page 26: Contrastive Relevance Propagation for Interpreting …...Contrastive Relevance Propagation for Interpreting Predictions by a Single-Shot Object Detector Hideomi Tsunakawa1, Yoshitaka

CRP: Propagation Rules in Classification

IJCNN-19 26

Classificationlayer

High-level featurelayer

Low-level featurelayer

class 1

class k*(target)

class K

class k

Until the input layer, we use w+-rule

to distribute the positivity or the negativity of contrastive relevance(activations xi are non-negative due to ReLU)

Page 27: Contrastive Relevance Propagation for Interpreting …...Contrastive Relevance Propagation for Interpreting Predictions by a Single-Shot Object Detector Hideomi Tsunakawa1, Yoshitaka

CRP: Propagation Rules in Localization

IJCNN-19 27

Localizationlayer

High-level featurelayer

Low-level featurelayer

center on x-axis

center on y-axis(target)

width

height

Page 28: Contrastive Relevance Propagation for Interpreting …...Contrastive Relevance Propagation for Interpreting Predictions by a Single-Shot Object Detector Hideomi Tsunakawa1, Yoshitaka

CRP: Propagation Rules in Localization

IJCNN-19 28

Localizationlayer

High-level featurelayer

Low-level featurelayer

center on x-axis

center on y-axis(target)

width

height

InitialRelevance

1

0

0

0

Page 29: Contrastive Relevance Propagation for Interpreting …...Contrastive Relevance Propagation for Interpreting Predictions by a Single-Shot Object Detector Hideomi Tsunakawa1, Yoshitaka

CRP: Propagation Rules in Localization

IJCNN-19 29

Localizationlayer

High-level featurelayer

Low-level featurelayer

center on x-axis

center on y-axis(target)

width

height

Sign-based rule switching:We switch two rulesaccording to the sign of xj

If xj is positive, use w+-rule( -LRP with = 1, = 0)

to find units that positivelycontribute to center on y-axis

Activationxj

Page 30: Contrastive Relevance Propagation for Interpreting …...Contrastive Relevance Propagation for Interpreting Predictions by a Single-Shot Object Detector Hideomi Tsunakawa1, Yoshitaka

CRP: Propagation Rules in Localization

IJCNN-19 30

Localizationlayer

High-level featurelayer

Low-level featurelayer

center on x-axis

center on y-axis(target)

width

height

Activationxj

Sign-based rule switching:We switch two rulesaccording to the sign of xj

If xj is negative, use w–-rule( -LRP with = 0, = 1)

to find units that negativelycontribute to center on y-axis

Page 31: Contrastive Relevance Propagation for Interpreting …...Contrastive Relevance Propagation for Interpreting Predictions by a Single-Shot Object Detector Hideomi Tsunakawa1, Yoshitaka

CRP: Propagation Rules in Localization

IJCNN-19 31

Localizationlayer

High-level featurelayer

Low-level featurelayer

center on x-axis

center on y-axis(target)

width

heightWe compute contrastive relevance

relevance fromthe localization layer “overall average”

class-specific relevance

Page 32: Contrastive Relevance Propagation for Interpreting …...Contrastive Relevance Propagation for Interpreting Predictions by a Single-Shot Object Detector Hideomi Tsunakawa1, Yoshitaka

CRP: Propagation Rules in Localization

IJCNN-19 32

Localizationlayer

High-level featurelayer

Low-level featurelayer

center on x-axis

center on y-axis(target)

width

heightUntil the input layer, we use w+-rule

as in classification

Page 33: Contrastive Relevance Propagation for Interpreting …...Contrastive Relevance Propagation for Interpreting Predictions by a Single-Shot Object Detector Hideomi Tsunakawa1, Yoshitaka

CRP: Propagation Rules in Localization

IJCNN-19 33

Localizationlayer

High-level featurelayer

Low-level featurelayer

center on x-axis

center on y-axis(target)

width

heightUntil the input layer, we use w+-rule

as in classification

Page 34: Contrastive Relevance Propagation for Interpreting …...Contrastive Relevance Propagation for Interpreting Predictions by a Single-Shot Object Detector Hideomi Tsunakawa1, Yoshitaka

Outline✓ Background

✓ Proposed method: CRP

• Experiments

IJCNN-19 34

Page 35: Contrastive Relevance Propagation for Interpreting …...Contrastive Relevance Propagation for Interpreting Predictions by a Single-Shot Object Detector Hideomi Tsunakawa1, Yoshitaka

Experimental Settings

• Dataset: Pascal VOC 2012

• We ported the TensorFlow implementation of LRP(https://github.com/VigneshSrinivasan10/interprettensor)

into a TensorFlow implementation of SSD(https://github.com/balancap/SSD-Tensorflow)

• SSD implementation includes a learned model(We conducted no learning)

• We added CRP-specific routines

• Relevance was normalized before creating heatmaps

IJCNN-19 35

(See the paper for details)

Page 36: Contrastive Relevance Propagation for Interpreting …...Contrastive Relevance Propagation for Interpreting Predictions by a Single-Shot Object Detector Hideomi Tsunakawa1, Yoshitaka

Numerical Example

• Relevance is almost symmetrically distributed at zero

IJCNN-19 36

0Positives NegativesDifferent Colorsin Heatmap:

Target class: “dog”

Page 37: Contrastive Relevance Propagation for Interpreting …...Contrastive Relevance Propagation for Interpreting Predictions by a Single-Shot Object Detector Hideomi Tsunakawa1, Yoshitaka

Error Analysis (1)

• A dog was misclassified as a sheep

IJCNN-19 37

Page 38: Contrastive Relevance Propagation for Interpreting …...Contrastive Relevance Propagation for Interpreting Predictions by a Single-Shot Object Detector Hideomi Tsunakawa1, Yoshitaka

Error Analysis (2)

• A dog was misclassified as a sheep

IJCNN-19 38

Target class: “dog” Target class: “sheep”

Page 39: Contrastive Relevance Propagation for Interpreting …...Contrastive Relevance Propagation for Interpreting Predictions by a Single-Shot Object Detector Hideomi Tsunakawa1, Yoshitaka

Error Analysis (3)

• A dog was misclassified as a sheep

IJCNN-19 39

Target class: “sheep”<85%tile values masked

Page 40: Contrastive Relevance Propagation for Interpreting …...Contrastive Relevance Propagation for Interpreting Predictions by a Single-Shot Object Detector Hideomi Tsunakawa1, Yoshitaka

Error Analysis (4)

• Unwanted localizations:

– Horizontal shift to left with widening

– Vertical shift to top with heightening

IJCNN-19 40

Before localization After localization

Page 41: Contrastive Relevance Propagation for Interpreting …...Contrastive Relevance Propagation for Interpreting Predictions by a Single-Shot Object Detector Hideomi Tsunakawa1, Yoshitaka

Error Analysis (5)

• Unwanted localizations:

– Horizontal shift to left with widening

– Vertical shift to top with heightening

IJCNN-19 41

Target offset: center on x-axis Target offset: center on y-axis

Page 42: Contrastive Relevance Propagation for Interpreting …...Contrastive Relevance Propagation for Interpreting Predictions by a Single-Shot Object Detector Hideomi Tsunakawa1, Yoshitaka

Error Analysis (6)

• Unwanted localizations:

– Horizontal shift to left with widening

– Vertical shift to top with heightening

IJCNN-19 42

Target offset: width Target offset: height

Page 43: Contrastive Relevance Propagation for Interpreting …...Contrastive Relevance Propagation for Interpreting Predictions by a Single-Shot Object Detector Hideomi Tsunakawa1, Yoshitaka

Summary

• CRP (contrastive relevance propagation) as an LRP method tailored for SSD:

– Can highlight only significantly important featuresfor a target class

– Can deal with SSD’s heterogeneous outputs(classification and localization)

• Some error analyses using CRP were conducted

IJCNN-19 43

• Applying CRP to other object detectors such as YOLO

• Applying CRP (retrospectively) to standard CNNs

Future work

Page 44: Contrastive Relevance Propagation for Interpreting …...Contrastive Relevance Propagation for Interpreting Predictions by a Single-Shot Object Detector Hideomi Tsunakawa1, Yoshitaka

Thank you for your attention!

IJCNN-19 44


Recommended