22 April 2020
Compressing Subject-specific Brain-Computer Interface Models into One Model by Superposition in Hyperdimensional Space
Michael Hersche
Philipp Rupp
Luca Benini
Abbas Rahimi
Integrated Systems LaboratoryD-ITET ETH Zürich
Towards wearable embedded Motor-Imagery Brain – Computer Interfaces (MI-BCIs)
2
Left Hand
EEG data acquisition
“Left Hand”…
Classifier
Towards wearable embedded Motor-Imagery Brain – Computer Interfaces (MI-BCIs)
Why embedded embedded MI-BCI?▪ User comfort▪ Latency▪ Security & privacy▪ Long-term usability
2
…
Embedded MI-BCI
Subject-specific models pose a challenge for embedded MI-BCIs
3
▪ MI-Brain signals are highly subject dependentNeed to train subject-specific models
▪ Store multiple subject-specific models on device▪ Device for multiple subjects
▪ Model selection on unseen subjects
▪ We need to reduce memory footprint!
…
Embedded MI-BCI
Model 1
Model 2
Model n
…
…
Global model
Global Model M1
M2
Mn
Quantization Superposition
This Work: Model compression by hyperdimensional superposition▪ Compress subject-specific CNN models with superposition
▪ Novel retraining method to counteract compression noise
▪ Compress two compact SoA CNNs by up to 3x with slightly better accuracy
▪ Shallow ConvNet +1.46%
▪ EEGNet +2.41%
4
The BCI competition IV-2a dataset is still a big challenge
▪ 9 subjects
▪ 2 sessions per subject: training and test set
▪ 288 trials per session and subject
▪ 4 different MI tasks initiated by visual cue
▪ Left hand/right hand/feet/tongue
▪ 22 EEG channels sampled with 250 Hz
5
Sub 1
Sub 2
Sub 9
Sub 1
Sub 2
Sub 9
Session 1 (Training) Session 2 (Testing)
Day 1 Day 2
… …
Shallow Convnet1 is a light-weightand accurate CNN for MI classification
6
One model per subject (sub-spec)
47,240 weights
74.3% on 4-class MI
[1] Schirrmeister et al. , “Deep learning with convolutional neural networks for EEG decoding and visualization,” Human Brain Mapping 2017
1110
0 0
27 0 112 .
22
1110
0
22
1,000 weights 35,200 weights 11,040 weights
ℝ𝑑
𝑊1
𝑊2
𝑊3
ℝ𝑑
Orthogonalization by key-value binding in hyperdimensional space
7
Values
Retrieval
𝑊𝑖 = K𝑖 ⊙𝑉𝑖
𝐾3⊛𝑊3
𝐾2⊛𝑊2
𝑉1 = 𝐾1⊛𝑊1
Key-value pairs(orthogonal)
ℝ𝑑
𝑊1
𝑊2
𝑊3
Binding
𝑉𝑖 = K𝑖 ⊛𝑊𝑖
Retrieved values
Orthogonalized key-value pairs are superimposed and retrieved in hyperdimensional space
1) Superimpose multiple key-value pairs
2) Retrieve values
8
𝑊 ∈ℝ𝑑 – value
𝐾 ∈ ℝ𝑑 , 𝐾 ~𝑁(0 ,1
𝑑𝑰𝑑 – key
⊛ – circular convolution⊙ – circular correlation
𝑆 =
𝑖
𝐾𝑖 ⊛𝑊𝑖
𝑊𝑘 = 𝐾𝑘 ⊙𝑆= 𝐾𝑘 ⊙𝐾𝑘 ⊛𝑊𝑘 +
𝑖≠𝑘
𝐾𝑘 ⊙𝐾𝑖 ⊛𝑊𝑖 = 𝑊𝑘 + 𝑛
Key-weight binding of compressible weights
e 1
d eig t m del 1
u e t 1
compressible weights
model size
Superposition of key-value pairs reduces the memory footprint while staying in same dimensionality
10
e 1
eig t m del 1
u e t 1
Superposition of key-value pairs reduces the memory footprint while staying in same dimensionality
10
e 1
eig t m del 1
u e t 1
d eig t m del 2
u e t 2
e 2
Superposition of key-value pairs reduces the memory footprint while staying in same dimensionality
10
e 1
eig t m del 1
u e t 1
d eig t m del 2
u e t 2
e 2
d eig t m del
e
u e t
Superposition of key-value pairs reduces the memory footprint while staying in same dimensionality
10
Compression Rate =𝑁𝑠
𝑁𝑠(1−𝑟)+𝑟𝑟 ≔
𝑑
model size
e 1
eig t m del 1
u e t 1
d eig t m del 2
u e t 2
e 2
d eig t m del
e
u e t
...
Approximate retrieval from compressed representation yields huge accuracy loss
11
...
d eig t m del 1
u e t 1
e 1
𝑊1
73.59
52.44
45
50
55
60
65
70
75
80
1 1.26
Acc
ura
cy [
%]
Compression Rate
Sup(FC)
Iterative Retraining
12
...
d eig t m del 1
u e t 1
e 1
Iterative Retraining
12
for i =1:Ns1) Retrieve weights for subject i2) Retrain model for subject i3) Update compressed representation
...
d eig t m del 1
u e t 1
e 1
etraining
Iterative Retraining
12
for i =1:Ns1) Retrieve weights for subject i2) Retrain model for subject i3) Update compressed representation
e 1
...
d eig t m del 1
u e t 1 u e t 1
e 1
etraining
Iterative Retraining
12
for i =1:Ns1) Retrieve weights for subject i2) Retrain model for subject i3) Update compressed representation
e 2
...
d eig t m del 2
u e t 2 u e t 2
e 2
etraining
Iterative Retraining
12
for i =1:Ns1) Retrieve weights for subject i2) Retrain model for subject i3) Update compressed representation
e
...
d eig t m del
u e t u e t
e
etraining
Retraining recovers the performance on training set
13
Trai
nin
g M
iscl
assi
fica
tio
n
Retraining Iteration
1) Retrieve weights
2) Retrain model
3) Update compressed representation
Subject 1
Randomized subject ordering and hyperparameter selection improve iterative retraining▪ Randomized subject ordering
Change subject order after every retraining iteration
▪ Hyperparameter selection
▪ 5-fold cross-validation on training set
▪ Find best hyperparameters
▪ Batch size
▪ Number of epochs per iteration
▪ Learning rate
▪ Number of retraining iterations
14
Sub 1
Sub 2
Sub 9
Session 1 (Training & Validation)
…
Retraining recovers the misclassification on validation set
15
Validation misclassificationbefore compression
With retraining we compress FC or Conv layer with no accuracy loss
16
1,000 weights 35,200 weights 11,040 weights
73.59
52.44
75.14 75.05
60.46
45
50
55
60
65
70
75
80
1 1.26 1.26 2.95 7.61
Acc
ura
cy [
%]
Compression Rate
Sup(FC)no retraining
Sup(FC) Sup(Conv)
Sup(FC + Conv)
1110
0 0
27 0 112 .
22
1110
0
22
Superposition even compresses tiny EEGNet
17
352 weights 512 weights 1,088 weights
1 0
1
272 112 .
22
112
22
1
2,464 weights
72.3274.73
45
50
55
60
65
70
75
80
1 1.9
Acc
ura
cy [
%]
Compression Rate
Sup(FC)
512 weights
Our compression improves both Shallow ConvNet and EEGNet
18
EEGNet Shallow ConvNet
3x
+1.46%
+2.41%
1.9x
Conclusion
▪ Hyperdimensional superposition compresses already compact MI-BCI CNN models
▪ Iterative retraining recovers loss
▪ Compress two SoA light-weight networks
▪ Shallow ConvNet (47k weights) by 3x at 1.46% higher accuracy
▪ EEGNet (2.5k weights) by 1.9x at 2.41% higher accuracy
▪ Code is available!
19
https://github.com/MHersche/bci-model-superpos
22 April 2020
Compressing Subject-specific Brain-Computer Interface Models into One Model by Superposition in Hyperdimensional Space
Michael Hersche
Philipp Rupp
Luca Benini
Abbas Rahimi
Integrated Systems LaboratoryD-ITET ETH Zürich