multi-task learning for commercial brain computer …...multi-task learning for commercial brain...

1
Multi-task Learning for Commercial Brain Computer Interfaces George Panagopoulos [email protected] , giorgospanagopoulos.github.io Computational Physiology Lab ,University of Houston, Houston, Texas 77204 Introduction Data Methods Results Multi-task algorithms are more robust than conventional pooled approaches to the subject generalization problem. Improved or steady accuracy with more subjects Consistent among datasets feature selection Conclusion [1] Argyriou, Andreas, Theodoros Evgeniou, and Massimiliano Pontil. "Convex multi-task feature learning." Machine Learning 73.3 (2008): 243-272. [2] Alamgir, Morteza, Moritz Grosse-Wentrup, and Yasemin Altun. "Multitask learning for brain-computer interfaces." International Conference on Artificial Intelligence and Statistics. 2010. References Problem Improve subject generalization of passive economical Brain Computer Interfaces Motivation: Exploit the trade off between noisy individual recordings and increased number of subjects Advantages: Utilize knowledge shared between subjects during training Extract patterns present in all subjects instead of explicit individuals Berkeley Experiment 1 - 30 subjects - One 5-minute session each - Two types of stimuli during session - Math, memorizing colors, think of items - Listen to music, watch video ads, relax - Cognitive state changes in the same session - Classes: Mental activity or relaxation Carnegie Mellon Experiment 2 - 9 subjects - Ten 2-minute sessions with MOOC videos - Self-classified levels of confusion for each session - Cognitive state changes between sessions - Classes: Confused or not 1 https://www.kaggle.com/berkeley-biosense/synchronized-brainwave-dataset Discriminative MTL[1] Logistic regression with group sparsity constraint on the K coefficients of all subjects T 1 =1 X t ,W t ,Y t =1 2 2 https://www.kaggle.com/wanghaohan/eeg-brain-wave-for-confusion https://github.com/GiorgosPanagopoulos/Multi-task-Learning-for- Commercial-Brain-Computer-Interfaces Code MTL methods perform better as the number of subjects increases Generative MTL[2] Bayesian estimation of coefficients’ prior distribution ;, ~ෑ =1 , ; ~ ,Σ ,∀∈ min ,,Σ 1 2 =1 T 2 + 1 2 =1 Σ −1 + 2 (Σ) Example of subject variability Subjects Subject Adaptive Multi-task Subject Invariant (pooled) Feature Coefficients The sign of the regressed value determines the binary class For a new subject a coefficient vector is sampled from (, Σ) Rows of W are the feature coefficients of each subject t Each approach yields a different W: The column average of multi-task W is the of a new subject n Pooled Multi-Task Subject1 1 1 1 9 1 9 1 Subject2 +1 1 +1 9 1 9 +1 Joint Optimization Task1 ( 1 ) Task2 ( 2 ) = 1 2 1 2 1 1 1 9 2 1 2 9 Subject1 1 1 1 9 1 9 1 = MTL uncovers patterns that comply with the field’s literature

Upload: others

Post on 14-Apr-2020

4 views

Category:

Documents


0 download

TRANSCRIPT

Page 1: Multi-task Learning for Commercial Brain Computer …...Multi-task Learning for Commercial Brain Computer Interfaces George Panagopoulos gpanagopoulos@uh.edu , giorgospanagopoulos.github.io

Multi-task Learning for Commercial Brain Computer InterfacesGeorge Panagopoulos

[email protected] , giorgospanagopoulos.github.io Computational Physiology Lab ,University of Houston, Houston, Texas 77204

Introduction

Data

Methods Results

Multi-task algorithms are more robust than conventional pooled approaches to the subject generalization problem.

• Improved or steady accuracy with more subjects• Consistent among datasets feature selection

Conclusion

[1] Argyriou, Andreas, Theodoros Evgeniou, and Massimiliano Pontil. "Convex multi-task feature learning." Machine Learning 73.3 (2008): 243-272.

[2] Alamgir, Morteza, Moritz Grosse-Wentrup, and Yasemin Altun. "Multitask learning for brain-computer interfaces." International Conference on Artificial Intelligence and Statistics. 2010.

References

ProblemImprove subject generalization of passive economical Brain Computer Interfaces

Motivation: Exploit the trade off between noisy individual recordings and increased number of subjects

Advantages: Utilize knowledge shared between subjects during trainingExtract patterns present in all subjects instead of explicit individuals

Berkeley Experiment 1

- 30 subjects- One 5-minute session each- Two types of stimuli during session

- Math, memorizing colors, think of items- Listen to music, watch video ads, relax

- Cognitive state changes in the same session- Classes: Mental activity or relaxation

Carnegie Mellon Experiment 2

- 9 subjects - Ten 2-minute sessions with MOOC videos- Self-classified levels of confusion for each

session- Cognitive state changes between sessions- Classes: Confused or not

1 https://www.kaggle.com/berkeley-biosense/synchronized-brainwave-dataset

Discriminative MTL[1]Logistic regression with group sparsity constraint on the K coefficients of all subjects T

𝑚𝑖𝑛1

𝑁𝑡

𝑡=1

𝑇

𝐽 Xt,Wt , Yt + λ

𝑘=1

𝐾

𝑊𝑘 2

2https://www.kaggle.com/wanghaohan/eeg-brain-wave-for-confusion

https://github.com/GiorgosPanagopoulos/Multi-task-Learning-for-Commercial-Brain-Computer-Interfaces

Code

MTL methods perform better as the number of subjects increases

Generative MTL[2]Bayesian estimation of coefficients’ prior distribution

𝑝 𝑊;𝑋, 𝑌 ~ෑ

𝑡=1

𝑇

𝑝 𝑦𝑡, 𝑋𝑡; 𝑤𝑡 𝑝 𝑤𝑡

𝑝 𝑤𝑡 ~𝑁 𝜇, Σ ,∀ 𝑡 ∈ 𝑇

min𝑤𝑡,𝜇,Σ

1

𝜎2

𝑡=1

T

𝑋𝑡𝑤𝑡 − 𝑌𝑡2+1

2

𝑡=1

𝑇

𝑤𝑡 − 𝜇 𝑇Σ−1 𝑤𝑡 − 𝜇 +𝑇

2𝑙𝑜𝑔𝑑𝑒𝑡(Σ)

Example of subject variability

Sub

ject

sSubject Adaptive Multi-task Subject Invariant

(pooled)

Feature Coefficients

The sign of the regressed value determines the binary classFor a new subject a coefficient vector 𝑤𝑛 is sampled from 𝑁(𝜇, Σ)

Rows of W are the feature coefficients 𝑤𝑡 of each subject tEach approach yields a different W:

The column average of multi-task W is the 𝑤𝑛 of a new subject n

Po

ole

d

Mu

lti-

Task

Subject1

𝑋11 ⋯ 𝑋1

9

⋮ ⋱ ⋮𝑋𝑚1 ⋯ 𝑋𝑚

9

𝑌1⋮𝑌𝑚

Subject2

𝑋𝑚+11 ⋯ 𝑋𝑚+1

9

⋮ ⋱ ⋮𝑋𝑛1 ⋯ 𝑋𝑛

9

𝑌𝑚+1

⋮𝑌𝑛

Joint Optimization

Task1 (𝑤1)

Task2 (𝑤2)

𝑤𝑛 =1

2

1

2

𝑤11 … 𝑤1

9

𝑤21 … 𝑤2

9

Subject1

𝑋11 ⋯ 𝑋1

9

⋮ ⋱ ⋮𝑋𝑛1 ⋯ 𝑋𝑛

9

𝑌1⋮𝑌𝑛

𝑤𝑛 = 𝑤

MTL uncovers patterns that comply with the field’s literature