30-jcit2-753041je

Upload: vidya-muthukrishnan

Post on 03-Apr-2018

215 views

Category:

Documents


0 download

TRANSCRIPT

  • 7/28/2019 30-JCIT2-753041JE

    1/9

    A Detective Method for Multi-class EEG-based Motor Imagery Classification Based on OCSVM

    Yanlei Gu, Jianhua Dai, Bian Wu, Nenggan Zheng, Weidong Chen, Xiaoxiang Zheng

    Journal of Convergence Information Technology, Volume 6, Number 1. January 2011

    A Detective Method for Multi-class EEG-based Motor ImageryClassification Based on OCSVM

    Yanlei Gu1,3, Jianhua Dai1,2,, Bian Wu1,3, Nenggan Zheng1, Weidong Chen1,2,

    Xiaoxiang Zheng1,3,*1 Qiushi Academy for Advanced Studies, Zhejiang University, Hangzhou 310027, China

    2 College of Computer Science, Zhejiang University, Hangzhou 310027, China3 College of Biomedical Engineering and Instrument Science, Zhejiang University, Hangzhou

    310027, [email protected]

    doi:10.4156/jcit.vol6. issue1.30

    AbstractThe aim of BCI is to translate the activity of brain into command to control external device

    completing the task of communication. To achieve this goal, we need to recognize the various patterns

    of the brain. So improving classification accuracy is essential in BCI. In this paper, a detective method:one class support vector machine (OCSVM) is applied to three (EEG) motor imagery (MI) tasksclassification the first time. The EEG signals is recorded from9 subjects performing left, right handand feet MI. In addition, we also use other classification methods: LDA, PNN, multi-class SVM, as acomparison. The results of the 3-class classification problem show that when using OCSVM,classification accuracy is significantly improved.

    1. Introduction

    Over the past years, Many research results show the possibility that brain signals recorded from thescalp or from within the brain could control external device to complete certain tasks which disabilities

    cannot complete. The signals used to communicate through a computer usually include: P300,

    event-related desynchronization (ERD), visual evoked potential(VEP), etc.

    The aim of BCI is to translate the activity of brain into command to control external devicecompleting the task of communication[1,2]. To achieve this goal, we need to recognize the

    patterns of the brain. Now, in EEG MI classification, lots of work is put in the methods for

    difference feature extraction. In this paper, we will discuss aspects of classification methods.These most common methods include multi-class support vector machine(M-SVM) and artificial

    neural networks(ANN). SVM has been applied to multiclass BCI problems using the one-versus

    rest(OVR) strategy. ANN have been applied to binary and multiclass mental tasks classification.

    Schalk, et al. proposed a new concept in BCI: detection instead of classification. They introduces andvalidates signal detection, which does not require the analysis procedures (preliminary analyses to

    identify the brain signal features best suited for communication)[3].

    OCSVM is an extension to SVMs to estimate the support vectors of a distribution introduced by

    Schlkopf et al [4]. Various areas have the applications of OCSVM: image retrieval [5], geometryinvariant texture retrieval [6], Clustering[7], communication signal modulation scheme recognition[8],

    etc. In EEG-based signal processing, OCSVM is usually used to detect epilepsy[3,9], in the cases thatthe number of samples of one class is much less than another.

    In this paper, OCSVM is introduced as an detective method for EEG MI classification, with almost

    the same number of samples of three classes. In this paper, we choose OCSVM to classify three MI

    tasks. First, the characteristics of feature samples in one class are modeled using this algorithm. Then,

    for every class, we construct a model. So, for every new feature samples, we select the nearest modelas its class.

    The introduced method is tested on brain pattern recognition data to correctly distinguish three MI

    tasks. First, we use the common spatial pattern (CSP) to derive the feature samples for every class. The

    method was compared to Linear discriminant analysis (LDA), Multi-class support vector

    *Corresponding author.

    - 257 -

  • 7/28/2019 30-JCIT2-753041JE

    2/9

    A Detective Method for Multi-class EEG-based Motor Imagery Classification Based on OCSVM

    Yanlei Gu, Jianhua Dai, Bian Wu, Nenggan Zheng, Weidong Chen, Xiaoxiang Zheng

    Journal of Convergence Information Technology, Volume 6, Number 1. January 2011

    machine(M-SVM), probabilistic neural network(PNN). The results show that the classification

    performance can be significantly improved by OCSVM method.

    2. Methods

    2.1. Feature extraction

    In this paper, we use the well-known algorithm called Common Spatial Patterns(CSP) to derive thefeature vectors for training. The main idea of CSP is to use a linear transform to project the

    multi-channel EEG data into a spatial subspace with a projection matrix, each row of which consists of

    the weights corresponding to each channel. This transformation can maximize the variance of two-classsignal matrices. The algorithm is based on the simultaneous diagonalization of the covariance matrices

    of both classes[11]. At first, CSP is applied to two imaginary tasks[12, 13]. But in this paper, three

    kinds of tasks need to be distinguished. So, we use a method based on one-versus-the-rest(OVR)

    algorithm, an extension of common spatial patterns(CSP) algorithm to multi-class case[14, 15].

    Suppose there are three class tasks A, B, and C [14]

    Step1: Estimate covariance matrices using equation(1)

    TAAXXAR TBBXXBR TCC XXCR (1)

    AX BX and CX are matrices with dimension of N (channels) by T (samples) (T>N). AX BX and

    CX represent a trial of the all channels signal of class A, class B, and class C.

    Step2: Separate one class from others. Here, we separate feet from hands. Factorize the sum

    covariance matrices ofAR , BR and CR using equation(2)

    letCBA RRR

    TCBAAA UURRRRRR (2)

    U is the matrix of eigenvector with the dimension of N by N, and U is also the unitary matrices of

    principal components. is a diagonal matrix of eigenvalues with dimension of N by N.Here, we separate feet from hands. Then, the two classes signals can be modeled as[16]

    C

    A

    CAAS

    SCCX (3)

    C

    A

    CAAS

    SCCX (4)

    Where AS and AS are the special source components for class A and A . AC and AC

    are corresponding spatial patterns; CS is the common source component, CC is its

    corresponding spatial pattern. By CSP, we can get two spatial filters, which can be used to extract

    source componentsAS and AS .

    Step3: Construct whitening transformation matrix using equation(5)

    TUP 21 (5)Step4:

    PRPS AA (6)

    PRPS AA (7)

    - 258 -

  • 7/28/2019 30-JCIT2-753041JE

    3/9

    A Detective Method for Multi-class EEG-based Motor Imagery Classification Based on OCSVM

    Yanlei Gu, Jianhua Dai, Bian Wu, Nenggan Zheng, Weidong Chen, Xiaoxiang Zheng

    Journal of Convergence Information Technology, Volume 6, Number 1. January 2011

    Step5: Find the maximum and minimum eigenvalues ofAR then we can find the spatial

    filters(SF1 and SF2) corresponding to the eigenvalues which make the two classes with maximum

    separability. Then use the filters to process the signals as (8):

    XSFS 11

    XSFS 22 (8)

    X is a data matrix of preprocessed multi-channel EEG. The feature vector corresponding to one

    source activity was defined as:

    Feature=

    )

    )var()var(

    )var(log()

    )var()var(

    )var(log(

    21

    2

    21

    1

    SS

    S

    SS

    S(9)

    Computation step and detailed description about CSP can be found in [14, 15]. In this process, a key

    point feature frequency band selection problem. In this paper, the preprocessed multi-channel EEGsignals were cut into 17 overlapped frequency band with bandwidth of 4 and overlapping bandwidth of

    2. Then we select the band with best separability. We select one frequency band obtaining bestseparability, so the feature vectors are 4-dimensional vectors.

    In this paper, the results of these types of classifiers are compared. In our case, we define left hand

    MI as class1, right hand MI as class 2, feet MI as class 3.

    2.2. L inear discriminant analysis( LDA)

    LDA defines two measures: within-class scatter matrix and between-class scatter matrix.

    Within-class scatter matrix: Tj

    jij

    ji

    c

    j

    N

    iW mXmXS

    j

    ))((1 1

    (10)

    Within-class scatter matrix: Tjj

    c

    jb mmmmS ))((

    1

    (11)

    The number of classes is c, jiX reprensents the ith sample of class j, jm is the mean of class j,

    m is the mean of all class. The basic idea of LDA is to makes the direction of Fisher criterion function

    reaches an extremum value as the best projection direction vector. So maximize the between-classmeasure, at the same time minimize the within-class measure [17].

    In our case, we combine three LDA classifiers to make decision. As described in Figure 1.

    First, we combine two of three classes as one class with the label of -1. Then the rest class is labeled

    as 1. Using a LDA classifier, we get a result which we look as the value belongs to class 1. Afterrepeating the process twice, as shown in Fig.1, we get three results: result1, result2, result3

    (corresponding to the original class1, class2, class3 respectively). Then we compare the three resultsand choose the maximum as the classification result.

    - 259 -

  • 7/28/2019 30-JCIT2-753041JE

    4/9

    A Detective Method for Multi-class EEG-based Motor Imagery Classification Based on OCSVM

    Yanlei Gu, Jianhua Dai, Bian Wu, Nenggan Zheng, Weidong Chen, Xiaoxiang Zheng

    Journal of Convergence Information Technology, Volume 6, Number 1. January 2011

    COMPARE

    Figure 1. Three classifications using three LDA classifiers. LH: left hand, RH: right hand, F: feet

    2.3. Probabilistic neural network(PNN)

    PNN is feedforward neural networks with 2 hidden layers, and is a typical nonlinear classifier,

    which uses minimum Bayesian risk criterion. PNN network has 4-layer structure: input layer, pattern

    layer, summation layer and decision layer.The input layer receives and normalizes input vector which is the feature extracted from EEG signal

    using CSP. We set the number of neurons in the pattern layer as 3 for our 3 MI classification. Everyunit in pattern layer represents a training vector. Compute the Euclidean distance between the input

    vector and every training vector, then realize nonlinear mapping with Gaussian kernel: the spherical

    Gaussian radial basis function which is a Parzen probability density function estimator as Equation (12)

    iM

    k

    iKT

    iKqqi

    XXXX

    MXf

    12

    1

    2]

    2

    )()(exp[

    1

    )2(

    1)(

    (12)

    The summation layer computes the summation of each pattern and multiply the loss factor. Thedecision layer selects the largest one in summation layer as the classification result.

    Figure 2. Probabilistic neural network structure

    2.4. Multi-class support vector machine( M-SVM)

    For comparison with the OCSVM, we also use M-SVM here, we use LIBSVM software package,designed by Taiwan University, Dr. Lin Zhiren, a freely-available library of SVM tools. It can solve

    classification and regression estimation problems and distribution problems, and so on. In this paper,

    we use LIBSVM to solve classification problem and a Gaussian radial basis function was selected asthe kernel function.

    2.5. One-class support vector machine (OCSVM)

    Class: RH and F LH

    Label: -1 1

    Class: LH and F RH

    Label: -1 1

    Class: RH and LH F

    Label: -1 1

    Result1

    Result2

    Result3

    Make decision

    - 260 -

  • 7/28/2019 30-JCIT2-753041JE

    5/9

    A Detective Method for Multi-class EEG-based Motor Imagery Classification Based on OCSVM

    Yanlei Gu, Jianhua Dai, Bian Wu, Nenggan Zheng, Weidong Chen, Xiaoxiang Zheng

    Journal of Convergence Information Technology, Volume 6, Number 1. January 2011

    Based on statistical learning theory, there developed a new machine learning method: SVM, which

    is to find the optimal separating hyper-plane through learning in the feature space. This method can

    overcome the shortcomings of rule-based classification algorithm and use less training data to achievehigher classification accuracy. However, SVM was originally proposed for the binary case. To solve

    multi-classification problems, it requires constructing multiple classifiers (one to one, one to many,

    etc.), training models and determining the complex parameters. In this paper, we select an extendedmethod: OCSVM to solve the problem. It achieves good results.

    We consider training data

    1x , 2x , 3x ,, nx XNR ;

    Where n is the number of training samples. First, map the data into the feature space, then find a

    smallest separating hyper-sphere through learning in the feature space, containing samples as many aspossible. We want the sphere to be as small as possible while at the same time, including the training

    samples as many as possible. This problem can be transformed as the following optimization problem:

    min(

    l

    ii

    lw 1

    2 1

    2

    1) (13)

    s.t.iixw ))(( 0,2,1 ili

    where w and are hyper-sphere parameters, is the map from input space to

    feature space. By setting the parameter (0

  • 7/28/2019 30-JCIT2-753041JE

    6/9

    A Detective Method for Multi-class EEG-based Motor Imagery Classification Based on OCSVM

    Yanlei Gu, Jianhua Dai, Bian Wu, Nenggan Zheng, Weidong Chen, Xiaoxiang Zheng

    Journal of Convergence Information Technology, Volume 6, Number 1. January 2011

    *3333 ),()( xxKxf ii

    i, 0

    3 i (19)

    The discriminant function to make a decision, finding the Maximum of )(1 xf , )(2 xf , )(3 xf .

    Max( )(1 xf , )(2 xf , )(3 xf ) (20)

    3. Results

    3.1. Data description and processing

    We use the BCI Competition 2008 IV dataset II a. The competition data set consists of EEG data

    from 9 subjects. The cue-based BCI paradigm consisted of four different motor imagery tasks, namelythe imagination of movement of the left hand (class 1), right hand (class 2), both feet (class 3), and

    tongue (class 4). Two sessions on different days were recorded for each subject. Each session is

    comprised of 6 runs separated by short breaks. One run consists of 48 trials (12 for each of the four

    possible classes), yielding a total of 288 trials per session. We choose the first three classes as the datawe processed. For every subject we combine the two sessions as one data set. Twenty-two Ag/AgCl

    electrodes (with inter-electrode distances of 3.5 cm) were used to record the EEG. All signals wererecorded monopolarly with the left mastoid serving as reference and the right mastoid as ground. The

    signals were sampled with 250Hz and bandpass-filtered between 0.5Hz and 100Hz. The signals were

    further bandpass-filtered between 0.5Hz and 40Hz in this paper. The sensitivity of the amplifier was setto 100 V. An additional 50Hz notch filter was used to suppress line noise.

    3.2. Results of four algorithm

    For the feature vectors of every person, we use ten-fold cross-validation to get the mean

    classification accuracy(MCA) of every algorithm(alg) with the same feature samples extracted through

    the CSP, The results is shown as follow. There are 9 subjects denoted as A01-A09.

    Table 1. The classification result of four methods

    AlgMCA

    SubLDA OCSVM PNN SVM

    A01 0.786 0.660 0.621 0.333

    A02 0.538 0.626 0.548 0.307

    A03 0.485 0.590 0.563 0.333

    A04 0.678 0.741 0.698 0.650

    A05 0.697 0.714 0.698 0.581

    A06 0.542 0.605 0.540 0.333

    A07 0.652 0.674 0.669 0.610

    A08 0.595 0.636 0.610 0.555A09 0.619 0.645 0.555 0.600

    Besides, for every subject, we calculate the variance of the ten classification accuracy, which is also

    plot on the figure3.

    - 262 -

  • 7/28/2019 30-JCIT2-753041JE

    7/9

    A Detective Method for Multi-class EEG-based Motor Imagery Classification Based on OCSVM

    Yanlei Gu, Jianhua Dai, Bian Wu, Nenggan Zheng, Weidong Chen, Xiaoxiang Zheng

    Journal of Convergence Information Technology, Volume 6, Number 1. January 2011

    A01 A02

    A03 A04

    A05 A06

    A07 A08

    A09Figure 3. Classification accuracy of nine subjects using the four algorithms(A01-A09 is the results of

    subjects1-subject9). Black lines represent the variance of the results of 10 times cross-validation.

    - 263 -

  • 7/28/2019 30-JCIT2-753041JE

    8/9

    A Detective Method for Multi-class EEG-based Motor Imagery Classification Based on OCSVM

    Yanlei Gu, Jianhua Dai, Bian Wu, Nenggan Zheng, Weidong Chen, Xiaoxiang Zheng

    Journal of Convergence Information Technology, Volume 6, Number 1. January 2011

    Figure 4.The line chart is the combined effect of the nine subjects. Abscissa is the nine subjects.Ordinate is the classification accuracy.

    4. Discussion and Conclusion

    By the results, the detective method OCSVM in the EEG-based three MI tasks classification hasdistinct advantages. Based on detection method, OCSVM construct a model for every class. When

    there are new samples, every model detects and selects ones belong to their own class. This detective

    method provides a new way to solve the classification problem.

    5. Acknowledgement

    This work was supported in part by National Science Foundation of China (61031002,60873125, 30800287, 60703038, 61070074).

    6. Reference

    [1] McFarland D, Wolpaw J, "Sensorimotor rhythm-based braincomputer interface (BCI): featureselection by regression improves performance", IEEE Transactions on Neural Systems and

    Rehabilitation Engineering, vol. 13, no. 3, 2005.[2] Penny W, Roberts S, "EEG-based communication: a pattern recognition approach", IEEE

    Transactions on Rehabilitation Engineering, vol. 8, no. 2, pp.214-215, 2000.

    [3] Schalk G, Brunner P, "Brain-computer interfaces (BCIs): detection instead of classification",Journal of Neuroscience Methods, vol. 167, no. 1, pp.51-62, 2008.

    [4] Schlkopf B, Platt J, "Williamson R. Estimating the support of a high-dimensional distribution",Neural computation, vol. 13, no. 7, pp.1443-1471, 2001.

    [5] Yunqiang Chen, Xiang Zhou, "One-class SVM for learning in image retrieval", Proceedings ofIEEE International Conference on Image Processing, pp.34-39, 2001.

    [6] Ma YD, Liu L, "Pulse-coupled neural networks and one-class support vector machines forgeometry invariant texture retrieval", Pattern Recognization, vol. 28, no. 11, pp.1524-1529, 2010.

    [7] Huang X, Chen X, "A Novel Clustering Algorithm Based on One-Class SVM", IEEE ComputerSociety, pp.486-490, 2009.[8] Zhendong Y. "Research of communication signal modulation scheme recognition based on

    one-class SVM bayesian algorithm", Proceedings of the 5th International Conference on Wirelesscommunications, networking and mobile computing, pp.739-742, 2009.

    [9] Gardner A, Krieger A, "One-class novelty detection for seizure analysis from intracranial EEG",The Journal of Machine Learning Research, vol. 7, pp.10251044, 2006.

    [10]Sun S, Zhang C, "Adaptive feature extraction for EEG signal classification", Medical andBiological Engineering and Computing, vol. 44, no. 10, pp.931-935, 2006.

    [11]Ramoser H, Muller-Gerking J, "Optimal spatial filtering of single trial EEG during imaginedhandmovement", IEEE Transactions on Rehabilitation Engineering, vol. 8, no. 4, pp.441-446,

    2000.

    - 264 -

  • 7/28/2019 30-JCIT2-753041JE

    9/9

    A Detective Method for Multi-class EEG-based Motor Imagery Classification Based on OCSVM

    Yanlei Gu, Jianhua Dai, Bian Wu, Nenggan Zheng, Weidong Chen, Xiaoxiang Zheng

    Journal of Convergence Information Technology, Volume 6, Number 1. January 2011

    [12]Wang Y, Berg P, "Common spatial subspace decomposition applied to analysis of brain responsesunder multiple task conditions: a simulation study", Clinical Neurophysiology, vol. 110, no. 4, pp.

    604-614, 1999.

    [13]Mller-Gerking J, Pfurtscheller G, "Designing optimal spatial filters for single-trial EEGclassification in a movement task", Clinical Neurophysiology, vol. 110, no. 5, pp.787-798, 1999.[14]Wu W, Gao X, "One-versus-the-rest (OVR) algorithm: An extension of common spatial patterns

    (CSP) algorithm to multi-class case", 27th Annual International Conference of the IEEEEngineering in Medicine and Biology Society, pp.2387-2390, 2005.

    [15]Dornhege G, Blankertz B, "Boosting bit rates in noninvasive EEG single-trial classifications byfeature combination and multiclass paradigms", IEEE Transactions on Biomedical Engineering,

    vol. 51, no. 6, pp.993-1002, 2004.

    [16]Quigguo W, Fei M, "Feature combination for classifying single-trial ECoG during motor imageryof different sessions", Progress in Natural Science, vol. 17, no. 7, pp.851-858, 2007.

    [17]Martnez A, Kak A, "PCA versus LDA", IEEE Transactions on Pattern Analysis and MachineIntelligence, vol. 23, no. 2, pp.228-233, 2001.

    - 265 -