research article enhanced z-lda for small sample size...

Research ArticleEnhanced Z-LDA for Small Sample Size Training inBrain-Computer Interface Systems

Dongrui Gao1 Rui Zhang1 Tiejun Liu12 Fali Li1 Teng Ma1 Xulin Lv1 Peiyang Li1

Dezhong Yao12 and Peng Xu12

1Key Laboratory for Neuro-Information of Ministry of Education School of Life Science and TechnologyUniversity of Electronic Science and Technology of China Chengdu 611731 China2Center for Information in Bio-Medicine University of Electronic Science and Technology of China Chengdu 611731 China

Correspondence should be addressed to Peng Xu xupenguestceducn

Received 10 July 2015 Revised 28 September 2015 Accepted 28 September 2015

Academic Editor Irena Cosic

Copyright copy 2015 Dongrui Gao et al This is an open access article distributed under the Creative Commons Attribution Licensewhich permits unrestricted use distribution and reproduction in any medium provided the original work is properly cited

Background Usually the training set of online brain-computer interface (BCI) experiment is small For the small training set it lacksenough information to deeply train the classifier resulting in the poor classification performance during online testingMethodsIn this paper on the basis of Z-LDA we further calculate the classification probability of Z-LDA and then use it to select the reliablesamples from the testing set to enlarge the training set aiming to mine the additional information from testing set to adjust thebiased classification boundary obtained from the small training set The proposed approach is an extension of previous Z-LDAand is named enhanced Z-LDA (EZ-LDA) Results We evaluated the classification performance of LDA Z-LDA and EZ-LDA onsimulation and real BCI datasets with different sizes of training samples and classification results showed EZ-LDA achieved the bestclassification performance Conclusions EZ-LDA is promising to deal with the small sample size training problem usually existingin online BCI system

1 Introduction

Brain-computer interface (BCI) could translate brain inten-tion into computer commands and it has been widely usedfor cursor control [1] word spelling [2] neurological reha-bilitation [3] and so forth Generally BCI system consists ofstimulus presentation signal acquisition feature extractionand translation modules [4] among them feature extractionand translation algorithms play important roles for the finalBCI performance Three factors including heteroscedasticclass distribution small sample size training and nonstation-ary physiological signals should be taken into considerationwhen selecting the translation algorithms for BCI systemRegarding the translation module linear discriminant analy-sis (LDA) is one of the most popular classification algorithmsfor BCI application due to its simplicity and it has beenwidely used in motor imagery-based BCI [5] P300 speller[6] and motion-onset visual evoked potential-based BCI(mVEP-BCI) [7] Besides the use of linear discriminant

analysis (LDA) for functional near-infrared spectroscopy-(fNIRS-) based BCIs is worth mentioning LDA has beenshown to work effectively for the binary [8] andmulticlass [9]classifications of motor imagery signals for the developmentof fNIRS-based BCIs

However LDA is established on the homoscedastic classdistribution assumption which is usually not held for prac-tical BCI application In order to handle this problem weproposed an improved method named 119911-score LDA (Z-LDA) [10] Z-LDA defines the decision boundary through119911-score utilizing both mean and standard deviation infor-mation of the projected data and evaluation results showedbetter classification performance could be obtained underthe heteroscedastic distribution situation But Z-LDA doesnot take into account the small sample size training problemthat usually existed in actual online BCI system [11] Whenthe number of the training samples is small the estimatedclassifier tends to be overfitted resulting in the poor general-ization during online testing Various approaches have been

Hindawi Publishing CorporationComputational and Mathematical Methods in MedicineVolume 2015 Article ID 680769 7 pageshttpdxdoiorg1011552015680769

2 Computational and Mathematical Methods in Medicine

proposed to address this issue [12] Li et al designed a self-training semisupervised SVM algorithm to train the classifierwith small training data [11] Xu et al proposed a strategywhich enlarges training set by adding test samples withhigh probability to improve the classification performanceof Bayesian LDA under small sample training situation [1314] The strategy hypothesizes that unlabeled samples withhigh probability provide valuable information for refining theclassification boundary

In essence Z-LDA defines the confidence of samples interms of its position in the estimated distribution whichcould be used to update the classifier for the online BCIsystem In the current study we will extend Z-LDA to dealwith the small size training problem

2 Materials and Methods

21 Probability Output of Z-LDA In LDA [15] the weightsum 119910(119883) of the unlabeled sample 119883 is calculated based onthe project vector119882which is estimated from the training setand the corresponding prediction label is then determinedby the shortest distance between 119910(119883) and the labels of eachclass For Z-LDA [10] we assume that119910(119883) of samples in eachclass follow normal distribution and normalize it through 119911-score as

119911119896 =

119910 (119883) minus 120583119896

120590119896

(1)

where 120583119896 120590119896 are the corresponding mean and standarddeviation of the weight sum 119910(119883) for training set 119862119896 Thus119911119896 follows standard normal distribution Z-LDA make theprediction based on the distance between 119911119896 and mean ofthe standard normal distribution (ie 0) Suppose 119911lowast is theclosest one near to 0 then the unlabeled sample will beclassified to training set 119862lowast In the binary classification thedecision boundary of Z-LDA is defined as [10]

119888lowast=

(12059011205832 + 12059021205831)

(1205901 + 1205902)

(2)

Generally the cumulative distribution function of thestandard normal distribution is denoted as

Φ (119909) = 119875 119883 lt 119909 =

1

radic2120587

int

119909

minusinfin

119890minus11990522119889119905

minus infin lt 119909 lt +infin

(3)

For the transformed 119911-score 119911lowast of Z-LDA the area representsthe cumulative probability Φ(119911lowast) that is shown in Figure 1Based on this we define the prediction probability of Z-LDAas

119875 (119911lowast) = 1 minus (Φ (

1003816100381610038161003816119911lowast1003816100381610038161003816) minus Φ (minus

1003816100381610038161003816119911lowast1003816100381610038161003816)) (4)

The area which represents 119875(119911lowast) is also marked on Figure 1It is easy to know that 119875(119911lowast) decreases with the increaseddistance between 119911lowast and the mean of the standard normaldistribution and the range of 119875(119911lowast) is [0 1] Obviously thelarger 119875(119911lowast) denotes the higher confidence that the samplebelongs to class 119862lowast thus the above definition is reasonable

Φ(|zlowast|) minus Φ(minus|z

lowast|)

Φ(zlowast) 1 minus Φ(|z

lowast|)

50

40

30

20

10

0

Prob

abili

ty (

)

minus5 minus4 minus3 minus2 minus1 0 1 2 3 4 5

z

P(zlowast)

Figure 1The prediction probability of EZ-LDA and the cumulativeprobability

Class 1 Class 2

Classifier model

Unlabeled test samples

Label 1 with probability


Training samples

Train

Classify

Output

Prediction result of test samples

ThresholdThreshold

ExceedExceed

EnlargeEnlarge

Figure 2 The flow chart of the training set enlarging strategy

22 Training Set Enlarging Strategy A small training set maynot provide enough information for estimating the distribu-tion parameters of the samples thus the biased classificationboundary could be obtained when training the classifierIn this case the classification accuracy will be decreasedduring online test We propose to add a kind of training setenlarging strategy to alleviate the small sample size trainingeffect in this work and the detailed procedures are illustratedin Figure 2 After classifier model is estimated based on thesmall training set and the prediction results of unlabeled testsamples are obtained the strategy assumes that unlabeledtest samples with high classification probability representcorrect prediction and these correctly predicted samplescould be screened out and then used to enlarge the trainingset Thus more accurate sample distribution estimation andthe improved classifier with the refined classification bound-ary would be obtained [13] In this strategy classificationprobability information exported from Z-LDA classifier isregarded as a confident evaluation criterion to select the highprobability test samples which are then labeled accordingto the prediction results of Z-LDA Next we need to set athreshold the predicted label of a test sample is believed tobe correct if the corresponding classification probability ishigher than the threshold Finally this test sample could be

Computational and Mathematical Methods in Medicine 3

considered as a training sample because its label has beencorrectly predicted and it could be added to the trainingset for classifier calibration The above procedures could berepeated several times thus more samples could be selectedthe training set could be enlarged and the more accurateclassification boundary could be found The above trainingset enlarging strategy incorporated with Z-LDA is named asenhanced Z-LDA (EZ-LDA) in current study

23 Simulations We constructed a simulation dataset inorder to quantitatively investigate the classification perfor-mance of EZ-LDA when dealing with the small sample sizetraining problem The simulation was established by usingthe fundamental two 2-dimensional Gaussian distributionswhere the samples in the first class follow a Gaussian distri-bution with mean (minus5 1) and standard deviation (1 1) andthe samples in the second class follow a Gaussian distributionwith mean (5 minus1) and standard deviation (5 5) 5 outliersamples were added into both the training and testing setswhere the outliers follow a Gaussian distribution with mean(25 minus15) and standard deviation (1 1)Therewere 50 samplesin the training set with 25 samples in each class and thetesting set consisted of 200 samples with 100 samples in eachclass

During classifier training 20 30 40 and 50 samples fromthe training set were selected respectively The predictionresults of LDA and Z-LDA were also calculated for compar-ison Regarding EZ-LDA the test set was divided into 20parts with 10 samples in each partThenwe used the classifierestimated from original training set to predict the labels ofthe first part (10 samples) of the test set and obtained thepredicted label and classification probability of each sampleNext we set a probability threshold the predicted label ofa test sample is believed to be correct if the correspondingclassification probability is higher than the threshold Finallythis kind of test sample could be considered as a trainingsample because its label has been correctly predicted andit could be added to the training set Once the first part(10) samples have been processed we retrain the classifierbased on the extended training set then using the updatedclassifier to process the next 10 test samples the repetitionswill be stopped until all the test samples are predicted Theprobability thresholds ranged from 10 to 90 with a step10 being considered The above procedures were repeated100 times in order to reduce the random effect and all thesamples of the training and testing set were generated at thebeginning of each iteration Finally the average classificationaccuracies were obtained for the three classifiers respectively

24 Real BCI Dataset

241 Dataset Description The evaluation dataset comesfrom motion-onset visual evoked potential-based BCI(mVEP-BCI) experiment mVEP could be evoked by briefmotion of object and it is time locked to the onset of themotion We can achieve the brief motion stimuli by screenvirtual button thus BCI system based on mVEP could bedeveloped [7 16]

Eight subjects (2 females aged 233 plusmn 13 years) par-ticipated in the current study They were with normal orcorrected to normal vision The experimental protocol wasapproved by the Institution Research Ethics Board of theUniversity of Electronic Science andTechnology ofChina Allparticipants were asked to read and sign an informed consentform before participating in the study After the experimentall participants received monetary compensation for theirtime and effort

The experimental paradigm is similar as we described in[17] six virtual buttons were presented on the 14-inch LCDscreen and in each virtual button a red vertical line appearedin the right side of the button and moved leftward until itreached the left side of the button which generated a briefmotion-onset stimulus The entire move took 140ms witha 60ms interval between the consecutive two moves Themotion-onset stimulus in each of the six buttons appeared ina random order and a trial was defined as a complete series ofmotion-onset stimulus of all six virtual buttons successivelyThe interval between two trials was 300ms thus each triallasted 15 s Five trials comprised a block which costs 75 sThe subject needs to focus on the button which is indicated inthe center of the graphical user interface and the instructednumber randomly appeared To increase their attention thesubject was further asked to count in silence the times ofmoving stimulus appearing in the target button A total of 144blocks including 720 trials were collected for each subjectin four equal separate sessions with a 2-minute rest periodbetween sessions

Ten AgAgCl electrodes (CP1 CP2 CP3 CP4 P3 P4 PzO3 O4 and Oz) from extended 10ndash20 system were placedfor EEG recordings by using a Symtop amplifier (SymtopInstrument Beijing China) All electrode impedance waskept below 5 kΩ and AFz electrode was adopted as referenceThe EEG signals were sampled at 1000Hz and band-passfiltered between 05 and 45Hz

242 Preprocessing and Feature Extraction Since the scalprecorded EEG signals are usually contaminated with noisethose trials with absolute amplitude above 50 120583v thresholdwere removed from the following analysis The remainingEEG data were band-pass filtered between 05Hz and 10Hzbecause the mVEP is usually distributed in the low frequencyband [18] Then the EEG epochs of 5 trials in each blockwere averaged by stimulus Similar to P300-based study theinstructed stimulus was defined as target and the others weredefined as nontarget Two-sample 119905-test was applied betweenthe target and nontarget epochs to find the channels and timewindows which exhibit significant difference Finally threesignificant channels were selected for each subject and thetime windows ranged from 140ms to 350ms which variedbetween subjects The selected epochs were further down-sampled to 20Hz and a 9-dimensional or 12-dimensionalfeature vector was generated for each stimulus in the blockat the end

243 Small Sample Size Classification The aim of themVEP-BCI was to distinguish the target and nontarget stimuli that


Table 1 The classification accuracies of LDA Z-LDA and EZ-LDAon simulation dataset

Training size 20 30 40 50LDA 787 804 819 834Z-LDA 770 817 830 840

EZ-LDA(differentthresholds)

10 812lowast 836lowast 849lowast 858lowast









The second column denotes the threshold used in EZ-LDAlowast denotes the classification accuracy of EZ-LDA is significantly higher thanthat of Z-LDA (Mann-Whitney 119880 test 119901 lt 005)

is recognizing the button which subject paid attention tothus it was a binary classification problem The ratio of thenontarget and target number in each trial was 5 1 and weselected equal number of target and nontarget samples forinitially training in the following analysis in order to balancethe two classes 20 40 60 80 and 100 samples were used totrain the classifiers respectively and the remaining sampleswere used for test Similar to the strategies demonstratedin Simulations the higher probability test samples wereadded into the training set to update the classifier for EZ-LDA with the step also being 10 samples and the thresholdwas set as 50 because the highest accuracy was achievedat this threshold on the simulation dataset Similarly theclassification results of LDA and Z-LDA were also calculatedfor comparison

3 Results

Theclassification results of the simulation dataset were shownin Table 1 where the accuracies of all the three classifiersincreased with the extending of the initial training samplesize EZ-LDA achieved the highest average accuracies underall the sample size conditions and the accuracies obtainedby EZ-LDA under all thresholds were significantly higherthan Z-LDA (119901 lt 005 Mann-Whitney 119880 test) When thetraining sample size was bigger than 20 both Z-LDA andEZ-LDA achieved higher accuracies than LDA Regardingthe various thresholds in EZ-LDA the best performance wasobtained for the threshold 50 To further reveal the workingmechanism of EZ-LDA an illustration of the training setenlarging strategy was given in Figure 3 The initial trainingsample size was 20 Z-LDA and EZ-LDA shared the sameclassification boundary (Figure 3(a)) When the high prob-ability samples in testing set were included in the trainingset the corresponding boundary lines of EZ-LDA could beadaptively adjusted (Figures 3(b)ndash3(d))

Figure 4 presented the performances on real mVEP-BCIdataset for LDA Z-LDA and EZ-LDA For all the three

kinds of classifiers the mean accuracies increased when weenlarged the training size and this is consistent with thesimulation result Moreover the classification accuracies ofEZ-LDA consistently outperformed Z-LDA in all of the fiveconsidered training sample sizes (119901 lt 005 Mann-Whitney119880test) comparedwith LDA the significant improvement couldbe also observed except the training size with 40 samples

4 Discussion

Translation algorithms in BCI system could translate intent-related features to computer commands and generally weneed to collect a training set at first for training the classifierBut due to the factors such as experiment time limit electrodeconductivity decrease over time and subject fatigue usuallythe training set is not big enough in BCI application Besidesas for the online BCI system the subjectrsquos mental state mayvary largely from the previous mental state during trainingTherefore in order to track subjectrsquos mental state change it isnecessary to update the classifier by utilizing the informationfrom those new test samples which is also useful for resolvingthe small sample size training problem

The best way to reduce the bias effect is to include enoughsamples to train the classifier but it is unable to achieve inactual BCI experiment Alternatively we could consider theunlabeled test samples with high classification probabilityas the prediction of them could be trusted Thus we mayenlarge the training set by adding those test sampleswith highclassification probability in testing set to refine the classifierResults from both simulation and real BCI datasets in currentstudy showed that the enlarging strategy for training set couldimprove the classification performance of Z-LDAunder smallsample size situation The classification boundary definitionof Z-LDA is based on the distribution of the training samplesvividly it can be viewed as the intersection of the Gaussiandistribution curves [10] If the training set is too small wemay not get the accurate distribution information of thesamples resulting in the biased classifier boundary In thiscase the classification accuracy of Z-LDA may be decreasedas revealed in the results from both simulation and actualBCI datasets Specifically in simulation dataset the meanaccuracies of Z-LDA are lower than those of LDA when thetraining size is 20 similarly in real BCI dataset when 20 or40 samples are selected for training the mean accuraciesof Z-LDA are lower than LDA However EZ-LDA achievedbetter accuracies than LDA in both simulation and real BCIdatasets but EZ-LDA still needs a number of training samplesfor initial classifier training based on the simulation andreal BCI datasets in current study we think the minimumnumber of training samples is 20 for EZ-LDA to performwellAdding the test samples with higher classification probabilityto training set could enlarge the training size thus moreaccurate sample distribution information could be estimatedfor Z-LDA and the biased classification boundary couldbe corrected too As shown in Figure 3 the classificationboundaries are the same between EZ-LDA and Z-LDA forthe initial training set (Figure 3(a)) Then the boundary isused to predict the labels of the test samples and those


minus5

minus5

minus10

minus10

minus15

minus200 5 10 15 20 25

0

5

10

Z-LDAEZ-LDA

(a)

minus5minus10

minus5

minus10

minus15

minus200 5 10 15 20 25

0

5

10

Z-LDAEZ-LDA

(b)

minus5minus10

minus5

minus10

minus15

minus20

0 5 10 15 20 25

0

5

10

Z-LDAEZ-LDA

(c)

minus5minus10

minus5

minus10

minus15

minus20

0 5 10 15 20 25

0

5

10

Z-LDAEZ-LDA

(d)

Figure 3 The classification boundary adjustment processes of EZ-LDA (a) The initial 20 training samples (b) 20 training samples + 20 testsamples (c) 20 training samples + 40 test samples (d) 20 training samples + 60 test samples Star the samples in the first class filled circlethe samples in the second class blue the samples in the training set red the higher probability test samples for extending the training setgrey the lower probability test samples

predicted test samples with high classification probability areselected to extend the training set finally the classificationboundary is refined (Figure 3(b)) The above procedurescould be repeated again to obtainmore accurate classificationboundary (Figures 3(c) and 3(d)) Noted that some of thetest samples may be wrongly classified during predictionusually these kinds of samples are far from the distributioncenter and with lower classification probability for examplethe grey filled circles in Figure 3 Since theymay influence theestimated classifier model we set a threshold to screen themout

The classification probability of Z-LDA is based onthe cumulative distribution function Φ(119909) of the standardnormal distribution The cumulative probability is equal to0 50 and 100 when 119909 are minusinfin 0 and +infin respectivelyMathematically Z-LDA measures similarity of how a samplebelongs to this distribution by the distance between thedistribution center 0 and the sample variable defined in (1)Following (4) the closer to distribution center the samplevariable is the higher probability the sample belongs to theclass with this distribution To expand the training set weneed to set a probability threshold to select the test samples


20 40 60 80 10070

74

78

82

86

90

Training size

Clas

sifica

tion

accu

racy

()

LDAZ-LDAEZ-LDA

lowast

lowast

lowast

lowast lowast

Figure 4 Classification results of LDA Z-LDA and EZ-LDA onmVEP-BCI dataset with different training sample sizes lowast denotesthe classification accuracy of EZ-LDA is significantly higher thanthat of Z-LDA (Mann-Whitney 119880 test 119901 lt 005)

with correctly predicted label Obviously more wrongly pre-dicted samples may be added to the training set when a lowerthreshold was set whereas when the threshold was set higherfewer samples could be selected to add in the training setThebiased classification boundary cannot be effectively correctedunder the two above conditions as proved by Table 1 that therelatively poor performance is achieved for EZ-LDA whenthe threshold is set as 10 or 90 However the highestmean accuracy is achieved when the threshold is set as 50Therefore the threshold should be set as a compensationof the number of correctly and wrongly predicted samplesin the actual application In the real BCI dataset we onlyconsidered the 50 threshold for illustration because thehighest accuracies are achieved under this threshold onsimulation dataset As shown in Figure 4 when the enlargingstrategy is combined with EZ-LDA the performance of EZ-LDA has statistical improvement compared with LDA and Z-LDA because the information in the testing set can be minedto update the classifier

SVM is an another popular classifier used for BCIapplication [19 20] the advantages of LDA are simplicity andlow computational cost while SVM has better generalizationcapability Theoretically the optimal hyperplane of SVMmaximizes the distance between the support vectors (thenearest training points) thus it is highly dependent on thenearest training points which are found from the trainingsamples However when there are fewer samples in thetraining set the selected nearest training points may not berepresentative thus the obtained optimal hyperplane may bebiased tooTherefore SVMcannot solve the small sample sizetraining problem

5 Conclusions

In the current study we proposed adding the test sampleswith higher classification probability to the training set forobtaining comprehensive distribution information thus thebiased classification boundary estimated from the smalltraining set could be corrected The effectiveness of EZ-LDA in handling the small sample size training problem wasvalidated on both simulation and real BCI datasets EZ-LDAis an extension of Z-LDA and it is easy to be implemented inreal-time BCI systems

Conflict of Interests

The authors declare that they have no competing interests

Authorsrsquo Contribution

DongruiGao andRui Zhang contributed equally to this work

Acknowledgments

This work was supported in part by Grants from the 973Program 2011CB707803 the 863 Project 2012AA011601 theNational Nature Science Foundation of China (no 61175117no 81330032 no 91232725 and no 31100745) the Programfor New Century Excellent Talents in University (no NCET-12-0089) and the 111 Project (B12027)

References

[1] J R Wolpaw and D J McFarland ldquoControl of a two-dimen-sional movement signal by a noninvasive brain-computer inter-face in humansrdquo Proceedings of the National Academy of Sciencesof the United States of America vol 101 no 51 pp 17849ndash178542004

[2] Z Zhou E Yin Y Liu J Jiang and D Hu ldquoA novel task-oriented optimal design for P300-based brain-computer inter-facesrdquo Journal of Neural Engineering vol 11 no 5 Article ID056003 2014

[3] M Takahashi K Takeda Y Otaka et al ldquoEvent related desyn-chronization-modulated functional electrical stimulation sys-tem for stroke rehabilitation a feasibility studyrdquo Journal ofNeuroEngineering and Rehabilitation vol 9 article 56 2012

[4] J R Wolpaw N Birbaumer D J McFarland G Pfurtschellerand T M Vaughan ldquoBrain-computer interfaces for communi-cation and controlrdquo Clinical Neurophysiology vol 113 no 6 pp767ndash791 2002

[5] C Guger H Ramoser and G Pfurtscheller ldquoReal-time EEGanalysis with subject-specific spatial patterns for a brain-computer interface (BCI)rdquo IEEE Transactions on RehabilitationEngineering vol 8 no 4 pp 447ndash456 2000

[6] V Bostanov ldquoBCI competition 2003mdashdata sets Ib and IIbfeature extraction from event-related brain potentials with thecontinuous wavelet transform and the t-value scalogramrdquo IEEETransactions on Biomedical Engineering vol 51 no 6 pp 1057ndash1061 2004

[7] F Guo B Hong X Gao and S Gao ldquoA brain-computerinterface using motion-onset visual evoked potentialrdquo Journalof Neural Engineering vol 5 no 4 pp 477ndash485 2008


[8] N Naseer and K-S Hong ldquoClassification of functional near-infrared spectroscopy signals corresponding to the right- andleft-wrist motor imagery for development of a brain-computerinterfacerdquo Neuroscience Letters vol 553 pp 84ndash89 2013

[9] K-S Hong N Naseer and Y-H Kim ldquoClassification ofprefrontal andmotor cortex signals for three-class fNIRS-BCIrdquoNeuroscience Letters vol 587 pp 87ndash92 2015

[10] R Zhang P Xu L Guo Y Zhang P Li and D Yao ldquoZ-Scorelinear discriminant analysis for EEG based brain-computerinterfacesrdquo PLoS ONE vol 8 no 9 Article ID e74433 2013

[11] Y Li C Guan H Li and Z Chin ldquoA self-training semi-supervised SVM algorithm and its application in an EEG-basedbrain computer interface speller systemrdquo Pattern RecognitionLetters vol 29 no 9 pp 1285ndash1294 2008

[12] F Lotte M Congedo A Lecuyer F Lamarche and B ArnaldildquoA review of classification algorithms for EEG-based brain-computer interfacesrdquo Journal of Neural Engineering vol 4 no2 pp R1ndashR13 2007

[13] P Xu P Yang X Lei and D Yao ldquoAn enhanced probabilisticLDA for multi-class brain computer interfacerdquo PLoS ONE vol6 no 1 Article ID e14634 2011

[14] X Lei P Yang and D Yao ldquoAn empirical bayesian frameworkfor brain-computer interfacesrdquo IEEE Transactions on NeuralSystems and Rehabilitation Engineering vol 17 no 6 pp 521ndash529 2009

[15] C M Bishop Pattern Recognition and Machine LearningSpringer New York NY USA 2006

[16] B Hong F Guo T Liu X R Gao and S K Gao ldquoN200-spellerusing motion-onset visual responserdquo Clinical Neurophysiologyvol 120 no 9 pp 1658ndash1666 2009

[17] R Zhang P Xu R Chen et al ldquoAn adaptive motion-onsetVEP-based brain-computer interfacerdquo IEEE Transactions onAutonomous Mental Development 2015

[18] M Kuba Z Kubova J Kremlacek and J Langrova ldquoMotion-onset VEPs characteristics methods and diagnostic userdquoVision Research vol 47 no 2 pp 189ndash202 2007

[19] N Naseer M J Hong and K-S Hong ldquoOnline binary decisiondecoding using functional near-infrared spectroscopy for thedevelopment of brain-computer interfacerdquo Experimental BrainResearch vol 232 no 2 pp 555ndash564 2014

[20] N Naseer and K Hong ldquofNIRS-based brain-computer inter-faces a reviewrdquo Frontiers in Human Neuroscience vol 9 article3 2015

Submit your manuscripts athttpwwwhindawicom

Stem CellsInternational

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014


MEDIATORSINFLAMMATION

of


Behavioural Neurology

EndocrinologyInternational Journal of



Disease Markers


BioMed Research International

OncologyJournal of



Oxidative Medicine and Cellular Longevity


PPAR Research

The Scientific World JournalHindawi Publishing Corporation httpwwwhindawicom Volume 2014

Immunology ResearchHindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Journal of

ObesityJournal of



Computational and Mathematical Methods in Medicine

OphthalmologyJournal of


Diabetes ResearchJournal of



Research and TreatmentAIDS


Gastroenterology Research and Practice


Parkinsonrsquos Disease

Evidence-Based Complementary and Alternative Medicine

Volume 2014Hindawi Publishing Corporationhttpwwwhindawicom


proposed to address this issue [12] Li et al designed a self-training semisupervised SVM algorithm to train the classifierwith small training data [11] Xu et al proposed a strategywhich enlarges training set by adding test samples withhigh probability to improve the classification performanceof Bayesian LDA under small sample training situation [1314] The strategy hypothesizes that unlabeled samples withhigh probability provide valuable information for refining theclassification boundary

In essence Z-LDA defines the confidence of samples interms of its position in the estimated distribution whichcould be used to update the classifier for the online BCIsystem In the current study we will extend Z-LDA to dealwith the small size training problem

2 Materials and Methods

21 Probability Output of Z-LDA In LDA [15] the weightsum 119910(119883) of the unlabeled sample 119883 is calculated based onthe project vector119882which is estimated from the training setand the corresponding prediction label is then determinedby the shortest distance between 119910(119883) and the labels of eachclass For Z-LDA [10] we assume that119910(119883) of samples in eachclass follow normal distribution and normalize it through 119911-score as

119911119896 =

119910 (119883) minus 120583119896

120590119896

(1)

where 120583119896 120590119896 are the corresponding mean and standarddeviation of the weight sum 119910(119883) for training set 119862119896 Thus119911119896 follows standard normal distribution Z-LDA make theprediction based on the distance between 119911119896 and mean ofthe standard normal distribution (ie 0) Suppose 119911lowast is theclosest one near to 0 then the unlabeled sample will beclassified to training set 119862lowast In the binary classification thedecision boundary of Z-LDA is defined as [10]

119888lowast=

(12059011205832 + 12059021205831)

(1205901 + 1205902)

(2)

Generally the cumulative distribution function of thestandard normal distribution is denoted as

Φ (119909) = 119875 119883 lt 119909 =

1

radic2120587

int

119909

minusinfin

119890minus11990522119889119905

minus infin lt 119909 lt +infin

(3)

For the transformed 119911-score 119911lowast of Z-LDA the area representsthe cumulative probability Φ(119911lowast) that is shown in Figure 1Based on this we define the prediction probability of Z-LDAas

119875 (119911lowast) = 1 minus (Φ (

1003816100381610038161003816119911lowast1003816100381610038161003816) minus Φ (minus

1003816100381610038161003816119911lowast1003816100381610038161003816)) (4)

The area which represents 119875(119911lowast) is also marked on Figure 1It is easy to know that 119875(119911lowast) decreases with the increaseddistance between 119911lowast and the mean of the standard normaldistribution and the range of 119875(119911lowast) is [0 1] Obviously thelarger 119875(119911lowast) denotes the higher confidence that the samplebelongs to class 119862lowast thus the above definition is reasonable

Φ(|zlowast|) minus Φ(minus|z

lowast|)

Φ(zlowast) 1 minus Φ(|z

lowast|)

50

40

30

20

10

0

Prob

abili

ty (

)

minus5 minus4 minus3 minus2 minus1 0 1 2 3 4 5

z

P(zlowast)

Figure 1The prediction probability of EZ-LDA and the cumulativeprobability

Class 1 Class 2

Classifier model

Unlabeled test samples



Training samples

Train

Classify

Output

Prediction result of test samples

ThresholdThreshold

ExceedExceed

EnlargeEnlarge

Figure 2 The flow chart of the training set enlarging strategy

22 Training Set Enlarging Strategy A small training set maynot provide enough information for estimating the distribu-tion parameters of the samples thus the biased classificationboundary could be obtained when training the classifierIn this case the classification accuracy will be decreasedduring online test We propose to add a kind of training setenlarging strategy to alleviate the small sample size trainingeffect in this work and the detailed procedures are illustratedin Figure 2 After classifier model is estimated based on thesmall training set and the prediction results of unlabeled testsamples are obtained the strategy assumes that unlabeledtest samples with high classification probability representcorrect prediction and these correctly predicted samplescould be screened out and then used to enlarge the trainingset Thus more accurate sample distribution estimation andthe improved classifier with the refined classification bound-ary would be obtained [13] In this strategy classificationprobability information exported from Z-LDA classifier isregarded as a confident evaluation criterion to select the highprobability test samples which are then labeled accordingto the prediction results of Z-LDA Next we need to set athreshold the predicted label of a test sample is believed tobe correct if the corresponding classification probability ishigher than the threshold Finally this test sample could be





24 Real BCI Dataset






















3 Results




4 Discussion




minus5

minus5

minus10

minus10

minus15

minus200 5 10 15 20 25

0

5

10

Z-LDAEZ-LDA

(a)

minus5minus10

minus5

minus10

minus15

minus200 5 10 15 20 25

0

5

10

Z-LDAEZ-LDA

(b)

minus5minus10

minus5

minus10

minus15

minus20

0 5 10 15 20 25

0

5

10

Z-LDAEZ-LDA

(c)

minus5minus10

minus5

minus10

minus15

minus20

0 5 10 15 20 25

0

5

10

Z-LDAEZ-LDA

(d)





20 40 60 80 10070

74

78

82

86

90

Training size

Clas

sifica

tion

accu

racy

()

LDAZ-LDAEZ-LDA

lowast

lowast

lowast

lowast lowast




5 Conclusions






Acknowledgments


References



























of






Disease Markers



OncologyJournal of





PPAR Research



Journal of

ObesityJournal of




















24 Real BCI Dataset






















3 Results




4 Discussion




minus5

minus5

minus10

minus10

minus15

minus200 5 10 15 20 25

0

5

10

Z-LDAEZ-LDA

(a)

minus5minus10

minus5

minus10

minus15

minus200 5 10 15 20 25

0

5

10

Z-LDAEZ-LDA

(b)

minus5minus10

minus5

minus10

minus15

minus20

0 5 10 15 20 25

0

5

10

Z-LDAEZ-LDA

(c)

minus5minus10

minus5

minus10

minus15

minus20

0 5 10 15 20 25

0

5

10

Z-LDAEZ-LDA

(d)





20 40 60 80 10070

74

78

82

86

90

Training size

Clas

sifica

tion

accu

racy

()

LDAZ-LDAEZ-LDA

lowast

lowast

lowast

lowast lowast




5 Conclusions






Acknowledgments


References



























of






Disease Markers



OncologyJournal of





PPAR Research



Journal of

ObesityJournal of































3 Results




4 Discussion




minus5

minus5

minus10

minus10

minus15

minus200 5 10 15 20 25

0

5

10

Z-LDAEZ-LDA

(a)

minus5minus10

minus5

minus10

minus15

minus200 5 10 15 20 25

0

5

10

Z-LDAEZ-LDA

(b)

minus5minus10

minus5

minus10

minus15

minus20

0 5 10 15 20 25

0

5

10

Z-LDAEZ-LDA

(c)

minus5minus10

minus5

minus10

minus15

minus20

0 5 10 15 20 25

0

5

10

Z-LDAEZ-LDA

(d)





20 40 60 80 10070

74

78

82

86

90

Training size

Clas

sifica

tion

accu

racy

()

LDAZ-LDAEZ-LDA

lowast

lowast

lowast

lowast lowast




5 Conclusions






Acknowledgments


References



























of






Disease Markers



OncologyJournal of





PPAR Research



Journal of

ObesityJournal of

















minus5

minus5

minus10

minus10

minus15

minus200 5 10 15 20 25

0

5

10

Z-LDAEZ-LDA

(a)

minus5minus10

minus5

minus10

minus15

minus200 5 10 15 20 25

0

5

10

Z-LDAEZ-LDA

(b)

minus5minus10

minus5

minus10

minus15

minus20

0 5 10 15 20 25

0

5

10

Z-LDAEZ-LDA

(c)

minus5minus10

minus5

minus10

minus15

minus20

0 5 10 15 20 25

0

5

10

Z-LDAEZ-LDA

(d)





20 40 60 80 10070

74

78

82

86

90

Training size

Clas

sifica

tion

accu

racy

()

LDAZ-LDAEZ-LDA

lowast

lowast

lowast

lowast lowast




5 Conclusions






Acknowledgments


References



























of






Disease Markers



OncologyJournal of





PPAR Research



Journal of

ObesityJournal of

















20 40 60 80 10070

74

78

82

86

90

Training size

Clas

sifica

tion

accu

racy

()

LDAZ-LDAEZ-LDA

lowast

lowast

lowast

lowast lowast




5 Conclusions






Acknowledgments


References



























of






Disease Markers



OncologyJournal of





PPAR Research



Journal of

ObesityJournal of



































of






Disease Markers



OncologyJournal of





PPAR Research



Journal of

ObesityJournal of
















research article enhanced z-lda for small sample size...

Documents