statistical identification of bacteria species identification of bacteria species a. suchwalko1,* i....

11
Statistical identification of bacteria species A. Suchwalko 1,* I. Buzalewicz 1 and H. Podbielska 1 1 Bio-Optics Group, Institute of Biomedical Engineering and Instrumentation, Wroclaw University of Technology, Wybrzeze Wyspianskiego 27, 50-370 Wroclaw, Poland The challenge of developing fast and trustworthy method for bacteria identification has been taken by many research groups from all around the world but only two decided to develop methods based on the statistical analysis of forward light scattering on bacterial colonies grown on solid nutrient media. Optical methods contrary to other techniques offer many advantages such as non-destructive and non-contact bacteria identification, which allows application of different method on the same sample if identification confirmation is required e.g. in medical diagnostic procedures. Both mentioned above methods use dedicated optical systems for recording diffraction patterns of the bacterial colonies, construct set of numerical features calculated from the patterns and use statistical analysis for identification of unknown samples. The methods use different optical systems, different numerical features and different statistical analysis workflows and therefore obtain different results. In this chapter we will present a brief comparison of these two methods and a more detailed description of the one developed in our group. Keywords bacteria species identification; Fresnel diffraction patterns; statistical analysis; classification 1. Introduction The threat posed by pathogenic bacteria species grows worldwide at an alarming rate every year. The microbes identified as possible food and waterborne pathogens are present in the surrounding environment. Increasing bacterial resistance to well-known antibacterial agents and antibiotics is a serious threat to our health and life [1–9]. The identification of bacteria species is a difficult, time and resources consuming task. There are some attempts to identify bacteria by means of biochemical, molecular or immunological techniques. However, in spite of their high sensitivity, they are also time consuming, need high quality reagents and very pure samples, what makes them expensive. Moreover, the most sensitive methods as the PCR (Polymerase Chain Reaction) can take up to seven days to identify particular bacteria species, what for example in hospital condition may lead in extreme cases even to patient’s death. Recently, real-time PCR was demonstrated to be fast and reliable method, but it needs a priori knowledge and therefore cannot deal with unknown samples [9,10]. Various optical methods as e.g. bacterial cells detection in water suspension by forward and backward light scattering [11], infrared [12] and fluorescence spectroscopy [13–16], flow cytometry, chromatography, chemiluminescence [17], bioconjugated biomolecules [18,19] or surface plasmon resonance (SPR) [20,21] were proposed, as well. Although they offer the possibility of non-invasive, non-destructive and non-contact examination of bacterial samples, they exhibit similar disadvantages as previous techniques, including demanding time-consuming preparation of high quality samples and necessity of usage of the equipment with the high sensitivity and spectral resolution [9]. In order to be able to effectively fight with the increasingly menacing enemy, we need tools that allow fast and reliable detection and identification of the bacteria species and their strains. The challenge of developing fast and trustworthy method for bacteria identification has been taken by many research groups from all around the world, but only two decided to develop methods based on the statistical analysis of forward light scattering on bacterial colonies grown on solid nutrient media. Optical methods contrary to other techniques offer many advantages such as non- destructive and non-contact bacteria identification, which allows application of different method on the same sample, if identification confirmation is required e.g. in medical diagnostic procedures [9,10,22–38]. These two methods are as follows: a microbial high-throughput screening (HTS) system proposed by researchers from Purdue University and Bacteria Identification System (BIS) introduced by our Bio-Optics Group from the Wroclaw University of Technology. Though the HTS name appears for the first time in work from 2012 year [26] the first research on the topic was from 2006 [29]. In order to distinguish these methods in the text, they will be referred by their acronyms HTS and BIS. Both methods use dedicated optical systems for recording diffraction patterns of the bacterial colonies, construct set of numerical features calculated form the patterns and use statistical analysis for identification of unknown samples. These methods use different optical systems, different numerical features and different statistical analysis workflows and therefore obtain different results. The methods will be compared, further in this chapter. The BIS method offers a unique optical system with converging spherical wave illumination that exhibits unique properties: compression of the observation plane to the finite region in the space, observation of Fresnel and Fraunhofer patterns in the same setup, diffraction patterns scaling. HTS method is based on detection of multi-angle light-scatter patterns formed by the complex interaction of laser light with the structure of colonies to quantify differences in phenotypes of colonies [30]. Microbial pathogens and strategies for combating them: science, technology and education (A. Méndez-Vilas, Ed.) © FORMATEX 2013 ____________________________________________________________________________________________ 711

Upload: dangthuy

Post on 28-Apr-2018

216 views

Category:

Documents


2 download

TRANSCRIPT

Page 1: Statistical identification of bacteria species identification of bacteria species A. Suchwalko1,* I. Buzalewicz 1 and H. Podbielska 1Bio-Optics Group, Institute of Biomedical Engineering

Statistical identification of bacteria species

A. Suchwalko1,* I. Buzalewicz1 and H. Podbielska1 1Bio-Optics Group, Institute of Biomedical Engineering and Instrumentation, Wroclaw University of Technology,

Wybrzeze Wyspianskiego 27, 50-370 Wroclaw, Poland

The challenge of developing fast and trustworthy method for bacteria identification has been taken by many research groups from all around the world but only two decided to develop methods based on the statistical analysis of forward light scattering on bacterial colonies grown on solid nutrient media. Optical methods contrary to other techniques offer many advantages such as non-destructive and non-contact bacteria identification, which allows application of different method on the same sample if identification confirmation is required e.g. in medical diagnostic procedures.

Both mentioned above methods use dedicated optical systems for recording diffraction patterns of the bacterial colonies, construct set of numerical features calculated from the patterns and use statistical analysis for identification of unknown samples. The methods use different optical systems, different numerical features and different statistical analysis workflows and therefore obtain different results. In this chapter we will present a brief comparison of these two methods and a more detailed description of the one developed in our group.

Keywords bacteria species identification; Fresnel diffraction patterns; statistical analysis; classification

1. Introduction

The threat posed by pathogenic bacteria species grows worldwide at an alarming rate every year. The microbes identified as possible food and waterborne pathogens are present in the surrounding environment. Increasing bacterial resistance to well-known antibacterial agents and antibiotics is a serious threat to our health and life [1–9]. The identification of bacteria species is a difficult, time and resources consuming task. There are some attempts to identify bacteria by means of biochemical, molecular or immunological techniques. However, in spite of their high sensitivity, they are also time consuming, need high quality reagents and very pure samples, what makes them expensive. Moreover, the most sensitive methods as the PCR (Polymerase Chain Reaction) can take up to seven days to identify particular bacteria species, what for example in hospital condition may lead in extreme cases even to patient’s death. Recently, real-time PCR was demonstrated to be fast and reliable method, but it needs a priori knowledge and therefore cannot deal with unknown samples [9,10]. Various optical methods as e.g. bacterial cells detection in water suspension by forward and backward light scattering [11], infrared [12] and fluorescence spectroscopy [13–16], flow cytometry, chromatography, chemiluminescence [17], bioconjugated biomolecules [18,19] or surface plasmon resonance (SPR) [20,21] were proposed, as well. Although they offer the possibility of non-invasive, non-destructive and non-contact examination of bacterial samples, they exhibit similar disadvantages as previous techniques, including demanding time-consuming preparation of high quality samples and necessity of usage of the equipment with the high sensitivity and spectral resolution [9]. In order to be able to effectively fight with the increasingly menacing enemy, we need tools that allow fast and reliable detection and identification of the bacteria species and their strains. The challenge of developing fast and trustworthy method for bacteria identification has been taken by many research groups from all around the world, but only two decided to develop methods based on the statistical analysis of forward light scattering on bacterial colonies grown on solid nutrient media. Optical methods contrary to other techniques offer many advantages such as non-destructive and non-contact bacteria identification, which allows application of different method on the same sample, if identification confirmation is required e.g. in medical diagnostic procedures [9,10,22–38]. These two methods are as follows: a microbial high-throughput screening (HTS) system proposed by researchers from Purdue University and Bacteria Identification System (BIS) introduced by our Bio-Optics Group from the Wroclaw University of Technology. Though the HTS name appears for the first time in work from 2012 year [26] the first research on the topic was from 2006 [29]. In order to distinguish these methods in the text, they will be referred by their acronyms HTS and BIS. Both methods use dedicated optical systems for recording diffraction patterns of the bacterial colonies, construct set of numerical features calculated form the patterns and use statistical analysis for identification of unknown samples. These methods use different optical systems, different numerical features and different statistical analysis workflows and therefore obtain different results. The methods will be compared, further in this chapter. The BIS method offers a unique optical system with converging spherical wave illumination that exhibits unique properties:

• compression of the observation plane to the finite region in the space, • observation of Fresnel and Fraunhofer patterns in the same setup, • diffraction patterns scaling.

HTS method is based on detection of multi-angle light-scatter patterns formed by the complex interaction of laser light with the structure of colonies to quantify differences in phenotypes of colonies [30].

Microbial pathogens and strategies for combating them: science, technology and education (A. Méndez-Vilas, Ed.)

© FORMATEX 2013

____________________________________________________________________________________________

711

Page 2: Statistical identification of bacteria species identification of bacteria species A. Suchwalko1,* I. Buzalewicz 1 and H. Podbielska 1Bio-Optics Group, Institute of Biomedical Engineering

Additionally, in both systems there is no need for special and labor intensive preparation of bacteria samples as in case of immunological, biochemical or biomolecular techniques. Therefore, it should be pointed out, that these advantages are not offered by other optical configurations as the HTS set-up [9,32–38]. Moreover, BIS optical system for bacteria species identification can be integrated with developed in our Group method of automated bacterial colonies number evaluation by analysis of their Fourier spectra [34], which is not offered by the other system. This additionally will increase the practical application of this system in microbiological diagnosis. Results obtained by means of the BIS system have shown that colonies of different bacteria species and their strains generate unique signatures [32]. In the analysis workflow registration of the Fresnel patterns is followed by the extraction of unique classification features based on the statistical moments, which correspond to the morphological and textural characteristics of the bacterial colonies and their diffraction patterns. From many features only the best are selected for further analysis with use of statistical apparatus (ANOVA analysis of variance and Fisher divergence). Selected features are then used to build classification models (Linear Discriminant Analysis, Quadratic Discriminant Analysis and Support Vector Machine) and finally, the models are used for unknown bacteria colonies Fresnel patterns identification [9,36–38]. The workflow scheme comparing BIS and HTS systems can be found in Table 1. Table 1 Summary of analysis steps in the two methods described in the work

Analysis tasks Analysis subtasks BIS Methods used by BIS HTS Methods used by HTS

1. Data acquisition

Bacteria samples preparation

+ Incubation on Columbia agar (Oxoid)

+ Incubation on soy agar (BD, catalog #211043)

Diffraction patterns registration

+ Optical system with converging spherical wave illumination

+ Laser Scatterometer

2. Image Preprocessing

Center and edges of the colony marking

+ ImageJ macro with human interaction

+ Colony locating module

Normalization + Histogram stretching with assumption of black background

-

Pattern partitioning + Dividing patterns into 10 rings of equal thickness

-

Feature extraction + Textural and morphological features based on statistical moments calculated within rings

+ Features based on pseudo-Zernike moments and Haralick texture features

3. Statistical analysis

Feature selection + ANOVA, SNR + Sequential floating forward selection-based wrapper was used with the k-nearest neighbor, linear discriminant analysis and recursive-portioning algorithms

Classification + Best classifier was QDA + Best classifier was SVM

Classification performance assessment

+ CV, sensitivity, specificity, accuracy

+ CV, sensitivity, specificity, accuracy

The identification results obtained by BIS of seven bacteria species under study (Salmonella enteritidis, Staphylococcus aureus, Staphylococcus intermedius, Escherichia coli, Proteus mirabilis, Pseudomonas aeruginosa and Citrobacter freundii) depending on the selected features show the identification accuracy varying from 98.58% to 99.143% [9,36–38].

Microbial pathogens and strategies for combating them: science, technology and education (A. Méndez-Vilas, Ed.)

© FORMATEX 2013

____________________________________________________________________________________________

712

Page 3: Statistical identification of bacteria species identification of bacteria species A. Suchwalko1,* I. Buzalewicz 1 and H. Podbielska 1Bio-Optics Group, Institute of Biomedical Engineering

2. Sample preparation

2.1. Sample choosing

The BIS method was developed and tested on the 7 various species of pathogen bacteria: Salmonella enteritidis, Staphylococcus aureus, Staphylococcus intermedius, Escherichia coli, Proteus mirabilis, Pseudomonas aeruginosa and Citrobacter freundii. The samples of bacterial species were chosen according to difficulty of their identification with the use of classical methods. The HTS was tested on the most common pathogenic species of Vibrio, Vibrio cholerae, Vibrio parahaemolyticus and Vibrio vulnificus and Listeria monocytogenes and other Listeria species [27,30].

2.2. Sample preparation

The seven chosen bacteria species were examined with BIS: Salmonella Enteritidis (ATCC 13076), Escherichia coli (PCM O119), Staphylococcus aureus (PCM 2267), Staphylococcus intermedius (PCM 2405), Citrobacter freundii (PCM 531), Proteus mirabilis (PCM 547) and Pseudomonas aeruginosa (ATCC 27853) ( Table 2). The cultures were obtained from the microbiological laboratory of the Department of Epizootiology and Veterinary Administration with Clinic of Infectious Diseases of the Wroclaw University of Environmental and Life Science. Bacteria suspensions were first incubated for 18 hours at the temperature of 37°C. Respective dilutions were seeded on the surface of the solid nutrient medium in Petri dish, so as to obtain 12-20 colonies per plate, and were again incubated at 37°C for the next 11 hours. The bacteria colonies were grown on Columbia agar (Oxoid) [9]. Table 2 Number of patterns analyzed for various bacteria species by BIS method.

Bacteria

species Salmonella enteritidis

Staphylococcus aureus

Staphylococcus-intermedius

Escherichia coli

Proteus mirabilis

Pseudomonas aeruginosa

Citrobacter freundii

Number of patterns

55 50 48 53 47 50 49

Trypicase soy agar (BD, catalog #211043) was used for sample preparation in HTS method. 40 mg agar powder was suspended in 1 litre millipore water and boiled for 1 min. After the agar cooled, 35 ml agar was dispensed onto sterile Petri dishes. Petri dishes containing test samples were incubated at 30°C for 12–18 h. Colony diameter of 1.3 - 0.2 mm was used as a fixed parameter for each test culture before analysis with the light scatterometer [26,30].

3. Pattern registration

3.1. Optical system

The configuration of the BIS (Fig. 1) dedicated optical system with converging spherical wave illumination for analysis of light diffraction on bacterial colonies includes the laser diode module (635 nm, 1 mW, collimated, Thorlabs), natural density filter (optical density: 0-4.0, Edmund Optics), linear polarizer (Thorlabs), beam expander (Edmund Optics, 4X), iris diaphragm, transforming lens (achromatic doublet, focal distance: 48.6 cm, clear aperture: 6.35 cm, Edmund Optics), XYZ sample positioning stage with Petri dish, which enables the adjustment of the uniform illumination of the single bacteria colony, CMOS camera (EO-1312, Edmund Optics) and computer. In this configuration bacterial colony is illuminated by converging spherical wave generated by transforming lens in order to control lateral scale changes of diffraction patterns. The sample was located at the back surface of transforming lens and the CMOS camera. The diameter of light beam was approximately equal to the diameter of bacterial colony [9, 32–38]. The forward scatterometer was designed and fabricated as the core of the HTS system and consists of colony locating and elastic light scatter (ELS) modules. The first component is required to obtain information on the total number of colonies and the colony centre locations, via capture of scattered light by a universal serial bus (USB) complementary metal oxide semiconductor (CMOS) camera (PL-B741U-BL, 1280 × 1024 pixels, PixeLINK, Ottawa, ON, Canada) from the back-illuminated image of the Petri dish. Compared to the previous light source with oblique illumination, back illumination by light-emitting diodes (LED) provides stable and constant illumination without flickering. The USB camera is equipped with an imaging lens (M118FM08, Tamron, Saitama, Japan) with a viewing angle of 34°×25.6° to cover the whole area of the plate within a given imaging distance. A plate diffuser is placed between the bottom of the plate and the light source. The second component is responsible for recording the ELS patterns, which are generated by passage of light from a diode laser (λ = 635 nm, Coherent 0221–698- 01 REV B, CA, USA), which generates 1 mW circular beam with 1/e2 diameter of 1 mm and captured by a second USB CMOS camera. Two linear stages (XN10-

Microbial pathogens and strategies for combating them: science, technology and education (A. Méndez-Vilas, Ed.)

© FORMATEX 2013

____________________________________________________________________________________________

713

Page 4: Statistical identification of bacteria species identification of bacteria species A. Suchwalko1,* I. Buzalewicz 1 and H. Podbielska 1Bio-Optics Group, Institute of Biomedical Engineering

0060-M02-71 with stroke 6 in. and resolution 2 mm, Velmex, Bloomfield, NY, USA) were used for 2D translation of the plate, via communication with the computer through a USB connection [26].

Fig. 1 The configuration of the optical system BIS for bacteria species classification based on Fresnel diffraction patterns

3.2. Pattern characteristics

Diffraction patterns of bacterial colonies exhibit unique spatial structures. They contain set of the diffraction rings, but the number and sizes of these rings depend on the bacteria species. This is correlated with the morphology of colony, as well as amplitude and phase properties that are unique for each species. The number of rings was estimated

as piΔΦN ring 2

≅ where ΔΦ is the phase lag [23]. Morphological properties of the bacterial colonies are reflected by

the morphological and texture properties of their diffraction patterns. Therefore morphological and textural properties of the diffraction patterns were examined as they are easily interpretable [9,26,38]. 1 Exemplary Fresnel patterns of bacterial colonies of various bacteria species as captured in the BIS system are depicted in Fig. 2. 8 bits grey-scale images were analysed. This gives us 256 intensity levels per each pixel. Each image has width of 1280 pixels and height of 1024 pixels in both systems but in the HTS the bitmaps were rescaled and squares of 300 x 300 pixels representing centres of the bitmaps and were subjected to the further analysis [29,31]. The Fresnel patterns in BIS system covers at least 40% of the image space and were analysed as recorded. All diffraction patterns were recorded in the same optical system configuration under the same strictly controlled conditions. Exemplary Fresnel patterns of bacterial colonies are presented in the Fig. 2.

Microbial pathogens and strategies for combating them: science, technology and education (A. Méndez-Vilas, Ed.)

© FORMATEX 2013

____________________________________________________________________________________________

714

Page 5: Statistical identification of bacteria species identification of bacteria species A. Suchwalko1,* I. Buzalewicz 1 and H. Podbielska 1Bio-Optics Group, Institute of Biomedical Engineering

4. Image processing

4.1. Pattern normalization

In order to avoid any errors related to human and laboratory factors (e.g. recording Fresnel patterns by multiple laboratory workers or in slightly different conditions) the normalization is applied as a preliminary step of the analysis in both approaches [29,31]. Some differences in the background intensity can be observed on Fig. 2. Normalization process in the BIS system was carried out under the assumption of a black background of each pattern under controlled conditions. The mean values of the intensities of pixels belonging to the background (beyond the pattern edges) were calculated separately for each pattern and the value was set as the left edge of the stretched histograms. The histogram stretching was performed accordingly to the standard algorithm that transforms pixel intensities values according to the

formula

n (x,y)= o (x,y)− min (o)max (o)− min (o)

�(max (n)− min(n ))+min(n) where o stands for original image, n for the

new (transformed) one with known range, while x and y denote pixel coordinates [9,36–40]. Exemplary Fresnel patterns after normalization as obtained in the BIS method are demonstrated on Fig. 3.

4.2. Pattern partitioning

The set of diffraction rings is correlated with the morphology of colony and with its amplitude and phase properties, which are unique for each species. Therefore, in the BIS system the characteristics of the patterns are analysed in the concentric annulus-shaped zones (Regions Of Interest - ROI's), while HTS analyses whole patterns with the use of pseudo-Zernike moments and Haralick texture features [9,26,29,36–38]. In BIS each Fresnel pattern was limited by the circle covering entire pattern, which centre and radius were manually marked. Radius of this circle was then divided into 10 equal pieces, from which rings were created. Comparison of partitioning the Fresnel patterns into 3, 5 and 10 equal thickness rings was performed and 10 ring split showed to be the most effective of fixed partitioning [9,37]. The Fresnel patterns partitioning into 10 disjoint rings was performed in the ImageJ software [41], by the dedicated macro with human interaction. Examples of 10 rings partitioning of various Fresnel patterns are depicted on the Fig. 4. The examples of split patterns show differences in the location and size of each pattern.

Fig. 2 Exemplary Fresnel patterns of bacterial colonies of various bacteria species as captured in the BIS system

Microbial pathogens and strategies for combating them: science, technology and education (A. Méndez-Vilas, Ed.)

© FORMATEX 2013

____________________________________________________________________________________________

715

Page 6: Statistical identification of bacteria species identification of bacteria species A. Suchwalko1,* I. Buzalewicz 1 and H. Podbielska 1Bio-Optics Group, Institute of Biomedical Engineering

5. Patterns to numbers

5.1. Features description

Features used for building classification models in both approaches are somehow based on the statistical moments: central in the BIS and pseudo-Zernike computed using pseudo-Zernike polynomials in the HTS [9,26,31]. They were chosen as it is well known that they provide good textural interpretation of the images and, in the BIS approach also, regions of interest within images. Mean value (first raw moment) and standard deviation (square root of the second central moment) of the pixel intensities within the rings denote brightness and roughness of the regions of interest, respectively. Skewness (third moment) is referring to the measure of the symmetry of the shape of the distribution of

Fig. 3 Exemplary Fresnel patterns before and after normalization, BIS method

Fig. 4 Exemplary partitioning of Fresnel patterns into 10 rings

Microbial pathogens and strategies for combating them: science, technology and education (A. Méndez-Vilas, Ed.)

© FORMATEX 2013

____________________________________________________________________________________________

716

Page 7: Statistical identification of bacteria species identification of bacteria species A. Suchwalko1,* I. Buzalewicz 1 and H. Podbielska 1Bio-Optics Group, Institute of Biomedical Engineering

the pixel intensities within each ring. Greater value of the skewness means longer tail in the bright direction of the pixel intensities histogram. Kurtosis (fourth moment) is a measure of the flatness or peakedness of a distribution of the pixel intensities within each ring [38]. Apart from statistical moments other features describing textural characteristics were calculated from ring regions in the BIS. Relative smoothness, which is a measure based on the standard deviation given by the formula

1− 1/(1+σ 2)for each ring, and thus standard deviation and relative smoothness features are dependent. Uniformity

was also calculated as ∑ p(xi)2

, where p (x ) is a probability mass function of the pixel intensities within the

rings and i takes values of the available range i ∈[0,255] . Lastly the entropy was calculated which quantifies

expected value of information carried by the pattern rings and is given by the formula−∑ p(xi)log2 p(xi) . The

formula is called Shannon entropy. All of the features are labelled feature . x , where feature is the name of the

calculated numerical feature and x is the ring number counting from the centre of the colony [38].

5.2. Feature extraction

The features from diffraction patterns of bacteria colonies are expressed by discrete pixel intensities. To extract interpretable features from the patterns each of them was partitioned into 10 disjoint rings of equal thickness, which was shown to be the best of fixed splits (for BIS). Pixel intensities from within each of the rings were then used to calculate features based on statistical moments, which are often used as texture features of images as they are easily interpretable. For each ring of each pattern values of the features were calculated in the BIS analysis and for whole pattern in the HTS analysis [9,26,29,38]

6. Statistical identification

6.1. Statistical analysis workflow

The data set in the BIS system consists of 70 features for each of 10 rings in each of 352 diffraction pattern images and it is just a start for the catalogue of pathogen bacteria species necessary for usage of the method in the laboratories. Since the data are large, and contain many classes (bacteria species) it is crucial to develop efficient analysis workflow that will take into consideration all of foreseeable difficulties. The HTS method used 120 features describing each image as inputs for the analysis procedure. The order of Zernike moment-based features was high enough to include not only low-frequency shape information, but also high-frequency features details of the light-scatter patterns [29].

6.2. Feature selection

As the BIS Fresnel diffraction patterns of bacteria species data set consist of 70 features for each pattern (7 feature groups and 10 rings) choosing of those that differentiate bacteria species the best, was necessary. It is well known that too many features make classification models more complicated and less robust. For this reason we decided to choose only some groups of features. Selecting groups of features only decreases diversity of the features used and thus is a step towards better interpretation of results while maintaining good identification quality. For the task the filter class feature selection methods were applied and combinations of different feature groups and rankings of their class (species) separation capability were tested. Combination of 4 groups of the features: entropy, mean, standard deviation and uniformity gave best identification results and therefore, were chosen for the analysis [38]. To decide which features are the best for building the classification models ANOVA analysis of variance and Fisher divergence measure called also signal to noise ratio further called SNR, were used. Both tests were used to find features

that differentiate bacteria species in the best way. Explicit SNR formula is μ /σ . We used it to calculate separation measure value of the given feature by averaging separation values calculated in one-versus-all scheme. It means that for each feature under study we calculated seven separation measures using SNR formula for two series of the data

SNR= 1n∑

μi− μ j

σ i +σ j , where n is the number of bacteria species and in our case is equal seven, i is given

bacteria specie while j represents complement of the specie set i in the whole data set. SNR was chosen to be better suited method as texture based features differ in mean value what was shown by the ANOVA during previous study, but show high variance, which has a great impact on differentiation possibilities of the features[38]. The HTS feature reduction was performed via linear discriminant analysis (LDA) and principal component analysis (PCA) [29], but no feature selection step was reported.

Microbial pathogens and strategies for combating them: science, technology and education (A. Méndez-Vilas, Ed.)

© FORMATEX 2013

____________________________________________________________________________________________

717

Page 8: Statistical identification of bacteria species identification of bacteria species A. Suchwalko1,* I. Buzalewicz 1 and H. Podbielska 1Bio-Optics Group, Institute of Biomedical Engineering

6.3. Classification and identification

Classification is a process of building statistical classifiers, basing on diffraction patterns features, and using them for prediction of unknown bacteria species, while identification is only prediction of unknown bacteria species. Therefore, identification can be assumed as final part of classification and as its part. The terms will be used alternatively when describing results.

6.4. Classification methods

In the HTS the scatter patterns represented by a number of selected numerical features are subsequently classified using Fisher’s linear discriminant, support vector machine with linear kernel (SVM-L), and support vector machine with radial-basis function kernel (SVM-RBF). The classification system follows the implementation described previously (Bayraktar et al., 2006; Banada et al., 2007) [30]. In the BIS method three classification algorithms were exploited: Linear Discriminant Analysis (LDA), Quadratic Discriminant Analysis (QDA) and Support Vector Machine (SVM). Their results were verified by a classifier performance assessment method. QDA separates classes with the quadratic surface and gave results that were affected by the least error value and therefore QDA is a main classifier in the BIS [9,38].

6.5. Performance assessment

Cross-validation (CV) was chosen as classifier performance the BIS assessment method. CV is a technique for determining how the results of the classification will work for other, independent data sets. In other words, CV estimates the unknown prediction error. CV splits the data set into two complementary subsets (learning and test sets) and performs classification analysis. These two steps are applied given number of times to the whole data set and, as a result the classification error is estimated. In our analysis, 100 repetitions for each classifier and data were performed [9,36–38]. Sensitivity and specificity are well known statistical measures of the performance of a binary (two class e.g. case and control) classification test. As we are dealing with multi-class case (7 classes) sensitivity and specificity cannot be calculated in the ordinary way. The most widely accepted method of calculating those measures in multi-class case is to treat one of classes as case class and combine the rest of the classes as the control class. In this simple way the binary definition can be applied as many times as we have classes. Averaged values of sensitivity and specificity over all of

classes give measures for the whole experiment. Accuracy of the classification is equal 1− CV error [9,36–38].

Table 3 Best QDA classifier performance for all seven bacteria species under study and features selected with the use of SNR

Error Sensitivity Specificity

QDA with 18 features selected by SNR 0.857% 0.9820 0.9933

In the HTS the detection capability of the best classifier (SVM-RBF) was evaluated, where a sample is represented by multiple instances (multiple colonies) presented on a plate. However, no true multiple-instance learning algorithms were used in this approach. To evaluate the classifier performance, an extended version of a cross-validation procedure was designed. 100 virtual plates were generated in silico containing random combination of colonies for each of the tested detection tasks. Out of 100, the first group of 50 plates contained one or more colonies representing the class of interest, and another 50 did not. The test was repeated six times for each species of interest. In the described setting, a sample (represented by a single plate) was considered positive, if at least one colony in that sample belonged to the class of interest), and negative, if all colonies on the plate were something other than the target. Separate accuracy scores were computed for each classification system individually optimized for detection of each of the classes [30]. The results demonstrate very high sensitivity (96–100%) and specificity (100%) of the proposed method. Out of 50 plates containing one or more colonies of V. cholerae and 50 plates containing V. vulnificus, all were tagged as positive for V. cholerae and V. vulnificus respectively in each of the six repeats. In the case of V. parahaemolyticus, 96% of the contaminated plates were detected as positive. In all cases the plates which did not contain contamination were correctly labeled as negative [30].

Microbial pathogens and strategies for combating them: science, technology and education (A. Méndez-Vilas, Ed.)

© FORMATEX 2013

____________________________________________________________________________________________

718

Page 9: Statistical identification of bacteria species identification of bacteria species A. Suchwalko1,* I. Buzalewicz 1 and H. Podbielska 1Bio-Optics Group, Institute of Biomedical Engineering

Table 4 Best SVM-RBF, support vector machine with radial-basis function kernel classifier performance assessment where mixture represents colonies of Vibrio alginolyticus, Vibrio anguillarum, Vibrio mimicus and Vibrio orientalis. AUC, area under receiver operating curve [30]

Class Sensitivity Specificity Accuracy AUC

Mixture 0.96 0.99 0.98 1.0

Vibrio cholerae 0.94 0.99 0.98 1.0

Vibrio parahaemolyticus 0.97 0.96 0.97 1.0

Vibrio vulnificus 0.99 1.0 1.0 1.0

7. Conclusions

Fresnel diffraction patterns of bacterial colonies exhibit unique spatial structures that can be used for the microbial identification. Easily interpretable features based on the morphological and textural properties of bacterial colonies diffraction patterns and verification of their bacterial species differentiation capabilities with the use of statistical analysis workflow is possible with the BIS method. Providing morphological and textural features for the bacteria identification resulted in building models of very high efficiency. With use of the features and the feature selection algorithms it was possible to obtain identification error as small as 0.857% for QDA classifier in the BIS method. It is possible to build models of very good quality using small number of features based on the statistical moments calculated within rings. This suggests that constructing new features, e.g. related with the optical system modification that differentiate bacteria species very well can lead to building models based on as few as 5 or 6 features with an error of less than 1%. Such situation would allow the visually assisted identification with use of the BIS method to be fully interpretable by the users. It means that the BIS method in a laboratory can provide and show to its user additional information: what are similarities and differences between the sample being identified and selected samples from the database. Constraining the number of features to several only will make number of hints lower and possible to handle by a human being. Both presented methods are open to identification of new mutations of existing species within just hours and therefore are suitable for fast and reliable diagnosis. They are also of low cost and low laboratory labour. Optical methods for microbial identification are rapidly developing and soon will be very important apparatus in every microbial laboratory.

Acknowledgements The support by European Union under the European Social Fund is gratefully acknowledged.

References [1] S. G. Amyes, “The rise in bacterial resistance is partly because there have been no new classes of antibiotics since the 1960s.,”

BMJ (Clinical research ed.) 320, 199–200 (2000). [2] K. Christen, “Bioterrorism and waterborne pathogens: how big is the threat?,” Environmental science & technology 35, 396A–

397A, ACS Publications (2001). [3] A. S. S. Colsky, R. S. S. Kirsner, and F. A. A. Kerdel, “Analysis of antibiotic susceptibilities of skin wound flora in hospitalized

dermatology patients. The crisis of antibiotic resistance has come to the surface.,” Archives of dermatology 134, 1006–1009, Am Med Assoc (1998).

[4] M. E. a de Kraker, V. Jarlier, J. C. M. Monen, O. E. Heuer, N. van de Sande, and H. Grundmann, “The changing epidemiology of bacteraemias in Europe: trends from the European Antimicrobial Resistance Surveillance System.,” Clinical microbiology and infection : the official publication of the European Society of Clinical Microbiology and Infectious Diseases, 1–9 (2012) [doi:10.1111/1469-0691.12028].

[5] C. Dennis, “The bugs of war.,” Nature 411, 232–235 (2001) [doi:10.1038/35077161]. [6] S. E. Gould, “Underground Network,” Scientific American 307, 28–28 (2012) [doi:10.1038/scientificamerican1012-28a]. [7] S. B. S. B. Levy and B. Marshall, “Antibacterial resistance worldwide: causes, challenges and responses,” Nature medicine 10,

S122–9, Nature Publishing Group (2004) [doi:10.1038/nm1145]. [8] A. A. M. Nicol, C. Hurrell, W. McDowall, K. Bartlett, and N. Elmieh, “Communicating the risks of a new, emerging pathogen:

the case of Cryptococcus gattii,” Risk 28, 373–386, Wiley Online Library (2008). [9] A. Suchwalko, I. Buzalewicz, A. Wieliczko, and H. Podbielska, “Bacteria species identification by the statistical analysis of

bacterial colonies Fresnel patterns,” Optics Express 21, 11322 (2013) [doi:10.1364/OE.21.011322].

Microbial pathogens and strategies for combating them: science, technology and education (A. Méndez-Vilas, Ed.)

© FORMATEX 2013

____________________________________________________________________________________________

719

Page 10: Statistical identification of bacteria species identification of bacteria species A. Suchwalko1,* I. Buzalewicz 1 and H. Podbielska 1Bio-Optics Group, Institute of Biomedical Engineering

[10] J. Robinson, B. Rajwa, E. Bae, V. Patsekin, A. Roumani, A. Bhunia, J. Dietz, V. Davisson, M. M. Dundar, et al., “Using Scattering to Identify Bacterial Pathogens,” Optics and Photonics News 22, 20–27 (2011) [doi:10.1364/OPN.22.10.000020].

[11] J. Auger, K. Aptowicz, R. Pinnick, and Y. Pan, “Angularly resolved light scattering from aerosolized spores: observations and calculations,” Optics letters 32, 3358–3360 (2007).

[12] A. C. Samuels, A. P. Snyder, D. K. Emge, D. S. T. Amant, J. Minter, M. Campbell, and A. Tripathi, “Classification of select category A and B bacteria by Fourier transform infrared spectroscopy,” Applied spectroscopy 63, 14–24, Society for Applied Spectroscopy (2009).

[13] A. Manninen, M. Putkiranta, A. Rostedt, J. Saarela, T. Laurila, M. Marjamäki, J. Keskinen, and R. Hernberg, “Instrumentation for measuring fluorescence cross sections from airborne microsized particles,” Applied optics 47, 110–115, Optical Society of America (2008).

[14] S. Morel, N. Leone, P. Adam, and J. Amouroux, “Detection of bacteria by time-resolved laser-induced breakdown spectroscopy,” Applied optics 42, 6184–6191, Optical Society of America (2003).

[15] A. Alimova, A. Katz, P. Gottlieb, and R. Alfano, “Proteins and dipicolinic acid released during heat shock activation of Bacillus subtilis spores probed by optical spectroscopy,” Applied optics 45, 445–450 (2006).

[16] G. Faris and R. Copeland, “Spectrally resolved absolute fluorescence cross sections for bacillus spores,” Applied optics 36, 958–967 (1997).

[17] R. T. Noble, S. B. Weisberg, and others, “A review of technologies for rapid detection of bacteria in recreational waters,” Journal of water and health 3, 381–392 (2005).

[18] S. J. Mechery, X. J. Zhao, L. Wang, L. R. Hilliard, A. Munteanu, and W. Tan, “Using bioconjugated nanoparticles to monitor E. coli in a flow channel,” Chemistry–An Asian Journal 1, 384–390, Wiley Online Library (2006).

[19] W. Lian, S. A. Litherland, H. Badrane, W. Tan, D. Wu, H. V Baker, P. A. Gulig, D. V Lim, and S. Jin, “Ultrasensitive detection of biomolecules with fluorescent dye-doped nanoparticles.,” Analytical biochemistry 334, 135–144 (2004) [doi:10.1016/j.ab.2004.08.005].

[20] J. Homola, J. Dostálek, S. Chen, A. Rasooly, S. Jiang, and S. S. Yee, “Spectral surface plasmon resonance biosensor for detection of staphylococcal enterotoxin B in milk,” International Journal of Food Microbiology 75, 61–69, Elsevier (2002) [doi:10.1016/S0168-1605(02)00010-7].

[21] A. Subramanian and J. Irudayaraj, “A mixed self-assembled monolayer-based surface plasmon immunosensor for detection of E. coli O157: H7,” Biosensors and Bioelectronics 21, 998–1006 (2006).

[22] E. Bae, A. Aroonnual, A. A. K. Bhunia, and E. D. Hirleman, “On the sensitivity of forward scattering patterns from bacterial colonies to media composition,” Journal of Biophotonics 4, 236–243 (2011) [doi:10.1002/jbio.201000051].

[23] E. Bae, N. Bai, A. Aroonnual, J. P. Robinson, A. K. Bhunia, and E. D. Hirleman, “Modeling light propagation through bacterial colonies and its correlation with forward scattering patterns,” Journal of Biomedical Optics 15, 045001 (2010) [doi:10.1117/1.3463003].

[24] E. Bae, P. P. Banada, K. Huff, A. K. Bhunia, J. P. Robinson, and E. D. Hirleman, “Biophysical modeling of forward scattering from bacterial colonies using scalar diffraction theory.,” Applied optics 46, 3639–3648 (2007).

[25] E. Bae, P. P. P. Banada, K. Huff, A. A. K. Bhunia, J. P. Robinson, and E. D. Hirleman, “Analysis of time-resolved scattering from macroscale bacterial colonies,” Journal of Biomedical Optics 13, 014010 (2008) [doi:10.1117/1.2830655].

[26] E. Bae, V. Patsekin, B. Rajwa, A. K. Bhunia, C. Holdman, V. J. Davisson, E. D. Hirleman, and J. P. Robinson, “Development of a microbial high-throughput screening instrument based on elastic light scatter patterns.,” The Review of scientific instruments 83, 044304 (2012) [doi:10.1063/1.3697853].

[27] P. P. Banada, S. Guo, B. Bayraktar, E. Bae, B. Rajwa, J. P. Robinson, E. D. Hirleman, and A. K. Bhunia, “Optical forward-scattering for detection of Listeria monocytogenes and other Listeria species.,” Biosensors & bioelectronics 22, 1664–1671 (2007) [doi:10.1016/j.bios.2006.07.028].

[28] P. P. P. Banada, K. Huff, E. Bae, B. Rajwa, A. Aroonnual, B. Bayraktar, A. Adil, J. P. Robinson, E. D. Hirleman, et al., “Label-free detection of multiple bacterial pathogens using light-scattering sensor.,” Biosensors & bioelectronics 24, 1685–1692 (2009) [doi:10.1016/j.bios.2008.08.053].

[29] B. Bayraktar, P. P. Banada, E. D. Hirleman, A. K. Bhunia, J. P. Robinson, and B. Rajwa, “Feature extraction from light-scatter patterns of Listeria colonies for identification and classification.,” Journal of biomedical optics 11, 34006 (2006) [doi:10.1117/1.2203987].

[30] K. Huff, A. Aroonnual, A. E. F. Littlejohn, B. Rajwa, E. Bae, P. P. Banada, V. Patsekin, E. D. Hirleman, J. P. Robinson, et al., “Light-scattering sensor for real-time identification of Vibrio parahaemolyticus, Vibrio vulnificus and Vibrio cholerae colonies on solid agar plate.,” Microbial biotechnology 5, 607–620 (2012) [doi:10.1111/j.1751-7915.2012.00349.x].

[31] B. Rajwa, M. M. Dundar, F. Akova, A. Bettasso, V. Patsekin, E. D. Hirleman, A. K. Bhunia, and J. P. Robinson, “Discovering the unknown: detection of emerging pathogens using a label-free light-scattering system.,” Cytometry. Part A : the journal of the International Society for Analytical Cytology 77, 1103–1112 (2010) [doi:10.1002/cyto.a.20978].

[32] I. Buzalewicz, A. Wieliczko, and H. Podbielska, “Influence of various growth conditions on Fresnel diffraction patterns of bacteria colonies examined in the optical system with converging spherical wave illumination,” Optics Express 19, 21768–21785 (2011).

[33] I. Buzalewicz, K. Wysocka-Król, and H. Podbielska, “Exploiting of optical transforms for bacteria evaluation in vitro,” in SPIE, p. 7371_1H (2009).

[34] I. Buzalewicz, K. Wysocka-Król, and H. Podbielska, “Image processing guided analysis for estimation of bacteria colonies number by means of optical transforms,” Optics Express 18, 12992–13005 (2010).

[35] I. Buzalewicz, K. Wysocka-Król, K. Kowal, and H. Podbielska, “Evaluation of antibacterial agents efficiency,” in Information Technologies in Biomedicine 2, pp. 341–351 (2010).

[36] H. Podbielska, I. Buzalewicz, and A. Suchwalko, “Bacteria Classification by Means of the Statistical Analysis of Fresnel Diffraction Patterns of Bacteria Colonies,” Biomedical Optics (BIOMED) (2012).

Microbial pathogens and strategies for combating them: science, technology and education (A. Méndez-Vilas, Ed.)

© FORMATEX 2013

____________________________________________________________________________________________

720

Page 11: Statistical identification of bacteria species identification of bacteria species A. Suchwalko1,* I. Buzalewicz 1 and H. Podbielska 1Bio-Optics Group, Institute of Biomedical Engineering

[37] A. Suchwalko, I. Buzalewicz, and H. Podbielska, “Computer-based classification of bacteria species by analysis of their colonies Fresnel diffraction patterns,” Proceedings of SPIE 11, 82120R (2012) [doi:http://dx.doi.org/10.1117/12.907420].

[38] A. Suchwalko, I. Buzalewicz, and H. Podbielska, “Identification of bacteria species by using morphological and textural properties of bacterial colonies diffraction patterns,” in Proceedings of SPIE, F. Remondino, M. R. Shortis, J. Beyerer, and F. Puente León, Eds., p. 87911M–87911M–7 (2013) [doi:10.1117/12.2020337].

[39] R. M. Rangayyan, Biomedical Image Analysis, in Synthesis Lectures on Image Video and Multimedia Processing 2, M. R. Neuman, Ed., CRC Press (2005) [doi:10.1109/IEMBS.2009.5332535].

[40] R. C. Gonzalez, R. E. Woods, and B. R. Masters, “Digital Image Processing, Third Edition,” Journal of Biomedical Optics 14, 029901, Prentice Hall (2009) [doi:10.1117/1.3115362].

[41] M. D. Abràmoff, P. J. Magalhães, and S. J. Ram, "Image processing with ImageJ," Biophotonics international 11, 36–42 (2004).

Microbial pathogens and strategies for combating them: science, technology and education (A. Méndez-Vilas, Ed.)

© FORMATEX 2013

____________________________________________________________________________________________

721