hybridization of partial least squares and neural network models for quantifying lunar surface...

Icarus 221 (2012) 208–225

Contents lists available at SciVerse ScienceDirect

Icarus

journal homepage: www.elsevier .com/ locate/ icarus

Hybridization of partial least squares and neural network modelsfor quantifying lunar surface minerals

Shuai Li a,b,⇑, Lin Li a, Ralph Milliken b, Kaishan Song a

a Department of Earth Sciences, Indiana University–Purdue University Indianapolis, IN 46202, United Statesb Department of Geological Sciences, Brown University, RI 02912, United States

a r t i c l e i n f o a b s t r a c t

Article history:Received 3 May 2012Revised 23 July 2012Accepted 25 July 2012Available online 9 August 2012

Keywords:MoonData reduction techniquesMoon, SurfaceRegoliths

0019-1035/$ - see front matter Published by Elsevierhttp://dx.doi.org/10.1016/j.icarus.2012.07.023

⇑ Corresponding author at: Department of Earth SPurdue University Indianapolis, IN 46202, United Stat

E-mail addresses: [email protected], njushuaili@

The goal of this study is to develop an efficient and accurate model for using visible–near infrared reflec-tance spectra to estimate the abundance of minerals on the lunar surface. Previous studies using partialleast squares (PLS) and genetic algorithm–partial least squares (GA–PLS) models for this purpose revealedseveral drawbacks. PLS has two limitations: (1) redundant spectral bands cannot be removed effectivelyand (2) nonlinear spectral mixing (i.e., intimate mixtures) cannot be accommodated. Incorporating GAinto the model is an effective way for selecting a set of spectral bands that are the most sensitive to vari-ations in the presence/abundance of lunar minerals and to some extent overcomes the first limitation.Given the fact that GA–PLS is still subject to the effect of nonlinearity, here we develop and test a hybridpartial least squares–back propagation neural network (PLS–BPNN) model to determine the effectivenessof BPNN for overcoming the two limitations simultaneously. BPNN takes nonlinearity into account withsigmoid functions, and the weights of redundant spectral bands are significantly decreased through theback propagation learning process. PLS, GA–PLS and PLS–BPNN are tested with the Lunar Soil Character-ization Consortium dataset (LSCC), which includes VIS–NIR reflectance spectra and mineralogy for vari-ous soil size fractions and the accuracy of the models are assessed based on R2 and root mean square errorvalues. The PLS–BPNN model is further tested with 12 additional Apollo soil samples. The results indicatethat: (1) PLS–BPNN exhibits the best performance compared with PLS and GA–PLS for retrieving abun-dances of minerals that are dominant on the lunar surface; (2) PLS–BPNN can overcome the two limita-tions of PLS; (3) PLS–BPNN has the capability to accommodate spectral effects resulting from variations inparticle size. By analyzing PLS beta coefficients, spectral bands selected by GA, and the loading curve ofthe latent variable with the largest weight in PLS–BPNN, we conclude that spectral information incorpo-rated into the three models is directly derived from the diagnostic absorption bands associated with theindividual minerals. It is concluded that the PLS–BPNN model should be applicable to both ClementineUV–VIS–NIRs and Moon Mineralogy Mapper (M3) data.

Published by Elsevier Inc.

1. Introduction

Accurate knowledge of the diversity, distribution, and abun-dances of minerals on the lunar surface is a critical aspect of inves-tigations into the origin and geologic evolution of the Moon. Suchcompositional information can be used to examine the ‘magmaocean’ model of lunar crust formation and structural evolution,basaltic volcanism, impact crater/basin formation and ejectaemplacement, and lunar soil evolution and mixing mechanisms(Cahill et al., 2009; Elkins-Tanton et al., 2002; Longhi, 1978;Pieters, 1986; Pieters et al., 1993, 1997; Solomon and Longhi,1977; Taylor, 1979; Tompkins and Pieters, 1999; Warren, 1990;

Inc.

ciences, Indiana University–es. Fax: +1 317 274 7966.gmail.com (S. Li).

Wood, 1975). In addition, variations in the chemical and mineral-ogical composition of lunar soils can inform us on the effects ofspace weathering (Anand et al., 2004; McKay et al., 1974; Nobleet al., 2001; Starukhina and Shkuratov, 2001; Taylor et al., 2003,2010), and accurate global maps of lunar mineral compositionand abundance are necessary for the exploration of mineral re-sources on the Moon.

Remotely acquired visible–near infrared (VIS–NIR) reflectancespectra have been used to estimate the abundance of minerals onthe lunar surface. Commonly used approaches include Gaussianmodel (GM), modified Gaussian model (MGM) (Noble et al., 2006;Sunshine and Pieters, 1993; Sunshine et al., 1990; Tsuboi et al.,2010), multiple linear regression model (MLR) (Pieters et al.,2006; Shkuratov et al., 2005a,b, 2007, 2003), principal componentregression (PCR) (Pieters et al., 2002), partial least squares (PLS)regression (Li, 2006), genetic algorithm–partial least squares

http://dx.doi.org/10.1016/j.icarus.2012.07.023

mailto:[email protected]

mailto:[email protected]

http://dx.doi.org/10.1016/j.icarus.2012.07.023

http://www.sciencedirect.com/science/journal/00191035

http://www.elsevier.com/locate/icarus

Fig. 1. The range of modal abundances of major minerals in the Lunar SoilCharacterization Consortium (LSCC) samples. The models are trained by sampleswithin this range, thus their predictive capability is most accurate when applying tosamples that also fall within this range.

S. Li et al. / Icarus 221 (2012) 208–225 209

(GA–PLS) (Li and Li, 2010), artificial neural network (ANN) (Korok-hin et al., 2008), spectral mixture analysis (SMA) (Combe et al.,2010; Li and Mustard, 2000, 2003) and radiative transfer models(Hapke, 1981; Shkuratov et al., 1999). Among these approaches,GM and MGM can be applied to hyperspectral data, whereas theremaining methods can be applied to both hyper- and multi-spec-tral data. Radiative transfer methods and ANN are nonlinear models,whereas the remainder (MLR, PCR, PLS, GA–PLS and SMA) are linear.

These models aim to provide quantitative information related tolunar surface composition, but planetary geologists and spectrosco-pists currently lack efficient models to derive accurate estimates ofmineral abundances on the lunar surface. The MGM is a commonmethod for fitting mineral absorption features of lunar reflectancespectra, but it cannot uniquely identify or predict the abundancesof minerals that lack absorption features at VIS–NIR wavelengths(e.g., minerals that lack transition elements, such as iron-free pla-gioclase or pure forsterite). Indeed, previous studies that examinedthis technique for lunar soils showed that it performed well only forthe prediction of pyroxenes and olivine (Moroz and Arnold, 1999;Noble et al., 2006; Sunshine and Pieters, 1993). Although MLR iseasy to perform and effective for estimating abundances of pyrox-enes and agglutinates, its performance can be largely degradeddue to co-linearity between adjacent wavelengths (spectral chan-nels) (Pieters et al., 2006; Shkuratov et al., 2005a,b, 2007, 2003).Previous studies that focused on PLS, GA–PLS and PCR to interpretreflectance spectra relied on the assumption of linear relationshipsbetween reflectance spectra and a lunar compositional component,but these models have not yet been examined in detail to considernonlinear relationships between spectra and a constituent of inter-est (Li, 2006; Li and Li, 2010; Pieters et al., 2002). SMA can performwell in quantification of lunar mineral abundances, but the applica-bility of this model to remotely acquired data on a global scale islimited because its effectiveness depends on a priori knowledge ofthe ‘pure’ mineral or spectral endmembers (Li and Mustard, 2000,2003). Finally, radiative transfer models have limitations in thatthey are computationally intensive and perform best for ‘fresh’ lu-nar samples that have experienced minimal space weathering (Liand Li, 2011; Lucey, 2004). In summary, numerous models existfor constraining the mineralogy of the lunar surface, but the limita-tions associated with each of these models constrain our ability tomaximize the science return from existing and future VIS–NIRreflectance data of the Moon.

This study aims to improve the PLS performance for mappinglunar surface mineralogy with hyper- and multi-spectral VIS–NIRreflectance data. To fulfill this objective, a hybrid partial leastsquares regression and back propagation neural network (PLS–BPNN) model is proposed and evaluated for its effectiveness forestimating the abundances of typical lunar minerals and glasses(agglutinate, pyroxene, plagioclase, olivine, ilmenite and volcanicglass). Hybridization of BPNN with PLS accommodates possiblenonlinearity between reflectance and the abundance of a lunarcompositional constituent; the evaluation for the PLS–BPNN effec-tiveness is carried out through comparisons of the PLS–BPNN de-rived abundances with those derived by PLS and GA–PLS models.In order to identify spectral information used in each model forcompositional estimation we evaluate three model results includ-ing the PLS beta coefficient, spectral bands that are selected by GA,and the loading curve for the latent variable with the largestweight in PLS–BPNN. The Lunar Soil Characterization Consortium(LSCC) dataset (Taylor et al., 2000, 1999, 2003) is used to calibrateand validate the three models. To further test the PLS–BPNN model,two more samples from each Apollo landing site are examined (to-tal of 12 additional samples). This new algorithm provides a fastand non-destructive technique to assess bulk mineralogy andchemistry for other lunar samples by measuring their reflectancespectra, and it is suitable for both hyper- and multi-spectral data

(e.g., Moon Mineralogy Mapper (M3) and Clementine, respectively)to generate spatial distribution maps of major minerals on the lu-nar surface.

2. Methods

2.1. Spectral datasets

2.1.1. Lunar Soil Characterization Consortium (LSCC) datasetReflectance spectra and mineral abundances of lunar soils from

the LSCC dataset were used to test our developed PLS–BPNN mod-el. The LSCC dataset includes 9 unique mare and 10 unique high-land soil samples (Taylor et al., 2000, 1999, 2010, 2003), whichprovide the only ‘ground truth’ data of lunar soils consisting ofboth VIS–NIR reflectance spectra and mineral abundances. Theabundance range for each of the dominant minerals in these soilsis presented in Fig. 1. For each soil sample, the LSCC dataset con-sists of four particle size fractions (<10 lm, 10–20 lm, 20–45 lm, and <45 lm). Reflectance spectra for all samples were mea-sured in the Keck/NASA Reflectance Experiment LABoratory (RE-LAB) at Brown University using a visible–near infrared bi-directional spectrometer (Pieters, 1983). For this study we haveresampled these spectra to the bands used in the global mode res-olution of the M3 instrument (73 bands, spanning a wavelengthrange of �460–2500 nm) for direct relevance to M3 data. The cor-responding modal mineral abundances reported by the LSCC(determined by X-ray mapping using an energy-dispersive spec-trometer on an electron microprobe (Taylor et al., 1996)) wereused for each size fraction; modal mineralogy for bulk soils(<45 lm) was not measured, thus they were estimated as the aver-age of other three subgroups (size fractions) (Li, 2006). Though therelative mass of the different size fractions are not known, the factthat the measured chemical compositions of bulk soils are verysimilar to the average chemical compositions of the three sub-groups provides some basis for this assumption regarding the min-eralogy (Fig. 2).

For the results and discussion below, soil samples in the<10 lm, 10–20 lm, and 20–45 lm size fractions were used asthe calibration or ‘training’ dataset for the model (57 samples in to-tal) and spectra of the ‘bulk’ soil samples (<45 lm) were used asthe validation data (19 samples in total). We also tested the modelsby using randomly selected samples for the calibration and valida-tion datasets (e.g., 70% of the total samples were selected randomlyas the calibration dataset while the remaining 30% were used as

Fig. 2. Comparison of measured chemical compositions of LSCC bulk soils with those derived by taking an average of the chemical composition for the three subgroups (sizefractions). The average chemical compositions of the size fractions are similar to the chemical compositions of the bulk soils, suggesting the same may be true for mineralabundances, even though the relative mass fractions of the size separates are not known.

210 S. Li et al. / Icarus 221 (2012) 208–225

the validation dataset). The results for this random selection pro-cess showed no significant differences from the results presentedhere for each given model. However, in order to directly comparethe performance of the three methods, both the calibration andvalidation datasets should be consistent as inputs between thethree models. Therefore, in this paper we focus on the results forthe case in which the training and validation datasets arepredefined and simply acknowledge that a random selection oftraining/validation data yields similar results for the samplesexamined in this study.

2.1.2. Other Apollo samplesIn addition to calibration and validation of the PLS–BPNN model

with the LSCC dataset, an independent test to further evaluate theperformance of PLS–BPNN was conducted by applying the modelto reflectance spectra of 12 additional Apollo soil samples thatare unrelated to the LSCC dataset (two samples from each landingsite, their spectra are resampled to M3 resolution). However, onlychemical data, not modal mineralogy, are available for these

samples (Table 1) (Morris et al., 1983). Thus, the predicted mineralabundances must be converted to bulk chemistry, and the conver-sion was accomplished by multiplying the abundance of each min-eral by the assumed chemical composition of that mineral.Considering the variation in chemical compositions of mineralsfrom the mare and highlands, we choose to treat the compositionsof mare and highland minerals separately. The average chemicalcompositions of mare minerals are estimated to be equivalent tothose of minerals in the finest size fraction (<10 lm) of high-Timare soils (Taylor et al., 1996) (Table 2). The average chemicalcompositions of highland minerals are estimated to be equivalentto those of minerals in the finest size fraction of highland soils(Taylor et al., 2010) (Table 3). However, the range of bulk chemicalcompositions for lunar soils can vary depending on the size frac-tion that is being examined. The <10 lm fraction, for example,can be enriched in Al2O3, CaO, Na2O, and K2O but depleted inMgO, FeO, and MnO relative to the bulk soil (Papike et al., 1982).Therefore, estimating the chemical compositions for samples out-side of the size fraction on which our assumption was based (i.e.,

Table 4List of reflectance spectra of lunar minerals used in the three models. All data are fromRELAB at Brown University.

Mineral Spectra label

Agglutinate LU-CMP-007-1Clinopyroxene LS-CMP-009Orthopyroxene LS-CMP-012Olivine LR-CMP-014Ilmenite MR-MSR-006Plagioclase LS-CMP-086Volcanic glass (black) DD-MDD-030Volcanic glass (orange) LR-CMP-051

Fig. 3. Reflectance spectra of major minerals on the lunar surface; see Table 1 foradditional information.

Table 1Additional (non-LSCC) Apollo soil samples whose reflectance spectra were measured in RELAB at Brown University and for which the bulk chemical composition is known. Notethat chemical compositions were derived from the <1 mm size fraction.

Sample name Spectra label Particle size (lm) Chemical compositionsb

SiO2 TiO2 Al2O3 MgO CaO FeO

10084,896J LS-JBA-006-P6 0–250 41.00 7.30 12.80 9.20 12.40 16.2010084,896K LS-JBA-006-P7 0–250 41.00 7.30 12.80 9.20 12.40 16.2012042,41C LS-JBA-036-P1 0–250 45.70 2.71 13.00 10.40 10.60 16.2012042,41D LS-JBA-036-P2 0–250 45.70 2.71 13.00 10.40 10.60 16.2014148,40R LS-JBA-081-P 0–1000 48.50 1.71 17.38 9.66 10.40 10.5514259,13 LS-JBA-091-P1 0–250 48.16 1.73 17.60 9.26 11.25 10.4115061,40Ca LS-JBA-110-P1 0–25015021,114D LS-JBA-105-P2 0–180 46.56 1.75 13.73 10.37 10.54 15.2160601,18R LS-JBA-151-P 0–250 45.35 0.60 26.75 6.27 15.46 5.4961241,26C LS-JBA-155-P1 0–180 45.32 0.57 27.15 5.75 15.69 5.3371041,11 LS-JBA-203 0–250 39.74 9.57 10.80 9.72 10.72 17.7373121,19C LS-JBA-218-P1 0–180 45.56 1.39 21.23 9.73 12.82 8.45

a Compositions are unavailable.b Data are from Handbook of Lunar Soils (Morris et al., 1983).

Table 2Average chemical compositions of minerals in lunar mare high Ti soils (<10 lm) fromTable 4 in Taylor et al. (2010).

Agglutinate Pyroxene Plagioclase Olivine Ilmenite Volcanic

SiO2 44.28 50.25 46.11 37.32 0.08 38.43TiO2 3.05 1.14 0.07 0.16 52.10 9.30Al2O3 16.24 1.55 33.09 0.19 0.22 6.57MgO 9.68 16.41 0.17 36.51 2.25 12.64CaO 12.97 8.28 17.76 0.25 0.15 8.01FeO 11.26 20.51 0.35 24.62 42.37 21.78

Table 3Average chemical compositions of minerals in lunar highland soils (<10 lm) fromTable 9 in Taylor et al. (2010).

Agglutinate Pyroxene Plagioclase Olivine Ilmenite Volcanic

SiO2 47.67 53.34 47.18 39.05 0.07 72.72TiO2 1.24 0.95 0.05 0.08 55.28 0.46Al2O3 22.94 1.32 36.14 0.07 0.11 13.49MgO 8.09 17.14 0.07 36.25 3.25 0.35CaO 13.99 9.77 19.49 0.16 0.23 2.52FeO 8.21 20.77 0.14 28.64 43.46 3.62

S. Li et al. / Icarus 221 (2012) 208–225 211

samples larger than the finest fraction) can lead to large (andunquantifiable) uncertainties (e.g., over estimating Al2O3, CaO,Na2O, and K2O and underestimating MgO, FeO, and MnO).

2.1.3. Spectra of typical lunar minerals and glassesIn this study, the reflectance spectra of minerals that dominate

the lunar surface are used as a reference to determine which spec-tral bands each model relies upon. Reflectance spectra of individualmineral separates were acquired from the online RELAB spectraldatabase at Brown University (Table 4). All mineral separates, withthe exception of ilmenite, were lunar in origin. The reflectancespectra for these minerals are presented in Fig. 3 and are shownsimply for comparison to the bands (wavelength regions) thatare ultimately chosen by the models; the spectra of the mineralseparates are not used as inputs to the models.

2.2. Partial least squares (PLS)

PLS is a type of regression method that bears some relation toPCR. In a PLS model, eigenvectors of the independent variables(i.e., spectral bands) are used such that the corresponding scores(latent variables) not only explain the variance of independent vari-ables but also exhibit a high correlation with response variables

(i.e., mineral abundances), which is the advantage of PLS over PCRand other linear regression methods (Li, 2006). A simple PLS modelconsists of two outer relations and one inner relation. The two outerrelations result from eigenstructure decompositions of both thematrix containing independent variables (i.e., spectral bands)(steps 1 and 2 in Fig. 4) and the matrix containing response vari-ables (i.e., mineral abundances) (step 3 in Fig. 4), whereas the innerrelation links the resultant score matrices from the two eigenstruc-ture decompositions generated by the outer relations (step 4 inFig. 4) (Geladi and Kowalski, 1986). The final matrix of latent vari-ables is generated through iterations by replacing independent anddependent variables with their residuals (DX and DY) from the pre-ceding iteration (step 5 in Fig. 4).

212 S. Li et al. / Icarus 221 (2012) 208–225

Criteria for the iteration can be set as: when DX or DY is lessthan a value, the calculation stops; or the number of iterationscan be set to some maximum value. After training the model,new estimations of dependent variables can be calculated as:

a ¼ S0P0ðBP0Þ�1U ð1Þ

where a is the vector of estimated mineral abundances; S is the ma-trix of reflectance spectra after preprocessing (standardizing); P isthe weight loading matrix determined during training of the PLSmodel; B is the latent variable matrix of the PLS model; and U isthe latent variable matrix of y (mineral abundances). All variablesare preprocessed by standardizing before running the PLS modelthrough the equation (Geladi and Kowalski, 1986)

hxii;j ¼ ðxi;j � �xjÞ=rj ð2Þ

where �xj and rj are the mean and standard deviation, respectively,for values of x (the spectral matrix) at the jth dimension. Thus theterm (P0(BP0)�1U) in Eq. (1) represents the beta coefficients (alsocalled standardized coefficients), which measure the change in thedependent variables (e.g., standardized mineral or chemical abun-dances) that results from the independent variables (e.g., standard-ized spectra) (Schroeder et al., 1986). In practical terms, this meansthat if a spectral band has a larger beta coefficient then it contrib-utes more to the variation of corresponding mineral or chemicalabundances.

2.3. Genetic algorithm–partial least squares (GA–PLS)

GA is a method that searches for the best solutions for optimi-zation problems through mimicking the process of natural evolu-tion. In our application, GA is used to determine and select whichsubset of wavelengths (spectral bands) are most sensitive to vari-ations in mineral abundance. In a simplified GA model there are

Fig. 4. Flowchart of a simplified partial least squares (PLSs) model.

five steps (Fig. 5). (1) Encoding: a genetic algorithm works with apopulation of ‘‘chromosomes’’. Each chromosome is formed of asmany ‘‘bits’’ as the number of wavelengths (spectral bands), and‘‘zeros’’ or ‘‘ones’’ are randomly assigned to the bits of that chromo-some. If a bit is set to one, then the bit (band) is denoted to containinformation (in our case this information is assumed to be directlyrelated to mineral or chemical abundance); otherwise, the bit(band) is set to zero and will not be factored into the ‘‘evolution-ary’’ computation. During the training GA process the number of‘‘chromosomes’’ is equal to the number of samples (spectra). (2)Initializing: GA requires an initial random assignment of chromo-some bit (band) values (either zero or one) in order to start. (3)Evaluating: each ‘‘chromosome’’ is evaluated based upon a prede-fined fitness function that the GA is attempting to minimize. Thebetter the fit between the observed and predicted values (mineralor chemical abundances, for our study) for a given ‘‘chromosome’’,the more likely it is to survive to future generations (iterations).The selection is done with replacement, such that the same chro-mosome can be selected for many times (Forrest, 1993). (4) Cross-over: this is the process of producing new ‘‘offspring’’ in the nextiteration by transferring bits from ‘‘parent’’ chromosomes in theprevious iteration. The parent bit(s) are determined by evaluatingwhich selected bit(s) occurred the most frequently out of thosechromosome/spectra combinations that provided a reasonable fitbetween the predicted and measured mineral/chemical abun-dances. The goodness of fit was determined by the residuals be-tween predicted and measured values being below a setthreshold. The threshold is determined by the desired goodnessof fit between measured and modeled abundances (e.g., the usersets the fit to be within ±5% of the measured values, here we useda value very close to zero). (5) Mutation: this operation follows thecrossover and simulates the gene change of a chromosome due torandom disturbance; for those bits that were not determined byparents, they are randomly assigned a value of zero or one. Inthe combined GA–PLS model, the fitting function that the GA isattempting to minimize is a simplified PLS model.

The subset of wavelengths for the reflectance spectra that havebeen identified by the GA as carrying important information, alongwith the mineral or chemical abundances corresponding to eachreflectance spectrum, are then used as the input data for the PLSmodel (Fig. 5). Therefore, as the trained PLS–GA model is appliedto new spectra (e.g., for validation), only reflectance values at thewavelengths selected by the GA will be used to estimate min-eral/chemical abundances with Eq. (1).

Fig. 5. Flowchart of genetic algorithm–partial least squares (GA–PLSs) model.

Fig. 6. Flowchart of a partial least squares–back propagation neural network (PLSs–BPNN) model. LVs: latent variables, hi: hidden nodes. O: output value.

S. Li et al. / Icarus 221 (2012) 208–225 213

2.4. Partial least squares–back propagation neural network (PLS–BPNN)

Artificial neural network (ANN) consists of a class of nonlinearmodels. Of this class, back propagation neural network (BPNN) isthe most widely used and is capable of complicated multidimen-sional mapping (Hecht-Nielsen, 1989; Heermann and Khazenie,1992; Werbos, 1988). A typical BPNN model is composed of manyidealized layers of nodes and specified by the node characteristics(weights), the learning rules (transfer functions, also called sig-moid functions), network interconnection geometry (different lay-ers), and dimensionality (number of layers and nodes). BPNNresembles the human brain in that the model learns and storesknowledge (Mehra and Wah, 1992; Werbos, 1994). This learningfeeds back into the model to change the weights of nodes betweenlayers in order to decrease errors between predicted and measuredvalues. Thus, BPNN takes nonlinearity into account using the sig-moid functions that connect the BPNN layers of nodes, and theweights of redundant spectral bands (e.g., adjacent spectral bands)are significantly decreased through the back propagation learningprocess. After the node weights and sigmoid functions have beendetermined through the training process, the BPNN model can beused for predictions with new input data.

In this study, we applied a single hidden layer of nodes in orderto avoid long computation time. Thus, a three-layer neural networkalgorithm was used: one input layer, one hidden layer, and oneoutput layer (Fig. 6). Each layer consists of a number of nodes,and every node in each layer is connected to a node of the preced-ing and following layers (Hecht-Nielsen, 1989; Heermann andKhazenie, 1992; Werbos, 1994). The first layer distributes the inputparameters of latent variables (LVs) selected from the PLS model(Fig. 6). The second layer (hidden layer) has a varying number ofnodes, where each input parameter is multiplied by its connec-tion’s weights. Here we have set the number of nodes in the hiddenlayer to be equal to the number of nodes in the input layer (whichis equal to the number of samples); too many hidden nodes mayslow down the computation. Each node of the third layer receivesthe output from each node of the second layer, during which it isprocessed through a function and weighted again. In our modelthe third layer is simply the output layer, and it consists of a singlenode that represents the abundance of a given mineral; the train-ing process and weights are determined separately for each min-eral phase. The transfer of information between the input andhidden layers is hk (Fig. 6):

hk ¼ s1

Xn

j¼1

Wi1;k;j � LVsj þ b1

!ð3Þ

where s1 is the sigmoid function, Wi1;k;j is the weight of the jth node

in the input layer connecting with the kth node in the hidden layerduring the ith iteration. Note that the inputs, LVs, are the latent vari-ables selected by PLS, not reflectance spectra; b1 is the bias betweenthe input and hidden layer. The information transferred betweenthe hidden layer and output layer (mineral abundance) is O (Fig. 6):

O ¼ s2

Xn

j¼1

Wi2;j � hj þ b2

!ð4Þ

where s2 is the sigmoid function between the hidden layer and theoutput layer; h represents the information stored in the nodes in

the hidden layer, Wi2;j are the matrices containing the weights of

all connections between the hidden layer and the output layer; jis the jth node; i stands for the ith iteration, b2 is the bias betweenthe hidden and output layer. In this study, three kinds of sigmoidfunctions (linear s = x, logarithmic s ¼ 2

ð1þe�2xÞ � 1, and tangential

transfer functions s ¼ 11þe�x) are applied, which enables the network

to model nonlinear problems commonly present in quantitative re-mote sensing applications.

During the training of the model the goal is to optimize theweights for every node by minimizing the error between the out-put (O) and the measurements (M) (that is, the predicted and themeasured mineral or chemical abundances):

E ¼X

m

ðMm � OmÞ2 ð5Þ

where m is the number of samples. To minimize E, back propagationneural network changes the weights of all nodes through the feed-back processes.

Wiþ11;k;j ¼Wi

1;k;j � adi1;jLVsj ð6Þ

Wiþ12;j ¼Wi

2;j � adi2hj ð7Þ

where a is called the learning rate and can be calculated as: a ¼ C0pn,

where p is the total number of patterns, n is the total number ofnodes in the network, C0 = 10 is a constant (based on experienceand can change) (Heermann and Khazenie, 1992), and di

1;j; di2 are

the node errors between the input and hidden layer and the hiddenand output layer, respectively.

di2 ¼ ðM � OÞOð1� OÞ ð8Þ

di1;j ¼ hjð1� hjÞ

Xn

j¼1

di2Wi

2;j ð9Þ

where i stands for ith iteration. Through the iteration process, E isexpected to decrease until a predefined criterion is met. Two crite-ria are defined to signal a stop to the iteration process: the error, E,reaches a minimum threshold (set to 5% of the average measure-ments), or the total number of iterations reaches a limit (set to100 here). After the iteration process, all the node weights are opti-mized and the PLS–BPNN model is ready for prediction of mineralabundances with new input reflectance spectra. The prediction pro-cess for the PLS–BPNN model is as computationally efficient as forPLS and GA–PLS, which requires a simple matrix computation withthe weight matrix, spectral matrix and sigmoid functions.

214 S. Li et al. / Icarus 221 (2012) 208–225

To determine which spectral information (bands) has the mostsignificant contribution for the estimation of mineral or chemicalabundances for each node, the weights for each node are summed.If the sigmoid function is tangential, which is monotone decreasingat negative values and monotone increasing at positive values, theabsolute weights are added together. If a node has the largest sumof weight, it means that spectral information associated with thisnode has the most contribution for that mineral or element. Fromthe loadings of the corresponding latent variable (which were usedas inputs to the neural network), we can determine which spectralbands were selected in the PLS–BPNN model.

2.5. Cross validation

For a PLS model, the selection of latent variables (LVs) is the keystep. Too many LVs will lead to ‘over fitting’ problems and too fewLVs will degrade the model performance. In this study, we employa cross-validation method in which a calibration dataset is used todetermine the appropriate number of latent variables (LVs). All LVsgenerated from the PLS model are ranked based upon the percent-age of independent variables (spectral bands) that they explain.

Fig. 7. Model results for agglutinates. Left column: comparison between modeled and meinformation used in the models based on (b) beta coefficients derived by PLS (dashed linelatent variable with the largest weight in the PLS–BPNN model (dashed line). Black line

These LVs are then added to the PLS model one by one from thehighest rank to the lowest to do predictions. Thus, the changeand trend in the errors for the predicted values (for both calibra-tion and validation datasets) can be evaluated as the number ofLVs increases. The plot of the response (prediction errors) versusthe number of LVs in the model can be used to determine theappropriate number of latent variables.

2.6. Model evaluation

To evaluate the results of predicted mineral and chemical abun-dances from the three models discussed above, we examined theroot mean square error (RMSE), relative RMSE (rRMSE), and thecoefficient of determination (R2) between modeled (y) and mea-sured (y) values.

RMSE and rRMSE between measured and predicted values aredefined as:

RMSE ¼

ffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiPðyi � yiÞ

2

n� 1

sð10Þ

asured agglutinate for (a) PLS, (c) GA–PLS and (e) PLS–BPNN. Right column: spectral), (d) spectral bands selected by the GA–PLS model (black squares), and (f) loading ofis the spectrum for agglutinate and n is the number of samples.

Table 5Statistical indicators for PLS, GA–PLS and PLS–BPNN for the estimation of the abundance of lunar surface minerals.

PLS GA–PLS PLS–BPNN

R2 RMSE rRMSE R2 RMSE rRMSE R2 RMSE rRMSE

Agglutinate 0.54 7.53 14.76 0.81 5.49 10.74 0.85 4.79 9.23Pyroxene 0.68 3.21 30.04 0.81 3.18 29.70 0.93 1.60 14.87Plagioclase 0.89 5.62 19.97 0.91 5.30 18.20 0.97 3.55 12.61Olivine 0.32 0.78 33.11 0.55 0.64 27.01 0.85 0.35 14.60Ilmenite 0.88 1.15 42.45 0.89 1.05 38.82 0.94 0.88 31.55Volcanic glass 0.64 2.76 73.27 0.69 2.54 64.24 0.96 1.04 29.80

S. Li et al. / Icarus 221 (2012) 208–225 215

rRMSE ¼ RMSE�y� 100 ð11Þ

where n is the number of samples and �y is the mean value of themeasurements y.

The R2 value is calculated as:

R2 ¼ 1� SStotal

SSerrorð12Þ

Fig. 8. Model results for pyroxenes. Left column: comparison between modeled and meinformation used in the models based on (b) beta coefficients derived by PLS (dashed linelatent variable with the largest weight in the PLS–BPNN model (dashed line). Blackorthopyroxene, and n is the number of samples. (For interpretation of the references to

where SStotal and SSerror are the variance of measured values and thesum of squared residuals, respectively, and defined in followingequations:

SStotal ¼X

n

ðyi � �yÞ2 ð13Þ

SSerror ¼X

n

ðbyi � yiÞ2 ð14Þ

asured pyroxene for (a) PLS, (c) GA–PLS and (e) PLS–BPNN. Right column: spectral), (d) spectral bands selected by the GA–PLS model (black squares), and (f) loading ofline is the spectrum for lunar clinopyroxene, red line is the spectrum for lunarcolor in this figure legend, the reader is referred to the web version of this article.)

216 S. Li et al. / Icarus 221 (2012) 208–225

3. Results and discussion

Once the three models were developed they were applied to theLSCC reflectance spectra to estimate the abundance of agglutinate,pyroxene, plagioclase, olivine, ilmenite and volcanic glass in thoselunar samples, and the predicted mineral abundances were thencompared to the measured modal mineralogy for each sample.The PLS–BPNN model was further tested with 12 additional Apollosoil samples.

3.1. Estimation with the LSCC dataset

3.1.1. AgglutinateComparisons between modeled and measured agglutinate

abundances from the PLS, GA–PLS and PLS–BPNN models are pre-sented in Fig. 7. The corresponding R2 values are 0.54, 0.81 and0.85, respectively. Similarly, RMSE values for PLS, GA–PLS andPLS–BPNN models are 7.53 (rRMSE = 14.76%), 5.49(rRMSE = 10.74%) and 4.79 (rRMSE = 9.23%), respectively (Table 5).

Fig. 9. Model results for plagioclase. Left column: comparison between modeled and meinformation used in the models based on (b) beta coefficients derived by PLS (dashed linelatent variable with the largest weight in the PLS–BPNN model (dashed line). Black line

According to these indicators, one can conclude that the PLS–BPNNmodel provides the best performance at estimating the abundanceof this constituent, whereas the PLS model exhibits the poorestperformance.

Spectra of agglutinates exhibit two weak absorption bands cen-tered near 1000 nm and 1900 nm, with the latter being weaker(Figs. 3 and 7b). However, several other phases are known to exhi-bit absorption features near 1000 nm (e.g., olivine, pyroxene) andat 1900 nm (e.g., pyroxenes). Agglutinates exhibit much weakerabsorption features at these wavelengths than olivine or pyroxene(Fig. 3), but because of this overlap it is best to avoid using spectralbands at 1000 nm and 1900 nm for the estimation of agglutinate.For PLS, spectral bands from 460 to 800 nm and near 1000 nm,1300 nm, 1900 nm and 2500 nm (Fig. 7b) appear to contribute tothe variation of agglutinate abundance. In contrast, the GA–PLSmodel selects spectral bands near 500 nm, 800 nm and 1300 nm,but not near 1000 nm and 1900 nm (Fig. 7d). In the PLS–BPNNmodel, the loading curve for the latent variable with the largestweight (Fig. 7f) shows that spectral bands near 500 nm, 800 nm,1200 nm, 1600 nm and 2300 nm are used and absorptions near

asured plagioclase for (a) PLS, (c) GA–PLS and (e) PLS–BPNN. Right column: spectral), (d) spectral bands selected by the GA–PLS model (black squares), and (f) loading ofis the spectrum for lunar plagioclase and n is the number of samples.

S. Li et al. / Icarus 221 (2012) 208–225 217

1000 nm and 1900 nm are ignored. This difference in band selec-tion (i.e., avoidance of non-unique bands) may partially explainwhy the PLS–BPNN and GA–PLS models perform better than thePLS model. The improved performance of the PLS–BPNN model rel-ative to the PLS model may result from the ability of the former toaccount for nonlinear solutions.

3.1.2. PyroxeneIn Fig. 8, the plots a, c and e show the comparison between mea-

sured and modeled pyroxene for PLS, GA–PLS and PLS–BPNN, withR2 values of 0.68, 0.81 and 0.93 respectively. The regression line forthe PLS–BPNN results is much closer to 1:1 than those for PLS andGA–PLS results. The PLS–BPNN model also yields the lowest RMSEvalue (14.87%) (Table 5); based on these indicators it can be ob-served that PLS–BPNN performs the best at estimation of pyroxeneabundance whereas PLS performs the worst.

Both clinopyroxene and orthopyroxene are present on the lunarsurface, with the former higher in Ca2+ and the latter higher in Fe2+

(Heiken et al., 1991). Orthopyroxene exhibits two diagnostic

Fig. 10. Model results for olivine. Left column: comparisons between modeled and minformation used in the models based on (b) beta coefficients derived by PLS (dashed linelatent variable with the largest weight in the PLS–BPNN model (dashed line). Black line

absorption features centered near �950 nm and �1900 nm causedby Fe2+, whereas the two absorption bands for clinopyroxene areshifted to longer wavelengths (around �1000 nm and �2000 nm)due to a lower abundance of Fe2+ (Adams, 1974; Heiken et al.,1991). In the PLS model, spectral bands near 800 nm, 1000 nm,1500 nm and 2000 nm show significant contribution to the varia-tion of pyroxene abundance (Fig. 8b), whereas the GA–PLS modelselects spectral bands near 700 nm, 950 nm, 1000 nm, 2200 nmand 2500 nm for the PLS regression (Fig. 8d). Although band selec-tions between PLS and GA–PLS are roughly similar, the PLS modelincorporates more spectral bands over certain wavelength regions,such as the 1300–1600 nm and 1900–2200 nm regions (Fig. 8b).This redundancy in spectral bands may degrade the performanceof PLS. For the PLS–BPNN model, the loading curve of the latentvariable with the largest weight shows that spectral informationnear 950 nm, 1100 nm, 1900 nm and 2400 nm is used (Fig. 8f). Thisband selection is similar to that of PLS and GA–PLS. The selection ofbands near 1100 nm and 2400 nm may account for variations inabsorption band shape, because 1100 nm is the left shoulder of

easured olivine for (a) PLS, (c) GA–PLS and (e) PLS–BPNN. Right column: spectral), (d) spectral bands selected by the GA–PLS model (black squares), and (f) loading ofis the spectrum for lunar olivine and n is the number of samples.

218 S. Li et al. / Icarus 221 (2012) 208–225

the 950 nm absorption band and 2400 nm is the left shoulder ofthe 2000 nm absorption band, even though pyroxenes do not showabsorptions centered at 1100 and 2400 nm. The reason PLS–BPNNhas the best performance (modeling accuracy) may be the result ofits ability to incorporate nonlinear solutions. This could be thesame reason that Li (2006) achieved better results than presentedhere with PLS by using log(1/R) instead of R (reflectance spectra) asthe input.

3.1.3. PlagioclaseThe comparison between measured and modeled plagioclase

for PLS, GA–PLS and PLS–BPNN is shown in Fig. 9a, c and e. TheR2 values for the three models are 0.89, 0.91 and 0.97, respectively,and the regression lines for the three models are very close to 1:1(Fig. 9a, c and e). Again, though all three models perform well, thestatistical results (e.g., Table 5) indicate that the PLS–BPNN modelhas the best performance and PLS has the worst performance forthe estimation of plagioclase.

Fig. 11. Model results for ilmenite. Left column: comparisons between modeled and minformation used in the models based on (b) beta coefficients derived by PLS (dashed linelatent variable with the largest weight in the PLS–BPNN model (dashed line). Black line

The reason that all three models work well may be the result ofplagioclase having a unique absorption band centered near�1250 nm, attributed to the presence of Fe2+ (Fig. 3) (Adams andGoullaud, 1978).The difference in performance between PLS andGA–PLS may have a similar cause as for pyroxene: redundant datain PLS leads to worse estimations. It can be seen in Fig. 9b that al-most all of the spectral bands are used for PLS regression, whereasfor GA–PLS only spectral bands near 500 nm, 900 nm, 1300 nm and2200 nm are selected (Fig. 9d). In contrast, the PLS–BPNN modelrelies on spectral information near 900–1300 nm and 2200–2500 nm (Fig. 9f). The former includes absorptions due to Fe2+

(Adams, 1974; Adams and Goullaud, 1978).

3.1.4. OlivineAccurately estimating the abundance of olivine is a challenge

even though its spectra show diagnostic absorption bands near1100 nm (Li, 2006; Li and Li, 2010; Pieters et al., 2002; Shkuratovet al., 2003). As mentioned above, agglutinate and pyroxene alsoexhibit absorption features near 1000 nm, and this overlap in

easured ilmenite for (a) PLS, (c) GA–PLS and (e) PLS–BPNN. Right column: spectral), (d) spectral bands selected by the GA–PLS model (black squares), and (f) loading ofis the spectrum for ilmenite and n is the number of samples.

S. Li et al. / Icarus 221 (2012) 208–225 219

absorption features is a likely cause of the generally poor perfor-mance. The R2 values for PLS and GA–PLS models are only 0.32and 0.55, respectively (Fig. 10). The nonlinear PLS–BPNN model,however, has a significantly higher R2 value of 0.85, and the RMSEis decreased from 0.78 (PLS) to 0.64 (PLS–GA) to 0.35 (PLS-BPNN)(Table 5). Additionally, the regression line for the PLS–BPNN resultsis much closer to a 1:1 line than for the PLS and GA–PLS results(Fig. 10a, c and e). From these statistical indicators, we concludethat the nonlinear PLS–BPNN model is much more reliable forthe estimation of olivine than the PLS and GA–PLS models.

Among the three models, PLS–BPNN performs the best for quan-tifying olivine. The PLS model relies on spectral bands near 500 nmand 2500 nm, not near 1100 nm. In contrast, the GA–PLS model re-lies on spectral bands near 700 nm, 1100 nm and 2500 (Fig. 10d).In the PLS–BPNN model, spectral information near 900–1200 nmand 2200–2500 nm is used (Fig. 10f). It can be seen that the PLS–GA and PLS-BPNN models use similar spectral information forquantifying olivine, thus we speculate that the reason PLS–BPNN

Fig. 12. Model results for volcanic glass. Left column: comparisons between modeled anspectral information used in the models based on (b) beta coefficients derived by PLS (daloading of latent variable with the largest weight in the PLS–BPNN model (dashed line). Band n is the number of samples. (For interpretation of the references to color in this fig

performs better than GA–PLS for this phase is the former’s abilityto incorporate nonlinear relationships.

3.1.5. IlmeniteAs an opaque mineral, the reflectance spectrum of ilmenite is

dark and relatively featureless (Fig. 3). However, the estimation re-sults for ilmenite are accurate for all three models (Fig. 11).

In PLS, according to the beta coefficient curve (Fig. 11b), spectralbands near 500 nm, 1100 nm, 1700 nm and 2200 nm are used inthe regression. The PLS–GA model chooses similar spectral bands(Fig. 11d), which explains why there is no significant differencein the results of these two models (Table 5). For the PLS–BPNNmodel, the loading curve of the latent variable with the largestweight shows that more spectral bands are selected, such as the500–700 nm, �1000 nm, 1300–1700 nm and 2200–2500 nmwavelength regions (Fig. 11f). The inclusion of more spectral infor-mation (bands) may explain why the PLS–BPNN model has the bestperformance (see Fig. 11e and Table 5), though the lack of

d measured volcanic glass for PLS (a), GA–PLS (c) and PLS–BPNN (e). Right column:shed line), (d) spectral bands selected by the GA–PLS model (black squares), and (f)lack line is the spectrum for black glass, orange line is the spectrum for orange glass,ure legend, the reader is referred to the web version of this article.)

220 S. Li et al. / Icarus 221 (2012) 208–225

diagnostic absorption features in ilmenite make such interpreta-tions ambiguous. Alternatively, the ability of the PLS–BPNN modelto account for nonlinear relationships may also explain its im-proved performance when compared to the PLS and GA–PLSmodels, which are strictly linear. The R2 and RMSE values for thePLS–BPNN prediction are improved to 0.94 and 0.88 respectively(Table 5), indicating that this model provides the best estimationof ilmenite.

3.1.6. Volcanic glassFig. 12 presents the comparison between measured and mod-

eled abundances of volcanic glass for the PLS, PLS–GA and PLS–BPNN models. The R2 values for the models are 0.64, 0.69 and0.96, respectively (Table 5), and the regression line between mod-eled and measured volcanic glass for PLS–BPNN is much closer to a1:1 line than for PLS and GA–PLS (Fig. 12). Thus, the PLS–BPNNmodel performs better than PLS or PLS–GA in quantifying volcanic

Fig. 13. Comparison between measured chemical abundances and those estimated by afraction subgroups of the LSCC dataset (<10 lm, 10–20 lm, 20–45 lm size fractions). Taverage chemical composition for each mineral with the measured abundance of that maverage plagioclase composition and/or modal abundance of plagioclase may be a sourc

glass and the PLS–GA model performs marginally better than thePLS model.

As with the other phases, the differences between these modelsin their ability to accurately estimate the abundance of volcanicglass may result from the use of distinct spectral information andwhether or not nonlinear effects can be taken into account. Thedominant volcanic glasses on the lunar surface are black, greenand orange glass (Delano, 1986).Though spectra of volcanic glass(black and orange glass being considered) are dark (Fig. 3), theyshow very weak absorption features caused by Fe2+ near 1000–1100 nm and 2000 nm. In PLS, spectral bands near 500 nm,800 nm, 1200 nm, 1700 nm and 2000 nm show significant contri-bution to the estimation of volcanic glass (Fig. 12b). Spectral bandscentered at 750 nm, 1800 nm and 1950 nm are selected for the PLSregression in the PLS–GA model (Fig. 12d). In PLS–BPNN, spectralinformation near 500 nm, 750–950 nm, 1100 nm, 1950 nm,2200 nm and 2500 nm is applied (Fig. 12f). Among the three

ssuming an average chemical composition of each mineral for samples in the 3 sizehe estimated chemical abundances for each sample are derived by combining anineral. Note that CaO and Al2O3 are systematically underestimated, suggesting thee of uncertainty in this method.

Fig. 14. Comparison between measured chemical abundances and those estimated by assuming an average chemical composition of each mineral for bulk LSCC soils(<45 lm). The estimation is similar to that described in Fig. 13, except that here the mineral abundances used in the calculation represent an average of the measured mineralabundances from the three size separates.

S. Li et al. / Icarus 221 (2012) 208–225 221

models, only the PLS–BPNN model fully utilizes the spectral infor-mation near the two absorption bands of volcanic glass.

3.2. Further validating PLS–BPNN

From the preceding discussion it is clear that the PLS–BPNNmodel provides the most accurate estimates of mineral abun-dances for the samples in the LSCC dataset. However, the modelwas both trained and validated with samples from that dataset.As a second verification and to independently test whether thePLS–BPNN model works well for samples not related to the LSCCdataset, we chose to test the model using 12 additional Apollo soilsamples (Table 1). The reflectance spectra for these samples wereused as the input for the trained PLS–BPNN model (which wastrained using a portion of the LSCC dataset) in order to predict theirabundances of agglutinate, pyroxene, plagioclase, olivine, ilmenite,and volcanic glass. As mentioned in Section 2, only the bulk chem-istry and not the modal mineralogy have been reported for these

samples. Therefore, the predicted mineral abundances are multi-plied by the average compositions of agglutinate, pyroxene, plagio-clase, olivine, ilmenite and volcanic glass in order to convert themineral abundance results to predicted chemical abundances,which can then be compared to the measured chemicalabundances.

To test whether or not this mineral-to-chemical compositionconversion is a reliable method and whether or not the assumedaverage compositions of each mineral are consistent with mea-surements, we first converted the measured mineral abundancesof the LSCC samples to bulk chemical compositions for comparisonwith the direct measurements of chemical composition (Figs. 13and 14). Fig. 13 shows the comparison between measured chemi-cal abundances and those estimated from measured mineralogicalabundances (assuming an average chemical composition of eachmineral) for major oxides for samples in the 3 size separates ofthe LSCC dataset. Fig. 14 is a similar plot for the bulk samples ofthe LSCC dataset. The mineral abundances for the bulk samples

Fig. 15. Comparison between measured and estimated chemical compositions for 12 additional (non-LSCC) Apollo soil samples (see Table 1). Estimated chemicalcompositions are based on the mineral abundances predicted by the PLS–BPNN model (trained using the LSCC dataset) and an assumed average chemistry for each mineral.

222 S. Li et al. / Icarus 221 (2012) 208–225

were not measured, thus they were assumed to be the average ofthe three subgroups (size fractions) for each sample. Overall, theassumed average chemical composition for each mineral seemsplausible, but FeO, MgO, and TiO2 are systematically underesti-mated, whereas CaO and Al2O3 are systematically overestimated(Figs. 13 and 14). These discrepancies are the result of assuminga single average composition for each mineral or phase, whereasthe true range in chemical composition for a phase can be quitevariable, but the discrepancies may also be influenced by uncer-tainties associated with the measured (modal) mineralogy. Despitethese uncertainties, this method is currently the best that can bedone to provide a second verification of the PLS–BPNN model untilboth visible–near infrared reflectance spectra and modal mineral-ogy are available for other lunar soil samples.

The results of this second verification are presented in Fig. 15,which plots the measured and estimated abundances of major oxi-des for the 12 additional samples. As expected, FeO, MgO, and TiO2

are commonly underestimated, whereas CaO and Al2O3 are com-monly overestimated, mimicking the trend observed for the LSCCsamples. Despite these complications, estimated versus measured

values exhibit linear trends, as highlighted by the regression lines,and values are commonly within ±2% of the regression line(Fig. 15). Given that different soil size fractions will have differentspectral properties and variations in mineral/chemical composi-tion, it is promising that the PLS–BPNN model, which was trainedwith samples <45 lm from the LSCC dataset, produces linear rela-tionships for larger soil samples (<250 lm; Table 1). The results ofthis test suggest that PLS–BPNN not only has high precision in pre-dicting compositions of samples, but also that it may be capable ofaccommodating spectral effects associated with variations in parti-cle size. Much of the uncertainty in this test is likely a result ofcomparing spectral reflectance data for one size fraction of the soil(<250 lm) to chemical data derived from a different and larger sizefraction (<1 mm). Uncertainties associated with these differenceswill be compounded by the fact that we had to rely on an averagechemical composition for each phase, which is likely a poorassumption for such a large particle size range.

As noted above, the LSCC dataset also consists of measuredchemical abundances for all samples (bulk and size fraction sam-ples). Therefore, we could use this chemical data directly to train

Fig. 16. Comparison between measured and predicted chemical compositions of all LSCC soil samples (three size fraction subgroups and the bulk soil samples) using the PLS–BPNN model. In this scenario, the PLS–BPNN model was trained and validated using the measured chemical abundances instead of the measured mineral abundances. Notethat the model performs extremely well for all major elements and can accurately predict abundances within ±2% or better.

S. Li et al. / Icarus 221 (2012) 208–225 223

and validate the PLS–BPNN model as a tool to predict chemicalabundances instead of mineral abundances for lunar soils. Thetrained version of this model would be more directly applicableto our independent verification dataset (the 12 additional Apollosoil samples), for which the chemistry is known. For this scenario,we chose to use the samples in the three subgroups of the LSCCdataset (i.e., the three groups of size fractions) as the calibrationdataset and the samples in the bulk group for the initial validationdataset. As before, a second validation was carried out using the 12additional Apollo soil samples. However, the particle size range ofthe validation samples should ideally be within the range of thoseused in the training dataset. In our case the particle size range ofthe additional samples is much larger than the LSCC sample size(<250 lm versus <45 lm), and the chemical abundances of theseadditional samples were measured from the <1 mm size fractionof these soils (Morris et al., 1983). One may expect that these dif-ferences could lead to large uncertainties in the predicted values ofchemistry using the PLS–BPNN model, especially if variations inchemistry are more size-dependent than variations in mineralogy

(Papike et al., 1982). Thus, even though the PLS–BPNN model per-forms well within the particle size and chemical range of the LSCCsamples (Fig. 16; uncertainties are ±2%), predicted chemical abun-dances of the 12 additional soil samples could have large uncer-tainties. However, it must be kept in mind that many elementshave no physical interpretation at visible–near infrared wave-lengths, sensu stricto, thus the physical meaning of this version ofthe PLS–BPNN model is somewhat ambiguous. Nevertheless, thisexample illustrates the usefulness of this model as a non-destruc-tive predictive tool for estimating the bulk chemistry of lunar soils.

3.3. Performance of the PLS–BPNN model

It should be noted that: (1) the initialization of BPNN is random,which means that a sufficient number of training samples is re-quired to achieve a stable estimation. In this study, 57 sampleswere used in the training dataset. The estimation results presentedhere are stable but they are likely not optimal; additional samplesthat extend the compositional range/variability of the training set

224 S. Li et al. / Icarus 221 (2012) 208–225

would improve the predictive capabilities of the model. (2) Thetraining samples should span the range of compositions and phys-ical properties of the samples to be used in the prediction. Morespecifically, the training samples should cover the full range inthe abundance of each mineral (or element) in order to achieve sta-ble and accurate prediction results. In addition, it is ideal that thetraining set would also cover the appropriate range in particle sizeand possibly other attributes that affect spectral properties (e.g.,abundance of SMFe). If the model is used to predict the mineralor chemical abundances of samples that are grossly dissimilar fromthose used for the training set, then the predictive capabilities willbe diminished and large uncertainties could result. This is evidentfrom our second verification test that relied on the 12 additionalApollo soil samples, which were of a much larger particle sizerange (<250 lm) than the LSCC samples used to train the model(<45 lm) and could span a much wider range in mineral composi-tions and abundances. However, the true chemical compositions ofthe <250 lm size fraction for these soils is not known (recall thatthe measured values were for the <1 mm size fraction). Additionallaboratory measurements akin to the LSCC dataset are required(spectral measurements, chemical, and mineralogical abundance)in order to fully understand the limitations of the model.

However, we note that the PLS and GA–PLS have the same lim-itation (training samples are required) as the PLS–BPNN model,though it is clear from the preceding discussion that the latter issuperior as a predictive tool for lunar soil compositions. In practicalterms this means that applying this model to remotely acquiredreflectance data of the Moon (e.g., Clementine or M3 data) will bemost accurate for regions whose mineral compositions and abun-dances are within the range of the LSCC samples, such as fine-grained soils at the Apollo landing sites. In addition, because reflec-tance spectroscopy is a non-destructive technique, our results indi-cate that the PLS–BPNN model may be a powerful tool forestimating mineral or chemical abundances of lunar samples in arapid and non-destructive way. This aspect is particularly intrigu-ing given the precious nature of lunar material and that bulk chem-ical data are not available for many lunar soil samples. Asmentioned above, the application of this model to reflectance spec-tra as a non-destructive chemical/mineralogical measurement willreach its fullest potential if the training database can be expandedto include a wider variety of lunar soils and rocks. This would re-quire additional spectral, chemical, and mineralogical measure-ments of the same size fraction for a given soil sample or rock.

4. Conclusions

This research was aimed at developing a computationally effi-cient and accurate algorithm for estimating the abundance of min-erals in lunar samples and on the lunar surface. A hybrid model ofpartial least squares (PLS) and back propagation neural network(BPNN) has been developed and validated with the lunar ‘groundtruth’ data – the Lunar Soil Characterization Consortium dataset.Compared with partial least squares (PLS) and genetic algorithm–partial least squares (GA–PLS), the PLS–BPNN model yielded thebest estimates for mineral and chemical abundances (Table 5).The PLS–BPNN model is also computationally efficient, which sug-gests that it is a viable option for deriving global lunar mineralabundance maps from hyperspectral imaging data (e.g., Moon Min-eralogy Mapper data). The best prediction results are for plagio-clase (R2 = 0.97), likely because the spectral features for thisphase do not overlap with features in other phases. In contrast,predicted values of olivine have the largest uncertainties, with anR2 value of 0.85, because of overlapping bands (e.g., �1000 nm)with pyroxene and volcanic glass.

In addition to M3 data, the PLS–BPNN model could also be ap-plied to Clementine data to map the global distribution of aggluti-nate, pyroxene, plagioclase, olivine, ilmenite and volcanic glass onthe lunar surface. However, due to the limitation of being anempirical model, such mapping results will be most accurate forregions whose range in mineral abundances and compositionsare most similar to those of the LSCC dataset, which was used totrain the model. In these locations the model results should behighly reliable. The PLS–BPNN model also provides a non-destruc-tive alternative to estimate the mineral and bulk chemical compo-sitions of lunar soils, and it can be further tested to determine itsapplicability to rock samples. Because of its calculation efficiencyand reliability in prediction, a trained PLS–BPNN model has the po-tential to be integrated into future lunar rovers to provide rapidand accurate estimates of surface mineralogy based solely on visi-ble–near infrared reflectance spectra, which can be used to deter-mine which samples should be analyzed further or selected forreturn to Earth.

Acknowledgments

We thank two anonymous reviewers and Dr. Oded Aharonsonfor their fruitful comments on the manuscript. This work is par-tially supported by the Research Support Funds Grant (RSFG) pro-gram of Indiana University–Purdue University at Indianapolis.

References

Adams, J.B., 1974. Visible and near-infrared diffuse reflectance spectra of pyroxenesas applied to remote sensing of solid objects in the Solar System. J. Geophys.Res. 79, 4829–4836.

Adams, J.B., Goullaud, L.H., 1978. Plagioclase feldspars – Visible and near infrareddiffuse reflectance spectra as applied to remote sensing. Proc. Lunar Sci. Conf.2901–2909 (abstract).

Anand, M. et al., 2004. Space weathering on airless planetary bodies: Clues from thelunar mineral hapkeite. Proc. Natl. Acad. Sci. USA 101, 6847–6851.

Cahill, J.T.S. et al., 2009. Compositional variations of the lunar crust: Results fromradiative transfer modeling of central peak spectra. J. Geophys. Res. – Planets114, 1–17.

Combe, J.-P. et al., 2010. Mixing of surface materials investigated by spectralmixture analysis with the Moon Mineralogy Mapper. Lunar Planet. Sci. 2215(abstract).

Delano, J.W., 1986. Pristine lunar glasses: Criteria, data, and implications. Proc.Lunar Sci. Conf. 201 (abstract).

Elkins-Tanton, L.T. et al., 2002. Re-examination of the lunar magma ocean cumulateoverturn hypothesis: Melting or mixing is required. Earth Planet. Sci. Lett. 196,239–249.

Forrest, S., 1993. Genetic algorithms – Principles of natural-selection applied tocomputation. Science 261, 872–878.

Geladi, P., Kowalski, B.R., 1986. Partial least-squares regression: A tutorial. Anal.Chim. Acta 185, 1–17.

Hapke, B., 1981. Bidirectional reflectance spectroscopy: I. Theory. J. Geophys. Res.86, 3039–3054.

Hecht-Nielsen, R., 1989. Theory of the backpropagation neural network. In:International Joint Conference on Neural Networks, pp. 593–605.

Heermann, P.D., Khazenie, N., 1992. Classification of multispectral remote-sensingdata using a back-propagation neural network. IEEE Trans. Geosci. Remote Sens.30, 81–88.

Heiken, G.H. et al., 1991. Lunar Sourcebook – A User’s Guide to the Moon.Cambridge University Press, New York.

Korokhin, V.V. et al., 2008. Prognosis of TiO2 abundance in lunar soil using a non-linear analysis of Clementine and LSCC data. Planet. Space Sci. 56, 1063–1078.

Li, L., 2006. Partial least squares modeling to quantify lunar soil composition withhyperspectral reflectance measurements. J. Geophys. Res. – Planets 111, 1–13.

Li, L., Li, S., 2010. Deriving lunar mineral abundance maps from Clementinemultispectral imagery. Lunar Planet. Sci. 2189 (abstract).

Li, S., Li, L., 2011. Radiative transfer modeling for quantifying lunar surface minerals,particle size, and submicroscopic metallic Fe. J. Geophys. Res. – Planets 116, 1–14.

Li, L., Mustard, J.F., 2000. Compositional gradients across mare-highland contacts:Importance and geological implication of lateral transport. J. Geophys. Res. –Planets 105, 20431–20450.

Li, L., Mustard, J.F., 2003. Highland contamination in lunar mare soils: Improvedmapping with multiple end-member spectral mixture analysis (MESMA). J.Geophys. Res. – Planets 108, 1–14.

Longhi, J., 1978. Pyroxene stability and the composition of the lunar magma ocean.Proc. Lunar Sci. Conf. 285–306 (abstract).

S. Li et al. / Icarus 221 (2012) 208–225 225

Lucey, P.G., 2004. Mineral maps of the Moon. Geophys. Res. Lett. 31, 1–4.McKay, D.S. et al., 1974. Grain size and the evolution of lunar soils. Proc. Lunar Sci.

Conf. 887–906 (abstract).Mehra, P., Wah, B.W., 1992. Artificial Neural Networks: Concepts and Theory. IEEE

Computer Society Press, Los Alamitos, Calif..Moroz, L., Arnold, G., 1999. Influence of neutral components on relative band

contrasts in reflectance spectra of intimate mixtures: Implications for remotesensing 1. Nonlinear mixing modeling. J. Geophys. Res. – Planets 104, 14109–14121.

Morris, R.V. et al., 1983. Handbook of Lunar Soils. Lyndon B. Johnson Space Center,Houston.

Noble, S.K. et al., 2001. The optical properties of the finest fraction of lunar soil:Implications for space weathering. Meteorit. Planet. Sci. 36, 31–42.

Noble, S.K. et al., 2006. Using the modified Gaussian model to extract quantitativedata from lunar soils. J. Geophys. Res. – Planets 111, 1–17.

Papike, J.J. et al., 1982. The lunar regolith – Chemistry, mineralogy, and petrology.Rev. Geophys. 20, 761–826.

Pieters, C.M., 1983. Strength of mineral absorption features in the transmittedcomponent of near-infrared reflected light: First results from RELAB. J. Geophys.Res. 88, 9534–9544.

Pieters, C.M., 1986. Composition of the lunar highland crust from near-infraredspectroscopy. Rev. Geophys. 24, 557–578.

Pieters, C.M. et al., 1993. Crustal diversity of the Moon – Compositional analyses ofGalileo solid-state imaging data. J. Geophys. Res. – Planets 98, 17127–17148.

Pieters, C.M. et al., 1997. Mineralogy of the mafic anomaly in the South Pole Aitkenbasin: Implications for excavation of the lunar mantle. Geophys. Res. Lett. 24,1903–1906.

Pieters, C.M., Stankevich, D.G., Shkuratov, Y.G., Taylor, L.A., 2002. Statistical analysisof the links among lunar mare soil mineralogy, chemistry and reflectancespectra. Icarus 155, 285–298.

Pieters, C.M. et al., 2002. Statistical analysis of the links among lunar mare soilmineralogy, chemistry, and reflectance spectra. Icarus 155, 285–298.

Pieters, C. et al., 2006. Lunar Soil Characterization Consortium analyses: Pyroxeneand maturity estimates derived from Clementine image data. Icarus 184, 83–101.

Schroeder, L.D. et al., 1986. Understanding Regression Analysis: An IntroductoryGuide. Sage Publications, Beverly Hills.

Shkuratov, Y. et al., 1999. A model of spectral albedo of particulate surfaces:Implications for optical properties of the Moon. Icarus 137, 235–246.

Shkuratov, Y.G. et al., 2003. Composition of the lunar surface as will be seen fromSMART-1: A simulation using Clementine data. J. Geophys. Res. – Planets 108,1–13.

Shkuratov, Y.G. et al., 2005a. Lunar clinopyroxene and plagioclase: Surfacedistribution and composition. Solar Syst. Res. 39, 255–266.

Shkuratov, Y.G. et al., 2005b. Derivation of elemental abundance maps atintermediate resolution from optical interpolation of lunar prospectorgamma-ray spectrometer data. Planet. Space Sci. 53, 1287–1301.

Shkuratov, Y.G. et al., 2007. Lunar surface agglutinates: Mapping compositionanomalies. Solar Syst. Res. 41, 177–185.

Solomon, S.C., Longhi, J., 1977. Magma oceanography: 1. Thermal evolution. LunarPlanet. Sci. Abstracts 884.

Starukhina, L.V., Shkuratov, Y.G., 2001. A theoretical model of lunar opticalmaturation: Effects of submicroscopic reduced iron and particle sizevariations. Icarus 152, 275–281.

Sunshine, J.M., Pieters, C.M., 1993. Estimating modal abundances from the spectraof natural and laboratory pyroxene mixtures using the modified Gaussianmodel. J. Geophys. Res. – Planets 98, 9075–9087.

Sunshine, J.M. et al., 1990. Deconvolution of mineral absorption-bands – Animproved approach. J. Geophys. Res. – Solid Earth Planets 95, 6955–6966.

Taylor, S.R., 1979. Structure and evolution of the Moon. Nature 281, 105–110.Taylor, L.A. et al., 1996. X-ray digital imaging petrography of lunar mare soils:

Modal analyses of minerals and glasses. Icarus 124, 500–512.Taylor, L.A. et al., 1999. Integration of the chemical and mineralogical

characteristics of lunar soils with reflectance spectroscopy. Lunar Planet. Sci.1859 (abstract).

Taylor, L.A. et al., 2000. Mineralogical characterization of lunar mare soils. LunarPlanet. Sci. Abstracts 1706.

Taylor, L.A. et al., 2003. Mineralogical characterization of lunar highland soils. LunarPlanet. Sci. 1774 (abstract).

Taylor, L.A. et al., 2010. Mineralogical and chemical characterization of lunarhighland soils: Insights into the space weathering of soils on airless bodies. J.Geophys. Res. – Planets 115, 1–14.

Tompkins, S., Pieters, C.M., 1999. Mineralogy of the lunar crust: Results fromClementine. Meteorit. Planet. Sci. 34, 25–41.

Tsuboi, N. et al., 2010. A new modified Gaussian model (MGM) using the cross-validation method. Lunar Planet. Sci. Abstracts 1744.

Warren, P.H., 1990. Lunar anorthosites and the magma-ocean plagioclase-flotationhypothesis – Importance of FeO enrichment in the parent magma. Am. Mineral.75, 46–58.

Werbos, P.J., 1988. Generalization of backpropagation with application to arecurrent gas market model. Neural Networks 1, 339–356.

Werbos, P.J., 1994. The Roots of Backpropagation: From Ordered Derivatives toNeural Networks and Political Forecasting. J. Wiley & Sons, New York.

Wood, J.A., 1975. Lunar petrogenesis in a well-stirred magma ocean. Lunar Planet.Sci. Abstracts 881.

hybridization of partial least squares and neural network models for quantifying lunar surface...

Documents