qspr study of critical micelle concentrations of nonionic surfactants

9
QSPR Study of Critical Micelle Concentrations of Nonionic Surfactants Alan R. Katritzky,* ,Liliana M. Pacureanu, ,‡ Svetoslav H. Slavov, Dimitar A. Dobchev, ,§,| and Mati Karelson § Center for Heterocyclic Compounds, Department of Chemistry, UniVersity of Florida, GainesVille, Florida 32611, Institute of Chemistry of Romanian Academy, M. Viteazul 24, Timisoara 300223, Romania, Institute of Chemistry, Tallinn UniVersity of Technology, Ehitajate tee 5, Tallinn 19086, Estonia, and MolCode Ltd., Soola 8, Tartu 51013, Estonia Linear and nonlinear predictive models are derived for a data set of 162 nonionic surfactants. The descriptors in the derived models relate to the molecular shape and size and to the presence of heteroatoms participating in donor-acceptor and dipole-dipole interactions. Steric hindrance in the hydrophobic area also plays an important role in micellization. The derived linear and nonlinear QSPR models could be useful to predict the CMCs of broad classes of nonionic surfactants. 1. Introduction and Interpretation of CMC Surfactants are amphiphilic molecules that contain at least one polar group and one hydrophobic nonpolar group, each of which may be composed from a variety of constituents that confer the hydrophobic and hydrophilic character. Surfactant systems are of special research interest due their wide applica- tions in industrial and domestic area such as detergency, solubilizing agents, emulsifying agents, dispersing agents, coatings, and pharmaceutical adjuvants. When surfactants are in solution, above a certain concentration, called the critical micelle concentration (CMC), they tend to undergo spontaneous self-association into ordered structures called micelles. The CMC is influenced by many external factors including temperature, pressure, pH, ionic strength, volume of the solution, and also by the surfactant chemical structure such as the length of hydrophobic tail, headgroup area, etc. The influence of temperature on micellization is much studied: the hydrophilic and hydrophobic group’s interactions with water vary signifi- cantly with temperature. Usually the CMC of surfactants are determined by the measurement of physicochemical parameters of their aqueous solutions as function of surfactant concentration, especially conductometry, tensiometry, fluorescence emission spectros- copy, calorimetry, kinetic approaches, light scattering, NMR spectrometry, cyclic voltametry, etc. 1-8 Surfactants are classified by their net charge into three main subclasses: cationic, anionic, and nonionic. Ionic surfactants and especially polymerizable anionic surfactants (and the corre- sponding polymers) are of most practical interest but are more complex to model. The nonionic surfactants are of particular interest for experimental and theoretical studies because of their good solubility and ease of synthesis. Zwitterionic (amphoteric) surfactants form a subclass of nonionic surfactants. Other classifications of surfactants invoke the structural nature of the hydrophobic tails, e.g., linear, branched, fluorinated, etc. The main varieties of nonionic surfactants are based on fatty alcohols, fatty acids and esters, ethoxylated alcohols and phenols, glycerol esters, alkyl polyglycosides, ethylene oxide/propylene oxide copolymers, polyalcohols and ethoxylated polyalcohols, sorbitol and sorbitan derivatives, alkanolamines and alkanolamides, thiols, etc. Much scientific effort has been devoted to the experimental studies and prediction of the CMC of nonionic surfactants by various theoretical and empirical methods. 9-12 Thermodynamic studies led to the pseudophase theory of micellization as a phase separation phenomenon and to the mass-action theory of micelles as chemical aggregates of amphiphiles in multiple chemical equilibria. 11 Theoretical approaches developed to understand micelle formation, growth, structure, size distribu- tion, and critical micelle concentration include phenomenologi- cal, statistical-thermodynamic, and geometric-packing theories. 11 Puvada and Blankschtein used a molecular model of miceliza- tion 13 to predict properties of nonionic surfactants while the approach of van Lent and Scheutjens was based on self- consistent field theory. 14 2. Previous Correlations of CMC with Structure Various empirical and theoretical studies have been employed to correlate the CMC values and toxicity of the surfactants with their molecular structure. 9-12,15 Our group first correlated CMC with molecular structure employing a general QSPR approach for a data set of 77 nonionic surfactants using CODESSA software and a heuristic algorithm. 9 The regression equation of R 2 ) 0.983 then reported 9 (eq 1) included two topological descriptors for the hydrophobic fragment of the molecule: (i) the Kier and Hall index of zeroth order (c-KH0) and (ii) the average information content of second order (c-AIC-2), together with (iii) a constitutional descriptor s relative number of nitrogen and oxygen atoms (RNNO). According to eq 1, the CMC of nonionic surfactants depends largely on the hydrophobic fragment (decreasing with its size), but increasing with the size of the hydrophilic fragment. For the same data set of nonionic surfactants used to derive eq 1, Saunders and Platt 18 applied (i) the LFER method (linear free energy relationship) that separates the solute-solvent interactions into five physicochemical descriptors, to obtain a * Corresponding author. Phone: (352) 392-0554. Fax: (352) 392- 9199. University of Florida. Institute of Chemistry of Romanian Academy. § Tallinn University of Technology. | MolCode, Ltd. log CMC )-(0.567 ( 0.009)c-KH0 + (1.054 ( 0.048)c-AIC-2 + (7.5 ( 1.0)RNN - (1.80 ( 0.16) (1) Ind. Eng. Chem. Res. 2008, 47, 9687–9695 9687 10.1021/ie800954k CCC: $40.75 2008 American Chemical Society Published on Web 10/29/2008

Upload: mati

Post on 09-Dec-2016

218 views

Category:

Documents


2 download

TRANSCRIPT

Page 1: QSPR Study of Critical Micelle Concentrations of Nonionic Surfactants

QSPR Study of Critical Micelle Concentrations of Nonionic Surfactants

Alan R. Katritzky,*,⊥ Liliana M. Pacureanu,⊥ ,‡ Svetoslav H. Slavov,⊥ Dimitar A. Dobchev,⊥ ,§,| andMati Karelson§

Center for Heterocyclic Compounds, Department of Chemistry, UniVersity of Florida, GainesVille, Florida 32611,Institute of Chemistry of Romanian Academy, M. Viteazul 24, Timisoara 300223, Romania, Institute of Chemistry,Tallinn UniVersity of Technology, Ehitajate tee 5, Tallinn 19086, Estonia, and MolCode Ltd.,Soola 8, Tartu 51013, Estonia

Linear and nonlinear predictive models are derived for a data set of 162 nonionic surfactants. The descriptorsin the derived models relate to the molecular shape and size and to the presence of heteroatoms participatingin donor-acceptor and dipole-dipole interactions. Steric hindrance in the hydrophobic area also plays animportant role in micellization. The derived linear and nonlinear QSPR models could be useful to predict theCMCs of broad classes of nonionic surfactants.

1. Introduction and Interpretation of CMC

Surfactants are amphiphilic molecules that contain at leastone polar group and one hydrophobic nonpolar group, each ofwhich may be composed from a variety of constituents thatconfer the hydrophobic and hydrophilic character. Surfactantsystems are of special research interest due their wide applica-tions in industrial and domestic area such as detergency,solubilizing agents, emulsifying agents, dispersing agents,coatings, and pharmaceutical adjuvants. When surfactants arein solution, above a certain concentration, called the criticalmicelle concentration (CMC), they tend to undergo spontaneousself-association into ordered structures called micelles.

The CMC is influenced by many external factors includingtemperature, pressure, pH, ionic strength, volume of the solution,and also by the surfactant chemical structure such as the lengthof hydrophobic tail, headgroup area, etc. The influence oftemperature on micellization is much studied: the hydrophilicand hydrophobic group’s interactions with water vary signifi-cantly with temperature.

Usually the CMC of surfactants are determined by themeasurement of physicochemical parameters of their aqueoussolutions as function of surfactant concentration, especiallyconductometry, tensiometry, fluorescence emission spectros-copy, calorimetry, kinetic approaches, light scattering, NMRspectrometry, cyclic voltametry, etc.1-8

Surfactants are classified by their net charge into three mainsubclasses: cationic, anionic, and nonionic. Ionic surfactants andespecially polymerizable anionic surfactants (and the corre-sponding polymers) are of most practical interest but are morecomplex to model. The nonionic surfactants are of particularinterest for experimental and theoretical studies because of theirgood solubility and ease of synthesis. Zwitterionic (amphoteric)surfactants form a subclass of nonionic surfactants. Otherclassifications of surfactants invoke the structural nature of thehydrophobic tails, e.g., linear, branched, fluorinated, etc. Themain varieties of nonionic surfactants are based on fatty alcohols,fatty acids and esters, ethoxylated alcohols and phenols, glycerolesters, alkyl polyglycosides, ethylene oxide/propylene oxide

copolymers, polyalcohols and ethoxylated polyalcohols, sorbitoland sorbitan derivatives, alkanolamines and alkanolamides,thiols, etc.

Much scientific effort has been devoted to the experimentalstudies and prediction of the CMC of nonionic surfactants byvarious theoretical and empirical methods.9-12 Thermodynamicstudies led to the pseudophase theory of micellization as a phaseseparation phenomenon and to the mass-action theory ofmicelles as chemical aggregates of amphiphiles in multiplechemical equilibria.11 Theoretical approaches developed tounderstand micelle formation, growth, structure, size distribu-tion, and critical micelle concentration include phenomenologi-cal, statistical-thermodynamic, and geometric-packing theories.11

Puvada and Blankschtein used a molecular model of miceliza-tion13 to predict properties of nonionic surfactants while theapproach of van Lent and Scheutjens was based on self-consistent field theory.14

2. Previous Correlations of CMC with Structure

Various empirical and theoretical studies have been employedto correlate the CMC values and toxicity of the surfactants withtheir molecular structure.9-12,15

Our group first correlated CMC with molecular structureemploying a general QSPR approach for a data set of 77nonionic surfactants using CODESSA software and a heuristicalgorithm.9 The regression equation of R2 ) 0.983 then reported9

(eq 1) included two topological descriptors for the hydrophobicfragment of the molecule: (i) the Kier and Hall index ofzeroth order (c-KH0) and (ii) the average information contentof second order (c-AIC-2), together with (iii) a constitutionaldescriptor s relative number of nitrogen and oxygen atoms(RNNO).

According to eq 1, the CMC of nonionic surfactants dependslargely on the hydrophobic fragment (decreasing with its size),but increasing with the size of the hydrophilic fragment.

For the same data set of nonionic surfactants used to deriveeq 1, Saunders and Platt18 applied (i) the LFER method (linearfree energy relationship) that separates the solute-solventinteractions into five physicochemical descriptors, to obtain a

* Corresponding author. Phone: (352) 392-0554. Fax: (352) 392-9199.

⊥ University of Florida.‡ Institute of Chemistry of Romanian Academy.§ Tallinn University of Technology.| MolCode, Ltd.

log CMC ) -(0.567 ( 0.009)c-KH0 +(1.054 ( 0.048)c-AIC-2 + (7.5 ( 1.0)RNN - (1.80 ( 0.16)

(1)

Ind. Eng. Chem. Res. 2008, 47, 9687–9695 9687

10.1021/ie800954k CCC: $40.75 2008 American Chemical SocietyPublished on Web 10/29/2008

Page 2: QSPR Study of Critical Micelle Concentrations of Nonionic Surfactants

correlation coefficient of R2 ) 0.903, and (ii) a surface areaQSPR approach which produced R2 ) 0.856.16

Chen correlated the CMC values for aqueous solutions ofnonionic polyoxyethylene alcohol surfactants using a modifiedAranovich and Donohue (AD) excess Gibbs energy model.12

The p-NRTL [segment-based-polymer-NRTL (nonrandom two-liquid)] model was used for the same types of surfactants toaccount for the nonideality of aqueous nonionic surfactantssolutions.17 Li et al.11 have correlated the CMC points at 25 °Cof nonionic surfactants using s-UNIQUAC (segment-baseduniversal quasi-chemical model) and SAFT (statistical associat-ing fluid theory) equations. Critical micelle concentrations werepredicted accurately by Cheng et al. with the trend of hydro-phobic chain and ethylene oxide group using the UNIFACequation.18

Yuan and co-workers19 reported a highly significant correla-tion (R2 ) 0.998) for a data set of 37 structurally similar alkylethoxylates and octylphenol ethoxylates. The descriptors in-volved in their model were as follows: octanol/water partitioncoefficient, heat of formation, molecular volume, and LUMOenergy. For the same classes of compounds Wang et al. proposeda model including Kier and Hall index of zeroth order, the totalenergy (or the heat of formation), and the molecular dipolemoment with R2 ) 0.995.20

Temperature has a major influence on CMC.9 At constanttemperature the CMC increases as the number of oxyethylenegroups increase, and the reverse phenomena is observed whenthe number of carbon atoms from hydrophobic tail increases.The influence of the temperature on nonionic surfactantsmicellization is mainly due to the increase of hydrophobicitywith respect to the increase of the temperature caused by thedestruction of hydrogen bonds between water molecules andhydrophilic groups. Consequently, it determines the decreaseof the CMC. Hence, an interpolation technique can be used tocalculate log CMC at different temperatures using the plot oflog CMC vs 1/T, which is approximately linear for nonionicsurfactants.9 In the case of p,t-octylphenol ethoxylates aminimum on the CMC-temperature curve was observed tomove toward higher temperatures as the number of ethoxygroups increases. Temperature dependence studies of n-dode-cylpolyoxyethylene glycol monoether revealed that CMC ini-tially decrease until the minimum is reached and after thatincrease as the temperature is increased further. The increaseof CMC is supposed to be due to the loss of water structuresurrounding the hydrophobic group as the temperature increases.As temperature increases, this effect became preponderant andlead to the increase of CMC.21

3. Present Work

This paper attempts to develop the more general QSPR modelfor the micellization of nonionic surfactants using molecularand fragment descriptors by (i) extending the number anddiversity of the surfactants treated, (ii) testing our previousQSPR model9 on the new data, (iii) searching for better models,and (iv) relating the descriptors involved in the new model tothe micellization phenomenon.

4. Data Set

Our present data set (Table 1) consists of 162 nonionicsurfactants. In addition to the 77 CMC values previouslystudied,9 we included 85 newly available CMCs all measuredat 25 °C in water containing no additional salts, cosurfactants,or other additives. The current data set contains several classes

of linear ethoxylated alcohols and octylphenols, a large numberof carbohydrate derivatives, and dimeric surfactants gatheredfrom references 4, 9, 11, 19, 20, and 22-34. As previously,9 alogarithmic transformation of CMC (given in molar units) wasused in order to improve the normal distribution of theexperimental data.

All nonionic surfactants experimental data are collected inTable 2, which provides (i) the code of the nonionic surfactants(second column), (ii) the negative logarithm of critical micelleconcentration determined experimentally (third column), and(iii) the predicted -log CMC values from the QSPR models ofeq 1, Table 4, and ANN (columns four-six).

5. Methodology

The geometry of each anionic surfactant molecule waspreoptimized using the molecular mechanics force field (MM+)as implemented in HyperChem 7.5.35 Further refinement ofmolecular geometries was obtained using the AM1 (AustinModel-1) semiempirical method36 with a gradient norm limitof 0.1 kcal/(mol Å). The optimized geometries were then usedto calculate up to 700 molecular and fragment descriptorsclassified into (i) constitutional 38, (ii) topological 38, (iii)geometrical 14, (iv) charge-related 313, and (v) semiemprical316, which were calculated using CODESSA-PRO software.37

The hydrophobic parameter log P and the molar refractivitydefined for the whole molecules and for the hydrophobicfragments were calculated using the EPI Suite38 package anduploaded into the CODESSA storage.

CODESSA-PRO has previously been used to correlate andpredicted many physical properties and biological activities,including boiling points39 partition coefficients (log D),40 solventscales,41 correlation of liquid viscosity with molecular structurefor organic compounds,42 the binding energies for 1:1 com-plexation systems between various organic guest molecules and�-cyclodextrin,43 the in vitro minimum inhibitory concentration(MIC) of 3-aryloxazolidin-2-one antibacterials to inhibit growthof Staphylococcus aureus,44 partition coefficients of drugsbetween human breast milk and plasma,45 HIV-1 proteaseinhibitory activity of substituted tetrahydropyrimidinone,46 tox-icities of polychlorodibenzofurans, polychlorodibenzo-1,4-dioxins, and polychlorobiphenyls.47

Molecular fragments have repeatedly been used successfullyin structure-property studies.48-51 Although fragment-basedQSPRs have been criticized for using a larger number ofvariables compared to whole molecular descriptor-based ap-proaches,48 our previous correlations of critical micelle con-centrations using appropriate fragment descriptors that accountfor structural diversity9,10 and produced physico-chemicallymeaningful correlations with improved statistical characteristics.

Statistical theory demonstrates that QSPR correlations utiliz-ing two or more variables which intercorrelate with R2 higherthan 0.6 are not reliable. The best multilinear regression (BMLR)algorithm42 was used to generate reliable QSPR models froman initial pool of orthogonal descriptors preselected by the ABCapproach (see the description of the method below). The BMLRselects the best two, three, etc., parameter regression equations,based on the highest R2 value in a stepwise regressionprocedure.51 During the BMLR procedure the descriptor scalesare normalized and centered automatically, and the final resultis given in natural scales. Thus, the models generated providethe optimum property representation from a given descriptorpool.

Two validation techniques were applied: leave-one-out cross-validation and Y-scrambling.53 The corresponding squared cross-

9688 Ind. Eng. Chem. Res., Vol. 47, No. 23, 2008

Page 3: QSPR Study of Critical Micelle Concentrations of Nonionic Surfactants

Table 1. Code, Name, and Chemical Structures for Nonionic Surfactants

Ind. Eng. Chem. Res., Vol. 47, No. 23, 2008 9689

Page 4: QSPR Study of Critical Micelle Concentrations of Nonionic Surfactants

validated correlation coefficient (R2cv) for all selected models

is calculated automatically by the validation module imple-mented in the CODESSA PRO package.

Nonlinear Artificial Neural Network (ANN) Modeling. Theartificial neural network (ANN) algorithm used in this work ismultiperceptron feed-forward based with back-propagation ofthe error. A typical architecture of such network is shown inFigure 1.

The ANN method is more computationally intensive thanlinear regression since the nonlinear coefficients (weights andbiases) are changed iteratively, requiring repeated evaluationof the network outputs. However, the greater mathematicalflexibility found in ANNs often leads to models superior to thoseof MLR.

Adjusting the weights and biases to fit target values is knownas “training the network”. The increased mathematical flexibilityof ANNs and the large number of adjustable parameters canovertrain the neural network and result in apparently good fitsby chance. To avoid this situation, the data set needs to be splitinto training and validation subsets. The weights and biases areadjusted based on the rms (root-mean-squared) error of thetraining set members, and the rms error of the validation set iscalculated periodically throughout the training. Overtraining isconsidered to occur when the rms error of the validation setbegins to rise. The training process stops when the validationerror is at its minimum, to provide a network that may be usedwith reasonable confidence for future predictions.

Training the network is simply an optimization problem. Oneof the simplest methods for optimization of the weights duringthe training period is the delta rule. It propagates back thechanges of error with respect to the weights on each iteration(epoch) so that the ANN uses supervised learning based on theexperimental CMC’s. Once the ANN is trained, it can be usedfor QSPR predictions and analysis of novel surfactants. Becauseof the mathematical complexity of the ANN models, theirphysicochemical interpretation is usually difficult or nontrivial.

6. Results and Discussion

QSAR Model Development. The best multilinear regressionmethod (BMLR) was used to correlate the descriptors to thenegative logarithm of CMC. The square of the correlationcoefficient (R2), the cross-validated squared correlation coef-ficient (R2

cv), the external predictive squared correlation coef-ficient (R2

ext), the Fisher criterion (F), and the squared standarddeviation (s2) were used as criteria for stability and robustnessof the models. A small difference between R2 and R2

cv denoteshigh predictive ability of the QSPR model. The regressioncoefficients and their errors are represented by X and ∆X,respectively.

The “breaking point” rule54 was used to determine the optimalnumber of descriptors in the model. It is based on the significantimprovement of R2 with respect to the number of consecutivedescriptors in the model.

An Estimation of the Predictive Power of Eq 1. The limiteddomain of applicability of eq 1 is demonstrated by the moderatequality of the predictions when it is applied to the newstructurally diverse set of 85 surfactants (R2 ) 0.555). However,eightsurfactants(Sorb-Ol-3,bis(C8GA),bis(C12GA),bis(C12GH),bis(C8LA), bis(C12LA), C16-OCO-Glu, C18-OCO-Glu), allcharacterized by complex hydrophilic and hydrophobic do-mains,23 are identified as extreme outliers. After the removalof these extreme outliers (see Figure 2) the quality of predictionbased on eq 1 was increased significantly (R2 ) 0.873).

New QSPR Modeling of the Full Set of 162 NonionicSurfactants. A modified QSPR approach55 aimed (i) to proposea general QSPR model including all compounds while stillkeeping the traditional “training/test set” separation in use and(ii) to minimize the possibility of “correlations by chance” bylimiting the initial set of descriptors. It is consisted of thefollowing steps:

1. All 162 data points of the initial data set were ordered indescending order of their -log CMC values.

2. By selection of every third point from the original dataset three new subsets (conventionally denoted as A, B, and C)

Table 1. Continued

9690 Ind. Eng. Chem. Res., Vol. 47, No. 23, 2008

Page 5: QSPR Study of Critical Micelle Concentrations of Nonionic Surfactants

Table 2. Experimental and Predicted by Eq 1, Model of Table 4, and ANN -log CMC Valuesa

-log CMC pred -log CMC pred

no.structure

code -log CMCexp eq 1model ofTable 4 ANNAB+C no.

structurecode -log CMCexp eq 1

model ofTable 4 ANNAB+C

New Set

1 C6EO3 0.959 [31] 0.997 1.372 1.090 44 C8-Lactitol 2.561 [33] 1.020 2.015 2.1012 C6EO4 1.032 [12] 0.946 1.400 1.168 45 C12-Lactose 3.370 [33] 3.035 3.786 3.7573 C6EO5 1.017 [12] 0.908 1.434 1.410 46 C12-Lactitol 3.370 [33] 3.141 3.586 3.579*4 C8EO4 2.063 [12] 2.033 2.453 2.254 47 C16-Lactose 5.020 [33] 5.007 4.860 4.816*5 C8EO5 1.959 [28] 1.989 2.479 2.067 48 C16-Lactitol 5.120 [33] 5.098 4.679 4.7726 C9EO8 2.520 [11] 2.420 3.015 2.839 49 n-C12-Mpyr 3.740 [23] 3.606 3.897 3.8457 C10EO5 3.100 [11] 3.025 3.325 3.176 50 C4-OCO-Xyl 0.921 [34] -0.384 0.397 0.6848 C10EO7 3.015 [12] 2.955 3.396 3.264 51 C5-OCO-Xyl 1.237 [34] 0.068 0.842 1.0369 C12EO1 4.638 [12] 4.299 3.905 4.346 52 C6-OCO-Xyl 2.000 [34] 0.640 1.434 1.77310 C12EO14 4.260 [19] 3.804 4.295 4.273 53 C7-OCO-Xyl 1.745 [34] 1.215 1.981 1.91011 C14EO9 5.046 [19] 4.842 4.719 4.735 54 C8-OCO-Xyl 2.357 [34] 1.777 2.470 2.41212 C16EO8 5.921 [12] 5.798 5.172 5.279 55 C9-OCO-Xyl 2.745 [34] 2.323 2.895 2.298*13 C16EO10 5.699 [23] 5.747 5.232 5.587 56 C4-O-Xyl 1.237 [34] -0.237 0.348 0.38614 C8PhEO30 3.959 [4] 3.522 4.171 4.079* 57 C5-O-Xyl 1.420 [34] 0.208 0.809 0.94115 C8PhEO40 4.119 [4] 3.493 4.481 4.320* 58 C6-O-Xyl 2.027 [34] 0.774 1.408 1.84716 C9PhEO2 3.377 [32] 3.823 3.355 3.365 59 C7-O-Xyl 2.036 [34] 1.343 1.961 1.756*17 C9PhEO5 3.328 [32] 3.64 3.400 3.364 60 C8-O-Xyl 2.174 [34] 1.899 2.453 2.387*18 C9PhEO12 3.301 [32] 3.454 3.535 3.452* 61 C9-O-Xyl 2.678 [34] 2.44 2.898 2.877*19 H4EO3 2.097 [28] 1.448 2.821 2.753* 62 C10-O-Xyl 3.092 [34] 2.965 3.288 3.27320 F4EO3 2.699 [28] 1.800 2.857 2.749 63 C11-O-Xyl 3.523 [34] 3.478 3.644 3.630*21 H6EO3 3.523 [28] 3.203 4.165 4.164 64 C4-S-Xyl 0.745 [34] -0.016 0.359 0.51122 F6EO3 4.097 [28] 3.506 4.139 4.125 65 C5-S-Xyl 1.337 [34] 0.411 0.824 1.120*23 C12AmEO3 3.292 [24] 3.564 3.395 3.321 66 C6-S-Xyl 1.796 [34] 0.961 1.420 1.54424 C12AmEO6 3.187 [24] 3.446 3.386 3.292 67 C8-OCO-Glu 2.796 [34] 1.665 2.437 2.59625 C12AmEO9 3.125 [24] 3.374 3.362 3.258* 68 C12-OCO-Glu 3.638 [34] 3.778 3.903 3.775*26 F4C3NCOEO2 2.009 [27] 1.661 3.000 2.474 69 C16-OCO-Glu 3.854 [34] 5.720** 4.929 4.59927 F4C3NCOEO3 2.854 [27] 1.670 2.902 2.874 70 C18-OCO-Glu 3.699 [34] 6.650** 5.345 4.30928 F6C3NCOEO2 3.824 [27] 3.357 3.927 3.491* 71 C12-O-Malt 3.482 [29] 3.035 3.519 3.50829 F6C3NCOEO3 4.046 [27] 3.351 3.978 4.022 72 C12H25CONH(C2H4O)4H 3.301 [29] 3.488 3.463 3.36530 F8C3NCOEO2 4.620 [27] 5.055 4.965 4.857 73 C8TGlupyr 2.071 [26] 1.933 2.406 2.39431 F8C3NCOEO3 4.959 [27] 5.037 5.264 5.138 74 bis(C8GA) 4.174 [25] 8.332** 4.426 4.37532 Gly4Ol-1 4.484 [22] 5.057 3.946 4.175 75 bis(C12GA) 5.420 [25] 11.942** 5.146 5.18433 Gly4La-1 4.402 [22] 3.245 3.757 3.768 76 bis(C12GH) 5.284 [25] 11.888** 5.138 5.17534 Gly4 St-1 4.650 [22] 6.119 5.329 5.125 77 bis(C8LA) 3.886 [25] 8.123** 4.484 4.26235 Gly6Ol-1 4.562 [22] 4.947 4.076 4.313* 78 bis(C12LA) 5.051 [25] 11.727** 4.714 4.839*36 Gly6La-1 4.446 [22] 3.150 3.861 4.188 79 Glupyr-1 2.143 [30] 1.323 2.177 2.16237 Gly6 St-1 4.553 [22] 6.008 5.260 4.934 80 Glupyr-2 1.883 [30] 1.370 2.122 2.09238 Gly10Ol-1 4.676 [22] 4.817 4.334 4.432* 81 Glupyr-3 2.699 [30] 1.323 2.043 2.10739 Gly10La-1 4.549 [22] 3.045 4.146 4.396* 82 Glupyr-4 2.509 [30] 1.370 2.018 2.023*40 Sorb-La-1 4.440 [22] 3.384 3.599 4.020 83 Glupyr-5 1.801 [30] 1.385 2.081 2.037*41 Sorb-Ol-1 4.578 [22] 5.214 4.079 4.440 84 Glupyr-6 0.959 [30] -0.384 0.335 0.671*42 Sorb-Ol-3 4.944 [22] 19.151** 5.529 5.484* 85 Glupyr-7 1.886 [30] 1.323 2.123 2.050*43 C8-Lactose 2.580 [33] 0.891 2.213 2.447

Old Set

86 C4E1 0.009 [9] 0.184 0.241 0.196 125 C8PHE9 3.523 [9] 3.713 3.672 3.56687 C4E6 0.110 [9] -0.055 0.404 0.344* 126 C8PHE10 3.481 [9] 3.692 3.71 3.684*88 C6E3 1.000 [9] 0.997 1.372 1.321 127 IC4E6 0.049 [9] 0.423 0.996 0.93889 C6E6 1.164 [9] 0.878 1.467 1.350* 128 IC6E6 1.016 [9] 0.791 1.047 1.333*90 C9E1 2.310 [9] 2.274 2.368 2.351 129 IC8E6 1.670 [9] 1.614 1.604 1.240*91 C8E3 2.125 [9] 2.090 2.428 2.405 130 IC10E6 2.547 [9] 2.476 2.097 2.131*92 C8E6 2.004 [9] 1.954 2.524 2.324* 131 IC10E9 2.526 [9] 2.395 2.186 2.22593 C8E9 1.886 [9] 1.881 2.606 2.255 132 C8GLYCER 2.237 [9] 2.123 2.378 2.31394 C10E3 3.222 [9] 3.131 3.268 3.068* 133 C10DIOL 2.638 [9] 2.775 3.206 3.202*95 C10E4 3.167 [9] 3.072 3.297 3.247 134 C11DIOL 2.638 [9] 2.810 3.208 3.13996 C10E6 3.046 [9] 2.987 3.368 3.192* 135 C12DIOL 3.745 [9] 3.761 3.973 3.93197 C10E8 3.000 [9] 2.928 3.423 3.290 136 C15DIOL 4.886 [9] 4.723 4.581 4.70198 C10E9 2.886 [9] 2.905 3.443 3.416* 137 C8GLUC 1.602 [9] 1.777 2.396 2.26299 C11E8 3.523 [9] 3.426 3.788 3.695 138 C10GLUC 2.658 [9] 2.853 3.229 3.027100 C12E2 4.481 [9] 4.200 3.933 4.266 139 C12GLUC 3.721 [9] 3.875 3.9 3.372*101 C12E3 4.284 [9] 4.125 3.954 4.204 140 C12DELAC 3.222 [9] 3.177 3.898 3.583102 C12E4 4.194 [9] 4.065 3.982 4.052 141 C12MALT 3.620 [9] 3.606 4.11 3.796103 C12E5 4.194 [9] 4.017 4.01 4.051* 142 C12SUCR 3.469 [9] 3.035 3.703 3.686104 C12E6 4.060 [9] 3.977 4.054 4.056 143 C18SUCR 5.292 [9] 4.885 4.2 4.964*105 C12E7 4.086 [9] 3.943 4.067 4.37* 144 C11CONEO 3.585 [9] 3.488 3.366 3.444106 C12E8 4.000 [9] 3.915 4.117 4.039 145 C9CONE3E 2.299 [9] 2.464 2.41 2.337*107 C12E9 4.000 [9] 3.89 4.146 4.098 146 C9CONE4E 2.193 [9] 2.416 2.148 2.177108 C12E12 3.854 [9] 3.832 4.23 4.121* 147 C11CONE2 3.398 [9] 3.543 3.309 3.369109 C13E8 4.569 [9] 4.395 4.417 4.500 148 C11CONE3 3.292 [9] 3.47 3.221 3.043*110 C14E6 5.000 [9] 4.933 4.624 4.655 149 C11CONE4 3.611 [9] 3.417 3.049 3.113111 C14E8 5.046 [9] 4.869 4.683 4.799 150 C12ALAE4 3.413 [9] 3.943 3.39 3.014*

Ind. Eng. Chem. Res., Vol. 47, No. 23, 2008 9691

Page 6: QSPR Study of Critical Micelle Concentrations of Nonionic Surfactants

were constructed. Thus, for each of the subsets the datadistribution of the investigated property values is similar.

3. The three binary sum combinations A + B, A + C, and B+ C were used to form the training subsets.

4. The standard QSAR modeling procedure including bestmultiple linear regression method (BMLR) was applied to thesubsets obtained in step 3.

5. The “breaking point” restriction rule was used to determinethe optimal number of descriptors of the generated models.

6. The complementary parts to each of these three subsets(C, B, and A, respectively) were used as external validationdata sets by considering their consistency.

7. All the descriptors that appeared in the obtained modelsof step 5 were tested to obtain a general model including all162 surfactants from the initial data set.

8. The general model was again validated using classicalinternal cross-validation and scrambling procedures.

QSPR models involving up to seven descriptors for the A +B, A + C, and B + C subsets were generated (step 4). Theapplication of the “breaking point” rule (Figure 3) suggestedthat models with four descriptors would be optimal (see Table3).

At the next stage (step 7) only those descriptors listed in Table3 were used to derive a general model for the whole data setincluding 162 surfactants. As for the submodels, the “breakingpoint” rule was used again for identification of the optimalnumber of descriptors allowed to enter the model (see Figure4), and once again, a four-descriptor model was indicated. Thismodel and its statistical parameters are shown in Table 4 andFigure 5.

To examine the sensitivity of the proposed QSPR model tochance correlations, a Y-scrambling procedure was applied; i.e.,the model was fitted to randomly reordered activity values andthen compared with the one obtained for the actual activities.Twenty such randomizations were performed, which producedR2 ranging from 0.211 to 0.265 (average R2 ) 0.237). The

Table 2. Continued

-log CMC pred -log CMC pred

no.structure

code -log CMCexp eq 1model ofTable 4 ANNAB+C no.

structurecode -log CMCexp eq 1

model ofTable 4 ANNAB+C

112 C15E8 5.456 [9] 5.336 4.942 5.226 151 C12GLYE4 3.474 [9] 3.915 3.614 3.575113 C16E6 5.780 [9] 5.864 5.113 5.272 152 C12SARE4 3.533 [9] 3.943 3.589 3.752*114 C16E7 5.770 [9] 5.829 5.142 5.340* 153 CF6SE2 4.602 [9] 4.649 4.316 4.318115 C16E9 5.678 [9] 5.771 5.202 5.439 154 CF6SE3 4.553 [9] 4.573 4.317 4.388116 C16E12 5.638 [9] 5.707 5.291 5.305* 155 CF6SE5 4.432 [9] 4.465 4.328 4.091*117 C8PHE1 4.305 [9] 4.122 3.502 3.910* 156 CF6SE7 4.319 [9] 4.395 4.361 4.349118 C8PHE2 4.116 [9] 4.022 3.488 3.708 157 CF6SESE2 4.638 [9] 4.634 4.258 4.269*119 C8PHE3 4.013 [9] 3.945 3.517 3.882* 158 CF6SE2SE 4.585 [9] 4.574 4.304 4.325*120 C8PHE4 3.886 [9] 3.886 3.538 3.691 159 CF6SE3SE 4.469 [9] 4.486 4.269 4.190*121 C8PHE5 3.824 [9] 3.838 3.564 3.730 160 CF6CONE3 3.260 [9] 3.330 3.693 3.568122 C8PHE6 3.678 [9] 3.838 3.717 3.700 161 CF8CONE3 4.921 [9] 5.002 4.818 4.849123 C8PHE7 3.602 [9] 3.765 3.625 3.921* 162 CF10CONE 6.523 [9] 6.630 5.616 6.076*124 C8PHE8 3.553 [9] 3.737 3.638 3.634

a Note! The ANN test set compounds are marked with an asterisk. The surfactants marked with a double asterisk are outliers.

Figure 1. Feed-forward back-propagation neural network.

Figure 2. Plot of predicted vs experimental -log CMC values for new set(eight outliers removed) nonionic surfactants using eq 1.

Figure 3. Number of descriptors used in the submodels vs R2.

9692 Ind. Eng. Chem. Res., Vol. 47, No. 23, 2008

Page 7: QSPR Study of Critical Micelle Concentrations of Nonionic Surfactants

substantial difference between the actual R2 and the averagedR2 from the scrambling procedure supports the stability of themodel.

The model of Table 4 contains two fragment descriptorsconcerning the hydrophobic fragment: the “average comple-mentary information content (order 2)” and the “averageinformation content (order 0)”. Despite the similarity of theirformulation, these descriptors are highly orthogonal (R )-0.309). The t test criterion was used to determine thedescriptors’ significance, which is as follows: average comple-mentary information content (order 2) > average informationcontent (order 0) > FNSA-2 fractional PNSA (PNSA-2/TMSA)[Zefirov’s PC] > YZ shadow.

The most important descriptor, “average complementaryinformation content (order 2)”, is defined as a sum taken overall the atomic layers in the coordination sphere of a given atom.

where n is the total number of atoms, ni the number of atomsin the ith class, and k is the number of classes. The division ofthe atoms in classes is based on the coordination sphere definedfor the molecule. For example, for the first-order index the atomsfall in the same class if they are of the same type and valence,while for the second order they need to have the same numberof neighbors.52 The positive regression coefficient sign impliesthat surfactants characterized by large, complex hydrophobicfragments will likely possess low CMCs.

The “average information content (order 0)” descriptor isdefined for the hydrophobic fragment of the surfactant and canbe calculated similarly to eq 3. The positive regression coef-

ficient implies that the bigger the descriptor value, the lowerthe CMC of nonionic surfactants. Its presence in the modelmight be related to the importance of the steric hindrance inthe hydrophobic area in the micelle state.

The “FNSA-2 fractional PNSA [Zefirov’s PC]” descriptor isdefined by the ratio of the “total charge weighted partialnegatively charged molecular surface area” to the “total mo-lecular surface area” and reflects the negative charge redistribu-tion within the molecule (a whole molecule descriptor). Itsappearance in the model is probably connected to the presenceof heteroatoms (in both the tails and the heads) and theirpossibility to participate in donor-acceptor or dipole-dipoleinteractions, thus effectively increasing the surfactant solubility(and CMC) in the aqueous phase.

By the orientation of the molecule in the space along theaxes of inertia the areas of the shadows of the molecule areprojected on the XY, XZ, and YZ planes.56 The normalizedshadow areas are calculated by applying 2D-square grid on themolecular projection and by summation of the areas of squaresoverlapped with a projection. It is usually stated that the “shadowarea” type of descriptors are related to the molecular volume(bulk). The unexpected negative sign of the descriptor coefficient(leading to increased CMCs with the increase of YZ shadowdescriptor) lead us to analyze the dependence between “YZshadow” values and the molecular volume which, surprisingly,were found almost orthogonal (R2 ) 0.251).

ANN Modeling. A sensitivity analysis performed by building1-1-1 NN models was aimed to select a good starting set ofdescriptors related to the CMC. The descriptors characterized withlowest error at the output were selected for further examination.The visual inspection of the scatter plots showing the variabilityof the CMCs in respect to the descriptors lead to a combination offour descriptors: “average complementary information content(order 2)” and “average information content (order 1)”, both defined

Table 3. Statistical Characteristics of the Four-Descriptors Models for the A + B, A + C, and B + C Subsets

ID X ∆X t test descriptors

training set: A + B; test set: C; R2 ) 0.872; R2cv ) 0.861; R2

ext ) 0.895; F ) 175.78; s2 ) 0.236

0 -5.814 0.5164 -11.26 intercept1 1.640 0.06894 23.78 average complementary information content (order 2)a

2 0.1835 0.01153 15.92 number of F atoms3 0.9222 0.08963 10.29 maximum Coulombic interaction for a C-C bonda

4 -1.239 0.1927 -6.430 number of double bondsa

training set: A + C; test set: B; R2 ) 0.911; R2cv ) 0.901; R2

ext ) 0.843; F ) 263.89; s2 ) 0.165

0 -5.429 0.3890 -13.96 intercept1 2.371 0.08754 27.08 average complementary information content (order 2)a

2 -1.965 0.1433 -13.71 FNSA-2 fractional PNSA (PNSA-2/TMSA) [Zefirov’s PC]3 8.265 0.7242 11.41 average bonding information content (order 1)a

4 -0.01106 0.0013 -8.231 YZ shadow

training set: B + C; test set: A; R2 ) 0.884; R2cv ) 0.873; R2

ext ) 0.856; F ) 196.19; s2 ) 0.212

0 -13.38 2.267 -5.901 intercept1 1.843 0.06969 26.44 average complementary information content (order 2)a

2 2.220 0.3116 7.124 average information content (order 0)a

3 -4.012 0.6652 -6.031 FPSA-1 fractional PPSA (PPSA-1/TMSA) [Zefirov’s PC]4 13.81 2.654 5.202 min (>0.1) bond order of a C atomb

a For hydrophobic fragment. b For hydrophilic fragment.

Table 4. Statistical Characteristics of the General QSPR Model (R2 ) 0.888; R2cv ) 0.879; F ) 309.95; s2 ) 0.203)

ID X ∆X t test descriptors

0 -4.270 0.2867 -14.89 intercept1 1.781 0.05469 32.56 average complementary information content (order 2)a

2 2.893 0.2408 12.02 average information content (order 0)a

3 -1.336 0.1620 -8.246 FNSA-2 fractional PNSA (PNSA-2/TMSA) [Zefirov’s PC]4 -0.009777 0.001272 -7.688 YZ shadow

a For hydrophobic fragment.

kCIC ) log2 n-kIC (2)

kIC ) ∑i)1

k ni

nlog2

ni

n(3)

Ind. Eng. Chem. Res., Vol. 47, No. 23, 2008 9693

Page 8: QSPR Study of Critical Micelle Concentrations of Nonionic Surfactants

for the hydrophobic fragment, “FNSA-2 fractional PNSA (PNSA-2/TMSA) [Zefirov’s PC]”, and “YZ shadow”. These descriptorswere used as an input vectors for the ANN.

To ensure a similar distribution of the data for the trainingand test subsets, the ABC separation method (as describedabove) was applied.

The A + B, A + C, and B + C subsets defined above wereused to train the network. To avoid overtraining, the rms of therespective validations sets was monitored during the trainingprocess. When the rms for the validation sets started to increase,it was taken as a stopping criterion for the training of the ANNat a given epoch.

To define the ANN topology, we constructed several ANNarchitectures varying the number of the neurons in the hiddenlayer from 2 to 6. Thus, it was found that the 4-5-1architecture (i.e., 4 neurons in the input layer, 5 neurons in thehidden layer, and 1 neuron in the output) generalizes the bestand avoids overparameterization of the model. The ANNmodeling results are shown in Table 5 and Figure 6 (“A + B”training set and “C” test set case is shown).

A comparison between the descriptors in Table 4 and theANN model shows that they are almost identical with theexception of the “average information content” descriptor, whichin the ANN model is first order. A general comparison of themultilinear and nonlinear models shows that BMLR and ANNmodels provide results of similar quality. However, the multi-

linear model is relatively easy to interpret from a physicochem-ical point of view revealing an insight regarding the micellizationphenomenon, whereas the complexity of the ANN proceduredoes not allow a direct interpretation of the molecular descriptors.

7. Conclusions

A data set, consisting of diverse classes of nonionic surfac-tants, has been investigated to relate the logarithm of CMC totheir molecular structure. The involved in the QSPR modeltopological, geometrical, and electrostatic descriptors empha-sized the importance of the size, complexity, and negative chargeredistribution within the surfactant molecule for the aggregationphenomenon. This study gives information about the way thatthese descriptors influence the micellization potential of thenonionic surfactants. The main models obtained in this workcould be used for prediction, screening, and analysis of newnonionic surfactants similar to the current compounds employedin the models. Specifically, the QSAR models reported areexpected to provide reliable estimations for the followingsurfactant classes: branched and linear alkyl ethoxylates, oc-tylphenyl ethoxylates, linear ethoxylated alcohols, octylphenols,alkanediols, alkyl mono- and disaccharides, ethoxylated alky-lamines and alkylamides, fluorinated alkyl ethoxylates, carbo-hydrate derivatives, and dimeric surfactants.

Literature Cited

(1) Fuguet, E.; Rafols, C.; Roses, B. Critical micelle concentration ofsurfactants in aqueous buffered and unbuffered systems. Anal. Chim. Acta2005, 548, 95.

(2) Arai, T.; Takasugi, K.; Esumi, K. Micellar properties of nonionicsaccharide surfactants with amide linkage in aqueous solution. Colloids Surf.1996, 119, 81.

(3) Paxton, T. R. Adsorbtion of emulsifier on polystyrene and poly(m-ethyl methacrylate) latex particles. J. Colloid Interface Sci. 1969, 31, 19.

(4) Perkowski, J.; Mayer, J.; Ledakowicz, S. Determination of criticalmicelle concentration of non-ionic surfactanys using kinetic approach.Colloids Surf. 1995, 101, 103.

(5) Kato, T.; Iwai, M.; Seimiya, T. Micellar growth in mixed anionic/cationic surfactant solutions: Sodium dodecyl sulfate/octyltrimethylammo-nium bromide. J. Colloid Interface Sci. 1989, 130, 439.

Figure 4. Number of descriptors used in the submodels vs R2.

Figure 5. Predicted vs experimental -log CMC values of the general model.

Figure 6. Predicted vs experimental -log CMC values (A + B trainingset and C test set case is shown).

Table 5. ANN Results for the Training and Test Subsets

training set R2tr std dev test set R2

ext

A + B 0.946 0.309 C 0.947A + C 0.971 0.232 B 0.938B + C 0.951 0.297 A 0.942average 0.956 0.279 0.942

9694 Ind. Eng. Chem. Res., Vol. 47, No. 23, 2008

Page 9: QSPR Study of Critical Micelle Concentrations of Nonionic Surfactants

(6) Ray, G. B.; Chakraborty, I.; Moulik, S. P. Pyrene absorbtion can bea convenient method for probing critical micellar concentration (CMC) andindexing micellar polarity. J. Colloid Interface Sci. 2006, 294, 248.

(7) Kumar, G.; Kanti, K.; Chandra, B. Physicochemical characteristicsof reverse micelles of polyoxyethylene nonyl phenol in different organicsolvents. J. Colloid Interface Sci. 2004, 279, 523.

(8) Thies, M.; Poschmann, M.; Hinze, U.; Paradies, H. H. Light scatteringon micellar solutions of N,N′-butyl-n-alkyl-1,4-diazabicyclo-[2,2,2]-octanederivatives of 3′-O-acylthymidine-5′-phosphates. Colloids Surf. 1995, 101, 159.

(9) Huibers, P. D. T.; Lobanov, V. S.; Katritzky, A. R.; Shah, O. D.;Karelson, M. Prediction of critical micelle concentration using a quantitativestructure-property relationship approach. 1. Nonionic surfactants. Langmuir1996, 12, 1462.

(10) Huibers, P. D. T.; Lobanov, V. S.; Katritzky, A. R.; Shah, O. D.;Karelson, M. Prediction of critical micelle concentration using a quantitativestructure-property relationship approach. 2. Anionic surfactants. J. ColloidInterface Sci. 1997, 187, 113.

(11) Li, X.-S.; Lu, J.-F.; Liu, J.-C. Studies on UNIQUAC and SAFTequations for nonionic surfactant solutions. Fluid Phase Equilib. 1998, 153,215.

(12) Cheng, J.-S.; Chen, Y.-P. Correlation of critical micelle concentra-tion for aqueous solutions of nonionic surfactants. Fluid Phase Equilib.2005, 232, 37.

(13) Puvada, S.; Blankschtein, D. Thermodynamic description of mi-cellization, phase behavior, and phase separation of aqueous solutions ofsurfactant mixtures. J. Phys. Chem. 1992, 96, 5567.

(14) van Lent, B.; Scheutjens, J. M. H. M. Influence of association onadsorbtion properties of block copolymers. Macromolecules 1989, 22, 1931.

(15) Becker, P. Hydrophile-Lipophile Balance: History and recentDevelopments. J. Dispersion Sci. Technol. 1984, 5, 81.

(16) Ravey, J. C.; Gerbi, A.; Stebe, M. J. Comparative study of fluorinatedand hydrogenated nonionic surfactants. I. Surface activity properties and criticalconcentrations. Prog. Colloid Polym. Sci. 1988, 76, 234.

(17) Chen, C. C. Molecular thermodynamic model for Gibbs energy ofmixing of nonionic surfactant solutions. AIChE J. 1996, 42, 3231.

(18) Saunders, R. A.; Platts, J. A. Correlation and prediction of criticalmicelle concentration using polar surface area and LFER methods. J. Phys.Org. Chem. 2004, 17, 431–438.

(19) Yuan, S.; Cai, Z.; Xu, G.; Jiang, Y. Quantitative structure-propertyrelationships of surfactants: prediction of the critical micelle concentrationof nonionic surfactants. Colloid Polym. Sci. 2002, 280, 630.

(20) Wang, Z.; Li, G.; Zhang, X.; Wang, R.; Lou, A. A quantitativestructure-property relationship study for the prediction of critical micelleconcentration of nonionic surfactants. Colloids Surf. 2002, 197, 37.

(21) Chen, L.-J.; Lin, S.-Y.; Huang, C.-C.; Chen, E.-M. Temperaturedependence of critical micelle concentration of polyoxyethylenated non-ionic surfactants. Colloids Surf. 1998, 135, 175.

(22) Koga, K.; Ohyashiki, T.; Murakami, M.; Kawashima, S. Modifica-tion of Ceftibuten transport by the addition of nonionic surfactants. Eur.J. Pharm. Biopharm. 2000, 49, 17.

(23) Nakahara, Y.; Kida, T.; Nakatsuji, Y.; Akashi, M. New fluorescencemethod for determination of the critical micelle concentration by photosensi-tive monoazacryptand derivatives. Langmuir 2005, 21, 6688.

(24) Paddon-Jones, G.; Regismond, S.; Kwetkat, K.; Zana, R. Micelli-zation of nonionic surfactant dimmers and of the corresponding surfactantmonomers in aqueous solutions. J. Colloid Interface Sci. 2001, 243, 496.

(25) Komorek, U.; Wilk, K. A. Surface and micellar properties of newnonionic Gemini aldonamide-type surfactants. J. Colloid Interface Sci. 2004,271, 206.

(26) Molina-Bolivar, J. A.; Aguiar, J.; Peula-Garcia, J. M.; Ruiz, C.Surface activity, micelle formation, and growth of n-octyl-�-thioglucopy-ranoside in aqueous solutions at different temperatures. J. Phys. Chem. B2004, 108, 12813.

(27) Mureau, N.; Trabelsi, H.; Guittard, F.; Geribaldi, S. Preparationand evaluation of monodisperse nonionic surfactants based on fluorine-containning dicarbamates. J. Colloid Interface Sci. 2000, 229, 440.

(28) Eastoe, J.; Paul, A.; Rankin, A.; Wat, R. Fluorinated nonionicsurfactants bearing either CF3 - or H-CF2 - terminal groups: adsorbtion atthe surface of aqueous solutions. Langmuir 2001, 17, 7873.

(29) Kjellin, U. R. M.; Reimer, J.; Hanson, P. An investigation of dynamicsurface tension, critical micelle concentration, and aggregation number of threenonionic surfactants using NMR, time-resolved fluorescence quenching, andmaximum bubble pressure tensiometry. J. Colloid Interface Sci. 2003, 262,506.

(30) Castro, M.; Kovensky, J.; Fernandez-Cirelli, A. New family ofnonionic Gemini surfactants, determination and analysis of interfacialproperties. Langmuir 2002, 18, 2477.

(31) Ortona, O.; Vitagliano, V.; Paduano, L.; Costantino, L. Microcalo-rimetric study of some short-chain nonionic surfactants. J. Colloid InterfaceSci. 1998, 203, 477.

(32) Gosh, S. K.; Khatua, P. K.; Battacharya, S. C. Characterization ofmicelles of polyoxyethylene nonylphenol (Igepal) and its complexation with3,7-diamino-2,8-dimethyl-5-phenylphenazinium chloride. J. Colloid Inter-face Sci. 2004, 275, 623.

(33) Drummond, C. J.; Wells, D. Nonionic lactose and lactitol basedsurfactants: comparison of some physico-chemical properties. Colloids Surf.1998, 141, 131.

(34) Savelli, M. P.; Van Roekeghem, P.; Douillet, O.; Cave, G.; Gode,P.; Ronco, G.; Villa, P. Effects of tail alkyl chain length (n), head groupstructure and junction (Z) on amphiphilic properties of 1-Z-R-D,L-xylitolcompounds (R)C n H 2n+1). Int. J. Pharm. 1999, 182, 221.

(35) www.hyper.com.(36) Dewar, J. S. M.; Zoebisch, E. G.; Healy, E. F.; Stewart, J. J. P.

AM1: A new general purpose quantum mechanical molecular model. J. Am.Chem. Soc. 1985, 107, 3902.

(37) www.codessa-pro.com.(38) http://www.epa.gov/opptintr/exposure/pubs/episuite.htm.(39) Katritzky, A. R.; Mu, L.; Lobanov, V.; Karelson, M. Correlation of

boiling points with molecular structure. 1. A traninig set of 298 diverse organicsand a test set of 9 simple inorganics. J. Phys. Chem. 1996, 100, 10400.

(40) Katritzky, A. R.; Tamm, K.; Kuanar, M.; Fara, D. C.; Oliferenko,A.; Oliferenko, P.; Huddleston, J. G.; Rogers, R. D. Aqueous biphasicsystems. Partitioning of organic molecules: a QSPR treatment. J. Chem.Inf. Comput. Sci. 2004, 44, 136.

(41) Katritzky, A. R.; Kuanar, M.; Fara, D. C.; Hur, E.; Karelson, M.The classification of solvents by combining classical QSPR methodologywith principal component analysis. J. Phys. Chem. A 2005, 109, 10323.

(42) Lueiæ, B.; Basic, I.; Nadramija, D.; Trinajstiæc, N.; Takahiro, S.;Petrukin, R.; Karelson, M.; Katritzky, A. Correlation of liquid viscositywith molecular structure for organic compounds using different variableselection methods. ARKIVOC 2002, 4, 45.

(43) Katritzky, A. R.; Fara, D. C.; Yang, H.; Karelson, M.; Suzuki, T.;Solov’ev, V. P.; Varnek, A. QSPR Modeling of beta-Cyclodextrin Com-plexation Free Energies. J. Chem. Inf. Comp. Sci. 2004, 44, 529.

(44) Katritzky, A. R.; Fara, D. C.; Karelson, M. QSPR of 3-aryloxazo-lidin-2-one antibacterials. Bioorg. Med. Chem. 2004, 12, 3027.

(45) Katritzky, A.; Dobchev, D.; Fara, D.; Karelson, M. QSAR treatmentof drugs transfer into human breast milk. Bioorg. Med. Chem. 2005, 13, 1623.

(46) Katritzky, A. R.; Oliferenko, A.; Lomaka, A.; Karelson, M. Sixmembred cyclic ureas as HIV-1 protease inhibitors: a QSAR study basedon CODESSA PRO approach. Bioorg. Med. Chem. Lett. 2002, 12, 3453.

(47) Beteringhe, A.; Balaban, A. T. QSAR for toxicities of polychlo-rodibenzofurans, polychlorodibenzo-1,4-dioxins, and polychlorobiphenyls.ARKIVOC 2004, 1, 163.

(48) Zefirov, N.; Palyulin, A. Fragmental approach in QSPR. J. Chem.Inf. Comput. Sci. 2002, 42, 1112.

(49) Klopman, G.; Zhu, H. Estimation of the aqueous solubility oforganic molecules by the group contribution approach. J. Chem. Inf. Comput.Sci. 2001, 41, 439.

(50) Trepalin, S. V.; Gerasimenko, V. A.; Kozyukov, A. V.; Savchuk,N. P.; Ivaschenko, A. A. New Diversity Calculations Algorihms Used forCompound Selection. J. Chem. Inf. Comput. Sci. 2002, 42, 249.

(51) Bawden, D. Computerized chemical structure handling techniquesin structure - activity studies and molecular property prediction. J. Chem.Inf. Comput. Sci. 1983, 23, 14.

(52) Karelson, M. In Molecular Descriptors in QSAR/QSPR; J. Wiley& Sons: New York, 2000.

(53) Kubinyi, H. QSAR in Drug Design. In Handbook of Chemoinformatics;Gasteiger, J., Ed.; Wiley-VCH: Weinheim, 2003; Vol. 4, pp 1532-1554.

(54) Katritzky, A. R.; Kuanar, M.; Dobchev, D.; Vanhoecke, B.; Karelson,M.; Parmar, V.; Stevens, C.; Bracke, M. QSAR modeling of anti - invasiveactivity of organic compounds using structural descriptors. Bioorg. Med. Chem.2006, 14, 6933.

(55) Katritzky, A. R.; Slavov, S.; Dobchev, D.; Karelson, M. QSARmodeling of the antifungal activity against Candida albicans for a diverseset of organic compounds. Bioorg. Med. Chem., 2008, 16, 7055.

(56) Rohrbaugh, R. H.; Jurs, P. C. Descriptions of molecular shapeapplied in studies of structure/activity and structure/property relationships.Anal. Chim. Acta 1987, 199, 99.

ReceiVed for reView June 17, 2008ReVised manuscript receiVed August 26, 2008

Accepted September 10, 2008

IE800954K

Ind. Eng. Chem. Res., Vol. 47, No. 23, 2008 9695