research article estimation of n -octanol/water...
TRANSCRIPT
Hindawi Publishing CorporationJournal of ChemistryVolume 2013 Article ID 740548 8 pageshttpdxdoiorg1011552013740548
Research ArticleEstimation of n-OctanolWater Partition Coefficients(logKOW) of Polychlorinated Biphenyls by Using QuantumChemical Descriptors and Partial Least Squares
Fang-Li Zhang1 Xing-Jian Yang1 Xiu-Ling Xue2 Xue-Qin Tao3
Gui-Ning Lu14 and Zhi Dang15
1 School of Environment and Energy South China University of Technology Guangzhou 510006 China2 College of Chemical Engineering Huaqiao University Xiamen 361021 China3 School of Environmental Science and Engineering Zhongkai University of Agriculture and Engineering Guangzhou 510225 China4Guangdong Provincial Key Laboratory of Atmospheric Environment and Pollution Control Guangzhou 510006 China5The Key Lab of Pollution Control and Ecosystem Restoration in Industry Clusters Ministry of Education Guangzhou 510006 China
Correspondence should be addressed to Gui-Ning Lu lgn36qqcom
Received 15 May 2013 Revised 29 July 2013 Accepted 22 August 2013
Academic Editor Enrico Veschetti
Copyright copy 2013 Fang-Li Zhang et al This is an open access article distributed under the Creative Commons Attribution Licensewhich permits unrestricted use distribution and reproduction in any medium provided the original work is properly cited
The n-octanolwater partition coefficient (log119870OW) is a useful parameter for the assessment of the environmental fate and impact ofxenobiotic trace contaminantsQuantitative structure-activity relationship (QSAR)model for log119870OW of polychlorinated biphenyls(PCBs) was analyzed by using the density functional theory at B3LYP6-31G(d) level and the partial least squares (PLS)methodwithan optimizing procedure A PLS model with reasonably good coefficient (1198772 = 0992) and cross-validation test (1198762cum = 0988)values was obtained All the predicted values are within the range of plusmn03 log unit from the observed values The log119870OW values of7 PCBs in the test set predicted by the model are very close to those observed indicating that this model has high fitting precisionand good predictability The PLS analysis showed that PCBs with larger electronic spatial extent and lower molecular total energyvalues tend to be more hydrophobic and lipophilic
1 Introduction
Polychlorinated biphenyls (PCBs) are highly stable industrialchemical products that have been used as industrial fluidsflame retardants diluents hydraulic fluids and dielectric flu-ids Due to careless disposal practices and chemical stabilityPCBs are some of the most prevalent pollutants in the totalenvironment Generally a sum of 209 theoretical PCBs existand approximately 150 PCBs are found in the environment[1]
Despite the fact that the production and the use of PCBshave long been banned they are still considered environ-mental contaminants as they are among the most widespreadand persistent man-made products in the ecosystem It iswell known that these chemicals are toxic and lipophilic and
tend to be bioaccumulated Some interfere with mammalianand avian reproduction and others disturb humanrsquos germcell development or are promoters of cancer Some PCBs areincluded in the increasing list of environmental pollutantsthat are able to mimic estrogen action and are termedldquoenvironmental estrogensrdquo [2]
PCBs are categorized as persistent organic pollutants(POPs) [3] Individual PCB congeners exhibit differentphysicochemical properties and biological activities thatresult in different environmental distributions and toxicityprofiles The variable composition of PCB residues in envi-ronmental matrices and their different mechanisms of toxi-city complicate the development of scientifically based regu-lations for the risk assessment [4] The n-octanolwater dis-tribution coefficient (119870OW) is an important physicochemical
2 Journal of Chemistry
property suitable for the characterization of the lipophilicityof the compounds such as PCBs119870OW has been shown to cor-relate with aquatic solubility toxicity and bioaccumulation[5ndash7] and as such it is a useful parameter for the assessmentof the environmental fate and impact of xenobiotic tracecontaminants [8] During the last 50 years 119870OW has provento be a valuable tool in many areas of natural sciences chem-istry biochemistry biology pharmacology environmentalsciences and so forth During that period many hundredsor even thousands of scientific publications have been madein which the successful application of 119870OW in correlationwith many physical chemical or biological properties hasbeen demonstrated for a large variety of organic chemicalsUnfortunately many correlations have been also publishedin which the processes studied have nothing in commonwith the process of partition between water and n-octanolphases It must be emphasized that in order to correlate twoproperties do not have to be related both of themmay simplybe related to a third property The shake flask is the classicalmethod for measuring 119870OW and this procedure will givereproducible and accurate results for chemicalswith log119870OWAccurate results can also be obtained for more hydrophobicchemicals but this will need a very precise and skillfulhandling and the use of very pure n-octanol The problemwith this procedure is that the complete phase separation isdifficult to achieve and this may lead to the underestimationof119870OW particularly for themore hydrophobic chemicals [9]
It is impractical to measure the 119870OW of all the PCBsdirectly in the laboratory because of the extremely lowwater solubility of some high-chlorinated PCBs The lackof complete reliable and comparable data has led to thedevelopment of different119870OW estimation methods With theadvent of inexpensive and rapid computation there has beena remarkable growth in the area of quantitative structure-activity relationships (QSAR) which correlate the proper-ties of pollutants with relevant properties and moleculardescriptors [11] A large number of calculation methods havebeen presently developed for estimation of the partitioncoefficients with varying success and applicability Thesemethods are cable of predicting environmental behaviors notonly for common environmental pollutants but also for thosechemicals that have not yet been synthesized or those thatcannot be examined experimentally due to their extremelyhazardous nature According to the descriptors used thesemethods can be classified into two groups empirical andtheoretical methods [12] Hansen et al [1] used the degreeof chlorination in the orthoposition of the PCB molecule aspredictor variable together with either the calculated molec-ular total surface area (TSA) or the gas chromatographicrelative retention times (RRT) to predict the log119870OW andsoil sorption coefficients (log119870OC) for all 209 PCB congenerswhich were based on 53 and 48 experimental data pointsrespectively Also reporting the relationship between 119870OWand 119870OC which is promising for 119870OW could proceed with119870OC However TSA and RRT are both empirical moleculardescriptors Firstly the accurate determination of empiricalmolecular descriptorsmay be difficult and expensive in termsof cost and time or even impossible for some PCBs whichmight have been not synthesized or purified Meanwhile the
experimental errors may be introduced especially for thosecongeners difficult to be separated and identified by chro-matography Furthermore the empirical molecular descrip-tors often reflect complex and multiple physical interactionsHence the interpretation of the respective property could bedifficult and ambiguous [13] Many attempts have been madeto model the physicochemical properties and bioactivities ofPCBs by utilizingQSAR techniques [14ndash21] For instanceQinet al [17] reported the correlation of bioconcentration factorsand molecular electronegativity distance vector (MEDV)of PCBs based on molecular structure Tuppurainen andRuuskanen [18] introduced a newQSARdescriptor electroniceigenvalue (EEVA) based on molecular orbital energies topredict the Ah receptor binding affinity of PCBs polychlo-rinated dibenzo-p-dioxins and dibenzofurans (PCDDFs)Daren [19] reported QSPR studies of119870OW 119878W 119884W and119870OCof all 209 PCBs and biphenyl by the combination of geneticalgorithms and PLS analysis
Modern theoretical method in quantum chemistry suchas density functional theory (DFT) with high calculation pre-cision should have its advantages in estimating the propertiesof environmental toxics such as PCBs PCDDFs [13 22]The aim of the present study is to explore QSAR modelrelating to log119870OW of 27 PCBs by using partial least squares(PLS) analysis with optimizing procedure based on reportedlog119870OW data and quantum chemical descriptors computedby DFT contained in Gaussian 03 in an attempt to developreliable and predictive QSARmodel and identify the primarymolecular factors governing the 119870OW of PCBs
2 Materials and Methods
21 Target PCBs A total of 27 PCBs whose 119870OW andorlog119870OW were previously published were chosen to constitutethe training set and the test set in this work The trainingset consists of 20 of these 27 PCBs selected randomly andthe remaining 7 PCBs constitute the test set Their chemicalabstracts service numbers (CAS no) and reported log119870OWare listed in Table 1
22 Calculation and Selection of the Descriptors The molec-ular modeling system HyperChem (Release 70 HypercubeInc 2002) was used to construct and view all molecularstructures Molecular geometry was optimized and quantumchemical descriptors were computed using the B3LYP hybridfunctional of DFT in conjunction with 6-31G(d) a split-valence basis set with polarization function The B3LYPcalculations were performed using the quantum chemicalcomputation software Gaussian 03 [23] All calculations wererun on an Intel Core i5-2410M CPU230GHz computerequipped with 200GB of internal memory and MicrosoftWindows 7 professional operating system
According to the chemometrics theory it is suggested thataQSARmodel should include asmany relevant descriptors aspossible to increase the probability of a good characterizationfor a class of compounds [24] In this study 18 independentquantum chemical variables were selected for developingQSAR models The descriptors cover the dihedral angle of
Journal of Chemistry 3
Table 1 n-Octanolwater partition coefficients of the PCBs studied
Noa Compounds CAS no Observed log119870OWb Predicted log119870OW
Reference [10] Model I Model II1 Biphenyl 92-52-4 410 414 396 3872 2-Chlorobiphenyl 2051-60-7 456 464 456 4563 3-Chlorobiphenyl 2051-61-8 472 463 477 4774 221015840-Dichlorobiphenyl 13029-08-8 502 520 503 5055 24-Dichlorobiphenyl 33284-50-3 515 516 505 5066 2210158405-Trichlorobiphenyl 37680-65-2 564 573 557 5597 2441015840-Trichlorobiphenyl 7012-37-5 574 573 582 5808 245-Trichlorobiphenyl 15862-07-4 577 576 557 5609 23101584045-Tetrachlorobiphenyl 73557-53-8 639 630 658 65810 331015840441015840-Tetrachlorobiphenyl 32598-13-3 652 639 668 67111 2210158403451015840-Pentachlorobiphenyl 38380-02-8 685 684 685 69112 2210158404551015840-Pentachlorobiphenyl 37680-73-2 685 680 685 68613 23456-Pentachlorobiphenyl 18259-05-7 685 686 685 68814 221015840331015840441015840-Hexachlorobiphenyl 38380-07-3 744 743 745 75315 221015840441015840551015840-Hexachlorobiphenyl 35065-27-1 744 737 748 74316 221015840441015840661015840-Hexachlorobiphenyl 33979-03-2 712 739 741 73717 221015840331015840441015840551015840-Octachlorobiphenyl 35694-08-7 868 860 860 85918 221015840331015840551015840661015840-Octachlorobiphenyl 2136-99-4 842 850 858 85419 2210158403310158404410158405510158406-Nonachlorobiphenyl 40186-72-9 914 908 890 89020 221015840331015840441015840551015840661015840-Decachlorobiphenyl 2051-24-3 960 972 934 93021 4-Chlorobiphenyl 2051-62-9 469 469 450 45122 25-Dichlorobiphenyl 34883-39-1 518 516 515 51123 441015840-Dichlorobiphenyl 2050-68-2 528 529 534 52524 3441015840-Trichlorobiphenyl 38444-90-5 590 577 598 59725 221015840551015840-Tetrachlorobiphenyl 35693-99-3 626 629 639 63526 231015840410158405-Tetrachlorobiphenyl 32598-11-1 639 631 653 65727 2210158403410158405510158406-Heptachlorobiphenyl 52663-68-0 793 795 804 805aCompounds no 1ndash20 constitute the training set and compounds No 21ndash27 constitute the test set bfrom [10]
the two benzene ring plans (DA) the eigenvalue of thehighest occupied molecular orbital (119864HOMO) the eigenvalueof the lowest unoccupied molecular orbital (119864LUMO) theMulliken atomic charges on 12 carbon atoms from C1 to C12(119876C1ndash119876C12 Figure 1 shows the numbering of selected atomson themolecular structure) the electronic spatial extent (Re)the dipole moment (120583) and molecular total energy (TE) Allthe above descriptors were obtained directly in the outputfiles of Gaussian 03 calculation through a single full opti-mizing process for the molecular structure B3LYP6-31G(d)FOPT The values of these quantum chemical descriptors arelisted in Table 2 The units of dihedral angle atomic chargeelectronic spatial extent dipole moment and energy are asfollows degree atomic charge unit atom unit Debye andHartree respectively
23 Modeling Method and Evaluation Indexes Since a bignumber of descriptors were selected in this study intercor-relation of independent variables (multicollinearity) mightbecome a technical problem It was especially difficultwhen the number of independent variables was equal toor greater than the number of compounds selected in the
ClX ClY
1
2 3
4
56
9 8
7
1211
10
Figure 1 Numbering of selected atoms in the molecular structure119883 and119884 are integers between 0 and 5 inclusiveThe positions 7 8 910 11 and 12 are corresponding to 11015840 21015840 31015840 41015840 51015840 and 61015840 respectively
training set The common regression analysis frequentlyused in QSAR studies became ineffective because over-fitting readily occurred To overcome these problems weused the PLS regression a well-known multivariate methodwhich gives a stepwise solution for a regression modelIt extracts principal component-like latent variables fromoriginal independent variables (predictor variables) anddependent variables (response variables) respectively [24]This method can analyze data which are strongly collinearand more importantly it is a remedy to overfitting PLS finds
4 Journal of Chemistry
Table2Quantum
chem
icaldescrip
torsforthe
PCBs
studied
No
DA119864HOMO119864LU
MO
119876C1
119876C2
119876C3
119876C4
119876C5
119876C6
119876C7
119876C8
119876C9
119876C10
119876C11
119876C12
Re120583
TE1
000minus021730minus00330
0010126minus017937minus01339
5minus01248
7minus01339
5minus017937
010126minus017937minus01339
5minus01248
7minus01339
5minus017937
2336
596
000
00minus46
330
272
5613minus02322
3minus002
747
007
290minus01156
4minus01156
4minus012189minus013101minus01608
9006
717minus01479
3minus01354
8minus01244
2minus013374minus01646
12764
356
16832minus92
289
733
000minus022
671minus004
365
01034
9minus01805
3minus006
979minus012783minus013333minus017359
01003
5minus017756minus01342
1minus01230
2minus013374minus01784
63399
230
21220minus92
289
894
11008minus024
310minus002
654
005
227minus01026
5minus01368
5minus01192
6minus013195minus01356
6005
227minus01026
5minus01368
5minus01192
6minus013195minus01356
63242
151
200
16minus1382
489
95
5559minus023
637minus003
739
007
579minus01130
4minus01368
0minus005
946minus013118minus01600
7006
673minus014837minus013517minus01236
3minus01336
1minus01649
04150
644
21687minus1382
492
16
8121minus024
564minus00322
0004
787minus008
812minus01356
1minus011931minus006
799minus014376
004
642minus009
074minus01370
2minus01179
6minus01306
2minus014315
4428
552
1857
1minus1842
084
47
5536minus023937minus004
578
007
547minus011313minus01365
3minus005
901minus013102minus01598
8006
733minus01448
2minus01358
7minus006
134minus01343
5minus016175
6126
716
1429
7minus1842
087
88
5616minus024
156minus004
537
007
940minus010817minus013619minus006
867minus007
657minus01623
8006
469minus01476
3minus01349
4minus01225
1minus013332minus01636
650
70323
226
21minus1842
082
09
12687minus024
760minus005
378
007
904minus010835minus01357
7minus006
785minus007
710minus016166
006
579minus016413minus007
033minus01236
3minus01336
4minus01428
567
00831
1948
5minus23
01677
210
000minus023
831minus006
630
01096
3minus01783
8minus007
907minus00744
6minus01334
7minus01697
001096
3minus01783
8minus007
907minus00744
6minus01334
7minus01697
080
12630
267
16minus23
016757
118360minus02529
8minus004
434
005
232minus009
797minus009
264minus006
419minus01280
4minus01343
3004
160minus008
793minus013512minus01182
2minus006
856minus014180
718116
2264
02minus2761263
412
8064minus02536
0minus004
744
004
992minus008
328minus01374
8minus006
497minus007
717minus01425
1004
568minus008
834minus013512minus011755minus006
869minus01427
77317571
15120minus2761268
013
8995minus025
762minus004
957
006
408minus009
995minus008
788minus007
241minus008
788minus009
995
002
910minus01336
9minus013182minus012112minus013182minus01336
963
48796
21628minus2761249
114
8565minus026
235minus004
741
004
784minus009
803minus009
259minus006
444minus01280
2minus01338
8004
784minus009
803minus009
259minus006
444minus01280
2minus01338
888
80956
31952minus3220
848
115
8114minus025
785minus005
272
004
819minus008
337minus013735minus006
481minus007
739minus014173
004
819minus008
337minus013735minus006
481minus007
739minus014173
9135369
002
78minus3220
8572
1690
00minus026
706minus004
525
004
104minus007
511minus013762minus005
110minus013762minus007
511
004
104minus007
511minus013762minus005
110minus013762minus007
511
7602
549
000
00minus3220
859
717
8588minus026
513minus005
927
004
910minus009
391minus009
064minus007
515minus007
354minus013392
004
910minus009
391minus009
064minus007
515minus007
354minus013392
1158
265
611
648minus4140
024
218
8998minus026
008minus005
592
003
906minus008
305minus007
831minus01143
7minus007
831minus008
305
003
906minus008
305minus007
831minus01143
7minus007
831minus008
305
9319431
000
00minus4140
027
719
9083minus026
526minus006
331
005
601minus009
023minus008
906minus006
972minus008
907minus009
023
003
596minus008
464minus009
111minus00743
5minus00744
7minus01197
81208
396
910
110minus45
99608
620
9001minus026
670minus006
373
004
327minus007
929minus008
962minus006
896minus008
962minus007
929
004
327minus007
929minus008
962minus006
896minus008
962minus007
929
1257959
3000
00minus50
5919
3321
000minus022
180minus004
264
010353minus01765
2minus01354
1minus006
248minus01354
1minus01765
201034
5minus017931minus01336
7minus012379minus01336
7minus017931
3744
110
21391minus92
289
9022
5589minus023
832minus003
813
00748
7minus01129
8minus01339
4minus012217minus006
823minus016185
006
624minus014755minus013515minus012312minus013355minus016419
3923
000
07279minus1382
492
223
000minus022
591minus005
162
010551minus01763
9minus01350
5minus006
209minus01350
5minus01763
9010551minus01763
9minus01350
5minus006
209minus01350
5minus01763
956
91897
000
00minus1382
495
024
000minus02322
9minus005
925
01124
1minus01806
1minus007
848minus007
511minus01338
0minus01706
7010325minus01745
4minus013518minus006
122minus01348
1minus017556
6796
892
15815minus1842
0855
2582
68minus025
010minus003
958
004
559minus008
791minus013527minus011819minus006
836minus014172
004
559minus008
791minus013527minus011819minus006
836minus014172
5819703
02171minus23
01678
626
5556minus024
687minus005
342
00746
2minus01130
6minus013333minus01200
8minus006
896minus016122
007
015minus01450
7minus008
147minus007
164minus01324
1minus01556
668
27221
239
61minus23
01677
927
9059minus025
929minus005
534
005
332minus009
406minus007
793minus011522minus007
793minus009
406
003168minus00745
5minus013741minus006
394minus007
807minus012514
9224
954
10323minus36
80442
4
Journal of Chemistry 5
Table 3 Fitting results for PLS model I and model II
Model 119884 119883 ℎ 1198772
1198831198772
119883(cum) 1198772
1198841198772
119884(cum) Eig 1198762
1198762
cum
I log119870OW 1198831a 1 0519 0519 0928 0928 935 0922 0922
2 0172 0691 0065 0992 309 0841 0988
II log119870OW 1198832b 1 0542 0542 0928 0928 921 0921 0921
2 0181 0723 0063 0992 308 0830 0987a1198831 containing all 18 independent variables
b1198832 the variable 120583 was eliminated from matrix X for the model
the relationship between a matrix Y (containing dependentvariables often only one for QSAR studies) and a matrix X(containing predictor variables) by reducing the dimension ofthe matrices while concurrently maximizing the relationshipbetween them
SIMCA-P (Version 105 Umetrics AB 2004) software wasemployed to perform the PLS analysis The default valuesgiven by the software were used as the initial conditionsfor computation According to the userrsquos guide to SIMCA-P[25] the criterion used to determine the model dimension-ality the number of significant PLS components is cross-validation (CV) With CV the tested PLS component isthought significant if the fraction of the total variation ofthe dependent variables estimated by a component 1198762 ishigher than a significance limit (00975) for thewhole datasetThe model is thought to have a good predicting ability whenthe cumulative 1198762 for the extracted components 1198762cum ismore than 0500 [25] Model adequacy was assessed basedprimarily on the number of PLS principal components (ℎ)1198762
cum the squared correlation coefficient between observedvalues and fitted values (1198772) the standard error (SE) of theestimate the variance ratio of variance analysis (119865) and thesignificance level (119901)
3 Results and Discussion
31 Modeling and Optimizing For the 20 PCBs contained inthe training set PLS analysis was performed with log119870OWas the dependent variable and the 18 independent variablesas the predictor variables yielding QSAR model I For thismodel we have 1198772 = 0992 SE = 0145 119865 = 212 times 103and 119901 = 400 times 10minus20 The fitting results for this modelare listed in Table 3 In Table 3 1198772
119883and 1198772
119884stand for the
fraction of the sum of squares of all the Xrsquos and Y rsquos explainedby the current component 1198772
119883(cum) and 1198772
119884(cum) stand forthe fraction of the cumulative variance of all the Xrsquos and Y rsquosexplained by all extracted components respectively andEigstands for eigenvalue which denotes the importance of thePLS principal component
Variable importance in the projection (VIP) is a parame-ter that shows the importance of a variable in a model in theassistant analysis of PLS modeling According to the manualof SIMCA-P termswith large VIP (gt10) are themost relevantfor explaining dependent variables [25] To obtain an optimalmodel the following PLS analysis procedure was adopted APLSmodel with all the predictor variables was first calculatedand then the variable with the lowest VIP was eliminatedand a new PLS regression was performed yielding a new PLS
model This procedure was repeated until an optimal modelwas obtained The optimal model was selected with respectto 1198762cum 119877
2 SE 119865 and 119901 According to the statistics andmetrics theories a QSAR model with larger values of 1198762cum1198772 and 119865 and smaller values of SE and 119901 tends to be more
stable and reliable than in the opposite caseThe above described PLS analysis procedure with
log119870OW as dependent variable and the 18 independentvariables as the initial predictor variables for the 20 PCBscontained in the training set resulted in model II For thismodel we have 1198772 = 0992 SE = 0151 119865 = 197 times 103 and119901 = 762 times 10
minus20 The fitting results for this model are shownin Table 3 As shown in the table based on the estimatedindexes model I is better than model II indicating all thatthe selected 18 independent variables are necessary for thePLS modeling In model I 2 PLS principal componentswere selected in the model which explained 691 of thevariance of the predictor variables and 992 of the varianceof the dependent variable Based on the estimated indexesemployed in this study this model is statistically significant
32 Analysis and Discussion Based on the optimal modelI the predicted log119870OW for the 20 PCBs contained in thetraining set are as listed in Table 1 A comparison of theobserved and predicted log119870OW is shown in Figure 2 Ascan be seen in Table 1 and Figure 2 the predicted values arewithin the range of plusmn03 log unit from the observed valuesThe observed log119870OW values are close to those predicted bythe optimalmodel and the correlation between observed andpredicted log119870OW is significant For the PCBs studied thedifferences between observed and predicted log119870OW are verysmall so the predicted values are acceptable Note that thecross-validated 1198762cum value (=0988) of the optimal model ismuch larger than 0500 it would seem that thismodel is stableand has good prediction ability and can be used to predict the119870OW of PCBs effectively
WhenX andY are unscaled and uncentered the unscaledcoefficients of the independent variables and the constanttransformed from the PLS results are listed in Table 4 Thesigns of the coefficients of the independent variables in themodel enable one to see the direction in which each affectsthe119870OW of PCBs Based on the unscaled coefficients a QSARequation like that obtained from multiple linear regressioncan be created for the optimal model and the predicted119870OW for other PCBs can be estimated Based on model Ipredictions of 119870OW for the 7 PAHs contained in the testset were generated and listed in Table 1 and the comparisonof observed and predicted log119870OW is shown in Figure 2
6 Journal of Chemistry
4 5 6 7 8 9
4
5
6
7
8
9
Training setTest set
Observed log KOW
Pred
icte
d lo
g KO
W
Figure 2 Plot of observed values versus those predicted by theoptimal model I
Table 4The unscaled coefficients and VIPs of the optimal model I
Variables Unscaled coefficients VIPsTE minus1645 times 10minus4 1412Re 0747 times 10minus4 1404119864HOMO minus10390 1274119864LUMO minus20295 1220119876C6 3418 1082119876C12 3366 1074119876C8 1924 1009119876C11 5212 0992119876C1 minus1999 0947119876C9 7412 0940119876C7 minus1975 0926119876C2 1087 0922119876C10 4711 0834DA 5726 times 10minus4 0794119876C3 6512 0756119876C5 3434 0716119876C4 3535 0690120583 minus0047 0473Constant 6995
The correlation coefficient between observed and predictedlog119870OW in the test set was as high as 0995 larger than thatin the training set As can be seen in Table 1 and Figure 2the predicted log119870OW for the test set are very close tothose observed This indicates that the PLS model has goodpredictive ability
The VIP values for the optimal model were calculatedand listed in Table 4 We can find that the VIP values of TERe 119864HOMO and 119864LUMO are all greater than 10 indicatingthat they are important in governing the 119870OW values of
PCBs TE and Re are the two most important variablessince their VIP values are larger than 140 which is greaterthan the VIP values of the other variables It is expectedthat log119870OW increases with the increase of molecularsize Considering the following examples 4-chlorobiphenyl(log119870OW = 469 molecular weight (MW) = 1885) 4 41015840-dichlorobiphenyl (log119870OW = 528 MW = 2230) 2 4 41015840-trichlorobiphenyl (log119870OW = 574 MW = 2575) 3 31015840 4 41015840-tetrachlorobiphenyl (log119870OW = 652 MW = 2920) whichdiffer between themselves by the number of chlorine sub-stituents the increase in log119870OW as a function of the numberof chlorine substituents and molecular sizes is clearly seenWhen H atom is replaced by Cl atom on the biphenyl119870OW value will be increased [26] So the more chlorineatoms on the biphenyls molecular the greater the119870OW valueWhen isomers share the same molecular weight molecu-lar sizes differ from the arrangements of chlorine atomswhich can be expressed by molecular volume and surfacearea leading to the difference in log119870OW 2-Chlorobiphenyl(log119870OW = 456) 3-chlorobiphenyl (log119870OW = 472) and 4-chlorobiphenyl (log119870OW = 469) are isomers (MW = 1885)but they have different n-octanolwater partition coefficients
The variety of molecular size could be described bythe quantum chemical descriptors of TE and Re in thiswork Firstly for the whole compounds set in this study thedecrease of TEvalues indicates that the molecule containsmore chlorine atoms which might lead to the increase ofmolecular volume Zou et al [27] found that molecularvolume and molecular surface area were highly correlatedfor PCBs sharing similar structures So the decrease of TEvalues might lead to the increase of molecular surface areaMolecular volume and molecular surface area are measuresof molecular size In general the bigger the molecular sizethe stronger the effect of the intermolecular dispersion forceOn the other hand the molecular volume is a measure ofabsorbed energy of the molecule when it forms the holein a solvent Due to stronger polar of water moleculescontributing to stronger binding energy the larger solutemolecules need to absorbmore energy to destroy the bindingbetween water molecules when distributing in the aqueousphase resulting in tending to be allocated in weakly polarsolvents It means that the decrease of TE values might leadto larger log119870OW which is consistent with the study of Zou etal [27] Secondly Re is the expectation value of the operator120588 in the formula of int 1199032120588(119903)1198893119903 It is occasionally seen inthe literature as a measure of molecular volume Althoughin many cases Re correlates poorly with molecular volumethe correlation for PCBs sharing similar structures is quitesignificant A PCB molecule with large electronic spatialextent will have large molecular volume Therefore it can beconcluded that PCBs molecules with lower molecular totalenergy values and larger electronic spatial extent tend to bemore hydrophobic and lipophilic leading to larger log119870OWvalues This implies that the decrease of the TE value and theincrease of the Re value lead to an increase in log119870OW value
The value of log119870OW is in inverse ratio to 119864HOMO and119864LUMO that is large value of 119864HOMO and 119864LUMO would resultin small value of log119870OW for PCBs This is because 119864HOMO
Journal of Chemistry 7
represents the proton acceptance ability in forming hydrogenbond while 119864LUMO represents the proton donation ability inthe formation of hydrogen bond Therefore the compoundswith large values of 119864HOMO and 119864LUMO tend to donate oraccept protons easily In this case hydrogen bond can beeasily formed between PCBs and H
2O molecules and PCBs
are able to enter water phase resulting in low log119870OW value[13]
33 Comparison with the Results from Other Models In [13]there are 3 variables (119864HOMO 119864LUMO and 120572) obtained fromB3LYP6-31G(d) level in the model and the squared correla-tion coefficient is 1198772 = 09484 less than that of the optimalmodel in the present study It was found that there are highcorrelations relating the number of Cl substitution position(119873) to120572 and119864HOMO And taking119873 as theoretical descriptorsto correlate the three-variable models for predicting log119870OWof PCBs the achieved new model is having 1198772 = 09497and exhibiting optimum stability convenience for use andbetter predictive power than that from AM1 method [28]In contrast the squared correlation coefficient 1198772 = 0992obtained by PLS in the present study is obviously larger thanthat reported
Taking molar weight (119872) and melting point (119879119898 ∘C) as
independent variables Wang et al [10] had developed thebinary linear regression equation
log119870OW = 00155119872 + 000107 (119879119898 minus 25)
+ 171 (119899 = 27 119903 = 0997)
(1)
whose predicted log119870OW are also listed in Table 1 whichseems superior to the optimal model I in the present studyobtained solely on quantum chemical descriptors Howeverthe variable melting point in the model equation is anexperimental physicochemical parameter It is impracticalto determine the melting points of all 209 PCBs directly inthe laboratory which makes the model become of limitedapplicability Therefore our optimal model based only onquantum chemical descriptors has wider applicability thanthat based on known physicochemical propertiesThe overallhigh quality of themodel obtained in this study indicates thatit will find application in the estimation of 119870OW of PCBs forwhich experimental values are lacking
4 Conclusions
In this work based on some quantum chemical descriptorscomputed by DFT at the B3LYP6-31G(d) level by the useof PLS analysis QSAR models were obtained for logarithmicn-octanolwater partition coefficients of PCBs The squaredcorrelation coefficient of the optimalmodel is 0992Theopti-mal model has high fitting precision and good predictabilityso it can be applied in the estimation of the n-octanolwaterpartition behavior of PCBs It can generally be concludedthat PCBs with larger electronic spatial extent and lowermolecular total energy values tend to be more hydrophobicand lipophilic
Conflict of Interests
The authors declare no conflict of interests
Acknowledgments
This study was financially supported by the National NaturalScience Foundation of China (21007017) the Program forNew Century Excellent Talents in University (NCET-12-0199) the Specialized Research Fund for the Doctoral Pro-gram of Higher Education (20100172120037) the China Post-doctoral Science Foundation (201003350) and the Guang-dong Natural Science Foundation (9351064101000001)
References
[1] B G Hansen A B Paya-Perez M Rahman and B RLarsen ldquoQSARs for 119870OW and 119870OC of PCB congeners a criticalexamination of data assumptions and statistical approachesrdquoChemosphere vol 39 no 13 pp 2209ndash2228 1999
[2] S-S Liu Y Liu D-Q Yin X-D Wang and L-S WangldquoPrediction of chromatographic relative retention time ofpolychlorinated biphenyls from the molecular electronegativitydistance vectorrdquo Journal of Separation Science vol 29 no 2 pp296ndash301 2006
[3] United Nations Environment Programme ldquoStockholm Con-vention on Persistent Organic Pollutantsrdquo 2001 httpwwwpopsintdocumentsconvtextconvtext enpdf
[4] J P Giesy and K Kannan ldquoDioxin-like and non-dioxin-liketoxic effects of polychlorinated biphenyls (PCBs) implicationsfor risk assessmentrdquo Critical Reviews in Toxicology vol 28 no6 pp 511ndash569 1998
[5] C T Chiou V H Freed DW Schmedding and R L KohnertldquoPartition coefficient and bioaccumulation of selected organicchemicalsrdquo Environmental Science and Technology vol 11 no 5pp 475ndash478 1977
[6] H Konemann ldquoStructure-activity relationships and additivityin fish toxicities of environmental pollutantsrdquoEcotoxicology andEnvironmental Safety vol 4 no 4 pp 415ndash421 1980
[7] G Patil ldquoCorrelation of aqueous solubility and octanol-waterpartition coefficient based on molecular structurerdquo Chemo-sphere vol 22 no 8 pp 723ndash738 1991
[8] DMackay A BobraW Y Shiu and SH Yalkowsky ldquoRelation-ships between aqueous solubility and octanol-water partitioncoefficientsrdquo Chemosphere vol 9 no 11 pp 701ndash711 1980
[9] A Sabljic H Gusten J Hermens andAOpperhuizen ldquoModel-ing octanolwater partition coefficients by molecular topologychlorinated benzenes and biphenylsrdquo Environmental Scienceand Technology vol 27 no 7 pp 1394ndash1402 1993
[10] L S Wang Y H Zhao and H Gao ldquoPredicting aqueoussolubility and octanolwater partition coefficients of organicchemicals from molar volumerdquo Environmental Chemistry vol11 pp 55ndash70 1992
[11] G-N Lu Z Dang X-Q Tao and X-Y Yi ldquoEstimation of watersolubility of polycyclic aromatic hydrocarbons using quantumchemical descriptors and partial least squaresrdquo QSAR andCombinatorial Science vol 27 no 5 pp 618ndash626 2008
[12] G-N Lu X-Q Tao Z Dang X-Y Yi and C Yang ldquoEstimationof n-octanolwater partition coefficients of polycyclic aromatichydrocarbons by quantum chemical descriptorsrdquo Central Euro-pean Journal of Chemistry vol 6 no 2 pp 310ndash318 2008
8 Journal of Chemistry
[13] W Zhou Z Zhai Z Wang and L Wang ldquoEstimation of n-octanolwater partition coefficients (119870OW) of all PCB congenersby density functional theoryrdquo Journal of Molecular Structurevol 755 no 1ndash3 pp 137ndash145 2005
[14] F Verweij K Booij K Satumalay N Van Der Molen andR Van Der Oost ldquoAssessment of bioavailable PAH PCB andOCP concentrations in water using semipermeable membranedevices (SPMDs) sediments and caged carprdquoChemosphere vol54 no 11 pp 1675ndash1689 2004
[15] L Jantschi S Bolboaca and R Sestras ldquoMeta-heuristics onquantitative structure -activity relationships study on polychlo-rinated biphenylsrdquo Journal of Molecular Modeling vol 16 pp377ndash386 2010
[16] P Ruiz O Faroon C J Moudgal H Hansen C T DeRosa and M Mumtaz ldquoPrediction of the health effects ofpolychlorinated biphenyls (PCBs) and their metabolites usingquantitative structure-activity relationship (QSAR)rdquo ToxicologyLetters vol 181 no 1 pp 53ndash65 2008
[17] L-T Qin S-S Liu H-L Liu and H-L Ge ldquoA new predictivemodel for the bioconcentration factors of polychlorinatedbiphenyls (PCBs) based on the molecular electronegativitydistance vector (MEDV)rdquo Chemosphere vol 70 no 9 pp 1577ndash1587 2008
[18] K Tuppurainen and J Ruuskanen ldquoElectronic eigenvalue(EEVA) a new QSARQSPR descriptor for electronic sub-stituent effects based on molecular orbital energies A QSARapproach to the Ah receptor binding affinity of polychlorinatedbiphenyls (PCBs) dibenzo-p-dioxins (PCDDs) and dibenzofu-rans (PCDFs)rdquo Chemosphere vol 41 no 6 pp 843ndash848 2000
[19] Z Daren ldquoQSPR studies of PCBs by the combination of geneticalgorithms and PLS analysisrdquoComputers and Chemistry vol 25no 2 pp 197ndash204 2001
[20] E B de Melo ldquoA new quantitative structure-property rela-tionship model to predict bioconcentration factors of poly-chlorinated biphenyls (PCBs) in fishes using E-state indexand topological descriptorsrdquo Ecotoxicology and EnvironmentalSafety vol 75 no 1 pp 213ndash222 2012
[21] J Y Long H B Yi X K Liu and Y F Wang ldquoQuantitativestructure-property relationship for polychlorinated biphenylstoxicity and structure by density functional theoryrdquoActa Chim-ica Sinica vol 70 pp 949ndash960 2012
[22] G-N Lu X-Q Tao Z H I Dang W Huang and Z LildquoQuantitative structureproperty relationships on dissolvabilityof pcddfs using quantum chemical descriptors and partial leastsquaresrdquo Journal of Theoretical and Computational Chemistryvol 9 no 1 pp 9ndash22 2010
[23] M J Frisch G W Trucks H B Schlegel et al Gaussian 03Revision B 01 Gaussian Pittsburgh Pa USA 2003
[24] S Wold M Sjostrom and L Eriksson ldquoPLS-regression a basictool of chemometricsrdquoChemometrics and Intelligent LaboratorySystems vol 58 no 2 pp 109ndash130 2001
[25] A B Umetrics USerrsquos Guide to SIMCA-P SIMCA-P+ Version10 0 Umea Sweden 2005
[26] B Z Zhao Y Y Li X R Hao et al ldquoTheoretical calculation ofthe octanolwater partition coefficient for polychlorobiphenylsrdquoJournal of Northeast Normal University vol 36 no 2 pp 41ndash442004
[27] J-W Zou Y-J Jiang G-X Hu M Zeng S-L Zhuang and Q-S Yu ldquoQSPRQSAR studies on the physicochemical propertiesand biological activities of polychlorinated biphenylsrdquo ActaPhysico vol 21 no 3 pp 267ndash272 2005
[28] X-Y Han Z-YWang Z-C Zhai and L-S Wang ldquoEstimationof n-octanolwater partition coefficients (119870OW) of all PCBcongeners by ab initio and a Cl substitution position methodrdquoQSAR and Combinatorial Science vol 25 no 4 pp 333ndash3412006
Submit your manuscripts athttpwwwhindawicom
Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014
Inorganic ChemistryInternational Journal of
Hindawi Publishing Corporation httpwwwhindawicom Volume 2014
International Journal ofPhotoenergy
Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014
Carbohydrate Chemistry
International Journal of
Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014
Journal of
Chemistry
Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014
Advances in
Physical Chemistry
Hindawi Publishing Corporationhttpwwwhindawicom
Analytical Methods in Chemistry
Journal of
Volume 2014
Bioinorganic Chemistry and ApplicationsHindawi Publishing Corporationhttpwwwhindawicom Volume 2014
SpectroscopyInternational Journal of
Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014
The Scientific World JournalHindawi Publishing Corporation httpwwwhindawicom Volume 2014
Medicinal ChemistryInternational Journal of
Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014
Chromatography Research International
Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014
Applied ChemistryJournal of
Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014
Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014
Theoretical ChemistryJournal of
Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014
Journal of
Spectroscopy
Analytical ChemistryInternational Journal of
Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014
Journal of
Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014
Quantum Chemistry
Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014
Organic Chemistry International
ElectrochemistryInternational Journal of
Hindawi Publishing Corporation httpwwwhindawicom Volume 2014
Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014
CatalystsJournal of
2 Journal of Chemistry
property suitable for the characterization of the lipophilicityof the compounds such as PCBs119870OW has been shown to cor-relate with aquatic solubility toxicity and bioaccumulation[5ndash7] and as such it is a useful parameter for the assessmentof the environmental fate and impact of xenobiotic tracecontaminants [8] During the last 50 years 119870OW has provento be a valuable tool in many areas of natural sciences chem-istry biochemistry biology pharmacology environmentalsciences and so forth During that period many hundredsor even thousands of scientific publications have been madein which the successful application of 119870OW in correlationwith many physical chemical or biological properties hasbeen demonstrated for a large variety of organic chemicalsUnfortunately many correlations have been also publishedin which the processes studied have nothing in commonwith the process of partition between water and n-octanolphases It must be emphasized that in order to correlate twoproperties do not have to be related both of themmay simplybe related to a third property The shake flask is the classicalmethod for measuring 119870OW and this procedure will givereproducible and accurate results for chemicalswith log119870OWAccurate results can also be obtained for more hydrophobicchemicals but this will need a very precise and skillfulhandling and the use of very pure n-octanol The problemwith this procedure is that the complete phase separation isdifficult to achieve and this may lead to the underestimationof119870OW particularly for themore hydrophobic chemicals [9]
It is impractical to measure the 119870OW of all the PCBsdirectly in the laboratory because of the extremely lowwater solubility of some high-chlorinated PCBs The lackof complete reliable and comparable data has led to thedevelopment of different119870OW estimation methods With theadvent of inexpensive and rapid computation there has beena remarkable growth in the area of quantitative structure-activity relationships (QSAR) which correlate the proper-ties of pollutants with relevant properties and moleculardescriptors [11] A large number of calculation methods havebeen presently developed for estimation of the partitioncoefficients with varying success and applicability Thesemethods are cable of predicting environmental behaviors notonly for common environmental pollutants but also for thosechemicals that have not yet been synthesized or those thatcannot be examined experimentally due to their extremelyhazardous nature According to the descriptors used thesemethods can be classified into two groups empirical andtheoretical methods [12] Hansen et al [1] used the degreeof chlorination in the orthoposition of the PCB molecule aspredictor variable together with either the calculated molec-ular total surface area (TSA) or the gas chromatographicrelative retention times (RRT) to predict the log119870OW andsoil sorption coefficients (log119870OC) for all 209 PCB congenerswhich were based on 53 and 48 experimental data pointsrespectively Also reporting the relationship between 119870OWand 119870OC which is promising for 119870OW could proceed with119870OC However TSA and RRT are both empirical moleculardescriptors Firstly the accurate determination of empiricalmolecular descriptorsmay be difficult and expensive in termsof cost and time or even impossible for some PCBs whichmight have been not synthesized or purified Meanwhile the
experimental errors may be introduced especially for thosecongeners difficult to be separated and identified by chro-matography Furthermore the empirical molecular descrip-tors often reflect complex and multiple physical interactionsHence the interpretation of the respective property could bedifficult and ambiguous [13] Many attempts have been madeto model the physicochemical properties and bioactivities ofPCBs by utilizingQSAR techniques [14ndash21] For instanceQinet al [17] reported the correlation of bioconcentration factorsand molecular electronegativity distance vector (MEDV)of PCBs based on molecular structure Tuppurainen andRuuskanen [18] introduced a newQSARdescriptor electroniceigenvalue (EEVA) based on molecular orbital energies topredict the Ah receptor binding affinity of PCBs polychlo-rinated dibenzo-p-dioxins and dibenzofurans (PCDDFs)Daren [19] reported QSPR studies of119870OW 119878W 119884W and119870OCof all 209 PCBs and biphenyl by the combination of geneticalgorithms and PLS analysis
Modern theoretical method in quantum chemistry suchas density functional theory (DFT) with high calculation pre-cision should have its advantages in estimating the propertiesof environmental toxics such as PCBs PCDDFs [13 22]The aim of the present study is to explore QSAR modelrelating to log119870OW of 27 PCBs by using partial least squares(PLS) analysis with optimizing procedure based on reportedlog119870OW data and quantum chemical descriptors computedby DFT contained in Gaussian 03 in an attempt to developreliable and predictive QSARmodel and identify the primarymolecular factors governing the 119870OW of PCBs
2 Materials and Methods
21 Target PCBs A total of 27 PCBs whose 119870OW andorlog119870OW were previously published were chosen to constitutethe training set and the test set in this work The trainingset consists of 20 of these 27 PCBs selected randomly andthe remaining 7 PCBs constitute the test set Their chemicalabstracts service numbers (CAS no) and reported log119870OWare listed in Table 1
22 Calculation and Selection of the Descriptors The molec-ular modeling system HyperChem (Release 70 HypercubeInc 2002) was used to construct and view all molecularstructures Molecular geometry was optimized and quantumchemical descriptors were computed using the B3LYP hybridfunctional of DFT in conjunction with 6-31G(d) a split-valence basis set with polarization function The B3LYPcalculations were performed using the quantum chemicalcomputation software Gaussian 03 [23] All calculations wererun on an Intel Core i5-2410M CPU230GHz computerequipped with 200GB of internal memory and MicrosoftWindows 7 professional operating system
According to the chemometrics theory it is suggested thataQSARmodel should include asmany relevant descriptors aspossible to increase the probability of a good characterizationfor a class of compounds [24] In this study 18 independentquantum chemical variables were selected for developingQSAR models The descriptors cover the dihedral angle of
Journal of Chemistry 3
Table 1 n-Octanolwater partition coefficients of the PCBs studied
Noa Compounds CAS no Observed log119870OWb Predicted log119870OW
Reference [10] Model I Model II1 Biphenyl 92-52-4 410 414 396 3872 2-Chlorobiphenyl 2051-60-7 456 464 456 4563 3-Chlorobiphenyl 2051-61-8 472 463 477 4774 221015840-Dichlorobiphenyl 13029-08-8 502 520 503 5055 24-Dichlorobiphenyl 33284-50-3 515 516 505 5066 2210158405-Trichlorobiphenyl 37680-65-2 564 573 557 5597 2441015840-Trichlorobiphenyl 7012-37-5 574 573 582 5808 245-Trichlorobiphenyl 15862-07-4 577 576 557 5609 23101584045-Tetrachlorobiphenyl 73557-53-8 639 630 658 65810 331015840441015840-Tetrachlorobiphenyl 32598-13-3 652 639 668 67111 2210158403451015840-Pentachlorobiphenyl 38380-02-8 685 684 685 69112 2210158404551015840-Pentachlorobiphenyl 37680-73-2 685 680 685 68613 23456-Pentachlorobiphenyl 18259-05-7 685 686 685 68814 221015840331015840441015840-Hexachlorobiphenyl 38380-07-3 744 743 745 75315 221015840441015840551015840-Hexachlorobiphenyl 35065-27-1 744 737 748 74316 221015840441015840661015840-Hexachlorobiphenyl 33979-03-2 712 739 741 73717 221015840331015840441015840551015840-Octachlorobiphenyl 35694-08-7 868 860 860 85918 221015840331015840551015840661015840-Octachlorobiphenyl 2136-99-4 842 850 858 85419 2210158403310158404410158405510158406-Nonachlorobiphenyl 40186-72-9 914 908 890 89020 221015840331015840441015840551015840661015840-Decachlorobiphenyl 2051-24-3 960 972 934 93021 4-Chlorobiphenyl 2051-62-9 469 469 450 45122 25-Dichlorobiphenyl 34883-39-1 518 516 515 51123 441015840-Dichlorobiphenyl 2050-68-2 528 529 534 52524 3441015840-Trichlorobiphenyl 38444-90-5 590 577 598 59725 221015840551015840-Tetrachlorobiphenyl 35693-99-3 626 629 639 63526 231015840410158405-Tetrachlorobiphenyl 32598-11-1 639 631 653 65727 2210158403410158405510158406-Heptachlorobiphenyl 52663-68-0 793 795 804 805aCompounds no 1ndash20 constitute the training set and compounds No 21ndash27 constitute the test set bfrom [10]
the two benzene ring plans (DA) the eigenvalue of thehighest occupied molecular orbital (119864HOMO) the eigenvalueof the lowest unoccupied molecular orbital (119864LUMO) theMulliken atomic charges on 12 carbon atoms from C1 to C12(119876C1ndash119876C12 Figure 1 shows the numbering of selected atomson themolecular structure) the electronic spatial extent (Re)the dipole moment (120583) and molecular total energy (TE) Allthe above descriptors were obtained directly in the outputfiles of Gaussian 03 calculation through a single full opti-mizing process for the molecular structure B3LYP6-31G(d)FOPT The values of these quantum chemical descriptors arelisted in Table 2 The units of dihedral angle atomic chargeelectronic spatial extent dipole moment and energy are asfollows degree atomic charge unit atom unit Debye andHartree respectively
23 Modeling Method and Evaluation Indexes Since a bignumber of descriptors were selected in this study intercor-relation of independent variables (multicollinearity) mightbecome a technical problem It was especially difficultwhen the number of independent variables was equal toor greater than the number of compounds selected in the
ClX ClY
1
2 3
4
56
9 8
7
1211
10
Figure 1 Numbering of selected atoms in the molecular structure119883 and119884 are integers between 0 and 5 inclusiveThe positions 7 8 910 11 and 12 are corresponding to 11015840 21015840 31015840 41015840 51015840 and 61015840 respectively
training set The common regression analysis frequentlyused in QSAR studies became ineffective because over-fitting readily occurred To overcome these problems weused the PLS regression a well-known multivariate methodwhich gives a stepwise solution for a regression modelIt extracts principal component-like latent variables fromoriginal independent variables (predictor variables) anddependent variables (response variables) respectively [24]This method can analyze data which are strongly collinearand more importantly it is a remedy to overfitting PLS finds
4 Journal of Chemistry
Table2Quantum
chem
icaldescrip
torsforthe
PCBs
studied
No
DA119864HOMO119864LU
MO
119876C1
119876C2
119876C3
119876C4
119876C5
119876C6
119876C7
119876C8
119876C9
119876C10
119876C11
119876C12
Re120583
TE1
000minus021730minus00330
0010126minus017937minus01339
5minus01248
7minus01339
5minus017937
010126minus017937minus01339
5minus01248
7minus01339
5minus017937
2336
596
000
00minus46
330
272
5613minus02322
3minus002
747
007
290minus01156
4minus01156
4minus012189minus013101minus01608
9006
717minus01479
3minus01354
8minus01244
2minus013374minus01646
12764
356
16832minus92
289
733
000minus022
671minus004
365
01034
9minus01805
3minus006
979minus012783minus013333minus017359
01003
5minus017756minus01342
1minus01230
2minus013374minus01784
63399
230
21220minus92
289
894
11008minus024
310minus002
654
005
227minus01026
5minus01368
5minus01192
6minus013195minus01356
6005
227minus01026
5minus01368
5minus01192
6minus013195minus01356
63242
151
200
16minus1382
489
95
5559minus023
637minus003
739
007
579minus01130
4minus01368
0minus005
946minus013118minus01600
7006
673minus014837minus013517minus01236
3minus01336
1minus01649
04150
644
21687minus1382
492
16
8121minus024
564minus00322
0004
787minus008
812minus01356
1minus011931minus006
799minus014376
004
642minus009
074minus01370
2minus01179
6minus01306
2minus014315
4428
552
1857
1minus1842
084
47
5536minus023937minus004
578
007
547minus011313minus01365
3minus005
901minus013102minus01598
8006
733minus01448
2minus01358
7minus006
134minus01343
5minus016175
6126
716
1429
7minus1842
087
88
5616minus024
156minus004
537
007
940minus010817minus013619minus006
867minus007
657minus01623
8006
469minus01476
3minus01349
4minus01225
1minus013332minus01636
650
70323
226
21minus1842
082
09
12687minus024
760minus005
378
007
904minus010835minus01357
7minus006
785minus007
710minus016166
006
579minus016413minus007
033minus01236
3minus01336
4minus01428
567
00831
1948
5minus23
01677
210
000minus023
831minus006
630
01096
3minus01783
8minus007
907minus00744
6minus01334
7minus01697
001096
3minus01783
8minus007
907minus00744
6minus01334
7minus01697
080
12630
267
16minus23
016757
118360minus02529
8minus004
434
005
232minus009
797minus009
264minus006
419minus01280
4minus01343
3004
160minus008
793minus013512minus01182
2minus006
856minus014180
718116
2264
02minus2761263
412
8064minus02536
0minus004
744
004
992minus008
328minus01374
8minus006
497minus007
717minus01425
1004
568minus008
834minus013512minus011755minus006
869minus01427
77317571
15120minus2761268
013
8995minus025
762minus004
957
006
408minus009
995minus008
788minus007
241minus008
788minus009
995
002
910minus01336
9minus013182minus012112minus013182minus01336
963
48796
21628minus2761249
114
8565minus026
235minus004
741
004
784minus009
803minus009
259minus006
444minus01280
2minus01338
8004
784minus009
803minus009
259minus006
444minus01280
2minus01338
888
80956
31952minus3220
848
115
8114minus025
785minus005
272
004
819minus008
337minus013735minus006
481minus007
739minus014173
004
819minus008
337minus013735minus006
481minus007
739minus014173
9135369
002
78minus3220
8572
1690
00minus026
706minus004
525
004
104minus007
511minus013762minus005
110minus013762minus007
511
004
104minus007
511minus013762minus005
110minus013762minus007
511
7602
549
000
00minus3220
859
717
8588minus026
513minus005
927
004
910minus009
391minus009
064minus007
515minus007
354minus013392
004
910minus009
391minus009
064minus007
515minus007
354minus013392
1158
265
611
648minus4140
024
218
8998minus026
008minus005
592
003
906minus008
305minus007
831minus01143
7minus007
831minus008
305
003
906minus008
305minus007
831minus01143
7minus007
831minus008
305
9319431
000
00minus4140
027
719
9083minus026
526minus006
331
005
601minus009
023minus008
906minus006
972minus008
907minus009
023
003
596minus008
464minus009
111minus00743
5minus00744
7minus01197
81208
396
910
110minus45
99608
620
9001minus026
670minus006
373
004
327minus007
929minus008
962minus006
896minus008
962minus007
929
004
327minus007
929minus008
962minus006
896minus008
962minus007
929
1257959
3000
00minus50
5919
3321
000minus022
180minus004
264
010353minus01765
2minus01354
1minus006
248minus01354
1minus01765
201034
5minus017931minus01336
7minus012379minus01336
7minus017931
3744
110
21391minus92
289
9022
5589minus023
832minus003
813
00748
7minus01129
8minus01339
4minus012217minus006
823minus016185
006
624minus014755minus013515minus012312minus013355minus016419
3923
000
07279minus1382
492
223
000minus022
591minus005
162
010551minus01763
9minus01350
5minus006
209minus01350
5minus01763
9010551minus01763
9minus01350
5minus006
209minus01350
5minus01763
956
91897
000
00minus1382
495
024
000minus02322
9minus005
925
01124
1minus01806
1minus007
848minus007
511minus01338
0minus01706
7010325minus01745
4minus013518minus006
122minus01348
1minus017556
6796
892
15815minus1842
0855
2582
68minus025
010minus003
958
004
559minus008
791minus013527minus011819minus006
836minus014172
004
559minus008
791minus013527minus011819minus006
836minus014172
5819703
02171minus23
01678
626
5556minus024
687minus005
342
00746
2minus01130
6minus013333minus01200
8minus006
896minus016122
007
015minus01450
7minus008
147minus007
164minus01324
1minus01556
668
27221
239
61minus23
01677
927
9059minus025
929minus005
534
005
332minus009
406minus007
793minus011522minus007
793minus009
406
003168minus00745
5minus013741minus006
394minus007
807minus012514
9224
954
10323minus36
80442
4
Journal of Chemistry 5
Table 3 Fitting results for PLS model I and model II
Model 119884 119883 ℎ 1198772
1198831198772
119883(cum) 1198772
1198841198772
119884(cum) Eig 1198762
1198762
cum
I log119870OW 1198831a 1 0519 0519 0928 0928 935 0922 0922
2 0172 0691 0065 0992 309 0841 0988
II log119870OW 1198832b 1 0542 0542 0928 0928 921 0921 0921
2 0181 0723 0063 0992 308 0830 0987a1198831 containing all 18 independent variables
b1198832 the variable 120583 was eliminated from matrix X for the model
the relationship between a matrix Y (containing dependentvariables often only one for QSAR studies) and a matrix X(containing predictor variables) by reducing the dimension ofthe matrices while concurrently maximizing the relationshipbetween them
SIMCA-P (Version 105 Umetrics AB 2004) software wasemployed to perform the PLS analysis The default valuesgiven by the software were used as the initial conditionsfor computation According to the userrsquos guide to SIMCA-P[25] the criterion used to determine the model dimension-ality the number of significant PLS components is cross-validation (CV) With CV the tested PLS component isthought significant if the fraction of the total variation ofthe dependent variables estimated by a component 1198762 ishigher than a significance limit (00975) for thewhole datasetThe model is thought to have a good predicting ability whenthe cumulative 1198762 for the extracted components 1198762cum ismore than 0500 [25] Model adequacy was assessed basedprimarily on the number of PLS principal components (ℎ)1198762
cum the squared correlation coefficient between observedvalues and fitted values (1198772) the standard error (SE) of theestimate the variance ratio of variance analysis (119865) and thesignificance level (119901)
3 Results and Discussion
31 Modeling and Optimizing For the 20 PCBs contained inthe training set PLS analysis was performed with log119870OWas the dependent variable and the 18 independent variablesas the predictor variables yielding QSAR model I For thismodel we have 1198772 = 0992 SE = 0145 119865 = 212 times 103and 119901 = 400 times 10minus20 The fitting results for this modelare listed in Table 3 In Table 3 1198772
119883and 1198772
119884stand for the
fraction of the sum of squares of all the Xrsquos and Y rsquos explainedby the current component 1198772
119883(cum) and 1198772
119884(cum) stand forthe fraction of the cumulative variance of all the Xrsquos and Y rsquosexplained by all extracted components respectively andEigstands for eigenvalue which denotes the importance of thePLS principal component
Variable importance in the projection (VIP) is a parame-ter that shows the importance of a variable in a model in theassistant analysis of PLS modeling According to the manualof SIMCA-P termswith large VIP (gt10) are themost relevantfor explaining dependent variables [25] To obtain an optimalmodel the following PLS analysis procedure was adopted APLSmodel with all the predictor variables was first calculatedand then the variable with the lowest VIP was eliminatedand a new PLS regression was performed yielding a new PLS
model This procedure was repeated until an optimal modelwas obtained The optimal model was selected with respectto 1198762cum 119877
2 SE 119865 and 119901 According to the statistics andmetrics theories a QSAR model with larger values of 1198762cum1198772 and 119865 and smaller values of SE and 119901 tends to be more
stable and reliable than in the opposite caseThe above described PLS analysis procedure with
log119870OW as dependent variable and the 18 independentvariables as the initial predictor variables for the 20 PCBscontained in the training set resulted in model II For thismodel we have 1198772 = 0992 SE = 0151 119865 = 197 times 103 and119901 = 762 times 10
minus20 The fitting results for this model are shownin Table 3 As shown in the table based on the estimatedindexes model I is better than model II indicating all thatthe selected 18 independent variables are necessary for thePLS modeling In model I 2 PLS principal componentswere selected in the model which explained 691 of thevariance of the predictor variables and 992 of the varianceof the dependent variable Based on the estimated indexesemployed in this study this model is statistically significant
32 Analysis and Discussion Based on the optimal modelI the predicted log119870OW for the 20 PCBs contained in thetraining set are as listed in Table 1 A comparison of theobserved and predicted log119870OW is shown in Figure 2 Ascan be seen in Table 1 and Figure 2 the predicted values arewithin the range of plusmn03 log unit from the observed valuesThe observed log119870OW values are close to those predicted bythe optimalmodel and the correlation between observed andpredicted log119870OW is significant For the PCBs studied thedifferences between observed and predicted log119870OW are verysmall so the predicted values are acceptable Note that thecross-validated 1198762cum value (=0988) of the optimal model ismuch larger than 0500 it would seem that thismodel is stableand has good prediction ability and can be used to predict the119870OW of PCBs effectively
WhenX andY are unscaled and uncentered the unscaledcoefficients of the independent variables and the constanttransformed from the PLS results are listed in Table 4 Thesigns of the coefficients of the independent variables in themodel enable one to see the direction in which each affectsthe119870OW of PCBs Based on the unscaled coefficients a QSARequation like that obtained from multiple linear regressioncan be created for the optimal model and the predicted119870OW for other PCBs can be estimated Based on model Ipredictions of 119870OW for the 7 PAHs contained in the testset were generated and listed in Table 1 and the comparisonof observed and predicted log119870OW is shown in Figure 2
6 Journal of Chemistry
4 5 6 7 8 9
4
5
6
7
8
9
Training setTest set
Observed log KOW
Pred
icte
d lo
g KO
W
Figure 2 Plot of observed values versus those predicted by theoptimal model I
Table 4The unscaled coefficients and VIPs of the optimal model I
Variables Unscaled coefficients VIPsTE minus1645 times 10minus4 1412Re 0747 times 10minus4 1404119864HOMO minus10390 1274119864LUMO minus20295 1220119876C6 3418 1082119876C12 3366 1074119876C8 1924 1009119876C11 5212 0992119876C1 minus1999 0947119876C9 7412 0940119876C7 minus1975 0926119876C2 1087 0922119876C10 4711 0834DA 5726 times 10minus4 0794119876C3 6512 0756119876C5 3434 0716119876C4 3535 0690120583 minus0047 0473Constant 6995
The correlation coefficient between observed and predictedlog119870OW in the test set was as high as 0995 larger than thatin the training set As can be seen in Table 1 and Figure 2the predicted log119870OW for the test set are very close tothose observed This indicates that the PLS model has goodpredictive ability
The VIP values for the optimal model were calculatedand listed in Table 4 We can find that the VIP values of TERe 119864HOMO and 119864LUMO are all greater than 10 indicatingthat they are important in governing the 119870OW values of
PCBs TE and Re are the two most important variablessince their VIP values are larger than 140 which is greaterthan the VIP values of the other variables It is expectedthat log119870OW increases with the increase of molecularsize Considering the following examples 4-chlorobiphenyl(log119870OW = 469 molecular weight (MW) = 1885) 4 41015840-dichlorobiphenyl (log119870OW = 528 MW = 2230) 2 4 41015840-trichlorobiphenyl (log119870OW = 574 MW = 2575) 3 31015840 4 41015840-tetrachlorobiphenyl (log119870OW = 652 MW = 2920) whichdiffer between themselves by the number of chlorine sub-stituents the increase in log119870OW as a function of the numberof chlorine substituents and molecular sizes is clearly seenWhen H atom is replaced by Cl atom on the biphenyl119870OW value will be increased [26] So the more chlorineatoms on the biphenyls molecular the greater the119870OW valueWhen isomers share the same molecular weight molecu-lar sizes differ from the arrangements of chlorine atomswhich can be expressed by molecular volume and surfacearea leading to the difference in log119870OW 2-Chlorobiphenyl(log119870OW = 456) 3-chlorobiphenyl (log119870OW = 472) and 4-chlorobiphenyl (log119870OW = 469) are isomers (MW = 1885)but they have different n-octanolwater partition coefficients
The variety of molecular size could be described bythe quantum chemical descriptors of TE and Re in thiswork Firstly for the whole compounds set in this study thedecrease of TEvalues indicates that the molecule containsmore chlorine atoms which might lead to the increase ofmolecular volume Zou et al [27] found that molecularvolume and molecular surface area were highly correlatedfor PCBs sharing similar structures So the decrease of TEvalues might lead to the increase of molecular surface areaMolecular volume and molecular surface area are measuresof molecular size In general the bigger the molecular sizethe stronger the effect of the intermolecular dispersion forceOn the other hand the molecular volume is a measure ofabsorbed energy of the molecule when it forms the holein a solvent Due to stronger polar of water moleculescontributing to stronger binding energy the larger solutemolecules need to absorbmore energy to destroy the bindingbetween water molecules when distributing in the aqueousphase resulting in tending to be allocated in weakly polarsolvents It means that the decrease of TE values might leadto larger log119870OW which is consistent with the study of Zou etal [27] Secondly Re is the expectation value of the operator120588 in the formula of int 1199032120588(119903)1198893119903 It is occasionally seen inthe literature as a measure of molecular volume Althoughin many cases Re correlates poorly with molecular volumethe correlation for PCBs sharing similar structures is quitesignificant A PCB molecule with large electronic spatialextent will have large molecular volume Therefore it can beconcluded that PCBs molecules with lower molecular totalenergy values and larger electronic spatial extent tend to bemore hydrophobic and lipophilic leading to larger log119870OWvalues This implies that the decrease of the TE value and theincrease of the Re value lead to an increase in log119870OW value
The value of log119870OW is in inverse ratio to 119864HOMO and119864LUMO that is large value of 119864HOMO and 119864LUMO would resultin small value of log119870OW for PCBs This is because 119864HOMO
Journal of Chemistry 7
represents the proton acceptance ability in forming hydrogenbond while 119864LUMO represents the proton donation ability inthe formation of hydrogen bond Therefore the compoundswith large values of 119864HOMO and 119864LUMO tend to donate oraccept protons easily In this case hydrogen bond can beeasily formed between PCBs and H
2O molecules and PCBs
are able to enter water phase resulting in low log119870OW value[13]
33 Comparison with the Results from Other Models In [13]there are 3 variables (119864HOMO 119864LUMO and 120572) obtained fromB3LYP6-31G(d) level in the model and the squared correla-tion coefficient is 1198772 = 09484 less than that of the optimalmodel in the present study It was found that there are highcorrelations relating the number of Cl substitution position(119873) to120572 and119864HOMO And taking119873 as theoretical descriptorsto correlate the three-variable models for predicting log119870OWof PCBs the achieved new model is having 1198772 = 09497and exhibiting optimum stability convenience for use andbetter predictive power than that from AM1 method [28]In contrast the squared correlation coefficient 1198772 = 0992obtained by PLS in the present study is obviously larger thanthat reported
Taking molar weight (119872) and melting point (119879119898 ∘C) as
independent variables Wang et al [10] had developed thebinary linear regression equation
log119870OW = 00155119872 + 000107 (119879119898 minus 25)
+ 171 (119899 = 27 119903 = 0997)
(1)
whose predicted log119870OW are also listed in Table 1 whichseems superior to the optimal model I in the present studyobtained solely on quantum chemical descriptors Howeverthe variable melting point in the model equation is anexperimental physicochemical parameter It is impracticalto determine the melting points of all 209 PCBs directly inthe laboratory which makes the model become of limitedapplicability Therefore our optimal model based only onquantum chemical descriptors has wider applicability thanthat based on known physicochemical propertiesThe overallhigh quality of themodel obtained in this study indicates thatit will find application in the estimation of 119870OW of PCBs forwhich experimental values are lacking
4 Conclusions
In this work based on some quantum chemical descriptorscomputed by DFT at the B3LYP6-31G(d) level by the useof PLS analysis QSAR models were obtained for logarithmicn-octanolwater partition coefficients of PCBs The squaredcorrelation coefficient of the optimalmodel is 0992Theopti-mal model has high fitting precision and good predictabilityso it can be applied in the estimation of the n-octanolwaterpartition behavior of PCBs It can generally be concludedthat PCBs with larger electronic spatial extent and lowermolecular total energy values tend to be more hydrophobicand lipophilic
Conflict of Interests
The authors declare no conflict of interests
Acknowledgments
This study was financially supported by the National NaturalScience Foundation of China (21007017) the Program forNew Century Excellent Talents in University (NCET-12-0199) the Specialized Research Fund for the Doctoral Pro-gram of Higher Education (20100172120037) the China Post-doctoral Science Foundation (201003350) and the Guang-dong Natural Science Foundation (9351064101000001)
References
[1] B G Hansen A B Paya-Perez M Rahman and B RLarsen ldquoQSARs for 119870OW and 119870OC of PCB congeners a criticalexamination of data assumptions and statistical approachesrdquoChemosphere vol 39 no 13 pp 2209ndash2228 1999
[2] S-S Liu Y Liu D-Q Yin X-D Wang and L-S WangldquoPrediction of chromatographic relative retention time ofpolychlorinated biphenyls from the molecular electronegativitydistance vectorrdquo Journal of Separation Science vol 29 no 2 pp296ndash301 2006
[3] United Nations Environment Programme ldquoStockholm Con-vention on Persistent Organic Pollutantsrdquo 2001 httpwwwpopsintdocumentsconvtextconvtext enpdf
[4] J P Giesy and K Kannan ldquoDioxin-like and non-dioxin-liketoxic effects of polychlorinated biphenyls (PCBs) implicationsfor risk assessmentrdquo Critical Reviews in Toxicology vol 28 no6 pp 511ndash569 1998
[5] C T Chiou V H Freed DW Schmedding and R L KohnertldquoPartition coefficient and bioaccumulation of selected organicchemicalsrdquo Environmental Science and Technology vol 11 no 5pp 475ndash478 1977
[6] H Konemann ldquoStructure-activity relationships and additivityin fish toxicities of environmental pollutantsrdquoEcotoxicology andEnvironmental Safety vol 4 no 4 pp 415ndash421 1980
[7] G Patil ldquoCorrelation of aqueous solubility and octanol-waterpartition coefficient based on molecular structurerdquo Chemo-sphere vol 22 no 8 pp 723ndash738 1991
[8] DMackay A BobraW Y Shiu and SH Yalkowsky ldquoRelation-ships between aqueous solubility and octanol-water partitioncoefficientsrdquo Chemosphere vol 9 no 11 pp 701ndash711 1980
[9] A Sabljic H Gusten J Hermens andAOpperhuizen ldquoModel-ing octanolwater partition coefficients by molecular topologychlorinated benzenes and biphenylsrdquo Environmental Scienceand Technology vol 27 no 7 pp 1394ndash1402 1993
[10] L S Wang Y H Zhao and H Gao ldquoPredicting aqueoussolubility and octanolwater partition coefficients of organicchemicals from molar volumerdquo Environmental Chemistry vol11 pp 55ndash70 1992
[11] G-N Lu Z Dang X-Q Tao and X-Y Yi ldquoEstimation of watersolubility of polycyclic aromatic hydrocarbons using quantumchemical descriptors and partial least squaresrdquo QSAR andCombinatorial Science vol 27 no 5 pp 618ndash626 2008
[12] G-N Lu X-Q Tao Z Dang X-Y Yi and C Yang ldquoEstimationof n-octanolwater partition coefficients of polycyclic aromatichydrocarbons by quantum chemical descriptorsrdquo Central Euro-pean Journal of Chemistry vol 6 no 2 pp 310ndash318 2008
8 Journal of Chemistry
[13] W Zhou Z Zhai Z Wang and L Wang ldquoEstimation of n-octanolwater partition coefficients (119870OW) of all PCB congenersby density functional theoryrdquo Journal of Molecular Structurevol 755 no 1ndash3 pp 137ndash145 2005
[14] F Verweij K Booij K Satumalay N Van Der Molen andR Van Der Oost ldquoAssessment of bioavailable PAH PCB andOCP concentrations in water using semipermeable membranedevices (SPMDs) sediments and caged carprdquoChemosphere vol54 no 11 pp 1675ndash1689 2004
[15] L Jantschi S Bolboaca and R Sestras ldquoMeta-heuristics onquantitative structure -activity relationships study on polychlo-rinated biphenylsrdquo Journal of Molecular Modeling vol 16 pp377ndash386 2010
[16] P Ruiz O Faroon C J Moudgal H Hansen C T DeRosa and M Mumtaz ldquoPrediction of the health effects ofpolychlorinated biphenyls (PCBs) and their metabolites usingquantitative structure-activity relationship (QSAR)rdquo ToxicologyLetters vol 181 no 1 pp 53ndash65 2008
[17] L-T Qin S-S Liu H-L Liu and H-L Ge ldquoA new predictivemodel for the bioconcentration factors of polychlorinatedbiphenyls (PCBs) based on the molecular electronegativitydistance vector (MEDV)rdquo Chemosphere vol 70 no 9 pp 1577ndash1587 2008
[18] K Tuppurainen and J Ruuskanen ldquoElectronic eigenvalue(EEVA) a new QSARQSPR descriptor for electronic sub-stituent effects based on molecular orbital energies A QSARapproach to the Ah receptor binding affinity of polychlorinatedbiphenyls (PCBs) dibenzo-p-dioxins (PCDDs) and dibenzofu-rans (PCDFs)rdquo Chemosphere vol 41 no 6 pp 843ndash848 2000
[19] Z Daren ldquoQSPR studies of PCBs by the combination of geneticalgorithms and PLS analysisrdquoComputers and Chemistry vol 25no 2 pp 197ndash204 2001
[20] E B de Melo ldquoA new quantitative structure-property rela-tionship model to predict bioconcentration factors of poly-chlorinated biphenyls (PCBs) in fishes using E-state indexand topological descriptorsrdquo Ecotoxicology and EnvironmentalSafety vol 75 no 1 pp 213ndash222 2012
[21] J Y Long H B Yi X K Liu and Y F Wang ldquoQuantitativestructure-property relationship for polychlorinated biphenylstoxicity and structure by density functional theoryrdquoActa Chim-ica Sinica vol 70 pp 949ndash960 2012
[22] G-N Lu X-Q Tao Z H I Dang W Huang and Z LildquoQuantitative structureproperty relationships on dissolvabilityof pcddfs using quantum chemical descriptors and partial leastsquaresrdquo Journal of Theoretical and Computational Chemistryvol 9 no 1 pp 9ndash22 2010
[23] M J Frisch G W Trucks H B Schlegel et al Gaussian 03Revision B 01 Gaussian Pittsburgh Pa USA 2003
[24] S Wold M Sjostrom and L Eriksson ldquoPLS-regression a basictool of chemometricsrdquoChemometrics and Intelligent LaboratorySystems vol 58 no 2 pp 109ndash130 2001
[25] A B Umetrics USerrsquos Guide to SIMCA-P SIMCA-P+ Version10 0 Umea Sweden 2005
[26] B Z Zhao Y Y Li X R Hao et al ldquoTheoretical calculation ofthe octanolwater partition coefficient for polychlorobiphenylsrdquoJournal of Northeast Normal University vol 36 no 2 pp 41ndash442004
[27] J-W Zou Y-J Jiang G-X Hu M Zeng S-L Zhuang and Q-S Yu ldquoQSPRQSAR studies on the physicochemical propertiesand biological activities of polychlorinated biphenylsrdquo ActaPhysico vol 21 no 3 pp 267ndash272 2005
[28] X-Y Han Z-YWang Z-C Zhai and L-S Wang ldquoEstimationof n-octanolwater partition coefficients (119870OW) of all PCBcongeners by ab initio and a Cl substitution position methodrdquoQSAR and Combinatorial Science vol 25 no 4 pp 333ndash3412006
Submit your manuscripts athttpwwwhindawicom
Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014
Inorganic ChemistryInternational Journal of
Hindawi Publishing Corporation httpwwwhindawicom Volume 2014
International Journal ofPhotoenergy
Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014
Carbohydrate Chemistry
International Journal of
Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014
Journal of
Chemistry
Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014
Advances in
Physical Chemistry
Hindawi Publishing Corporationhttpwwwhindawicom
Analytical Methods in Chemistry
Journal of
Volume 2014
Bioinorganic Chemistry and ApplicationsHindawi Publishing Corporationhttpwwwhindawicom Volume 2014
SpectroscopyInternational Journal of
Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014
The Scientific World JournalHindawi Publishing Corporation httpwwwhindawicom Volume 2014
Medicinal ChemistryInternational Journal of
Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014
Chromatography Research International
Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014
Applied ChemistryJournal of
Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014
Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014
Theoretical ChemistryJournal of
Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014
Journal of
Spectroscopy
Analytical ChemistryInternational Journal of
Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014
Journal of
Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014
Quantum Chemistry
Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014
Organic Chemistry International
ElectrochemistryInternational Journal of
Hindawi Publishing Corporation httpwwwhindawicom Volume 2014
Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014
CatalystsJournal of
Journal of Chemistry 3
Table 1 n-Octanolwater partition coefficients of the PCBs studied
Noa Compounds CAS no Observed log119870OWb Predicted log119870OW
Reference [10] Model I Model II1 Biphenyl 92-52-4 410 414 396 3872 2-Chlorobiphenyl 2051-60-7 456 464 456 4563 3-Chlorobiphenyl 2051-61-8 472 463 477 4774 221015840-Dichlorobiphenyl 13029-08-8 502 520 503 5055 24-Dichlorobiphenyl 33284-50-3 515 516 505 5066 2210158405-Trichlorobiphenyl 37680-65-2 564 573 557 5597 2441015840-Trichlorobiphenyl 7012-37-5 574 573 582 5808 245-Trichlorobiphenyl 15862-07-4 577 576 557 5609 23101584045-Tetrachlorobiphenyl 73557-53-8 639 630 658 65810 331015840441015840-Tetrachlorobiphenyl 32598-13-3 652 639 668 67111 2210158403451015840-Pentachlorobiphenyl 38380-02-8 685 684 685 69112 2210158404551015840-Pentachlorobiphenyl 37680-73-2 685 680 685 68613 23456-Pentachlorobiphenyl 18259-05-7 685 686 685 68814 221015840331015840441015840-Hexachlorobiphenyl 38380-07-3 744 743 745 75315 221015840441015840551015840-Hexachlorobiphenyl 35065-27-1 744 737 748 74316 221015840441015840661015840-Hexachlorobiphenyl 33979-03-2 712 739 741 73717 221015840331015840441015840551015840-Octachlorobiphenyl 35694-08-7 868 860 860 85918 221015840331015840551015840661015840-Octachlorobiphenyl 2136-99-4 842 850 858 85419 2210158403310158404410158405510158406-Nonachlorobiphenyl 40186-72-9 914 908 890 89020 221015840331015840441015840551015840661015840-Decachlorobiphenyl 2051-24-3 960 972 934 93021 4-Chlorobiphenyl 2051-62-9 469 469 450 45122 25-Dichlorobiphenyl 34883-39-1 518 516 515 51123 441015840-Dichlorobiphenyl 2050-68-2 528 529 534 52524 3441015840-Trichlorobiphenyl 38444-90-5 590 577 598 59725 221015840551015840-Tetrachlorobiphenyl 35693-99-3 626 629 639 63526 231015840410158405-Tetrachlorobiphenyl 32598-11-1 639 631 653 65727 2210158403410158405510158406-Heptachlorobiphenyl 52663-68-0 793 795 804 805aCompounds no 1ndash20 constitute the training set and compounds No 21ndash27 constitute the test set bfrom [10]
the two benzene ring plans (DA) the eigenvalue of thehighest occupied molecular orbital (119864HOMO) the eigenvalueof the lowest unoccupied molecular orbital (119864LUMO) theMulliken atomic charges on 12 carbon atoms from C1 to C12(119876C1ndash119876C12 Figure 1 shows the numbering of selected atomson themolecular structure) the electronic spatial extent (Re)the dipole moment (120583) and molecular total energy (TE) Allthe above descriptors were obtained directly in the outputfiles of Gaussian 03 calculation through a single full opti-mizing process for the molecular structure B3LYP6-31G(d)FOPT The values of these quantum chemical descriptors arelisted in Table 2 The units of dihedral angle atomic chargeelectronic spatial extent dipole moment and energy are asfollows degree atomic charge unit atom unit Debye andHartree respectively
23 Modeling Method and Evaluation Indexes Since a bignumber of descriptors were selected in this study intercor-relation of independent variables (multicollinearity) mightbecome a technical problem It was especially difficultwhen the number of independent variables was equal toor greater than the number of compounds selected in the
ClX ClY
1
2 3
4
56
9 8
7
1211
10
Figure 1 Numbering of selected atoms in the molecular structure119883 and119884 are integers between 0 and 5 inclusiveThe positions 7 8 910 11 and 12 are corresponding to 11015840 21015840 31015840 41015840 51015840 and 61015840 respectively
training set The common regression analysis frequentlyused in QSAR studies became ineffective because over-fitting readily occurred To overcome these problems weused the PLS regression a well-known multivariate methodwhich gives a stepwise solution for a regression modelIt extracts principal component-like latent variables fromoriginal independent variables (predictor variables) anddependent variables (response variables) respectively [24]This method can analyze data which are strongly collinearand more importantly it is a remedy to overfitting PLS finds
4 Journal of Chemistry
Table2Quantum
chem
icaldescrip
torsforthe
PCBs
studied
No
DA119864HOMO119864LU
MO
119876C1
119876C2
119876C3
119876C4
119876C5
119876C6
119876C7
119876C8
119876C9
119876C10
119876C11
119876C12
Re120583
TE1
000minus021730minus00330
0010126minus017937minus01339
5minus01248
7minus01339
5minus017937
010126minus017937minus01339
5minus01248
7minus01339
5minus017937
2336
596
000
00minus46
330
272
5613minus02322
3minus002
747
007
290minus01156
4minus01156
4minus012189minus013101minus01608
9006
717minus01479
3minus01354
8minus01244
2minus013374minus01646
12764
356
16832minus92
289
733
000minus022
671minus004
365
01034
9minus01805
3minus006
979minus012783minus013333minus017359
01003
5minus017756minus01342
1minus01230
2minus013374minus01784
63399
230
21220minus92
289
894
11008minus024
310minus002
654
005
227minus01026
5minus01368
5minus01192
6minus013195minus01356
6005
227minus01026
5minus01368
5minus01192
6minus013195minus01356
63242
151
200
16minus1382
489
95
5559minus023
637minus003
739
007
579minus01130
4minus01368
0minus005
946minus013118minus01600
7006
673minus014837minus013517minus01236
3minus01336
1minus01649
04150
644
21687minus1382
492
16
8121minus024
564minus00322
0004
787minus008
812minus01356
1minus011931minus006
799minus014376
004
642minus009
074minus01370
2minus01179
6minus01306
2minus014315
4428
552
1857
1minus1842
084
47
5536minus023937minus004
578
007
547minus011313minus01365
3minus005
901minus013102minus01598
8006
733minus01448
2minus01358
7minus006
134minus01343
5minus016175
6126
716
1429
7minus1842
087
88
5616minus024
156minus004
537
007
940minus010817minus013619minus006
867minus007
657minus01623
8006
469minus01476
3minus01349
4minus01225
1minus013332minus01636
650
70323
226
21minus1842
082
09
12687minus024
760minus005
378
007
904minus010835minus01357
7minus006
785minus007
710minus016166
006
579minus016413minus007
033minus01236
3minus01336
4minus01428
567
00831
1948
5minus23
01677
210
000minus023
831minus006
630
01096
3minus01783
8minus007
907minus00744
6minus01334
7minus01697
001096
3minus01783
8minus007
907minus00744
6minus01334
7minus01697
080
12630
267
16minus23
016757
118360minus02529
8minus004
434
005
232minus009
797minus009
264minus006
419minus01280
4minus01343
3004
160minus008
793minus013512minus01182
2minus006
856minus014180
718116
2264
02minus2761263
412
8064minus02536
0minus004
744
004
992minus008
328minus01374
8minus006
497minus007
717minus01425
1004
568minus008
834minus013512minus011755minus006
869minus01427
77317571
15120minus2761268
013
8995minus025
762minus004
957
006
408minus009
995minus008
788minus007
241minus008
788minus009
995
002
910minus01336
9minus013182minus012112minus013182minus01336
963
48796
21628minus2761249
114
8565minus026
235minus004
741
004
784minus009
803minus009
259minus006
444minus01280
2minus01338
8004
784minus009
803minus009
259minus006
444minus01280
2minus01338
888
80956
31952minus3220
848
115
8114minus025
785minus005
272
004
819minus008
337minus013735minus006
481minus007
739minus014173
004
819minus008
337minus013735minus006
481minus007
739minus014173
9135369
002
78minus3220
8572
1690
00minus026
706minus004
525
004
104minus007
511minus013762minus005
110minus013762minus007
511
004
104minus007
511minus013762minus005
110minus013762minus007
511
7602
549
000
00minus3220
859
717
8588minus026
513minus005
927
004
910minus009
391minus009
064minus007
515minus007
354minus013392
004
910minus009
391minus009
064minus007
515minus007
354minus013392
1158
265
611
648minus4140
024
218
8998minus026
008minus005
592
003
906minus008
305minus007
831minus01143
7minus007
831minus008
305
003
906minus008
305minus007
831minus01143
7minus007
831minus008
305
9319431
000
00minus4140
027
719
9083minus026
526minus006
331
005
601minus009
023minus008
906minus006
972minus008
907minus009
023
003
596minus008
464minus009
111minus00743
5minus00744
7minus01197
81208
396
910
110minus45
99608
620
9001minus026
670minus006
373
004
327minus007
929minus008
962minus006
896minus008
962minus007
929
004
327minus007
929minus008
962minus006
896minus008
962minus007
929
1257959
3000
00minus50
5919
3321
000minus022
180minus004
264
010353minus01765
2minus01354
1minus006
248minus01354
1minus01765
201034
5minus017931minus01336
7minus012379minus01336
7minus017931
3744
110
21391minus92
289
9022
5589minus023
832minus003
813
00748
7minus01129
8minus01339
4minus012217minus006
823minus016185
006
624minus014755minus013515minus012312minus013355minus016419
3923
000
07279minus1382
492
223
000minus022
591minus005
162
010551minus01763
9minus01350
5minus006
209minus01350
5minus01763
9010551minus01763
9minus01350
5minus006
209minus01350
5minus01763
956
91897
000
00minus1382
495
024
000minus02322
9minus005
925
01124
1minus01806
1minus007
848minus007
511minus01338
0minus01706
7010325minus01745
4minus013518minus006
122minus01348
1minus017556
6796
892
15815minus1842
0855
2582
68minus025
010minus003
958
004
559minus008
791minus013527minus011819minus006
836minus014172
004
559minus008
791minus013527minus011819minus006
836minus014172
5819703
02171minus23
01678
626
5556minus024
687minus005
342
00746
2minus01130
6minus013333minus01200
8minus006
896minus016122
007
015minus01450
7minus008
147minus007
164minus01324
1minus01556
668
27221
239
61minus23
01677
927
9059minus025
929minus005
534
005
332minus009
406minus007
793minus011522minus007
793minus009
406
003168minus00745
5minus013741minus006
394minus007
807minus012514
9224
954
10323minus36
80442
4
Journal of Chemistry 5
Table 3 Fitting results for PLS model I and model II
Model 119884 119883 ℎ 1198772
1198831198772
119883(cum) 1198772
1198841198772
119884(cum) Eig 1198762
1198762
cum
I log119870OW 1198831a 1 0519 0519 0928 0928 935 0922 0922
2 0172 0691 0065 0992 309 0841 0988
II log119870OW 1198832b 1 0542 0542 0928 0928 921 0921 0921
2 0181 0723 0063 0992 308 0830 0987a1198831 containing all 18 independent variables
b1198832 the variable 120583 was eliminated from matrix X for the model
the relationship between a matrix Y (containing dependentvariables often only one for QSAR studies) and a matrix X(containing predictor variables) by reducing the dimension ofthe matrices while concurrently maximizing the relationshipbetween them
SIMCA-P (Version 105 Umetrics AB 2004) software wasemployed to perform the PLS analysis The default valuesgiven by the software were used as the initial conditionsfor computation According to the userrsquos guide to SIMCA-P[25] the criterion used to determine the model dimension-ality the number of significant PLS components is cross-validation (CV) With CV the tested PLS component isthought significant if the fraction of the total variation ofthe dependent variables estimated by a component 1198762 ishigher than a significance limit (00975) for thewhole datasetThe model is thought to have a good predicting ability whenthe cumulative 1198762 for the extracted components 1198762cum ismore than 0500 [25] Model adequacy was assessed basedprimarily on the number of PLS principal components (ℎ)1198762
cum the squared correlation coefficient between observedvalues and fitted values (1198772) the standard error (SE) of theestimate the variance ratio of variance analysis (119865) and thesignificance level (119901)
3 Results and Discussion
31 Modeling and Optimizing For the 20 PCBs contained inthe training set PLS analysis was performed with log119870OWas the dependent variable and the 18 independent variablesas the predictor variables yielding QSAR model I For thismodel we have 1198772 = 0992 SE = 0145 119865 = 212 times 103and 119901 = 400 times 10minus20 The fitting results for this modelare listed in Table 3 In Table 3 1198772
119883and 1198772
119884stand for the
fraction of the sum of squares of all the Xrsquos and Y rsquos explainedby the current component 1198772
119883(cum) and 1198772
119884(cum) stand forthe fraction of the cumulative variance of all the Xrsquos and Y rsquosexplained by all extracted components respectively andEigstands for eigenvalue which denotes the importance of thePLS principal component
Variable importance in the projection (VIP) is a parame-ter that shows the importance of a variable in a model in theassistant analysis of PLS modeling According to the manualof SIMCA-P termswith large VIP (gt10) are themost relevantfor explaining dependent variables [25] To obtain an optimalmodel the following PLS analysis procedure was adopted APLSmodel with all the predictor variables was first calculatedand then the variable with the lowest VIP was eliminatedand a new PLS regression was performed yielding a new PLS
model This procedure was repeated until an optimal modelwas obtained The optimal model was selected with respectto 1198762cum 119877
2 SE 119865 and 119901 According to the statistics andmetrics theories a QSAR model with larger values of 1198762cum1198772 and 119865 and smaller values of SE and 119901 tends to be more
stable and reliable than in the opposite caseThe above described PLS analysis procedure with
log119870OW as dependent variable and the 18 independentvariables as the initial predictor variables for the 20 PCBscontained in the training set resulted in model II For thismodel we have 1198772 = 0992 SE = 0151 119865 = 197 times 103 and119901 = 762 times 10
minus20 The fitting results for this model are shownin Table 3 As shown in the table based on the estimatedindexes model I is better than model II indicating all thatthe selected 18 independent variables are necessary for thePLS modeling In model I 2 PLS principal componentswere selected in the model which explained 691 of thevariance of the predictor variables and 992 of the varianceof the dependent variable Based on the estimated indexesemployed in this study this model is statistically significant
32 Analysis and Discussion Based on the optimal modelI the predicted log119870OW for the 20 PCBs contained in thetraining set are as listed in Table 1 A comparison of theobserved and predicted log119870OW is shown in Figure 2 Ascan be seen in Table 1 and Figure 2 the predicted values arewithin the range of plusmn03 log unit from the observed valuesThe observed log119870OW values are close to those predicted bythe optimalmodel and the correlation between observed andpredicted log119870OW is significant For the PCBs studied thedifferences between observed and predicted log119870OW are verysmall so the predicted values are acceptable Note that thecross-validated 1198762cum value (=0988) of the optimal model ismuch larger than 0500 it would seem that thismodel is stableand has good prediction ability and can be used to predict the119870OW of PCBs effectively
WhenX andY are unscaled and uncentered the unscaledcoefficients of the independent variables and the constanttransformed from the PLS results are listed in Table 4 Thesigns of the coefficients of the independent variables in themodel enable one to see the direction in which each affectsthe119870OW of PCBs Based on the unscaled coefficients a QSARequation like that obtained from multiple linear regressioncan be created for the optimal model and the predicted119870OW for other PCBs can be estimated Based on model Ipredictions of 119870OW for the 7 PAHs contained in the testset were generated and listed in Table 1 and the comparisonof observed and predicted log119870OW is shown in Figure 2
6 Journal of Chemistry
4 5 6 7 8 9
4
5
6
7
8
9
Training setTest set
Observed log KOW
Pred
icte
d lo
g KO
W
Figure 2 Plot of observed values versus those predicted by theoptimal model I
Table 4The unscaled coefficients and VIPs of the optimal model I
Variables Unscaled coefficients VIPsTE minus1645 times 10minus4 1412Re 0747 times 10minus4 1404119864HOMO minus10390 1274119864LUMO minus20295 1220119876C6 3418 1082119876C12 3366 1074119876C8 1924 1009119876C11 5212 0992119876C1 minus1999 0947119876C9 7412 0940119876C7 minus1975 0926119876C2 1087 0922119876C10 4711 0834DA 5726 times 10minus4 0794119876C3 6512 0756119876C5 3434 0716119876C4 3535 0690120583 minus0047 0473Constant 6995
The correlation coefficient between observed and predictedlog119870OW in the test set was as high as 0995 larger than thatin the training set As can be seen in Table 1 and Figure 2the predicted log119870OW for the test set are very close tothose observed This indicates that the PLS model has goodpredictive ability
The VIP values for the optimal model were calculatedand listed in Table 4 We can find that the VIP values of TERe 119864HOMO and 119864LUMO are all greater than 10 indicatingthat they are important in governing the 119870OW values of
PCBs TE and Re are the two most important variablessince their VIP values are larger than 140 which is greaterthan the VIP values of the other variables It is expectedthat log119870OW increases with the increase of molecularsize Considering the following examples 4-chlorobiphenyl(log119870OW = 469 molecular weight (MW) = 1885) 4 41015840-dichlorobiphenyl (log119870OW = 528 MW = 2230) 2 4 41015840-trichlorobiphenyl (log119870OW = 574 MW = 2575) 3 31015840 4 41015840-tetrachlorobiphenyl (log119870OW = 652 MW = 2920) whichdiffer between themselves by the number of chlorine sub-stituents the increase in log119870OW as a function of the numberof chlorine substituents and molecular sizes is clearly seenWhen H atom is replaced by Cl atom on the biphenyl119870OW value will be increased [26] So the more chlorineatoms on the biphenyls molecular the greater the119870OW valueWhen isomers share the same molecular weight molecu-lar sizes differ from the arrangements of chlorine atomswhich can be expressed by molecular volume and surfacearea leading to the difference in log119870OW 2-Chlorobiphenyl(log119870OW = 456) 3-chlorobiphenyl (log119870OW = 472) and 4-chlorobiphenyl (log119870OW = 469) are isomers (MW = 1885)but they have different n-octanolwater partition coefficients
The variety of molecular size could be described bythe quantum chemical descriptors of TE and Re in thiswork Firstly for the whole compounds set in this study thedecrease of TEvalues indicates that the molecule containsmore chlorine atoms which might lead to the increase ofmolecular volume Zou et al [27] found that molecularvolume and molecular surface area were highly correlatedfor PCBs sharing similar structures So the decrease of TEvalues might lead to the increase of molecular surface areaMolecular volume and molecular surface area are measuresof molecular size In general the bigger the molecular sizethe stronger the effect of the intermolecular dispersion forceOn the other hand the molecular volume is a measure ofabsorbed energy of the molecule when it forms the holein a solvent Due to stronger polar of water moleculescontributing to stronger binding energy the larger solutemolecules need to absorbmore energy to destroy the bindingbetween water molecules when distributing in the aqueousphase resulting in tending to be allocated in weakly polarsolvents It means that the decrease of TE values might leadto larger log119870OW which is consistent with the study of Zou etal [27] Secondly Re is the expectation value of the operator120588 in the formula of int 1199032120588(119903)1198893119903 It is occasionally seen inthe literature as a measure of molecular volume Althoughin many cases Re correlates poorly with molecular volumethe correlation for PCBs sharing similar structures is quitesignificant A PCB molecule with large electronic spatialextent will have large molecular volume Therefore it can beconcluded that PCBs molecules with lower molecular totalenergy values and larger electronic spatial extent tend to bemore hydrophobic and lipophilic leading to larger log119870OWvalues This implies that the decrease of the TE value and theincrease of the Re value lead to an increase in log119870OW value
The value of log119870OW is in inverse ratio to 119864HOMO and119864LUMO that is large value of 119864HOMO and 119864LUMO would resultin small value of log119870OW for PCBs This is because 119864HOMO
Journal of Chemistry 7
represents the proton acceptance ability in forming hydrogenbond while 119864LUMO represents the proton donation ability inthe formation of hydrogen bond Therefore the compoundswith large values of 119864HOMO and 119864LUMO tend to donate oraccept protons easily In this case hydrogen bond can beeasily formed between PCBs and H
2O molecules and PCBs
are able to enter water phase resulting in low log119870OW value[13]
33 Comparison with the Results from Other Models In [13]there are 3 variables (119864HOMO 119864LUMO and 120572) obtained fromB3LYP6-31G(d) level in the model and the squared correla-tion coefficient is 1198772 = 09484 less than that of the optimalmodel in the present study It was found that there are highcorrelations relating the number of Cl substitution position(119873) to120572 and119864HOMO And taking119873 as theoretical descriptorsto correlate the three-variable models for predicting log119870OWof PCBs the achieved new model is having 1198772 = 09497and exhibiting optimum stability convenience for use andbetter predictive power than that from AM1 method [28]In contrast the squared correlation coefficient 1198772 = 0992obtained by PLS in the present study is obviously larger thanthat reported
Taking molar weight (119872) and melting point (119879119898 ∘C) as
independent variables Wang et al [10] had developed thebinary linear regression equation
log119870OW = 00155119872 + 000107 (119879119898 minus 25)
+ 171 (119899 = 27 119903 = 0997)
(1)
whose predicted log119870OW are also listed in Table 1 whichseems superior to the optimal model I in the present studyobtained solely on quantum chemical descriptors Howeverthe variable melting point in the model equation is anexperimental physicochemical parameter It is impracticalto determine the melting points of all 209 PCBs directly inthe laboratory which makes the model become of limitedapplicability Therefore our optimal model based only onquantum chemical descriptors has wider applicability thanthat based on known physicochemical propertiesThe overallhigh quality of themodel obtained in this study indicates thatit will find application in the estimation of 119870OW of PCBs forwhich experimental values are lacking
4 Conclusions
In this work based on some quantum chemical descriptorscomputed by DFT at the B3LYP6-31G(d) level by the useof PLS analysis QSAR models were obtained for logarithmicn-octanolwater partition coefficients of PCBs The squaredcorrelation coefficient of the optimalmodel is 0992Theopti-mal model has high fitting precision and good predictabilityso it can be applied in the estimation of the n-octanolwaterpartition behavior of PCBs It can generally be concludedthat PCBs with larger electronic spatial extent and lowermolecular total energy values tend to be more hydrophobicand lipophilic
Conflict of Interests
The authors declare no conflict of interests
Acknowledgments
This study was financially supported by the National NaturalScience Foundation of China (21007017) the Program forNew Century Excellent Talents in University (NCET-12-0199) the Specialized Research Fund for the Doctoral Pro-gram of Higher Education (20100172120037) the China Post-doctoral Science Foundation (201003350) and the Guang-dong Natural Science Foundation (9351064101000001)
References
[1] B G Hansen A B Paya-Perez M Rahman and B RLarsen ldquoQSARs for 119870OW and 119870OC of PCB congeners a criticalexamination of data assumptions and statistical approachesrdquoChemosphere vol 39 no 13 pp 2209ndash2228 1999
[2] S-S Liu Y Liu D-Q Yin X-D Wang and L-S WangldquoPrediction of chromatographic relative retention time ofpolychlorinated biphenyls from the molecular electronegativitydistance vectorrdquo Journal of Separation Science vol 29 no 2 pp296ndash301 2006
[3] United Nations Environment Programme ldquoStockholm Con-vention on Persistent Organic Pollutantsrdquo 2001 httpwwwpopsintdocumentsconvtextconvtext enpdf
[4] J P Giesy and K Kannan ldquoDioxin-like and non-dioxin-liketoxic effects of polychlorinated biphenyls (PCBs) implicationsfor risk assessmentrdquo Critical Reviews in Toxicology vol 28 no6 pp 511ndash569 1998
[5] C T Chiou V H Freed DW Schmedding and R L KohnertldquoPartition coefficient and bioaccumulation of selected organicchemicalsrdquo Environmental Science and Technology vol 11 no 5pp 475ndash478 1977
[6] H Konemann ldquoStructure-activity relationships and additivityin fish toxicities of environmental pollutantsrdquoEcotoxicology andEnvironmental Safety vol 4 no 4 pp 415ndash421 1980
[7] G Patil ldquoCorrelation of aqueous solubility and octanol-waterpartition coefficient based on molecular structurerdquo Chemo-sphere vol 22 no 8 pp 723ndash738 1991
[8] DMackay A BobraW Y Shiu and SH Yalkowsky ldquoRelation-ships between aqueous solubility and octanol-water partitioncoefficientsrdquo Chemosphere vol 9 no 11 pp 701ndash711 1980
[9] A Sabljic H Gusten J Hermens andAOpperhuizen ldquoModel-ing octanolwater partition coefficients by molecular topologychlorinated benzenes and biphenylsrdquo Environmental Scienceand Technology vol 27 no 7 pp 1394ndash1402 1993
[10] L S Wang Y H Zhao and H Gao ldquoPredicting aqueoussolubility and octanolwater partition coefficients of organicchemicals from molar volumerdquo Environmental Chemistry vol11 pp 55ndash70 1992
[11] G-N Lu Z Dang X-Q Tao and X-Y Yi ldquoEstimation of watersolubility of polycyclic aromatic hydrocarbons using quantumchemical descriptors and partial least squaresrdquo QSAR andCombinatorial Science vol 27 no 5 pp 618ndash626 2008
[12] G-N Lu X-Q Tao Z Dang X-Y Yi and C Yang ldquoEstimationof n-octanolwater partition coefficients of polycyclic aromatichydrocarbons by quantum chemical descriptorsrdquo Central Euro-pean Journal of Chemistry vol 6 no 2 pp 310ndash318 2008
8 Journal of Chemistry
[13] W Zhou Z Zhai Z Wang and L Wang ldquoEstimation of n-octanolwater partition coefficients (119870OW) of all PCB congenersby density functional theoryrdquo Journal of Molecular Structurevol 755 no 1ndash3 pp 137ndash145 2005
[14] F Verweij K Booij K Satumalay N Van Der Molen andR Van Der Oost ldquoAssessment of bioavailable PAH PCB andOCP concentrations in water using semipermeable membranedevices (SPMDs) sediments and caged carprdquoChemosphere vol54 no 11 pp 1675ndash1689 2004
[15] L Jantschi S Bolboaca and R Sestras ldquoMeta-heuristics onquantitative structure -activity relationships study on polychlo-rinated biphenylsrdquo Journal of Molecular Modeling vol 16 pp377ndash386 2010
[16] P Ruiz O Faroon C J Moudgal H Hansen C T DeRosa and M Mumtaz ldquoPrediction of the health effects ofpolychlorinated biphenyls (PCBs) and their metabolites usingquantitative structure-activity relationship (QSAR)rdquo ToxicologyLetters vol 181 no 1 pp 53ndash65 2008
[17] L-T Qin S-S Liu H-L Liu and H-L Ge ldquoA new predictivemodel for the bioconcentration factors of polychlorinatedbiphenyls (PCBs) based on the molecular electronegativitydistance vector (MEDV)rdquo Chemosphere vol 70 no 9 pp 1577ndash1587 2008
[18] K Tuppurainen and J Ruuskanen ldquoElectronic eigenvalue(EEVA) a new QSARQSPR descriptor for electronic sub-stituent effects based on molecular orbital energies A QSARapproach to the Ah receptor binding affinity of polychlorinatedbiphenyls (PCBs) dibenzo-p-dioxins (PCDDs) and dibenzofu-rans (PCDFs)rdquo Chemosphere vol 41 no 6 pp 843ndash848 2000
[19] Z Daren ldquoQSPR studies of PCBs by the combination of geneticalgorithms and PLS analysisrdquoComputers and Chemistry vol 25no 2 pp 197ndash204 2001
[20] E B de Melo ldquoA new quantitative structure-property rela-tionship model to predict bioconcentration factors of poly-chlorinated biphenyls (PCBs) in fishes using E-state indexand topological descriptorsrdquo Ecotoxicology and EnvironmentalSafety vol 75 no 1 pp 213ndash222 2012
[21] J Y Long H B Yi X K Liu and Y F Wang ldquoQuantitativestructure-property relationship for polychlorinated biphenylstoxicity and structure by density functional theoryrdquoActa Chim-ica Sinica vol 70 pp 949ndash960 2012
[22] G-N Lu X-Q Tao Z H I Dang W Huang and Z LildquoQuantitative structureproperty relationships on dissolvabilityof pcddfs using quantum chemical descriptors and partial leastsquaresrdquo Journal of Theoretical and Computational Chemistryvol 9 no 1 pp 9ndash22 2010
[23] M J Frisch G W Trucks H B Schlegel et al Gaussian 03Revision B 01 Gaussian Pittsburgh Pa USA 2003
[24] S Wold M Sjostrom and L Eriksson ldquoPLS-regression a basictool of chemometricsrdquoChemometrics and Intelligent LaboratorySystems vol 58 no 2 pp 109ndash130 2001
[25] A B Umetrics USerrsquos Guide to SIMCA-P SIMCA-P+ Version10 0 Umea Sweden 2005
[26] B Z Zhao Y Y Li X R Hao et al ldquoTheoretical calculation ofthe octanolwater partition coefficient for polychlorobiphenylsrdquoJournal of Northeast Normal University vol 36 no 2 pp 41ndash442004
[27] J-W Zou Y-J Jiang G-X Hu M Zeng S-L Zhuang and Q-S Yu ldquoQSPRQSAR studies on the physicochemical propertiesand biological activities of polychlorinated biphenylsrdquo ActaPhysico vol 21 no 3 pp 267ndash272 2005
[28] X-Y Han Z-YWang Z-C Zhai and L-S Wang ldquoEstimationof n-octanolwater partition coefficients (119870OW) of all PCBcongeners by ab initio and a Cl substitution position methodrdquoQSAR and Combinatorial Science vol 25 no 4 pp 333ndash3412006
Submit your manuscripts athttpwwwhindawicom
Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014
Inorganic ChemistryInternational Journal of
Hindawi Publishing Corporation httpwwwhindawicom Volume 2014
International Journal ofPhotoenergy
Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014
Carbohydrate Chemistry
International Journal of
Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014
Journal of
Chemistry
Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014
Advances in
Physical Chemistry
Hindawi Publishing Corporationhttpwwwhindawicom
Analytical Methods in Chemistry
Journal of
Volume 2014
Bioinorganic Chemistry and ApplicationsHindawi Publishing Corporationhttpwwwhindawicom Volume 2014
SpectroscopyInternational Journal of
Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014
The Scientific World JournalHindawi Publishing Corporation httpwwwhindawicom Volume 2014
Medicinal ChemistryInternational Journal of
Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014
Chromatography Research International
Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014
Applied ChemistryJournal of
Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014
Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014
Theoretical ChemistryJournal of
Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014
Journal of
Spectroscopy
Analytical ChemistryInternational Journal of
Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014
Journal of
Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014
Quantum Chemistry
Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014
Organic Chemistry International
ElectrochemistryInternational Journal of
Hindawi Publishing Corporation httpwwwhindawicom Volume 2014
Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014
CatalystsJournal of
4 Journal of Chemistry
Table2Quantum
chem
icaldescrip
torsforthe
PCBs
studied
No
DA119864HOMO119864LU
MO
119876C1
119876C2
119876C3
119876C4
119876C5
119876C6
119876C7
119876C8
119876C9
119876C10
119876C11
119876C12
Re120583
TE1
000minus021730minus00330
0010126minus017937minus01339
5minus01248
7minus01339
5minus017937
010126minus017937minus01339
5minus01248
7minus01339
5minus017937
2336
596
000
00minus46
330
272
5613minus02322
3minus002
747
007
290minus01156
4minus01156
4minus012189minus013101minus01608
9006
717minus01479
3minus01354
8minus01244
2minus013374minus01646
12764
356
16832minus92
289
733
000minus022
671minus004
365
01034
9minus01805
3minus006
979minus012783minus013333minus017359
01003
5minus017756minus01342
1minus01230
2minus013374minus01784
63399
230
21220minus92
289
894
11008minus024
310minus002
654
005
227minus01026
5minus01368
5minus01192
6minus013195minus01356
6005
227minus01026
5minus01368
5minus01192
6minus013195minus01356
63242
151
200
16minus1382
489
95
5559minus023
637minus003
739
007
579minus01130
4minus01368
0minus005
946minus013118minus01600
7006
673minus014837minus013517minus01236
3minus01336
1minus01649
04150
644
21687minus1382
492
16
8121minus024
564minus00322
0004
787minus008
812minus01356
1minus011931minus006
799minus014376
004
642minus009
074minus01370
2minus01179
6minus01306
2minus014315
4428
552
1857
1minus1842
084
47
5536minus023937minus004
578
007
547minus011313minus01365
3minus005
901minus013102minus01598
8006
733minus01448
2minus01358
7minus006
134minus01343
5minus016175
6126
716
1429
7minus1842
087
88
5616minus024
156minus004
537
007
940minus010817minus013619minus006
867minus007
657minus01623
8006
469minus01476
3minus01349
4minus01225
1minus013332minus01636
650
70323
226
21minus1842
082
09
12687minus024
760minus005
378
007
904minus010835minus01357
7minus006
785minus007
710minus016166
006
579minus016413minus007
033minus01236
3minus01336
4minus01428
567
00831
1948
5minus23
01677
210
000minus023
831minus006
630
01096
3minus01783
8minus007
907minus00744
6minus01334
7minus01697
001096
3minus01783
8minus007
907minus00744
6minus01334
7minus01697
080
12630
267
16minus23
016757
118360minus02529
8minus004
434
005
232minus009
797minus009
264minus006
419minus01280
4minus01343
3004
160minus008
793minus013512minus01182
2minus006
856minus014180
718116
2264
02minus2761263
412
8064minus02536
0minus004
744
004
992minus008
328minus01374
8minus006
497minus007
717minus01425
1004
568minus008
834minus013512minus011755minus006
869minus01427
77317571
15120minus2761268
013
8995minus025
762minus004
957
006
408minus009
995minus008
788minus007
241minus008
788minus009
995
002
910minus01336
9minus013182minus012112minus013182minus01336
963
48796
21628minus2761249
114
8565minus026
235minus004
741
004
784minus009
803minus009
259minus006
444minus01280
2minus01338
8004
784minus009
803minus009
259minus006
444minus01280
2minus01338
888
80956
31952minus3220
848
115
8114minus025
785minus005
272
004
819minus008
337minus013735minus006
481minus007
739minus014173
004
819minus008
337minus013735minus006
481minus007
739minus014173
9135369
002
78minus3220
8572
1690
00minus026
706minus004
525
004
104minus007
511minus013762minus005
110minus013762minus007
511
004
104minus007
511minus013762minus005
110minus013762minus007
511
7602
549
000
00minus3220
859
717
8588minus026
513minus005
927
004
910minus009
391minus009
064minus007
515minus007
354minus013392
004
910minus009
391minus009
064minus007
515minus007
354minus013392
1158
265
611
648minus4140
024
218
8998minus026
008minus005
592
003
906minus008
305minus007
831minus01143
7minus007
831minus008
305
003
906minus008
305minus007
831minus01143
7minus007
831minus008
305
9319431
000
00minus4140
027
719
9083minus026
526minus006
331
005
601minus009
023minus008
906minus006
972minus008
907minus009
023
003
596minus008
464minus009
111minus00743
5minus00744
7minus01197
81208
396
910
110minus45
99608
620
9001minus026
670minus006
373
004
327minus007
929minus008
962minus006
896minus008
962minus007
929
004
327minus007
929minus008
962minus006
896minus008
962minus007
929
1257959
3000
00minus50
5919
3321
000minus022
180minus004
264
010353minus01765
2minus01354
1minus006
248minus01354
1minus01765
201034
5minus017931minus01336
7minus012379minus01336
7minus017931
3744
110
21391minus92
289
9022
5589minus023
832minus003
813
00748
7minus01129
8minus01339
4minus012217minus006
823minus016185
006
624minus014755minus013515minus012312minus013355minus016419
3923
000
07279minus1382
492
223
000minus022
591minus005
162
010551minus01763
9minus01350
5minus006
209minus01350
5minus01763
9010551minus01763
9minus01350
5minus006
209minus01350
5minus01763
956
91897
000
00minus1382
495
024
000minus02322
9minus005
925
01124
1minus01806
1minus007
848minus007
511minus01338
0minus01706
7010325minus01745
4minus013518minus006
122minus01348
1minus017556
6796
892
15815minus1842
0855
2582
68minus025
010minus003
958
004
559minus008
791minus013527minus011819minus006
836minus014172
004
559minus008
791minus013527minus011819minus006
836minus014172
5819703
02171minus23
01678
626
5556minus024
687minus005
342
00746
2minus01130
6minus013333minus01200
8minus006
896minus016122
007
015minus01450
7minus008
147minus007
164minus01324
1minus01556
668
27221
239
61minus23
01677
927
9059minus025
929minus005
534
005
332minus009
406minus007
793minus011522minus007
793minus009
406
003168minus00745
5minus013741minus006
394minus007
807minus012514
9224
954
10323minus36
80442
4
Journal of Chemistry 5
Table 3 Fitting results for PLS model I and model II
Model 119884 119883 ℎ 1198772
1198831198772
119883(cum) 1198772
1198841198772
119884(cum) Eig 1198762
1198762
cum
I log119870OW 1198831a 1 0519 0519 0928 0928 935 0922 0922
2 0172 0691 0065 0992 309 0841 0988
II log119870OW 1198832b 1 0542 0542 0928 0928 921 0921 0921
2 0181 0723 0063 0992 308 0830 0987a1198831 containing all 18 independent variables
b1198832 the variable 120583 was eliminated from matrix X for the model
the relationship between a matrix Y (containing dependentvariables often only one for QSAR studies) and a matrix X(containing predictor variables) by reducing the dimension ofthe matrices while concurrently maximizing the relationshipbetween them
SIMCA-P (Version 105 Umetrics AB 2004) software wasemployed to perform the PLS analysis The default valuesgiven by the software were used as the initial conditionsfor computation According to the userrsquos guide to SIMCA-P[25] the criterion used to determine the model dimension-ality the number of significant PLS components is cross-validation (CV) With CV the tested PLS component isthought significant if the fraction of the total variation ofthe dependent variables estimated by a component 1198762 ishigher than a significance limit (00975) for thewhole datasetThe model is thought to have a good predicting ability whenthe cumulative 1198762 for the extracted components 1198762cum ismore than 0500 [25] Model adequacy was assessed basedprimarily on the number of PLS principal components (ℎ)1198762
cum the squared correlation coefficient between observedvalues and fitted values (1198772) the standard error (SE) of theestimate the variance ratio of variance analysis (119865) and thesignificance level (119901)
3 Results and Discussion
31 Modeling and Optimizing For the 20 PCBs contained inthe training set PLS analysis was performed with log119870OWas the dependent variable and the 18 independent variablesas the predictor variables yielding QSAR model I For thismodel we have 1198772 = 0992 SE = 0145 119865 = 212 times 103and 119901 = 400 times 10minus20 The fitting results for this modelare listed in Table 3 In Table 3 1198772
119883and 1198772
119884stand for the
fraction of the sum of squares of all the Xrsquos and Y rsquos explainedby the current component 1198772
119883(cum) and 1198772
119884(cum) stand forthe fraction of the cumulative variance of all the Xrsquos and Y rsquosexplained by all extracted components respectively andEigstands for eigenvalue which denotes the importance of thePLS principal component
Variable importance in the projection (VIP) is a parame-ter that shows the importance of a variable in a model in theassistant analysis of PLS modeling According to the manualof SIMCA-P termswith large VIP (gt10) are themost relevantfor explaining dependent variables [25] To obtain an optimalmodel the following PLS analysis procedure was adopted APLSmodel with all the predictor variables was first calculatedand then the variable with the lowest VIP was eliminatedand a new PLS regression was performed yielding a new PLS
model This procedure was repeated until an optimal modelwas obtained The optimal model was selected with respectto 1198762cum 119877
2 SE 119865 and 119901 According to the statistics andmetrics theories a QSAR model with larger values of 1198762cum1198772 and 119865 and smaller values of SE and 119901 tends to be more
stable and reliable than in the opposite caseThe above described PLS analysis procedure with
log119870OW as dependent variable and the 18 independentvariables as the initial predictor variables for the 20 PCBscontained in the training set resulted in model II For thismodel we have 1198772 = 0992 SE = 0151 119865 = 197 times 103 and119901 = 762 times 10
minus20 The fitting results for this model are shownin Table 3 As shown in the table based on the estimatedindexes model I is better than model II indicating all thatthe selected 18 independent variables are necessary for thePLS modeling In model I 2 PLS principal componentswere selected in the model which explained 691 of thevariance of the predictor variables and 992 of the varianceof the dependent variable Based on the estimated indexesemployed in this study this model is statistically significant
32 Analysis and Discussion Based on the optimal modelI the predicted log119870OW for the 20 PCBs contained in thetraining set are as listed in Table 1 A comparison of theobserved and predicted log119870OW is shown in Figure 2 Ascan be seen in Table 1 and Figure 2 the predicted values arewithin the range of plusmn03 log unit from the observed valuesThe observed log119870OW values are close to those predicted bythe optimalmodel and the correlation between observed andpredicted log119870OW is significant For the PCBs studied thedifferences between observed and predicted log119870OW are verysmall so the predicted values are acceptable Note that thecross-validated 1198762cum value (=0988) of the optimal model ismuch larger than 0500 it would seem that thismodel is stableand has good prediction ability and can be used to predict the119870OW of PCBs effectively
WhenX andY are unscaled and uncentered the unscaledcoefficients of the independent variables and the constanttransformed from the PLS results are listed in Table 4 Thesigns of the coefficients of the independent variables in themodel enable one to see the direction in which each affectsthe119870OW of PCBs Based on the unscaled coefficients a QSARequation like that obtained from multiple linear regressioncan be created for the optimal model and the predicted119870OW for other PCBs can be estimated Based on model Ipredictions of 119870OW for the 7 PAHs contained in the testset were generated and listed in Table 1 and the comparisonof observed and predicted log119870OW is shown in Figure 2
6 Journal of Chemistry
4 5 6 7 8 9
4
5
6
7
8
9
Training setTest set
Observed log KOW
Pred
icte
d lo
g KO
W
Figure 2 Plot of observed values versus those predicted by theoptimal model I
Table 4The unscaled coefficients and VIPs of the optimal model I
Variables Unscaled coefficients VIPsTE minus1645 times 10minus4 1412Re 0747 times 10minus4 1404119864HOMO minus10390 1274119864LUMO minus20295 1220119876C6 3418 1082119876C12 3366 1074119876C8 1924 1009119876C11 5212 0992119876C1 minus1999 0947119876C9 7412 0940119876C7 minus1975 0926119876C2 1087 0922119876C10 4711 0834DA 5726 times 10minus4 0794119876C3 6512 0756119876C5 3434 0716119876C4 3535 0690120583 minus0047 0473Constant 6995
The correlation coefficient between observed and predictedlog119870OW in the test set was as high as 0995 larger than thatin the training set As can be seen in Table 1 and Figure 2the predicted log119870OW for the test set are very close tothose observed This indicates that the PLS model has goodpredictive ability
The VIP values for the optimal model were calculatedand listed in Table 4 We can find that the VIP values of TERe 119864HOMO and 119864LUMO are all greater than 10 indicatingthat they are important in governing the 119870OW values of
PCBs TE and Re are the two most important variablessince their VIP values are larger than 140 which is greaterthan the VIP values of the other variables It is expectedthat log119870OW increases with the increase of molecularsize Considering the following examples 4-chlorobiphenyl(log119870OW = 469 molecular weight (MW) = 1885) 4 41015840-dichlorobiphenyl (log119870OW = 528 MW = 2230) 2 4 41015840-trichlorobiphenyl (log119870OW = 574 MW = 2575) 3 31015840 4 41015840-tetrachlorobiphenyl (log119870OW = 652 MW = 2920) whichdiffer between themselves by the number of chlorine sub-stituents the increase in log119870OW as a function of the numberof chlorine substituents and molecular sizes is clearly seenWhen H atom is replaced by Cl atom on the biphenyl119870OW value will be increased [26] So the more chlorineatoms on the biphenyls molecular the greater the119870OW valueWhen isomers share the same molecular weight molecu-lar sizes differ from the arrangements of chlorine atomswhich can be expressed by molecular volume and surfacearea leading to the difference in log119870OW 2-Chlorobiphenyl(log119870OW = 456) 3-chlorobiphenyl (log119870OW = 472) and 4-chlorobiphenyl (log119870OW = 469) are isomers (MW = 1885)but they have different n-octanolwater partition coefficients
The variety of molecular size could be described bythe quantum chemical descriptors of TE and Re in thiswork Firstly for the whole compounds set in this study thedecrease of TEvalues indicates that the molecule containsmore chlorine atoms which might lead to the increase ofmolecular volume Zou et al [27] found that molecularvolume and molecular surface area were highly correlatedfor PCBs sharing similar structures So the decrease of TEvalues might lead to the increase of molecular surface areaMolecular volume and molecular surface area are measuresof molecular size In general the bigger the molecular sizethe stronger the effect of the intermolecular dispersion forceOn the other hand the molecular volume is a measure ofabsorbed energy of the molecule when it forms the holein a solvent Due to stronger polar of water moleculescontributing to stronger binding energy the larger solutemolecules need to absorbmore energy to destroy the bindingbetween water molecules when distributing in the aqueousphase resulting in tending to be allocated in weakly polarsolvents It means that the decrease of TE values might leadto larger log119870OW which is consistent with the study of Zou etal [27] Secondly Re is the expectation value of the operator120588 in the formula of int 1199032120588(119903)1198893119903 It is occasionally seen inthe literature as a measure of molecular volume Althoughin many cases Re correlates poorly with molecular volumethe correlation for PCBs sharing similar structures is quitesignificant A PCB molecule with large electronic spatialextent will have large molecular volume Therefore it can beconcluded that PCBs molecules with lower molecular totalenergy values and larger electronic spatial extent tend to bemore hydrophobic and lipophilic leading to larger log119870OWvalues This implies that the decrease of the TE value and theincrease of the Re value lead to an increase in log119870OW value
The value of log119870OW is in inverse ratio to 119864HOMO and119864LUMO that is large value of 119864HOMO and 119864LUMO would resultin small value of log119870OW for PCBs This is because 119864HOMO
Journal of Chemistry 7
represents the proton acceptance ability in forming hydrogenbond while 119864LUMO represents the proton donation ability inthe formation of hydrogen bond Therefore the compoundswith large values of 119864HOMO and 119864LUMO tend to donate oraccept protons easily In this case hydrogen bond can beeasily formed between PCBs and H
2O molecules and PCBs
are able to enter water phase resulting in low log119870OW value[13]
33 Comparison with the Results from Other Models In [13]there are 3 variables (119864HOMO 119864LUMO and 120572) obtained fromB3LYP6-31G(d) level in the model and the squared correla-tion coefficient is 1198772 = 09484 less than that of the optimalmodel in the present study It was found that there are highcorrelations relating the number of Cl substitution position(119873) to120572 and119864HOMO And taking119873 as theoretical descriptorsto correlate the three-variable models for predicting log119870OWof PCBs the achieved new model is having 1198772 = 09497and exhibiting optimum stability convenience for use andbetter predictive power than that from AM1 method [28]In contrast the squared correlation coefficient 1198772 = 0992obtained by PLS in the present study is obviously larger thanthat reported
Taking molar weight (119872) and melting point (119879119898 ∘C) as
independent variables Wang et al [10] had developed thebinary linear regression equation
log119870OW = 00155119872 + 000107 (119879119898 minus 25)
+ 171 (119899 = 27 119903 = 0997)
(1)
whose predicted log119870OW are also listed in Table 1 whichseems superior to the optimal model I in the present studyobtained solely on quantum chemical descriptors Howeverthe variable melting point in the model equation is anexperimental physicochemical parameter It is impracticalto determine the melting points of all 209 PCBs directly inthe laboratory which makes the model become of limitedapplicability Therefore our optimal model based only onquantum chemical descriptors has wider applicability thanthat based on known physicochemical propertiesThe overallhigh quality of themodel obtained in this study indicates thatit will find application in the estimation of 119870OW of PCBs forwhich experimental values are lacking
4 Conclusions
In this work based on some quantum chemical descriptorscomputed by DFT at the B3LYP6-31G(d) level by the useof PLS analysis QSAR models were obtained for logarithmicn-octanolwater partition coefficients of PCBs The squaredcorrelation coefficient of the optimalmodel is 0992Theopti-mal model has high fitting precision and good predictabilityso it can be applied in the estimation of the n-octanolwaterpartition behavior of PCBs It can generally be concludedthat PCBs with larger electronic spatial extent and lowermolecular total energy values tend to be more hydrophobicand lipophilic
Conflict of Interests
The authors declare no conflict of interests
Acknowledgments
This study was financially supported by the National NaturalScience Foundation of China (21007017) the Program forNew Century Excellent Talents in University (NCET-12-0199) the Specialized Research Fund for the Doctoral Pro-gram of Higher Education (20100172120037) the China Post-doctoral Science Foundation (201003350) and the Guang-dong Natural Science Foundation (9351064101000001)
References
[1] B G Hansen A B Paya-Perez M Rahman and B RLarsen ldquoQSARs for 119870OW and 119870OC of PCB congeners a criticalexamination of data assumptions and statistical approachesrdquoChemosphere vol 39 no 13 pp 2209ndash2228 1999
[2] S-S Liu Y Liu D-Q Yin X-D Wang and L-S WangldquoPrediction of chromatographic relative retention time ofpolychlorinated biphenyls from the molecular electronegativitydistance vectorrdquo Journal of Separation Science vol 29 no 2 pp296ndash301 2006
[3] United Nations Environment Programme ldquoStockholm Con-vention on Persistent Organic Pollutantsrdquo 2001 httpwwwpopsintdocumentsconvtextconvtext enpdf
[4] J P Giesy and K Kannan ldquoDioxin-like and non-dioxin-liketoxic effects of polychlorinated biphenyls (PCBs) implicationsfor risk assessmentrdquo Critical Reviews in Toxicology vol 28 no6 pp 511ndash569 1998
[5] C T Chiou V H Freed DW Schmedding and R L KohnertldquoPartition coefficient and bioaccumulation of selected organicchemicalsrdquo Environmental Science and Technology vol 11 no 5pp 475ndash478 1977
[6] H Konemann ldquoStructure-activity relationships and additivityin fish toxicities of environmental pollutantsrdquoEcotoxicology andEnvironmental Safety vol 4 no 4 pp 415ndash421 1980
[7] G Patil ldquoCorrelation of aqueous solubility and octanol-waterpartition coefficient based on molecular structurerdquo Chemo-sphere vol 22 no 8 pp 723ndash738 1991
[8] DMackay A BobraW Y Shiu and SH Yalkowsky ldquoRelation-ships between aqueous solubility and octanol-water partitioncoefficientsrdquo Chemosphere vol 9 no 11 pp 701ndash711 1980
[9] A Sabljic H Gusten J Hermens andAOpperhuizen ldquoModel-ing octanolwater partition coefficients by molecular topologychlorinated benzenes and biphenylsrdquo Environmental Scienceand Technology vol 27 no 7 pp 1394ndash1402 1993
[10] L S Wang Y H Zhao and H Gao ldquoPredicting aqueoussolubility and octanolwater partition coefficients of organicchemicals from molar volumerdquo Environmental Chemistry vol11 pp 55ndash70 1992
[11] G-N Lu Z Dang X-Q Tao and X-Y Yi ldquoEstimation of watersolubility of polycyclic aromatic hydrocarbons using quantumchemical descriptors and partial least squaresrdquo QSAR andCombinatorial Science vol 27 no 5 pp 618ndash626 2008
[12] G-N Lu X-Q Tao Z Dang X-Y Yi and C Yang ldquoEstimationof n-octanolwater partition coefficients of polycyclic aromatichydrocarbons by quantum chemical descriptorsrdquo Central Euro-pean Journal of Chemistry vol 6 no 2 pp 310ndash318 2008
8 Journal of Chemistry
[13] W Zhou Z Zhai Z Wang and L Wang ldquoEstimation of n-octanolwater partition coefficients (119870OW) of all PCB congenersby density functional theoryrdquo Journal of Molecular Structurevol 755 no 1ndash3 pp 137ndash145 2005
[14] F Verweij K Booij K Satumalay N Van Der Molen andR Van Der Oost ldquoAssessment of bioavailable PAH PCB andOCP concentrations in water using semipermeable membranedevices (SPMDs) sediments and caged carprdquoChemosphere vol54 no 11 pp 1675ndash1689 2004
[15] L Jantschi S Bolboaca and R Sestras ldquoMeta-heuristics onquantitative structure -activity relationships study on polychlo-rinated biphenylsrdquo Journal of Molecular Modeling vol 16 pp377ndash386 2010
[16] P Ruiz O Faroon C J Moudgal H Hansen C T DeRosa and M Mumtaz ldquoPrediction of the health effects ofpolychlorinated biphenyls (PCBs) and their metabolites usingquantitative structure-activity relationship (QSAR)rdquo ToxicologyLetters vol 181 no 1 pp 53ndash65 2008
[17] L-T Qin S-S Liu H-L Liu and H-L Ge ldquoA new predictivemodel for the bioconcentration factors of polychlorinatedbiphenyls (PCBs) based on the molecular electronegativitydistance vector (MEDV)rdquo Chemosphere vol 70 no 9 pp 1577ndash1587 2008
[18] K Tuppurainen and J Ruuskanen ldquoElectronic eigenvalue(EEVA) a new QSARQSPR descriptor for electronic sub-stituent effects based on molecular orbital energies A QSARapproach to the Ah receptor binding affinity of polychlorinatedbiphenyls (PCBs) dibenzo-p-dioxins (PCDDs) and dibenzofu-rans (PCDFs)rdquo Chemosphere vol 41 no 6 pp 843ndash848 2000
[19] Z Daren ldquoQSPR studies of PCBs by the combination of geneticalgorithms and PLS analysisrdquoComputers and Chemistry vol 25no 2 pp 197ndash204 2001
[20] E B de Melo ldquoA new quantitative structure-property rela-tionship model to predict bioconcentration factors of poly-chlorinated biphenyls (PCBs) in fishes using E-state indexand topological descriptorsrdquo Ecotoxicology and EnvironmentalSafety vol 75 no 1 pp 213ndash222 2012
[21] J Y Long H B Yi X K Liu and Y F Wang ldquoQuantitativestructure-property relationship for polychlorinated biphenylstoxicity and structure by density functional theoryrdquoActa Chim-ica Sinica vol 70 pp 949ndash960 2012
[22] G-N Lu X-Q Tao Z H I Dang W Huang and Z LildquoQuantitative structureproperty relationships on dissolvabilityof pcddfs using quantum chemical descriptors and partial leastsquaresrdquo Journal of Theoretical and Computational Chemistryvol 9 no 1 pp 9ndash22 2010
[23] M J Frisch G W Trucks H B Schlegel et al Gaussian 03Revision B 01 Gaussian Pittsburgh Pa USA 2003
[24] S Wold M Sjostrom and L Eriksson ldquoPLS-regression a basictool of chemometricsrdquoChemometrics and Intelligent LaboratorySystems vol 58 no 2 pp 109ndash130 2001
[25] A B Umetrics USerrsquos Guide to SIMCA-P SIMCA-P+ Version10 0 Umea Sweden 2005
[26] B Z Zhao Y Y Li X R Hao et al ldquoTheoretical calculation ofthe octanolwater partition coefficient for polychlorobiphenylsrdquoJournal of Northeast Normal University vol 36 no 2 pp 41ndash442004
[27] J-W Zou Y-J Jiang G-X Hu M Zeng S-L Zhuang and Q-S Yu ldquoQSPRQSAR studies on the physicochemical propertiesand biological activities of polychlorinated biphenylsrdquo ActaPhysico vol 21 no 3 pp 267ndash272 2005
[28] X-Y Han Z-YWang Z-C Zhai and L-S Wang ldquoEstimationof n-octanolwater partition coefficients (119870OW) of all PCBcongeners by ab initio and a Cl substitution position methodrdquoQSAR and Combinatorial Science vol 25 no 4 pp 333ndash3412006
Submit your manuscripts athttpwwwhindawicom
Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014
Inorganic ChemistryInternational Journal of
Hindawi Publishing Corporation httpwwwhindawicom Volume 2014
International Journal ofPhotoenergy
Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014
Carbohydrate Chemistry
International Journal of
Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014
Journal of
Chemistry
Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014
Advances in
Physical Chemistry
Hindawi Publishing Corporationhttpwwwhindawicom
Analytical Methods in Chemistry
Journal of
Volume 2014
Bioinorganic Chemistry and ApplicationsHindawi Publishing Corporationhttpwwwhindawicom Volume 2014
SpectroscopyInternational Journal of
Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014
The Scientific World JournalHindawi Publishing Corporation httpwwwhindawicom Volume 2014
Medicinal ChemistryInternational Journal of
Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014
Chromatography Research International
Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014
Applied ChemistryJournal of
Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014
Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014
Theoretical ChemistryJournal of
Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014
Journal of
Spectroscopy
Analytical ChemistryInternational Journal of
Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014
Journal of
Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014
Quantum Chemistry
Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014
Organic Chemistry International
ElectrochemistryInternational Journal of
Hindawi Publishing Corporation httpwwwhindawicom Volume 2014
Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014
CatalystsJournal of
Journal of Chemistry 5
Table 3 Fitting results for PLS model I and model II
Model 119884 119883 ℎ 1198772
1198831198772
119883(cum) 1198772
1198841198772
119884(cum) Eig 1198762
1198762
cum
I log119870OW 1198831a 1 0519 0519 0928 0928 935 0922 0922
2 0172 0691 0065 0992 309 0841 0988
II log119870OW 1198832b 1 0542 0542 0928 0928 921 0921 0921
2 0181 0723 0063 0992 308 0830 0987a1198831 containing all 18 independent variables
b1198832 the variable 120583 was eliminated from matrix X for the model
the relationship between a matrix Y (containing dependentvariables often only one for QSAR studies) and a matrix X(containing predictor variables) by reducing the dimension ofthe matrices while concurrently maximizing the relationshipbetween them
SIMCA-P (Version 105 Umetrics AB 2004) software wasemployed to perform the PLS analysis The default valuesgiven by the software were used as the initial conditionsfor computation According to the userrsquos guide to SIMCA-P[25] the criterion used to determine the model dimension-ality the number of significant PLS components is cross-validation (CV) With CV the tested PLS component isthought significant if the fraction of the total variation ofthe dependent variables estimated by a component 1198762 ishigher than a significance limit (00975) for thewhole datasetThe model is thought to have a good predicting ability whenthe cumulative 1198762 for the extracted components 1198762cum ismore than 0500 [25] Model adequacy was assessed basedprimarily on the number of PLS principal components (ℎ)1198762
cum the squared correlation coefficient between observedvalues and fitted values (1198772) the standard error (SE) of theestimate the variance ratio of variance analysis (119865) and thesignificance level (119901)
3 Results and Discussion
31 Modeling and Optimizing For the 20 PCBs contained inthe training set PLS analysis was performed with log119870OWas the dependent variable and the 18 independent variablesas the predictor variables yielding QSAR model I For thismodel we have 1198772 = 0992 SE = 0145 119865 = 212 times 103and 119901 = 400 times 10minus20 The fitting results for this modelare listed in Table 3 In Table 3 1198772
119883and 1198772
119884stand for the
fraction of the sum of squares of all the Xrsquos and Y rsquos explainedby the current component 1198772
119883(cum) and 1198772
119884(cum) stand forthe fraction of the cumulative variance of all the Xrsquos and Y rsquosexplained by all extracted components respectively andEigstands for eigenvalue which denotes the importance of thePLS principal component
Variable importance in the projection (VIP) is a parame-ter that shows the importance of a variable in a model in theassistant analysis of PLS modeling According to the manualof SIMCA-P termswith large VIP (gt10) are themost relevantfor explaining dependent variables [25] To obtain an optimalmodel the following PLS analysis procedure was adopted APLSmodel with all the predictor variables was first calculatedand then the variable with the lowest VIP was eliminatedand a new PLS regression was performed yielding a new PLS
model This procedure was repeated until an optimal modelwas obtained The optimal model was selected with respectto 1198762cum 119877
2 SE 119865 and 119901 According to the statistics andmetrics theories a QSAR model with larger values of 1198762cum1198772 and 119865 and smaller values of SE and 119901 tends to be more
stable and reliable than in the opposite caseThe above described PLS analysis procedure with
log119870OW as dependent variable and the 18 independentvariables as the initial predictor variables for the 20 PCBscontained in the training set resulted in model II For thismodel we have 1198772 = 0992 SE = 0151 119865 = 197 times 103 and119901 = 762 times 10
minus20 The fitting results for this model are shownin Table 3 As shown in the table based on the estimatedindexes model I is better than model II indicating all thatthe selected 18 independent variables are necessary for thePLS modeling In model I 2 PLS principal componentswere selected in the model which explained 691 of thevariance of the predictor variables and 992 of the varianceof the dependent variable Based on the estimated indexesemployed in this study this model is statistically significant
32 Analysis and Discussion Based on the optimal modelI the predicted log119870OW for the 20 PCBs contained in thetraining set are as listed in Table 1 A comparison of theobserved and predicted log119870OW is shown in Figure 2 Ascan be seen in Table 1 and Figure 2 the predicted values arewithin the range of plusmn03 log unit from the observed valuesThe observed log119870OW values are close to those predicted bythe optimalmodel and the correlation between observed andpredicted log119870OW is significant For the PCBs studied thedifferences between observed and predicted log119870OW are verysmall so the predicted values are acceptable Note that thecross-validated 1198762cum value (=0988) of the optimal model ismuch larger than 0500 it would seem that thismodel is stableand has good prediction ability and can be used to predict the119870OW of PCBs effectively
WhenX andY are unscaled and uncentered the unscaledcoefficients of the independent variables and the constanttransformed from the PLS results are listed in Table 4 Thesigns of the coefficients of the independent variables in themodel enable one to see the direction in which each affectsthe119870OW of PCBs Based on the unscaled coefficients a QSARequation like that obtained from multiple linear regressioncan be created for the optimal model and the predicted119870OW for other PCBs can be estimated Based on model Ipredictions of 119870OW for the 7 PAHs contained in the testset were generated and listed in Table 1 and the comparisonof observed and predicted log119870OW is shown in Figure 2
6 Journal of Chemistry
4 5 6 7 8 9
4
5
6
7
8
9
Training setTest set
Observed log KOW
Pred
icte
d lo
g KO
W
Figure 2 Plot of observed values versus those predicted by theoptimal model I
Table 4The unscaled coefficients and VIPs of the optimal model I
Variables Unscaled coefficients VIPsTE minus1645 times 10minus4 1412Re 0747 times 10minus4 1404119864HOMO minus10390 1274119864LUMO minus20295 1220119876C6 3418 1082119876C12 3366 1074119876C8 1924 1009119876C11 5212 0992119876C1 minus1999 0947119876C9 7412 0940119876C7 minus1975 0926119876C2 1087 0922119876C10 4711 0834DA 5726 times 10minus4 0794119876C3 6512 0756119876C5 3434 0716119876C4 3535 0690120583 minus0047 0473Constant 6995
The correlation coefficient between observed and predictedlog119870OW in the test set was as high as 0995 larger than thatin the training set As can be seen in Table 1 and Figure 2the predicted log119870OW for the test set are very close tothose observed This indicates that the PLS model has goodpredictive ability
The VIP values for the optimal model were calculatedand listed in Table 4 We can find that the VIP values of TERe 119864HOMO and 119864LUMO are all greater than 10 indicatingthat they are important in governing the 119870OW values of
PCBs TE and Re are the two most important variablessince their VIP values are larger than 140 which is greaterthan the VIP values of the other variables It is expectedthat log119870OW increases with the increase of molecularsize Considering the following examples 4-chlorobiphenyl(log119870OW = 469 molecular weight (MW) = 1885) 4 41015840-dichlorobiphenyl (log119870OW = 528 MW = 2230) 2 4 41015840-trichlorobiphenyl (log119870OW = 574 MW = 2575) 3 31015840 4 41015840-tetrachlorobiphenyl (log119870OW = 652 MW = 2920) whichdiffer between themselves by the number of chlorine sub-stituents the increase in log119870OW as a function of the numberof chlorine substituents and molecular sizes is clearly seenWhen H atom is replaced by Cl atom on the biphenyl119870OW value will be increased [26] So the more chlorineatoms on the biphenyls molecular the greater the119870OW valueWhen isomers share the same molecular weight molecu-lar sizes differ from the arrangements of chlorine atomswhich can be expressed by molecular volume and surfacearea leading to the difference in log119870OW 2-Chlorobiphenyl(log119870OW = 456) 3-chlorobiphenyl (log119870OW = 472) and 4-chlorobiphenyl (log119870OW = 469) are isomers (MW = 1885)but they have different n-octanolwater partition coefficients
The variety of molecular size could be described bythe quantum chemical descriptors of TE and Re in thiswork Firstly for the whole compounds set in this study thedecrease of TEvalues indicates that the molecule containsmore chlorine atoms which might lead to the increase ofmolecular volume Zou et al [27] found that molecularvolume and molecular surface area were highly correlatedfor PCBs sharing similar structures So the decrease of TEvalues might lead to the increase of molecular surface areaMolecular volume and molecular surface area are measuresof molecular size In general the bigger the molecular sizethe stronger the effect of the intermolecular dispersion forceOn the other hand the molecular volume is a measure ofabsorbed energy of the molecule when it forms the holein a solvent Due to stronger polar of water moleculescontributing to stronger binding energy the larger solutemolecules need to absorbmore energy to destroy the bindingbetween water molecules when distributing in the aqueousphase resulting in tending to be allocated in weakly polarsolvents It means that the decrease of TE values might leadto larger log119870OW which is consistent with the study of Zou etal [27] Secondly Re is the expectation value of the operator120588 in the formula of int 1199032120588(119903)1198893119903 It is occasionally seen inthe literature as a measure of molecular volume Althoughin many cases Re correlates poorly with molecular volumethe correlation for PCBs sharing similar structures is quitesignificant A PCB molecule with large electronic spatialextent will have large molecular volume Therefore it can beconcluded that PCBs molecules with lower molecular totalenergy values and larger electronic spatial extent tend to bemore hydrophobic and lipophilic leading to larger log119870OWvalues This implies that the decrease of the TE value and theincrease of the Re value lead to an increase in log119870OW value
The value of log119870OW is in inverse ratio to 119864HOMO and119864LUMO that is large value of 119864HOMO and 119864LUMO would resultin small value of log119870OW for PCBs This is because 119864HOMO
Journal of Chemistry 7
represents the proton acceptance ability in forming hydrogenbond while 119864LUMO represents the proton donation ability inthe formation of hydrogen bond Therefore the compoundswith large values of 119864HOMO and 119864LUMO tend to donate oraccept protons easily In this case hydrogen bond can beeasily formed between PCBs and H
2O molecules and PCBs
are able to enter water phase resulting in low log119870OW value[13]
33 Comparison with the Results from Other Models In [13]there are 3 variables (119864HOMO 119864LUMO and 120572) obtained fromB3LYP6-31G(d) level in the model and the squared correla-tion coefficient is 1198772 = 09484 less than that of the optimalmodel in the present study It was found that there are highcorrelations relating the number of Cl substitution position(119873) to120572 and119864HOMO And taking119873 as theoretical descriptorsto correlate the three-variable models for predicting log119870OWof PCBs the achieved new model is having 1198772 = 09497and exhibiting optimum stability convenience for use andbetter predictive power than that from AM1 method [28]In contrast the squared correlation coefficient 1198772 = 0992obtained by PLS in the present study is obviously larger thanthat reported
Taking molar weight (119872) and melting point (119879119898 ∘C) as
independent variables Wang et al [10] had developed thebinary linear regression equation
log119870OW = 00155119872 + 000107 (119879119898 minus 25)
+ 171 (119899 = 27 119903 = 0997)
(1)
whose predicted log119870OW are also listed in Table 1 whichseems superior to the optimal model I in the present studyobtained solely on quantum chemical descriptors Howeverthe variable melting point in the model equation is anexperimental physicochemical parameter It is impracticalto determine the melting points of all 209 PCBs directly inthe laboratory which makes the model become of limitedapplicability Therefore our optimal model based only onquantum chemical descriptors has wider applicability thanthat based on known physicochemical propertiesThe overallhigh quality of themodel obtained in this study indicates thatit will find application in the estimation of 119870OW of PCBs forwhich experimental values are lacking
4 Conclusions
In this work based on some quantum chemical descriptorscomputed by DFT at the B3LYP6-31G(d) level by the useof PLS analysis QSAR models were obtained for logarithmicn-octanolwater partition coefficients of PCBs The squaredcorrelation coefficient of the optimalmodel is 0992Theopti-mal model has high fitting precision and good predictabilityso it can be applied in the estimation of the n-octanolwaterpartition behavior of PCBs It can generally be concludedthat PCBs with larger electronic spatial extent and lowermolecular total energy values tend to be more hydrophobicand lipophilic
Conflict of Interests
The authors declare no conflict of interests
Acknowledgments
This study was financially supported by the National NaturalScience Foundation of China (21007017) the Program forNew Century Excellent Talents in University (NCET-12-0199) the Specialized Research Fund for the Doctoral Pro-gram of Higher Education (20100172120037) the China Post-doctoral Science Foundation (201003350) and the Guang-dong Natural Science Foundation (9351064101000001)
References
[1] B G Hansen A B Paya-Perez M Rahman and B RLarsen ldquoQSARs for 119870OW and 119870OC of PCB congeners a criticalexamination of data assumptions and statistical approachesrdquoChemosphere vol 39 no 13 pp 2209ndash2228 1999
[2] S-S Liu Y Liu D-Q Yin X-D Wang and L-S WangldquoPrediction of chromatographic relative retention time ofpolychlorinated biphenyls from the molecular electronegativitydistance vectorrdquo Journal of Separation Science vol 29 no 2 pp296ndash301 2006
[3] United Nations Environment Programme ldquoStockholm Con-vention on Persistent Organic Pollutantsrdquo 2001 httpwwwpopsintdocumentsconvtextconvtext enpdf
[4] J P Giesy and K Kannan ldquoDioxin-like and non-dioxin-liketoxic effects of polychlorinated biphenyls (PCBs) implicationsfor risk assessmentrdquo Critical Reviews in Toxicology vol 28 no6 pp 511ndash569 1998
[5] C T Chiou V H Freed DW Schmedding and R L KohnertldquoPartition coefficient and bioaccumulation of selected organicchemicalsrdquo Environmental Science and Technology vol 11 no 5pp 475ndash478 1977
[6] H Konemann ldquoStructure-activity relationships and additivityin fish toxicities of environmental pollutantsrdquoEcotoxicology andEnvironmental Safety vol 4 no 4 pp 415ndash421 1980
[7] G Patil ldquoCorrelation of aqueous solubility and octanol-waterpartition coefficient based on molecular structurerdquo Chemo-sphere vol 22 no 8 pp 723ndash738 1991
[8] DMackay A BobraW Y Shiu and SH Yalkowsky ldquoRelation-ships between aqueous solubility and octanol-water partitioncoefficientsrdquo Chemosphere vol 9 no 11 pp 701ndash711 1980
[9] A Sabljic H Gusten J Hermens andAOpperhuizen ldquoModel-ing octanolwater partition coefficients by molecular topologychlorinated benzenes and biphenylsrdquo Environmental Scienceand Technology vol 27 no 7 pp 1394ndash1402 1993
[10] L S Wang Y H Zhao and H Gao ldquoPredicting aqueoussolubility and octanolwater partition coefficients of organicchemicals from molar volumerdquo Environmental Chemistry vol11 pp 55ndash70 1992
[11] G-N Lu Z Dang X-Q Tao and X-Y Yi ldquoEstimation of watersolubility of polycyclic aromatic hydrocarbons using quantumchemical descriptors and partial least squaresrdquo QSAR andCombinatorial Science vol 27 no 5 pp 618ndash626 2008
[12] G-N Lu X-Q Tao Z Dang X-Y Yi and C Yang ldquoEstimationof n-octanolwater partition coefficients of polycyclic aromatichydrocarbons by quantum chemical descriptorsrdquo Central Euro-pean Journal of Chemistry vol 6 no 2 pp 310ndash318 2008
8 Journal of Chemistry
[13] W Zhou Z Zhai Z Wang and L Wang ldquoEstimation of n-octanolwater partition coefficients (119870OW) of all PCB congenersby density functional theoryrdquo Journal of Molecular Structurevol 755 no 1ndash3 pp 137ndash145 2005
[14] F Verweij K Booij K Satumalay N Van Der Molen andR Van Der Oost ldquoAssessment of bioavailable PAH PCB andOCP concentrations in water using semipermeable membranedevices (SPMDs) sediments and caged carprdquoChemosphere vol54 no 11 pp 1675ndash1689 2004
[15] L Jantschi S Bolboaca and R Sestras ldquoMeta-heuristics onquantitative structure -activity relationships study on polychlo-rinated biphenylsrdquo Journal of Molecular Modeling vol 16 pp377ndash386 2010
[16] P Ruiz O Faroon C J Moudgal H Hansen C T DeRosa and M Mumtaz ldquoPrediction of the health effects ofpolychlorinated biphenyls (PCBs) and their metabolites usingquantitative structure-activity relationship (QSAR)rdquo ToxicologyLetters vol 181 no 1 pp 53ndash65 2008
[17] L-T Qin S-S Liu H-L Liu and H-L Ge ldquoA new predictivemodel for the bioconcentration factors of polychlorinatedbiphenyls (PCBs) based on the molecular electronegativitydistance vector (MEDV)rdquo Chemosphere vol 70 no 9 pp 1577ndash1587 2008
[18] K Tuppurainen and J Ruuskanen ldquoElectronic eigenvalue(EEVA) a new QSARQSPR descriptor for electronic sub-stituent effects based on molecular orbital energies A QSARapproach to the Ah receptor binding affinity of polychlorinatedbiphenyls (PCBs) dibenzo-p-dioxins (PCDDs) and dibenzofu-rans (PCDFs)rdquo Chemosphere vol 41 no 6 pp 843ndash848 2000
[19] Z Daren ldquoQSPR studies of PCBs by the combination of geneticalgorithms and PLS analysisrdquoComputers and Chemistry vol 25no 2 pp 197ndash204 2001
[20] E B de Melo ldquoA new quantitative structure-property rela-tionship model to predict bioconcentration factors of poly-chlorinated biphenyls (PCBs) in fishes using E-state indexand topological descriptorsrdquo Ecotoxicology and EnvironmentalSafety vol 75 no 1 pp 213ndash222 2012
[21] J Y Long H B Yi X K Liu and Y F Wang ldquoQuantitativestructure-property relationship for polychlorinated biphenylstoxicity and structure by density functional theoryrdquoActa Chim-ica Sinica vol 70 pp 949ndash960 2012
[22] G-N Lu X-Q Tao Z H I Dang W Huang and Z LildquoQuantitative structureproperty relationships on dissolvabilityof pcddfs using quantum chemical descriptors and partial leastsquaresrdquo Journal of Theoretical and Computational Chemistryvol 9 no 1 pp 9ndash22 2010
[23] M J Frisch G W Trucks H B Schlegel et al Gaussian 03Revision B 01 Gaussian Pittsburgh Pa USA 2003
[24] S Wold M Sjostrom and L Eriksson ldquoPLS-regression a basictool of chemometricsrdquoChemometrics and Intelligent LaboratorySystems vol 58 no 2 pp 109ndash130 2001
[25] A B Umetrics USerrsquos Guide to SIMCA-P SIMCA-P+ Version10 0 Umea Sweden 2005
[26] B Z Zhao Y Y Li X R Hao et al ldquoTheoretical calculation ofthe octanolwater partition coefficient for polychlorobiphenylsrdquoJournal of Northeast Normal University vol 36 no 2 pp 41ndash442004
[27] J-W Zou Y-J Jiang G-X Hu M Zeng S-L Zhuang and Q-S Yu ldquoQSPRQSAR studies on the physicochemical propertiesand biological activities of polychlorinated biphenylsrdquo ActaPhysico vol 21 no 3 pp 267ndash272 2005
[28] X-Y Han Z-YWang Z-C Zhai and L-S Wang ldquoEstimationof n-octanolwater partition coefficients (119870OW) of all PCBcongeners by ab initio and a Cl substitution position methodrdquoQSAR and Combinatorial Science vol 25 no 4 pp 333ndash3412006
Submit your manuscripts athttpwwwhindawicom
Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014
Inorganic ChemistryInternational Journal of
Hindawi Publishing Corporation httpwwwhindawicom Volume 2014
International Journal ofPhotoenergy
Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014
Carbohydrate Chemistry
International Journal of
Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014
Journal of
Chemistry
Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014
Advances in
Physical Chemistry
Hindawi Publishing Corporationhttpwwwhindawicom
Analytical Methods in Chemistry
Journal of
Volume 2014
Bioinorganic Chemistry and ApplicationsHindawi Publishing Corporationhttpwwwhindawicom Volume 2014
SpectroscopyInternational Journal of
Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014
The Scientific World JournalHindawi Publishing Corporation httpwwwhindawicom Volume 2014
Medicinal ChemistryInternational Journal of
Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014
Chromatography Research International
Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014
Applied ChemistryJournal of
Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014
Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014
Theoretical ChemistryJournal of
Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014
Journal of
Spectroscopy
Analytical ChemistryInternational Journal of
Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014
Journal of
Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014
Quantum Chemistry
Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014
Organic Chemistry International
ElectrochemistryInternational Journal of
Hindawi Publishing Corporation httpwwwhindawicom Volume 2014
Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014
CatalystsJournal of
6 Journal of Chemistry
4 5 6 7 8 9
4
5
6
7
8
9
Training setTest set
Observed log KOW
Pred
icte
d lo
g KO
W
Figure 2 Plot of observed values versus those predicted by theoptimal model I
Table 4The unscaled coefficients and VIPs of the optimal model I
Variables Unscaled coefficients VIPsTE minus1645 times 10minus4 1412Re 0747 times 10minus4 1404119864HOMO minus10390 1274119864LUMO minus20295 1220119876C6 3418 1082119876C12 3366 1074119876C8 1924 1009119876C11 5212 0992119876C1 minus1999 0947119876C9 7412 0940119876C7 minus1975 0926119876C2 1087 0922119876C10 4711 0834DA 5726 times 10minus4 0794119876C3 6512 0756119876C5 3434 0716119876C4 3535 0690120583 minus0047 0473Constant 6995
The correlation coefficient between observed and predictedlog119870OW in the test set was as high as 0995 larger than thatin the training set As can be seen in Table 1 and Figure 2the predicted log119870OW for the test set are very close tothose observed This indicates that the PLS model has goodpredictive ability
The VIP values for the optimal model were calculatedand listed in Table 4 We can find that the VIP values of TERe 119864HOMO and 119864LUMO are all greater than 10 indicatingthat they are important in governing the 119870OW values of
PCBs TE and Re are the two most important variablessince their VIP values are larger than 140 which is greaterthan the VIP values of the other variables It is expectedthat log119870OW increases with the increase of molecularsize Considering the following examples 4-chlorobiphenyl(log119870OW = 469 molecular weight (MW) = 1885) 4 41015840-dichlorobiphenyl (log119870OW = 528 MW = 2230) 2 4 41015840-trichlorobiphenyl (log119870OW = 574 MW = 2575) 3 31015840 4 41015840-tetrachlorobiphenyl (log119870OW = 652 MW = 2920) whichdiffer between themselves by the number of chlorine sub-stituents the increase in log119870OW as a function of the numberof chlorine substituents and molecular sizes is clearly seenWhen H atom is replaced by Cl atom on the biphenyl119870OW value will be increased [26] So the more chlorineatoms on the biphenyls molecular the greater the119870OW valueWhen isomers share the same molecular weight molecu-lar sizes differ from the arrangements of chlorine atomswhich can be expressed by molecular volume and surfacearea leading to the difference in log119870OW 2-Chlorobiphenyl(log119870OW = 456) 3-chlorobiphenyl (log119870OW = 472) and 4-chlorobiphenyl (log119870OW = 469) are isomers (MW = 1885)but they have different n-octanolwater partition coefficients
The variety of molecular size could be described bythe quantum chemical descriptors of TE and Re in thiswork Firstly for the whole compounds set in this study thedecrease of TEvalues indicates that the molecule containsmore chlorine atoms which might lead to the increase ofmolecular volume Zou et al [27] found that molecularvolume and molecular surface area were highly correlatedfor PCBs sharing similar structures So the decrease of TEvalues might lead to the increase of molecular surface areaMolecular volume and molecular surface area are measuresof molecular size In general the bigger the molecular sizethe stronger the effect of the intermolecular dispersion forceOn the other hand the molecular volume is a measure ofabsorbed energy of the molecule when it forms the holein a solvent Due to stronger polar of water moleculescontributing to stronger binding energy the larger solutemolecules need to absorbmore energy to destroy the bindingbetween water molecules when distributing in the aqueousphase resulting in tending to be allocated in weakly polarsolvents It means that the decrease of TE values might leadto larger log119870OW which is consistent with the study of Zou etal [27] Secondly Re is the expectation value of the operator120588 in the formula of int 1199032120588(119903)1198893119903 It is occasionally seen inthe literature as a measure of molecular volume Althoughin many cases Re correlates poorly with molecular volumethe correlation for PCBs sharing similar structures is quitesignificant A PCB molecule with large electronic spatialextent will have large molecular volume Therefore it can beconcluded that PCBs molecules with lower molecular totalenergy values and larger electronic spatial extent tend to bemore hydrophobic and lipophilic leading to larger log119870OWvalues This implies that the decrease of the TE value and theincrease of the Re value lead to an increase in log119870OW value
The value of log119870OW is in inverse ratio to 119864HOMO and119864LUMO that is large value of 119864HOMO and 119864LUMO would resultin small value of log119870OW for PCBs This is because 119864HOMO
Journal of Chemistry 7
represents the proton acceptance ability in forming hydrogenbond while 119864LUMO represents the proton donation ability inthe formation of hydrogen bond Therefore the compoundswith large values of 119864HOMO and 119864LUMO tend to donate oraccept protons easily In this case hydrogen bond can beeasily formed between PCBs and H
2O molecules and PCBs
are able to enter water phase resulting in low log119870OW value[13]
33 Comparison with the Results from Other Models In [13]there are 3 variables (119864HOMO 119864LUMO and 120572) obtained fromB3LYP6-31G(d) level in the model and the squared correla-tion coefficient is 1198772 = 09484 less than that of the optimalmodel in the present study It was found that there are highcorrelations relating the number of Cl substitution position(119873) to120572 and119864HOMO And taking119873 as theoretical descriptorsto correlate the three-variable models for predicting log119870OWof PCBs the achieved new model is having 1198772 = 09497and exhibiting optimum stability convenience for use andbetter predictive power than that from AM1 method [28]In contrast the squared correlation coefficient 1198772 = 0992obtained by PLS in the present study is obviously larger thanthat reported
Taking molar weight (119872) and melting point (119879119898 ∘C) as
independent variables Wang et al [10] had developed thebinary linear regression equation
log119870OW = 00155119872 + 000107 (119879119898 minus 25)
+ 171 (119899 = 27 119903 = 0997)
(1)
whose predicted log119870OW are also listed in Table 1 whichseems superior to the optimal model I in the present studyobtained solely on quantum chemical descriptors Howeverthe variable melting point in the model equation is anexperimental physicochemical parameter It is impracticalto determine the melting points of all 209 PCBs directly inthe laboratory which makes the model become of limitedapplicability Therefore our optimal model based only onquantum chemical descriptors has wider applicability thanthat based on known physicochemical propertiesThe overallhigh quality of themodel obtained in this study indicates thatit will find application in the estimation of 119870OW of PCBs forwhich experimental values are lacking
4 Conclusions
In this work based on some quantum chemical descriptorscomputed by DFT at the B3LYP6-31G(d) level by the useof PLS analysis QSAR models were obtained for logarithmicn-octanolwater partition coefficients of PCBs The squaredcorrelation coefficient of the optimalmodel is 0992Theopti-mal model has high fitting precision and good predictabilityso it can be applied in the estimation of the n-octanolwaterpartition behavior of PCBs It can generally be concludedthat PCBs with larger electronic spatial extent and lowermolecular total energy values tend to be more hydrophobicand lipophilic
Conflict of Interests
The authors declare no conflict of interests
Acknowledgments
This study was financially supported by the National NaturalScience Foundation of China (21007017) the Program forNew Century Excellent Talents in University (NCET-12-0199) the Specialized Research Fund for the Doctoral Pro-gram of Higher Education (20100172120037) the China Post-doctoral Science Foundation (201003350) and the Guang-dong Natural Science Foundation (9351064101000001)
References
[1] B G Hansen A B Paya-Perez M Rahman and B RLarsen ldquoQSARs for 119870OW and 119870OC of PCB congeners a criticalexamination of data assumptions and statistical approachesrdquoChemosphere vol 39 no 13 pp 2209ndash2228 1999
[2] S-S Liu Y Liu D-Q Yin X-D Wang and L-S WangldquoPrediction of chromatographic relative retention time ofpolychlorinated biphenyls from the molecular electronegativitydistance vectorrdquo Journal of Separation Science vol 29 no 2 pp296ndash301 2006
[3] United Nations Environment Programme ldquoStockholm Con-vention on Persistent Organic Pollutantsrdquo 2001 httpwwwpopsintdocumentsconvtextconvtext enpdf
[4] J P Giesy and K Kannan ldquoDioxin-like and non-dioxin-liketoxic effects of polychlorinated biphenyls (PCBs) implicationsfor risk assessmentrdquo Critical Reviews in Toxicology vol 28 no6 pp 511ndash569 1998
[5] C T Chiou V H Freed DW Schmedding and R L KohnertldquoPartition coefficient and bioaccumulation of selected organicchemicalsrdquo Environmental Science and Technology vol 11 no 5pp 475ndash478 1977
[6] H Konemann ldquoStructure-activity relationships and additivityin fish toxicities of environmental pollutantsrdquoEcotoxicology andEnvironmental Safety vol 4 no 4 pp 415ndash421 1980
[7] G Patil ldquoCorrelation of aqueous solubility and octanol-waterpartition coefficient based on molecular structurerdquo Chemo-sphere vol 22 no 8 pp 723ndash738 1991
[8] DMackay A BobraW Y Shiu and SH Yalkowsky ldquoRelation-ships between aqueous solubility and octanol-water partitioncoefficientsrdquo Chemosphere vol 9 no 11 pp 701ndash711 1980
[9] A Sabljic H Gusten J Hermens andAOpperhuizen ldquoModel-ing octanolwater partition coefficients by molecular topologychlorinated benzenes and biphenylsrdquo Environmental Scienceand Technology vol 27 no 7 pp 1394ndash1402 1993
[10] L S Wang Y H Zhao and H Gao ldquoPredicting aqueoussolubility and octanolwater partition coefficients of organicchemicals from molar volumerdquo Environmental Chemistry vol11 pp 55ndash70 1992
[11] G-N Lu Z Dang X-Q Tao and X-Y Yi ldquoEstimation of watersolubility of polycyclic aromatic hydrocarbons using quantumchemical descriptors and partial least squaresrdquo QSAR andCombinatorial Science vol 27 no 5 pp 618ndash626 2008
[12] G-N Lu X-Q Tao Z Dang X-Y Yi and C Yang ldquoEstimationof n-octanolwater partition coefficients of polycyclic aromatichydrocarbons by quantum chemical descriptorsrdquo Central Euro-pean Journal of Chemistry vol 6 no 2 pp 310ndash318 2008
8 Journal of Chemistry
[13] W Zhou Z Zhai Z Wang and L Wang ldquoEstimation of n-octanolwater partition coefficients (119870OW) of all PCB congenersby density functional theoryrdquo Journal of Molecular Structurevol 755 no 1ndash3 pp 137ndash145 2005
[14] F Verweij K Booij K Satumalay N Van Der Molen andR Van Der Oost ldquoAssessment of bioavailable PAH PCB andOCP concentrations in water using semipermeable membranedevices (SPMDs) sediments and caged carprdquoChemosphere vol54 no 11 pp 1675ndash1689 2004
[15] L Jantschi S Bolboaca and R Sestras ldquoMeta-heuristics onquantitative structure -activity relationships study on polychlo-rinated biphenylsrdquo Journal of Molecular Modeling vol 16 pp377ndash386 2010
[16] P Ruiz O Faroon C J Moudgal H Hansen C T DeRosa and M Mumtaz ldquoPrediction of the health effects ofpolychlorinated biphenyls (PCBs) and their metabolites usingquantitative structure-activity relationship (QSAR)rdquo ToxicologyLetters vol 181 no 1 pp 53ndash65 2008
[17] L-T Qin S-S Liu H-L Liu and H-L Ge ldquoA new predictivemodel for the bioconcentration factors of polychlorinatedbiphenyls (PCBs) based on the molecular electronegativitydistance vector (MEDV)rdquo Chemosphere vol 70 no 9 pp 1577ndash1587 2008
[18] K Tuppurainen and J Ruuskanen ldquoElectronic eigenvalue(EEVA) a new QSARQSPR descriptor for electronic sub-stituent effects based on molecular orbital energies A QSARapproach to the Ah receptor binding affinity of polychlorinatedbiphenyls (PCBs) dibenzo-p-dioxins (PCDDs) and dibenzofu-rans (PCDFs)rdquo Chemosphere vol 41 no 6 pp 843ndash848 2000
[19] Z Daren ldquoQSPR studies of PCBs by the combination of geneticalgorithms and PLS analysisrdquoComputers and Chemistry vol 25no 2 pp 197ndash204 2001
[20] E B de Melo ldquoA new quantitative structure-property rela-tionship model to predict bioconcentration factors of poly-chlorinated biphenyls (PCBs) in fishes using E-state indexand topological descriptorsrdquo Ecotoxicology and EnvironmentalSafety vol 75 no 1 pp 213ndash222 2012
[21] J Y Long H B Yi X K Liu and Y F Wang ldquoQuantitativestructure-property relationship for polychlorinated biphenylstoxicity and structure by density functional theoryrdquoActa Chim-ica Sinica vol 70 pp 949ndash960 2012
[22] G-N Lu X-Q Tao Z H I Dang W Huang and Z LildquoQuantitative structureproperty relationships on dissolvabilityof pcddfs using quantum chemical descriptors and partial leastsquaresrdquo Journal of Theoretical and Computational Chemistryvol 9 no 1 pp 9ndash22 2010
[23] M J Frisch G W Trucks H B Schlegel et al Gaussian 03Revision B 01 Gaussian Pittsburgh Pa USA 2003
[24] S Wold M Sjostrom and L Eriksson ldquoPLS-regression a basictool of chemometricsrdquoChemometrics and Intelligent LaboratorySystems vol 58 no 2 pp 109ndash130 2001
[25] A B Umetrics USerrsquos Guide to SIMCA-P SIMCA-P+ Version10 0 Umea Sweden 2005
[26] B Z Zhao Y Y Li X R Hao et al ldquoTheoretical calculation ofthe octanolwater partition coefficient for polychlorobiphenylsrdquoJournal of Northeast Normal University vol 36 no 2 pp 41ndash442004
[27] J-W Zou Y-J Jiang G-X Hu M Zeng S-L Zhuang and Q-S Yu ldquoQSPRQSAR studies on the physicochemical propertiesand biological activities of polychlorinated biphenylsrdquo ActaPhysico vol 21 no 3 pp 267ndash272 2005
[28] X-Y Han Z-YWang Z-C Zhai and L-S Wang ldquoEstimationof n-octanolwater partition coefficients (119870OW) of all PCBcongeners by ab initio and a Cl substitution position methodrdquoQSAR and Combinatorial Science vol 25 no 4 pp 333ndash3412006
Submit your manuscripts athttpwwwhindawicom
Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014
Inorganic ChemistryInternational Journal of
Hindawi Publishing Corporation httpwwwhindawicom Volume 2014
International Journal ofPhotoenergy
Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014
Carbohydrate Chemistry
International Journal of
Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014
Journal of
Chemistry
Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014
Advances in
Physical Chemistry
Hindawi Publishing Corporationhttpwwwhindawicom
Analytical Methods in Chemistry
Journal of
Volume 2014
Bioinorganic Chemistry and ApplicationsHindawi Publishing Corporationhttpwwwhindawicom Volume 2014
SpectroscopyInternational Journal of
Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014
The Scientific World JournalHindawi Publishing Corporation httpwwwhindawicom Volume 2014
Medicinal ChemistryInternational Journal of
Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014
Chromatography Research International
Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014
Applied ChemistryJournal of
Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014
Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014
Theoretical ChemistryJournal of
Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014
Journal of
Spectroscopy
Analytical ChemistryInternational Journal of
Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014
Journal of
Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014
Quantum Chemistry
Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014
Organic Chemistry International
ElectrochemistryInternational Journal of
Hindawi Publishing Corporation httpwwwhindawicom Volume 2014
Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014
CatalystsJournal of
Journal of Chemistry 7
represents the proton acceptance ability in forming hydrogenbond while 119864LUMO represents the proton donation ability inthe formation of hydrogen bond Therefore the compoundswith large values of 119864HOMO and 119864LUMO tend to donate oraccept protons easily In this case hydrogen bond can beeasily formed between PCBs and H
2O molecules and PCBs
are able to enter water phase resulting in low log119870OW value[13]
33 Comparison with the Results from Other Models In [13]there are 3 variables (119864HOMO 119864LUMO and 120572) obtained fromB3LYP6-31G(d) level in the model and the squared correla-tion coefficient is 1198772 = 09484 less than that of the optimalmodel in the present study It was found that there are highcorrelations relating the number of Cl substitution position(119873) to120572 and119864HOMO And taking119873 as theoretical descriptorsto correlate the three-variable models for predicting log119870OWof PCBs the achieved new model is having 1198772 = 09497and exhibiting optimum stability convenience for use andbetter predictive power than that from AM1 method [28]In contrast the squared correlation coefficient 1198772 = 0992obtained by PLS in the present study is obviously larger thanthat reported
Taking molar weight (119872) and melting point (119879119898 ∘C) as
independent variables Wang et al [10] had developed thebinary linear regression equation
log119870OW = 00155119872 + 000107 (119879119898 minus 25)
+ 171 (119899 = 27 119903 = 0997)
(1)
whose predicted log119870OW are also listed in Table 1 whichseems superior to the optimal model I in the present studyobtained solely on quantum chemical descriptors Howeverthe variable melting point in the model equation is anexperimental physicochemical parameter It is impracticalto determine the melting points of all 209 PCBs directly inthe laboratory which makes the model become of limitedapplicability Therefore our optimal model based only onquantum chemical descriptors has wider applicability thanthat based on known physicochemical propertiesThe overallhigh quality of themodel obtained in this study indicates thatit will find application in the estimation of 119870OW of PCBs forwhich experimental values are lacking
4 Conclusions
In this work based on some quantum chemical descriptorscomputed by DFT at the B3LYP6-31G(d) level by the useof PLS analysis QSAR models were obtained for logarithmicn-octanolwater partition coefficients of PCBs The squaredcorrelation coefficient of the optimalmodel is 0992Theopti-mal model has high fitting precision and good predictabilityso it can be applied in the estimation of the n-octanolwaterpartition behavior of PCBs It can generally be concludedthat PCBs with larger electronic spatial extent and lowermolecular total energy values tend to be more hydrophobicand lipophilic
Conflict of Interests
The authors declare no conflict of interests
Acknowledgments
This study was financially supported by the National NaturalScience Foundation of China (21007017) the Program forNew Century Excellent Talents in University (NCET-12-0199) the Specialized Research Fund for the Doctoral Pro-gram of Higher Education (20100172120037) the China Post-doctoral Science Foundation (201003350) and the Guang-dong Natural Science Foundation (9351064101000001)
References
[1] B G Hansen A B Paya-Perez M Rahman and B RLarsen ldquoQSARs for 119870OW and 119870OC of PCB congeners a criticalexamination of data assumptions and statistical approachesrdquoChemosphere vol 39 no 13 pp 2209ndash2228 1999
[2] S-S Liu Y Liu D-Q Yin X-D Wang and L-S WangldquoPrediction of chromatographic relative retention time ofpolychlorinated biphenyls from the molecular electronegativitydistance vectorrdquo Journal of Separation Science vol 29 no 2 pp296ndash301 2006
[3] United Nations Environment Programme ldquoStockholm Con-vention on Persistent Organic Pollutantsrdquo 2001 httpwwwpopsintdocumentsconvtextconvtext enpdf
[4] J P Giesy and K Kannan ldquoDioxin-like and non-dioxin-liketoxic effects of polychlorinated biphenyls (PCBs) implicationsfor risk assessmentrdquo Critical Reviews in Toxicology vol 28 no6 pp 511ndash569 1998
[5] C T Chiou V H Freed DW Schmedding and R L KohnertldquoPartition coefficient and bioaccumulation of selected organicchemicalsrdquo Environmental Science and Technology vol 11 no 5pp 475ndash478 1977
[6] H Konemann ldquoStructure-activity relationships and additivityin fish toxicities of environmental pollutantsrdquoEcotoxicology andEnvironmental Safety vol 4 no 4 pp 415ndash421 1980
[7] G Patil ldquoCorrelation of aqueous solubility and octanol-waterpartition coefficient based on molecular structurerdquo Chemo-sphere vol 22 no 8 pp 723ndash738 1991
[8] DMackay A BobraW Y Shiu and SH Yalkowsky ldquoRelation-ships between aqueous solubility and octanol-water partitioncoefficientsrdquo Chemosphere vol 9 no 11 pp 701ndash711 1980
[9] A Sabljic H Gusten J Hermens andAOpperhuizen ldquoModel-ing octanolwater partition coefficients by molecular topologychlorinated benzenes and biphenylsrdquo Environmental Scienceand Technology vol 27 no 7 pp 1394ndash1402 1993
[10] L S Wang Y H Zhao and H Gao ldquoPredicting aqueoussolubility and octanolwater partition coefficients of organicchemicals from molar volumerdquo Environmental Chemistry vol11 pp 55ndash70 1992
[11] G-N Lu Z Dang X-Q Tao and X-Y Yi ldquoEstimation of watersolubility of polycyclic aromatic hydrocarbons using quantumchemical descriptors and partial least squaresrdquo QSAR andCombinatorial Science vol 27 no 5 pp 618ndash626 2008
[12] G-N Lu X-Q Tao Z Dang X-Y Yi and C Yang ldquoEstimationof n-octanolwater partition coefficients of polycyclic aromatichydrocarbons by quantum chemical descriptorsrdquo Central Euro-pean Journal of Chemistry vol 6 no 2 pp 310ndash318 2008
8 Journal of Chemistry
[13] W Zhou Z Zhai Z Wang and L Wang ldquoEstimation of n-octanolwater partition coefficients (119870OW) of all PCB congenersby density functional theoryrdquo Journal of Molecular Structurevol 755 no 1ndash3 pp 137ndash145 2005
[14] F Verweij K Booij K Satumalay N Van Der Molen andR Van Der Oost ldquoAssessment of bioavailable PAH PCB andOCP concentrations in water using semipermeable membranedevices (SPMDs) sediments and caged carprdquoChemosphere vol54 no 11 pp 1675ndash1689 2004
[15] L Jantschi S Bolboaca and R Sestras ldquoMeta-heuristics onquantitative structure -activity relationships study on polychlo-rinated biphenylsrdquo Journal of Molecular Modeling vol 16 pp377ndash386 2010
[16] P Ruiz O Faroon C J Moudgal H Hansen C T DeRosa and M Mumtaz ldquoPrediction of the health effects ofpolychlorinated biphenyls (PCBs) and their metabolites usingquantitative structure-activity relationship (QSAR)rdquo ToxicologyLetters vol 181 no 1 pp 53ndash65 2008
[17] L-T Qin S-S Liu H-L Liu and H-L Ge ldquoA new predictivemodel for the bioconcentration factors of polychlorinatedbiphenyls (PCBs) based on the molecular electronegativitydistance vector (MEDV)rdquo Chemosphere vol 70 no 9 pp 1577ndash1587 2008
[18] K Tuppurainen and J Ruuskanen ldquoElectronic eigenvalue(EEVA) a new QSARQSPR descriptor for electronic sub-stituent effects based on molecular orbital energies A QSARapproach to the Ah receptor binding affinity of polychlorinatedbiphenyls (PCBs) dibenzo-p-dioxins (PCDDs) and dibenzofu-rans (PCDFs)rdquo Chemosphere vol 41 no 6 pp 843ndash848 2000
[19] Z Daren ldquoQSPR studies of PCBs by the combination of geneticalgorithms and PLS analysisrdquoComputers and Chemistry vol 25no 2 pp 197ndash204 2001
[20] E B de Melo ldquoA new quantitative structure-property rela-tionship model to predict bioconcentration factors of poly-chlorinated biphenyls (PCBs) in fishes using E-state indexand topological descriptorsrdquo Ecotoxicology and EnvironmentalSafety vol 75 no 1 pp 213ndash222 2012
[21] J Y Long H B Yi X K Liu and Y F Wang ldquoQuantitativestructure-property relationship for polychlorinated biphenylstoxicity and structure by density functional theoryrdquoActa Chim-ica Sinica vol 70 pp 949ndash960 2012
[22] G-N Lu X-Q Tao Z H I Dang W Huang and Z LildquoQuantitative structureproperty relationships on dissolvabilityof pcddfs using quantum chemical descriptors and partial leastsquaresrdquo Journal of Theoretical and Computational Chemistryvol 9 no 1 pp 9ndash22 2010
[23] M J Frisch G W Trucks H B Schlegel et al Gaussian 03Revision B 01 Gaussian Pittsburgh Pa USA 2003
[24] S Wold M Sjostrom and L Eriksson ldquoPLS-regression a basictool of chemometricsrdquoChemometrics and Intelligent LaboratorySystems vol 58 no 2 pp 109ndash130 2001
[25] A B Umetrics USerrsquos Guide to SIMCA-P SIMCA-P+ Version10 0 Umea Sweden 2005
[26] B Z Zhao Y Y Li X R Hao et al ldquoTheoretical calculation ofthe octanolwater partition coefficient for polychlorobiphenylsrdquoJournal of Northeast Normal University vol 36 no 2 pp 41ndash442004
[27] J-W Zou Y-J Jiang G-X Hu M Zeng S-L Zhuang and Q-S Yu ldquoQSPRQSAR studies on the physicochemical propertiesand biological activities of polychlorinated biphenylsrdquo ActaPhysico vol 21 no 3 pp 267ndash272 2005
[28] X-Y Han Z-YWang Z-C Zhai and L-S Wang ldquoEstimationof n-octanolwater partition coefficients (119870OW) of all PCBcongeners by ab initio and a Cl substitution position methodrdquoQSAR and Combinatorial Science vol 25 no 4 pp 333ndash3412006
Submit your manuscripts athttpwwwhindawicom
Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014
Inorganic ChemistryInternational Journal of
Hindawi Publishing Corporation httpwwwhindawicom Volume 2014
International Journal ofPhotoenergy
Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014
Carbohydrate Chemistry
International Journal of
Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014
Journal of
Chemistry
Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014
Advances in
Physical Chemistry
Hindawi Publishing Corporationhttpwwwhindawicom
Analytical Methods in Chemistry
Journal of
Volume 2014
Bioinorganic Chemistry and ApplicationsHindawi Publishing Corporationhttpwwwhindawicom Volume 2014
SpectroscopyInternational Journal of
Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014
The Scientific World JournalHindawi Publishing Corporation httpwwwhindawicom Volume 2014
Medicinal ChemistryInternational Journal of
Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014
Chromatography Research International
Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014
Applied ChemistryJournal of
Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014
Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014
Theoretical ChemistryJournal of
Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014
Journal of
Spectroscopy
Analytical ChemistryInternational Journal of
Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014
Journal of
Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014
Quantum Chemistry
Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014
Organic Chemistry International
ElectrochemistryInternational Journal of
Hindawi Publishing Corporation httpwwwhindawicom Volume 2014
Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014
CatalystsJournal of
8 Journal of Chemistry
[13] W Zhou Z Zhai Z Wang and L Wang ldquoEstimation of n-octanolwater partition coefficients (119870OW) of all PCB congenersby density functional theoryrdquo Journal of Molecular Structurevol 755 no 1ndash3 pp 137ndash145 2005
[14] F Verweij K Booij K Satumalay N Van Der Molen andR Van Der Oost ldquoAssessment of bioavailable PAH PCB andOCP concentrations in water using semipermeable membranedevices (SPMDs) sediments and caged carprdquoChemosphere vol54 no 11 pp 1675ndash1689 2004
[15] L Jantschi S Bolboaca and R Sestras ldquoMeta-heuristics onquantitative structure -activity relationships study on polychlo-rinated biphenylsrdquo Journal of Molecular Modeling vol 16 pp377ndash386 2010
[16] P Ruiz O Faroon C J Moudgal H Hansen C T DeRosa and M Mumtaz ldquoPrediction of the health effects ofpolychlorinated biphenyls (PCBs) and their metabolites usingquantitative structure-activity relationship (QSAR)rdquo ToxicologyLetters vol 181 no 1 pp 53ndash65 2008
[17] L-T Qin S-S Liu H-L Liu and H-L Ge ldquoA new predictivemodel for the bioconcentration factors of polychlorinatedbiphenyls (PCBs) based on the molecular electronegativitydistance vector (MEDV)rdquo Chemosphere vol 70 no 9 pp 1577ndash1587 2008
[18] K Tuppurainen and J Ruuskanen ldquoElectronic eigenvalue(EEVA) a new QSARQSPR descriptor for electronic sub-stituent effects based on molecular orbital energies A QSARapproach to the Ah receptor binding affinity of polychlorinatedbiphenyls (PCBs) dibenzo-p-dioxins (PCDDs) and dibenzofu-rans (PCDFs)rdquo Chemosphere vol 41 no 6 pp 843ndash848 2000
[19] Z Daren ldquoQSPR studies of PCBs by the combination of geneticalgorithms and PLS analysisrdquoComputers and Chemistry vol 25no 2 pp 197ndash204 2001
[20] E B de Melo ldquoA new quantitative structure-property rela-tionship model to predict bioconcentration factors of poly-chlorinated biphenyls (PCBs) in fishes using E-state indexand topological descriptorsrdquo Ecotoxicology and EnvironmentalSafety vol 75 no 1 pp 213ndash222 2012
[21] J Y Long H B Yi X K Liu and Y F Wang ldquoQuantitativestructure-property relationship for polychlorinated biphenylstoxicity and structure by density functional theoryrdquoActa Chim-ica Sinica vol 70 pp 949ndash960 2012
[22] G-N Lu X-Q Tao Z H I Dang W Huang and Z LildquoQuantitative structureproperty relationships on dissolvabilityof pcddfs using quantum chemical descriptors and partial leastsquaresrdquo Journal of Theoretical and Computational Chemistryvol 9 no 1 pp 9ndash22 2010
[23] M J Frisch G W Trucks H B Schlegel et al Gaussian 03Revision B 01 Gaussian Pittsburgh Pa USA 2003
[24] S Wold M Sjostrom and L Eriksson ldquoPLS-regression a basictool of chemometricsrdquoChemometrics and Intelligent LaboratorySystems vol 58 no 2 pp 109ndash130 2001
[25] A B Umetrics USerrsquos Guide to SIMCA-P SIMCA-P+ Version10 0 Umea Sweden 2005
[26] B Z Zhao Y Y Li X R Hao et al ldquoTheoretical calculation ofthe octanolwater partition coefficient for polychlorobiphenylsrdquoJournal of Northeast Normal University vol 36 no 2 pp 41ndash442004
[27] J-W Zou Y-J Jiang G-X Hu M Zeng S-L Zhuang and Q-S Yu ldquoQSPRQSAR studies on the physicochemical propertiesand biological activities of polychlorinated biphenylsrdquo ActaPhysico vol 21 no 3 pp 267ndash272 2005
[28] X-Y Han Z-YWang Z-C Zhai and L-S Wang ldquoEstimationof n-octanolwater partition coefficients (119870OW) of all PCBcongeners by ab initio and a Cl substitution position methodrdquoQSAR and Combinatorial Science vol 25 no 4 pp 333ndash3412006
Submit your manuscripts athttpwwwhindawicom
Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014
Inorganic ChemistryInternational Journal of
Hindawi Publishing Corporation httpwwwhindawicom Volume 2014
International Journal ofPhotoenergy
Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014
Carbohydrate Chemistry
International Journal of
Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014
Journal of
Chemistry
Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014
Advances in
Physical Chemistry
Hindawi Publishing Corporationhttpwwwhindawicom
Analytical Methods in Chemistry
Journal of
Volume 2014
Bioinorganic Chemistry and ApplicationsHindawi Publishing Corporationhttpwwwhindawicom Volume 2014
SpectroscopyInternational Journal of
Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014
The Scientific World JournalHindawi Publishing Corporation httpwwwhindawicom Volume 2014
Medicinal ChemistryInternational Journal of
Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014
Chromatography Research International
Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014
Applied ChemistryJournal of
Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014
Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014
Theoretical ChemistryJournal of
Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014
Journal of
Spectroscopy
Analytical ChemistryInternational Journal of
Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014
Journal of
Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014
Quantum Chemistry
Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014
Organic Chemistry International
ElectrochemistryInternational Journal of
Hindawi Publishing Corporation httpwwwhindawicom Volume 2014
Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014
CatalystsJournal of
Submit your manuscripts athttpwwwhindawicom
Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014
Inorganic ChemistryInternational Journal of
Hindawi Publishing Corporation httpwwwhindawicom Volume 2014
International Journal ofPhotoenergy
Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014
Carbohydrate Chemistry
International Journal of
Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014
Journal of
Chemistry
Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014
Advances in
Physical Chemistry
Hindawi Publishing Corporationhttpwwwhindawicom
Analytical Methods in Chemistry
Journal of
Volume 2014
Bioinorganic Chemistry and ApplicationsHindawi Publishing Corporationhttpwwwhindawicom Volume 2014
SpectroscopyInternational Journal of
Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014
The Scientific World JournalHindawi Publishing Corporation httpwwwhindawicom Volume 2014
Medicinal ChemistryInternational Journal of
Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014
Chromatography Research International
Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014
Applied ChemistryJournal of
Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014
Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014
Theoretical ChemistryJournal of
Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014
Journal of
Spectroscopy
Analytical ChemistryInternational Journal of
Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014
Journal of
Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014
Quantum Chemistry
Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014
Organic Chemistry International
ElectrochemistryInternational Journal of
Hindawi Publishing Corporation httpwwwhindawicom Volume 2014
Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014
CatalystsJournal of