literature review: structure elucidation of emerging contaminants

28
S St t a a t t e e o o f f t t h h e e A A r r t t : : S St t r r u u c c t t u u r r e e E E l l u u c c i i d d a a t ti i o o n n a a n n d d F Fo o r r m m u u l l a a F F i i n n d di i n n g g o o f f E E m m e e r r g g i i n n g g C C o o n n t t a a m m i i n n a a n n t t s s i i n n W Wa a t t e e r r Jennifer Schollée Literature Review Universiteit van Amsterdam – Institute for Biodiversity and Ecosystem Dynamics Eawag – Department of Environmental Chemistry Supervisor: prof. Pim de Voogt July 16, 2012

Upload: others

Post on 16-Feb-2022

4 views

Category:

Documents


0 download

TRANSCRIPT

Page 1: Literature Review: Structure Elucidation of Emerging Contaminants

SSStttaaattteee      ooofff      ttthhheee      AAArrrttt:::      SSStttrrruuuccctttuuurrreee      EEEllluuuccciiidddaaatttiiiooonnn      aaannnddd      FFFooorrrmmmuuulllaaa      FFFiiinnndddiiinnnggg      ooofff      EEEmmmeeerrrgggiiinnnggg      

CCCooonnntttaaammmiiinnnaaannntttsss      iiinnn      WWWaaattteeerrr      

Jennifer  Schollée    Literature  Review    

Universiteit  van  Amsterdam  –  Institute  for  Biodiversity  and  Ecosystem  Dynamics  Eawag  –  Department  of  Environmental  Chemistry    Supervisor:  prof.  Pim  de  Voogt    July  16,  2012  

Page 2: Literature Review: Structure Elucidation of Emerging Contaminants

Table  of  Contents    I.   Abstract...................................................................................................................... 1  II.   Introduction ............................................................................................................... 2  III.   Overview of Analytical Methods .............................................................................. 3  IV.   Steps for Identification .............................................................................................. 5  

i.   Accurate Mass and MS/MS Fragmentation .......................................................... 5  ii.   Fiehn-Kind Seven Golden Rules .......................................................................... 7  iii.   Database Searches ................................................................................................ 7  iv.   Isotope Patterns ..................................................................................................... 9  v.   Structure-property Relationships .......................................................................... 9  vi.   Computer-assisted Analysis................................................................................ 10  

V.   Transformation Products ......................................................................................... 11  i.   Laboratory Prediction ......................................................................................... 12  ii.   Degradation-Fragmentation Relationship ........................................................... 12  iii.   Computational Prediction of Fragmentation ...................................................... 13  iv.   Computational Prediction of Transformation Products ...................................... 14  

VI.   Non-target Screening with Statistics ....................................................................... 16  VII.   Conclusions ............................................................................................................. 17  VIII.   References ............................................................................................................... 20  

Page 3: Literature Review: Structure Elucidation of Emerging Contaminants

Abbreviations   APCI – atmospheric-pressure chemical ionization APPI – atmospheric-pressure photoionization CHI – Chromatographic Hydrophobicity Index CID – collision-induced dissociation CODA – component detection algorithm DAIOS – Database-Assisted Identification of Organic Substances EDA – effect-directed analysis EI – electron ionization ESI – electrospray ionization FMF – Find Molecular Feature FT-ICR – Fourier transform ion cyclotron resonance FWHM – full width at half-height maximum GC – gas chromatography H/D – hydrogen/deuterium HMMM – hexa(methoxymethyl)melamine HPLC – high-performance liquid chromatography HRMS – high-resolution mass spectrometry LC – liquid chromatography LOD – limit of detection LSER – Linear Solvation Energy Relationship LTR – linear ion trip MFE – Molecular Feature Extraction MS/MS – tandem mass spectrometry MultiCASE – Multiple Computer Automated Structure Evaluation MVA – multivariate analysis NIST – National Institute of Standards and Technology NMR – nuclear magnetic resonance QqQ – triple quadrupole QTOF – quadrupole/time-of-flight R – resolution RIZA – Institute for Inland Water Management and Wastewater Treatment in the Netherlands SIM – selected-ion monitoring SRM – selected-reaction monitoring SPC – sulfophenyl carboxylate SPE – solid-phase extraction TOF – time-of-flight TP – transformation product UPLC – ultra-high performance liquid chromatography UV – unit variance WWTP – wastewater treatment plant

Page 4: Literature Review: Structure Elucidation of Emerging Contaminants

Literature Review: Structure Elucidation of Emerging Contaminants   2012

1

I. Abstract Emerging contaminants and their transformation products (TPs) pose a threat to the environment but cannot be monitored with traditional targeted screening methods. Methods for non-target screening i.e., monitoring without prior knowledge about the compounds of interest, have been developed to assess the risk of emerging contaminants in the environment. Analysis with liquid chromatography – mass spectrometry (LC-MS) has become increasingly important, especially for chemicals in aquatic environments since this technique is suited to measure polar compounds, which are more mobile in water than apolar compounds. In particular, high-resolution MS (HRMS) is critical for the identification of the elemental composition of an unknown and tandem MS/MS fragmentation can be used to gain structural information. This literature study evaluated current trends for structure elucidation of emerging contaminants in water samples, focusing on the data analysis processes. Most studies advocated a step-wise approach to the evaluation of LC-MS data (Ibáñez et al. 2008; Hogenboom et al. 2009; Krauss et al. 2010; Nurmi et al. 2012). In the majority of studies it was first recommended to predict the molecular formula from exact mass measurements collected with HRMS. Elements to be included in the formula could be restricted based on the isotope patterns, particularly for bromine, chlorine, or sulfur (Bobeldijk et al. 2001; Martinez Bueno et al. 2007). Searches of databases such as PubChem or ChemSpider were then used to compile a list of candidate structures. The list of structures could be further refined with the use of physiochemical properties such as log Kow or the Chromatographic Hydrophobicity Index (CHI) (Kern et al. 2009; Ulrich et al. 2011). Confirmation of structures could be carried out with MS/MS fragmentation patterns and finally with comparison to a reference standard. The identification of TPs poses a larger problem, because searches in a database will not be successful for these compounds. TPs have so far been most effectively identified when assuming some similarity with the parent compound. In particular, the establishment of a fragmentation-degradation relationship has helped to identify TPs in water samples (Gómez et al. 2008; Gómez-Ramos et al. 2011). This idea has also been used to group classes of compounds together to speed identification (Hao et al. 2008; Kaufmann et al. 2011). Additionally, software has been developed to predict degradation products of a parent compound, as well as fragmentation spectra of a compound (Hill et al. 2008; Kern et al. 2009; Helbling et al. 2010a; Nurmi et al. 2012). Calculations with these types of software can be useful because they provide information about the possible TPs without labor-intensive laboratory degradation experiments (Kern et al. 2009). But further work still needs to be done to streamline identification of non-target analytes. In particular, the use of statistical methods for the evaluation of the large LC-MS datasets has not been explored much, although it can be useful for selecting peaks which are most important to identify during non-target analysis (Müller et al. 2011).

Page 5: Literature Review: Structure Elucidation of Emerging Contaminants

Literature Review: Structure Elucidation of Emerging Contaminants   2012

2

II. Introduction One of the largest issues facing analytical environmental chemistry today is the identification of new and emerging contaminants in the environment. Although analytical techniques have improved in sensitivity and selectivity in recent years, the task of identifying compounds in samples still remains daunting. While manual interpretation of chromatograms and spectra can be attempted, these are extremely time-consuming and require a high degree of expert knowledge for successful results. The objective of this literature review is the assessment of the data analysis techniques used in non-target screening of emerging contaminants and their transformation products (TPs) in surface and wastewaters. The focus was on liquid chromatography (LC) coupled to high-resolution mass spectrometry (HRMS) and tandem MS/MS. Emerging contaminants are an ever-pressing issue in environmental research (Hogenboom et al. 2009; Gómez-Ramos et al. 2011; Fischer et al. 2012; Zedda and Zweiner 2012). They compose a wide array of chemical classes, with varying physiochemical properties (Hernández et al. 2012). Additionally, with increases in both production and in the number of chemicals that are being produced, the impact of emerging chemicals will only increase (Nurmi et al. 2012). The definition of an emerging contaminant, as explained in Horvat et al. (2012), is any chemical that is not currently considered in standard environmental testing or regulations. The start of production of this compound does not need to be recent, but the toxicological effects of the compound are unknown or rarely studied. According to Hogenboom et al. (2009), there is little to no data on approximately 90% of the chemicals commonly available, which indicates that the field of potential emerging contaminants is very broad. Some of the emerging chemicals under discussion today include pesticides, pharmaceuticals, personal care products, hormones, surfactants, perfluorinated compounds, artificial sweeteners, siloxanes, X-ray contrast media, and brominated flame retardants (Richardson and Ternes 2011; Nurmi et al. 2012; Zedda and Zwiener 2012). In addition to the parent compounds that are being created and released in the environment, both traditional pollutants and emerging contaminants can decompose into a myriad of metabolites and TPs. Compounds such as pharmaceuticals are first ingested and metabolized, forming either phase I and phase II metabolites, where phase I generally involves oxidative processes while phase II metabolism includes conjugation, especially with glucuronide or sulfate (Escher and Fenner 2011). It has been shown that these TPs can be even more prevalent and toxic than the parent compounds (Ferrer et al. 2004; Gómez et al. 2008; Kern et al. 2009). Many chemicals are converted to more polar compounds during metabolism (Ibáñez et al. 2006) and these metabolized pharmaceuticals are therefore potentially very mobile within an aquatic system. In many cases polar compounds are able to pass through wastewater treatment without further breakdown (Weiss and Reemtsma 2005; Huset et al. 2008), so organisms that live downstream of wastewater treatment plants (WWTPs) may still be exposed to these TPs. But even compounds that are not directly ingested and metabolized could potentially form toxic TPs through either natural or anthropogenic processes. Within the

Page 6: Literature Review: Structure Elucidation of Emerging Contaminants

Literature Review: Structure Elucidation of Emerging Contaminants   2012

3

environment, processes such as photolysis, oxidation, and hydrolysis have the potential to degrade chemicals. Anthropogenic-induced transformation, namely from wastewater treatment processes, including ozonation, chlorination, chloramination, and advanced oxidation, can also produce new TPs (Escher and Fenner 2011). Microbial degradation is another important process to consider, and one which may occur either through natural or anthropogenic circumstances. Since WWTPs are not designed for emerging contaminants, it is not known how these compounds will transform during treatment. For this reason, non-target screening plays an important role in identifying not only emerging contaminants but their potentially toxic breakdown products. The most critical and labor-intensive step in the non-target screening process is identification of detected unknowns (Gómez-Ramos et al. 2011; Nurmi et al. 2012). Although a lot of work has gone into the development of analytical methods for measuring chemicals in surface water and wastewater, identification of compounds still remains a limiting part of non-target analysis. This review was undertaken to explore some of the proposed methods for structure elucidation of emerging chemicals currently being researched, focusing mainly on the data processing aspect. As a result, this literature study will only include a small overview of the current measurement techniques in use, summarizing the most common methods of analysis. No critical evaluation was made about these different methods and, during the discussion of the identification methods, these analytical differences will be set aside unless relevant. Since this review focuses on non-target screening, ideally the method used should not be tailored for specific classes of chemicals. Unfortunately currently in the literature there are some discrepancies about the definition of non-target screening. García-Reyes et al. (2006) defined non-target screening as the search for compounds that are not currently monitored, even if this is done in a targeted way, such as searching for the mass of the suspected contaminant. Ibáñez et al. (2008) referred to this as “non-target known” screening. They also defined “non-target unknown” screening, which involved the identification of all peaks within a sample. In Helbling et al. (2010a), this definition was taken a step further and they defined non-target as a screening where two samples are compared and identification is initiated on all peaks found only in one of the samples. Both this definition and the “non-target unknown” screening described by Ibáñez et al. (2008) are considered non-target screening for the purposes of this review. In general, any identification initiated on compounds detected in a sample that were not set a priori is defined here as non-target. Nevertheless, some of the studies discussed in this review did target one or more compound classes. In these cases, the authors still classified their study as non-target due to the fact that they were not searching for specific chemicals but rather broad classes. For the purposes of this study and to expand the breadth of research analyzed, these results were included in the evaluation. But these studies were considered with a little more care, since they did not fall within the definition of non-target screening.

III. Overview of Analytical Methods Screening for environmental contaminants was started with gas chromatography (GC), and focused on that technique for many years (Giger 2009). Because measurement with

Page 7: Literature Review: Structure Elucidation of Emerging Contaminants

Literature Review: Structure Elucidation of Emerging Contaminants   2012

4

GC requires compounds to be volatile or semi-volatile, environmental research focused on nonpolar pollutants. But the introduction of high-performance LC (HPLC) and ultra-high performance LC (UPLC) allowed LC separations to approach GC in resolution. The introduction of new ionization techniques such as electrospray (ESI) also pushed LC-MS into the realm of environmental research, specifically for its ability to detect polar compounds. LC is particularly valuable when concerned with contaminants in water since it is a better method for analysis of polar contaminants, which are mobile and therefore more relevant in an aquatic environment. Finally, since TPs are often more polar than the parent chemicals, LC is likely to be a more suitable technique for identifying these compounds. For this overview an emphasis was made on LC-HRMS and LC-MS/MS methods. These techniques have gained a certain degree of prominence in the field of emerging chemical research within the last 15 years, especially with respect to aqueous samples (Krauss et al. 2010). Although there is much variation in LC instrumentation and implementation, this step of the analytical method will not be discussed. For any LC-MS analysis, the solvents and gradients used to achieve separation need to be optimized, including consideration of a sample clean-up step such as solid-phase extraction (SPE). Extensive research has been devoted to this optimization; Hernández et al. (2005) provided a thorough review of LC-MS and LC-MS/MS instrument set-ups and Farré et al. (2008) provided a list of references of analytical methods for various classes of emerging chemicals. Another element that can be adjusted is the ionization method at the LC-MS interface. Alterations to the ionization method change the chemicals that can be detected. Adjustments to this part of the method are also outside the scope of this report. In general, ESI is standard for LC-MS systems, although some laboratories have used atmospheric-pressure chemical ionization (APCI) or atmospheric-pressure photoionization (APPI). All studies discussed in this review used ESI unless otherwise noted. These ionization methods are low-energy and therefore generate very few fragments during MS analysis. Therefore, collision-induced dissociation (CID) is used when additional fragmentation is needed for structural information. Much discussion has gone into the MS and MS/MS systems used for non-target analysis. Comparison of MS instruments is done with various criteria, including mass accuracy, linear dynamic range, sensitivity, and resolution. Mass accuracy is the range where the instrument is accurate around the measured m/z value. Linear dynamic range designates over how many orders of magnitude the instrument can be used. Sensitivity describes the lower limits of the concentration range which the instrument can detect and resolution (R) defines the ability of the instrument to separate two MS peaks. Resolution of the instrument is calculated by for a specific mass and at a specific peak height. Generally the calculation of R is defined at full-width at half-height maximum (FWHM). High resolution is especially critical for the separation of isobaric compounds (Kind and Fiehn 2006). The choice of MS is driven by the different strengths and weaknesses of various instruments. Initially, triple quadrupole (QqQ) instruments were very popular because of their high selectivity and sensitivity. But these instruments only measure nominal masses

Page 8: Literature Review: Structure Elucidation of Emerging Contaminants

Literature Review: Structure Elucidation of Emerging Contaminants   2012

5

and high sensitivity can only be achieved when operating in selected-ion monitoring (SIM) or selected-reaction monitoring (SRM) modes. Therefore these instruments are more suited to targeted analysis and not for non-target screening, where full-scan mode is necessary to detect unexpected peaks in the sample and accurate mass is needed for structure elucidation. Instead, in non-target monitoring, QqQ instruments have been replaced by time-of-flight (TOF) or Orbitrap instruments. Both TOF and Orbitrap instruments can achieve high resolution; TOF generally in the range of 20,000 and Orbitrap instruments around 100,000 (FWHM, m/z 300-400) (Krauss et al. 2010). Fourier transform ion cyclotron resonance (FT-ICR) instruments are currently the most sensitive, achieving a resolution of 1,000,000 and mass accuracy less than one ppm over a linear dynamic range of four orders of magnitude (Krauss et al. 2010). But these instruments are too expensive for many laboratories. The hybridization of two different MS instruments can often profit from the strengths of both. For example, the quadrupole-TOF (QTOF) or linear ion trap – Orbitrap (LTR-Orbitrap) combinations are especially common currently. Replacing the third quadrupole in a QqQ system with a TOF instrument allows for high mass accuracy measurements of the final spectra (Hollender et al. 2010). High sensitivity and low mass differences are essential for proper identification of unknowns. In general, TOF-MS systems have a high level of sensitivity (and therefore low limits of detection (LODs)), but are lacking in mass accuracy (Nurmi et al. 2012). By comparison, QTOF-MS systems have increased sensitivity and are therefore more suited for environmental analysis. The LTR-Orbitrap combination can reach a resolution on the order of 60,000 (Helbling 2010a) and is able to measure accurate mass in the first MS (prior to fragmentation) as well as the second MS (after fragmentation). For this hybridization, full-scan mode, followed by triggered (data-dependent) MS/MS provides the advantages of both molecular ion information from the full-scan mode and the fragmentation information from the tandem MS.

IV. Steps for Identification After the introduction of these new measurement techniques, it became necessary to understand how best to use the information generated. As mentioned previously, the positive identification of a compound without any prior knowledge of its structure can be a time-consuming process. For this reason, many studies focused on ways to narrow down the list of possible candidates to simply identification. Some of these steps are explained below.

i. Accurate Mass and MS/MS Fragmentation One of the earliest studies to look at the screening of unknowns in water samples was Bobeldijk et al. (2001). They used a multi-step process for the tentative identification of unknowns found in the surface water samples. Samples were analyzed with an LC-QTOF-MS system, run in full-scan mode. The highest intensity fragment triggered MS/MS and they tested their identification procedure with spiked samples; the set of six target compounds included one insecticide and five herbicides. Samples were first compared to the procedural blank and peaks which appeared in both chromatograms were discounted, a comparison which appears to have been manual.

Page 9: Literature Review: Structure Elucidation of Emerging Contaminants

Literature Review: Structure Elucidation of Emerging Contaminants   2012

6

Then, based on accurate mass measurements, the likely elemental composition was calculated with a tool included in the MassLynx software, version 3.4 (Waters). The elements C, N, O, H, P, and S were automatically considered. The elements Cl and Br were also considered if the isotope pattern suggested it. After a chemical formula was proposed, searches were done in a number of databases for the possible compound and structure, including the Merck Index (https://themerckindex.cambridgesoft.com/), National Institute of Standards and Technology (NIST) mass spectra library (http://www.sisweb.com/software/ms/nist.htm), InfoSpec© GC-MS database (Kiwa), and the Chemical Ionization – Collision-Induced Dissociation (CI-CID) mass spectra database from the Institute for Inland Water Management and Wastewater Treatment in the Netherlands (RIZA). Even at the lowest concentration tested, Bobeldijk et al. (2001) were able to identify five out of six target compounds based on the MS/MS spectra. At the higher concentrations, all of the target compounds were found, even without background subtraction. For two of the target compounds, isotope patterns were needed to confirm the targets. In addition to the validation with target compounds, the authors also attempted to identify some unknowns in the samples. For each unknown, several elemental compositions were listed as options, but these could not be narrowed down further. Based on MS/MS fragmentation, structures were suggested for three of the four unknowns, but these could not be verified. So, through the generally simple method of exact mass measurement, elemental composition calculation, and library searching, the authors were able to generate a list of possible candidate structures for unknowns in surface water samples. In a follow-up study carried out by Bobeldijk et al. (2002), analysis of unknowns was again carried out with LC-MS/MS. The selection of interesting unknowns was done from a group of sixty samples, without background subtraction. Peaks which appeared frequently and in high concentrations in the majority of samples were selected for further analysis. The software MassLynx, version 3.4 (Waters) and ACD (Advanced Chemical Development) (Micromass) were used for the data processing. With this method, two compounds were chosen for identification. Initial confirmation of one of the peaks as hexa(methoxymethyl)melamine (HMMM) was done with GC-MS and library matches. Further confirmation of the peak was then attempted with LC-MS/MS, using APCI. The fragmentation pattern of an HMMM standard in LC-MS/MS matched with the fragmentation pattern of the frequently occurring peak and identification was therefore confirmed. Bobeldijk et al. (2002) also found some likely by-products of the parent compound HMMM because of their presence in both the standard mixture and the water sample, although relative abundances of the by-products were different. These initial studies showed the power of LC-HRMS to identify unknowns. Accurate mass was the first step to identifying compounds. But even additional data such as isotopic ratios was insufficient to resolve isobaric compounds. GC-MS was used at this stage because of the wealth of information already collected in these databases. These two studies showed that accurate mass measurements are especially helpful for determining the molecular formula. Further identification of the unknowns required MS/MS to gather information about the structures. But even though Bobeldijk et al.

Page 10: Literature Review: Structure Elucidation of Emerging Contaminants

Literature Review: Structure Elucidation of Emerging Contaminants   2012

7

(2002) attempted to use GC-MS databases for the confirmation of compounds, this proved to be inconclusive in most of the searches with the LC-MS/MS data. Since GC-MS databases are based on different collision energies and different ionization techniques then those used in LC-MS analysis, comparison of spectra from one technique to the other did not offer much additional information. Nurmi et al. (2012) showed that confirmation of structures could be done with HRMS alone when they used UPLC-TOF-MS for the confirmation of 88 target compounds in wastewater samples. But when they attempted non-target screening on these samples, even high-resolution mass data was not enough to identify more than one of the six compounds from their test set. They acknowledged that the use of tandem MS such as QTOF with CID fragmentation would provide additional structural information necessary for discrimination of isobaric compounds.

ii. Fiehn-Kind Seven Golden Rules In 2007, Kind and Fiehn published an article which outlined “seven golden rules” for structure elucidation. Although this work was done with metabolomics in mind, most of the rules are also useful for formula finding in emerging contaminant research. The seven rules outlined are the following:

1. Restriction on the number of elements 2. Adherence to the LEWIS and SENIOR chemical rules 3. Isotopic patterns 4. Ratio of hydrogens/carbons 5. Ratio of additional elements such as nitrogen, oxygen, phosphor, and sulfur to

carbon 6. Adherence to the element ratio probabilities 7. Inclusion of trimethylsilyl groups

With the exception of rule 7 (applicable only to derivatized compounds for GC-MS analysis), each of the golden rules can also be applied to structure elucidation of LC-MS data. The rules were developed by analysis of a large set of biologically relevant compounds. Kind and Fiehn (2007) determined that nearly all structures fell without these rules and therefore proposed structure lists from software programs could be narrowed simply by enforcing some basic chemical principles.

iii. Database Searches As a next step, groups began to develop databases for LC-MS and LC-MS/MS. Some groups had success with database searching for sample identification. Gómez et al. (2010) advertised a method of automated, non-target screening through database searching. Tandem MS/MS was only initiated when the precursor ion could not be identified through accurate mass, isotope signals, or fragmentation patterns. Their method consisted of the following two steps: accurate mass measurement by LC-QTOF-MS and database searching. The database used was an accurate mass and retention time library set up by the laboratory which contained approximately 400 pharmaceuticals and pesticides. While this method did result in the tentative identification of 26 compounds, it negates

Page 11: Literature Review: Structure Elucidation of Emerging Contaminants

Literature Review: Structure Elucidation of Emerging Contaminants   2012

8

the idea of non-target by limiting the identified compounds to those entered into the database. Another study to use a similar method was Gómez-Ramos et al. (2011). Here the authors compiled an internal library of 147 compounds and predicted TPs and used it to evaluate wastewater samples. Measurement was done with LC-QTOF-MS/MS. The next steps of the method followed the process outlined in Gómez et al. (2010). Compounds were identified through matches with the database and confirmation of those matches was done with MS/MS fragmentation patterns. Again, the prior determination of compounds that may be present limits this approach for non-target analysis. Although using databases can be an incomplete way to carry out a non-target investigation, they can also be a useful tool as a first step in evaluating a chromatogram. In Hogenboom et al. (2009), non-target analysis of wastewater samples was done with an LTR-Orbitrap instrument. In their screening method, accurate masses of the detected peaks were compared to an in-house laboratory, composed of 3,000 pollutants. If no identification could be done based on this library, elemental composition of the accurate masses was predicted with XCalibur version 2.0 software (Thermo Scientific), based on isotope patterns and mass defects. Using the NIST and ChemSpider (www.chemspider.com) databases, a possible structure was then assigned to the chemical formula, which was further confirmed with the MSn patterns. Ibáñez et al. (2008) used a combination of target, post-target, and non-target methods to search for organic pollutants in surface water and wastewater samples from Spain. For the non-target screening they selected peaks which were found frequently (in multiple samples) at relatively high intensities for further identification. These peaks were first compared to in-house libraries which included both experimental and predicted spectra. If no match was found with these databases, structure elucidation of the unknown compound was carried out with ChromaLynx software (MassLynx version 4.1, Waters). Structure elucidation steps included deconvolution, consideration of the accurate mass and isotope pattern (measured by QTOF-MS; R=100,000 at FWHM), and searches in the NIST and Merck Index for possible structures. Tandem MS was not carried out but the authors acknowledged that this would be a next step for further confirmation of identification results. The authors were able to detect several fungicides, herbicides, and pharmaceuticals (including antibiotics), as well as caffeine and a cocaine metabolite. But while databases continue to be used extensively in GC-MS, this is not the case for LC-MS analysis. There are a couple of reasons for this. Firstly, ionization in GC-MS is generally accomplished through electron ionization (EI), which generates reproducible fragmentation patterns at the standard ionization energy of 70 eV. But for LC-MS, ESI, APCI, and APPI are “softer” ionization techniques, meaning that there is less fragmentation and therefore less peaks for comparison. Additionally, there is no standard ionization energy for analysis with ESI since the optimum energy varies for different compounds, so spectra are not reproducible. Similarly, the CID fragmentation used in tandem MS investigations also varies from one instrument to the other, making it difficult to establish large databases useful across many laboratories.

Page 12: Literature Review: Structure Elucidation of Emerging Contaminants

Literature Review: Structure Elucidation of Emerging Contaminants   2012

9

iv. Isotope Patterns Since database searches did not appear to be the answer for identification of LC-MS samples, additional steps were considered to narrow down the list of possible candidates. One of these additional steps is the use of isotope patterns for identification. Bodeldijk et al. (2001) briefly discussed this option, specifically using isotope patterns to determine if Cl and/or Br should be included in the molecular formula calculation. Grange et al. (2006) and Petrovic and Barceló (2007) also had moderate success using isotope ratios to narrow the list of possible candidate structures. Kind and Fiehn (2006) showed mathematically that HRMS (<1 ppm accuracy) still required additional information for proper structure elucidation. For example, without isotope abundance included in formula generation, at a resolution of 5 ppm, there were 973 possible formulas for a compound with molecular mass 800 Da. But for the same mass and mass accuracy, applying a 5% isotope abundance accuracy filter lowered the number of possible formulas to 111. These theoretical calculations were verified with experimental work done by Erve et al. (2009), using Xcalibur software, version 2.0.7 (Thermo Scientific) for data processing. Using an LTR-Orbitrap with a 2 ppm mass tolerance range, the correct formula appeared in the top 6% of results for a compound of mass 638 Da and in the top 0.5% for compounds with masses 1202 Da and 1664 Da. Studies which failed to include isotope information during the identification steps, such as Kaufmann et al. (2011) and Perez-Parada et al. (2012), were less successful in eliminating candidate structures. Another technique that is related is the use of adduct patterns for identification. For example, the presence of sodium adducts was used by Gómez et al. (2008) for confirmation of the molecular ion peak. But this technique has not been used much yet, as evidenced by the lack of literature on this subject.

v. Structure-property Relationships Another way to restrict the candidate structure list generated from a molecular formula is to incorporate structure-property relationships into the screening. For example, consideration of the logarithm of the octanol-water partitioning coefficient (log Kow) was used to eliminate some structures (Hogenboom et al. 2009; Kern et al. 2009; Nurmi et al. 2012). The log Kow describes the affinity of a compound to be in a polar or nonpolar phase and is therefore related to the retention time of the compound during LC analysis. The retention time can be obtained experimentally from analysis of a standard and log Kow can be estimated using a standard calibration curve of retention time vs. log Kow. If this is not possible, log Kow can be predicted from software such as KowWin found in EPISuite (U.S. EPA 2000). But each of these studies reported some problems with predicted log Kow values, such as a large uncertainty or incorrect predictions, leading to false negatives and false positives in the identification. Another structure-property relationship that has been investigated is the Chromatographic Hydrophobicity Index (CHI). Instead of predicting retention time from log Kow, Ulrich et al. (2011) used CHI to estimate a retention time range. Calculations of CHI were done using the Linear Solvation Energy Relationship (LSER) developed by Abraham et al.

Page 13: Literature Review: Structure Elucidation of Emerging Contaminants

Literature Review: Structure Elucidation of Emerging Contaminants   2012

10

(2004) and Vitha and Carr (2006). Of the 19 test compounds used by Ulrich et al. (2011), between 4 and 15 isomers could be excluded using CHI prediction.

vi. Computer-assisted Analysis Another complication in emerging contaminant research is the often complex sample matrices. In non-target analysis it is desirable to retain all compounds in the sample, since prior to analysis it is unknown which are the compounds of interest. This can result in samples with hundreds or thousands of peaks, with no information to separate matrix peaks from all other peaks. Structure elucidation would not be possible on all of them and unnecessary on matrix peaks. Background subtraction by using a blank sample is one solution to this issue, as was done by Bobeldijk et al. (2001) and Hogenboom et al. (2009). This step eliminates not only matrix peaks but also peaks from contamination introduced during the analysis. Unfortunately it is not always possible to obtain a reference sample. Therefore, various studies have advocated for the inclusion of deconvolution software for better non-target analysis (Kind and Fiehn 2010; Krauss et al. 2010; Gómez-Ramos et al. 2011; Perez-Parada et al. 2012). Software that accompanies MS instruments often includes this option, such as the “Molecular Feature Extraction” (MFE) algorithm in Agilent MassHunter software or the “Find Molecular Feature” (FMF) option in Bruker Daltonics DataAnalysis software. This type of software scans the chromatogram and picks peaks that belong to a compound, thereby eliminating peaks attributed to noise. As outlined in Kind and Fiehn (2010), the component detection algorithm (CODA) from Windig (1996) was developed for the analysis of LC-ESI-MS data and has been integrated into various software packages. One drawback to deconvolution is the loss of information during the analysis. Kaufmann et al. (2011) demonstrated that peak picking was a crucial step for the correct identification of compounds but also that there were a number of false positives generated by the program resulting from ignored isotopic peaks. Nurmi et al. (2012) also performed an evaluation of a non-target screening method based on deconvoluted spectra from UPLC-TOF-MS measurements. The exclusion of isotope patterns from the spectra led to a higher percentage of false positives during a library search of predicted spectra. Further software developments that could identify isotope and adduct peaks and group these with the monoisotopic peak as one feature would be very valuable. Krauss et al. (2010) summarized some of the shared steps used in non-target analysis and structure identification. They recommended a deconvolution step for peak picking and removal of background noise. For the peaks chosen for further analysis, molecular formulas were assigned based on accurate mass measurements and isotope patterns. These formulas were then associated with possible structures through searches of databases such as PubChem (http://pubchem.ncbi.nlm.nih.gov/) or ChemSpider. The list of candidate structures was further refined through structure-property relationships, such as log Kow, and fragmentation patterns, obtained through MS/MS. Final confirmation could only be carried out by comparison to a reference standard.

Page 14: Literature Review: Structure Elucidation of Emerging Contaminants

Literature Review: Structure Elucidation of Emerging Contaminants   2012

11

V. Transformation Products While the identification of emerging contaminants remains time-consuming and challenging, there lies an even greater challenge in finding metabolites and TPs. For many of the parent compounds, the chemical formulas and structures are at least known and spectra have been collected and catalogued for database searching. In the European Union states, prior to introduction into the market, chemical manufacturers submit the names and chemical composition of new compounds into databases, along with physiochemical properties, predicted environmental and toxicological behavior, intended uses, and expected production volume, as per the Reach legislation (EC 1907/2006). But TPs are in general completely unknown. Even if a TP were detected in a sample, identification of this substance offers a completely different challenge because searches of the molecular formula would be unsuccessful. But there are a number of potential strategies for overcoming this gap in knowledge. The following sections outline proposed strategies for predicting and confirming metabolites and TPs in the environment. One of the first reports of the identification of TPs in water samples was from Steen et al. (2001). Previous research had reported the detection of TPs, but under laboratory conditions and at concentrations higher than those predicted to be in the environment. In this study, using a QqQ instrument with ESI and APCI, standards of the pesticides of interest were first scanned in MS/MS mode and characteristic product ions and neutral losses were catalogued. Then surface water samples were measured in SRM mode, selecting for those neutral losses. This method allowed Steen et al. (2001) to identify TPs of atrazine present, even with a complex sample matrix. Additionally, some of the sensitivity issues of the QqQ instrument were overcome by using the SRM mode for the analysis of the samples, lowering the LODs to environmentally-relevant levels. Bobeldijk et al. (2002) also carried out identification of chemical by-products. In this study, they analyzed a standard of a compound (HMMM) that they had detected in surface water samples. In the standard chromatogram, peaks were measured which were assumed to be by-products or breakdowns of HMMM since they did not appear in the blank sample. A qualitative MRM procedure was then used to screen for these potential degradation products in the surface water samples and two by-products of HMMM were detected. Identification of TPs can also occur with elemental assignment based on HRMS data (Kosjek et al. 2007; Martinez Bueno et al. 2007; Hogenboom et al. 2009; Perez-Parada et al. 2011). For example, in Martinez Bueno et al. (2007), accurate mass measurements and an isotope signal (indicative of one sulfur atom) were used to assign molecular formulas using Applied Biosystem/MDS-Sciex Analyst QS Software (Agilent Technologies). They observed a series of related peaks, with similar chemical formulas and a repeated neutral loss of 14 m/z. The series was identified as homologues C7–C11 of sulfophenyl carboxylates (SPCs), known TPs of linear alkylbenzenesulfonates. Based on that information, the corresponding oxidized SPCs C8–C11 could also be identified in the chromatogram. So using a constant neutral loss to classify peaks into groups of similar compounds aided in identification of multiple TPs simultaneously. But these types of

Page 15: Literature Review: Structure Elucidation of Emerging Contaminants

Literature Review: Structure Elucidation of Emerging Contaminants   2012

12

investigations are generally time-consuming and still not suitable for the large range of possible TPs present in the environment (Gómez-Ramos et al. 2011).

i. Laboratory Prediction Another possibility for the confirmation of TPs is in-house synthesis of possible products. This option can be especially useful for photodegradation since this can be simulated in a laboratory (Ibáñez et al. 2006). Perez-Parada et al. (2011) used laboratory degradation studies to then conduct a targeted screening of amoxicillin and its TPs in wastewater. But laboratory simulations cannot be the only method for identifying potential TPs of concern, mainly because laboratory setups do not always represent environmental conditions (Steen et al. 2001; Kern et al. 2009).

ii. Degradation-Fragmentation Relationship Another way to look for TPs is to assume similarity between the spectra of the parent compound and the degradation product. In Gómez et al. (2010) and Gómez-Ramos et al. (2011) they established, with several examples, that there was a link between the fragmentation pattern of a parent compound and the degradation of that compound. The hypothesis was that sites where fragmentation occurred were the same sites which would be vulnerable during degradation. This relationship was initially shown in Thurman et al. (2005) and García-Reyes et al. (2006) for TPs formed in food. This method was further established for aqueous samples by Galmier et al. (2005) and was similar to how Steen et al. (2001) selected precursor ions for SRM analysis. After MS/MS measurement, searches for the exact mass of some of the fragments from the parent compound lead to identification of TPs. For example, a hydroxylated derivative of carbamazepine (a pharmaceutical) was found in wastewater samples simply through an exact mass match with the database but a different retention time (Gómez et al. 2010). Formula generation of the unknown peak was carried out with MassHunter software, version B.02.00 (Agilent) and this metabolite was suggested, which was then confirmed with MS/MS. But this particular method was still limited by the use of an established library which included information about the fragmentation patterns of the target compounds and their retention time. Although the idea of the fragmentation-degradation relationship was shown to have to useful applications, as a non-target method it is still limiting. For instance it assumes that the fragments from the parent must be found in the TPs. A degradation product that is altered and then fragments in a different pattern would not be observed. Similarly, if a parent compound is completely degraded, there would be no mass spectra to use to determine the identifying mass fragments. Another problem is related to the instability of CID fragmentation in MS/MS. Fragment peaks often have low intensity and may be below the LOD. Finally, this method has limited applicability for lower molecular weight compounds that only fragment into common structures. As seen in Gómez-Ramos et al. (2011), a success rate of 10% in TP identification was achieved in six samples and a success of 20% was achieved in only one sample. These results show that in most samples analyzed, more than 90% of the possible TPs identified were not found.

Page 16: Literature Review: Structure Elucidation of Emerging Contaminants

Literature Review: Structure Elucidation of Emerging Contaminants   2012

13

Similar approaches were used by Hao et al. (2008) and Kaufmann et al. (2011). In both of these papers the authors chose to use specific fragments from the LC-MS analysis as indicative of a chemical class. Through this method, identification of the compounds was easier because the list of possible options was shortened to only include those that fell within the appropriate compound classification. In Hao et al. (2008), they observed a seven-fold decrease in database hits using common diagnostic ions, leading to a success rate of identification above 80%.

iii. Computational Prediction of Fragmentation Instead of using experimental fragmentation data, various computer software programs now exist for prediction of fragmentation of molecules based on sets of rules. Innovative work was done by Hill et al. (2008) to investigate the effectiveness of using computationally-derived fragmentation patterns for the identification of compounds. For the initial analysis, five test compounds were chosen. They then pulled from PubChem all compounds matching ±10 ppm of the monoisotopic mass of a target compound. They restricted their list to chemicals with C, H, N, O, S, and P, creating a list of possible compounds. Using Mass Frontier software, version 4 (Thermo Cooperation), proposed fragmentation spectra were generated for each of these compounds and for multiple fragmentation reactions. The experimentally-derived fragmentation patterns of the five test compounds were then compared to the large set of computational predictions. Using five fragmentation reactions, successful matches were made in most cases, with the correct compound appearing in the top 2% for each target compound. The method was further validated with a larger test set of 102 compounds. The pooled set of possible candidates (using the same requirements as described before) from PubChem was 27,760 compounds. For each of these, five fragmentation reactions at five CID energies were calculated with Mass Frontier. In 87 of the test compounds, the correct chemical was listed in the top 20 solutions, with the average ranking being fourth. Another positive from this method was that in all but four of the cases, the correct molecular formula was ranked first. One of the limitations for this approach in non-target analysis is that a decision must be made a priori about the optimum collision energy at the CID. Different collision energies produce different fragmentation patterns, which therefore make comparison difficult. Also a decision must be made beforehand about which fragmentation reactions to run. But it costs less to produce lots of calculations of fragmentations than to do the degradation experiments for all the same compounds. Krauss et al. (2010) suggested that databases could be created based on the computational calculations for comparison, which would at least provide the advantage of narrowing the field of possible candidate structures. Nurmi et al. (2012) used predicted spectra for the non-target identification of emerging contaminants in wastewater. They selected six compounds as a test set and prediction of the theoretical spectra (including the base peak and isotope peaks) was carried out with ChromaLynx XS software from MassLynx, version 4.1 (Waters). Spectra of the deconvoluted spiked wastewater samples were compared to the library with the six

Page 17: Literature Review: Structure Elucidation of Emerging Contaminants

Literature Review: Structure Elucidation of Emerging Contaminants   2012

14

predicted spectra. Only one of the six compounds was successfully identified. The authors attributed the poor identification to the loss of isotope information during the deconvolution step. As per user-defined settings, only one m/z value was retained during the deconvolution. The explanation for this setting was that the TOF-MS used for measurement was operated without CID and therefore without fragmentation. So the authors assumed that there would be only one peak formed, corresponding to the monoisotopic mass. But when these deconvoluted spectra were then matched to the predicted library spectra (which included both the base peak and the corresponding isotope peaks), false positive resulted. They therefore advocated that future deconvolution software should group molecular ion peaks together with the isotope and adduct peaks to increase the strength of library matching. In contrast, photodegradation products of triclosan were identified by Ferrer et al. (2004) with LC-TOF-MS without CID or MS/MS measurements. For their work, the software Data Explorer, version 4.0.0.1 (Applied Biosystems) processed the experimental data by calculating the predicted molecular formulas from the accurate mass and then comparing the experimental spectra to the theoretical spectra for the predicted formula. The match was then scored based on the fit of the isotope patterns. Four degradation products of triclosan were identified with this method and potential pathways for photodegradation were suggested.

iv. Computational Prediction of Transformation Products Computational calculations have also been used to predict possible TPs. Compared to studies discussed previously, which predict a fragmentation pattern of a given structure, software has also been developed for the prediction of TPs from a given compound. Programs such as the University of Minnesota – Pathway Prediction Software (UM-PPS) (http://umbbd.ethz.ch/predict/) have been used successfully to establish knowledge about the possible TPs (Kern et al. 2009; Helbling et al. 2010a; Kern et al. 2010). The University of Minnesota BioCatalysis/Biodegradation Database (UM-BBD) (http://umbbd.ethz.ch/) has collected experimental information of microbial degradation pathways in an online database since 1995 (Gao et al. 2010). This system is coupled to the UM-PPS, which in turn predicts the degradation products of organic compounds based on the rules established within the UM-BBD. The content of the UM-BBD increased by approximately 30% between 2005 and 2010; the inclusion of additional pathways also increases the number of possible degradation products. This is turn may be a problem for emerging chemical research concerned with identifying TPs. Consequently, multi-step approaches have been considered for the refinement of the result dataset to eliminate false positives resulting from the UM-PPS. Additionally, developers of the UM-PPS system realized this concern and have begun to include varying levels of rules, including super rules and likelihood rankings (Gao et al. 2010). The CATABOL system also simulates the transformation of chemicals, but with a focus on metabolism and not environmental degradation (Mekenyan et al. 2006). The QSARs developed for this system can therefore be used to predict mutagenicity, estrogenicity, and other toxicity endpoints. META is a computer program which predicts degradation of

Page 18: Literature Review: Structure Elucidation of Emerging Contaminants

Literature Review: Structure Elucidation of Emerging Contaminants   2012

15

compounds in the environment by applying metabolic rules (Klopman and Tu 1997). The inclusion of a metabolic degradation pathway depends on the functional groups of the chemical and probability calculations are done by an algorithm in the Multiple Computer Automated Structure Evaluation (MultiCASE) program. The authors did point out some of the limitations of this type of software, namely that degradation pathways can vary across different microbial communities and under different abiotic conditions (such as temperature, pH, etc.). The application of these systems to emerging contaminant research was tested by Kern et al. (2009). They used UM-PPS to develop a new non-target screening method for a diverse set of TPs. This multi-step method was as follows. First a compiled database of 1,794 possible TPs was constructed from both UM-PPS predictions and experimentally-measured metabolites of chemicals. The exact mass for each of these predicted TPs was extracted from the LC-MS/MS chromatogram and the intensity and retention times recorded. These results were filtered by comparison to the blank sample and any overlap was removed. Additionally an intensity threshold of 105 was arbitrarily selected as a cutoff. Log Kow was then used to constrain retention times for the predicted TPs. The next steps for selecting matches were isotope fit and ionization efficiency. A final step of the identification was a second analysis with MS/MS, at a higher collision energy. This analysis prompted greater fragmentation and therefore additional structural information. Kern et al. (2009) surmised that the fragments generated by the TPs should have some logical structural similarity to the parent compound, similar to the fragmentation-degradation relationship advocated by Gómez et al. (2010). A final step was a comparison of the observed fragment to a prediction by Mass Frontier software (Thermo Scientific). After all of these filtering steps, the authors still advocated for positive identification through confirmation with reference standards or an independent technique such as nuclear magnetic resonance (NMR). This thorough process of non-target screening resulted in identification of 19 TPs in samples collected from WWTPs in Switzerland; some were degradation products observed in laboratory studies that had not been detected in environmental water samples. The screening procedure did show some false negatives due to incorrect log Kow values and peaks that were below the chosen abundance value. Although not a true non-target method, because the parent compounds of interest are selected beforehand, using computational predictions can help reduce some of the possible candidate structures. But it is important to keep in mind for what purpose the database has been developed. For example, systems such as the UM-PPS are based on enzyme-catalyzed and microbially-mediated degradation and these predictions may therefore not be relevant to predicting TPs from WWTP processes (Helbling et al. 2010b). Müller et al. (2011) introduced a new database for compound identification. For eventual identification of the compounds, the Database-Assisted Identification of Organic Substances (DAIOS) (http://www.daios-online.de/daios/) was used. While most databases are built on comparisons of mass and abundance of fragments for spectral comparison, DAIOS includes spectral comparison only to match masses. The rest of the

Page 19: Literature Review: Structure Elucidation of Emerging Contaminants

Literature Review: Structure Elucidation of Emerging Contaminants   2012

16

identification is done with metadata, including linkages to UM-PPS and Metabolite ID (Agilent Technologies) for the prediction of TPs, information about environmental fate, data about usage of the chemicals, and literature citations. For this reason the authors postulate that DAIOS could develop into an instrument-independent database platform.

VI. Non-target Screening with Statistics Although the screening methods discussed previously can be effective for the identification of unknowns, researchers have also attempted to apply statistics to gather further information from the large datasets generated with full-scan LC-HRMS. The use of statistics does require more than one sample because unknowns of interest are chosen through their presence or absence in a sample as compared to other samples. For example, influent and effluent samples of a WWTP can be collected and compared and new peaks that appear in the effluent sample are deemed interesting due to their formation during the wastewater treatment processes. Helbling et al (2010a) used an m/z vs. retention time matrix to compare pre- and post-treatment samples to select the peaks of interest and remove background noise. Müller et al. (2011) chose to apply a set of mathematical operators to their dataset. All MS peaks above a certain threshold were considered equally in the mathematical comparisons. First they used MFE (MassHunter software version B.03.01, Agilent Technologies) for deconvolution. For all samples, an m/z vs. retention time plot was generated. In this way, features that belonged together such as adducts and isotope patterns were clustered together. Then interactions between the samples were then calculated, including the UNION (A B), INTERSECTION (AҏB), and COMPLEMENT (A/B). This method is essentially mathematical version of Venn diagrams, applied to all features in all samples. This process was tested with samples from groundwater and landfill leachate. The study question was could compounds from the landfill leach through groundwater to the influent of treatment plants? If so, were any of those compounds found subsequently in the treated drinking water? For the first question, samples from the leachate were compared to those from the groundwater, after first subtracting peaks found in reference samples. For the second question, samples from the leachate were compared to those from both the groundwater and effluent of the WWTP, after subtracting peaks in the reference sample. From this analysis, it was determined that ten features were relevant for further investigation in the water treatment samples and three features warranted further investigation in the drinking water samples. Identification of unknowns then proceeded with determination of the molecular formula, comparison to MS/MS fragmentation, and hydrogen/deuterium (H/D) exchange experiments. Final confirmation was done by comparison with reference standards. Overall, nine chemicals were successfully identified from the landfill leachate sample with this process. TPs were also considered by Müller et al. (2011). For chemicals that were identified in the samples, TPs were predicted by the UM-PPS and then extracted ion chromatograms were generated for the predicted exact masses. One transformation product (1-adamantylamine) was tentatively identified but could not be confirmed. This type of post-

Page 20: Literature Review: Structure Elucidation of Emerging Contaminants

Literature Review: Structure Elucidation of Emerging Contaminants   2012

17

analysis targeted searching demonstrates the advantage of measuring samples in full-scan mode. Searches with DAIOS also confirmed the presence of some possible TPs and three of these were successfully identified with references. This method from Müller et al. (2011) provided a number of novel additions to the analysis of non-target screening. Firstly, the process of mathematical operators to isolate interesting features for further studies appeared to quite successful. This step addresses one of the more labor-intensive problems of non-target analysis, how to locate compounds that are relevant for further study and structure elucidation. The process of clustering mass and retention time made all the information collected useful, including adducts and isotopes. But one struggle in identification is highlighted by this study. Even with exact mass measurements and molecular formula generation, narrowing down the list of possible structures can be a taunting task. Here they used an additional analysis (H/D exchange experiments) to refine the possibilities before confirmation with references. Identification of TPs was limited with this method. Unknown TPs could only be found after parent compounds were identified and the exact mass of the TP was predicted with UM-PPS. This method excludes TPs whose parent compounds have degraded to levels below the LOD. Perhaps these would still be captured in the first steps of the method but that is unclear. Müller et al. (2011) also further the state of the field through their novel approach to database searches with the DAIOS library. By incorporating more than just the spectral information, it is possible that this library will be a useful tool for non-target analysis. But as with any library, its usefulness will be determined by the breadth of chemicals contained in the database and therefore the range of applicability.

VII. Conclusions While LC-HRMS and LC-MSn methods have been employed in increasing numbers to the problem of measuring and identifying emerging chemicals, there are still some issues to overcome, specifically with respect to structure elucidation. Although targeted and suspect screenings are certainly worthwhile for environmental monitoring, these methods are prohibitive when it comes to non-target monitoring since they require prior knowledge. It has been shown that measurement with powerful LC-MSn systems capable of high-resolution mass measurements are the first step in proper identification (Bobeldijk et al. 2001; Bobeldijk et al. 2002). Exact mass measurements can be used to calculate the molecular formula of the unknown with software packages such as MassLynx, Xcalibur, or Applied Biosystem/MDS-Sciex Analyst QS. Using the Seven Golden Rules established by Kind and Fiehn (2007) ensures that unlikely formulas are excluded from the list of candidates. After generation of the molecular formula, the list of possible candidate structures associated with that formula may still be quite extensive. It was shown that including physiochemical properties such as log Kow is useful to narrow down the list (Nurmi et al. 2012). Incorporating isotope patterns and adducts into the interpretation of the MSn spectra also generally lowered the number of possible hits for a given mass (Kind and Fiehn 2006; Erve et al. 2009). The clustering of families of compounds in a single sample

Page 21: Literature Review: Structure Elucidation of Emerging Contaminants

Literature Review: Structure Elucidation of Emerging Contaminants   2012

18

may also help remove some steps in identification. As seen in Martinez Bueno et al. (2007), after one SPC was identified in the sample, locating homologues by recognizing the constant loss of an alkane group was simpler than identification of each peak individually. One of the strategies for non-target analysis includes carrying out targeted analysis first (Gómez et al. 2010; Krauss et al. 2010; Perez-Parada et al. 2012). Using the information available in targeted analysis, some of the peaks can be eliminated from consideration because they are easily identified in this initial screening. Then non-target screening proceeds with the peaks in the chromatogram that are deemed interesting but have not yet been successfully identified by the target screening method. Also, knowledge about the compounds present in the sample may be used to search for TPs by employing the fragmentation-degradation relationship (Gómez et al. 2010). Screening for the same mass fragments at different retention times can also indicate related compounds. In general the lack of standard libraries for the comparison of LC-MS/MS spectra is a problem, and a reason why non-target analysis is still time-consuming. Additionally, the lack of a common collision energy such as in GC-MS is even more detrimental. In true non-target analysis one would not expect compounds to be searchable in a database and structure elucidation would be required nonetheless. But without a common collision energy, confirmation of results, especially between laboratories, is unattainable. But the irreproducibility of CID fragmentation means it is unlikely that universal databases of LC-MS spectra will be established. Databases with computationally-derived fragmentation patterns could be part of the solution (Krauss et al. 2010). Software programs such as Mass Frontier and MetFrag predict the possible fragmentation from a given compound under different collision energies and for multiple reactions. These predicted spectra have been successfully used to identify emerging chemicals and TPs (Ferrer et al. 2004; Hill et al. 2008). Programs such as the UM-PPS can predict possible TPs from a given parent compound, which can also be used for more targeted screening of these potential degradation products (Kern et al. 2009). Finally, programs such as DAIOS which incorporate data from many sources and consider not only the spectra but also predicted TPs and environmental fate could potentially be a powerful tool for compound identification (Müller et al. 2011). Besides the studies from Helbling et al. (2010a) and Müller et al. (2011), not much work has been done to explore the potential of statistical methods to emerging chemical identification. Specifically, a reliable method for the selection of compounds of interest has not yet been developed. As shown in Helbling et al. (2010a) and Müller et al. (2011), a set of samples which capture before and after conditions is very effective. This approach focuses on the compounds that are newly formed in the environment through some degradation process and therefore is a good indication of compounds to which humans or aquatic organisms may be exposed. Paired samples should be possible in many scenarios, since influent/effluent or upstream/downstream samples can be collected. It is possible that multivariate analysis (MVA) could also be applied. Samples are read into MVA models and then scaled, for example through unit variance (UV) or

Page 22: Literature Review: Structure Elucidation of Emerging Contaminants

Literature Review: Structure Elucidation of Emerging Contaminants   2012

19

Pareto scaling methods. This scaling may eliminate some of the losses from setting an abundance limit during data processing. Any features that are characteristic of a particular group could quickly be identified with loading plots. But this method would also require groups of samples from different locations for comparison. Additionally, deconvolution software that removes instrument noise while retaining most of the sample features is critical. Most of the research discussed here focuses on exposure-driven approaches to emerging chemicals i.e., identification of compounds that are present in surface waters or wastewaters without any knowledge about the potential risk of the compound. Another aspect of this field is concerned instead with effect-driven analysis or effect-directed analysis (EDA). In this research, samples are fractioned during chromatography to reduce the complexity and then toxicity testing is carried out on these fractions with test organisms (Brack 2003). Unknowns in the toxic LC fractions are then further identified. The argument for this approach is that structure elucidation is only carried out on compounds which are present and causing an effect. The drawback of this method is that with fractionation, additive or synergistic effects of chemicals in a mixture may be missed. Additionally, sensitivity to compounds varies between organisms, so even if a compound is nontoxic for the test organism does not mean that other organisms may not experience adverse effects. Other methods may also gain importance in the future of emerging chemical research. Since confirmation of a compound is nearly impossible without a reference standard, analyzing samples with an orthogonal method such as LC-NMR may be necessary for identification. Currently sensitivity of LC-NMR is an issue, since LODs are below the levels of chemicals normally found in the environment. But if the sensitivity can be lowered then it may also become an important technique in environmental analysis for the determination of unknowns.

Page 23: Literature Review: Structure Elucidation of Emerging Contaminants

Literature Review: Structure Elucidation of Emerging Contaminants   2012

20

VIII. References Abraham, M. H., A. Ibrahim, et al. (2004). "Determination of sets of solute descriptors

from chromatographic measurements." Journal of Chromatography A 1037(1–2): 29-47.

Brack, W. (2003). "Effect-directed analysis: a promising tool for the identification of

organic toxicants in complex mixtures?" Analytical and Bioanalytical Chemistry 377(3): 397-407.

Bobeldijk, I., J. P. C. Vissers, et al. (2001). "Screening and identification of unknown

contaminants in water with liquid chromatography and quadrupole-orthogonal acceleration-time-of-flight tandem mass spectrometry." Journal of Chromatography A 929(1–2): 63-74.

Bobeldijk, I., P. G. M. Stoks, et al. (2002). "Surface and wastewater quality monitoring:

combination of liquid chromatography with (geno)toxicity detection, diode array detection and tandem mass spectrometry for identification of pollutants." Journal of Chromatography A 970(1–2): 167-181.

Erve, J., M. Gu, et al. (2009). "Spectral accuracy of molecular ions in an LTQ/Orbitrap

mass spectrometer and implications for elemental composition determination." Journal of The American Society for Mass Spectrometry 20(11): 2058-2069.

Escher, B. I. and K. Fenner (2011). "Recent Advances in Environmental Risk Assessment

of Transformation Products." Environmental Science & Technology 45(9): 3835-3847.

Farré, M. l., S. Pérez, et al. (2008). "Fate and toxicity of emerging pollutants, their

metabolites and transformation products in the aquatic environment." TrAC Trends in Analytical Chemistry 27(11): 991-1007.

Ferrer, I., M. Mezcua, et al. (2004). "Liquid chromatography/time-of-flight mass

spectrometric analyses for the elucidation of the photodegradation products of triclosan in wastewater samples." Rapid Communications in Mass Spectrometry 18(4): 443-450.

Fischer, K., E. Fries, et al. (2012). "New developments in the trace analysis of organic

water pollutants." Applied Microbiology and Biotechnology 94(1): 11-28. Galmier, M.-J., B. Bouchon, et al. (2005). "Identification of degradation products of

diclofenac by electrospray ion trap mass spectrometry." Journal of Pharmaceutical and Biomedical Analysis 38(4): 790-796.

Page 24: Literature Review: Structure Elucidation of Emerging Contaminants

Literature Review: Structure Elucidation of Emerging Contaminants   2012

21

Gao, J., L. B. M. Ellis, et al. (2010). "The University of Minnesota Biocatalysis/Biodegradation Database: improving public access." Nucleic Acids Research 38(suppl 1): D488-D491.

García-Reyes, J. F., A. Molina-Díaz, et al. (2006). "Identification of Pesticide

Transformation Products in Food by Liquid Chromatography/Time-of-Flight Mass

Chemistry 79(1): 307-321.

Giger, W. (2009). "Hydrophilic and amphiphilic water pollutants: using advanced analytical methods for classic and emerging contaminants." Analytical and Bioanalytical Chemistry 393(1): 37-44.

Gómez, M. J., C. Sirtori, et al. (2008). "Photodegradation study of three dipyrone

metabolites in various water systems: Identification and toxicity of their photodegradation products." Water Research 42(10–11): 2698-2706.

Gómez, M. J., M. M. Gómez-Ramos, et al. (2010). "Rapid automated screening,

identification and quantification of organic micro-contaminants and their main transformation products in wastewater and river waters using liquid chromatography–quadrupole-time-of-flight mass spectrometry with an accurate-mass database." Journal of Chromatography A 1217(45): 7038-7054.

Gómez-Ramos, M. d. M., A. Pérez-Parada, et al. (2011). "Use of an accurate-mass

database for the systematic identification of transformation products of organic contaminants in wastewater effluents." Journal of Chromatography A 1218(44): 8002-8012.

Grange, A. H., M. C. Zumwalt, et al. (2006). "Determination of ion and neutral loss

compositions and deconvolution of product ion mass spectra using an orthogonal acceleration time-of-flight mass spectrometer and an ion correlation program." Rapid Communications in Mass Spectrometry 20(2): 89-102.

Hao, H., N. Cui, et al. (2008). "Global Detection and Identification of Nontarget

Components from Herbal Preparations by Liquid Chromatography Hybrid Ion Trap Time-of-Flight Mass Spectrometry and a Strategy." Analytical Chemistry 80(21): 8187-8194.

Helbling, D. E., J. Hollender, et al. (2010a). "High-Throughput Identification of

Microbial Transformation Products of Organic Micropollutants." Environmental Science & Technology 44(17): 6621-6627.

Helbling, D. E., J. Hollender, et al. (2010b). "Structure-Based Interpretation of

Biotransformation Pathways of Amide-Containing Compounds in Sludge-Seeded Bioreactors." Environmental Science & Technology 44(17): 6628-6635.

Page 25: Literature Review: Structure Elucidation of Emerging Contaminants

Literature Review: Structure Elucidation of Emerging Contaminants   2012

22

Hernández, F., Ó. J. Pozo, et al. (2005). "Strategies for quantification and confirmation of multi-class polar pesticides and transformation products in water by LC–MS2 using triple quadrupole and hybrid quadrupole time-of-flight analyzers." TrAC Trends in Analytical Chemistry 24(7): 596-612.

Hernández, F., J. Sancho, et al. (2012). "Current use of high-resolution mass

spectrometry in the environmental sciences." Analytical and Bioanalytical Chemistry 403(5): 1251-1264.

Hill, D. W., T. M. Kertesz, et al. (2008). "Mass Spectral Metabonomics beyond

Elemental Formula: Chemical Database Querying by Matching Experimental with Computational Fragmentation Spectra." Analytical Chemistry 80(14): 5574-5582.

Hogenboom, A. C., J. A. van Leerdam, et al. (2009). "Accurate mass screening and

identification of emerging contaminants in environmental samples by liquid chromatography–hybrid linear ion trap Orbitrap mass spectrometry." Journal of Chromatography A 1216(3): 510-519.

Hollender, J., H. Singer, et al. (2010). The Challenge of the Identification and

Quantification of Transformation Products in the Aquatic Environment Using High Resolution Mass Spectrometry. Xenobiotics in the Urban Water Cycle. D. Fatta-Kassinos, K. Bester and K. Kümmerer, Springer Netherlands. 16: 195-211.

(2012). "Analysis, occurrence and fate of anthelmintics

and their transformation products in the environment." TrAC Trends in Analytical Chemistry 31(0): 61-84.

Huset, C. A., A. C. Chiaia, et al. (2008). "Occurrence and Mass Flows of

Fluorochemicals in the Glatt Valley Watershed, Switzerland." Environmental Science & Technology 42(17): 6369-6377.

Ibáñez, M., J. Sancho, et al. (2006). "Use of liquid chromatography quadrupole time-of-

flight mass spectrometry in the elucidation of transformation products and metabolites of pesticides. Diazinon as a case study." Analytical and Bioanalytical Chemistry 384(2): 448-457.

Ibáñez, M., J. V. Sancho, et al. (2008). "Rapid non-target screening of organic pollutants

in water by ultraperformance liquid chromatography coupled to time-of-light mass spectrometry." TrAC Trends in Analytical Chemistry 27(5): 481-489.

Kaufmann, A., P. Butcher, et al. (2010). "Comprehensive comparison of liquid

chromatography selectivity as provided by two types of liquid chromatography detectors (high resolution mass spectrometry and tandem mass spectrometry): “Where is the crossover point?”." Analytica Chimica Acta 673(1): 60-72.

Page 26: Literature Review: Structure Elucidation of Emerging Contaminants

Literature Review: Structure Elucidation of Emerging Contaminants   2012

23

Kaufmann, A., P. Butcher, et al. (2011). "Semi-targeted residue screening in complex matrices with liquid chromatography coupled to high resolution mass spectrometry: current possibilities and limitations." Analyst 136(9): 1898-1909.

Kern, S., K. Fenner, et al. (2009). "Identification of Transformation Products of Organic

Contaminants in Natural Waters by Computer-Aided Prediction and High-Resolution Mass Spectrometry." Environmental Science & Technology 43(18): 7039-7046.

Kern, S., R. Baumgartner, et al. (2010). "A tiered procedure for assessing the formation

of biotransformation products of pharmaceuticals and biocides during activated sludge treatment." Journal of Environmental Monitoring 12(11): 2100-2111.

Kind, T. and O. Fiehn (2006). "Metabolomic database annotations via query of elemental

compositions: Mass accuracy is insufficient even at less than 1 ppm." BMC Bioinformatics 7(1): 234.

Kind, T. and O. Fiehn (2007). "Seven Golden Rules for heuristic filtering of molecular

formulas obtained by accurate mass spectrometry." BMC Bioinformatics 8(1): 105. Kind, T. and O. Fiehn (2010). "Advances in structure elucidation of small molecules

using mass spectrometry." Bioanalytical Reviews 2(1): 23-60. Klopman, G. and M. Tu (1997). "Structure–biodegradability study and computer-

automated prediction of aerobic biodegradation of chemicals." Environmental Toxicology and Chemistry 16(9): 1829-1835.

Kosjek, T., E. Heath, et al. (2007). "Mass spectrometry for identifying pharmaceutical

biotransformation products in the environment." TrAC Trends in Analytical Chemistry 26(11): 1076-1085.

Krauss, M., H. Singer, et al. (2010). "LC–high resolution MS in environmental analysis:

from target screening to the identification of unknowns." Analytical and Bioanalytical Chemistry 397(3): 943-951.

Martínez Bueno, M. J., A. Agüera, et al. (2007). "Application of Liquid

Chromatography/Quadrupole-Linear Ion Trap Mass Spectrometry and Time-of-Flight Mass Spectrometry to the Determination of Pharmaceuticals and Related Contaminants in Wastewater." Analytical Chemistry 79(24): 9372-9384.

Mekenyan, O., S. Dimitrov, et al. (2006). "Metabolic activation of chemicals: in-silico

simulation†." SAR & QSAR in Environmental Research 17(1): 107-120. Müller, A., W. Schulz, et al. (2011). "A new approach to data evaluation in the non-target

screening of organic trace substances in water analysis." Chemosphere 85(8): 1211-1219.

Page 27: Literature Review: Structure Elucidation of Emerging Contaminants

Literature Review: Structure Elucidation of Emerging Contaminants   2012

24

Nurmi, J., J. Pellinen, et al. (2012). "Critical evaluation of screening techniques for

emerging environmental contaminants based on accurate mass measurements with time-of-flight mass spectrometry." Journal of Mass Spectrometry 47(3): 303-312.

Pérez-Parada, A., A. Agüera, et al. (2011). "Behavior of amoxicillin in wastewater and

river water: identification of its main transformation products by liquid chromatography/electrospray quadrupole time-of-flight mass spectrometry." Rapid Communications in Mass Spectrometry 25(6): 731-742.

Pérez-Parada, A., M. d. M. Gómez-Ramos, et al. (2012). "Analytical improvements of

hybrid LC-MS/MS techniques for the efficient evaluation of emerging contaminants in river waters: a case study of the Henares River (Madrid, Spain)." Environmental Science and Pollution Research 19(2): 467-481.

Petrovic, M. and D. Barceló (2007). "LC-MS for identifying photodegradation products

of pharmaceuticals in the environment." TrAC - Trends in Analytical Chemistry 26(6): 486-493.

Richardson, S. D. and T. A. Ternes (2011). "Water Analysis: Emerging Contaminants

and Current Issues." Analytical Chemistry 83(12): 4614-4648. Steen, R. J. C. A., I. Bobeldijk, et al. (2001). "Screening for transformation products of

pesticides using tandem mass spectrometric scan modes." Journal of Chromatography A 915(1–2): 129-137.

Thurman, E. M., I. Ferrer, et al. (2005). "Discovering metabolites of post-harvest

fungicides in citrus with liquid chromatography/time-of-flight mass spectrometry and ion trap tandem mass spectrometry." Journal of Chromatography A 1082(1): 71-80.

Ulrich, N., G. Schüürmann, et al. (2011). "Linear Solvation Energy Relationships as

classifiers in non-target analysis—A capillary liquid chromatography approach." Journal of Chromatography A 1218(45): 8192-8196.

U.S. Environmental Protection Agency (EPA). EPI Suite Kowwin, v1.67, 2000. Vitha, M. and P. W. Carr (2006). "The chemical interpretation and practice of linear

solvation energy relationships in chromatography." Journal of Chromatography A 1126(1–2): 143-194.

Weiss, S. and T. Reemtsma (2005). "Determination of Benzotriazole Corrosion Inhibitors

from Aqueous Environmental Samples by Liquid Chromatography-Electrospray Ionization-Tandem Mass Spectrometry." Analytical Chemistry 77(22): 7415-7420.

Page 28: Literature Review: Structure Elucidation of Emerging Contaminants

Literature Review: Structure Elucidation of Emerging Contaminants   2012

25

Windig, W., J. M. Phalp, et al. (1996). "A Noise and Background Reduction Method for Component Detection in Liquid Chromatography/Mass Spectrometry." Analytical Chemistry 68(20): 3602-3606.

Zedda, M. and C. Zwiener (2012). "Is nontarget screening of emerging contaminants by

LC-HRMS successful? A plea for compound libraries and computer tools." Analytical and Bioanalytical Chemistry 403(9): 2493-2502.