the development of an lc-faims-ms metabolomics workflow: a ... · the development of an...

1
www.owlstonemedical.com The development of an LC-FAIMS-MS metabolomics workflow: a new tool for untargeted metabolite profiling to diagnose disease Lauren Brown; Kayleigh Arthur; Alasdair Edge; Aditya Malkar; Paul Nasca At a S:N of 3, more features were detected using a 2 CF scans s -1 scan rate (Figure 9a). To determine if this result was accurate, or a result of increased noise using the faster scan rate, the analysis was repeated at a S:N of 10 (Figure 9b). 5. Feature extraction Features lists for each CF were aligned into one list based on retention time and m/z. Conditional formatting was applied to the features to deconvolute features in the FAIMS dimension. Features present in multiple CFs were identified as either adjacent or separated into multiple features in the CF dimension, such as in the case of isobars. An example of both is shown in Figure 8. 6. Conclusions References The optimized FAIMS scan settings were used to generate a three-dimensional nested data set (Figure 10). The method is now being applied to the urinary analysis of a 674 patient colorectal cancer cohort in collaboration with University Hospital of Coventry and Warwickshire. The LC-ultraFAIMS-MS features list will be used to build a predictive model, or classifier, from which a probability of disease can be predicted. The additional features available in LC-ultraFAIMS-MS, compared to LC-MS alone, increase the likelihood of successfully building such a classifier. 1. Arthur, KA, Turner, MA, Reynolds, JC, Creaser, CS.; Anal. Chem., 2017, 89, 3452–3459. Figure 7. Comparison of the number of features detected in experiments 3 a-f as described in Table 1 Figure 8. Example CF scans of detected features (a) single feature detected in adjacent CF ands (b) multiple features showing separation of isobars Figure 9. Comparison of features detected at 1 CF scan s -1 and 2 CF scans s -1 at (a) S:N of 3 and (b) S:N of 10 Figure 10. m/z vs t R of a urine sample showing features detected at a S:N of 3; each CF is color differentiated Upon visual verification of the determined features, the lower sensitivity observed at the 2 CF scan s -1 rate resulted in increased noise in the baseline, accounting for the increased number of unique features detected at the S:N 3. 1 CF scan s -1 rate was therefore chosen for subsequent analysis, as a compromise between sensitivity and the number of data points across the peak for untargeted semi-quantitative analysis. The effect of the different MS scan rate and CF ranges on feature detection was investigated. Acquiring data at 2 CF scans s -1 increased the number of features detected, in all cases, despite the decrease in peak intensity associated with higher acquisition speeds (Figure 7b, c). Increasing to 3 CF scans s -1 did not further increase the number of features detected. In non-targeted ‘omics applications, liquid or gas chromatography (LC or GC) is typically combined with mass spectrometry (MS) for analysis of complex biological samples. LC-MS molecular features can be missed due to: Proteowizard’s “msconvert” was used to convert the Agilent MassHunter .d data into mzML. An in-house developed Python based tool was then used to split the LC-FAIMS-MS data and convert it to individual .mzML files associated with each FAIMS CF setting. The saved .mzML files were subjected to feature extraction using XCMS package in R. The workflow for feature extraction from LC-FAIMS-MS files is detailed in Figure 4. A feature was defined as a unique identifier for each component of a m/z, a retention time (t R ) and a compensation field (CF). 1. Introduction Analysis was performed on a 6230 TOF-MS and a 1290 series LC (Agilent, Santa Clara, US) combined with an ultraFAIMS device (Owlstone Medical Ltd, Cambridge, UK). Optimization of the acquisition of nested FAIMS data focussed on: Number and range of FAIMS CF settings to give optimal FAIMS separation Number of data points within the timescale of the chromatographic peaks Optimal sensitivity via chromatographic peak heights and number of TOF scans s -1 . Full data acquisition parameters are shown in Table 1. Optimal DF was determined based on trade off between selectivity and sensitivity. Optimization of chromatography to achieve optimal retention of small polar molecules and coverage of features across the chromatographic range was carried out prior to FAIMS optimization experiments. The key dimensions of the ultraFAIMS device are the 100 μm electrode gap and 700 μm path length. The small scale is key to the ability to integrate into the LC-MS workflow. The short ion residence time means an entire CF scan per second can be achieved, making the scanning approach compatible with chromatographic timescales. The CF values are synchronized with MS acquisition so a spectrum is acquired for each CF (Figure 3). Synchronization was provided via a contact closure interface from the binary pump on the LC. LC was performed using a reversed phase Poroshell 120 EC-C18, 2.1 x 100 mm, particle size 2.7 μm (Agilent Technologies). 2. Methods 3. Data analysis 4. LC-ultraFAIMS-MS optimization • Trace component suppression by chemical noise. • Chromatographically unresolved isomeric species. Orthogonality between FAIMS, LC and MS provides additional unique compound identifiers with detection of features based on: Retention time (RT) FAIMS dispersion and compensation fields (DF and CF) Mass-to-charge (m/z) (Figure 1) Figure 1. LC-MS and LC-ultraFAIMS-MS feature determination Figure 4. LC-ultraFAIMS-MS feature extraction workflow Table 1: LC-ultraFAIMS-MS optimization experiments Figure 5. Comparison of features detected at DFs of 230, 240 and 250 Td Figure 6. (a) Raw EIC for m/z 273 across all CFs (black) 10 spectra s -1 and (red) 20 spectra s -1 , showing twice as many CF scans across the LC peak and (b) deconvoluted LC-MS peak at CF of 1.6 Td showing twice as many data points at 20 spectra s -1 Figure 3. Integrated LC-ultraFAIMS-MS analysis Figure 2. (a) ultraFAIMS chip and source schematic (b) installed on Agilent 6230 ToF-MS Test MS scan rate (s -1 ) per s m/z start m/z end Index Start CF (Td) End CF (Td) N CF steps N CF actual CF step size (Td) N Repeats Start DF (Td) End DF (Td) N DF Steps 1a 12 1 80 1500 0 -0.9 4.1 10 12 0.5 503 250 250 0 1b 12 1 80 1500 0 -0.9 4.1 10 12 0.5 503 240 240 0 1c 12 1 80 1500 0 -0.9 4.1 10 12 0.5 503 230 230 0 2 24 2 80 1500 0 -0.9 4.1 10 12 0.5 1007 Test 1 0 Test 1 3a 10 1 80 1500 0 -0.9 3.1 8 10 0.5 503 Test 1 0 Test 1 3b 20 2 80 1500 0 -0.9 3.1 8 10 0.5 1007 Test 1 0 Test 1 3c 18 1 80 1500 0 -0.9 3.1 16 18 0.25 503 Test 1 0 Test 1 3d 6 1 80 1500 0 -0.9 3.1 4 6 1 503 Test 1 0 Test 1 3e 12 2 80 1500 0 -0.9 3.1 4 6 1 1007 Test 1 0 Test 1 3f 18 3 80 1500 0 0 -0.9 3.1 4 6 1 1511 Test 1 Test 1 500 0 1000 1500 2000 2500 3000 3500 4000 4500 -0.9 -0.4 0.1 0.6 1.1 1.6 2.1 2.6 3.1 3.6 4.1 # of features CF 250 240 230 The CF vs features plot (Figure 5) shows good coverage across the analytical space in the range -1 to +3 Td. More features were detected in the higher CF region at the higher DFs as distribution increased with increasing DF. 240 Td was selected for further experiments based on widest distribution of detected features across the CF range. As the CF scan is synchronized with the MS acquisition, the number of data points across a chromatographic peak is dependent on the number of MS scans s -1 and the CF range. Increased MS scan rate means more data points can be acquired over a given CF range. Alternatively, a smaller CF step size could be applied, increasing the data points across the CF peak whilst maintaining the number of CF data points across the LC peak. Faster ToF scan rates do, however, reduce peak intensity (Figure 6). No. of features 3d 3e 3f No. of features 3a 3b 3c 0 500 1000 1500 2000 2500 3000 3500 4000 4500 0 500 1000 1500 2000 2500 3000 3500 4000 4500 0 500 1000 1500 2000 2500 3000 3500 4000 4500 -0.9 -0.4 0.1 0.6 1.1 1.6 2.1 2.6 3.1 No. of features CF (Td) -0.9 -0.4 0.1 0.6 1.1 1.6 2.1 2.6 3.1 CF (Td) -0.9 -0.4 0.1 0.6 1.1 1.6 2.1 2.6 3.1 CF (Td) 3a 3b 3c 3d 3e 3f 0 200000 400000 600000 800000 1000000 1200000 0 200000 400000 600000 800000 1000000 1200000 -1 1 3 Peak Area CF (Td) 3a 3b 0 100000 200000 300000 400000 500000 600000 700000 250 260 270 280 290 Intensity Retention time (s) CF 1.6 Td 3a 3b -1 1 3 Peak Area CF (Td) 3a 3b RT 1 Identified Feature Identified Feature 1 Identified Feature 2 m/z 1 + = RT 1 LC-ultraFAIMS-MS LC-MS DF+ CF 1 m/z 1 DF+ CF 2 m/z 2 + + = RT 1 + + = LC-FAIMS-MS (Agilent.d Files) Convert All Extracted TIC to .mzML Intermediate .mz5 Files Segregate Samples as Per CF TIC in Foldets CF 1 .mzML Feature List (1) Extract TICs Corresponding to Various CF Values Done via in-house Python Script and Proteowizard Feature Extraction using XCMS in R CF 2 .mzML Feature List (2) Feature Extraction using XCMS in R Liquid Flow Sheath gas flow Nebuliser (a) Nozzle Drying gas flow Inlet capillary Spray shield Agilent 6230 TOF MS Miniaturised chip- based FAIMS in chip housing (b) Chromatographic Peak Mass Spectra 1 per CF Compensation Field Scan (nCFs per s) LC peak width Incorporate FAIMS into LC-MS M a s s S p e c t r o m e t r y F i e l d A s y m m e t r i c W a v e f o r m L i q u i d C h r o m a t o g r a p h y I o n M o b i l i t y S p e c t r o m e t r y (a) (a) (b) 0 400 200 600 800 1000 1200 1400 0 100 200 300 400 500 -0.9 Td -0.4 Td 0.1 Td 0.6 Td 1.1 Td 1.6 Td 2.1 Td 2.6 Td 3.1 Td m/z Retention time (s) (a) (b) (b) (b) (c) 100% 90% 80% 70% 60% 50% 40% 30% 20% 10% 0% Sum of features No. of 3 9372 25 643 8011 4488 4141 10409 32 704 8905 4488 5084 No. of 2 Common features No. of 1 Unique features 100% 90% 80% 70% 60% 50% 40% 30% 20% 10% 0% Sum of features No. of 3 1384 1 62 1257 991 328 1470 3 64 1333 991 406 No. of 2 Common features No. of 1 Unique features 2 CF/s 1 CF/s Mid 2 CF/s 1 CF/s Mid A recent study 1 reported a threefold increase in features detected in non-targeted profiling of human urine with the addition of FAIMS to LC-MS analysis. Here we describe an optimized workflow to produce three-dimensional metabolomics data sets and its application to urinary analysis of a colorectal cancer cohort. (a) Intensity Owlstone Medical Ltd., 162 Cambridge Science Park, Cambridge, CB4 0GH, UK For further information, email: [email protected]

Upload: buithien

Post on 09-Aug-2019

220 views

Category:

Documents


0 download

TRANSCRIPT

www.owlstonemedical.com

The development of an LC-FAIMS-MS metabolomics workflow:a new tool for untargeted metabolite profiling to diagnose diseaseLauren Brown; Kayleigh Arthur; Alasdair Edge; Aditya Malkar; Paul Nasca

At a S:N of 3, more features were detected using a 2 CF scans s-1 scan rate (Figure 9a).To determine if this result was accurate, or a result of increased noise using the faster scan rate, the analysis was repeated at a S:N of 10 (Figure 9b).

5. Feature extractionFeatures lists for each CF were aligned into one list based on retention time and m/z. Conditional formatting was applied to the features to deconvolute features in the FAIMS dimension. Features present in multiple CFs were identified as either adjacent or separated into multiple features in the CF dimension, such as in the case of isobars. An example of both is shown in Figure 8.

6. Conclusions

References

• The optimized FAIMS scan settings were used to generate a three-dimensional nested data set (Figure 10).

• The method is now being applied to the urinary analysis of a 674 patient colorectal cancer cohort in collaboration with University Hospital of Coventry and Warwickshire.

• The LC-ultraFAIMS-MS features list will be used to build a predictive model, or classifier, from which a probability of disease can be

predicted. The additional features available in LC-ultraFAIMS-MS, compared to LC-MS alone, increase the likelihood of successfully building such a classifier.1. Arthur, KA, Turner, MA, Reynolds, JC, Creaser, CS.; Anal. Chem., 2017, 89, 3452–3459.

Figure 7. Comparison of the number of features detected in experiments 3 a-f as described in Table 1

Figure 8. Example CF scans of detected features (a) single feature detected in adjacent CF ands (b) multiple features showing separation of isobars

Figure 9. Comparison of features detected at 1 CF scan s-1 and 2 CF scans s-1 at (a) S:N of 3 and (b) S:N of 10

Figure 10. m/z vs tR of a urine sample showing features detected at a S:N of 3; each CF is color di�erentiated

• Upon visual verification of the determined features, the lower sensitivity observed at the 2 CF scan s-1 rate resulted in increased noise in the baseline, accounting for the increased number of unique features detected at the S:N 3.

• 1 CF scan s-1 rate was therefore chosen for subsequent analysis, as a compromise between sensitivity and the number of data points across the peak for untargeted semi-quantitative analysis.

• The e�ect of the di�erent MS scan rate and CF ranges on feature detection was investigated.

• Acquiring data at 2 CF scans s-1 increased the number of features detected, in all cases, despite the decrease in peak intensity associated with higher acquisition speeds (Figure 7b, c).

• Increasing to 3 CF scans s-1 did not further increase the number of features detected.

In non-targeted ‘omics applications, liquid or gas chromatography (LC or GC) is typically combined with mass spectrometry (MS) for analysis of complex biological samples.LC-MS molecular features can be missed due to:

Proteowizard’s “msconvert” was used to convert the Agilent MassHunter .d data into mzML. An in-house developed Python based tool was then used to split the LC-FAIMS-MS data and convert it to individual .mzML files associated with each FAIMS CF setting. The saved .mzML files were subjected to feature extraction using XCMS package in R.The workflow for feature extraction from LC-FAIMS-MS files is detailed in Figure 4.A feature was defined as a unique identifier for each component of a m/z, a retention time (tR) and a compensation field (CF).

1. Introduction

Analysis was performed on a 6230 TOF-MS and a 1290 series LC (Agilent, Santa Clara, US) combined with an ultraFAIMS device (Owlstone Medical Ltd, Cambridge, UK).

Optimization of the acquisition of nested FAIMS data focussed on:• Number and range of FAIMS CF settings to give optimal FAIMS separation• Number of data points within the timescale of the chromatographic peaks • Optimal sensitivity via chromatographic peak heights and number of TOF scans s-1.Full data acquisition parameters are shown in Table 1.

Optimal DF was determined based on trade o� between selectivity and sensitivity.

Optimization of chromatography to achieve optimal retention of small polar molecules and coverage of features across the chromatographic range was carried out prior to FAIMS optimization experiments.

The key dimensions of the ultraFAIMS device are the 100 µm electrode gap and 700 µm path length.The small scale is key to the ability to integrate into the LC-MS workflow.• The short ion residence time means an entire CF

scan per second can be achieved, making the scanning approach compatible with chromatographic timescales.

• The CF values are synchronized with MS acquisition so a spectrum is acquired for each CF (Figure 3).

• Synchronization was provided via a contact closure interface from the binary pump on the LC.

• LC was performed using a reversed phase Poroshell 120 EC-C18, 2.1 x 100 mm, particle size 2.7 µm (Agilent Technologies).

2. Methods

3. Data analysis

4. LC-ultraFAIMS-MS optimization

• Trace component suppression by chemical noise.

• Chromatographically unresolved isomeric species.

Orthogonality between FAIMS, LC and MS provides additional unique compound identifiers with detection of features based on:• Retention time (RT)• FAIMS dispersion and compensation fields

(DF and CF)• Mass-to-charge (m/z) (Figure 1)

Figure 1. LC-MS and LC-ultraFAIMS-MS feature determination

Figure 4. LC-ultraFAIMS-MS feature extraction workflow

Table 1: LC-ultraFAIMS-MS optimization experiments

Figure 5. Comparison of features detected at DFs of 230, 240 and 250 Td

Figure 6. (a) Raw EIC for m/z 273 across all CFs (black) 10 spectra s-1 and (red) 20 spectra s-1, showing twice as many CF scans across the LC peak and (b) deconvoluted LC-MS peak at CF of 1.6 Td showing twice as many data points at 20 spectra s-1

Figure 3. Integrated LC-ultraFAIMS-MS analysis

Figure 2. (a) ultraFAIMS chip and source schematic (b) installed on Agilent 6230 ToF-MS

Test MS scan rate (s-1) per s m/z start m/z end Index Start CF (Td) End CF (Td) N CF steps N CF actual CF step size (Td) N Repeats Start DF (Td) End DF (Td) N DF Steps

1a 12 1 80 1500 0 -0.9 4.1 10 12 0.5 503 250 250 0

1b 12 1 80 1500 0 -0.9 4.1 10 12 0.5 503 240 240 0

1c 12 1 80 1500 0 -0.9 4.1 10 12 0.5 503 230 230 0

2 24 2 80 1500 0 -0.9 4.1 10 12 0.5 1007 Test 1 0Test 1

3a 10 1 80 1500 0 -0.9 3.1 8 10 0.5 503 Test 1 0Test 1

3b 20 2 80 1500 0 -0.9 3.1 8 10 0.5 1007 Test 1 0Test 1

3c 18 1 80 1500 0 -0.9 3.1 16 18 0.25 503 Test 1 0Test 1

3d 6 1 80 1500 0 -0.9 3.1 4 6 1 503 Test 1 0Test 1

3e 12 2 80 1500 0 -0.9 3.1 4 6 1 1007 Test 1 0Test 1

3f 18 3 80 1500 0 0-0.9 3.1 4 6 1 1511 Test 1 Test 1

500

0

1000

1500

2000

2500

3000

3500

4000

4500

-0.9 -0.4 0.1 0.6 1.1 1.6 2.1 2.6 3.1 3.6 4.1#

of

feat

ures

CF

250

240

230

• The CF vs features plot (Figure 5) shows good coverage across the analytical space in the range -1 to +3 Td.

• More features were detected in the higher CF region at the higher DFs as distribution increased with increasing DF.

• 240 Td was selected for further experiments based on widest distribution of detected features across the CF range.

• As the CF scan is synchronized with the MS acquisition, the number of data points across a chromatographic peak is dependent on the number of MS scans s-1 and the CF range.

• Increased MS scan rate means more data points can be acquired over a given CF range.• Alternatively, a smaller CF step size could be applied, increasing the data points across the• CF peak whilst maintaining the number of CF data points across the LC peak.• Faster ToF scan rates do, however, reduce peak intensity (Figure 6).

No

. of

feat

ures

3d3e3f

No

. of

feat

ures

3a3b3c

0

500

1000

1500

2000

2500

3000

3500

4000

4500

0

500

1000

1500

2000

2500

3000

3500

4000

4500

0

500

1000

1500

2000

2500

3000

3500

4000

4500

-0.9 -0.4 0.1 0.6 1.1 1.6 2.1 2.6 3.1

No

. of

feat

ures

CF (Td)-0.9 -0.4 0.1 0.6 1.1 1.6 2.1 2.6 3.1

CF (Td)-0.9 -0.4 0.1 0.6 1.1 1.6 2.1 2.6 3.1

CF (Td)

3a3b3c3d3e3f

0

200000

400000

600000

800000

1000000

1200000

0

200000

400000

600000

800000

1000000

1200000

-1 1 3

Pea

k A

rea

CF (Td)

3a

3b

0

100000

200000

300000

400000

500000

600000

700000

250 260 270 280 290

Inte

nsit

y

Retention time (s)

CF 1.6 Td

3a

3b

-1 1 3

Pea

k A

rea

CF (Td)

3a

3b

RT1IdentifiedFeature

IdentifiedFeature1

IdentifiedFeature2

m/z1+ =

RT1

LC-ultraFAIMS-MS

LC-MS

DF+CF1

m/z1

DF+CF2

m/z2

+ + =

RT1 + + =

LC-FAIMS-MS(Agilent.d Files)

Convert All ExtractedTIC to .mzML

Intermediate .mz5Files

Segregate Samples asPer CF TIC in Foldets

CF 1 .mzML

Feature List (1)

Extract TICsCorresponding toVarious CF Values

Done via in-housePython Script and

Proteowizard

FeatureExtraction using

XCMS in R

CF 2 .mzML

Feature List (2)

FeatureExtraction using

XCMS in R

Liquid Flow

Sheathgas flow

Nebuliser

(a)

Nozzle

Drying gas flow

Inlet capillary

Spray shield

Agilent6230

TOF MS

Miniaturised chip-based FAIMS in

chip housing

(b)

ChromatographicPeak

Mass Spectra1 per CF

CompensationField Scan

(nCFs per s)

LC peak width

IncorporateFAIMS into

LC-MS

Mass Spectrometry

Field Asym

metric W

aveform

Liquid Chromatography

Ion Mobility Spectrom

etry

(a)

(a) (b)

0

400

200

600

800

1000

1200

1400

0 100 200 300 400 500

-0.9 Td

-0.4 Td

0.1 Td

0.6 Td

1.1 Td

1.6 Td

2.1 Td

2.6 Td

3.1 Td

m/z

Retention time (s)

(a) (b)

(b)

(b) (c)

100%

90%

80%

70%

60%

50%

40%

30%

20%

10%

0%

Sum offeatures

No. of 3

937

2

25 64

3

8011

44

88

414

1

104

09 32

704

890

5

44

88

5084

No. of 2 Commonfeatures

No. of 1 Uniquefeatures

100%

90%

80%

70%

60%

50%

40%

30%

20%

10%

0%

Sum offeatures

No. of 3

1384

1 62

1257

99

1

328

1470

3

64

1333 99

1

40

6

No. of 2 Commonfeatures

No. of 1 Uniquefeatures

2 CF/s

1 CF/s

Mid

2 CF/s

1 CF/s

Mid

A recent study1 reported a threefold increase in features detected in non-targeted profiling of human urine with the addition of FAIMS to LC-MS analysis.

Here we describe an optimized workflow to produce three-dimensional metabolomics data sets and its application to urinary analysis of a colorectal cancer cohort.

(a)

Inte

nsit

y

Owlstone Medical Ltd., 162 Cambridge Science Park, Cambridge, CB4 0GH, UKFor further information, email: [email protected]