![Page 1: Pre-processing - workflow4metabolomics€¦ · Preprocessing tool: incentive •An inadequate pretreatment will never be compensated by a powerful data analysis Some crucial steps](https://reader034.vdocument.in/reader034/viewer/2022050521/5fa45a8ea305f419b6331578/html5/thumbnails/1.jpg)
02/10/2018 v 1.0.0
NMR TOOLS
Pre-processing
Marie Tremblay-Franco
Cécile Canlet
Manon Martin
![Page 2: Pre-processing - workflow4metabolomics€¦ · Preprocessing tool: incentive •An inadequate pretreatment will never be compensated by a powerful data analysis Some crucial steps](https://reader034.vdocument.in/reader034/viewer/2022050521/5fa45a8ea305f419b6331578/html5/thumbnails/2.jpg)
NMR workflow
2
1Preprocessing using TopSpin: Fourier transformation, phase correction, baseline correction and calibration (TSP: 0ppm)
3 Fourier transformation
2Free Induction Decay
5Cluster-based Peak Alignment: Vu et al, (2011). BMC Bioinformatics 12:405
4In collaboration with B. Govaerts and M. Martin (Louvain La Neuve University, Belgium) : Martin et al. (2018). AnalyticaChimica Acta, 1019, 1-13.
6Probabilistic Quotient Normalization: Dieterle et al. 2006. Analytical Chemistry 78(13):42817In collaboration with R. Servien, P. Tardivel and C. Canlet (ToxAlim, Toulouse, France) : Tardivel et al. (2017). Metabolomics, 13(10), 109.
PQN6
univariate
PCA
available
in progress
ASICS7
PLS-DA
Stan
dar
dA
dva
nce
d
Lege
nd
Equal size
1. Read 4. Bucketing 5. Normalize 6. Analyze 7. Annotate3. Alignment
Total intensity
CluPa5
Quantitativevariable
FID2 FT3,4
Baseline correction4
Phase correction4
Brukerfiles1
2. TopSpin
Jcamp-dx
![Page 3: Pre-processing - workflow4metabolomics€¦ · Preprocessing tool: incentive •An inadequate pretreatment will never be compensated by a powerful data analysis Some crucial steps](https://reader034.vdocument.in/reader034/viewer/2022050521/5fa45a8ea305f419b6331578/html5/thumbnails/3.jpg)
Galaxy NMR processing tools
![Page 4: Pre-processing - workflow4metabolomics€¦ · Preprocessing tool: incentive •An inadequate pretreatment will never be compensated by a powerful data analysis Some crucial steps](https://reader034.vdocument.in/reader034/viewer/2022050521/5fa45a8ea305f419b6331578/html5/thumbnails/4.jpg)
4
MTBLS1 DATA DESCRIPTION
![Page 5: Pre-processing - workflow4metabolomics€¦ · Preprocessing tool: incentive •An inadequate pretreatment will never be compensated by a powerful data analysis Some crucial steps](https://reader034.vdocument.in/reader034/viewer/2022050521/5fa45a8ea305f419b6331578/html5/thumbnails/5.jpg)
4-weeks washoutperiod
t0 t1
t2 t4
t3 t5 t7
t6
42 volunteers:• 12 healthy
patients• 30 T2DM
patients
Experimental design
• 1H-NMR analysis of 131 urine samples from type II diabetic or control patients (http://www.ebi.ac.uk/metabolights/MTBLS1)
• W4M00004_MTBLS1 reference workflow
![Page 6: Pre-processing - workflow4metabolomics€¦ · Preprocessing tool: incentive •An inadequate pretreatment will never be compensated by a powerful data analysis Some crucial steps](https://reader034.vdocument.in/reader034/viewer/2022050521/5fa45a8ea305f419b6331578/html5/thumbnails/6.jpg)
Data description
• 134 midstream urine samples from human
– 12 healthy volunteers (8 males/4 females): 7 time points of urine collection
– 30 T2DM patients (17 males/13 females): 1-3 time points of urine collection
• NMR sample preparationa
• NMR analysis:
– Bruker DRX-700 spectrometer using a NEOSY pulse sequenceb
– Fourier transformation was applied, then all spectra were phased and baseline corrected using TopSpinf
a b c
![Page 7: Pre-processing - workflow4metabolomics€¦ · Preprocessing tool: incentive •An inadequate pretreatment will never be compensated by a powerful data analysis Some crucial steps](https://reader034.vdocument.in/reader034/viewer/2022050521/5fa45a8ea305f419b6331578/html5/thumbnails/7.jpg)
7
HUMAN SERUM DATA DESCRIPTION
![Page 8: Pre-processing - workflow4metabolomics€¦ · Preprocessing tool: incentive •An inadequate pretreatment will never be compensated by a powerful data analysis Some crucial steps](https://reader034.vdocument.in/reader034/viewer/2022050521/5fa45a8ea305f419b6331578/html5/thumbnails/8.jpg)
Experimental design and data description
• 1 blood sample was collected for 4 different donors
• For each blood sample, 8 sub-samples were measured across 8 days with one sub-sample of each donor per day and permutations according to a latinhypercube sampling method
• The total number of samples is then 4×8 = 32
• Data were acquired with a 500 MHz Bruker Avance spectrometer equipped with a TCI cryoprobe and using a CPMG relaxation-editing sequence with pre-saturation
• Spectra are labelled as: JxDx where Jx is the day of measurement and Dx is the donor label
• This dataset was designed to collect multiple measures from the same experimental unit and capture other technical sources of variation, allowing to compare the inter- and intra-unit variability.
![Page 9: Pre-processing - workflow4metabolomics€¦ · Preprocessing tool: incentive •An inadequate pretreatment will never be compensated by a powerful data analysis Some crucial steps](https://reader034.vdocument.in/reader034/viewer/2022050521/5fa45a8ea305f419b6331578/html5/thumbnails/9.jpg)
PRE-PROCESSING TOOL
9
![Page 10: Pre-processing - workflow4metabolomics€¦ · Preprocessing tool: incentive •An inadequate pretreatment will never be compensated by a powerful data analysis Some crucial steps](https://reader034.vdocument.in/reader034/viewer/2022050521/5fa45a8ea305f419b6331578/html5/thumbnails/10.jpg)
Reading and pre-processing tools in W4M
![Page 11: Pre-processing - workflow4metabolomics€¦ · Preprocessing tool: incentive •An inadequate pretreatment will never be compensated by a powerful data analysis Some crucial steps](https://reader034.vdocument.in/reader034/viewer/2022050521/5fa45a8ea305f419b6331578/html5/thumbnails/11.jpg)
Developed from the R package PepsNMR
Included in W4M
Martin M., Legat B., Leenders J., Vanwinsberghe J., Rousseau R., Boulanger B., Eilers, P.H.C, De Tullio P., Govaerts B., Analytica Chimica Acta, 2018, 1019, 1-13.PepsNMR for 1H NMR metabolomics data pre-processing
https://github.com/ManonMartin/PepsNMR
• Read Bruker files
• Group delay correction
• Solvent residuals suppression
• Apodization
• Fourier Transform
• Referencing to 0 ppm
• Zero order phase correction
• Baseline correction
![Page 12: Pre-processing - workflow4metabolomics€¦ · Preprocessing tool: incentive •An inadequate pretreatment will never be compensated by a powerful data analysis Some crucial steps](https://reader034.vdocument.in/reader034/viewer/2022050521/5fa45a8ea305f419b6331578/html5/thumbnails/12.jpg)
Data importation (1)
Input for the Preprocessing tool: formatted datasets as outputted by the NMR_Readtool
The NMR_Read tool: import raw Bruker data from a zip file
Choose the zip file
TRUE if there is sub-directory per sample
TRUE for the same namesthan sub-directory
Execute
![Page 13: Pre-processing - workflow4metabolomics€¦ · Preprocessing tool: incentive •An inadequate pretreatment will never be compensated by a powerful data analysis Some crucial steps](https://reader034.vdocument.in/reader034/viewer/2022050521/5fa45a8ea305f419b6331578/html5/thumbnails/13.jpg)
Data importation (2)
Format: .tabular (as outputted by NMR_Read) or .csv– Field separator: tab
– Decimal separator: .
Sample metadata with the acquisition parameters:
• TD: Time domain size
• BYTORDA: Determine the endianness of stored data. If 0 -> Little Endian; if 1 -> Big Endian
• DIGMOD: Digitization mode
• DECIM: Decimation rate of digital filter
• DSPFVS: DSP firmware version
• SW_h: Sweep width in Hz
• SW: Sweep width in ppm
• O1: Spectrometer frequency offset
• DT: Dwell time in microseconds
Data matrix
Input datasets
![Page 14: Pre-processing - workflow4metabolomics€¦ · Preprocessing tool: incentive •An inadequate pretreatment will never be compensated by a powerful data analysis Some crucial steps](https://reader034.vdocument.in/reader034/viewer/2022050521/5fa45a8ea305f419b6331578/html5/thumbnails/14.jpg)
Preprocessing tool: incentive
• An inadequate pretreatment will never be compensated by a powerful data analysis
Some crucial steps must be carefully performed before the statistical analysis
• Metabolomics studies can have several hundreds of samples
• Manual preprocessing is time-consuming
• In contrast, the Preprocessing tool in W4M:
– Works semi-automatically => gain in time
– Includes advanced methods (solvent suppression, baseline correction, etc.)
– Has its code in open source (GitHub)
![Page 15: Pre-processing - workflow4metabolomics€¦ · Preprocessing tool: incentive •An inadequate pretreatment will never be compensated by a powerful data analysis Some crucial steps](https://reader034.vdocument.in/reader034/viewer/2022050521/5fa45a8ea305f419b6331578/html5/thumbnails/15.jpg)
Steps
1. Group Delay suppression
2. Solvent suppression
3. Apodization (increase the resolution/sensitivity)
4. Fourier Transform
5. Zero order phase correction
6. Referencing to 0 ppm
7. Baseline correction
8. Negative values put to zero
Preprocessing tool: overview
![Page 16: Pre-processing - workflow4metabolomics€¦ · Preprocessing tool: incentive •An inadequate pretreatment will never be compensated by a powerful data analysis Some crucial steps](https://reader034.vdocument.in/reader034/viewer/2022050521/5fa45a8ea305f419b6331578/html5/thumbnails/16.jpg)
Phase shifts
• Spectra have a mix of absorptive and dispersive signals
The phase must be corrected
• Crucial step: the integration of signals depends on an adequate phase correction
![Page 17: Pre-processing - workflow4metabolomics€¦ · Preprocessing tool: incentive •An inadequate pretreatment will never be compensated by a powerful data analysis Some crucial steps](https://reader034.vdocument.in/reader034/viewer/2022050521/5fa45a8ea305f419b6331578/html5/thumbnails/17.jpg)
1. Group Delay suppression
• Zero (𝜑0) and first order (𝜑1 𝜈 ) phase shifts: S = 𝑆𝑝ℎ𝑎𝑠𝑒𝑑e𝑖 𝜑0+𝜑1 𝜈
• Presence of a digital filter – The first tens of points in the FID are not part of the recorded signal and are
called the group delay
– Since the phase shift differs across signals, it introduces a first order phase shift linearly related to 𝜈
• The Bruker pre-acquisition t is known from the acquisition parameters
• Procedure– Apply Fourier transform on the FID
– Remove the delay t
– Apply inverse Fourier transform on the spectrum to recover the FID
![Page 18: Pre-processing - workflow4metabolomics€¦ · Preprocessing tool: incentive •An inadequate pretreatment will never be compensated by a powerful data analysis Some crucial steps](https://reader034.vdocument.in/reader034/viewer/2022050521/5fa45a8ea305f419b6331578/html5/thumbnails/18.jpg)
2. Solvent suppression
• Solvent residuals signal is problematic: – Highly variable
– Not of interest & can mask useful information
– Affects other pre-processing steps (baseline correction, phase corrections, etc.)
• Hypothesis: solvent is the main component in the samples
• Principle: – Non-parametric estimation of the solvent signal (W) with a penalized
smoother (Eilers, 2003):
– Remove the estimated solvent signal from the FIDs
• Meta-parameter: smoothing parameter lambda
– Determine how smooth is the solvent signal : λ ↑ => smoother signal
Min Least Squares criterion + lambda*roughness
deviations of W from the FID roughness penalty
![Page 19: Pre-processing - workflow4metabolomics€¦ · Preprocessing tool: incentive •An inadequate pretreatment will never be compensated by a powerful data analysis Some crucial steps](https://reader034.vdocument.in/reader034/viewer/2022050521/5fa45a8ea305f419b6331578/html5/thumbnails/19.jpg)
Illustration of solvent residuals suppression
![Page 20: Pre-processing - workflow4metabolomics€¦ · Preprocessing tool: incentive •An inadequate pretreatment will never be compensated by a powerful data analysis Some crucial steps](https://reader034.vdocument.in/reader034/viewer/2022050521/5fa45a8ea305f419b6331578/html5/thumbnails/20.jpg)
3. Apodization
• Purpose: improve the sensitivity
• Procedure: multiply the FID by a factor called a weighting function
• Several classes of factors in W4M:– Negative exponential
– Gaussian
– Hanning
– Hamming
– Cos2
• Example: the negative exponential function:𝑊 𝑡 = exp −𝐿𝐵 ∗ 𝑡 with LB : line broadening factor
![Page 21: Pre-processing - workflow4metabolomics€¦ · Preprocessing tool: incentive •An inadequate pretreatment will never be compensated by a powerful data analysis Some crucial steps](https://reader034.vdocument.in/reader034/viewer/2022050521/5fa45a8ea305f419b6331578/html5/thumbnails/21.jpg)
Example of apodization with a negative exponential
No apodization Exponential
LB=0.3
Exponential
LB=1
Exponential
LB=5
![Page 22: Pre-processing - workflow4metabolomics€¦ · Preprocessing tool: incentive •An inadequate pretreatment will never be compensated by a powerful data analysis Some crucial steps](https://reader034.vdocument.in/reader034/viewer/2022050521/5fa45a8ea305f419b6331578/html5/thumbnails/22.jpg)
4. Fourier Transform and ppm scale
• Mathematical function to convert the signal in the time domain into a spectrum in the frequency domain
• Frequencies are expressed in Hertz
• Hertz units are dependant of the external magnetic field
• Frequencies in Hertz are re-expressed as chemical shift d in parts-per-million (ppm), a dimensionless scale
![Page 23: Pre-processing - workflow4metabolomics€¦ · Preprocessing tool: incentive •An inadequate pretreatment will never be compensated by a powerful data analysis Some crucial steps](https://reader034.vdocument.in/reader034/viewer/2022050521/5fa45a8ea305f419b6331578/html5/thumbnails/23.jpg)
5. Zero order phase correction
• The axis along which the signal is recorded cannot be predicted => arbitrary phase 𝜑0
• Hypothesis: – the real part of the spectrum should be in pure absorptive mode (with
strictly (+) intensities)
• Correction principle: – Rotate the spectrum with a series of angles
– Measure a positiveness criterion on the real part of the spectrum
– Select the angle that maximises this criterion
• Positiveness criteria:
– rms:∑ 𝑝𝑜𝑠𝑖𝑡𝑖𝑣𝑒 𝑖𝑛𝑡𝑒𝑛𝑠𝑖𝑡𝑖𝑒𝑠 2
∑ 𝑖𝑛𝑡𝑒𝑛𝑠𝑖𝑡𝑖𝑒𝑠 2
– max: maximum positive intensity
![Page 24: Pre-processing - workflow4metabolomics€¦ · Preprocessing tool: incentive •An inadequate pretreatment will never be compensated by a powerful data analysis Some crucial steps](https://reader034.vdocument.in/reader034/viewer/2022050521/5fa45a8ea305f419b6331578/html5/thumbnails/24.jpg)
6. Referencing to 0 ppm
• Incentive: a known standard called an internal reference compound (TMS or TSP) is usually added to the samples to refine the scale calibration
The chemical shift is now defined relative to this reference
• A ppm value is attributed to the reference peak (usually 0 ppm) and spectra are aligned to this peak
![Page 25: Pre-processing - workflow4metabolomics€¦ · Preprocessing tool: incentive •An inadequate pretreatment will never be compensated by a powerful data analysis Some crucial steps](https://reader034.vdocument.in/reader034/viewer/2022050521/5fa45a8ea305f419b6331578/html5/thumbnails/25.jpg)
7. Baseline correction
• For a good integration, baseline should be flat with no distortion Baseline artefacts need to be removed
• Principle: 1. Non-parametric estimation of the baseline (Z) for the real spectrum with AsLS
2. Removal of the estimated baseline from the real part of the spectrum
• Asymmetric Least Squares smoothing (AsLS) algorithm (Eilers &
Boelens, 2005):
• Meta-parameters: – Asymmetry parameter (p): = weight for (-) deviations, if 𝑝 ↑ => the estimated
baseline will more frequently be above the spectrum. Default value = 0.05
– Smoothing parameter (lambda): λ ↑ => smoother signal
uneven weights p for (+) and (-) deviations of Z from the spectrum will favour positive corrected intensities
Min Asymmetric Least Squares criterion + lambda*roughness
roughness penalty
![Page 26: Pre-processing - workflow4metabolomics€¦ · Preprocessing tool: incentive •An inadequate pretreatment will never be compensated by a powerful data analysis Some crucial steps](https://reader034.vdocument.in/reader034/viewer/2022050521/5fa45a8ea305f419b6331578/html5/thumbnails/26.jpg)
8. Negative values are put to zero
• After the baseline correction step, the remaining negative values are replaced by 0 since they cannot be interpreted
• Final spectra are ready for further alignment, bucketing, normalization and data analysis steps
![Page 27: Pre-processing - workflow4metabolomics€¦ · Preprocessing tool: incentive •An inadequate pretreatment will never be compensated by a powerful data analysis Some crucial steps](https://reader034.vdocument.in/reader034/viewer/2022050521/5fa45a8ea305f419b6331578/html5/thumbnails/27.jpg)
Preprocessing tool in W4M (1)
Data importation:- Data matrix: spectral
intensities- Sample metadata:
acquisition parameters
First order phase correction
Solvent suppression- Smoothing parameter:
lambda
Apodization- Method: window
function
![Page 28: Pre-processing - workflow4metabolomics€¦ · Preprocessing tool: incentive •An inadequate pretreatment will never be compensated by a powerful data analysis Some crucial steps](https://reader034.vdocument.in/reader034/viewer/2022050521/5fa45a8ea305f419b6331578/html5/thumbnails/28.jpg)
Preprocessing tool in W4M (2)
Data importation:- Data matrix: spectral
intensities- Sample metadata:
acquisition parameters
First order phase correction
Solvent suppression- Smoothing parameter:
lambda
Apodization- Method: window
functionMethod option
![Page 29: Pre-processing - workflow4metabolomics€¦ · Preprocessing tool: incentive •An inadequate pretreatment will never be compensated by a powerful data analysis Some crucial steps](https://reader034.vdocument.in/reader034/viewer/2022050521/5fa45a8ea305f419b6331578/html5/thumbnails/29.jpg)
Preprocessing tool in W4M (3)
Fourier Transform
Zero order phase correction- Method: positiveness criterion- Exclusion area(s): one or several exclusion areas can be specified for the computation of the
positiveness criterion
![Page 30: Pre-processing - workflow4metabolomics€¦ · Preprocessing tool: incentive •An inadequate pretreatment will never be compensated by a powerful data analysis Some crucial steps](https://reader034.vdocument.in/reader034/viewer/2022050521/5fa45a8ea305f419b6331578/html5/thumbnails/30.jpg)
Preprocessing tool in W4M (4)
Shift referencing- Search zone: define the area where to look for the reference compound peak- shiftHandling: how to deal with spectral borders during realignment: 0 filling, cut, etc.
![Page 31: Pre-processing - workflow4metabolomics€¦ · Preprocessing tool: incentive •An inadequate pretreatment will never be compensated by a powerful data analysis Some crucial steps](https://reader034.vdocument.in/reader034/viewer/2022050521/5fa45a8ea305f419b6331578/html5/thumbnails/31.jpg)
Preprocessing tool in W4M (4)
Shift referencing- Search zone: define the area where to look for the reference compound peak- shiftHandling: how to deal with spectral borders during realignment: 0 filling, cut, etc.
![Page 32: Pre-processing - workflow4metabolomics€¦ · Preprocessing tool: incentive •An inadequate pretreatment will never be compensated by a powerful data analysis Some crucial steps](https://reader034.vdocument.in/reader034/viewer/2022050521/5fa45a8ea305f419b6331578/html5/thumbnails/32.jpg)
Preprocessing tool in W4M (4)
Shift referencing- Search zone: define the area where to look for the reference compound peak- shiftHandling: how to deal with spectral borders during realignment: 0 filling, cut, etc.
Search zone method option
![Page 33: Pre-processing - workflow4metabolomics€¦ · Preprocessing tool: incentive •An inadequate pretreatment will never be compensated by a powerful data analysis Some crucial steps](https://reader034.vdocument.in/reader034/viewer/2022050521/5fa45a8ea305f419b6331578/html5/thumbnails/33.jpg)
Preprocessing tool in W4M (4)
Shift referencing- Search zone: define the area where to look for the reference compound peak- shiftHandling: how to deal with spectral borders during realignment: 0 filling, cut, etc.
![Page 34: Pre-processing - workflow4metabolomics€¦ · Preprocessing tool: incentive •An inadequate pretreatment will never be compensated by a powerful data analysis Some crucial steps](https://reader034.vdocument.in/reader034/viewer/2022050521/5fa45a8ea305f419b6331578/html5/thumbnails/34.jpg)
Preprocessing tool in W4M (4)
Shift referencing- Search zone: define the area where to look for the reference compound peak- shiftHandling: how to deal with spectral borders during realignment: 0 filling, cut, etc.
![Page 35: Pre-processing - workflow4metabolomics€¦ · Preprocessing tool: incentive •An inadequate pretreatment will never be compensated by a powerful data analysis Some crucial steps](https://reader034.vdocument.in/reader034/viewer/2022050521/5fa45a8ea305f419b6331578/html5/thumbnails/35.jpg)
Preprocessing tool in W4M (4)
Shift referencing- Search zone: define the area where to look for the reference compound peak- shiftHandling: how to deal with spectral borders during realignment: 0 filling, cut, etc.
![Page 36: Pre-processing - workflow4metabolomics€¦ · Preprocessing tool: incentive •An inadequate pretreatment will never be compensated by a powerful data analysis Some crucial steps](https://reader034.vdocument.in/reader034/viewer/2022050521/5fa45a8ea305f419b6331578/html5/thumbnails/36.jpg)
Preprocessing tool in W4M (5)
Baseline correction- Smoothing parameter:
lambda- Asymmetry parameter: p
Negative values put to zero
Final spectra
![Page 37: Pre-processing - workflow4metabolomics€¦ · Preprocessing tool: incentive •An inadequate pretreatment will never be compensated by a powerful data analysis Some crucial steps](https://reader034.vdocument.in/reader034/viewer/2022050521/5fa45a8ea305f419b6331578/html5/thumbnails/37.jpg)
Results : graphs (1)
Spectrum after Fourier transform
• First order phase correction
• Solvent suppression with lambda = 106
• Apodization : exp with lb=0.3
• Fourier transform
![Page 38: Pre-processing - workflow4metabolomics€¦ · Preprocessing tool: incentive •An inadequate pretreatment will never be compensated by a powerful data analysis Some crucial steps](https://reader034.vdocument.in/reader034/viewer/2022050521/5fa45a8ea305f419b6331578/html5/thumbnails/38.jpg)
Spectrum after zero order phase correction and chemical shift referencing
Results : graphs (2)
• zero order phase correction : method rms
• Shift referencing : nearvalue; zerofilling; 0 ppm
![Page 39: Pre-processing - workflow4metabolomics€¦ · Preprocessing tool: incentive •An inadequate pretreatment will never be compensated by a powerful data analysis Some crucial steps](https://reader034.vdocument.in/reader034/viewer/2022050521/5fa45a8ea305f419b6331578/html5/thumbnails/39.jpg)
Results : graphs (3)
Spectrum after baseline correction
• Baseline correction : lambda = 100000; p=0.05; numerical precision =
10-8
![Page 40: Pre-processing - workflow4metabolomics€¦ · Preprocessing tool: incentive •An inadequate pretreatment will never be compensated by a powerful data analysis Some crucial steps](https://reader034.vdocument.in/reader034/viewer/2022050521/5fa45a8ea305f419b6331578/html5/thumbnails/40.jpg)
Results : graphs (4)
Final preprocessed spectrum
Negative values are put to 0
![Page 41: Pre-processing - workflow4metabolomics€¦ · Preprocessing tool: incentive •An inadequate pretreatment will never be compensated by a powerful data analysis Some crucial steps](https://reader034.vdocument.in/reader034/viewer/2022050521/5fa45a8ea305f419b6331578/html5/thumbnails/41.jpg)
ALIGNMENT TOOL
41
![Page 42: Pre-processing - workflow4metabolomics€¦ · Preprocessing tool: incentive •An inadequate pretreatment will never be compensated by a powerful data analysis Some crucial steps](https://reader034.vdocument.in/reader034/viewer/2022050521/5fa45a8ea305f419b6331578/html5/thumbnails/42.jpg)
• Variation of chemical shifts between samples
‒ Chemical shift depends on the pH
‒ For urine samples, the pH can vary greatly
‒ Addition of phosphate buffer is not sufficient to ensure the same pH in all samples (500 µl of urine and 200 µl of phosphate buffer pH 7)
Alignment: why?
• Problems with bucketing
‒ A bucket can correspond to different metabolites in the different spectra if the chemical shift is different between spectra
‒ The statistical results may be affected by this problem
![Page 43: Pre-processing - workflow4metabolomics€¦ · Preprocessing tool: incentive •An inadequate pretreatment will never be compensated by a powerful data analysis Some crucial steps](https://reader034.vdocument.in/reader034/viewer/2022050521/5fa45a8ea305f419b6331578/html5/thumbnails/43.jpg)
Example : superposition of urine NMR spectra
Problematic for the bucketing: the bucket at 2.53 ppm corresponds to citrate in the red spectrum and noise in the other spectra
![Page 44: Pre-processing - workflow4metabolomics€¦ · Preprocessing tool: incentive •An inadequate pretreatment will never be compensated by a powerful data analysis Some crucial steps](https://reader034.vdocument.in/reader034/viewer/2022050521/5fa45a8ea305f419b6331578/html5/thumbnails/44.jpg)
Bibliography (1)
• Vu et Laukens. (2013): classification of identified algorithms
![Page 45: Pre-processing - workflow4metabolomics€¦ · Preprocessing tool: incentive •An inadequate pretreatment will never be compensated by a powerful data analysis Some crucial steps](https://reader034.vdocument.in/reader034/viewer/2022050521/5fa45a8ea305f419b6331578/html5/thumbnails/45.jpg)
Bibliography (2)
• Vu et Laukens. (2013): some of identified algorithms in their review have been compared in different papers
![Page 46: Pre-processing - workflow4metabolomics€¦ · Preprocessing tool: incentive •An inadequate pretreatment will never be compensated by a powerful data analysis Some crucial steps](https://reader034.vdocument.in/reader034/viewer/2022050521/5fa45a8ea305f419b6331578/html5/thumbnails/46.jpg)
CluPA algorithm (1)
• Vu et al. (2011): Cluster-based Peak Alignment
• Three-steps algorithm
‒ Peak peaking: spectra denoising, baseline correction and peak selection: R package based on wavelet transform, developed for mass spectra
![Page 47: Pre-processing - workflow4metabolomics€¦ · Preprocessing tool: incentive •An inadequate pretreatment will never be compensated by a powerful data analysis Some crucial steps](https://reader034.vdocument.in/reader034/viewer/2022050521/5fa45a8ea305f419b6331578/html5/thumbnails/47.jpg)
Peak picking algorithm
![Page 48: Pre-processing - workflow4metabolomics€¦ · Preprocessing tool: incentive •An inadequate pretreatment will never be compensated by a powerful data analysis Some crucial steps](https://reader034.vdocument.in/reader034/viewer/2022050521/5fa45a8ea305f419b6331578/html5/thumbnails/48.jpg)
CluPA algorithm (2)
• Vu et al. (2011): Cluster-based Peak Alignment
• Three-steps algorithm
‒ Peak peaking: spectra denoising, baseline correction and peak selection: R package based on wavelet transform, developed for mass spectra
‒ Reference finding: reference pattern against which all the spectra will be aligned. Automatic method proposed
![Page 49: Pre-processing - workflow4metabolomics€¦ · Preprocessing tool: incentive •An inadequate pretreatment will never be compensated by a powerful data analysis Some crucial steps](https://reader034.vdocument.in/reader034/viewer/2022050521/5fa45a8ea305f419b6331578/html5/thumbnails/49.jpg)
Reference determination
![Page 50: Pre-processing - workflow4metabolomics€¦ · Preprocessing tool: incentive •An inadequate pretreatment will never be compensated by a powerful data analysis Some crucial steps](https://reader034.vdocument.in/reader034/viewer/2022050521/5fa45a8ea305f419b6331578/html5/thumbnails/50.jpg)
Reference determination
![Page 51: Pre-processing - workflow4metabolomics€¦ · Preprocessing tool: incentive •An inadequate pretreatment will never be compensated by a powerful data analysis Some crucial steps](https://reader034.vdocument.in/reader034/viewer/2022050521/5fa45a8ea305f419b6331578/html5/thumbnails/51.jpg)
Reference determination
![Page 52: Pre-processing - workflow4metabolomics€¦ · Preprocessing tool: incentive •An inadequate pretreatment will never be compensated by a powerful data analysis Some crucial steps](https://reader034.vdocument.in/reader034/viewer/2022050521/5fa45a8ea305f419b6331578/html5/thumbnails/52.jpg)
CluPA algorithm (3)
• Vu et al. (2011): Cluster-based Peak Alignment
• Three-steps algorithm
‒ Peak peaking: spectra denoising, baseline correction and peak selection: R package based on wavelet transform, developed for mass spectra
‒ Reference finding: reference pattern against which all the spectra will be aligned. Automatic method proposed
‒ Peak alignment: iterative method based on hierarchical clustering
![Page 53: Pre-processing - workflow4metabolomics€¦ · Preprocessing tool: incentive •An inadequate pretreatment will never be compensated by a powerful data analysis Some crucial steps](https://reader034.vdocument.in/reader034/viewer/2022050521/5fa45a8ea305f419b6331578/html5/thumbnails/53.jpg)
Alignment algorithm (1)
• Peak lists of reference spectrum and target spectrum are merged
• Steps• 1 – Definition of the spectral zone to align
• 2 – Determination of a shift step, based on the FFT cross correlation
• 3 – Alignment
• 4 – Hierarchical Clustering Algorithm of the shifted merged peak list
![Page 54: Pre-processing - workflow4metabolomics€¦ · Preprocessing tool: incentive •An inadequate pretreatment will never be compensated by a powerful data analysis Some crucial steps](https://reader034.vdocument.in/reader034/viewer/2022050521/5fa45a8ea305f419b6331578/html5/thumbnails/54.jpg)
Alignment algorithm (2)
![Page 55: Pre-processing - workflow4metabolomics€¦ · Preprocessing tool: incentive •An inadequate pretreatment will never be compensated by a powerful data analysis Some crucial steps](https://reader034.vdocument.in/reader034/viewer/2022050521/5fa45a8ea305f419b6331578/html5/thumbnails/55.jpg)
Alignment algorithm (1)
• Peak lists of reference spectrum and target spectrum are merged
• Steps• 1 – Definition of the spectral zone to align
• 2 – Determination of a shift step, based on the FFT cross correlation
• 3 – Alignment
• 4 – Hierarchical Clustering Algorithm of the shifted merged peak list
• 5 – Tree is cut at level 2
• Steps 1 – 5 are repeated for each sub-tree
![Page 56: Pre-processing - workflow4metabolomics€¦ · Preprocessing tool: incentive •An inadequate pretreatment will never be compensated by a powerful data analysis Some crucial steps](https://reader034.vdocument.in/reader034/viewer/2022050521/5fa45a8ea305f419b6331578/html5/thumbnails/56.jpg)
Alignment algorithm (2)
![Page 57: Pre-processing - workflow4metabolomics€¦ · Preprocessing tool: incentive •An inadequate pretreatment will never be compensated by a powerful data analysis Some crucial steps](https://reader034.vdocument.in/reader034/viewer/2022050521/5fa45a8ea305f419b6331578/html5/thumbnails/57.jpg)
Alignment algorithm (2)
![Page 58: Pre-processing - workflow4metabolomics€¦ · Preprocessing tool: incentive •An inadequate pretreatment will never be compensated by a powerful data analysis Some crucial steps](https://reader034.vdocument.in/reader034/viewer/2022050521/5fa45a8ea305f419b6331578/html5/thumbnails/58.jpg)
Alignment algorithm (1)
• Peak lists of reference spectrum and target spectrum are merged
• Steps• 1 – Definition of the spectral zone to align
• 2 – Determination of a shift step, based on the FFT cross correlation
• 3 – Alignment
• 4 – Hierarchical Clustering Algorithm of the shifted merged peak list
• 5 – Tree is cut at level 2
• Steps 1 – 5 are repeated for each sub-tree, as long as • More than 1 peak in the peak list
• Peaks from reference and target spectra in the in the peak list
![Page 59: Pre-processing - workflow4metabolomics€¦ · Preprocessing tool: incentive •An inadequate pretreatment will never be compensated by a powerful data analysis Some crucial steps](https://reader034.vdocument.in/reader034/viewer/2022050521/5fa45a8ea305f419b6331578/html5/thumbnails/59.jpg)
Alignment algorithm (2)
![Page 60: Pre-processing - workflow4metabolomics€¦ · Preprocessing tool: incentive •An inadequate pretreatment will never be compensated by a powerful data analysis Some crucial steps](https://reader034.vdocument.in/reader034/viewer/2022050521/5fa45a8ea305f419b6331578/html5/thumbnails/60.jpg)
CluPA evaluation (1)
• NMR spectra were simulated using MetAssimulo (Muncey et al. 2010)
– MatLab toolbox to simulate 1H NMR spectra of complex mixtures
– Based on a database including 49 reference compounds and on HMDB
– Spectra including Dimethylamine, Citric acid, L-Histidine, Taurine, Trimethylamine, Hippuric acid, Glycine and TMAO were simulated under different pH conditions (from 4.0 to 10.0)
![Page 61: Pre-processing - workflow4metabolomics€¦ · Preprocessing tool: incentive •An inadequate pretreatment will never be compensated by a powerful data analysis Some crucial steps](https://reader034.vdocument.in/reader034/viewer/2022050521/5fa45a8ea305f419b6331578/html5/thumbnails/61.jpg)
CluPA evaluation (2)
![Page 62: Pre-processing - workflow4metabolomics€¦ · Preprocessing tool: incentive •An inadequate pretreatment will never be compensated by a powerful data analysis Some crucial steps](https://reader034.vdocument.in/reader034/viewer/2022050521/5fa45a8ea305f419b6331578/html5/thumbnails/62.jpg)
CluPA evaluation (3)
![Page 63: Pre-processing - workflow4metabolomics€¦ · Preprocessing tool: incentive •An inadequate pretreatment will never be compensated by a powerful data analysis Some crucial steps](https://reader034.vdocument.in/reader034/viewer/2022050521/5fa45a8ea305f419b6331578/html5/thumbnails/63.jpg)
CluPA evaluation (4)
![Page 64: Pre-processing - workflow4metabolomics€¦ · Preprocessing tool: incentive •An inadequate pretreatment will never be compensated by a powerful data analysis Some crucial steps](https://reader034.vdocument.in/reader034/viewer/2022050521/5fa45a8ea305f419b6331578/html5/thumbnails/64.jpg)
Form
Type of input files
![Page 65: Pre-processing - workflow4metabolomics€¦ · Preprocessing tool: incentive •An inadequate pretreatment will never be compensated by a powerful data analysis Some crucial steps](https://reader034.vdocument.in/reader034/viewer/2022050521/5fa45a8ea305f419b6331578/html5/thumbnails/65.jpg)
Form
Type of input files
![Page 66: Pre-processing - workflow4metabolomics€¦ · Preprocessing tool: incentive •An inadequate pretreatment will never be compensated by a powerful data analysis Some crucial steps](https://reader034.vdocument.in/reader034/viewer/2022050521/5fa45a8ea305f419b6331578/html5/thumbnails/66.jpg)
Bruker file
• Preprocessing using TopSpin performed: Fourier transformation, spectra phased and baseline corrected, calibrated (TSP: 0 ppm)
• For each individual sample, one directory as follows:
![Page 67: Pre-processing - workflow4metabolomics€¦ · Preprocessing tool: incentive •An inadequate pretreatment will never be compensated by a powerful data analysis Some crucial steps](https://reader034.vdocument.in/reader034/viewer/2022050521/5fa45a8ea305f419b6331578/html5/thumbnails/67.jpg)
Form
Spectral width
Region(s) to exclude
Type of input files
![Page 68: Pre-processing - workflow4metabolomics€¦ · Preprocessing tool: incentive •An inadequate pretreatment will never be compensated by a powerful data analysis Some crucial steps](https://reader034.vdocument.in/reader034/viewer/2022050521/5fa45a8ea305f419b6331578/html5/thumbnails/68.jpg)
Spectral width
Region(s) to exclude
Type of input files
Form
![Page 69: Pre-processing - workflow4metabolomics€¦ · Preprocessing tool: incentive •An inadequate pretreatment will never be compensated by a powerful data analysis Some crucial steps](https://reader034.vdocument.in/reader034/viewer/2022050521/5fa45a8ea305f419b6331578/html5/thumbnails/69.jpg)
Form
Left and right boundaries
Spectral width
Region(s) to exclude
Type of input files
![Page 70: Pre-processing - workflow4metabolomics€¦ · Preprocessing tool: incentive •An inadequate pretreatment will never be compensated by a powerful data analysis Some crucial steps](https://reader034.vdocument.in/reader034/viewer/2022050521/5fa45a8ea305f419b6331578/html5/thumbnails/70.jpg)
Form
Reference spectrum
Segment size
Minimal intensity
Spectral width
Region(s) to exclude
Type of input files
![Page 71: Pre-processing - workflow4metabolomics€¦ · Preprocessing tool: incentive •An inadequate pretreatment will never be compensated by a powerful data analysis Some crucial steps](https://reader034.vdocument.in/reader034/viewer/2022050521/5fa45a8ea305f419b6331578/html5/thumbnails/71.jpg)
Form
Reference spectrumSegment sizeMinimal intensity
Spectral width
Region(s) to exclude
Type of input files
Graphical zone(s)
![Page 72: Pre-processing - workflow4metabolomics€¦ · Preprocessing tool: incentive •An inadequate pretreatment will never be compensated by a powerful data analysis Some crucial steps](https://reader034.vdocument.in/reader034/viewer/2022050521/5fa45a8ea305f419b6331578/html5/thumbnails/72.jpg)
Form
Lactic acid
Citric acid
Lactic acid
![Page 73: Pre-processing - workflow4metabolomics€¦ · Preprocessing tool: incentive •An inadequate pretreatment will never be compensated by a powerful data analysis Some crucial steps](https://reader034.vdocument.in/reader034/viewer/2022050521/5fa45a8ea305f419b6331578/html5/thumbnails/73.jpg)
Form
Reference spectrum
Segment size
Minimal intensity
Spectral width
Region(s) to exclude
Type of input files
![Page 74: Pre-processing - workflow4metabolomics€¦ · Preprocessing tool: incentive •An inadequate pretreatment will never be compensated by a powerful data analysis Some crucial steps](https://reader034.vdocument.in/reader034/viewer/2022050521/5fa45a8ea305f419b6331578/html5/thumbnails/74.jpg)
Results: aligned matrix
Samples’ IDVariables’ ID = chemical shift
![Page 75: Pre-processing - workflow4metabolomics€¦ · Preprocessing tool: incentive •An inadequate pretreatment will never be compensated by a powerful data analysis Some crucial steps](https://reader034.vdocument.in/reader034/viewer/2022050521/5fa45a8ea305f419b6331578/html5/thumbnails/75.jpg)
Results: graph
![Page 76: Pre-processing - workflow4metabolomics€¦ · Preprocessing tool: incentive •An inadequate pretreatment will never be compensated by a powerful data analysis Some crucial steps](https://reader034.vdocument.in/reader034/viewer/2022050521/5fa45a8ea305f419b6331578/html5/thumbnails/76.jpg)
Results: graph
![Page 77: Pre-processing - workflow4metabolomics€¦ · Preprocessing tool: incentive •An inadequate pretreatment will never be compensated by a powerful data analysis Some crucial steps](https://reader034.vdocument.in/reader034/viewer/2022050521/5fa45a8ea305f419b6331578/html5/thumbnails/77.jpg)
BUCKETING TOOL
77
![Page 78: Pre-processing - workflow4metabolomics€¦ · Preprocessing tool: incentive •An inadequate pretreatment will never be compensated by a powerful data analysis Some crucial steps](https://reader034.vdocument.in/reader034/viewer/2022050521/5fa45a8ea305f419b6331578/html5/thumbnails/78.jpg)
Bucketing or binning
• Data Reduction
16384 variables
600 variables
![Page 79: Pre-processing - workflow4metabolomics€¦ · Preprocessing tool: incentive •An inadequate pretreatment will never be compensated by a powerful data analysis Some crucial steps](https://reader034.vdocument.in/reader034/viewer/2022050521/5fa45a8ea305f419b6331578/html5/thumbnails/79.jpg)
• Fixed size bucketing :
– spectra segmentation in 600 (depends on the bucket width) windowsof equal size, for example 0,01 ppm;
‒ exclusion of contaminants;
‒ bucket integration
• Variable size bucketing :
‒ Each window is determined graphically;
‒ The entire signal (singlet, doublet, triplet….) is in the same bucket;
‒ Contaminants and noise are not chosen;
‒ In case of shifts, the signal of a same proton is in the same bucket for all spectra;
‒ The number of buckets is small (100-200)
‒ Bucket integration
Bucketing or binning
![Page 80: Pre-processing - workflow4metabolomics€¦ · Preprocessing tool: incentive •An inadequate pretreatment will never be compensated by a powerful data analysis Some crucial steps](https://reader034.vdocument.in/reader034/viewer/2022050521/5fa45a8ea305f419b6331578/html5/thumbnails/80.jpg)
Bucketing or binning
![Page 81: Pre-processing - workflow4metabolomics€¦ · Preprocessing tool: incentive •An inadequate pretreatment will never be compensated by a powerful data analysis Some crucial steps](https://reader034.vdocument.in/reader034/viewer/2022050521/5fa45a8ea305f419b6331578/html5/thumbnails/81.jpg)
Fixed-size Bucketing
• Binning: spectra segmentation in fixed-size windows (e.g. 0.01ppm)
• Spectra regions corresponding to water, solvents, … resonances excluded
![Page 82: Pre-processing - workflow4metabolomics€¦ · Preprocessing tool: incentive •An inadequate pretreatment will never be compensated by a powerful data analysis Some crucial steps](https://reader034.vdocument.in/reader034/viewer/2022050521/5fa45a8ea305f419b6331578/html5/thumbnails/82.jpg)
1.925 1.915
Integration
• Sum of intensities inside each bucket = area under the curve computation
• Trapezoidal method: cumtrapz function (pracma package)
– For each bucket
• Discretization into N equally spaced “small” rectangles
![Page 83: Pre-processing - workflow4metabolomics€¦ · Preprocessing tool: incentive •An inadequate pretreatment will never be compensated by a powerful data analysis Some crucial steps](https://reader034.vdocument.in/reader034/viewer/2022050521/5fa45a8ea305f419b6331578/html5/thumbnails/83.jpg)
• Sum of intensities inside each bucket = area under the curve computation
• Trapezoidal method: cumtrapz function (pracma package)
– For each bucket
• Discretization into N equally spaced “small” rectangles
1.925 1.915
Integration
![Page 84: Pre-processing - workflow4metabolomics€¦ · Preprocessing tool: incentive •An inadequate pretreatment will never be compensated by a powerful data analysis Some crucial steps](https://reader034.vdocument.in/reader034/viewer/2022050521/5fa45a8ea305f419b6331578/html5/thumbnails/84.jpg)
• Sum of intensities inside each bucket = area under the curve computation
• Trapezoidal method: cumtrapz function (pracma package)
– For each bucket
• Discretization into N equally spaced “small” rectangles
1.925 1.915
Integration
![Page 85: Pre-processing - workflow4metabolomics€¦ · Preprocessing tool: incentive •An inadequate pretreatment will never be compensated by a powerful data analysis Some crucial steps](https://reader034.vdocument.in/reader034/viewer/2022050521/5fa45a8ea305f419b6331578/html5/thumbnails/85.jpg)
• Sum of intensities inside each bucket = area under the curve computation
• Trapezoidal method: cumtrapz function (pracma package)
– For each bucket
• Discretization into N equally spaced “small” rectangles
1.925 1.915
Integration
![Page 86: Pre-processing - workflow4metabolomics€¦ · Preprocessing tool: incentive •An inadequate pretreatment will never be compensated by a powerful data analysis Some crucial steps](https://reader034.vdocument.in/reader034/viewer/2022050521/5fa45a8ea305f419b6331578/html5/thumbnails/86.jpg)
• Sum of intensities inside each bucket = area under the curve computation
• Trapezoidal method: cumtrapz function (pracma package)
– For each bucket
• Discretization into N equally spaced “small” rectangles
1.925 1.915
Integration
![Page 87: Pre-processing - workflow4metabolomics€¦ · Preprocessing tool: incentive •An inadequate pretreatment will never be compensated by a powerful data analysis Some crucial steps](https://reader034.vdocument.in/reader034/viewer/2022050521/5fa45a8ea305f419b6331578/html5/thumbnails/87.jpg)
• Sum of intensities inside each bucket = area under the curve computation
• Trapezoidal method: cumtrapz function (pracma package)
– For each bucket
• Discretization into N equally spaced “small” rectangles
• In each “small” rectangles, linear approximation of the curve = trapezium area is used to compute the area under the curve
• Absolute values of rectangles are summed
1.925 1.915
Integration
![Page 88: Pre-processing - workflow4metabolomics€¦ · Preprocessing tool: incentive •An inadequate pretreatment will never be compensated by a powerful data analysis Some crucial steps](https://reader034.vdocument.in/reader034/viewer/2022050521/5fa45a8ea305f419b6331578/html5/thumbnails/88.jpg)
Form
Type of input files
![Page 89: Pre-processing - workflow4metabolomics€¦ · Preprocessing tool: incentive •An inadequate pretreatment will never be compensated by a powerful data analysis Some crucial steps](https://reader034.vdocument.in/reader034/viewer/2022050521/5fa45a8ea305f419b6331578/html5/thumbnails/89.jpg)
Form
Bin size
Spectral width
Type of input files
Region(s) to exclude
![Page 90: Pre-processing - workflow4metabolomics€¦ · Preprocessing tool: incentive •An inadequate pretreatment will never be compensated by a powerful data analysis Some crucial steps](https://reader034.vdocument.in/reader034/viewer/2022050521/5fa45a8ea305f419b6331578/html5/thumbnails/90.jpg)
Form
Left and right boundaries
Bin size
Spectral width
Type of input files
![Page 91: Pre-processing - workflow4metabolomics€¦ · Preprocessing tool: incentive •An inadequate pretreatment will never be compensated by a powerful data analysis Some crucial steps](https://reader034.vdocument.in/reader034/viewer/2022050521/5fa45a8ea305f419b6331578/html5/thumbnails/91.jpg)
Form
Bin size
Spectral width
Region(s) to exclude
Spectra vizualisation
Type of input files
![Page 92: Pre-processing - workflow4metabolomics€¦ · Preprocessing tool: incentive •An inadequate pretreatment will never be compensated by a powerful data analysis Some crucial steps](https://reader034.vdocument.in/reader034/viewer/2022050521/5fa45a8ea305f419b6331578/html5/thumbnails/92.jpg)
Results: bucketed and integrated data matrix
Samples’ IDVariables’ ID = Bucket center File download
![Page 93: Pre-processing - workflow4metabolomics€¦ · Preprocessing tool: incentive •An inadequate pretreatment will never be compensated by a powerful data analysis Some crucial steps](https://reader034.vdocument.in/reader034/viewer/2022050521/5fa45a8ea305f419b6331578/html5/thumbnails/93.jpg)
Variables’ ID=Bucket center: row names of the variable Metadata file have to be identical to the row names of the dataMatrix file
Results: variableMetadata matrix
![Page 94: Pre-processing - workflow4metabolomics€¦ · Preprocessing tool: incentive •An inadequate pretreatment will never be compensated by a powerful data analysis Some crucial steps](https://reader034.vdocument.in/reader034/viewer/2022050521/5fa45a8ea305f419b6331578/html5/thumbnails/94.jpg)
Results: spectra representation
![Page 95: Pre-processing - workflow4metabolomics€¦ · Preprocessing tool: incentive •An inadequate pretreatment will never be compensated by a powerful data analysis Some crucial steps](https://reader034.vdocument.in/reader034/viewer/2022050521/5fa45a8ea305f419b6331578/html5/thumbnails/95.jpg)
NORMALIZATION TOOL
95
![Page 96: Pre-processing - workflow4metabolomics€¦ · Preprocessing tool: incentive •An inadequate pretreatment will never be compensated by a powerful data analysis Some crucial steps](https://reader034.vdocument.in/reader034/viewer/2022050521/5fa45a8ea305f419b6331578/html5/thumbnails/96.jpg)
Normalization
• Operation applied to each sample to make the data from all samples directly comparable with each other (to take into account for variations of the overall concentrations of samples due to biological and technical reasons)
To ensure that a measured concentration observed for a metabolite at the lower end of the dynamic range is as reliable as it is for a metabolite at the upper end
![Page 97: Pre-processing - workflow4metabolomics€¦ · Preprocessing tool: incentive •An inadequate pretreatment will never be compensated by a powerful data analysis Some crucial steps](https://reader034.vdocument.in/reader034/viewer/2022050521/5fa45a8ea305f419b6331578/html5/thumbnails/97.jpg)
Total intensity
• Intensity of each feature is divided by the intensity of total spectrum
• Each spectrum has a total intensity of 1
![Page 98: Pre-processing - workflow4metabolomics€¦ · Preprocessing tool: incentive •An inadequate pretreatment will never be compensated by a powerful data analysis Some crucial steps](https://reader034.vdocument.in/reader034/viewer/2022050521/5fa45a8ea305f419b6331578/html5/thumbnails/98.jpg)
Quantitative variable
• Intensity of each feature is divided by the value of a quantitative variable: weight for tissue, osmolality, …
![Page 99: Pre-processing - workflow4metabolomics€¦ · Preprocessing tool: incentive •An inadequate pretreatment will never be compensated by a powerful data analysis Some crucial steps](https://reader034.vdocument.in/reader034/viewer/2022050521/5fa45a8ea305f419b6331578/html5/thumbnails/99.jpg)
Probabilistic Quotient Normalization (1)
• Hyp.: a majority of molecule concentrations remain unchanged across the samples = biologically interesting concentration changes influence only parts of the NMR spectrum, whereas dilution effects will affect all metabolite signals
• Comparison of spectra to a reference sample (the best approach is the computation of the median spectrum of control samples, Dieterle et al. 2006)
![Page 100: Pre-processing - workflow4metabolomics€¦ · Preprocessing tool: incentive •An inadequate pretreatment will never be compensated by a powerful data analysis Some crucial steps](https://reader034.vdocument.in/reader034/viewer/2022050521/5fa45a8ea305f419b6331578/html5/thumbnails/100.jpg)
Probabilistic Quotient Normalization (2)
• Hyp.: a majority of molecule concentrations remain unchanged across the samples = biologically interesting concentration changes influence only parts of the NMR spectrum, while dilution effects will affect all metabolite signals
• Comparison of spectra to a reference sample (the best approach is the computation of the median spectrum of control samples, Dieterle et al. 2006)
• Steps
– Total intensity normalization
![Page 101: Pre-processing - workflow4metabolomics€¦ · Preprocessing tool: incentive •An inadequate pretreatment will never be compensated by a powerful data analysis Some crucial steps](https://reader034.vdocument.in/reader034/viewer/2022050521/5fa45a8ea305f419b6331578/html5/thumbnails/101.jpg)
Probabilistic Quotient Normalization (2)
![Page 102: Pre-processing - workflow4metabolomics€¦ · Preprocessing tool: incentive •An inadequate pretreatment will never be compensated by a powerful data analysis Some crucial steps](https://reader034.vdocument.in/reader034/viewer/2022050521/5fa45a8ea305f419b6331578/html5/thumbnails/102.jpg)
Probabilistic Quotient Normalization (3)
• Hyp.: a majority of molecule concentrations remain unchanged across the samples = biologically interesting concentration changes influence only parts of the NMR spectrum, while dilution effects will affect all metabolite signals
• Comparison of spectra to a reference sample (the best approach is the computation of the median spectrum of control samples, Dieterle et al. 2006)
• Steps
– Total intensity normalization
– Reference spectrum computation
![Page 103: Pre-processing - workflow4metabolomics€¦ · Preprocessing tool: incentive •An inadequate pretreatment will never be compensated by a powerful data analysis Some crucial steps](https://reader034.vdocument.in/reader034/viewer/2022050521/5fa45a8ea305f419b6331578/html5/thumbnails/103.jpg)
Probabilistic Quotient Normalization (3)
![Page 104: Pre-processing - workflow4metabolomics€¦ · Preprocessing tool: incentive •An inadequate pretreatment will never be compensated by a powerful data analysis Some crucial steps](https://reader034.vdocument.in/reader034/viewer/2022050521/5fa45a8ea305f419b6331578/html5/thumbnails/104.jpg)
Probabilistic Quotient Normalization (4)
• Hyp.: a majority of molecule concentrations remain unchanged across the samples = biologically interesting concentration changes influence only parts of the NMR spectrum, while dilution effects will affect all metabolite signals
• Comparison of spectra to a reference sample (the best approach is the computation of the median spectrum of control samples, Dieterle et al. 2006)
• Steps
– Total intensity normalization
– Reference spectrum computation
– Computation of quotients of all variables of interest of the test spectrum with those of the reference spectrum
![Page 105: Pre-processing - workflow4metabolomics€¦ · Preprocessing tool: incentive •An inadequate pretreatment will never be compensated by a powerful data analysis Some crucial steps](https://reader034.vdocument.in/reader034/viewer/2022050521/5fa45a8ea305f419b6331578/html5/thumbnails/105.jpg)
Probabilistic Quotient Normalization (4)
![Page 106: Pre-processing - workflow4metabolomics€¦ · Preprocessing tool: incentive •An inadequate pretreatment will never be compensated by a powerful data analysis Some crucial steps](https://reader034.vdocument.in/reader034/viewer/2022050521/5fa45a8ea305f419b6331578/html5/thumbnails/106.jpg)
Probabilistic Quotient Normalization (5)
• Hyp.: a majority of molecule concentrations remain unchanged across the samples = biologically interesting concentration changes influence only parts of the NMR spectrum, while dilution effects will affect all metabolite signals
• Comparison of spectra to a reference sample (the best approach is the computation of the median spectrum of control samples, Dieterle et al. 2006)
• Steps
– Total intensity normalization
– Reference spectrum computation
– Computation of quotients of all variables of interest of the test spectrum with those of the reference spectrum
– Median of these quotients computation
![Page 107: Pre-processing - workflow4metabolomics€¦ · Preprocessing tool: incentive •An inadequate pretreatment will never be compensated by a powerful data analysis Some crucial steps](https://reader034.vdocument.in/reader034/viewer/2022050521/5fa45a8ea305f419b6331578/html5/thumbnails/107.jpg)
Probabilistic Quotient Normalization (5)
![Page 108: Pre-processing - workflow4metabolomics€¦ · Preprocessing tool: incentive •An inadequate pretreatment will never be compensated by a powerful data analysis Some crucial steps](https://reader034.vdocument.in/reader034/viewer/2022050521/5fa45a8ea305f419b6331578/html5/thumbnails/108.jpg)
Probabilistic Quotient Normalization (6)
• Hyp.: a majority of molecule concentrations remain unchanged across the samples = biologically interesting concentration changes influence only parts of the NMR spectrum, while dilution effects will affect all metabolite signals
• Comparison of spectra to a reference sample (the best approach is the computation of the median spectrum of control samples, Dieterle et al. 2006)
• Steps
– Total intensity normalization
– Reference spectrum computation
– Computation of quotients of all variables of interest of the test spectrum with those of the reference spectrum
– Median of these quotients computation
– Division of all variables of the test spectrum by this median
![Page 109: Pre-processing - workflow4metabolomics€¦ · Preprocessing tool: incentive •An inadequate pretreatment will never be compensated by a powerful data analysis Some crucial steps](https://reader034.vdocument.in/reader034/viewer/2022050521/5fa45a8ea305f419b6331578/html5/thumbnails/109.jpg)
Normalization (2)
![Page 110: Pre-processing - workflow4metabolomics€¦ · Preprocessing tool: incentive •An inadequate pretreatment will never be compensated by a powerful data analysis Some crucial steps](https://reader034.vdocument.in/reader034/viewer/2022050521/5fa45a8ea305f419b6331578/html5/thumbnails/110.jpg)
Form
Preprocessed datamatrixNormalization method
![Page 111: Pre-processing - workflow4metabolomics€¦ · Preprocessing tool: incentive •An inadequate pretreatment will never be compensated by a powerful data analysis Some crucial steps](https://reader034.vdocument.in/reader034/viewer/2022050521/5fa45a8ea305f419b6331578/html5/thumbnails/111.jpg)
Form
Preprocessed datamatrixNormalization method
![Page 112: Pre-processing - workflow4metabolomics€¦ · Preprocessing tool: incentive •An inadequate pretreatment will never be compensated by a powerful data analysis Some crucial steps](https://reader034.vdocument.in/reader034/viewer/2022050521/5fa45a8ea305f419b6331578/html5/thumbnails/112.jpg)
Form
Preprocessed datamatrixNormalization method
Biological factor of interest
« Reference » group
Samplemetadata1
1 mandatory for QuantitativeVariableand PQN normalization
![Page 113: Pre-processing - workflow4metabolomics€¦ · Preprocessing tool: incentive •An inadequate pretreatment will never be compensated by a powerful data analysis Some crucial steps](https://reader034.vdocument.in/reader034/viewer/2022050521/5fa45a8ea305f419b6331578/html5/thumbnails/113.jpg)
Form
Preprocessed datamatrixNormalization method
Spectra vizualisation
Biological factor of interest
« Reference » group
Samplemetadata1
1 mandatory for QuantitativeVariableand PQN normalization
![Page 114: Pre-processing - workflow4metabolomics€¦ · Preprocessing tool: incentive •An inadequate pretreatment will never be compensated by a powerful data analysis Some crucial steps](https://reader034.vdocument.in/reader034/viewer/2022050521/5fa45a8ea305f419b6331578/html5/thumbnails/114.jpg)
Results: normalized data matrix
SamplesFeatures
![Page 115: Pre-processing - workflow4metabolomics€¦ · Preprocessing tool: incentive •An inadequate pretreatment will never be compensated by a powerful data analysis Some crucial steps](https://reader034.vdocument.in/reader034/viewer/2022050521/5fa45a8ea305f419b6331578/html5/thumbnails/115.jpg)
Results: spectra representation
…
![Page 116: Pre-processing - workflow4metabolomics€¦ · Preprocessing tool: incentive •An inadequate pretreatment will never be compensated by a powerful data analysis Some crucial steps](https://reader034.vdocument.in/reader034/viewer/2022050521/5fa45a8ea305f419b6331578/html5/thumbnails/116.jpg)
Results: spectra representation
…
![Page 117: Pre-processing - workflow4metabolomics€¦ · Preprocessing tool: incentive •An inadequate pretreatment will never be compensated by a powerful data analysis Some crucial steps](https://reader034.vdocument.in/reader034/viewer/2022050521/5fa45a8ea305f419b6331578/html5/thumbnails/117.jpg)
PREPARING DATA
117
![Page 118: Pre-processing - workflow4metabolomics€¦ · Preprocessing tool: incentive •An inadequate pretreatment will never be compensated by a powerful data analysis Some crucial steps](https://reader034.vdocument.in/reader034/viewer/2022050521/5fa45a8ea305f419b6331578/html5/thumbnails/118.jpg)
Format of data files
• Three tables gathering all the information
• data matrix: intensities / expression (ions, buckets, proteins, …)
• sample metadata file: information regarding samples (individuals)
• variable metadata file: information regarding variables (ions, buckets, proteins, …)
• Tabulated files
• TSV / TXT with tabulation as column separator
• “.” as decimal separator
• Convention for identifiers and column names
• Should not contain any duplicate
• Rather use only alphanumeric characters, and points (.) and underscores (_)
![Page 119: Pre-processing - workflow4metabolomics€¦ · Preprocessing tool: incentive •An inadequate pretreatment will never be compensated by a powerful data analysis Some crucial steps](https://reader034.vdocument.in/reader034/viewer/2022050521/5fa45a8ea305f419b6331578/html5/thumbnails/119.jpg)
Format of data files: dataMatrixname
vari
able
s
first column = Variables’ ID
samples first row = Samples’ ID
intensitiesONLY intensities (no other information)
Note: missing values should be coding NA
![Page 120: Pre-processing - workflow4metabolomics€¦ · Preprocessing tool: incentive •An inadequate pretreatment will never be compensated by a powerful data analysis Some crucial steps](https://reader034.vdocument.in/reader034/viewer/2022050521/5fa45a8ea305f419b6331578/html5/thumbnails/120.jpg)
Format of data files: sampleMetadataname
sam
ple
s
first column = samples’ IDmust absolutely match those in the dataMatrix file
Column names first row = treatment, diet, age …
intensities Information about your samplesCan add to this table as many columns as you want or need
![Page 121: Pre-processing - workflow4metabolomics€¦ · Preprocessing tool: incentive •An inadequate pretreatment will never be compensated by a powerful data analysis Some crucial steps](https://reader034.vdocument.in/reader034/viewer/2022050521/5fa45a8ea305f419b6331578/html5/thumbnails/121.jpg)
Format of data files: variableMetadataname
vari
able
s
first column = variables’ IDmust absolutely match those in the dataMatrix file
Column namesfirst row = retention time,
molar mass, …
intensitiesInformation about your variablesCan add to this table as many columns as you want or need
![Page 122: Pre-processing - workflow4metabolomics€¦ · Preprocessing tool: incentive •An inadequate pretreatment will never be compensated by a powerful data analysis Some crucial steps](https://reader034.vdocument.in/reader034/viewer/2022050521/5fa45a8ea305f419b6331578/html5/thumbnails/122.jpg)
EXERCISE – MTBLS1 DATA
122
![Page 123: Pre-processing - workflow4metabolomics€¦ · Preprocessing tool: incentive •An inadequate pretreatment will never be compensated by a powerful data analysis Some crucial steps](https://reader034.vdocument.in/reader034/viewer/2022050521/5fa45a8ea305f419b6331578/html5/thumbnails/123.jpg)
EXERCISE
• Do the spectra preprocessing using different parameters :
– Apodization exponential : LB=0 ; 0.3 or 1
– Zero order phase correction : rms or max
– Shift referencing : nearvalue; all; or window
– Baseline correction : use different values for lambda, and for p
• Do the spectra alignment using different segment size, reference spectrum and intensity threshold
• Do the spectra bucketing using different bin sizes
• Do the spectra normalization using different methods
• Compare results