experimental design and data treatment in ......serge rezzi,*, ziad ramadan,laurent b. fay, and, and...
TRANSCRIPT
EXPERIMENTAL DESIGN AND DATA TREATMENT IN METABOLOMICS
APPLIED TO NUTRITION AND HEALTH
Mélanie Pétéra
PFEM, MetaboHUB-Clermont, Univ Clermont Auvergne, INRA, UNH
Métabolomique - Lille, 15 octobre 2019Mélanie Pétéra
Métabolomique - Lille, 15 octobre 2019Mélanie Pétéra
METABOLOMICS TO BETTER UNDERSTAND
NUTRITION-HEALTH RELATIONSHIP
PARTICULARITIES IN DESIGN
Métabolomique - Lille, 15 octobre 2019Mélanie Pétéra
A WHOLE TO CONSIDER
Métabolomique - Lille, 15 octobre 2019Mélanie Pétéra
TO CONSIDER BEFORE STARTING
Métabolomique - Lille, 15 octobre 2019Mélanie Pétéra
A main question clearly defined
Defining priorities
Gathering information
About individualsAbout samples
About experimental context
The goal
TO CONSIDER BEFORE STARTING
Métabolomique - Lille, 15 octobre 2019Mélanie Pétéra
A main question clearly defined
Defining priorities
Gathering information
About individualsAbout samples
About experimental context
The goal
Technologicalchoices
Knowing what ispossible and what is not
Métabolomique - Lille, 15 octobre 2019Mélanie Pétéra
METABOLOMICS AND VARIABILITY
Do not wait to get the data!
Métabolomique - Lille, 15 octobre 2019Mélanie Pétéra
BIOLOGICAL AND ANALYTICAL VARIABILITY: BEWARE OF UNFORTUNATE COMBINATION
-12
-10
-8
-6
-4
-2
0
2
4
6
8
10
12
-10 0 10
t[3]
t[1]
EQUOL021006_FM010_1973mz_Genotype.M3 (PCA-X), PCA log(data) centrées
t[Comp. 1] /t[Comp. 3]
Colored according to Obs ID (Primary)
R2X[1] = 0.174129 R2X[3] = 0.0819166
Ellipse: Hotelling T2 (0.95)
A
B
A
A
A
A
A
A
A
A AA
A
A
A
A
A
A
A
A
A
A
A
A
A
A
A
A
AA
A
A
A
A
A
A
A
A
A
A
A
A
A
A
AA A
A
A
AA
A
A
A
A
A
A
B
B
B
B
B
B
B
B
B
B
B
B
B
B
B
B
B
B
B
B
B
B
B
B
B
B
BB
B
B
B
B
B
BB
B
B
B
B
B
B
BB
B
B
B
B
B
B
B
B
B
B
B
B
Métabolomique - Lille, 15 octobre 2019Mélanie Pétéra
BIOLOGICAL AND ANALYTICAL VARIABILITY: BEWARE OF UNFORTUNATE COMBINATION
-12
-11
-10
-9
-8
-7
-6
-5
-4
-3
-2
-1
0
1
2
3
4
5
6
7
8
9
10
11
12
-10 0 10
t[3]
t[1]
[email protected] (PCA-X)
t[Comp. 1] /t[Comp. 3]
Colored according to classes in M4
R2X[1] = 0.174967 R2X[3] = 0.0821061
Ellipse: Hotelling T2 (0.95)
12
12
12
12
12
12
12
12
1212
12
12
12 12
12
12
12
12
12
12
12
1212
12
12
12
12
12
12
12
13
13
13
13
13
13
13
13
13
13
13
13
13
13
13
13_1
13
13
13
13
13
13
13
13
13
13
13
14
14
14
14
14
14
14
14
14
14
14
14
14
14
14
15
15
11
15 15
15
15
1515
15
15
15
15
16
1616
16
16
16
169_16
Signal drift (gradual loss in
sensitivity)
TO CONSIDER BEFORE STARTING
Métabolomique - Lille, 15 octobre 2019Mélanie Pétéra
A main question clearly defined
Defining priorities
Gathering information
About individualsAbout samples
About experimental context
The goal
Technologicalchoices
Knowing what ispossible and what is not
TO CONSIDER BEFORE STARTING
Métabolomique - Lille, 15 octobre 2019Mélanie Pétéra
A main question clearly defined
Defining priorities
Gathering information
About individualsAbout samples
About experimental context
The goal
Technologicalchoices
Knowing what ispossible and what is not
Confoundingfactors Data
processingparameters
Cofactors
TO CONSIDER BEFORE STARTING
Métabolomique - Lille, 15 octobre 2019Mélanie Pétéra
A main question clearly defined
Defining priorities
Gathering information
About individualsAbout samples
About experimental context
The goal
Technologicalchoices
Knowing what ispossible and what is not
Confoundingfactors Data
processingparameters
Cofactors
Involvement in the project design at every level
Métabolomique - Lille, 15 octobre 2019Mélanie Pétéra
EXAMPLE – CONFOUNDING EFFECT
Group distribution according to injection sequence
injection numberdistribution per group
Biological groups not mentioned before data analysis
Groups not taken into account in injection sequence randomisation
Unlucky distribution of groups along the injection sequence
Confunding effect:Is a difference due to biological groups or to a remaining signal drift effect?
DATA TREATMENT: WHAT TO CHOOSE?
Métabolomique - Lille, 15 octobre 2019Mélanie Pétéra
Métabolomique - Lille, 15 octobre 2019Mélanie Pétéra
WORKFLOW OVERVIEW
Métabolomique - Lille, 15 octobre 2019Mélanie Pétéra
WORKFLOW OVERVIEW
?
Remaining noise ?
Redundancy ?
Possibility ofidentification ?
Variable selection ?
Métabolomique - Lille, 15 octobre 2019Mélanie Pétéra
EXAMPLE 1: REDUNDANCY
REDUNDANCY
Choice of methods thatcan be used anyway
Getting rid of colinearity
Standard methods canstill be penalised
Subtil effectsVariables not linked to
the question of interest
Analyticalredundancy
Biologicalcorrelation
VS
How to make the difference?
Do we want to eliminatebiological correlation?
How to select which variables to keep?
?
?
?
! !
Métabolomique - Lille, 15 octobre 2019Mélanie Pétéra
EXAMPLE 2: FEATURE SELECTION
statisticsvariety of potentially
involved variables
biomarker discovery
mechanism understanding
parsimonious modelsglobal models
JK Nicholson – Imperial College London
identification not mandatory
need of large-scale validation
device-dependant models
biological validity
limited number of variables for clinical use
variable selection issues
limits of identification
A path we follow at PFEM
Métabolomique - Lille, 15 octobre 2019Mélanie Pétéra
EXAMPLE 2: FEATURE SELECTION
KNOWLEDGE BASED SIGNATURE OPTIMISATION
Identification of a specific signature of MetS
Morrow et al., 2007
TOOLS FOR DATA ANALYSIS
Métabolomique - Lille, 15 octobre 2019Mélanie Pétéra
Métabolomique - Lille, 15 octobre 2019Mélanie Pétéra
Tools ChoicesPrinciple and goals
Technical constraints
The forest
The kind of tree needed
The goal
Métabolomique - Lille, 15 octobre 2019Mélanie Pétéra
EXAMPLE WITH THE PFEM LANDSCAPE
Graphical User InterfaceErgonomic
Parameter completenessModularity
Data & workflow sharingPossibility of new tools integration
In-house tools to perform new strategies based on research
projects, collaborations…
Enhancement of available tools,new tool development to breakdown current barriers or to undertake new challenges
Métabolomique - Lille, 15 octobre 2019Mélanie Pétéra
EXAMPLE OF PFEM TOOL DEVELOPMENT (1)
• What:
a filtered coloured correlation table betweentwo datasets
• Why:
To allow biologist or analyst to have a combined view of metabolomic results and clinical data in a simple way
• How:
A Galaxy tool to plug itdirectly to the output files from W4M workflows
Métabolomique - Lille, 15 octobre 2019Mélanie Pétéra
EXAMPLE OF PFEM TOOL DEVELOPMENT (2)
• What:
functionalities dedicated to MBPLSDA
• Why:
To have a single package providing all necessary outputs for efficient evaluationof MBPLSDA models
• How:
R package with a few number of functionsto perform MBPLS and gather relevant indicators
THANK YOU FOR YOUR ATTENTION
Serge Rezzi,*, Ziad Ramadan,Laurent B. Fay, and, and Sunil Kochhar. Journal of Proteome Research 2007 6 (2), 513-525
Mark Haid, Caroline Muschet, Simone Wahl, Werner Römisch-Margl, Cornelia Prehn, Gabriele Möller, and Jerzy Adamski. Journal of Proteome Research 2018 17 (1), 203-211
Wenbin Zhou, Guigang Zeng, Chunming Lyu, Fang Kou, Shen Zhang, Hai Wei. Journal of Sports Science and Medicine 2019 (18), 253 - 263.
Alonso A, Marsal S, Julià A. Front Bioeng Biotechnol. 2015;3:23.
Morrow DA, de Lemos JA. Circulation. 2007;115:949–52.
Métabolomique - Lille, 15 octobre 2019Mélanie Pétéra