statistics for high dimensional biological recordings dr cyril pernet, centre for clinical brain...
TRANSCRIPT
STATISTICS FOR HIGH DIMENSIONAL BIOLOGICAL RECORDINGS
Dr Cyril Pernet,
Centre for Clinical Brain Sciences
Brain Research Imaging Centre
[email protected]://www.sbirc.ed.ac.uk/cyril/
Biological Recordings
• Behavioural / Electrophysiology / MRI images
• 1D: Single channel (time / freq)• 2D: Classification ‘images’ (can actually be spectrograms)• 3D: MRI (xyz) and MEEG (channels x time / freq / trials) • 4D: fMRI (time * xyz) and MEEG (channels x freq x time x
trials)
Biological Recordings
Often we want:
To ensure data are ok for analyses high dimensional outliers detection, weighting, etc.
To analyse each ‘cell’ in the data matrix = ‘massive univariate analyses’ multiple comparisons issue
To find features in the data to distinguish conditions / groups dimension reduction (ICA), classification (MVPA)
My toys
• General linear model (WLS, IRLS)
• Robust statistics (trimmed means, winsorized variance, skipped correlations, half space/mid-covariance determinant, MAD, S-outliers, etc)
• Bootstrap and permutations
• Cross-validation
Example 1: EEG outlier detection• Weighted least square of MEEG
–> weights based on time course similarity: 1. dimension reduction (PCA) 2. outlier detection (MAD) 3. weighting (WLS)
OLS – face 1 vs 2 seems a bit different WLS – face 1 vs 2 seems identicalBias is trial variability in face 2 leads to small diff. in OLS
Example 2: MCC• Threshold-Free Cluster Enhancement (widthe x heighth )• Smith and Nichols 2009 - Integrate the cluster mass at
multiple thresholds ; used for fMRI/TBSS
Example 2: MCC for N dimensions• Threshold-Free Cluster Enhancement:• Pernet et al 2014 validation for electrophysiology to
optimize parameter selection