new challenges in bioinformatics: integrative analysis of omics … · new challenges in...

New Challenges in Bioinformatics:Integrative Analysis of Omics Data

Alex Sánchez

1Statistics and Bioinformatics Research GroupStatistics department, Universitat de Barelona

2Statistics and Bioinformatics UnitVall d’Hebron Institut de Recerca

Outline Introduction:

omics, data integration, integrative analysis Integrative analysis: challenges and methods Some (prototypical) examples

Multivariate statistical approach to integrative analysis Building better predictors from diverse data sources Gene sets and its application to integrative analysis Network methods for visualization and data integration

Where to now?

Who, where, what?

Omics data

123456789 p p m

H NMR metabolites

Affy Transcriptome

LC-MS proteomicss

Adiponectin (change from baseline)

15day 7day 14

db/+ db/db

Veh Met30

Met75 Veh Met

30Gly1

Adipon

(ug/ml)

“Non-omic” markers

Veh A B C D Veh A B C DNormal Disease

Experimental Platforms generatediverse omics and non-omics data

“NGS-Sequences

Genomics

• Uses sequencing technologies to study genomes and intragenomic phenomena.

• Data: DNA sequences

Transcriptomics• The transcriptome is the

set of all RNA molecules, in one or a population of cells.

• Transcriptomics, examines expression levels of mRNAs in a given cell population,

• Technologies• Microarrays• Next Generation

Sequencing

Proteomics• The large-scale study

of proteins (the proteome)• (3D) structures and • functions.

• Spectra of techniques• 2D gel based• Mass Spectrometry (MS)• Seldi-TOF (MS)• Protein arrays,• …

Metabolomics• Comprehensive and

simultaneous systematic determination of• metabolite levels in the

metabolome and • their changes over time as

a consequence of stimuli.• Relies on

• Separation techniques• GC, CE, HPLC, UPLC

• Detection techniques• NMR, MS

CEMCAT-Neuroimmunology10

Altogether: The central dogma and the omics cascade

Why would we want to integrate data?

Why should we integrate data?

What we learn from an experiment may depend on where we look, how we look, and the scope of our view!

The Blind Men and the Elephant

http://www.noogenesis.com/pineapple/blind_men_elephant.html

Focusing on one platform risks missing an obvious signal!!!

From componentwise to global approaches

It is expected that the integrated collection and analysis of diverse types of data,

jointly modelled and analyzed in a systems biology approach

can shed light on the global functioning of biological systems.

Ultimate Goal: understanding of complex processes

Integrative Analysis & Data integration: methods, types, challenges

Data Integration is cool

• Everywhere nowadays in Biology Medicine, Bioinformatics, …• Meetings

• Barcelona (Feb. 2013), Leiden (Apr. 2013), Ascona (May 2013)

• Finnancing (FP7): projects with > 106 € each• Stategra• MimOmics

• Try googling with the terms 'omics data integration'

But what is Data Integration?

◦ “Data integration” may mean different things...◦ Computational combination of data ◦ Combination of studies performed independently◦ Simultaneous analysis of multiple variables on multiple

datasets.◦ Not to mention any possible approach for

homogeneously querying heterogeneous data sources

Integrative analysis may be preferable

There are many types of integrative analysis

Hamid et al. 200919

There are many methods ….

• Decision trees, Bayesian networks, Support vector machines, Graph algorithms, Multivariate analysis,

There are many issues to be addressed

Data-Preprocessing Data of same or different types

High (but "cursed") dimensionality N << p Datasets of different sizes (104 genes, 103 proteins) Multiple testing issues

Missing values Some values missing for some individuals Non rectangularity of the data

Biological interpretation

So what?

• We willl restrict to arbitrarily chosen examples providing an overview of the field without pretending to cover it all.

• Case studies.– Combining biological knowledge with omics data using

multivariate statistics.– How to obtain improved cancer predictors by

aggregating datasets.– Using network biology methods for traslational cancer

research.

Some examples

Integrative Analysis of the Relationship Between Insulin Resistance and Gut Microbiota

Insulin Resistance

Insulin resistance means cells become less sensitive to insulin,

This provokes the pancreas to over-compensate by working harder and releasing even more insulin.

Insulin-resistance + Insulin over-production leads to two common outcomes: diabetes, or obesity

IS/IR and Gut Microbiota

Human gut microbiome is related to health & weight◦ varies in healthy people◦ varies in lean and obese

It is reasonable to postulate insuline sensitivity to be associated with changes in bacterial microflora.

Data for relating IR/IS with Microbiome

Clinical variables (BMI, Homa, Ins, HDL, …) Microarrays

Expression matrix an related annotations (GO) Microbial flora diversity based on

Denaturing Gradient Gel Electrophoresis Metagenomic shotgun NGS sequencing

Clin1 ······ ClinK1 DGGE1 ······ DGGEK2 Expr1 ······ ExprK3 GeneSet1 ······ GeneSetK4 Spec1 ······ SpecK5IS_NoD_10IS_NoD_11IS_NoD_12IR_NoD_13IR_NoD_14IR_NoD_15Diab_16Diab_17

Principal Components Analysis

• Given a KxN data matrix containing K (correlated) measurements on N samples (objects/individuals…)

• Decomposes data matrix in new K components that – account for different sources of variability in the data,– are uncorrelated, that is each component accounts for a

different source of variability,– have decreasing explanatory ability: each component explains

more than the following– allow for a lower dimensional representation of the data in

terms of scores on principal components.

How does PCA work

• PCA provides a new set of coordinates for the observations• Original coordinates

•Value of the variables• New coordinates

•Value of PCs: scores• Scores are the new

coordinates in the orthogonal system defined by PCs.

Representing data in the PCA space

• PCs have been derived so that– They are orthogonal– Each PC explains the maximum amount of remaining

variation in the data• This means that it is not necessary to use all

PCs to visualize the data in this new coordinate system– Taking the first PCs will often explain a high

percentage of variability.– Usually only first 2 or 3– This should always be checked!!!

Multiple Factor Analysis (MFA)

MFA is a multivariate statistical technique useful to analyze several groups of variables

(numerical and/or categorical) defined on the same samples

Multiple Factor Analysis (2) The core of MFA is a PCA

applied to the whole set of variables,

Each group of variables is weighted, rendering possible the analysis of different points of view by taking them equally into account.

MFA allows to look for common factors by providing a representation of each matrix of variables.

MFA (3): Multiple displays

MFA (4): Supplementary info

The assets of MFA appear when integrating both numerical and categorical groups of variables, and when supplementary groups of data need to be added in the

analysis.

Conclusions

The good◦ MFA allows the integrated analysis of multiple groups of

possibly heterogeneous data types. ◦ It can help to highlight associations previously

undetected (“adds value”).◦ It can deal with any number of groups and any type of

supplementary variables (Gene Sets, Species, …) Limitations: ◦ It assumes individual-based information No groups

(e.g. pools) as input◦ Missings are difficult to deal with

Complementary idea 1: Improve use of biological knowledge

• The ultimate goal is a better understanding of (changes) in biological processes.

• It seems reasonable to make an (increased) use of biological information.

• This can be done in different ways– Convert data into networks and align them– Project biological units in a common space and rely on

• commonalities• differencesfor variable selection

Previous results

goProfiles

Variable selection based on Biological Knowledge

• Preliminary work on functional profiling can be used to project biological units such as genes or proteins into annotation databases such as the Gene Ontology

• An iterative algorithm can be used to select subsets that are either – most biologically diverse– nost biologically homogeneous

• This can be used as a basis for variable selection previous to MFA

Integrative Omics Data Mining and Knowledge Discovery in

Colorectal Cancer

based on a work by Jake Y. Chen, Ph.DIndiana Center for Systems Biology & Personalized Medicine

Polyp and Colorectal Cancer

Polyp vs. Colorectal CancerBenign tumors of the large intestine.Does not invade nearby tissue or spread to other parts of

the body.If not removed from the large intestine, may become

malignant (cancerous) over time.Most of the cancers of the large intestine are believed to

have developed from Polyp.Photo Courtesy of National Cancer Institute

Colon Cancer vs. Rectal Cancer• Share many commonalities, including molecular mechanisms.• Tend to be treated differently.

Omics/Clinical Data SourceProteomics/Metabolomics/Lipdomics/Clinical Data

Oxidative Stress

LC-MS Proteomics

Vitamin D

GC/GC MS Metabolomics

Lipdomics

NMR Metabolomics

Scientific Questions to Answer

Data AnalysisWhich Omics data has the best prediction power?Which features in Omics data are important?

Data MiningDoes integration of Omics data improve the prediction?Which combination of Omics data has the best prediction power?

Knowledge DiscoveryWhy those features in Omics data have the best prediction power?

RoadmapKnowledge Discovery of Proteomics DataKnowledge Discovery of Metabolomics DataIntegrative Data Mining

Proteomics Data Description

Group: Bindley Biosciences Center at Purdue University

Instruments: Agilent's chip cube coupled the XCT PLUS ESI ion trap

Data format at CCE webportal: mzXML

Number of Samples: Normal: 80; PolyP:72; Colorectal: 40

LC-MS Proteomics Data Processing

LC/MS data “heat map”

Total Ion Chromatogram (TIC) summarized from enhanced heat map

Methods Adapted fromN. Jeffries (2005) Bioinformatics, vol. 21, (no. 14), pp. 3066.S.A. Kazmi, et al., (2006) Metabolomics, vol. 2, (no. 2), pp. 75-83

Image Enhanced LC/MS data “heat map”

LC-MS Major Protein Identification~25-28 characteristic proteins /sample identified

Identify Most Informative TIC R.T. “Grid”

Apply the R.T. Grid to Original SpectraUse Mascot to Search for Protein ID at R.T. Grid Regions

No Scan RT Uniprot_ID Score Expect Evidence1 119 139.48 ADAD2_HUMAN 38 3.3 02 229 265.87 NNMT_HUMAN 43 1.1 23 372 429.15 ZSA5D_HUMAN 42 1.2 04 656 749.8 BRAF_HUMAN 40 2.2 4795 1162 1276.6 RGS7_HUMAN 47 0.39 16 1310 1407.2 TTC9C_HUMAN 35 6.3 07 1669 1713.9 CP042_HUMAN 38 3.1 08 1866 1879.1 HXD11_HUMAN 34 8.4 09 1987 1980.3 ING4_HUMAN 38 3.1 2

10 2114 2086 ZN423_HUMAN 33 10 011 2353 2285.7 CL065_HUMAN 37 3.9 012 2539 2441.3 CA5BL_HUMAN 47 0.4 113 2722 2594.7 NPDC1_HUMAN 38 3.6 014 2874 2722.2 DJC27_HUMAN 37 3.8 015 3001 2828.5 BORG4_HUMAN 40 2.2 116 3165 2965.1 KC1G1_HUMAN 27 43 017 3440 3196.1 TPPC5_HUMAN 40 2 018 3656 3377.6 UB2D3_HUMAN 43 0.99 119 3997 3665.5 TM208_HUMAN 34 8.1 020 4257 3885.4 ZBED3_HUMAN 29 23 0

Proteomics Result Interpretation

Proteins Identified from Colon Cancer and Health Group

Uniprot_ID

Frequency in Colon

Frequency in Health

(10)Evidence in

PubMedBRAF_HUMAN 3 0 508DMP46_HUMAN 3 0 0NNMT_HUMAN 3 1 4MRP_HUMAN 1 3 0STK33_HUMAN 0 3 0

Uniprot_ID Gene Protein NameEvidence in

PubMed

BRAF1_HUMAN BRAFSerine/threonine-protein kinase B-raf 508

P53_HUMAN TP53 Cellular tumor antigen p53 443CD44_HUMAN CD44 CD44 antigen 411MDM2_HUMAN MDM2 E3 ubiquitin-protein ligase Mdm2 131BCR_HUMAN BCR Breakpoint cluster region protein 59LCK_HUMAN LCK Tyrosine-protein kinase Lck 29Q7RTZ3_HUMAN LCK Tyrosine-protein kinase Lck 29CAV1_HUMAN CAV1 Caveolin-1 21PNPH_HUMAN PNP Purine nucleoside phosphorylase 13CBL_HUMAN CBL E3 ubiquitin-protein ligase CBL 11

RAF1_HUMAN RAF1RAF proto-oncogene serine/threonine-protein kinase 10

CD38_HUMAN CD38 ADP-ribosyl cyclase 1 8NNMT_HUMAN NNMT Nicotinamide N-methyltransferase 4

IRAK1_HUMAN IRAK1Interleukin-1 receptor-associated kinase 1 3

DMPK_HUMAN DMPK Myotonin-protein kinase 2ITA5_HUMAN ITGA5 Integrin alpha-5 1ITB1_HUMAN ITGB1 Integrin beta-1 1ZAP70_HUMAN ZAP70 Tyrosine-protein kinase ZAP-70 1

Proteins Interacted with High-Frequency Proteins from Colon Cancer Group

Proteomics Result InterpretationA Network Biology Context

Protein Network Constructed from the Top 3 Differential Proteins

Green-circled proteins are frequently (>=0.3) detected in the colon patient blood samples by using LC/MS. Node: Protein with evidence from PubMed by searching ("GENE_SYMBOL" AND ("colon" OR "colorectal") AND ("cancer" OR "carcinoma")), Edge: Protein interaction with confidence score from HAPPI 1.31 (4&5-Star)

Proteomics Result InterpretationA Biological Pathway Context

BRAF (Serine/threonine-protein kinase B-raf) plays major roles in Colorectal Cancer Pathway (KEGG data)

NNMT (Nicotinamide N-methyltransferase) is involved in Biological Oxidations/Phase II Conjugation/Methylation (from Reactome)

Proteomics Result InterpretationA Biological Pathway Context for NNMT

RoadmapKnowledge Discovery of Proteomics DataKnowledge Discovery of Metabolomics Data

NMR DataGCxGC MS Data

Integrative Data Mining

Metabolomics Data Description

Group: Daniel Raftery Laboratory at Purdue University

NMR DataInstruments: Bruker Avance 500MHz, NMRData format at CCE webportal: Excel spreadsheetNumber of Samples: Normal: 53; PolyP:35; Colorectal: 15

GCxGC MS Data Instruments: LECO Pegasus 4D GCxGC-TOF Data format at CCE webportal: Excel spreadsheetNumber of Samples: Normal: 83; Polyp: 84; Colorectal:30

NMR Data Analysis Workflow

Extract peaks’ ppm

Search AgainstHuman Metabolome Database (2.5) to identify metabolites

Report only significant metabolitesSample_ID 1 2Top1 Delta-Hexanolactone Delta-HexanolactoneTop2 Hypotaurine Hypotaurine

Top3 2,3-Diphosphoglyceric acid DiethanolamineTop4 Diethanolamine 3,7-Dimethyluric acid

Top5 3-Phosphoglyceric acid Methyl isobutyl ketoneTop6 3,7-Dimethyluric acid 1,3,7-Trimethyluric acid

Top7 1,3,7-Trimethyluric acid Cysteine-S-sulfateTop8 L-Allothreonine L-AllothreonineTop9Top10

Signal Processing

NMR Peak Metabolite Identificationusing Human Metabolomics Database

1) Input the peak lists

2) Get the metabolites; leave out those with fewer than 2 matches

Significant Metabolites Identified from NRM Metabolomics Data

Group MetabolitesPolyp vs Health D-Arabitol,D-Pantethine(2/35 vs 0/53)

Colorectal vs Polyp None

Colorectal vs Health D-Arabitol (2/15 vs 0/53)

Population Frequency =

Marker metabolites? Shared metabolites

D-Arabitol Identified from NMR ResultsInvolved in Pentose and Glucuronate Interconversions Pathways

RoadmapKnowledge Discovery of Proteomics DataKnowledge Discovery of Metabolomics Data

NMR DataGCxGC MS Data

Integrative Data Mining

Results from GCxGC MS Data IMetabolite identification is more straightforward

Polyp vs Healthy Colorectal vs Polyp Colorectal vs Healthy

Metabolites Metabolites Metabolites

Methanesulfinic acid, trimethylsilyl ester Acetic acid, (methoxyimino)-, trimethylsilyl ester Butanoic acid, 2-[(trimethylsilyl)oxy]-, trimethylsilyl ester

Propanoic acid, 2-(methoxyimino)-, trimethylsilyl ester

Pentanoic acid, 2-(methoxyimino)-3-methyl-, trimethylsilyl ester

L-Valine, N-(trimethylsilyl)-, trimethylsilyl ester

Hexanedioic acid, bis(2-ethylhexyl) ester Methanesulfinic acid, trimethylsilyl ester Cholesterol trimethylsilyl ether

Mefloquine Pentanedioic acid, 2-(methoxyimino)-, bis(trimethylsilyl) ester

Hexanoic acid, trimethylsilyl ester

Cyclohexane, 1,3,5-trimethyl-2-octadecyl- L-Valine, N-(trimethylsilyl)-, trimethylsilyl ester Pentanoic acid, 2-(methoxyimino)-3-methyl-, trimethylsilyl ester

Tetradecanoic acid, trimethylsilyl ester Butanoic acid, 2-[(trimethylsilyl)oxy]-, trimethylsilyl ester

Hexanoic acid, 2-(methoxyimino)-, trimethylsilyl ester

psi,psi.-Carotene, 3,3',4,4'-tetradehydro-1,1',2,2'-tetrahydro-1,1'-dimethoxy-2,2'-dioxo-

Cyclohexane, 1,3,5-trimethyl-2-octadecyl- 3,6-Dioxa-2,7-disilaoctane, 2,2,4,7,7-pentamethyl-

Silanol, trimethyl-, pyrophosphate (4:1) Butanoic acid, 2-(methoxyimino)-3-methyl-, trimethylsilyl ester

Trimethylsilyl ether of glycerol L-Asparagine, N,N2-bis(trimethylsilyl)-, trimethylsilyl ester

Ethylbis(trimethylsilyl)amine

Cyclotrisiloxane, 2,4,6-trimethyl-2,4,6-triphenyl-

Benzene, (1-hexadecylheptadecyl)-

Pentanedioic acid, 2-(methoxyimino)-, bis(trimethylsilyl) ester

Results from GCxGC MS Data II

A. Polyp vs Healthy B. Polyp vs Colorectal C. Colorectal vs Healthy

Comparative Results (Intensity vs. Population)Marker Metabolite Panel Clustering of three groups

Intensity based Heat map

Population Frequency based Heat map

Metabolites identified from GCxGC MS ResultsInvolved in Fatty Acid Biosynthesis Pathways

RoadmapKnowledge Discovery of Proteomics DataKnowledge Discovery of Metabolomics DataIntegrative Data Mining

Data Set DescriptionDiet, Lipidomics, Oxidative and VD

# of features and the total # of subjects varies

Three classes are balanced to the least common denominatorHealthy vs. PolypHealthy vs. ColorectalPolyp vs. Colorectal

Diet Lipid Oxidative VD

Total Subjects 150 97 94 195

Total Features 38 49 3 2

Predictive Modeling Methods

Data PreprocessingFiltering outliers (three standard deviations away from mean)Data Normalization (transforming to the 0-1 range) Binned categorical data using Quantile binning method

Missing Value TreatmentReplaced with the mean value of the attribute in group

Support vector machines (SVM) Classifier KernelRadial Basis Function (RBF) kernel are used

Feature Selection MethodsApproach #1: Two sample unpaired T-tests at 5% significance level.Approach #2: SVM Attribute Evaluator with Ranker Algorithm. Features from T-tests are filtered using p-values

K-fold Cross-validation

Classification Model

Clean Dataset

Raw Dataset

HypothesisHypothesis

Hypothesis

Dietary Attributes as Predictors

Polyp vs. Healthy Colorectal vs. Healthy

2.38E-02

4.21E-01

4.11E-02

1.21E-01

2.53E-02

9.57E-01

3.71E-02

5.60E-02

SVM Predictor Accuracy = 64% SVM Predictor Accuracy = 65%

P-value P-value

Ice cream

Shellfish

Tomato

Lipidomics T-Tests ResultsSignificant Features Selected from T Test with their corresponding p value

Features Polyp vs. Healthy Polyp vs. Colorectal Colorectal vs. Healthy

16:0/18:1 PE 1.76E-02

24:1 Cer 6.90E-03

LPE 18:1 <1.00E-04

LPE 20:0 1.50E-03 2.00E-04

An-16:0 LPA 3.23E-02

An-18:1 LPA 3.38E-02 1.33E-02

AA 1.13E-02

18:2 LPA 1.13E-02 4.50E-03

20:4 LPA 2.40E-02

22:6 FA 4.28E-02 3.24E-02

LPE 16:0 3.08E-02 3.40E-03

LPE 18:0 3.90E-03 1.00E-04

LPE 18:1 2.18E-02

Integrating lipidomics with clinical features Performance comparisons

Accuracy(without pre-selection)

Accuracy(with t-test pre-selection)

Accuracy(automatic selection)

Polyp vs. Healthy

0.54 0.71 0.78

Colorectal vs. Healthy*

0.57 0.63 0.73

Polyp vs. Colorectal *

0.70 0.90 0.87

* Since the number of subjects was less than15, 3 fold cross-validation accuracy was reported.

Accuracy

Polyp vs. Healthy

Colorectal vs. Healthy*

Polyp vs. Colorectal *

Without Clinical Features With Clinical Features

Messages

Individual Omics data set has variable predictive performance

Need thorough statistical filtering + biological knowledge integration to battle inherent high-level of data noise

Integration of different Omics data with clinical data can improve predictive performance

Network methods and data integration

Network methods

• [Obvious comment]: Networks are everywhere from social networks such as facebook, terrorism menaces or biochemical processes

• Network science is a (re-)emerging approach that relies on different approaches to modeling systems of interacting elements to describe, model and predict the behavior of diverse systems.

Biological systems modelling

Building and using networks

– Networks can be created from collecting interactions published in papers, or can be reconstructed directly from data.

– Different types of biological intracellular molecular networks can be represented by different types of graphs.

– Protein interaction networks and cell signaling networks can be connected to drugs and diseases

– Network representation can be used to integrate different datasets using genes as anchors

Network biology methods integrating biological data for translational science

Bebek G et al. Brief Bioinform 2012;13:446-459

An integrative -omics signaling network identification process workflow

Start with processing tissue-specific data (instrument outputs) Microarray data is normalized to make comparisons of expression levels and transformed to

select genes for further analysis. Genome-wide genotyping signals are analyzed to identify regions (and hence regional

genes) for both tumor and normal tissue (or non-cancerous cells). Next, genomic regions with significant aberrations are merged with their corresponding

microarray probes to create expression profiles. In this analysis step, expression profiles are used to calculate Pearson's

coexpression correlations among gene pairs. These results are fed into the Pathway Analysis Framework. Integrating gene–gene coexpression values, annotations from GO, known signaling

pathways, protein sequence information, PPI networks and protein subcellular co-localization data, pathways are predicted and filtered.

Significant pathway subnetworks are merged to form signaling networks connecting genes of interest.

The networks and genomic alterations identified are put together to create a descriptive functional network, creating a molecular basis for the cancer studied.

Network-based prioritization of candidate disease genes.

Bebek G et al. Brief Bioinform 2012;bib.bbr075

Conclusions Data integration or -better- integrative analysis of 'omics data' is

a challenging topic with many open-problems. Current state: go study by study and consider nature of data

and type of question. Current approaches are diverse:

Machine learning, Dimension reduction, Pathway visualization,

Diverse open research lines, lot of space for improvements Yet to come:

the "integrator": automatical combination that clearly improves biologival interpretation.

Mathematical framework common to all problems Last but not least: Integrative analysis requires integrative work,

well inside the philosophy of Biostatnet or other collaborative networks.

Acknowledgments Statistics and Bioinformatics Research

Group at the Statistics department of the University of Barcelona.

The Biostatnet group and particularly Carmen Cadarso and Lupe Gomez

My colleagues at the Statistics and Bioinformatics Unit at the Vall d'Hebrón Research Institute

Unitat de Serveis Científico Tècnics (UCTS) at the Vall d'Hebrón Research Institute

Thank you for your attention!

new challenges in bioinformatics: integrative analysis of omics … · new challenges in...

Documents

centre for integrative bioinformatics ibivu bioinformatics...

omics fusion – a platform for integrative analysis of...

integrative omics to detect bacteremia in patients with...

advances in omics and bioinformatics tools for systems...

university of groningen integrative omics to understand...

centre for integrative bioinformatics vu (ibivu)

ngs, omics and applied bioinformatics at cvi · 2011. 9....

research open access integrative analysis of multi-omics...

multi-omics bioinformatics across application domains

master in bioinformatics/ omics data …bioinformatics...

an integrative omics solution to the detection of ... de...

precise genetic mapping and integrative bioinformatics in...

an integrative imputation method based on multi-omics...

decoding epigenome with integrative “omics” data...

discovering and linking public ‘omics’ datasets 2...

introduction to bioinformatics 18 apr 20061 introduction to...

from statistical to biological interactions via omics...

integrative omics analysis

bioinformatics and omics group meeting reference guided rna...

journal of integrative bioinformatics, 9(1):192, 2012 http