reconstruction of regulatory modules based on heterogeneous data sources karen lemmens phd defence...

65
Reconstruction of regulatory modules based on heterogeneous data sources Karen Lemmens PhD Defence September 29th 2008

Upload: alexandrina-russell

Post on 26-Dec-2015

215 views

Category:

Documents


2 download

TRANSCRIPT

Page 1: Reconstruction of regulatory modules based on heterogeneous data sources Karen Lemmens PhD Defence September 29th 2008

Reconstruction of regulatory modules based on heterogeneous data sources

Karen Lemmens

PhD DefenceSeptember 29th 2008

Page 2: Reconstruction of regulatory modules based on heterogeneous data sources Karen Lemmens PhD Defence September 29th 2008

29 September 2008 Karen LemmensPhD defence

Outline

1. Introduction & objectives

2. Strategy– Data integration– Association rule mining algorithms

3. Main achievements– ReMoDiscovery: Unraveling the yeast transcriptional

network– DISTILLER: Condition-dependent combinatorial regulation

in E. coli

4. Conclusions and perspectives

1. Introduction 2. Strategy 3. Achievements 4. Conclusions

Page 3: Reconstruction of regulatory modules based on heterogeneous data sources Karen Lemmens PhD Defence September 29th 2008

29 September 2008 Karen LemmensPhD defence

DNA1. Introduction 2. Strategy 3. Achievements 4. Conclusions

Page 4: Reconstruction of regulatory modules based on heterogeneous data sources Karen Lemmens PhD Defence September 29th 2008

29 September 2008 Karen LemmensPhD defence

DNA & genes

TATCCCTCCCTGTTTATCATTAATTTCTAATTATCAGCGTTTTTGGCTGGCGGCGTAGCGATGCGCTGGTTACTCTGAAAACGTCTATGCAAATTAACAAAAGAGAATAGCTATGCATGATGCAAACATCCGCGTTGCCATCGCGGGAGCCGGGGGGCGTAGGGCCGCCAGTTGATTCAGGCGGCGCTGGCATTAGAGGGCGTGCAGTTGGGCGCTGCGCTGGAGCGTGAAGGATCTTCTGAGATCACCCATAAGGCGTCCAGCCGTATGACATTTGCTAACGGCGCGGTAAGATCGGCTTTGTGGTTGAGTGGTAAGGAAAGCGGTCTTTTTGATATGCGAGATGTACTTGATCTCAATAATTTGTAACCACAAAATATTTGTTATGGTGCAAAAATAACACATTTAATTTATTGATTATAAAGGGCTTTAATTTTTGGCCCTTTTATTTTTGGTGTTATGTTTTTAAATTGTCTATAAGTGCCAAATCGTCGGTAAGCAGATTTGCATTGATTTACGTCATCATTGTGAATTAATATGCAAATAAAGTGAGTGAATATTCTCTGGAGGGTGTTTTGATTAAGTCAGCGCTATTGGTTCTGGAAGACGGAACCCAGTTTCACGGTCGGGCCATAGGGGCAACAGGTTCGCCTGACCATCGTTCCGGCGCAAACTTCTGCGGAAGATGTGCTGAAAATGAATCCAGACGGCATCTTCCTCTCCAACGGTCCTGGCGACCCGGCCCCGTGCGATTACGCCATTACCGCCATCCAGAAATTCCTCGAAACCGATATTCCGAATTACATGTTTTG

DNA

mRNA

protein

GENE 1

GENE 2

1. Introduction 2. Strategy 3. Achievements 4. Conclusions

Page 5: Reconstruction of regulatory modules based on heterogeneous data sources Karen Lemmens PhD Defence September 29th 2008

29 September 2008 Karen LemmensPhD defence

DNA & genes

TATCCCTCCCTGTTTATCATTAATTTCTAATTATCAGCGTTTTTGGCTGGCGGCGTAGCGATGCGCTGGTTACTCTGAAAACGTCTATGCAAATTAACAAAAGAGAATAGCTATGCATGATGCAAACATCCGCGTTGCCATCGCGGGAGCCGGGGGGCGTAGGGCCGCCAGTTGATTCAGGCGGCGCTGGCATTAGAGGGCGTGCAGTTGGGCGCTGCGCTGGAGCGTGAAGGATCTTCTGAGATCACCCATAAGGCGTCCAGCCGTATGACATTTGCTAACGGCGCGGTAAGATCGGCTTTGTGGTTGAGTGGTAAGGAAAGCGGTCTTTTTGATATGCGAGATGTACTTGATCTCAATAATTTGTAACCACAAAATATTTGTTATGGTGCAAAAATAACACATTTAATTTATTGATTATAAAGGGCTTTAATTTTTGGCCCTTTTATTTTTGGTGTTATGTTTTTAAATTGTCTATAAGTGCCAAATCGTCGGTAAGCAGATTTGCATTGATTTACGTCATCATTGTGAATTAATATGCAAATAAAGTGAGTGAATATTCTCTGGAGGGTGTTTTGATTAAGTCAGCGCTATTGGTTCTGGAAGACGGAACCCAGTTTCACGGTCGGGCCATAGGGGCAACAGGTTCGCCTGACCATCGTTCCGGCGCAAACTTCTGCGGAAGATGTGCTGAAAATGAATCCAGACGGCATCTTCCTCTCCAACGGTCCTGGCGACCCGGCCCCGTGCGATTACGCCATTACCGCCATCCAGAAATTCCTCGAAACCGATATTCCGAATTACATGTTTTG

DNA

mRNA

protein

GENE 1

GENE 2

GENE 1 GENE 2

1. Introduction 2. Strategy 3. Achievements 4. Conclusions

Page 6: Reconstruction of regulatory modules based on heterogeneous data sources Karen Lemmens PhD Defence September 29th 2008

29 September 2008 Karen LemmensPhD defence

DNA & genes

TATCCCTCCCTGTTTATCATTAATTTCTAATTATCAGCGTTTTTGGCTGGCGGCGTAGCGATGCGCTGGTTACTCTGAAAACGTCTATGCAAATTAACAAAAGAGAATAGCTATGCATGATGCAAACATCCGCGTTGCCATCGCGGGAGCCGGGGGGCGTAGGGCCGCCAGTTGATTCAGGCGGCGCTGGCATTAGAGGGCGTGCAGTTGGGCGCTGCGCTGGAGCGTGAAGGATCTTCTGAGATCACCCATAAGGCGTCCAGCCGTATGACATTTGCTAACGGCGCGGTAAGATCGGCTTTGTGGTTGAGTGGTAAGGAAAGCGGTCTTTTTGATATGCGAGATGTACTTGATCTCAATAATTTGTAACCACAAAATATTTGTTATGGTGCAAAAATAACACATTTAATTTATTGATTATAAAGGGCTTTAATTTTTGGCCCTTTTATTTTTGGTGTTATGTTTTTAAATTGTCTATAAGTGCCAAATCGTCGGTAAGCAGATTTGCATTGATTTACGTCATCATTGTGAATTAATATGCAAATAAAGTGAGTGAATATTCTCTGGAGGGTGTTTTGATTAAGTCAGCGCTATTGGTTCTGGAAGACGGAACCCAGTTTCACGGTCGGGCCATAGGGGCAACAGGTTCGCCTGACCATCGTTCCGGCGCAAACTTCTGCGGAAGATGTGCTGAAAATGAATCCAGACGGCATCTTCCTCTCCAACGGTCCTGGCGACCCGGCCCCGTGCGATTACGCCATTACCGCCATCCAGAAATTCCTCGAAACCGATATTCCGAATTACATGTTTTG

DNA

mRNA

protein

GENE 1

GENE 2

GENE 1 GENE 2

1. Introduction 2. Strategy 3. Achievements 4. Conclusions

Page 7: Reconstruction of regulatory modules based on heterogeneous data sources Karen Lemmens PhD Defence September 29th 2008

29 September 2008 Karen LemmensPhD defence

DNA & genes

TATCCCTCCCTGTTTATCATTAATTTCTAATTATCAGCGTTTTTGGCTGGCGGCGTAGCGATGCGCTGGTTACTCTGAAAACGTCTATGCAAATTAACAAAAGAGAATAGCTATGCATGATGCAAACATCCGCGTTGCCATCGCGGGAGCCGGGGGGCGTAGGGCCGCCAGTTGATTCAGGCGGCGCTGGCATTAGAGGGCGTGCAGTTGGGCGCTGCGCTGGAGCGTGAAGGATCTTCTGAGATCACCCATAAGGCGTCCAGCCGTATGACATTTGCTAACGGCGCGGTAAGATCGGCTTTGTGGTTGAGTGGTAAGGAAAGCGGTCTTTTTGATATGCGAGATGTACTTGATCTCAATAATTTGTAACCACAAAATATTTGTTATGGTGCAAAAATAACACATTTAATTTATTGATTATAAAGGGCTTTAATTTTTGGCCCTTTTATTTTTGGTGTTATGTTTTTAAATTGTCTATAAGTGCCAAATCGTCGGTAAGCAGATTTGCATTGATTTACGTCATCATTGTGAATTAATATGCAAATAAAGTGAGTGAATATTCTCTGGAGGGTGTTTTGATTAAGTCAGCGCTATTGGTTCTGGAAGACGGAACCCAGTTTCACGGTCGGGCCATAGGGGCAACAGGTTCGCCTGACCATCGTTCCGGCGCAAACTTCTGCGGAAGATGTGCTGAAAATGAATCCAGACGGCATCTTCCTCTCCAACGGTCCTGGCGACCCGGCCCCGTGCGATTACGCCATTACCGCCATCCAGAAATTCCTCGAAACCGATATTCCGAATTACATGTTTTG

DNA

mRNA

protein

GENE 1

GENE 2

GENE 1 GENE 2

1. Introduction 2. Strategy 3. Achievements 4. Conclusions

TRANSCRIPTION

Page 8: Reconstruction of regulatory modules based on heterogeneous data sources Karen Lemmens PhD Defence September 29th 2008

29 September 2008 Karen LemmensPhD defence

DNA & genes

TATCCCTCCCTGTTTATCATTAATTTCTAATTATCAGCGTTTTTGGCTGGCGGCGTAGCGATGCGCTGGTTACTCTGAAAACGTCTATGCAAATTAACAAAAGAGAATAGCTATGCATGATGCAAACATCCGCGTTGCCATCGCGGGAGCCGGGGGGCGTAGGGCCGCCAGTTGATTCAGGCGGCGCTGGCATTAGAGGGCGTGCAGTTGGGCGCTGCGCTGGAGCGTGAAGGATCTTCTGAGATCACCCATAAGGCGTCCAGCCGTATGACATTTGCTAACGGCGCGGTAAGATCGGCTTTGTGGTTGAGTGGTAAGGAAAGCGGTCTTTTTGATATGCGAGATGTACTTGATCTCAATAATTTGTAACCACAAAATATTTGTTATGGTGCAAAAATAACACATTTAATTTATTGATTATAAAGGGCTTTAATTTTTGGCCCTTTTATTTTTGGTGTTATGTTTTTAAATTGTCTATAAGTGCCAAATCGTCGGTAAGCAGATTTGCATTGATTTACGTCATCATTGTGAATTAATATGCAAATAAAGTGAGTGAATATTCTCTGGAGGGTGTTTTGATTAAGTCAGCGCTATTGGTTCTGGAAGACGGAACCCAGTTTCACGGTCGGGCCATAGGGGCAACAGGTTCGCCTGACCATCGTTCCGGCGCAAACTTCTGCGGAAGATGTGCTGAAAATGAATCCAGACGGCATCTTCCTCTCCAACGGTCCTGGCGACCCGGCCCCGTGCGATTACGCCATTACCGCCATCCAGAAATTCCTCGAAACCGATATTCCGAATTACATGTTTTG

DNA

mRNA

protein

GENE 1

GENE 2

GENE 1 GENE 2

1. Introduction 2. Strategy 3. Achievements 4. Conclusions

TRANSCRIPTION

TRANSLATION

Page 9: Reconstruction of regulatory modules based on heterogeneous data sources Karen Lemmens PhD Defence September 29th 2008

29 September 2008 Karen LemmensPhD defence

Condition-dependent transcription

DNA

mRNA

protein

GENE 1

1. Introduction 2. Strategy 3. Achievements 4. Conclusions

Page 10: Reconstruction of regulatory modules based on heterogeneous data sources Karen Lemmens PhD Defence September 29th 2008

29 September 2008 Karen LemmensPhD defence

Condition-dependent transcription

DNA

mRNA

protein

GENE 1

1. Introduction 2. Strategy 3. Achievements 4. Conclusions

Page 11: Reconstruction of regulatory modules based on heterogeneous data sources Karen Lemmens PhD Defence September 29th 2008

29 September 2008 Karen LemmensPhD defence

Condition-dependent transcription

DNA

mRNA

protein

GENE 1

TRANSCRIPTION

TRANSLATION

1. Introduction 2. Strategy 3. Achievements 4. Conclusions

Page 12: Reconstruction of regulatory modules based on heterogeneous data sources Karen Lemmens PhD Defence September 29th 2008

29 September 2008 Karen LemmensPhD defence

Condition-dependent transcription

DNA

mRNA

protein

GENE 1 GENE 1

TRANSCRIPTION

TRANSLATION

1. Introduction 2. Strategy 3. Achievements 4. Conclusions

Page 13: Reconstruction of regulatory modules based on heterogeneous data sources Karen Lemmens PhD Defence September 29th 2008

29 September 2008 Karen LemmensPhD defence

Condition-dependent transcription

DNA

mRNA

protein

GENE 1 GENE 1

TRANSCRIPTION

TRANSLATION

1. Introduction 2. Strategy 3. Achievements 4. Conclusions

Page 14: Reconstruction of regulatory modules based on heterogeneous data sources Karen Lemmens PhD Defence September 29th 2008

29 September 2008 Karen LemmensPhD defence

Condition-dependent transcription

DNA

mRNA

protein

GENE 1 GENE 1

TRANSCRIPTION

TRANSLATION

1. Introduction 2. Strategy 3. Achievements 4. Conclusions

Page 15: Reconstruction of regulatory modules based on heterogeneous data sources Karen Lemmens PhD Defence September 29th 2008

29 September 2008 Karen LemmensPhD defence

Transcriptional regulation

GENE 1

1. Introduction 2. Strategy 3. Achievements 4. Conclusions

Page 16: Reconstruction of regulatory modules based on heterogeneous data sources Karen Lemmens PhD Defence September 29th 2008

29 September 2008 Karen LemmensPhD defence

Transcriptional regulation

Regulatory motifs

GENE 1

1. Introduction 2. Strategy 3. Achievements 4. Conclusions

Page 17: Reconstruction of regulatory modules based on heterogeneous data sources Karen Lemmens PhD Defence September 29th 2008

29 September 2008 Karen LemmensPhD defence

Transcriptional regulation

Regulatory motifs

GENE 1

GENE 1

Regulators

1. Introduction 2. Strategy 3. Achievements 4. Conclusions

Page 18: Reconstruction of regulatory modules based on heterogeneous data sources Karen Lemmens PhD Defence September 29th 2008

29 September 2008 Karen LemmensPhD defence

Transcriptional regulation

Regulatory motifs

GENE 1

GENE 1

Regulators

1. Introduction 2. Strategy 3. Achievements 4. Conclusions

Page 19: Reconstruction of regulatory modules based on heterogeneous data sources Karen Lemmens PhD Defence September 29th 2008

29 September 2008 Karen LemmensPhD defence

Transcriptional regulation

Regulatory motifs

GENE 1

GENE 1

Regulators

1. Introduction 2. Strategy 3. Achievements 4. Conclusions

Page 20: Reconstruction of regulatory modules based on heterogeneous data sources Karen Lemmens PhD Defence September 29th 2008

29 September 2008 Karen LemmensPhD defence

Transcriptional regulation

Regulatory motifs

GENE 1

GENE 1 GENE 1

Regulators

1. Introduction 2. Strategy 3. Achievements 4. Conclusions

Page 21: Reconstruction of regulatory modules based on heterogeneous data sources Karen Lemmens PhD Defence September 29th 2008

29 September 2008 Karen LemmensPhD defence

Transcriptional regulation

Regulatory motifs

GENE 1

GENE 1 GENE 1

Regulators

1. Introduction 2. Strategy 3. Achievements 4. Conclusions

Page 22: Reconstruction of regulatory modules based on heterogeneous data sources Karen Lemmens PhD Defence September 29th 2008

29 September 2008 Karen LemmensPhD defence

Transcriptional network1. Introduction 2. Strategy 3. Achievements 4. Conclusions

Page 23: Reconstruction of regulatory modules based on heterogeneous data sources Karen Lemmens PhD Defence September 29th 2008

29 September 2008 Karen LemmensPhD defence

Transcriptional network1. Introduction 2. Strategy 3. Achievements 4. Conclusions

Page 24: Reconstruction of regulatory modules based on heterogeneous data sources Karen Lemmens PhD Defence September 29th 2008

29 September 2008 Karen LemmensPhD defence

Transcriptional network1. Introduction 2. Strategy 3. Achievements 4. Conclusions

Page 25: Reconstruction of regulatory modules based on heterogeneous data sources Karen Lemmens PhD Defence September 29th 2008

29 September 2008 Karen LemmensPhD defence

Transcriptional network1. Introduction 2. Strategy 3. Achievements 4. Conclusions

Page 26: Reconstruction of regulatory modules based on heterogeneous data sources Karen Lemmens PhD Defence September 29th 2008

29 September 2008 Karen LemmensPhD defence

Transcriptional network1. Introduction 2. Strategy 3. Achievements 4. Conclusions

Page 27: Reconstruction of regulatory modules based on heterogeneous data sources Karen Lemmens PhD Defence September 29th 2008

29 September 2008 Karen LemmensPhD defence

Transcriptional network1. Introduction 2. Strategy 3. Achievements 4. Conclusions

Page 28: Reconstruction of regulatory modules based on heterogeneous data sources Karen Lemmens PhD Defence September 29th 2008

29 September 2008 Karen LemmensPhD defence

Transcriptional network1. Introduction 2. Strategy 3. Achievements 4. Conclusions

Page 29: Reconstruction of regulatory modules based on heterogeneous data sources Karen Lemmens PhD Defence September 29th 2008

29 September 2008 Karen LemmensPhD defence

Transcriptional network1. Introduction 2. Strategy 3. Achievements 4. Conclusions

Page 30: Reconstruction of regulatory modules based on heterogeneous data sources Karen Lemmens PhD Defence September 29th 2008

29 September 2008 Karen LemmensPhD defence

Transcriptional modules1. Introduction 2. Strategy 3. Achievements 4. Conclusions

Page 31: Reconstruction of regulatory modules based on heterogeneous data sources Karen Lemmens PhD Defence September 29th 2008

29 September 2008 Karen LemmensPhD defence

Outline

1. Introduction & objectives

2. Strategy– Data integration– Association rule mining algorithms

3. Main achievements– ReMoDiscovery: Unraveling the yeast transcriptional

network– DISTILLER: Condition-dependent combinatorial regulation

in E. coli

4. Conclusions and perspectives

1. Introduction 2. Strategy 3. Achievements 4. Conclusions

Page 32: Reconstruction of regulatory modules based on heterogeneous data sources Karen Lemmens PhD Defence September 29th 2008

29 September 2008 Karen LemmensPhD defence

Data integration

GENE 1

1. Introduction 2. Strategy 3. Achievements 4. Conclusions

Page 33: Reconstruction of regulatory modules based on heterogeneous data sources Karen Lemmens PhD Defence September 29th 2008

29 September 2008 Karen LemmensPhD defence

Data integration

GENE 1

Library of strains, eachwith a tagged regulator

Chromatin IP toenrich promoters

bound by regulatorin vivo

Microarray to identifypromoters bound by

regulator in vivo

Regulator Tag

Library of strains, eachwith a tagged regulator

Chromatin IP toenrich promoters

bound by regulatorin vivo

Microarray to identifypromoters bound by

regulator in vivo

Regulator Tag

Library of strains, eachwith a tagged regulator

Chromatin IP toenrich promoters

bound by regulatorin vivo

Microarray to identifypromoters bound by

regulator in vivo

Regulator Tag

Library of strains, eachwith a tagged regulator

Chromatin IP toenrich promoters

bound by regulatorin vivo

Microarray to identifypromoters bound by

regulator in vivo

Regulator Tag

ChIP-chip data

1. Introduction 2. Strategy 3. Achievements 4. Conclusions

Page 34: Reconstruction of regulatory modules based on heterogeneous data sources Karen Lemmens PhD Defence September 29th 2008

29 September 2008 Karen LemmensPhD defence

Data integration

GENE 1

Library of strains, eachwith a tagged regulator

Chromatin IP toenrich promoters

bound by regulatorin vivo

Microarray to identifypromoters bound by

regulator in vivo

Regulator Tag

Library of strains, eachwith a tagged regulator

Chromatin IP toenrich promoters

bound by regulatorin vivo

Microarray to identifypromoters bound by

regulator in vivo

Regulator Tag

Library of strains, eachwith a tagged regulator

Chromatin IP toenrich promoters

bound by regulatorin vivo

Microarray to identifypromoters bound by

regulator in vivo

Regulator Tag

Library of strains, eachwith a tagged regulator

Chromatin IP toenrich promoters

bound by regulatorin vivo

Microarray to identifypromoters bound by

regulator in vivo

Regulator Tag

Regulatory motifs

ChIP-chip data

1. Introduction 2. Strategy 3. Achievements 4. Conclusions

Page 35: Reconstruction of regulatory modules based on heterogeneous data sources Karen Lemmens PhD Defence September 29th 2008

29 September 2008 Karen LemmensPhD defence

Data integration

GENE 1

Library of strains, eachwith a tagged regulator

Chromatin IP toenrich promoters

bound by regulatorin vivo

Microarray to identifypromoters bound by

regulator in vivo

Regulator Tag

Library of strains, eachwith a tagged regulator

Chromatin IP toenrich promoters

bound by regulatorin vivo

Microarray to identifypromoters bound by

regulator in vivo

Regulator Tag

Library of strains, eachwith a tagged regulator

Chromatin IP toenrich promoters

bound by regulatorin vivo

Microarray to identifypromoters bound by

regulator in vivo

Regulator Tag

Library of strains, eachwith a tagged regulator

Chromatin IP toenrich promoters

bound by regulatorin vivo

Microarray to identifypromoters bound by

regulator in vivo

Regulator Tag

Regulatory motifs

ChIP-chip data

Microarray data

1. Introduction 2. Strategy 3. Achievements 4. Conclusions

Page 36: Reconstruction of regulatory modules based on heterogeneous data sources Karen Lemmens PhD Defence September 29th 2008

29 September 2008 Karen LemmensPhD defence

Network reconstruction

• Several methods for reconstruction of the transcriptional network exist

Not all aspects of transcription taken into account by these methods

** Van den Bulcke T., Lemmens K., Van de Peer Y., Marchal K. (2006) Inferring Transcriptional Networks by Mining Omics Data. Current Bioinformatics, vol. 1, no. 3, pp. 301-313. ** Dhollander T., Sheng Q., Lemmens K., De Moor B., Marchal K., Moreau Y. (2007) Query-driven module discovery in microarray data. Bioinformatics, vol. 23, no. 19, pp. 2573-2580.

BooleanODEBayesianAssociation (CLR, ARACNE)

ClusteringBiclustering Query-driven biclusteringMethod of Segal et al.LeMoNe

BayesianSEREND

GRAM MA-NetworkerSAMBA InferelatorCOGRIM

Expression data Data integration

Individual interactions

Transcriptional modules

1. Introduction 2. Strategy 3. Achievements 4. Conclusions

Page 37: Reconstruction of regulatory modules based on heterogeneous data sources Karen Lemmens PhD Defence September 29th 2008

29 September 2008 Karen LemmensPhD defence

Association rule mining

• Association rule mining algorithms

– Advantages:• Enable exhaustive search• Elegant and concurrent data integration• No co-expression assumption between regulator and target• Overlapping modules

– Problems• Binary or discretized data • Filtering method necessary

1. Introduction 2. Strategy 3. Achievements 4. Conclusions

Page 38: Reconstruction of regulatory modules based on heterogeneous data sources Karen Lemmens PhD Defence September 29th 2008

29 September 2008 Karen LemmensPhD defence

Outline

1. Introduction & objectives

2. Strategy– Data integration– Association rule mining algorithms

3. Main achievements– ReMoDiscovery: Unraveling the yeast transcriptional

network– DISTILLER: Condition-dependent combinatorial regulation

in E. coli

4. Conclusions and perspectives

1. Introduction 2. Strategy 3. Achievements 4. Conclusions

Page 39: Reconstruction of regulatory modules based on heterogeneous data sources Karen Lemmens PhD Defence September 29th 2008

29 September 2008 Karen LemmensPhD defence

ReMoDiscovery: Unraveling the yeast transcriptional network

1. Introduction 2. Strategy 3. Achievements 4. Conclusions

Page 40: Reconstruction of regulatory modules based on heterogeneous data sources Karen Lemmens PhD Defence September 29th 2008

29 September 2008 Karen LemmensPhD defence

ReMoDiscovery: Unraveling the yeast transcriptional network

Represent data in a mathematical way

Library of strains, eachwith a tagged regulator

Chromatin IP toenrich promoters

bound by regulatorin vivo

Microarray to identifypromoters bound by

regulator in vivo

Regulator Tag

Library of strains, eachwith a tagged regulator

Chromatin IP toenrich promoters

bound by regulatorin vivo

Microarray to identifypromoters bound by

regulator in vivo

Regulator Tag

Library of strains, eachwith a tagged regulator

Chromatin IP toenrich promoters

bound by regulatorin vivo

Microarray to identifypromoters bound by

regulator in vivo

Regulator Tag

Library of strains, eachwith a tagged regulator

Chromatin IP toenrich promoters

bound by regulatorin vivo

Microarray to identifypromoters bound by

regulator in vivo

Regulator Tag

1. Introduction 2. Strategy 3. Achievements 4. Conclusions

Page 41: Reconstruction of regulatory modules based on heterogeneous data sources Karen Lemmens PhD Defence September 29th 2008

29 September 2008 Karen LemmensPhD defence

ReMoDiscovery: Unraveling the yeast transcriptional network

• Transcriptional module– Genes are regulated by a minimum number of regulators– Genes share minimum number of common regulatory

motifs– Genes are co-expressed

1. Introduction 2. Strategy 3. Achievements 4. Conclusions

Page 42: Reconstruction of regulatory modules based on heterogeneous data sources Karen Lemmens PhD Defence September 29th 2008

29 September 2008 Karen LemmensPhD defence

ReMoDiscovery: Unraveling the yeast transcriptional network

• Transcriptional module– Genes are regulated by a minimum number of regulators– Genes share minimum number of common regulatory

motifs– Genes are co-expressed

1. Introduction 2. Strategy 3. Achievements 4. Conclusions

Page 43: Reconstruction of regulatory modules based on heterogeneous data sources Karen Lemmens PhD Defence September 29th 2008

29 September 2008 Karen LemmensPhD defence

ReMoDiscovery: Unraveling the yeast transcriptional network

• Transcriptional module– Genes are regulated by a minimum number of regulators– Genes share minimum number of common regulatory

motifs– Genes are co-expressed

1. Introduction 2. Strategy 3. Achievements 4. Conclusions

Page 44: Reconstruction of regulatory modules based on heterogeneous data sources Karen Lemmens PhD Defence September 29th 2008

29 September 2008 Karen LemmensPhD defence

ReMoDiscovery: Unraveling the yeast transcriptional network

• Transcriptional module– Genes are regulated by a minimum number of regulators– Genes share minimum number of common regulatory

motifs– Genes are co-expressed

1. Introduction 2. Strategy 3. Achievements 4. Conclusions

Page 45: Reconstruction of regulatory modules based on heterogeneous data sources Karen Lemmens PhD Defence September 29th 2008

29 September 2008 Karen LemmensPhD defence

ReMoDiscovery: Unraveling the yeast transcriptional network

• Regulatory program:

Regulators: Motifs:

MBP1

SWI4

SWI6

STB1

• Co-expressed genes:

YDL003W YER001W YGR109C YGR189CYGR221C YHR149C YER070W YPL256CYNL300W YPL163C YPL267W YPR120CYMR199W YMR199W YMR179W YML027WYKL113C

1. Introduction 2. Strategy 3. Achievements 4. Conclusions

Page 46: Reconstruction of regulatory modules based on heterogeneous data sources Karen Lemmens PhD Defence September 29th 2008

29 September 2008 Karen LemmensPhD defence

ReMoDiscovery: Unraveling the yeast transcriptional network

• ReMoDiscovery outperforms related methods for module detection– GRAM– SAMBA

• Conclusions– Meaningful biological results– Better performance than related methods

association rule mining algorithms are well suited for identification of regulatory modules through data integration

Lemmens K., Dhollander T., De Bie T., Monsieurs P., Engelen K., Smets B., Winderickx J., De Moor B., Marchal K. (2006) Inferring transcriptional module networks from ChIP-chip-, motif- and microarray data. Genome Biology, vol. 7, no. 5, pp. R37.

1. Introduction 2. Strategy 3. Achievements 4. Conclusions

Page 47: Reconstruction of regulatory modules based on heterogeneous data sources Karen Lemmens PhD Defence September 29th 2008

29 September 2008 Karen LemmensPhD defence

ReMoDiscovery: Unraveling the yeast transcriptional network

• Many aspects of transcription into account:– Regulatory motifs– Regulators– Co-expression of genes

Condition dependency of the interactions is missing

1. Introduction 2. Strategy 3. Achievements 4. Conclusions

Page 48: Reconstruction of regulatory modules based on heterogeneous data sources Karen Lemmens PhD Defence September 29th 2008

29 September 2008 Karen LemmensPhD defence

ReMoDiscovery: Unraveling the yeast transcriptional network

• Many aspects of transcription into account:– Regulatory motifs– Regulators– Co-expression of genes

Condition dependency of the interactions is missing

1. Introduction 2. Strategy 3. Achievements 4. Conclusions

Page 49: Reconstruction of regulatory modules based on heterogeneous data sources Karen Lemmens PhD Defence September 29th 2008

29 September 2008 Karen LemmensPhD defence

ReMoDiscovery: Unraveling the yeast transcriptional network

• Many aspects of transcription into account:– Regulatory motifs– Regulators– Co-expression of genes

Condition dependency of the interactions is missing

1. Introduction 2. Strategy 3. Achievements 4. Conclusions

Page 50: Reconstruction of regulatory modules based on heterogeneous data sources Karen Lemmens PhD Defence September 29th 2008

29 September 2008 Karen LemmensPhD defence

Outline

1. Introduction & objectives

2. Strategy– Data integration– Association rule mining algorithms

3. Main achievements– ReMoDiscovery: Unraveling the yeast transcriptional

network– DISTILLER: Condition-dependent combinatorial regulation

in E. coli

4. Conclusions and perspectives

1. Introduction 2. Strategy 3. Achievements 4. Conclusions

Page 51: Reconstruction of regulatory modules based on heterogeneous data sources Karen Lemmens PhD Defence September 29th 2008

29 September 2008 Karen LemmensPhD defence

DISTILLER: Condition-dependent combinatorial regulation in E. coli

• ReMoDiscovery:

– Co-expression in all conditions by correlation

– Apriori algorithm

– No filtering procedure

• DISTILLER:

– Condition dependency: bandwidth concept

– CHARM algorithm

– Filtering procedure to identify the most interesting modules

1. Introduction 2. Strategy 3. Achievements 4. Conclusions

Page 52: Reconstruction of regulatory modules based on heterogeneous data sources Karen Lemmens PhD Defence September 29th 2008

29 September 2008 Karen LemmensPhD defence

DISTILLER: Condition-dependent combinatorial regulation in E. coli

Pastor D., Cortes-Calabuig A., Lemmens K., De Moor B., Marchal K., Denecker M. (2007) GeneReg: Integration of Experimental Data on the DNA Transcription Process. Proceedings of BNAIC 2007.

1. Introduction 2. Strategy 3. Achievements 4. Conclusions

Page 53: Reconstruction of regulatory modules based on heterogeneous data sources Karen Lemmens PhD Defence September 29th 2008

29 September 2008 Karen LemmensPhD defence

DISTILLER: Condition-dependent combinatorial regulation in E. coli

• Example: FNR module

1. Introduction 2. Strategy 3. Achievements 4. Conclusions

Page 54: Reconstruction of regulatory modules based on heterogeneous data sources Karen Lemmens PhD Defence September 29th 2008

29 September 2008 Karen LemmensPhD defence

DISTILLER: Condition-dependent combinatorial regulation in E. coli

• Example: FNR module

1. Introduction 2. Strategy 3. Achievements 4. Conclusions

Page 55: Reconstruction of regulatory modules based on heterogeneous data sources Karen Lemmens PhD Defence September 29th 2008

29 September 2008 Karen LemmensPhD defence

DISTILLER: Condition-dependent combinatorial regulation in E. coli

• Example: FNR module

1. Introduction 2. Strategy 3. Achievements 4. Conclusions

Page 56: Reconstruction of regulatory modules based on heterogeneous data sources Karen Lemmens PhD Defence September 29th 2008

29 September 2008 Karen LemmensPhD defence

DISTILLER: Condition-dependent combinatorial regulation in E. coli

• FNR = one of the most extensively studied regulators

• Experimental validation of novel FNR targets

– High confidence: • ydhY (b1674) Partridge et al, 2008• yfgG (b2504)• hscC (b0650)• treF (b3519)

– Medium confidence:• yjhB (b4279)• ydjX (b1750)• yjtD (b4403)• ydaT (b1358)• yehD (b2111)• yhjA (b3518) Partridge et al, 2007• ftnB (b1902)

1. Introduction 2. Strategy 3. Achievements 4. Conclusions

Page 57: Reconstruction of regulatory modules based on heterogeneous data sources Karen Lemmens PhD Defence September 29th 2008

29 September 2008 Karen LemmensPhD defence

DISTILLER: Condition-dependent combinatorial regulation in E. coli

• Condition dependency

– Arrays were grouped into conditional categories

– Colors show to what extent the conditions of the modules of a particular regulator are enriched for a specific category

1. Introduction 2. Strategy 3. Achievements 4. Conclusions

Page 58: Reconstruction of regulatory modules based on heterogeneous data sources Karen Lemmens PhD Defence September 29th 2008

29 September 2008 Karen LemmensPhD defence

DISTILLER: Condition-dependent combinatorial regulation in E. coli

• Combinatorial regulation

– Static

– Highly combinatorial:• 42 regulons one regulator• 66 complex regulons two regulators• 70 complex regulons three or more regulators (maximum

of 8)

1. Introduction 2. Strategy 3. Achievements 4. Conclusions

Page 59: Reconstruction of regulatory modules based on heterogeneous data sources Karen Lemmens PhD Defence September 29th 2008

29 September 2008 Karen LemmensPhD defence

DISTILLER: Condition-dependent combinatorial regulation in E. coli

• Combinatorial regulation at the module level

• Lower combinatorial complexity• 25/150 modules at least two regulators (maximum of 3)• 24 modules involve at least one global regulator such as CRP,

FNR or ArcA

1. Introduction 2. Strategy 3. Achievements 4. Conclusions

Page 60: Reconstruction of regulatory modules based on heterogeneous data sources Karen Lemmens PhD Defence September 29th 2008

29 September 2008 Karen LemmensPhD defence

DISTILLER: Condition-dependent combinatorial regulation in E. coli

• Combinatorial regulation at connector gene level

One regulator may be sufficient to alter the expression of a connector gene upon a specific environmental cue

1. Introduction 2. Strategy 3. Achievements 4. Conclusions

Page 61: Reconstruction of regulatory modules based on heterogeneous data sources Karen Lemmens PhD Defence September 29th 2008

29 September 2008 Karen LemmensPhD defence

DISTILLER: Condition-dependent combinatorial regulation in E. coli

• Conclusions

– Reliable predictions

– Dynamic view on the network

– Combinatorial regulation

** Lemmens K., De Bie T., Dhollander T., Monsieurs P., De Moor B., Collado-Vides J., Engelen K., Marchal K. (2008) The condition-dependent transcriptional network in Escherichia coli. Accepted for publication in Annals of NYAS, DREAM2.

** Lemmens K., De Bie T., Dhollander T., De Keersmaecker S., Thijs I., Schoofs G., De Weerdt A., De Moor B., Vanderleyden J., Collado-Vides J., Engelen K., Marchal K. (2008) DISTILLER: a data integration framework to reveal condition dependency of complex regulons in Escherichia coli. Submitted to Genome Biology.

1. Introduction 2. Strategy 3. Achievements 4. Conclusions

Page 62: Reconstruction of regulatory modules based on heterogeneous data sources Karen Lemmens PhD Defence September 29th 2008

29 September 2008 Karen LemmensPhD defence

Outline

1. Introduction & objectives

2. Strategy– Data integration– Association rule mining algorithms

3. Main achievements– ReMoDiscovery: Unraveling the yeast transcriptional

network– DISTILLER: Condition-dependent combinatorial regulation

in E. coli

4. Conclusions and perspectives

1. Introduction 2. Strategy 3. Achievements 4. Conclusions

Page 63: Reconstruction of regulatory modules based on heterogeneous data sources Karen Lemmens PhD Defence September 29th 2008

29 September 2008 Karen LemmensPhD defence

Conclusions

• Main contributions of this thesis:– Automated collection of data– ReMoDiscovery– DISTILLER

• Goals obtained via:– Data integration – Association rule mining algorithms well suited for

data integration and reconstruction of transcriptional network

• Several algorithmic problems were solved

• Novel biological findings

1. Introduction 2. Strategy 3. Achievements 4. Conclusions

Page 64: Reconstruction of regulatory modules based on heterogeneous data sources Karen Lemmens PhD Defence September 29th 2008

29 September 2008 Karen LemmensPhD defence

Perspectives

• Conceptual extensions:– Inclusion of other data sources

• Additional motifs from de novo motif detection• Small RNAs

– Comparison of networks

• Implementation-related and algorithmic improvements:– User-friendly interface– Microarray compendium– Filtering step– Motif detection algorithms

1. Introduction 2. Strategy 3. Achievements 4. Conclusions

Page 65: Reconstruction of regulatory modules based on heterogeneous data sources Karen Lemmens PhD Defence September 29th 2008

29 September 2008 Karen LemmensPhD defence

Acknowledgements

• CMPG - BioI– Prof. Dr. K. Marchal– BioI group

• ESAT/SCD – BioI– Prof. Dr. B. De Moor– Prof. Dr. Y. Moreau– BioI group

– Prof. Dr. T. De Bie

• CMPG– Prof. Dr. J. Vanderleyden– Dr. S. De Keersmaecker

• Computer Sciences– Prof. Dr. M. Denecker– A. Cortés Calabuig

– Prof. Dr. J. Collado-Vides

1. Introduction 2. Strategy 3. Achievements 4. Conclusions