![Page 1: Defining Gene Clusters: 24 Ways of Looking at Mount Fuji Anne Bergeron, UQAM Dublin, September 19, 2005 7. Mt Fuji from the Foot](https://reader036.vdocument.in/reader036/viewer/2022062422/56649f065503460f94c1b48d/html5/thumbnails/1.jpg)
Defining Gene Clusters:24 Ways of Looking at Mount Fuji
Anne Bergeron, UQAMDublin, September 19, 2005
7. Mt Fuji from the Foot
![Page 2: Defining Gene Clusters: 24 Ways of Looking at Mount Fuji Anne Bergeron, UQAM Dublin, September 19, 2005 7. Mt Fuji from the Foot](https://reader036.vdocument.in/reader036/viewer/2022062422/56649f065503460f94c1b48d/html5/thumbnails/2.jpg)
Defining Gene Clusters:24 Ways of Looking at Mount Fuji
Anne Bergeron, UQAMDublin, September 19, 2005
"It struck me that it would be good to take one thing in life and regard it from many viewpoints, ... " Roger Zelazny
![Page 3: Defining Gene Clusters: 24 Ways of Looking at Mount Fuji Anne Bergeron, UQAM Dublin, September 19, 2005 7. Mt Fuji from the Foot](https://reader036.vdocument.in/reader036/viewer/2022062422/56649f065503460f94c1b48d/html5/thumbnails/3.jpg)
The basic problem
Genome A
Genome B
Genome C
We start with a set of genomes, labeled by gene names, domains, or synteny blocks,and a similarity relation on those labels.
Highlighting a gene means selecting all labels that are similar.
Genes, or other types of signals, can appear in multiple copies in a genome, or even be missing. In this talk, the similarity relation is "given" and is anequivalence relation.
![Page 4: Defining Gene Clusters: 24 Ways of Looking at Mount Fuji Anne Bergeron, UQAM Dublin, September 19, 2005 7. Mt Fuji from the Foot](https://reader036.vdocument.in/reader036/viewer/2022062422/56649f065503460f94c1b48d/html5/thumbnails/4.jpg)
Genome A
Genome B
Genome C
The basic problemWe are interested in what happens when a set of genes is highlighted.
A set of genes : { }
Boring...
![Page 5: Defining Gene Clusters: 24 Ways of Looking at Mount Fuji Anne Bergeron, UQAM Dublin, September 19, 2005 7. Mt Fuji from the Foot](https://reader036.vdocument.in/reader036/viewer/2022062422/56649f065503460f94c1b48d/html5/thumbnails/5.jpg)
Genome A
Genome B
Genome C
The basic problem
Another set of genes: { }
Interesting ?Measures of surprise are studied by Durand, Haque, Hoberman, Sankoff, Raghupathy, etc.
![Page 6: Defining Gene Clusters: 24 Ways of Looking at Mount Fuji Anne Bergeron, UQAM Dublin, September 19, 2005 7. Mt Fuji from the Foot](https://reader036.vdocument.in/reader036/viewer/2022062422/56649f065503460f94c1b48d/html5/thumbnails/6.jpg)
The basic problem
Goal : Given a (big) set of genomes, automatically identify all potentially interesting sets of genes.
![Page 7: Defining Gene Clusters: 24 Ways of Looking at Mount Fuji Anne Bergeron, UQAM Dublin, September 19, 2005 7. Mt Fuji from the Foot](https://reader036.vdocument.in/reader036/viewer/2022062422/56649f065503460f94c1b48d/html5/thumbnails/7.jpg)
1. Mount Fuji from Owari
Towards formal models
![Page 8: Defining Gene Clusters: 24 Ways of Looking at Mount Fuji Anne Bergeron, UQAM Dublin, September 19, 2005 7. Mt Fuji from the Foot](https://reader036.vdocument.in/reader036/viewer/2022062422/56649f065503460f94c1b48d/html5/thumbnails/8.jpg)
Towards formal models
What do labels stand for?
How many labels and genomes do we want to compare ?
What do we want to do with the resulting clusters ?
![Page 9: Defining Gene Clusters: 24 Ways of Looking at Mount Fuji Anne Bergeron, UQAM Dublin, September 19, 2005 7. Mt Fuji from the Foot](https://reader036.vdocument.in/reader036/viewer/2022062422/56649f065503460f94c1b48d/html5/thumbnails/9.jpg)
Towards formal models: Example 1
From: Eichler and Sankoff, Science (301:793-797), 2003
Definition of labels and similarity:Large homology segments disrupted only by local micro-rearrangements.
A total of 281 synteny blocks,colored in the human genomeby their mouse chromosome number.
Interesting features:
Chromosome XChromosome 17Chromosome 20
Application:
Genome evolution
![Page 10: Defining Gene Clusters: 24 Ways of Looking at Mount Fuji Anne Bergeron, UQAM Dublin, September 19, 2005 7. Mt Fuji from the Foot](https://reader036.vdocument.in/reader036/viewer/2022062422/56649f065503460f94c1b48d/html5/thumbnails/10.jpg)
Towards formal models: Example 2
Definition of labels and similarity:Gene annotations of chloroplasts.
Trachelium
Campanula
Adenophora
Symphandra
Walhenbergia
Merceria
Interesting features:
Rearrangements
Application:
Phylogeny
![Page 11: Defining Gene Clusters: 24 Ways of Looking at Mount Fuji Anne Bergeron, UQAM Dublin, September 19, 2005 7. Mt Fuji from the Foot](https://reader036.vdocument.in/reader036/viewer/2022062422/56649f065503460f94c1b48d/html5/thumbnails/11.jpg)
Towards formal models: Example 3
From: Pasek et al, Genome Research (15:867-874), 2005
Definition of labels and similarity:PFAM Domain numbers labeling fourbacterial genomes.
Interesting features:
DuplicationsInsertionsRearrangements
Application:
Operon identification
![Page 12: Defining Gene Clusters: 24 Ways of Looking at Mount Fuji Anne Bergeron, UQAM Dublin, September 19, 2005 7. Mt Fuji from the Foot](https://reader036.vdocument.in/reader036/viewer/2022062422/56649f065503460f94c1b48d/html5/thumbnails/12.jpg)
Towards formal models: Example 4
From: Pasek et al, Genome Research (15:867-874), 2005
Definition of labels and similarity:PFAM Domain numbers labeling fourbacterial genomes.
Application:
Identification of orthologsand/or duplicate segments.
With such an high E-value,the potential duplicate wouldhave been missed by a comparisonbased on sequence similarity.
![Page 13: Defining Gene Clusters: 24 Ways of Looking at Mount Fuji Anne Bergeron, UQAM Dublin, September 19, 2005 7. Mt Fuji from the Foot](https://reader036.vdocument.in/reader036/viewer/2022062422/56649f065503460f94c1b48d/html5/thumbnails/13.jpg)
Towards formal models: Example 5
Definition of labels and similarity:Large homology segments disrupted only by local micro-rearrangements.
Comparing 16 segments of the mouseand rat chromosome X.
Application:
Reconstructing ancestors
From: Bérard et al, WABI 2005
Mouse = 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16Rat = -4 -3 -2 1 -13 -15 14 -16 8 9 10 -11 12 5 6 7
![Page 14: Defining Gene Clusters: 24 Ways of Looking at Mount Fuji Anne Bergeron, UQAM Dublin, September 19, 2005 7. Mt Fuji from the Foot](https://reader036.vdocument.in/reader036/viewer/2022062422/56649f065503460f94c1b48d/html5/thumbnails/14.jpg)
2. Mt Fuji from a Teahouse at Yoshida
Down to earth details
![Page 15: Defining Gene Clusters: 24 Ways of Looking at Mount Fuji Anne Bergeron, UQAM Dublin, September 19, 2005 7. Mt Fuji from the Foot](https://reader036.vdocument.in/reader036/viewer/2022062422/56649f065503460f94c1b48d/html5/thumbnails/15.jpg)
Down to earth details
Do we allow gaps ?
Do we allow rearrangements?
Do we allow duplicates and missing genes ?
Do we allow multiple genomes orself-comparison ?
How about "extensions" ?
![Page 16: Defining Gene Clusters: 24 Ways of Looking at Mount Fuji Anne Bergeron, UQAM Dublin, September 19, 2005 7. Mt Fuji from the Foot](https://reader036.vdocument.in/reader036/viewer/2022062422/56649f065503460f94c1b48d/html5/thumbnails/16.jpg)
Genome A
Genome B
Genome C
A set of genes: { }
Down to earth details : Model 1
No gaps, no duplications, any rearrangement.
![Page 17: Defining Gene Clusters: 24 Ways of Looking at Mount Fuji Anne Bergeron, UQAM Dublin, September 19, 2005 7. Mt Fuji from the Foot](https://reader036.vdocument.in/reader036/viewer/2022062422/56649f065503460f94c1b48d/html5/thumbnails/17.jpg)
Genome A
Genome B
Genome C
A set of genes: { }
No gaps, no duplications, any rearrangement.
What about this gene? Should we add it ?
Down to earth details : Model 1
![Page 18: Defining Gene Clusters: 24 Ways of Looking at Mount Fuji Anne Bergeron, UQAM Dublin, September 19, 2005 7. Mt Fuji from the Foot](https://reader036.vdocument.in/reader036/viewer/2022062422/56649f065503460f94c1b48d/html5/thumbnails/18.jpg)
Genome A
Genome B
Genome C
A set of genes: { }
No gaps, no duplications, any rearrangement.
What about this gene? Should we add it ?
Down to earth details : Model 1Extension
![Page 19: Defining Gene Clusters: 24 Ways of Looking at Mount Fuji Anne Bergeron, UQAM Dublin, September 19, 2005 7. Mt Fuji from the Foot](https://reader036.vdocument.in/reader036/viewer/2022062422/56649f065503460f94c1b48d/html5/thumbnails/19.jpg)
Genome A
Genome B
Genome C
A set of genes: { }
No gaps, duplications, any rearrangement.
Genes not in the set
Down to earth details : Model 2
![Page 20: Defining Gene Clusters: 24 Ways of Looking at Mount Fuji Anne Bergeron, UQAM Dublin, September 19, 2005 7. Mt Fuji from the Foot](https://reader036.vdocument.in/reader036/viewer/2022062422/56649f065503460f94c1b48d/html5/thumbnails/20.jpg)
Genome A
Genome B
Genome C
A set of genes: { }
Gaps, no duplications, any rearrangement.
Down to earth details : Model 3
![Page 21: Defining Gene Clusters: 24 Ways of Looking at Mount Fuji Anne Bergeron, UQAM Dublin, September 19, 2005 7. Mt Fuji from the Foot](https://reader036.vdocument.in/reader036/viewer/2022062422/56649f065503460f94c1b48d/html5/thumbnails/21.jpg)
Genome A
Genome B
Genome C
A set of genes: { }
Gaps, missing/inserted genes, any rearrangement.
Down to earth details : Model 4
![Page 22: Defining Gene Clusters: 24 Ways of Looking at Mount Fuji Anne Bergeron, UQAM Dublin, September 19, 2005 7. Mt Fuji from the Foot](https://reader036.vdocument.in/reader036/viewer/2022062422/56649f065503460f94c1b48d/html5/thumbnails/22.jpg)
Genome A
Genome B
Genome C
A set of genes: { }
Gaps, missing genes, duplications, any rearrangement.
With gap size = 1, we get 4 occurrences.
Reducing the number of genes....
Down to earth details : Model 5
![Page 23: Defining Gene Clusters: 24 Ways of Looking at Mount Fuji Anne Bergeron, UQAM Dublin, September 19, 2005 7. Mt Fuji from the Foot](https://reader036.vdocument.in/reader036/viewer/2022062422/56649f065503460f94c1b48d/html5/thumbnails/23.jpg)
Genome A
Genome B
Genome C
A smaller set of genes: { }
... yields 5 occurrences.
Down to earth details : Model 5
![Page 24: Defining Gene Clusters: 24 Ways of Looking at Mount Fuji Anne Bergeron, UQAM Dublin, September 19, 2005 7. Mt Fuji from the Foot](https://reader036.vdocument.in/reader036/viewer/2022062422/56649f065503460f94c1b48d/html5/thumbnails/24.jpg)
24. Mount Fuji in a Summer Storm
A general framework
![Page 25: Defining Gene Clusters: 24 Ways of Looking at Mount Fuji Anne Bergeron, UQAM Dublin, September 19, 2005 7. Mt Fuji from the Foot](https://reader036.vdocument.in/reader036/viewer/2022062422/56649f065503460f94c1b48d/html5/thumbnails/25.jpg)
A general framework
Given a gap g, an occurrence of S is a maximal run of genes of S, separated by gaps of at most g genes not in S,and that contains at least one of each gene of S.
A set S of genes: { }
A set of genes S is an extension of a set T, included in S, if each occurrence of T is contained in an occurrence of S.
S = { } is an extension of T= { }
> g > g > g≤ g
Occurrence #1 Occurrence #2
A chromosome:
![Page 26: Defining Gene Clusters: 24 Ways of Looking at Mount Fuji Anne Bergeron, UQAM Dublin, September 19, 2005 7. Mt Fuji from the Foot](https://reader036.vdocument.in/reader036/viewer/2022062422/56649f065503460f94c1b48d/html5/thumbnails/26.jpg)
A general framework
Given a gap g, an occurrence of S is a maximal run of genes of S, separated by gaps of at most g genes not in S,and that contains at least one of each gene of S.
A set S of genes: { }
A set of genes S is an extension of a set T, included in S, if each occurrence of T is contained in an occurrence of S.
S = { } is an extension of T= { }
> g > g > g≤ g
Occurrence #1 Occurrence #2
A chromosome:
![Page 27: Defining Gene Clusters: 24 Ways of Looking at Mount Fuji Anne Bergeron, UQAM Dublin, September 19, 2005 7. Mt Fuji from the Foot](https://reader036.vdocument.in/reader036/viewer/2022062422/56649f065503460f94c1b48d/html5/thumbnails/27.jpg)
• g = 0 or g > 0
ChoicesWhen g = 0, the number of candidates is polynomial in the number of genes.
When g > 0, the number ofcandidates can be exponentialin the number of genes.
A general framework
Even with g = 1, there are problems. For example, with g = 0, the sequence of genes:
a b c d e fproduces one potential cluster that contains both a and f. But with g = 1, there are 8 of them:
a b c d e fa b c d fa b c e fa b d e fa c d e fa c e f a b d fa c d f
The number of these sequences grows in a Fibonacci progression!
![Page 28: Defining Gene Clusters: 24 Ways of Looking at Mount Fuji Anne Bergeron, UQAM Dublin, September 19, 2005 7. Mt Fuji from the Foot](https://reader036.vdocument.in/reader036/viewer/2022062422/56649f065503460f94c1b48d/html5/thumbnails/28.jpg)
• g = 0 or g > 0
Choices
• Duplications or no duplications Duplications usually meansan exponential number of candidates but, most of the time,are unavoidable.
Models without duplications are,nevertheless, useful in many situations.
A general framework
![Page 29: Defining Gene Clusters: 24 Ways of Looking at Mount Fuji Anne Bergeron, UQAM Dublin, September 19, 2005 7. Mt Fuji from the Foot](https://reader036.vdocument.in/reader036/viewer/2022062422/56649f065503460f94c1b48d/html5/thumbnails/29.jpg)
• g = 0 or g > 0
Choices
• Duplications or no duplications
• Three ways of filtering candidates
Filtering is mostly based on the properties of the extension relation.
If the number of candidates is low, filtering is not necessary,but it can be relevant.
For models with a huge numberof candidates, filtering is a must.
A general framework
![Page 30: Defining Gene Clusters: 24 Ways of Looking at Mount Fuji Anne Bergeron, UQAM Dublin, September 19, 2005 7. Mt Fuji from the Foot](https://reader036.vdocument.in/reader036/viewer/2022062422/56649f065503460f94c1b48d/html5/thumbnails/30.jpg)
• g = 0 or g > 0
Choices
• Duplications or no duplications
• Three ways of filtering candidates
• Formal or heuristic Formal models have inherentcomputational problems whenapplied to real data.
Heuristics will always be useful.
A general framework
![Page 31: Defining Gene Clusters: 24 Ways of Looking at Mount Fuji Anne Bergeron, UQAM Dublin, September 19, 2005 7. Mt Fuji from the Foot](https://reader036.vdocument.in/reader036/viewer/2022062422/56649f065503460f94c1b48d/html5/thumbnails/31.jpg)
• g = 0 or g > 0
Choices
• Duplications or no duplications
• Three ways of filtering candidates
• Formal or heuristic
A general framework
2 x 2 x 3 x 2 = 24How convenient!
![Page 32: Defining Gene Clusters: 24 Ways of Looking at Mount Fuji Anne Bergeron, UQAM Dublin, September 19, 2005 7. Mt Fuji from the Foot](https://reader036.vdocument.in/reader036/viewer/2022062422/56649f065503460f94c1b48d/html5/thumbnails/32.jpg)
20. Mount Fuji from Inume Pass
*Voluntary simplicity is a lifestyle considered by its adherents to be a sustainable, ecologically sensitive alternative to the typical, western consumerist lifestyle. [Ref. Wikipedia]
Common intervals: Voluntary simplicity*
![Page 33: Defining Gene Clusters: 24 Ways of Looking at Mount Fuji Anne Bergeron, UQAM Dublin, September 19, 2005 7. Mt Fuji from the Foot](https://reader036.vdocument.in/reader036/viewer/2022062422/56649f065503460f94c1b48d/html5/thumbnails/33.jpg)
Common intervals: Voluntary simplicity*
*Voluntary simplicity is a lifestyle considered by its adherents to be a sustainable, ecologically sensitive alternative to the typical, western consumerist lifestyle. [Ref. Wikipedia]
A (partial) list of credits:Uno and Yagiura (2000)Heber and Stoye (2001)Bergeron, Heber and Stoye (2002)Didier (2003)Schmidt and Stoye (2004)Figeac and Varré (2004)Bérard, Bergeron and Chauve (2004)Blin, Chauve and Fertin(2005)Landau, Parida and Weizman (2005)Tannier and Sagot (2005)Bérard, Bergeron, Chauve and Paul (2005)Bergeron, Chauve, de Montgolfier and Raffinot (2005)
![Page 34: Defining Gene Clusters: 24 Ways of Looking at Mount Fuji Anne Bergeron, UQAM Dublin, September 19, 2005 7. Mt Fuji from the Foot](https://reader036.vdocument.in/reader036/viewer/2022062422/56649f065503460f94c1b48d/html5/thumbnails/34.jpg)
Common intervals
• g = 0
Choices
• No duplications
• No filtering
• Formal
Genome A
Genome B
Genome C
The basic model of common intervals oftenyields a large number of 'uninteresting clusters'.However, filtering provides unusual informationon whole genome organization.
![Page 35: Defining Gene Clusters: 24 Ways of Looking at Mount Fuji Anne Bergeron, UQAM Dublin, September 19, 2005 7. Mt Fuji from the Foot](https://reader036.vdocument.in/reader036/viewer/2022062422/56649f065503460f94c1b48d/html5/thumbnails/35.jpg)
Common intervals -> Strong Intervals
• g = 0
Choices
• No duplications
• Filtering
• Formal
Genome A
Genome B
Common intervals
stuv
Both t and u are two different extensions of the common interval s: Remove them.
Strong intervalss
v
![Page 36: Defining Gene Clusters: 24 Ways of Looking at Mount Fuji Anne Bergeron, UQAM Dublin, September 19, 2005 7. Mt Fuji from the Foot](https://reader036.vdocument.in/reader036/viewer/2022062422/56649f065503460f94c1b48d/html5/thumbnails/36.jpg)
Strong Intervals
From: Bérard et al, WABI 2005
Mouse = 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16Rat = -4 -3 -2 1 -13 -15 14 -16 8 9 10 -11 12 5 6 7
This tree displays the strongintervals between the synteny blocks of the mouse and rat chromosomes X.
This kind of tree is known as a PQ-tree. Strong intervals possess a rich combinatorial structure that can be exploited both from the biological and computation perspective.
![Page 37: Defining Gene Clusters: 24 Ways of Looking at Mount Fuji Anne Bergeron, UQAM Dublin, September 19, 2005 7. Mt Fuji from the Foot](https://reader036.vdocument.in/reader036/viewer/2022062422/56649f065503460f94c1b48d/html5/thumbnails/37.jpg)
13 15 14 16 8 9 10 11 12 5 6 7
4 3 2 1
13 15 14 16
8 9 10 11 12 5 6 715 14
15 14 8 9 10 121 5 6 74 3 2 1113 16
4 3 2 1 13 15 14 16 8 9 10 11 12 5 6 7
Strong Intervals : transforming a rat into a mouse
This tree provides guidelines to possible rearrangementscenarios that transform the rat chromosome into a mouse chromosome. These scenarios preserve all common intervals.
![Page 38: Defining Gene Clusters: 24 Ways of Looking at Mount Fuji Anne Bergeron, UQAM Dublin, September 19, 2005 7. Mt Fuji from the Foot](https://reader036.vdocument.in/reader036/viewer/2022062422/56649f065503460f94c1b48d/html5/thumbnails/38.jpg)
13 15 14 16 8 9 10 11 12 5 6 7
4 3 2 1
13 15 14 16
8 9 10 11 12 5 6 715 14
15 14 8 9 10 121 5 6 74 3 2 1113 16
4 3 2 1 13 15 14 16 8 9 10 11 12 5 6 7
Strong Intervals : transforming a rat into a mouse
Intervals are first labeled (in red) with respect to their relative orientation.
![Page 39: Defining Gene Clusters: 24 Ways of Looking at Mount Fuji Anne Bergeron, UQAM Dublin, September 19, 2005 7. Mt Fuji from the Foot](https://reader036.vdocument.in/reader036/viewer/2022062422/56649f065503460f94c1b48d/html5/thumbnails/39.jpg)
4 3 2 1 13 15 14 16 8 9 10 11 12 5 6 7
13 15 14 16 8 9 10 11 12 5 6 7
4 3 2 1
13 15 14 16
8 9 10 11 12 5 6 715 14
15 14 8 9 10 121 5 6 74 3 2 1113 16
Strong Intervals : transforming a rat into a mouse
Intervals are first labeled (in red) with respect to their relative orientation.
![Page 40: Defining Gene Clusters: 24 Ways of Looking at Mount Fuji Anne Bergeron, UQAM Dublin, September 19, 2005 7. Mt Fuji from the Foot](https://reader036.vdocument.in/reader036/viewer/2022062422/56649f065503460f94c1b48d/html5/thumbnails/40.jpg)
4 3 2 1 13 15 14 16 8 9 10 11 12 5 6 7
4 3 2 1
4 3 2 1
13 15 14 16 8 9 10 11 12 5 6 7
13 15 14 16
8 9 10 11 12 5 6 715 14
15 14 8 9 10 12 5 6 71113 161
4 3 2 1
4 3 2 1 13 15 14 16 8 9 10 11 12 5 6 7
Strong Intervals : transforming a rat into a mouse
Then all strong intervals that disagree with their parent are inverted : 1
![Page 41: Defining Gene Clusters: 24 Ways of Looking at Mount Fuji Anne Bergeron, UQAM Dublin, September 19, 2005 7. Mt Fuji from the Foot](https://reader036.vdocument.in/reader036/viewer/2022062422/56649f065503460f94c1b48d/html5/thumbnails/41.jpg)
4 3 2 1 13 15 14 16 8 9 10 11 12 5 6 7
1
4 3 2 1
4 3 2
13 15 14 16 8 9 10 11 12 5 6 7
13 15 14 16
8 9 10 11 12 5 6 715 14
15 14 8 9 10 12 5 6 71113 164
1 2 3 4
1 2 3
1 2 3 4 13 15 14 16 8 9 10 11 12 5 6 7
Strong Intervals : transforming a rat into a mouse
Then all strong intervals that disagree with their parent are inverted : 4 3 2 1
![Page 42: Defining Gene Clusters: 24 Ways of Looking at Mount Fuji Anne Bergeron, UQAM Dublin, September 19, 2005 7. Mt Fuji from the Foot](https://reader036.vdocument.in/reader036/viewer/2022062422/56649f065503460f94c1b48d/html5/thumbnails/42.jpg)
4
1 2 3 4
1 2 3
1 2 3 4 13 15 14 16 8 9 10 11 12 5 6 7
13 15 14 16 8 9 10 11 12 5 6 7
13 15 14 16
15 14
15 1413 16
8 9 10 11 12 5 6 7
8 9 10 12 5 6 71113
13 15 14 16
13 15 14 16 8 9 10 11 12 5 6 7
1 2 3 4 13 15 14 16 8 9 10 11 12 5 6 7
Strong Intervals : transforming a rat into a mouse
Then all strong intervals that disagree with their parent are inverted : 13
![Page 43: Defining Gene Clusters: 24 Ways of Looking at Mount Fuji Anne Bergeron, UQAM Dublin, September 19, 2005 7. Mt Fuji from the Foot](https://reader036.vdocument.in/reader036/viewer/2022062422/56649f065503460f94c1b48d/html5/thumbnails/43.jpg)
13
13 15 14 16
13 15 14 16 8 9 10 11 12 5 6 7
1 2 3 4 13 15 14 16 8 9 10 11 12 5 6 7
4
1 2 3 4
1 2 3
15 14
15 14 16
8 9 10 11 12 5 6 7
8 9 10 12 5 6 711
15 14
13 15 14 16
14
13 15 14 16 8 9 10 11 12 5 6 7
1 2 3 4 13 15 14 16 8 9 10 11 12 5 6 7
Strong Intervals : transforming a rat into a mouse
Then all strong intervals that disagree with their parent are inverted : 14
![Page 44: Defining Gene Clusters: 24 Ways of Looking at Mount Fuji Anne Bergeron, UQAM Dublin, September 19, 2005 7. Mt Fuji from the Foot](https://reader036.vdocument.in/reader036/viewer/2022062422/56649f065503460f94c1b48d/html5/thumbnails/44.jpg)
15 14
13 15 14 16
14
13 15 14 16 8 9 10 11 12 5 6 7
1 2 3 4 13 15 14 16 8 9 10 11 12 5 6 7
134
1 2 3 4
1 2 3 15 16
8 9 10 11 12 5 6 7
8 9 10 12 5 6 71116
13 15 14 16
13 15 14 16 8 9 10 11 12 5 6 7
1 2 3 4 13 15 14 16 8 9 10 11 12 5 6 7
Strong Intervals : transforming a rat into a mouse
Then all strong intervals that disagree with their parent are inverted : 16
![Page 45: Defining Gene Clusters: 24 Ways of Looking at Mount Fuji Anne Bergeron, UQAM Dublin, September 19, 2005 7. Mt Fuji from the Foot](https://reader036.vdocument.in/reader036/viewer/2022062422/56649f065503460f94c1b48d/html5/thumbnails/45.jpg)
16
13 15 14 16
13 15 14 16 8 9 10 11 12 5 6 7
1 2 3 4 13 15 14 16 8 9 10 11 12 5 6 7
134
1 2 3 4
1 2 3
15 14
1415
8 9 10 11 12 5 6 7
8 9 10 12 5 6 711
14 15
1514
13 14 15 16
13 14 15 16 8 9 10 11 12 5 6 7
1 2 3 4 13 14 15 16 8 9 10 11 12 5 6 7
Strong Intervals : transforming a rat into a mouse
Then all strong intervals that disagree with their parent are inverted : 14 15
![Page 46: Defining Gene Clusters: 24 Ways of Looking at Mount Fuji Anne Bergeron, UQAM Dublin, September 19, 2005 7. Mt Fuji from the Foot](https://reader036.vdocument.in/reader036/viewer/2022062422/56649f065503460f94c1b48d/html5/thumbnails/46.jpg)
16
13 15 14 16
13
15 14
1415
14 15
1514
13 14 15 16
13 14 15 16 8 9 10 11 12 5 6 7
1 2 3 4 13 14 15 16 8 9 10 11 12 5 6 7
4
1 2 3 4
1 2 3
8 9 10 11 12 5 6 7
8 9 10 12 5 6 711
14 15
1514
13 14 15 16
1613
15 14
1415
16 15 14 13
1316
16 15 14 13 8 9 10 11 12 5 6 7
1 2 3 4 16 15 14 13 8 9 10 11 12 5 6 7
Strong Intervals : transforming a rat into a mouse
Then all strong intervals that disagree with their parent are inverted : 13 14 15 16
![Page 47: Defining Gene Clusters: 24 Ways of Looking at Mount Fuji Anne Bergeron, UQAM Dublin, September 19, 2005 7. Mt Fuji from the Foot](https://reader036.vdocument.in/reader036/viewer/2022062422/56649f065503460f94c1b48d/html5/thumbnails/47.jpg)
14 15
1514
13 14 15 16
1613
15 14
1415
16 15 14 13
1316
16 15 14 13 8 9 10 11 12 5 6 7
1 2 3 4 16 15 14 13 8 9 10 11 12 5 6 7
4
1 2 3 4
1 2 3
8 9 10 11 12
8 9 10 1211
5 6 7
5 6 711
8 9 10 11 12
16 15 14 13 8 9 10 11 12 5 6 7
1 2 3 4 16 15 14 13 8 9 10 11 12 5 6 7
Strong Intervals : transforming a rat into a mouse
Then all strong intervals that disagree with their parent are inverted : 11
![Page 48: Defining Gene Clusters: 24 Ways of Looking at Mount Fuji Anne Bergeron, UQAM Dublin, September 19, 2005 7. Mt Fuji from the Foot](https://reader036.vdocument.in/reader036/viewer/2022062422/56649f065503460f94c1b48d/html5/thumbnails/48.jpg)
16 15 14 13 8 9 10 11 12 5 6 7
1 2 3 4 16 15 14 13 8 9 10 11 12 5 6 7
14 15
1514
13 14 15 16
1613
15 14
1415
16 15 14 13
13164
1 2 3 4
1 2 3 11
8 9 10 11 12
8 9 10 12
5 6 7
5 6 79
12 11 10 9 8
12 11 10 8
16 15 14 13 12 11 10 9 8 5 6 7
1 2 3 4 16 15 14 13 12 11 10 9 8 5 6 7
Strong Intervals : transforming a rat into a mouse
Then all strong intervals that disagree with their parent are inverted : 8 9 10 11 12
![Page 49: Defining Gene Clusters: 24 Ways of Looking at Mount Fuji Anne Bergeron, UQAM Dublin, September 19, 2005 7. Mt Fuji from the Foot](https://reader036.vdocument.in/reader036/viewer/2022062422/56649f065503460f94c1b48d/html5/thumbnails/49.jpg)
9
12 11 10 9 8
12 11 10 8
16 15 14 13 12 11 10 9 8 5 6 7
1 2 3 4 16 15 14 13 12 11 10 9 8 5 6 7
14 15
1514
13 14 15 16
1613
15 14
1415
16 15 14 13
13164
1 2 3 4
1 2 3
5 6 7
5 6 7
7 6 5
7 6 5
16 15 14 13 12 11 10 9 8 7 6 5
1 2 3 4 16 15 14 13 12 11 10 9 8 7 6 5
Strong Intervals : transforming a rat into a mouse
Then all strong intervals that disagree with their parent are inverted : 5 6 7
![Page 50: Defining Gene Clusters: 24 Ways of Looking at Mount Fuji Anne Bergeron, UQAM Dublin, September 19, 2005 7. Mt Fuji from the Foot](https://reader036.vdocument.in/reader036/viewer/2022062422/56649f065503460f94c1b48d/html5/thumbnails/50.jpg)
1 2 3 4 16 15 14 13 12 11 10 9 8 7 6 5
7 6 5
7 6 5
16 15 14 13 12 11 10 9 8 7 6 5
9
12 11 10 9 8
12 11 10 8
14 15
1514
13 14 15 16
1613
15 14
1415
16 15 14 13
13164
1 2 3 4
1 2 3
5 6 7
14 15 16
5 6 7 8 9 10 11 12 13 14 15 16
12
8 9 10 11 12
9 10 11 13
14 15
76
13 14 15 16
85
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16
Strong Intervals : transforming a rat into a mouse
Then all strong intervals that disagree with their parent are inverted : 5 6 7 ... 14 15 16
![Page 51: Defining Gene Clusters: 24 Ways of Looking at Mount Fuji Anne Bergeron, UQAM Dublin, September 19, 2005 7. Mt Fuji from the Foot](https://reader036.vdocument.in/reader036/viewer/2022062422/56649f065503460f94c1b48d/html5/thumbnails/51.jpg)
1 2 3 4 16 15 14 13 12 11 10 9 8 7 6 5
4
1 2 3 4
1 2 3
5 6 7
14 15 16
5 6 7 8 9 10 11 12 13 14 15 16
12
8 9 10 11 12
9 10 11 13
14 15
76
13 14 15 16
85
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16
Strong Intervals : transforming a rat into a mouse
Then all strong intervals that disagree with their parent are inverted : 5 6 7 ... 14 15 16
![Page 52: Defining Gene Clusters: 24 Ways of Looking at Mount Fuji Anne Bergeron, UQAM Dublin, September 19, 2005 7. Mt Fuji from the Foot](https://reader036.vdocument.in/reader036/viewer/2022062422/56649f065503460f94c1b48d/html5/thumbnails/52.jpg)
18. Mt Fuji from the Offing in Kanagawa
Domain Teams: The 'eXtreme' model
![Page 53: Defining Gene Clusters: 24 Ways of Looking at Mount Fuji Anne Bergeron, UQAM Dublin, September 19, 2005 7. Mt Fuji from the Foot](https://reader036.vdocument.in/reader036/viewer/2022062422/56649f065503460f94c1b48d/html5/thumbnails/53.jpg)
A (partial) list of credits:Bergeron, Corteel and Raffinot (2002)Luc, Risler, Bergeron and Raffinot (2003)He and Goldwasser (2004)Béal, Bergeron, Corteel and Raffinot (2004)Pasek, Bergeron, Risler, Louis, Ollivier and Raffinot (2005)Blin, Chauve and Fertin (2005)
Domain Teams: The 'eXtreme' model
![Page 54: Defining Gene Clusters: 24 Ways of Looking at Mount Fuji Anne Bergeron, UQAM Dublin, September 19, 2005 7. Mt Fuji from the Foot](https://reader036.vdocument.in/reader036/viewer/2022062422/56649f065503460f94c1b48d/html5/thumbnails/54.jpg)
Domain Teams
• g > 0
Choices
• Duplications
• Heavy filtering
• Formal
Genome A
Genome B
Remove them all!
has an extension. has an extension.
has an extension. has an extension.
Surviving teams:
![Page 55: Defining Gene Clusters: 24 Ways of Looking at Mount Fuji Anne Bergeron, UQAM Dublin, September 19, 2005 7. Mt Fuji from the Foot](https://reader036.vdocument.in/reader036/viewer/2022062422/56649f065503460f94c1b48d/html5/thumbnails/55.jpg)
Domain Teams : Example
67591 Domains 50078 Proteins 16 ChromosomesMaximum gap: 3 16713 Domain Teams
![Page 56: Defining Gene Clusters: 24 Ways of Looking at Mount Fuji Anne Bergeron, UQAM Dublin, September 19, 2005 7. Mt Fuji from the Foot](https://reader036.vdocument.in/reader036/viewer/2022062422/56649f065503460f94c1b48d/html5/thumbnails/56.jpg)
Domain Teams : Example
From: Pasek et al, Genome Research (15:867-874), 2005
![Page 57: Defining Gene Clusters: 24 Ways of Looking at Mount Fuji Anne Bergeron, UQAM Dublin, September 19, 2005 7. Mt Fuji from the Foot](https://reader036.vdocument.in/reader036/viewer/2022062422/56649f065503460f94c1b48d/html5/thumbnails/57.jpg)
The combinatorial beauty of nature
12. Mt Fuji from Lake Kawaguchiç
![Page 58: Defining Gene Clusters: 24 Ways of Looking at Mount Fuji Anne Bergeron, UQAM Dublin, September 19, 2005 7. Mt Fuji from the Foot](https://reader036.vdocument.in/reader036/viewer/2022062422/56649f065503460f94c1b48d/html5/thumbnails/58.jpg)
The combinatorial beauty of nature
Does nature allow all possiblerearrangements ?
![Page 59: Defining Gene Clusters: 24 Ways of Looking at Mount Fuji Anne Bergeron, UQAM Dublin, September 19, 2005 7. Mt Fuji from the Foot](https://reader036.vdocument.in/reader036/viewer/2022062422/56649f065503460f94c1b48d/html5/thumbnails/59.jpg)
Six domains can theoretically form 63 potential teams.If they are labelled as {a, b, c, d, e, f}, the possible teamswith more than one member are:{a, b}, {a, c}, {a, d}, {a, e}, {a, f}, {b, c}...{a, b, c}, {a, b, d}, {a, b, e}, ......{a, b, c, d, e, f}
For 6 domains, of the 63 possibilities, we found 35 teams that had at least two occurrences and no extension.q
The combinatorial beauty of nature
Promiscuous domains
Who are they?PF00005 ABC transporterPF00072 Response regulator receiver domainPF00486 Transcriptional regulatory proteinPF00512 His Kinase A PF00528 Binding-protein-dependent transport system inner membranePF00672 HAMP domain
![Page 60: Defining Gene Clusters: 24 Ways of Looking at Mount Fuji Anne Bergeron, UQAM Dublin, September 19, 2005 7. Mt Fuji from the Foot](https://reader036.vdocument.in/reader036/viewer/2022062422/56649f065503460f94c1b48d/html5/thumbnails/60.jpg)
The need for heuristics
21. Mount Fuji from the Totomi Mountains
![Page 61: Defining Gene Clusters: 24 Ways of Looking at Mount Fuji Anne Bergeron, UQAM Dublin, September 19, 2005 7. Mt Fuji from the Foot](https://reader036.vdocument.in/reader036/viewer/2022062422/56649f065503460f94c1b48d/html5/thumbnails/61.jpg)
The need for heuristics
• g > 0
Choices
• Duplications
• No filtering
• Heuristic
From: St-Onge, et al. Poster RECOMB CG 2005
Very reasonable approximationsof the general model can be obtainedefficiently -- a few minutes -- in the case of very large scale comparisons.
![Page 62: Defining Gene Clusters: 24 Ways of Looking at Mount Fuji Anne Bergeron, UQAM Dublin, September 19, 2005 7. Mt Fuji from the Foot](https://reader036.vdocument.in/reader036/viewer/2022062422/56649f065503460f94c1b48d/html5/thumbnails/62.jpg)
The need for heuristics
An uncertainty principle
With the general model of gene clusters, it is impossible to predict simultaneously the computing time AND the properties of the output.
![Page 63: Defining Gene Clusters: 24 Ways of Looking at Mount Fuji Anne Bergeron, UQAM Dublin, September 19, 2005 7. Mt Fuji from the Foot](https://reader036.vdocument.in/reader036/viewer/2022062422/56649f065503460f94c1b48d/html5/thumbnails/63.jpg)
Marie-Pierre Béal, Informatique, Marne-la-ValléeSèverine Bérard, INRA, ToulouseMathieu Blanchette, McGill UniversitySylvie Corteel, PRiSM, VersaillesSteffen Heber, Raleig, USAHokusai Katsushika: 1760-1849Nicolas Luc,Génome et informatique, EvryFabien de Montgolfier, LIAFA, ParisChristophe Paul, LIRMM, MontpellierSophie Pasek, Génome et informatique, EvryJean-Loup Risler, Génome et informatique, EvryMathieu Raffinot, Laboratoire Poncelet, MoscouJens Stoye, Technische Facultat, Bielefeld
Credits
Cedric ChauveAnnie ChateauOlivier GingrasYannick GingrasAndré LevasseurJacqueline RwirangiraKarine St-Onge
![Page 64: Defining Gene Clusters: 24 Ways of Looking at Mount Fuji Anne Bergeron, UQAM Dublin, September 19, 2005 7. Mt Fuji from the Foot](https://reader036.vdocument.in/reader036/viewer/2022062422/56649f065503460f94c1b48d/html5/thumbnails/64.jpg)
Marie-Pierre Béal, Informatique, Marne-la-ValléeSèverine Bérard, INRA, ToulouseMathieu Blanchette, McGill UniversitySylvie Corteel, PRiSM, VersaillesSteffen Heber, Raleig, USAHokusai Katsushika: 1760-1849Nicolas Luc,Génome et informatique, EvryFabien de Montgolfier, LIAFA, ParisChristophe Paul, LIRMM, MontpellierSophie Pasek, Génome et informatique, EvryJean-Loup Risler, Génome et informatique, EvryMathieu Raffinot, Laboratoire Poncelet, MoscouJens Stoye, Technische Facultat, Bielefeld
Credits
Cedric ChauveAnnie ChateauOlivier GingrasYannick GingrasAndré LevasseurJacqueline RwirangiraKarine St-Onge