identification of genetic interactions using

Post on 11-Jul-2022

3 Views

Category:

Documents

0 Downloads

Preview:

Click to see full reader

TRANSCRIPT

Identification of genetic interactions using

computational homology

Javier ArsuagaMathematics Molecular and Cellular BiologyUniversity of California, Davis

Identification of genetic interactions using

computational homology

Javier ArsuagaMathematics Molecular and Cellular BiologyUniversity of California, Davis

V. Nanda

Topological Molecular Biology Lab

In cancer the structure of the genome can be heavily disrupted

Amplifications and Deletions across the entire genome can be detected using array CGH

Topological Analysis of array CGH

(TAaCGH)

DeWoskin et al. 2009DeWoskin et al. 2010Arsuaga et al. 2012Arsuaga et al. 2015

We analyzed four breast cancer subtypes

Clin Cancer Res; 16(2) January 15, 2010 663

Our method identified the region of ERBB2 (Her2+ shown in blue)

●●

●●

●●

● ●●●●

●●●

●●●

●●

●●

●●●

●●●

●●●●●

●●● ●

●●●

●●●

●●●

●● ●

●● ●● ● ● ●

●●

3e+07 4e+07 5e+07 6e+07 7e+07 8e+07

−0.5

0.5

1.5

Patient X208 on 17q

● ●●

●● ●●

● ●● ●

●●

●●

●●●

● ●●●●●

●●

●●●

●●

●●●

● ●●

●●

●●

●●●●

●● ● ●

● ●●●●

● ● ● ● ●●●

3e+07 4e+07 5e+07 6e+07 7e+07 8e+07

−0.5

0.5

1.5

Patient X308 on 17qd)##

Examples of ERBB2/Her2 patient profiles

●●

●●

●●

● ●●●●

●●●

●●●

●●

●●

●●●

●●●

●●●●●

●●● ●

●●●

●●●

●●●

●● ●

●● ●● ● ● ●

●●

3e+07 4e+07 5e+07 6e+07 7e+07 8e+07

−0.5

0.5

1.5

Patient X208 on 17q

● ●●

●● ●●

● ●● ●

●●

●●

●●●

● ●●●●●

●●

●●●

●●

●●●

● ●●

●●

●●

●●●●

●● ● ●

● ●●●●

● ● ● ● ●●●

3e+07 4e+07 5e+07 6e+07 7e+07 8e+07

−0.5

0.5

1.5

Patient X308 on 17q

●●

●●

●●

● ●●●●

●●●

●●●

●●

●●

●●●

●●●

●●●●●

●●● ●

●●●

●●●

●●●

●● ●

●● ●● ● ● ●

●●

3e+07 4e+07 5e+07 6e+07 7e+07 8e+07

−0.5

0.5

1.5

Patient X208 on 17q

● ●●

●● ●●

● ●● ●

●●

●●

●●●

● ●●●●●

●●

●●●

●●

●●●

● ●●

●●

●●

●●●●

●● ● ●

● ●●●●

● ● ● ● ●●●

3e+07 4e+07 5e+07 6e+07 7e+07 8e+07

−0.5

0.5

1.5

Patient X308 on 17qERBB2

●●●

●●●

●●●●

●●●●●●

●●●●●

●●

●●

●●

●●●●●

● ●●

●●●●●●●

●●●

●●

●●

●●●

● ●●●●

●●

●●●●

●●●●

●●●●●●●

●●

●●

6.0e+07 8.0e+07 1.0e+08 1.2e+08

−0.5

0.5

1.0

1.5

Patient X167 on 11q

●●●

●●

●●●●●●●●

●●●●●●

●●●●●●●

●●●●

●●

●●●

●●●

●●

●●●●●

●●●●●

●●●

●●●●

●●●● ●●

●●●

●●●●●●●●●

●●●●●●●●

●●●●●●

6.0e+07 8.0e+07 1.0e+08 1.2e+08

−0.5

0.5

1.0

1.5

Patient X220 on 11q

Luminal A: An amplification at the site of theProgesterone Receptor gene

Ion channel

Progesterone receptor

Regions detected in Basal Patients

Luminal B!

Luminal A! Her2!

TAaCGH Horlings DB!

Centers of Mass Horlings DB!

TAaCGH Bergamaschi DB!

Centers of Mass Bergamaschi DB!

Horlings Paper!

Basal!

TAaCGH Horlings DB!

Centers of Mass Horlings DB!

TAaCGH Bergamaschi DB!

Centers of Mass Bergamaschi DB!

Reported by!Horlings et al!

Gain in 2p

●●●●

●●●●●

●●

●●●●●●●

●●

●●●●●●

●●

●● ●●●

●●

● ●●

●●●

●●●●●●

● ● ●●●

●●

●●● ●●

●●●● ●

●●●

●●

●●● ●● ●

0e+00 2e+07 4e+07 6e+07 8e+07

−0.5

0.5

1.5

Patient X324 on 2p

●● ●

●●●●●●

●●●●●

●● ● ● ●●●

●●●●

●●

●●

●●

●● ●●

●●

●●● ●● ●● ●

●●● ● ●

●●●

● ●●●

●●●●●● ●●

●●●●

●●●●

● ●

0e+00 2e+07 4e+07 6e+07 8e+07

−0.5

0.5

1.5

Patient X330 on 2p

results from Arsuaga et al 2015

TAaCGH is a method within statistical genetics: topological genetics

Phenotype Markers: zj

1

0

A

A

B

A

B

A

.

.

.

.

.

.

Phenotype

1

0

.

.

....

Can we include genetic interactions?

Phenotype Markers: zj

1

0

A

A

B

A

B

A

.

.

.

.

.

.

Phenotype

1

0

.

.

....

Hypothesis: Genetic interactions in the form of co-amplifications/deletions can be detected by β1

Computer simulations suggest that β1

curves can detect co-occurring copy number changes

} 60% probes are in 8p/11q

} Test set: 9 patients contain the co-amplification (by inspection),

} Control set: no aberration set was generated artificially by mixing data

} Significant p=0.045

●●●

●●

●●

●●

●●●

●●

●●

●●

●●

●●●

●●

●●

●●

●●●

●●●

●●●●●●●●●●●●●●

●●●

●●●●●●●●●●●

●●●●

●●●●●●●●●

●●

●●●●●●●●●●●●●●●●●●●●●

●●●●●●●●●●●●

0.0 0.5 1.0 1.5

01

23

4

Epsilon

B1

●●

●●

●●●●●

●●

●●●●●●●●●●●●●●

●●

●●●

●●

●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●

B1 curves for both8p11q (9 blue) vs Non−both8p11q (5 red) on 8p_11q.s1 in 2D for kwek8p11q data

In order to detect co-occurring we need to be able to compute the cycles

β1 also detects single amplifications

Peaks detected do not necessarily persist

Patterns may change during filtration

Proposed statistical method for inverse problem

0.000

0.005

0.010

0.015

0.020

0.025

0.030

0.035

0.040

0.045

12.4

14.6

15.6

17.3

18.4

19.3

20.6

22.0

23.3

24.5

24.7

25.2

27.4

27.7

27.9

29.0

29.6

29.7

32.2

32.5

Avg$Cum$W

idth

Mbp

Basal01p36.222p35.1

Control Diff:0Test2Control

Computational Homology of Breast Cancer 11

A

B

0.0

0.5

1.0

1.5

0 10 20 30 40

A

0.0

0.5

1.0

1.5

0 10 20 30 40

B

A

B

Fig. 5. Correspondence between CGH probes and generators Di↵erent valuesof the filtration parameter detects di↵erent generators which corresponds to di↵erentprobes in the genome. Panel A shows the profile of one patient and its associated pointcloud. The probes highlighted in blue correspond to the vertices of the single generator,also in blue. The filtration coe�cient was ✏ = 0.78. Panel B shows the same patientand point cloud for a di↵erent value of the filtration coe�cient ✏ = 0.83

the bottom ones the histograms for the Climent data set. The histograms on theleft are the control and the ones on the right correspond to the ERBB2+. Themost remarkable feature is the di↵erence between the control and the ERBB2data sets. While the control show no significant concentration of the probes thatbelong to cycles the ERBB2+ clearly show three regions of interest. 17q12 has asignificant concentration of cycle elements and corresponds to the position of thegene ERBB2. Two regions extend beyond the position of ERBB2 The first oneis in the boundary between 17q21.2 and 17q21.31. The Horlings data set suggeststhat the region of interest is more localized in 17q21.31 while the Climent dataset suggest a region contained in 17q21.2. The last region is located at 17q21.33and is common to both studies.

Since our simulations show that the first homology group can also identifysingle amplifications one may argue that the found amplifications correspond tosingle independent events. To address this problem we analyzed the distribution

Computational Homology of Breast Cancer 11

A

B

0.0

0.5

1.0

1.5

0 10 20 30 40

A

0.0

0.5

1.0

1.5

0 10 20 30 40

B

A

B

Fig. 5. Correspondence between CGH probes and generators Di↵erent valuesof the filtration parameter detects di↵erent generators which corresponds to di↵erentprobes in the genome. Panel A shows the profile of one patient and its associated pointcloud. The probes highlighted in blue correspond to the vertices of the single generator,also in blue. The filtration coe�cient was ✏ = 0.78. Panel B shows the same patientand point cloud for a di↵erent value of the filtration coe�cient ✏ = 0.83

the bottom ones the histograms for the Climent data set. The histograms on theleft are the control and the ones on the right correspond to the ERBB2+. Themost remarkable feature is the di↵erence between the control and the ERBB2data sets. While the control show no significant concentration of the probes thatbelong to cycles the ERBB2+ clearly show three regions of interest. 17q12 has asignificant concentration of cycle elements and corresponds to the position of thegene ERBB2. Two regions extend beyond the position of ERBB2 The first oneis in the boundary between 17q21.2 and 17q21.31. The Horlings data set suggeststhat the region of interest is more localized in 17q21.31 while the Climent dataset suggest a region contained in 17q21.2. The last region is located at 17q21.33and is common to both studies.

Since our simulations show that the first homology group can also identifysingle amplifications one may argue that the found amplifications correspond tosingle independent events. To address this problem we analyzed the distribution

●●●

●●

●●

●●

●●●

●●

●●

●●

●●

●●●

●●

●●

●●

●●●

●●●

●●●●●●●●●●●●●●

●●●

●●●●●●●●●●●

●●●●

●●●●●●●●●

●●

●●●●●●●●●●●●●●●●●●●●●

●●●●●●●●●●●●

0.0 0.5 1.0 1.5

01

23

4

Epsilon

B1

●●

●●

●●●●●

●●

●●●●●●●●●●●●●●

●●

●●●

●●

●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●

B1 curves for both8p11q (9 blue) vs Non−both8p11q (5 red) on 8p_11q.s1 in 2D for kwek8p11q data

• Her2+: 14 vs Others: 52• Her2+: 14 vs Others: 52

Clin Cancer Res; 16(2) January 15, 2010 663

Co-amplifications detected in 17q across three different data sets

PHBJUN, CDK4,SLUG,WNT,TOP2

ERBB2

Ardanza et al. 2016

Generators are the product of co-amplifications not of single CNAs

Life

of C

ycle

Cycles dispersed over the entire profile showing multiple co-ocurrences

Life

of C

ycle

Cycles dispersed over the entire profile showing multiple co-ocurrences

Life

of C

ycle

Cycles dispersed over the entire profile showing multiple co-ocurrences

Life

of C

ycle

Cycles dispersed over the entire profile showing multiple co-ocurrences

18 S. Ardanza-Trevijano et al.

Patient 20 Patient 26

Patient 53 Patient 66

0 10 20 30 40 0 10 20 30 40

genindex123456

Fig. 7. Distribution of cycles in CGH profiles Each plate corresponds to the CGHprofile of a patient and how the vertices of the cycles are mapped back to the profile.Di↵erent colors indicate di↵erent cycles and do not represent the same cycle in eachplate. The height of the bars represent the life of the cycle.

TAaCGH suggests an interaction in 4q for basals

Conclusions and future research} We are expanding TAaCGH to identify genetic interactions

} Genetic interactions are in the form of co-occurring copy number changes and/or the finer structure of copy number changes.

} In the ERBB2+ subtype we find co-expression of different regions of 17q. In Basal we find co-expression in 4q

} Next:

} Identify whether gene expression is regulated by these profiles

} Generalize to other situations in statistical genetics: topological genetics?

Thank you

top related