functional splicing network reveals extensive regulatory

40
1 Functional Splicing Network Reveals Extensive Regulatory Potential of the Core Spliceosomal Machinery Panagiotis Papasaikas, 1,2,4 J. Ramo ´ n Tejedor, 1,2,4 Luisa Vigevani, 1,2 and Juan Valca ´ rcel 1,2,3, * 1 Centre de Regulacio ´ Geno ` mica, Dr. Aiguader 88, 08003 Barcelona, Spain 2 Universitat Pompeu Fabra, Dr. Aiguader 88, 08003 Barcelona, Spain 3 Institucio ´ Catalana de Recerca i Estudis Avanc ¸ ats, Passeig Lluis Companys 23, 08010 Barcelona, Spain 4 Co-first authors SUMMARY Pre-mRNA splicing relies on the poorly understood dynamic interplay between >150 protein compo- nents of the spliceosome. The steps at which splicing can be regulated remain largely unknown. We sys- tematically analyzed the effect of knocking down the components of the splicing machinery on alterna- tive splicing events relevant for cell proliferation and apoptosis and used this information to reconstruct a network of functional interactions. The network accurately captures known physical and functional associations and identifies new ones, revealing remarkable regulatory potential of core spliceosomal components, related to the order and duration of their recruitment during spliceosome assembly. In contrast with standard models of regulation at early steps of splice site recognition, factors involved in catalytic activation of the spliceosome display regu- latory properties. The network also sheds light on the antagonism between hnRNP C and U2AF, and on tar- gets of antitumor drugs, and can be widely used to identify mechanisms of splicing regulation. INTRODUCTION Pre-mRNA splicing is carried out by the spliceosome, one of the most complex molecular machineries of the cell, composed of five small nuclear ribonucleoprotein particles (U1, U2, U4/5/6 snRNP) and about 150 additional polypeptides (reviewed by Wahl et al., 2009). Detailed biochemical studies using a small number of model introns have delineated a sequential pathway for the assembly of spliceosomal subcomplexes. For example, U1 snRNP recognizes the 5’ splice site, and U2AF—a hetero- dimer of 35 and 65 kDa subunits—recognizes sequences at the 3’ end of introns. U2AF binding helps to recruit U2 snRNP to the upstream branchpoint sequence, forming complex A. U2 snRNP binding involves interactions of pre-mRNA sequences with U2 snRNA as well as with U2 proteins (e.g., SF3B1). Subse- quent binding of preassembled U4/5/6 tri-snRNP forms complex B, which after a series of conformational changes forms com- plexes Bact and C, concomitant with the activation of the two catalytic steps that generate splicing intermediates and prod- ucts. Transition between spliceosomal subcomplexes involves profound dynamic changes in protein composition as well as extensive rearrangements of base-pairing interactions between snRNAs and between snRNAs and splice site sequences (Wahl et al., 2009). RNA structures contributed by base-pairing interac- tions between U2 and U6 snRNAs serve to coordinate metal ions critical for splicing catalysis (Fica et al., 2013), implying that the spliceosome is an RNA enzyme whose catalytic center is only established upon assembly of its individual components. Differential selection of alternative splice sites in a pre-mRNA (alternative splicing, AS) is a prevalent mode of gene regulation in multicellular organisms, often subject to developmental regula- tion (Nilsen and Graveley, 2010) and frequently altered in disease (Cooper et al., 2009; Bonnal et al., 2012). Substantial efforts made to dissect mechanisms of AS regulation on a relatively small number of pre-mRNAs have provided a consensus picture in which protein factors recognizing cognate auxiliary sequences in the pre-mRNA promote or inhibit early events in spliceosome assembly (Fu and Ares, 2014). These regulatory factors include members of the hnRNP and SR protein families, which often display cooperative or antagonistic functions depending on the position of their binding sites relative to the regulated splice sites. Despite important progress (Barash et al., 2010; Zhang et al., 2010), the combinatorial nature of these contributions compli- cates the formulation of integrative models for AS regulation. To systematically capture the contribution of splicing regulato- ry factors, including components of the core spliceosome (Clark et al., 2002; Park et al., 2004; Pleiss et al., 2007; Saltzman et al., 2011) to AS regulation, we set up a high-throughput screen to evaluate the effects of knocking down each individual splicing component or regulator on 36 functionally important AS events (ASEs) and developed a framework for data analysis and network modeling. Network-based approaches can provide rich representations for real-world systems that capture their or- ganization more adequately than more traditional approaches or mere cataloguing of pairwise relationships. Numerous studies have utilized them as a platform for deriving comprehensive tran- scriptional, metabolic, and physical interaction maps in several

Upload: others

Post on 13-Jan-2022

3 views

Category:

Documents


0 download

TRANSCRIPT

Page 1: Functional Splicing Network Reveals Extensive Regulatory

1

 

 

     Functional Splicing Network Reveals Extensive Regulatory Potential of the Core Spliceosomal Machinery

 Panagiotis Papasaikas,1,2,4 J. Ramo n Tejedor,1,2,4 Luisa Vigevani,1,2 and Juan Valca rcel1,2,3,* 1Centre de Regulacio Geno mica, Dr. Aiguader 88, 08003 Barcelona, Spain 2Universitat Pompeu Fabra, Dr. Aiguader 88, 08003 Barcelona, Spain 3Institucio Catalana de Recerca i Estudis Avanc ats, Passeig Lluis Companys 23, 08010 Barcelona, Spain 4Co-first authors

     

SUMMARY  Pre-mRNA splicing relies on the poorly understood dynamic interplay between >150 protein compo- nents of the spliceosome. The steps at which splicing can be regulated remain largely unknown. We sys- tematically analyzed the effect of knocking down the components of the splicing machinery on alterna- tive splicing events relevant for cell proliferation and apoptosis and used this information to reconstruct a network of functional interactions. The network accurately captures known physical and functional associations and identifies new ones, revealing remarkable regulatory potential of core spliceosomal components, related to the order and duration of their recruitment during spliceosome assembly. In contrast with standard models of regulation at early steps of splice site recognition, factors involved in catalytic activation of the spliceosome display regu- latory properties. The network also sheds light on the antagonism between hnRNP C and U2AF, and on tar- gets of antitumor drugs, and can be widely used to identify mechanisms of splicing regulation.

   

INTRODUCTION  

Pre-mRNA splicing is carried out by the spliceosome, one of the most complex molecular machineries of the cell, composed of five small nuclear ribonucleoprotein particles (U1, U2, U4/5/6 snRNP) and about 150 additional polypeptides (reviewed by Wahl et al., 2009). Detailed biochemical studies using a small number of model introns have delineated a sequential pathway for the assembly of spliceosomal subcomplexes. For example, U1 snRNP recognizes the 5’ splice site, and U2AF—a hetero- dimer of 35 and 65 kDa subunits—recognizes sequences at the 3’ end of introns. U2AF binding helps to recruit U2 snRNP to the upstream branchpoint sequence, forming complex A. U2 snRNP binding involves interactions of pre-mRNA sequences with U2 snRNA as well as with U2 proteins (e.g., SF3B1). Subse-

quent binding of preassembled U4/5/6 tri-snRNP forms complex B, which after a series of conformational changes forms com- plexes Bact and C, concomitant with the activation of the two catalytic steps that generate splicing intermediates and prod- ucts. Transition between spliceosomal subcomplexes involves profound dynamic changes in protein composition as well as extensive rearrangements of base-pairing interactions between snRNAs and between snRNAs and splice site sequences (Wahl et al., 2009). RNA structures contributed by base-pairing interac- tions between U2 and U6 snRNAs serve to coordinate metal ions critical for splicing catalysis (Fica et al., 2013), implying that the spliceosome is an RNA enzyme whose catalytic center is only established upon assembly of its individual components.

Differential selection of alternative splice sites in a pre-mRNA (alternative splicing, AS) is a prevalent mode of gene regulation in multicellular organisms, often subject to developmental regula- tion (Nilsen and Graveley, 2010) and frequently altered in disease (Cooper et al., 2009; Bonnal et al., 2012). Substantial efforts made to dissect mechanisms of AS regulation on a relatively small number of pre-mRNAs have provided a consensus picture in which protein factors recognizing cognate auxiliary sequences in the pre-mRNA promote or inhibit early events in spliceosome assembly (Fu and Ares, 2014). These regulatory factors include members of the hnRNP and SR protein families, which often display cooperative or antagonistic functions depending on the position of their binding sites relative to the regulated splice sites. Despite important progress (Barash et al., 2010; Zhang et al., 2010), the combinatorial nature of these contributions compli- cates the formulation of integrative models for AS regulation.

To systematically capture the contribution of splicing regulato- ry factors, including components of the core spliceosome (Clark et al., 2002; Park et al., 2004; Pleiss et al., 2007; Saltzman et al., 2011) to AS regulation, we set up a high-throughput screen to evaluate the effects of knocking down each individual splicing component or regulator on 36 functionally important AS events (ASEs) and developed a framework for data analysis and network modeling. Network-based approaches can provide rich representations for real-world systems that capture their or- ganization more adequately than more traditional approaches or mere cataloguing of pairwise relationships. Numerous studies have utilized them as a platform for deriving comprehensive tran- scriptional, metabolic, and physical interaction maps in several

Page 2: Functional Splicing Network Reveals Extensive Regulatory

2

 

 

 (legend on next page)

Page 3: Functional Splicing Network Reveals Extensive Regulatory

3

 

 

       

organisms and contexts (e.g., Chatr-Aryamontri et al., 2013; Franceschini et al., 2013; Wong et al., 2012). Their application has also proven invaluable for deciphering regulatory relation- ships in transcriptional circuits, for studying their properties upon perturbation, and for connecting physiological responses and diseases to their molecular underpinnings (e.g., Kim et al., 2012; Watson et al., 2013; Yang et al., 2014).

Key to our approach is the premise that the distinct profiles of splicing changes caused by perturbation of regulatory factors depend on the functional bearings of those factors and can, therefore, be used as proxies for inferring functional relation- ships. We derive a network that recapitulates the topology of known splicing complexes and provides an extensive map of >500 functional associations among ~200 splicing factors (SFs). We use this network to identify general and particular mechanisms of AS regulation and to identify key SFs that mediate the effects on AS of perturbations induced by iron (see accompanying manuscript by Tejedor et al., 2014, in this issue of Molecular Cell) or by antitumor drugs targeting compo- nents of the splicing machinery. Our study offers a unique com- pendium of functional interactions among SFs, a discovery tool for coupling cell-perturbing stimuli to splicing regulation, and an expansive view of the splicing regulatory landscape and its organization.

 RESULTS

 Results from a genome-wide siRNA screen to identify regulators of Fas/CD95 AS revealed that knockdown of a significant fraction of core SFs (defined as proteins identified in highly puri- fied spliceosomal complexes assembled on model, single-intron pre-mRNAs; Wahl et al., 2009) caused changes in Fas/CD95 exon 6 inclusion (Tejedor et al., 2014). The number of core fac- tors involved and the extent of their regulatory effects exceeded those of classical splicing regulatory factors like SR proteins or hnRNPs. To systematically evaluate the contribution of different classes of splicing regulators to AS regulation, we set up a screen in which we assessed the effects of knockdown of each individual factor on 38 ASEs relevant for cell proliferation and/or apoptosis (Figures 1A and 1B). The ASEs were selected due to their clear impact on protein function and their docu- mented biological relevance (see Table S1 available online). A library of 270 siRNA pools (Table S2) was used, corresponding to genes encoding core spliceosomal components as well as auxiliary regulatory SFs and factors involved in other RNA-pro- cessing steps, including RNA stability, export, or polyadenyla- tion. In addition, 40 genes involved in chromatin structure mod-

ulation were included, to probe for possible functional links between chromatin and splicing regulation (Luco et al., 2011).

The siRNA pools were individually transfected in biological triplicates in HeLa cells using a robotized procedure (Supple- mental Experimental Procedures). Seventy-two hours posttrans- fection, RNA was isolated, cDNA libraries generated using oligo- dT and random primers, and the patterns of splicing probed by PCR using primers flanking each of the alternatively spliced regions (Figure 1A). PCR products were resolved by high- throughput capillary electrophoresis (HTCE) and the ratio be- tween isoforms calculated as the percent spliced in (PSI) index (Katz et al., 2010), which represents the percentage of exon inclusion/alternative splice site usage. Knockdowns of factors affecting one ASE (Fas/CD95) were also validated with siRNA pools from a second library and by a second independent technique (Illumina HiSeq sequencing; Tejedor et al., 2014). Conversely, the effects of knocking down core SFs (P14, SNRPG, and SF3B1) on the ASEs were robust when different siRNAs and RNA purification methods were used (Figure S1A). Several pieces of evidence indicate that the changes in AS are not due to the induction of cell death upon depletion of essential SFs. First, cell viability was not generally compromised after 72 hr of knockdown of 13 individual core SFs (Figure S1C). Sec- ond, cells detached from the plate were washed away before RNA isolation. Third, splicing changes upon induction of apoptosis with staurosporine were distinct from those observed upon knockdown of SFs (Figure S1D).

Overall the screen generated a total of 33,288 PCR data points. These measurements provide highly robust and sensitive estimates of the relative use of competing splice sites. Figure 1D shows the spread of the effects of knockdown for all the factors in the screen for the ASEs analyzed. Three events (VEGFA, SYK, and CCND1) were excluded from further analysis, because they were not affected in the majority of the knockdowns (median ab- solute DPSI <1, Figure 1D). From these data we generated, for every knockdown condition, a perturbation profile that reflects its impact across the 35 ASEs in terms of the magnitude and di- rection of the change. Figures 1E and 1F show examples of such profiles and the relationships between them: in one case, knock- down of CDC5L or PLRG1 generates very similar perturbation profiles (upper diagrams in Figures 1E and 1F), consistent with their well-known physical interactions within the PRP19 complex (Makarova et al., 2004). In contrast, the profiles associated with the knockdown of SNRPG and DDX52 are to a large extent opposite to each other (lower diagrams in Figures 1E and 1F), suggesting antagonistic functions. Perturbation profiles and relationships between factors were not significantly changed

 Figure 1. Comprehensive Mapping of Functional Interactions between SFs in ASEs Implicated in Cell Proliferation and Apoptosis (A) Flowchart of the pipeline for network generation. See main text for details. (B) Genes with ASEs relevant for the regulation of cell proliferation and/or apoptosis used in this study. (C) Examples of HTCE profiles for FAS/CD95 and CHEK2 ASEs under conditions of knockdown of the core SF SF3B1. Upper panels show the HTCE profiles, lower panels represent the relative intensities of the inclusion and skipping isoforms. PSI values indicate the median and standard deviation of three independent experiments. (D) Spread of PSI changes observed for each of the events analyzed in this study upon the different knockdowns. (E) Examples of splicing perturbation profiles for different knockdowns. The profiles represent change toward inclusion (>0) or skipping (<0) for each ASE upon knockdown of the indicated factors, quantified as a robust Z score (see Experimental Procedures). (F) Robust correlation estimates and regression of the perturbation profiles of strongly correlated (PLRG1 versus CDC5L, upper panel) or anticorrelated (DDX52 versus SNRPG, lower panel) factors. See also Figures S1 and S2, Table S1, and Table S2.

 

 

Page 4: Functional Splicing Network Reveals Extensive Regulatory

4

 

 

CELL FUNCTION

SPLICING TYPE

SPLICING TYPE

ce5ssmecomplex3ssir

CELL FUNCTION

apoptosis

bothcell proliferation

GENE CATEGORY

SpliceosomeSplic. FactorsOther RNApr./ Misc.Chrom. Factors

GE

NE

CA

TE

GO

RY

SKIP INC

0.5

1.0

1.5

2.0

2.5

Mean

ab

s��=íVFRUH

&RUH�6S

O�

6SO��

FDFWRUV

2WKHU�51$SU

.

&KURP

��FDFWRUV

0.0

0.5

1.0

1.5

2.0

2.5

Mean

ab

s��=íVFRUH

PHUVLVW��6

SO�

TUa

ns. E

aUO\

�6SO�

TUa

ns��0

LG�6SO�

TUa

ns��/DWH�6S

O�

EJC

A B

C

í� 0 5=íVFRUH

n=112 n=32 n=64 n=40

n=22 n=44 n=24 n=22 n=7

PH

F19

CH

EK

2M

CL1

BIR

C5_

D3

OLR

1S

MN

1C

AS

P9

BM

FFA

SA

PAF1

GA

DD

45A

MIN

K1

SM

N2

BIR

C5_

2BN

UM

BP

KM

2S

TAT3

MA

P4K

2C

CN

E1

NO

TCH

3R

AC

1D

IAB

LOB

CL2

L1B

IM_L

_EL

MS

TR1

MA

DD

CFL

AR

CA

SP

2PA

X6

FN1E

DA

FN1E

DB

MA

P4K

3H

2AFY

MA

P3K

7B

IM_E

L

Figure 2. Coordinated Regulation of Splicing Events by Coherent Subsets of SFs (A) Heatmap representation of the results of the screen. ASEs used in this study are on the x axis, while knockdown conditions on the y axis. Data are clustered on both dimensions (ward linkage, similarity measure based on Pearson correlation). Information about ASEs, (a) type (ce, cassette exon; 5ss, alternative 5’ splice site; me, mutually exclusive exon; complex, multiexonic rearrangement; 3ss, alternative 3’ splice site; ir, intron retention) and (b) involvement in apoptosis and/or cell proliferation regulation, as well as information on the category of genes knocked down, is color coded. (B) Boxplot representation of the mean absolute Z score of AS changes induced by the knockdown of particular classes of factors, including core and noncore SFs, other RNA processing factors, and chromatin remodeling factors. Median and spread (interquartile range) of AS changes of mock siRNA conditions are represented as a green line and box, respectively. (C) As in (B) for spliceosomal factors that assemble early and stay as detectable components through the spliceosome cycle; factors that are present only transiently at early, mid, or late stages of assembly; and components of the exon junction complex (EJC). See also Figure S3, Table S2, Table S3, and Table S4.

 when knockdown of a subset of factors was carried out for 48 hr (Figure S2A), arguing against AS changes being the conse- quence of secondary effects of SF knockdown in these cases (see also below the discussion on IK/SMU1).

 Pervasive and Distinct Effects of Core Spliceosome Components on Alternative Splicing Regulation The output of the screening process is summarized in the heat- map of Figure 2A. Three main conclusions can be derived from these results. First, a large fraction of the SF knockdowns have

noticeable but distinct effects on AS, indicating that knockdown of many components of the spliceosome causes switches in splice site selection, rather than a generalized inhibition of splicing of every intron. Second, a substantial fraction of AS changes correspond to higher levels of alternative exon inclu- sion, again arguing against simple effects of decreasing splicing activity upon knockdown of general SFs, which would typically favor skipping of alternative exons that often harbor weaker splice sites. These observations suggest extensive versatility of the effects of modulating the levels of SFs on splice site

Page 5: Functional Splicing Network Reveals Extensive Regulatory

5

 

 

selection. Third, despite the diversity of the effects of SF knock- downs on multiple ASEs, similarities can also be drawn (Fig- ure 2A). For example, a number of ASEs relevant for the control of programmed cell death (e.g., Fas, APAF1, CASP9, or MCL1) cluster together and are similarly regulated by a common set of core spliceosome components. This observation suggests coordinated regulation of apoptosis by core factors, although the functional effects of these AS changes appear to be com- plex (Figure S3A). Another example is two distinct ASEs in the fibronectin gene (cassette exons EDA and EDB, of paramount importance to control aspects of development and cancer pro- gression; Muro et al., 2003) which cluster close together, sug- gesting common mechanisms of regulation of the ASEs in this locus.

To explore the effects of different classes of SFs on AS regu- lation, we first grouped them in four categories: genes coding for core spliceosome components, noncore SFs/regulators, RNA-processing factors not directly related to splicing, and chromatin-related factors. Core SFs display the greater spread and average magnitude of effects, followed by noncore SFs (Fig- ure 2B). Factors related with chromatin structure and remodeling showed a narrower range of milder effects (Figure 2B), although their values were clearly above the range of changes observed in mock knockdown samples. Of interest, knockdown of core fac- tors leads to stronger exon skipping effects, while noncore SFs and other RNA processing and chromatin-related factors show more balanced effects (Figures S3B and S3C).

To further dissect the effects of core components, we sub- divided them into four categories depending on the timing and duration of their recruitment to splicing complexes during spliceosome assembly (Wahl et al., 2009): (1) persistent compo- nents (e.g., 17S U2 snRNPs, Sm proteins) that enter the assem- bly process before or during complex A formation and remain until completion of the reaction, (2) transient early components that join the reaction prior to B complex activation but are scarce at later stages of the process, (3) transient middle components joining during B-act complex formation, and (4) transient late components that are only present during or after C complex for- mation. Our results indicate that knockdowns of factors that persist during spliceosome assembly cause AS changes of higher magnitude than those caused by knockdowns of transient factors involved in early spliceosomal complexes, which in turn are stronger and more dispersed than those caused by factors involved in middle complexes and in complex C formation or cat- alytic activation (Figure 2C). As observed for core factors in gen- eral, knockdown of persistent factors tends to favor skipping, while the effects of knockdown of more transient factors are more split between inclusion and skipping (Figures S3B and S3C).

Interestingly, knockdowns of components of the exon junction complex (EJC) display a range of effects that resembles those of transient early or mid splicing complexes (Figure 2C). These re- sults are in line with recent reports documenting effects of EJC components in AS, in addition to their standard function in post- splicing processes (Ashton-Beaucage et al., 2010).

To test whether the knockdown of core factors causes a gen- eral inhibition of splicing, we measured the levels of introns (both in constitutive and alternatively spliced regions) relative to exons

in four genes, upon the knockdown of four SFs (BRR2, SF3B1, SNRPG, and SLU7). The results shown in Figure 3 reveal a complex picture in which retention of certain introns is clearly increased and retention of other introns is very mild, and there is even one case (the two introns flanking the fibronectin EDI alternative exon) in which the low levels of intron detected under control conditions are reduced upon knockdown of core SFs (notably, this ASE typically displays a distinct pattern of AS upon knockdown of core SFs, Figure 2A). Therefore, rather than a general uniform inhibition of splicing, reduction in the levels of core SFs induces differential effects in different introns, a concept reminiscent of the differential effects observed for antitumor drugs targeting core components like SF3B1 (Bonnal et al., 2012).

 Reconstruction of a Functional Network of Splicing Factors To systematically and accurately map the functional relation- ships of SFs on AS regulation, we quantified the similarity be- tween AS perturbation profiles for every pair of factors in our screen using a robust correlation estimate for the effects of the factors’ knockdowns across the ASEs. This measure captures the congruence between the shapes of perturbation profiles, while it does not take into account proportional differences in the magnitude of the fluctuations. Crucially, it discriminates be- tween biological and technical outliers and is resistant to distort- ing effects of the latter (Supplemental Experimental Procedures).

We next employed glasso, a regularization-based algorithm for graphical model selection (Friedman et al., 2008; Supple- mental Experimental Procedures), to reconstruct a network from these correlation estimates (Figures 4A and S4). Glasso seeks a parsimonious network model for the observed correla- tions. This is achieved by specifying a regularization parameter, which serves as a penalty that can be tuned in order to control the number of inferred connections (network sparsity) and there- fore the false discovery rate (FDR) of the final model. For a given regularization parameter, both the number of inferred connec- tions and the FDR are a decreasing function of the number of ASEs assayed (Figure 4B). Random sampling with different sub- sets of the real or reshuffled versions of the data indicates that network reconstruction converges to ~500 connections with a FDR <5% near 35 events (Figure 4B), implying that analysis of additional ASEs in the screening would have only small effects on network sparsity and accuracy.

The complete network of functional interactions among the screened factors is shown in Figure 4A. It is comprised of 196 nodes representing screened factors and 541 connections (average degree 5.5), of which 518 correspond to positive and 23 to negative functional associations. The topology of the re- sulting network has several key features. First, two classes of factors are easily distinguishable. The first class encompasses factors that form densely connected clusters comprised of core spliceosomal components (Figure 4A). Factors physically linked within U2 snRNP and factors functionally related with U2 snRNP activity, and with the transition from complex A to B, form a tight cluster (orange nodes in the inset of Figure 4A). An adjacent but distinct highly linked cluster corresponds mainly to factors physically associated within U5 snRNP or the U4/5/6

   

     

   

 

Page 6: Functional Splicing Network Reveals Extensive Regulatory

6

 

 

-2

-1

0

1

2

3

4

5

CN BRR2 SF3B1 SNRPG SLU7

INC SKIP Intron A Intron B GE

Fo

ld c

han

ge (

log

2)

A B

C D

5 6 7 2

GE

Intron A

Intron B

INC

SKIP

3

Constitutive

Intron

Fo

ld

ch

an

ge (

log

2)

-3

-2

-1

0

1

2

3

4

CN BRR2 SF3B1 SNRPG SLU7

INC SKIP Intron A Intron B Constitutive intron GE

0

0.1

0.2

0.3

0.4

0.5

0.6

0.7

CN BRR2 SF3B1 SNRPG SLU7

Exp

ressio

n v

alu

es

(co

mp

ared

to

HP

RT

1)

Control BRR2 SF3B1 SNRPG SLU7

PSI

SD

92.86 92.22 23.51 81.16 87.21

1.01 0.16 0.72 1.05 2.57

FAS 1 2 3

GE

Intron A

Intron B

INC

SKIP

MCL1

0

0.1

0.2

0.3

0.4

0.5

0.6

0.7

CN BRR2 SF3B1 SNRPG SLU7

Exp

ressio

n

valu

es

(co

mp

ared

to

HP

RT

1)

Control BRR2 SF3B1 SNRPG SLU7

PSI

SD

97.91 94.20 31.18 96.22 92.83

0.84 0.90 1.40 0.20 0.90

7 8 9 2

GE Intron A

Intron B

INC

6

Constitutive

intron

SKIP

CHEK2

0

0.1

0.2

0.3

0.4

0.5

0.6

0.7

CN BRR2 SF3B1 SNRPG SLU7

Exp

ressio

n

valu

es

(co

mp

ared

to

HP

RT

1)

-3

-2

-1

0

1

2

3

4

5

CN BRR2 SF3B1 SNRPG SLU7

INC SKIP Intron A Intron B Constitutive intron GE

Fo

ld

ch

an

ge (

log

2)

Control BRR2 SF3B1 SNRPG SLU7

PSI

SD

64.94 39.16 8.82 43.37 70.37

1.96 0.95 0.87 0.74 1.06

24 24A 25 3

Constitutive

intron

Intron A

Intron B

INC

SKIP

4

GE

-2.5

-2

-1.5

-1

-0.5

0

0.5

1

1.5

2

2.5

CN BRR2 SF3B1 SNRPG SLU7

INC SKIP Intron A Intron B Constitutive intron GE

Fo

ld

ch

an

ge (

log

2)

FN1 EDB

Exp

ressio

n

valu

es

(co

mp

ared

to

HP

RT

1)

0

0.1

0.2

0.3

0.4

0.5

0.6

0.7

CN BRR2 SF3B1 SNRPG SLU7

Control BRR2 SF3B1 SNRPG SLU7

PSI

SD

6.20 17.84 17.62 6.45 5.84

0.51 0.57 1.24 0.71 0.43

N=3

N=3 N=3

N=3

N=3N=3

N=3 N=3

Figure 3. Variety of Effects of Knockdown of Core SFs on Intron Retention (A–D) Real-time quantification of AS, intron retention, and gene expression for four genes included in the splicing network (FAS, A; MCL1, B; CHEK2, C; and FN1, D) upon knockdown of four core spliceosomal components (BRR2, SF3B1, SNRPG, and SLU7). In each of the panels, the graphs represent the following: top, scheme of genomic locations, AS patterns, and amplicons used; middle top, quantification of fold changes in expression of the different isoforms or introns (indicated by the color code) compared to mock siRNA conditions; middle bottom, relative levels of the different isoforms or introns (indicated by the color code) compared to HPRT1 housekeeping gene mRNAs; and bottom, AS changes measured by HTCE. PSI values are indicated. For all assays, values represent the mean, and error bars the standard deviation of three independent biological replicas.

Page 7: Functional Splicing Network Reveals Extensive Regulatory

7

 

 

       

tri-snRNP (blue nodes in Figure 4A, inset). Factors in these clus- ters and their mutual links encompass 23% of the network nodes but 63% of the total inferred functional associations (average degree 14.5). A second category includes factors that form a periphery of lower connectivity that occasionally projects to the core. This category includes many of the classical splicing regulators (e.g., SR proteins, hnRNPs) as well as several chromatin factors. An intermediate category—more densely in- terconnected but not reaching the density of interactions of the central modules—includes multiple members of the RNA- dependent DEAD/X box helicase family. While the number of connections (degree) is significantly higher for core spliceosomal factors in general (Figure 4C), this is especially clear for factors that represent persistent components of splicing complexes along the assembly pathway (Figures 4D and 4E). Persistent factors are particularly well connected to each other (50% of the possible functional connections are actually detected, Fig- ure 4E). Factors that belong to consecutive complexes are signif- icantly more linked to each other than factors that belong to early and late complexes, and factors that transiently assemble to- ward the final stages of spliceosome assembly display the least number of connections (Figures 4D and 4E).

Collectively, these results indicate that core SFs, particularly those that persist during assembly, display highly related func- tions in splice site recognition and fluctuations in their levels or activities have related consequences on the modulation of splice site choice. Of the observed connections, 20% can be attributed to known physical interactions and 82% of these are among core components (Figure 4A). Examples include tight links between the two subunits of U2AF, between the interacting partners PLRG1 and CDC5L or IK and SMU1. We estimate that about 50% of the functional links detected in our network could be pre- dicted by previous knowledge of the composition or function of splicing complexes.

Of relevance, a substantial number of the functional associa- tions closely recapitulate the composition and even—to some extent—the topology of known spliceosomal complexes (Fig- ure S4; see also below and Discussion). These observations suggest that differences in the sensitivity of ASEs to SF pertur- bations reflect the specific role of these factors in the splicing process. Conversely, these results warrant the use of splicing perturbation profiles as proxies for inferring physical or func- tional links between factors in the splicing process.

 Generality of the Functional Interactions in the AS Regulation Network The above network was derived from variations in 35 ASEs selected because of their relevance for cell proliferation and apoptosis, but they represent a heterogeneous collection of AS types and, possibly, regulatory mechanisms. To identify func- tional connections persistent across AS types, we undertook a subsampling approach. We sampled subsets of 17 from the original 35 ASEs and iteratively reconstructed the network of functional associations from each subset (see Supplementary methods). Following 10,000 iterations, we asked how many functional connections were recovered in at least 90% of the sampled networks. The retrieved set of connections are shown in Figure 5A and they can be considered to represent a ‘‘basal’’

or indispensable splicing circuitry that captures close similarities in function between SFs regardless of the subset of splicing substrates considered. The majority of these connections corre- spond to links between core components, with modules corre- sponding to major spliceosomal complexes. For example, the majority of components of U2 snRNP constitute the most prom- inent of the core modules (Figure 5A). Strikingly, the module is topologically subdivided in two submodules, the first corre- sponding to components of the SF3a and SF3b complexes and the second corresponding to proteins in the Sm complex. The connectivity between U2 snRNP components remarkably recapitulates known topological features of U2 snRNP organiza- tion, including the assembly of SF3a/b complexes in stem loops I and II and the assembly of the Sm ring in a different region of the U2 snRNA (Behrens et al., 1993) (Figures 5A and S4). That the connectivity between components derived from as- sessing the effects on AS of depletion of these factors correlates with topological features of the organization of the snRNP further argues that the functions in splice site selection are tightly linked to the structure and function of this particle, highlighting the potential resolution of our approach for identifying meaningful structural and mechanistic links.

A second prominent ‘‘core’’ module corresponds to factors known to play roles in conformational changes previous to/ concomitant with catalysis in spliceosomes from yeast to hu- man, including PRP8, PRP31 or U5-200. Once again some of the topological features of the module are compatible with phys- ical associations between these factors (e.g., PRP8 with PRP31 or U5-116K or PRP31 with C20ORF14). That knockdown of these late-acting factors, which coordinate the final steps of the splicing process (Bottner et al., 2005; Ha cker et al., 2008; Wahl et al., 2009), causes coherent changes in splice site selec- tion strongly argues that the complex process of spliceosome assembly can be modulated at late steps, or even at the time of catalysis, to affect splice site choice. Other links in the ‘‘core’’ network recapitulate physical interactions, including the aforementioned modules involving the two subunits of the 3’

splice site-recognizing factor U2AF, the interacting partners PLRG1 and CDC5L and IK and SMU1.

To further evaluate the generality of our approach, we tested whether the functional concordance derived from our network analysis and from the iterative selection of a ‘‘core’’ network could be recapitulated in a different cell context and genome- wide. We focused our attention in the strong functional asso- ciation between IK and SMU1, two factors that have been described as associated with complex B and its transition to Bact (Bessonov et al., 2008) (Figure 5A). Analysis of the effects of knockdown of these factors in the same sets of ASEs in HEK293 cells revealed a similar set of associations, despite the fact that individual events display different splicing ratios upon IK/SMU1 knockdown in this cell line compared to HeLa (Figures 5B and 5C). To explore the validity of this functional association in a larger set of splicing events, RNA was isolated in biological triplicates from HeLa cells transfected with siRNAs against these factors, and changes in AS were assessed using genome-wide splicing-sensitive microarrays. The results revealed a striking overlap between the effects of IK and SMU1 knockdown, both on gene expression changes and on

Page 8: Functional Splicing Network Reveals Extensive Regulatory

8

 

 

A

SMNDC1

LSM3

C20ORF14

CRNKL1

XAB2

PRPF4

U2AF2

PPIH

SLU78���.'

SF3A2DDX23

SNRPD1

LSM2

SNRPD3 SNRPA1

NHP2L1

8���.'

SF3A1

PLRG1

SNRPB

C19ORF29

SF3B4

SNRPB2

PRPF31 PRPF19

AQR

SNRPGCDC5L

PRPF8

SF3B2U2AF1

SFRS9

LSM4

SF3B1

SART1

SNRPF

P14

SF3B3

CDC40.,$$����

SF3A3

LSM7

PRPF3

SNRPD2

P29 HNRPH1

ACIN1

RNPC2

EIF2C2

HNRPU

SIAHBP1

KIN

THOC3

RNPS1

BUB3

BRD4

HNRPD

)/-�����

RBM15

LENG1

EWSR1

/2&������

EP300

TXNL4

MOV10

DHX32CBX3

DDX11

SETD1A

TIA1

CARM1

CTNNBL1

U2AF2

DDX12

MGC13125

CD2BP2

THRAP3

NOSIP

DDX28

PHC1

CHD1

SNRPC

NONO

/60�

MGC23918

NAB2

HNRNPA3

MAGOH

ILF2

SFPQ

HPRP8BP

ELAVL1

MGC2803

MECP2

CHERP

SNRP70

TET1

LSM2

FUBP3PABPC1

SKIIP

SR140

BAT1

DDX31

DDX52

DNMT1

THOC1

0*&����

NHP2L1HNRPR

RBM17

SFRS1

RALY

SUV39H1

DHX8

WBP11

SNRPB

KIAA0073

PPIL1

&��25)��

THOC4

FRG1

DICER1

C22ORF19

MFAP1

SFRS10

AOF2

PPIE

SMARCA2

THOC2

SNRPB2

EZH2

HNRPA2B1

IK

BCAS2

DHX57

HSPC148

CCAR1

ZNF207

DHX38

DEK

SKIV2L2

HSPA5

C9ORF78

WBP4

FNBP3

SMARCA4

SNRPG

GTL3

JMJD2B

&36)�

HNRPF

ARS2

CPSF5

FLJ35382

ASCL1

SF3B2

MORF4L1

U2AF1

KAT5

DBR1

SFRS9

TCERG1

NCBP2

RBM25

PPIL2

SYNCRIP

SF3B1

HNRPA0

ILF3

SFRS5

SNRPA DDX41

NCBP1

DDX3X

DHX30

EHMT2

USP39

G10FAM32A

SART1

SMU1

DIS3

RBM5

PRPF18

SRRM2

'+;��

HSPA8

PPIG

.,$$����

SFRS11

PTBP1SFRS3

KAT2B

HTATSF1

DGCR14

SFRS4

WTAP

DDX10

PRPF3

,03í�

HNRPH25%0�

SRRM1

PABPN1

HNRPK

DHX9

SNRPD2

0 5 10 15 20 25 30

02

04

06

08

01

00

Degree

Cum

ulat

ive

Freq

uenc

y

All

Spliceosome

Splic. Factors

Other RNApr.

Chrom. Factors

Num

ber o

f Edg

es

Number of Assayed events

0 5 10 15 20 25 30

02

04

06

08

01

00

Degree

Cum

ulat

ive

Freq

uenc

y

Spliceosome

Persistent

Trans. Early

Trans. Mid

Trans. Late

5 10 15 20 25 30 35

10

50

100

500

5000

Random

Actual

DC

B

E

(legend on next page)

Page 9: Functional Splicing Network Reveals Extensive Regulatory

9

 

 

       

AS changes (61% overlap, p < 1E-20) covering a wide range of gene expression levels, fold differences in isoform ratios and AS types (Figures 5D–5F, S5B, and S5C). The overlap was much more limited with RBM6, a splicing regulator (Bechara et al., 2013) located elsewhere in the network. These results validate the strong functional similarities between IK and SMU1 detected in our analysis. The molecular basis for this link could be ex- plained by the depletion of each protein (but not their mRNA) upon knockdown of the other factor, consistent with formation of a heterodimer of the two proteins and destabilization of one partner after depletion of the other (Figure S5A). Notably, the best correlation between the effects of these factors was be- tween IK knockdown for 48 hr and SMU1 knockdown for 72 hr, perhaps reflecting a stronger functional effect of IK on AS regu- lation (Figures S2B and S2D).

Gene ontology analysis revealed a clear common enrichment of AS changes in genes involved in cell death and survival (Fig- ure S5D), suggesting that IK/SMU1 could play a role in the control of programmed cell death through AS. To evaluate this possibility, we tested the effects of knockdown of these factors, individually or combined, on cell proliferation and apoptotic as- says. The results indicated that their depletion activated cleav- age of PARP, substantially reduced cell growth, and increased the fraction of cells undergoing apoptosis (Figures S5E–S5G). We conclude that functional links revealed by the network can be used to infer common mechanisms of splicing regulation and provide insights into the biological framework of the regula- tory circuits involved.

 Ancillary Functional Network Interactions Reveal Alternative Mechanisms of AS Regulation The iterative generation of networks from subsets of ASEs ex- plained above could, in addition to identifying general functional interactions, be used to capture functional links that are specific to particular classes of events characterized by distinct regula- tory mechanisms. To explore this possibility, we sought the set of functional connections that are robustly recovered in a signif- icant fraction of ASEs subsets but are absent in others (see Supplemental Experimental Procedures). Figure 6A shows the set of such identified interactions, which therefore represents specific functional links that only emerge when particular sub- sets of ASEs are considered. The inset represents graphically the results of principal component analysis (PCA) showing the variability in the strength of these interactions depending on

the presence or absence of different ASEs across the subsam- ples. The most discriminatory set of ASEs for separating these interactions is also shown (see also Figure S6). While this anal- ysis may be constrained by the limited number of events analyzed and higher FDR due to multiple testing, it can provide a valuable tool for discovering regulatory mechanisms underly- ing specific classes of ASEs as illustrated by the example below.

A synergistic link was identified between hnRNP C and U2AF1 (the 35 kDa subunit of U2AF, which recognizes the AG dinucleo- tide at 3’ splice sites) for a subset of ASEs (Figures 6A and S6A). We explored a possible relationship between the effects of knocking down these factors and the presence of consensus hnRNP C binding sites (characterized by stretches of uridine res- idues) or 3’ splice site-like motifs in the vicinity of the regulated exons. The results revealed a significant correlation between the two factors (0.795), associated with the presence of compos- ite elements comprised of putative hnRNP C binding sites within sequences conforming to a 3’ splice site motif (see Experimental Procedures) upstream and/or downstream of the regulated exons (Figures 6B and S6B). No such correlation was observed for ASEs lacking these sequence features (Figures 6C and S6C). These observations are in line with results from a recent report revealing that hnRNP C prevents exonization of Alu elements by competing the binding of U2AF to Alu sequences (Zarnack et al., 2013). Our results extend these findings by capturing func- tional relationships between hnRNP C and U2AF in ASEs that harbor binding sites for these factors in the vicinity of the regu- lated splice sites, both inside of Alu elements or independent of them (Figures 6B and 6D). Our data are compatible with a model in which uridine-rich and 3’ splice site-like sequences sequester U2AF away from the regulated splice sites, causing alternative exon skipping, while hnRNP C displaces U2AF away from these decoy sites, facilitating exon inclusion. In this model, depletion of U2AF and hnRNP C are expected to display the same effects in these ASEs, as observed in our data (Figures 6B and 6D). In contrast, in ASEs in which hnRNP C binding sites are located within the polypyrimidine tract of the 3’ splice site of the regulated exon, the two factors display antagonistic effects (Figures 6C and 6D), as expected from direct competition be- tween them at a functional splice site. This example illustrates the potential of the network approach to identify molecular mechanisms of regulation on the basis of specific sequence fea- tures and the interplay between cognate factors.

   

Figure 4. Functional Splicing Regulatory Network (A) Graphical representation of the reconstructed splicing network. Nodes (circles) correspond to individual factors and edges (lines) to inferred functional as- sociations. Positive or negative functional correlations are represented by green or red edges, respectively. Edge thickness signifies the strength of the functional interaction, while node size is proportional to the overall impact (median Z score) of a given knockdown in the regulation of AS. Node coloring depicts the network’s natural separation in coherent modules (see Supplemental Experimental Procedures). Known physical interactions as reported in the STRING database (Franceschini et al., 2013) are represented in black dotted lines between factors. The inset is an expanded view of factors in the U2 and U4/5/6 snRNP complexes. (B) Number of recovered network edges using actual or randomized data sets (red and green points respectively) as a function of the number of ASEs used. Lines represent the best fit curves for the points. (C and D) Network degree distribution. The plots represent the cumulative frequency of the number of connections (degree) for different classes of factors (as in Figures 2B and 2C). (E) Functional crosstalk between different categories of spliceosome components. Values represent the fraction of observed functional connections out of the total possible connections among factors that belong to the different categories: persistent, P; transient early, E; transient mid, M; or transient late, L factors. See also Figure S4E and Table S5.

   

Page 10: Functional Splicing Network Reveals Extensive Regulatory

10

 

 

 

SMNDC1

LSM3

C20ORF14

CRNKL1

XAB2

TXNL4

U2AF2PPIH

SLU7

8���.'

SF3A2

SNRPC

DDX23 SNRPD1

CHERP

SNRP70

SR140 SNRPD3

SNRPA1

RBM178���.'

SF3A1

PLRG1

SNRPB

C19ORF29

AOF2

SF3B4

SNRPB2

IK

PRPF31

PRPF19

AQR

CDC5L

PRPF8

SF3B2

U2AF1

NCBP2

LSM4

SF3B1

SART1

SNRPF

SMU1

P14

SF3B3

CDC40

.,$$����

SF3A3

LSM7

PRPF3

SNRPD2

Alternative splicing changes

IK SMU1

RBM6

462 415858

IK SMU1

RBM6

107

547

3631

Gene expression changes

p = 0.03

p = 1.07E -23p = 0.005

p = 3.36E-07

p = 0.73p = 4.60E-11

p = 6.87E-10

p = 3.38E-07

p = 0.14

p = 4.62E-31p = 0.08

p = 4.23E-05

p = 0.32

p = 6.46E-12

p = 9.32E-08

p = 0.1

IK KD AS distribution SMU1 KD AS distribution

A

D

E

338195 226

16

228

33

í� í� í� í� 0 1

í�í�

í�0

1

=íVFRUH��,.

=íVFRUH��608�

FAS

OLR1

PAX6

CHEK2

FN1EDB

FN1EDA

NUMB

BCL2L1

RAC1

MSTR1

MADD

CASP2

APAF1MINK1

MAP3K7MAP4K3

NOTCH3

MAP4K2

PKM2

MCL1

CFLARCASP9

DIABLO

BMFCCNE1

H2AFY

STAT3

GADD45ABIM_L_EL

BIM_ELBIRC5_2B

BIRC5_D3

SMN1SMN2

PHF19RobCor = 0.76

í� í� í� 0 1

í�í�

í�0

1

=íVFRUH��,.B+(.

=íVFRUH��608�B+(.

FAS

OLR1 PAX6 Chek2FN1EDB

FN1EDA

NUMBBCL2L1

RAC1

MSTR1

MADD

CASP2 APAF1

MINK1

MAP3K7

MAP4K3

NOTCH3MAP4K2

PKM2

MCL1

CFLAR

CASP9

DIABLO

BMFCCNE1

H2AFY

STAT3GADD45A

BIM_L_ELBIM_EL

BIRC5_2B

BIRC5_D3

SMN1SMN2

PHF19

RobCor = 0.77

B

C

AltHUnativH�EvHnt�TypH NumbHU�Rf�EvHntVAlter. First Exon 42926

Alter. Terminal Exon 27784Exon Cassette 25912

Mutually Exclusive Exons 3936Alter. Acceptor Splice Site 6443

Alter. Donor Splice Site 4986Intron Retention 9475

Complex 34841Internal Exon Deletion 3389

TRtal 159692

F AS event distribution in GW array platform

Figure 5. Core Network Involved in AS Regulation (A) Graphical representation of the functional connections present in at least 90% of 10,000 networks generated by iterative selection of subsets of 17 out of the 35 ASEs used to generate the complete network. Network attributes represented as in Figure 4A. Known spliceosome complexes are highlighted by shadowed areas (U1 snRNP, blue; U2 snRNP, yellow; SM proteins, green; and tri-snRNP proteins, red). (B and C) Consistency of inferred functional interactions in different cell lines. Knockdown of IK or SMU1 was carried out in parallel in HeLa (B) or HEK293 cells (C), and changes for the 35 ASEs were analyzed by RT-PCR and HTCE. Robust correlation estimates and regression for the AS changes observed in IK versus SMU1 knockdowns are shown for each of the cell lines. (D) Venn diagrams of the overlap between the number of gene expression changes upon IK, SMU1, or RBM6 knockdowns.

 (legend continued on next page)

Page 11: Functional Splicing Network Reveals Extensive Regulatory

11

 

 

Figure S5.

   

Mapping Drug Targets within the Spliceosome Another application of our network analysis is to identify possible targets within the splicing machinery of physiological or external (e.g., pharmacological) perturbations that produce changes in AS that can be measured using the same experimental setting. The similarity between profiles of AS changes induced by such perturbations and the knockdown of particular factors can help to uncover SFs that mediate their effects. To evaluate the poten- tial of this approach, cells were treated with drugs known to affect AS decisions and the patterns of changes in the 35 ASEs induced by these treatments were assessed by the same robotized procedure for RNA isolation and RT-PCR analysis used to locate their position within the network.

The results indicate that structurally similar drugs like Spli- ceostatin A (SSA) and Meayamycin cause AS changes that closely resemble the effects of knocking down components of U2 snRNP (Figure 7A), including close links between these drugs and SF3B1, a known physical target of SSA (Kaida et al., 2007; Hasegawa et al., 2011) previously implicated in mediating the ef- fects of the drug through alterations in the AS of cell cycle genes (Corrionero et al., 2011). Of interest, changes in AS of MCL1 (leading to the production of the proapoptotic mRNA isoform) appear as prominent effects of both drugs (Figure 7B), confirm- ing and extending recent observations obtained using Meaya- mycin (Gao et al., 2013) and suggesting that this ASE can play a key general role in mediating the antiproliferative effects of drugs that target core splicing components like SF3B1 (Bonnal et al., 2012). A strong link is also captured between each of the drugs and PPIH (Figure 7A), a peptidyl-prolyl cis-trans isomerase which has been found to be associated with U4/5/6 tri-snRNP (Horowitz et al., 1997; Teigelkamp et al., 1998) but had not been implicated directly in 3’ splice site regulation. Indeed, the network indicates a close relationship between PPIH and multi- ple U2 snRNP components, particularly in the SF3a and SF3b complexes, further arguing that PPIH plays a role in 3’ splice site definition closely linked to branchpoint recognition by U2 snRNP. Treatment of the cells with the Clk (Cdc2-like) kinase family inhibitor TG003, which is also known to modulate AS (Mur- aki et al., 2004) and causes changes in ASEs analyzed in our network (Figure 7B), led to links with a completely different set of SFs (Figure 7A), attesting to the specificity of the results. These data confirm the potential of the network analysis to identify bona fide SF targets of physiological or pharmacological pertur- bations and further our understanding of the the underlying mo- lecular mechanisms of regulation.

 DISCUSSION

 The combined experimental and computational approach described in this manuscript provides a rich source of informa- tion and a powerful toolset for studying the splicing process and its regulation. Similar approaches could be useful to system- atically capture information about other aspects of transcrip-

tional or posttranscriptional gene regulation. The data provide comprehensive information about SFs (and some additional chromatin factors) relevant to understand the regulation of 35 ASEs important for cell proliferation and apoptosis. Second, the network analysis reveals functional links between SFs, which in some cases are based upon physical interactions. The recon- struction of the known topology of some complexes (e.g., U2 snRNP) by the network suggests that the approach captures important aspects of the operation of these particles and therefore has the potential to provide novel insights into the composition and intricate workings of multiple Spliceosomal subcomplexes. Third, the network can serve as a resource for exploring mechanisms of AS regulation induced by perturba- tions of the system, including genetic manipulations, internal or external stimuli, or exposure to drugs. As an example, the network was used in the accompanying manuscript by Tejedor et al. (2014) to identify a link between AS changes induced by variations in intracellular iron and the function of SRSF7, which can be explained in terms of iron-induced changes in the RNA binding and splicing regulatory activity of this zinc-knuckle-con- taining SR protein. Other examples include (1) the identification of the extensive regulatory overlap between the interacting pro- teins IK/RED and SMU1, relevant for the control of programmed cell death; (2) evidence for a general role of uridine-rich hnRNP C binding sequences (some associated with transposable ele- ments; Zarnack et al., 2013) in the regulation of nearby alterna- tive exons via antagonism with U2AF; and (3) delineation of U2 snRNP components and other factors, including PPIH, that are functionally linked to the effects of antitumor drugs on AS.

Our data reveal changes of AS regulation mediated either by core splicing components or by classical regulatory factors like proteins of the SR and hnRNP families. While the central core of functional links is likely to operate globally, alternative config- urations (particularly in the periphery of the network) are likely to emerge if similar approaches are applied to other cell types, genes, or biological contexts. We report considerable versatility in the effects of core spliceosome factors on alternative spicing. Precedents exist for a role of core SFs on the regulation of splicing efficiency or splice site selection in yeast, Drosophila, and mammalian cells (Clark et al., 2002; Park et al., 2004; Pleiss et al., 2007; Saltzman et al., 2011). Despite their assumed gen- eral roles in the splicing process, depletion or mutation of particular core factors led to differential splicing effects in these studies, and at least in some cases the differential effects could be attributed to particular features of the regulated pre-mRNAs. For example, Clark et al. (2002) found that the core components Prp17p and Prp18p are dispensable for splicing of yeast introns with short branchpoint to 3’ splice site distances. Pleiss et al. (2007) found that introns of yeast ribosome protein genes are particularly sensitive to mutation of various core factors, including two DEAD/X box family of RNA helicases (PRP2 and PRP5) involved in spliceosome conformational transitions and PRP8, a highly conserved protein involved in catalytic activation

 (E) Venn diagrams of the overlap between the number of AS changes upon knockdown of IK, SMU1, or RBM6. (F) Distribution of ASEs in the Affymetrix array platform (upper panel) and distribution of the AS changes observed upon IK (lower left) or SMU1 (lower right) knockdown. p values correspond to the difference between the distribution of AS changes upon knockdown and the distribution of ASEs in the array. See also

   

Page 12: Functional Splicing Network Reveals Extensive Regulatory

12

 

 

A

SFRS11

SFRS1

SFRS2

SFRS10

SRRM2

SKIIP

U2AF1

SMNDC1

NCBP2

SNRP70

SNRPA

SF3B1

SF3B2

SFPQ

SF3A1

SF3A3

SF3B4

SNRPA1

SNRPB2

NONO

RBM35A

C20ORF14

CD2BP2

PRPF4

PRPF31

PPIH

CCDC55

NHP2L1

SART1

USP39

RY1

SNRPB

SNRPD1

SNRPF

DBR1

SNRPG

LSM2

LSM6

DHX15

DIS3

EXOSC4

IK

PPM1G

BCAS2

ACIN1

SHARP

RNPC2RBM15DHX9

SKIV2L2

DHX30

DDX10

DDX52

DEK

DDX31

DDX11XAB2

MGC2655

CRK7

CPSF5

ZNF207

THOC2

C22ORF19

MGC13125

WTAP

HSPC148

DGCR14

GTL3

HNRPA0

HNRPA1

HNRPC

HNRPD

HNRPL

ILF3ILF2

BUB3

ELAVL1

TAF15

YBX1

HNRPH2

KHDRBS1

DICER1

TIAL1 FNBP3RBM25

MFAP1

SMU1

CTNNBL1

KIN

P29 NOSIP

LENG1

MORG1

MGC20398

MGC23918

FRG1D6S2654EKIAA0073

PPIE

SDCCAG10

PPIG

SUV39H1

SETD1A

SETD2

NAB2

EED

AOF2

JMJD2BEZH2

SMARCA4ASH2LEP300

PRMT5

BRD4

CTCF

MORF4L1HDAC4

í� í� 0 � �

í�í�

0�

PC1

3&�

OLR1

CHEK2BCL2L1 MINK1

MAP3K7PKM2

MCL1

CFLAR

STAT3

PHF19

B

C

DIABLO AluSc

MCL1

&+(.��

+�$)Y 1kbp

AluJr

RAC1 AluSx3

AP$)��AluFLAMC AluSp

NUMB AluSx AluSc8

CCNE1

ALU

VL8�$)��

siHNRPC

Cass. Exon

500 bp

í� í� í� 0 1 �

í�í�

01

=íVFRUH��+153&

=íVFRUH��8�$

)�

%0)

AP$)�

&)/$5RAC1

CCNE1

DIABLO

+�$)<

NUMB

STA7�

)AS

0$3�.�

PAX6

CASP9SMN1

601�

%,5&�B'�

%,5&�B�%

&RU� ������

�� �� 0.0 0.5 1.0

í�í�

01

=íVFRUH��+153&

=íVFRUH��8�$

)�

OLR1

&+(.�

)1�('%

)1�(DA

%&/�/�

MSTR1

MADD

&$63�

MINK1

0$3�.�

NO7&+�

0$3�.�

3.0�

MCL1

*$''��$BIM_L_EL

BIM_EL

3+)��

&RU� �����

D

Figure 6. Variable Functional Interactions Extracted from Networks Generated from Subsets of ASEs (A) Network of ancillary functional connections characteristic of subsets of ASEs. Lines link SFs that display functional similarities in their effects on subsets of ASEs. Line colors indicate the positions of the interactions in the PCA shown in the inset, which captures the different relative contributions of the ASEs in defining these connections (see also Figure S6 and Experimental Procedures). For clarity, only the top ten discriminatory ASEs are shown. (B and C) Correlation values and regression for the effects of U2AF1 versus HNRPC knockdowns for ASEs in the presence (B) or absence (C) of upstream composite HNRPC binding/3’ splice site-like elements.

 (legend continued on next page)

Page 13: Functional Splicing Network Reveals Extensive Regulatory

Figure S6.

13

 

 

       

(Collins and Guthrie, 1999; Mozaffari-Jovin et al., 2012). Distinct effects of the mutations on different ribosomal genes were also observed, arguing again for specific effects on splicing of certain genes. DEAD/H box proteins as well as components of U1, U2, and U4/6 snRNP were found to modulate particular ASEs in an RNAi screen of RNA binding proteins in Drosophila cells (Park et al., 2004). In contrast with the standard function of classical splicing regulators acting through cognate binding sites specific to certain target RNAs, core factors could both carry out general, essential functions for intron removal and in addition display reg- ulatory potential if their levels become limiting for the function of complexes, leading to differences in the efficiency or kinetics of assembly on alternative splice sites. This could be the case for the knockdown of SmB/B0, a component of the Sm complex pre- sent in most spliceosomal snRNPs, which leads to autoregula- tion of its own pre-mRNA as well as to effects on hundreds of other ASEs, particularly in genes encoding RNA processing fac- tors (Saltzman et al., 2011).

The results of our genome-wide siRNA screen for regulators of Fas AS (Tejedor et al., 2014) provide unbiased evidence that an extensive number of core SFs have the potential to contribute to the regulation of splice site selection. This is, in fact, the most populated category of screen hits. Similarly, the results of our network highlight coherent effects on alternative splice site choice of multiple core components, revealing also that the extent of their effects on alternative splice site selection can be largely attributed to the duration and order of their recruitment in the splicing reaction.

Remarkably, effects on splice site selection are associated with depletion not only of complexes involved in early splice site recognition but also of factors involved in complex B forma- tion or even in catalytic activation of fully assembled spliceo- somes. While particular examples of AS regulation at the time of the transition from complex A to B and even at later steps had been reported (House and Lynch, 2006; Bonnal et al., 2008; Lallena et al., 2002), our results indicate that the implication of late factors is quite general. Such links, e.g., those involving PRP8 and other interacting factors closely positioned near the re- acting chemical groups at the time of catalysis, like PRP31 or U5 200KD, are actually part of the persistent functional interactions that emerge from the analysis of any subsample of ASEs analyzed, highlighting their general regulatory potential in splice site selection. How can factors involved in late steps of the splicing process influence splice site choices? It is conceivable that limiting amounts of late spliceosome components would favor splicing of alternative splice site pairs harboring early com- plexes that can more efficiently recruit late factors, or influence the kinetics of conformational changes required for catalysis. In this context, the regulatory plasticity revealed by our network analysis may be contributed, at least in part, by the emerging real- ization that a substantial number of SFs contain disordered re- gions when analyzed in isolation (Korneta and Bujnicki, 2012; Chen and Moore, 2014). Such regions may be flexible to adopt

different conformations in the presence of other spliceosomal components, allowing alternative routes for spliceosome assem- bly on different introns, with some pathways being more sensitive than others to the depletion of a general factor.

Kinetic effects on the assembly and/or engagement of splice sites to undergo catalysis would be particularly effective if spli- ceosome assembly is not an irreversible process. Results of sin- gle molecule analysis are indeed compatible with this concept (Tseng and Cheng, 2008; Abelson et al., 2010; Hoskins et al., 2011; Hoskins and Moore, 2012; Shcherbakova et al., 2013), thus opening the extraordinary complexity of conformational transitions and dynamic compositional changes of the spliceo- some as possible targets for regulation. Interestingly, while depletion of early factors tends to favor exon skipping, depletion of late factors causes a similar number of exon inclusion and skipping effects, suggesting high plasticity in splice site choice at this stage of the process.

Given this potential, it is conceivable that modulation of the relative concentration of core components can function as a physiological mechanism for splicing regulation, for example during development and cell differentiation. Indeed, variations in the relative levels of core spliceosomal components have been reported (Wong et al., 2013). Furthermore, recent reports revealed the high incidence of mutations in core SFs in cancer. For example, the gene encoding SF3B1, a component of U2 snRNP involved in branchpoint recognition, has been described as among the most highly mutated in myelodysplastic syn- dromes (Yoshida et al., 2011), chronic lymphocytic leukemia (Quesada et al., 2012), and other cancers (reviewed in Bonnal et al., 2012), correlating with different disease outcomes. Remarkably, SF3B1 is the physical target of antitumor drugs like Spliceostatin A and—likely—Meayamycin, as captured by our study. These observations are again consistent with the idea that depletion, mutation, or drug-mediated inactivation of core SFs may not simply cause a collapse of the splicing pro- cess—at least under conditions of limited physical or functional depletion—but rather lead to changes in splice site selection that can contribute to physiological, pathological, or therapeutic out- comes. SF3B1 participates in the stabilization and proofreading of U2 snRNP binding to the branchpoint region (Corrionero et al., 2011), and it is therefore possible that variations in the activity of this protein result in switches between alternative 3’ splice sites harboring branch sites with different strengths and/or flanking decoy binding sites. Similar concepts may apply to explain the effects of mutations in other SFs linked to disease, including, for example, mutations in PRP8 leading to retinitis pigmentosa (Pena et al., 2007).

In summary, the data and methodological approaches pre- sented in this study provide a rich resource for understanding the function of the spliceosome and the mechanisms of AS regu- lation, including the identification of targets of physiological, pathological, or pharmacological perturbations within the com- plex splicing machinery.

 (D) Schematic representation of the distribution of composite HNRPC binding sites/3’ splice site-like sequences in representative examples of the splicing events analyzed. The relative position of Alu elements in these regions is also shown. Genomic distances are drawn to scale. The di- rection of the arrows indicates up or downregulation of cassette exons upon knockdown of U2AF1 (black arrows) or HNRPC (white arrows). See also

Page 14: Functional Splicing Network Reveals Extensive Regulatory

14

 

 

 

A

PPIH

SNRPD3

SNRPA1

SF3A1

SF3B4

SNRPB2

SF3B2SF3B1

SART1

SNRPF

SF3B3

SF3A3

Spliceostatin

Meayamycin

DDX48

TG003HN

OOOR

OHO

O

O

O

O

RSpliceostatin A OCH 3

Meayamycin

CH3

FAS

OLR

1

PAX

6

CH

EK

2

FN1E

DB

FN1E

DA

NU

MB

BC

L2L1

RA

C1

MS

TR1

MA

DD

CA

SP

2

APA

F1

MIN

K1

MA

P3K

7

MA

P4K

3

NO

TCH

3

MA

P4K

2

PK

M2

MC

L1

CFL

AR

CA

SP

9

DIA

BLO

BM

F

CC

NE

1

H2A

FY

STA

T3

GA

DD

45A

BIM

_L_E

L

BIM

_EL

BIR

C5_

2B

BIR

C5_

D3

SM

N1

SM

N2

PH

F19

í�í�

í�í�

01

2

6FDOHG�=í

VFRUH

SpliceostatinMeayamycinTG003

B

N

SO

O

TG003

Figure 7. Mapping the Effects of Pharmaco- logical Treatments to the Splicing Network (A) Functional connections of splicing inhibitory drugs with the splicing machinery. See text for details. (B) Splicing perturbation profiles for Spliceostatin A, Meayamycin, or TG003 treatments across the 35 ASEs analyzed  

(                  

Page 15: Functional Splicing Network Reveals Extensive Regulatory

15

 

 

EXPERIMENTAL PROCEDURES  siRNA Library Transfection and mRNA Extraction HeLa cells were transfected in biological triplicates with siRNAs pools (ON TARGET plus smartpool, Dharmacon, Thermo Scientific) against 270 splicing and chromatin remodeling factors (see Table S2). Endogenous mRNAs were purified 72 hr posttransfection by using oligo dT-coated 96-well plates (mRNA catcher PLUS, Life Technologies) following the man- ufacturer’s instructions. RNA samples corresponding to treatments with splicing arresting drugs were isolated with the Maxwell 16 LEV simplyRNA kit (Promega).  RT-PCR and High-Throughput Capillary Electrophoresis Cellular mRNAs were reverse transcribed using Superscript III Retro Tran- scriptase (Invitrogen, Life Technologies) following the manufacturer’s recom- mendations. PCR reactions for every individual splicing event analyzed were carried out using forward and reverse primers in exonic sequences flanking the alternatively spliced region of interest and further reagents provided in the GOTaq DNA polymerase kit (GoTaq, Promega). Primers used in this study are listed in Table S6.

HTCE measurements for the different splicing isoforms were performed in 96-well format in a Labchip GX Caliper workstation (Caliper, Perkin Elmer) us- ing a HT DNA High Sensitivity LabChip chip (Perkin Elmer). Data values were obtained using the Labchip GX software analysis tool (version 3.0).  Quantification of AS Changes from HTCE Measurements Robust estimates of isoform ratios upon siRNA or pharmacological treatments were obtained using the median PSI indexes of the biological triplicates for each knockdown-AS pair. From this value xi the effect of each treatment was summarized as a robust Z score (Birmingham et al., 2009): Robust Sample Correlation Estimation Robust correlation estimation is based on an iterative weighting algorithm that discriminates between technical outliers and reliable measure- ments with high leverage. The weighting for mea- surements relies on calculation of a reliability index that takes into account their cumulative influence on correlation estimates of the complete data set. Algorithmic details are provided in Supplemental Experimental Procedures. Network Reconstruction The process of network reconstruction relies on the regularization-based graphical lasso (glasso) algorithm for graphical model selection (Friedman et al., 2008). The glasso process for network reconstruction was implemented using the glasso R package by Jerome Friedman, Trevor Hastie, and Rob Tibshirani (http://statweb.stanford.edu/~tibs/glasso/). A detailed exposition of the algorithmic steps for network reconstruction and analysis, glasso, and its use for graphical model selection is provided in Supplemental Experimental Procedures. Principal Component Analysis of Ancillary Connections PCA for ancillary edges was performed using R-mode PCA (R function prin- comp). The input data set is the scaled 95 3 35 matrix containing the absolute average correlation for each of the 95 edges over the subsets of the 10,000 samples that included each of the 35 events. The coordinates for projecting the 95 edges are directly derived from their scores on the first two principal components. The arrows representing the top ten discriminatory events (based on the vector norm of the first two PC loadings) were drawn from the origin to the coordinate defined by the first two PC loadings scaled by a con- stant factor for display purposes.

Splice Site Scoring, Identification of Probable HNRPC Binding Sites and Alu Elements For the identification and scoring of 3’ ss we used custom-built position weight matrices (PWMs) of length 21—8 intronic plus three exonic positions. The matrices were built using a set of human splice sites from constitutive, internal exons compiled from the hg19 UCSC annotation database (Karolchik et al., 2014). Background nucleotide frequencies were estimated from a set of strictly intronic regions. Threshold was set based

on a FDR of 0.5/kbp of random in- tronic sequences generated using a second-order Markov chain of actual in- tronic regions.

We considered as probable HNRPC binding sites consecutive stretches of U{4} as reported in Zarnack et al. (2013). Annotation of ALU elements was based on the RepeatMasker track developed by Arian Smit (http://www. repeatmasker.org) and the Repbase library (Jurka et al., 2005).

More detailed experimental procedures are provided in Supplemental Experimental Procedures.

 

AUTHOR CONTRIBUTIONS  

P.P. set up the computational pipeline to generate the splicing network. J.R.T. set up and carried out the experimental analyses. L.V. contributed to the network analyses involving drugs. P.P., J.R.T., and J.V. designed the project and wrote the manuscript.

 ACKNOWLEDGMENTS

 We thank many CRG colleagues, EURASNET members, and Quaid Morris for their advice, encouragement, and critical reading of the manuscript, and Drs. Koide and Yoshida for reagents. We acknowledge the excellent technical sup- port of the CRG Robotics and Genomics facilities. J.R.T. was supported by a PhD fellowship from Fondo de Investigaciones Sanitarias. Work in our lab is supported by Fundacio n Botın, Consolider RNAREG, Ministerio de Economıa e Innovacio n, and AGAUR.

Page 16: Functional Splicing Network Reveals Extensive Regulatory

16

 

 

REFERENCES  Abelson, J., Blanco, M., Ditzler, M.A., Fuller, F., Aravamudhan, P., Wood, M., Villa, T., Ryan, D.E., Pleiss, J.A., Maeder, C., et al. (2010). Conformational dy- namics of single pre-mRNA molecules during in vitro splicing. Nat. Struct. Mol. Biol. 17, 504–512. Ashton-Beaucage, D., Udell, C.M., Lavoie, H., Baril, C., Lefranc ois, M., Chagnon, P., Gendron, P., Caron-Lizotte, O., Bonneil, E., Thibault, P., and Therrien, M. (2010). The exon junction complex controls the splicing of MAPK and other long intron-containing transcripts in Drosophila. Cell 143, 251–262.

Barash, Y., Calarco, J.A., Gao, W., Pan, Q., Wang, X., Shai, O., Blencowe, B.J., and Frey, B.J. (2010). Deciphering the splicing code. Nature 465, 53–59. Bechara, E.G., Sebestye n, E., Bernardis, I., Eyras, E., and Valca rcel, J. (2013). RBM5, 6, and 10 differentially regulate NUMB alternative splicing to control cancer cell proliferation. Mol. Cell 52, 720–733. Behrens, S.E., Tyc, K., Kastner, B., Reichelt, J., and Lu hrmann, R. (1993). Small nuclear ribonucleoprotein (RNP) U2 contains numerous additional pro- teins and has a bipartite RNP structure under splicing conditions. Mol. Cell. Biol. 13, 3’7–319.

Bessonov, S., Anokhina, M., Will, C.L., Urlaub, H., and Lu hrmann, R. (2008). Isolation of an active step I spliceosome and composition of its RNP core. Nature 452, 846–850.

Birmingham, A., Selfors, L.M., Forster, T., Wrobel, D., Kennedy, C.J., Shanks, E., Santoyo-Lopez, J., Dunican, D.J., Long, A., Kelleher, D., et al. (2009). Statistical methods for analysis of high-throughput RNA interference screens. Nat. Methods 6, 569–575. Bonnal, S., Martınez, C., Fo rch, P., Bachi, A., Wilm, M., and Valca rcel, J. (2008). RBM5/Luca-15/H37 regulates Fas alternative splice site pairing after exon definition. Mol. Cell 32, 81–95.

Bonnal, S., Vigevani, L., and Valca rcel, J. (2012). The spliceosome as a target of novel antitumour drugs. Nat. Rev. Drug Discov. 11, 847–859.

Bottner, C.A., Schmidt, H., Vogel, S., Michele, M., and Ka ufer, N.F. (2005). Multiple genetic and biochemical interactions of Brr2, Prp8, Prp31, Prp1 and Prp4 kinase suggest a function in the control of the activation of spliceosomes in Schizosaccharomyces pombe. Curr. Genet. 48, 151–161.

Chatr-Aryamontri, A., Breitkreutz, B.J., Heinicke, S., Boucher, L., Winter, A., Stark, C., Nixon, J., Ramage, L., Kolas, N., O’Donnell, L., et al. (2013). The BioGRID interaction database: 2013 update. Nucleic Acids Res. 41 (Database issue), D816–D823.

Chen, W., and Moore, M.J. (2014). The spliceosome: disorder and dynamics defined. Curr. Opin. Struct. Biol. 24, 141–149.

Clark, T.A., Sugnet, C.W., and Ares, M., Jr. (2002). Genomewide analysis of mRNA processing in yeast using splicing-specific microarrays. Science 296, 907–910.

Collins, C.A., and Guthrie, C. (1999). Allele-specific genetic interactions be- tween Prp8 and RNA active site residues suggest a function for Prp8 at the cat- alytic core of the spliceosome. Genes Dev. 13, 1970–1982.

Cooper, T.A., Wan, L., and Dreyfuss, G. (2009). RNA and disease. Cell 136, 777–793.

Corrionero, A., Min ana, B., and Valca rcel, J. (2011). Reduced fidelity of branch point recognition and alternative splicing induced by the anti-tumor drug spli- ceostatin A. Genes Dev. 25, 445–459.

Fica, S.M., Tuttle, N., Novak, T., Li, N.S., Lu, J., Koodathingal, P., Dai, Q., Staley, J.P., and Piccirilli, J.A. (2013). RNA catalyses nuclear pre-mRNA splicing. Nature 503, 229–234.

Franceschini, A., Szklarczyk, D., Frankild, S., Kuhn, M., Simonovic, M., Roth, A., Lin, J., Minguez, P., Bork, P., von Mering, C., and Jensen, L.J. (2013). STRING v9.1: protein-protein interaction networks, with increased coverage and integration. Nucleic Acids Res. 41 (Database issue), D808–D815.

Friedman, J., Hastie, T., and Tibshirani, R. (2008). Sparse inverse covariance estimation with the graphical lasso. Biostatistics 9, 432–441.

Fu, X.D., and Ares, M., Jr. (2014). Context-dependent control of alternative splicing by RNA-binding proteins. Nat. Rev. Genet. 15, 689–701.

Gao, Y., Vogt, A., Forsyth, C.J., and Koide, K. (2013). Comparison of splicing factor 3b inhibitors in human cells. ChemBioChem 14, 49–52.

Ha cker, I., Sander, B., Golas, M.M., Wolf, E., Karago z, E., Kastner, B., Stark, H., Fabrizio, P., and Lu hrmann, R. (2008). Localization of Prp8, Brr2, Snu114 and U4/U6 proteins in the yeast tri-snRNP by electron microscopy. Nat. Struct. Mol. Biol. 15, 1206–1212.

Hasegawa, M., Miura, T., Kuzuya, K., Inoue, A., Won Ki, S., Horinouchi, S., Yoshida, T., Kunoh, T., Koseki, K., Mino, K., et al. (2011). Identification of SAP155 as the target of GEX1A (Herboxidiene), an antitumor natural product. ACS Chem. Biol. 6, 229–233.

Horowitz, D.S., Kobayashi, R., and Krainer, A.R. (1997). A new cyclophilin and the human homologues of yeast Prp3 and Prp4 form a complex associated with U4/U6 snRNPs. RNA 3, 1374–1387.

Hoskins, A.A., and Moore, M.J. (2012). The spliceosome: a flexible, reversible macromolecular machine. Trends Biochem. Sci. 37, 179–188.

Hoskins, A.A., Friedman, L.J., Gallagher, S.S., Crawford, D.J., Anderson, E.G., Wombacher, R., Ramirez, N., Cornish, V.W., Gelles, J., and Moore, M.J. (2011). Ordered and dynamic assembly of single spliceosomes. Science 331, 1289–1295.

House, A.E., and Lynch, K.W. (2006). An exonic splicing silencer represses spliceosome assembly after ATP-dependent exon recognition. Nat. Struct. Mol. Biol. 13, 937–944.

Jurka, J., Kapitonov, V.V., Pavlicek, A., Klonowski, P., Kohany, O., and Walichiewicz, J. (2005). Repbase Update, a database of eukaryotic repetitive elements. Cytogenet. Genome Res. 110, 462–467.

Kaida, D., Motoyoshi, H., Tashiro, E., Nojima, T., Hagiwara, M., Ishigami, K., Watanabe, H., Kitahara, T., Yoshida, T., Nakajima, H., et al. (2007). Spliceostatin A targets SF3b and inhibits both splicing and nuclear retention of pre-mRNA. Nat. Chem. Biol. 3, 576–583.

Karolchik, D., Barber, G.P., Casper, J., Clawson, H., Cline, M.S., Diekhans, M., Dreszer, T.R., Fujita, P.A., Guruvadoo, L., Haeussler, M., et al. (2014). The UCSC Genome Browser database: 2014 update. Nucleic Acids Res. 42 (Database issue), D764–D770. Katz, Y., Wang, E.T., Airoldi, E.M., and Burge, C.B. (2010). Analysis and design of RNA sequencing experiments for identifying isoform regulation. Nat. Methods 7, 1009–1015. Kim, D., Kim, M.S., and Cho, K.H. (2012). The core regulation module of stress- responsive regulatory networks in yeast. Nucleic Acids Res. 40, 8793–8802. Korneta, I., and Bujnicki, J.M. (2012). Intrinsic disorder in the human spliceoso- mal proteome. PLoS Comput. Biol. 8, e1002641. Lallena, M.J., Chalmers, K.J., Llamazares, S., Lamond, A.I., and Valca rcel, J. (2002). Splicing regulation at the second catalytic step by Sex-lethal involves 3’ splice site recognition by SPF45. Cell 109, 285–296. Luco, R.F., Allo, M., Schor, I.E., Kornblihtt, A.R., and Misteli, T. (2011). Epigenetics in alternative pre-mRNA splicing. Cell 144, 16–26. Makarova, O.V., Makarov, E.M., Urlaub, H., Will, C.L., Gentzel, M., Wilm, M., and Lu hrmann, R. (2004). A subset of human 35S U5 proteins, including Prp19, function prior to catalytic step 1 of splicing. EMBO J. 23, 2381–2391. Mozaffari-Jovin, S., Santos, K.F., Hsiao, H.H., Will, C.L., Urlaub, H., Wahl, M.C., and Lu hrmann, R. (2012). The Prp8 RNase H-like domain inhibits Brr2- mediated U4/U6 snRNA unwinding by blocking Brr2 loading onto the U4 snRNA. Genes Dev. 26, 2422–2434. Muraki, M., Ohkawara, B., Hosoya, T., Onogi, H., Koizumi, J., Koizumi, T., Sumi, K., Yomoda, J., Murray, M.V., Kimura, H., et al. (2004). Manipulation of alternative splicing by a newly developed inhibitor of Clks. J. Biol. Chem. 279, 24246–24254. Muro, A.F., Chauhan, A.K., Gajovic, S., Iaconcig, A., Porro, F., Stanta, G., and Baralle, F.E. (2003). Regulated splicing of the fibronectin EDA exon is essential for proper skin wound healing and normal lifespan. J. Cell Biol. 162, 149–160. Nilsen, T.W., and Graveley, B.R. (2010). Expansion of the eukaryotic proteome by alternative splicing. Nature 463, 457–463. Park, J.W., Parisky, K., Celotto, A.M., Reenan, R.A., and Graveley, B.R. (2004). Identification of alternative splicing regulators by RNA interference in Drosophila. Proc. Natl. Acad. Sci. USA 101, 15974–15979.

Page 17: Functional Splicing Network Reveals Extensive Regulatory

15

 

 

Pena, V., Liu, S., Bujnicki, J.M., Lu hrmann, R., and Wahl, M.C. (2007). Structure of a multipartite protein-protein interaction domain in splicing factor prp8 and its link to retinitis pigmentosa. Mol. Cell 25, 615–624. Pleiss, J.A., Whitworth, G.B., Bergkessel, M., and Guthrie, C. (2007). Transcript specificity in yeast pre-mRNA splicing revealed by mutations in core spliceosomal components. PLoS Biol. 5, e90. Quesada, V., Conde, L., Villamor, N., Ordo n ez, G.R., Jares, P., Bassaganyas, L., Ramsay, A.J., Bea , S., Pinyol, M., Martınez-Trillos, A., et al. (2012). Exome sequencing identifies recurrent mutations of the splicing factor SF3B1 gene in chronic lymphocytic leukemia. Nat. Genet. 44, 47–52.

Saltzman, A.L., Pan, Q., and Blencowe, B.J. (2011). Regulation of alternative splicing by the core spliceosomal machinery. Genes Dev. 25, 373–384.

Shcherbakova, I., Hoskins, A.A., Friedman, L.J., Serebrov, V., Corre a, I.R., Jr., Xu, M.Q., Gelles, J., and Moore, M.J. (2013). Alternative spliceosome assem- bly pathways revealed by single-molecule fluorescence microscopy. Cell Rep. 5, 151–165.

Teigelkamp, S., Achsel, T., Mundt, C., Go thel, S.F., Cronshagen, U., Lane, W.S., Marahiel, M., and Lu hrmann, R. (1998). The 20kD protein of human [U4/U6.U5] tri-snRNPs is a novel cyclophilin that forms a complex with the U4/U6-specific 60kD and 90kD proteins. RNA 4, 127–141.

Tejedor, J.R., Papasaikas, P., and Valca rcel, J. (2014). Genome-wide identifi- cation of Fas/CD95 alternative splicing regulators reveals links with iron ho- meostasis. Mol. Cell 57. Published online December 4, 2014. http://dx.doi. org/10.1016/j.molcel.2014.10.029.

Tseng, C.K., and Cheng, S.C. (2008). Both catalytic steps of nuclear pre- mRNA splicing are reversible. Science 320, 1782–1784.

Wahl, M.C., Will, C.L., and Lu hrmann, R. (2009). The spliceosome: design prin- ciples of a dynamic RNP machine. Cell 136, 701–718.

Watson, E., MacNeil, L.T., Arda, H.E., Zhu, L.J., and Walhout, A.J. (2013). Integration of metabolic and gene regulatory networks modulates the C. ele- gans dietary response. Cell 153, 253–266.

Wong, A.K., Park, C.Y., Greene, C.S., Bongo, L.A., Guan, Y., and Troyanskaya, O.G. (2012). IMP: a multi-species functional genomics portal for integration, visualization and prediction of protein functions and networks. Nucleic Acids Res. 40 (Web Server issue), W484–W490.

Wong, J.J., Ritchie, W., Ebner, O.A., Selbach, M., Wong, J.W., Huang, Y., Gao, D., Pinello, N., Gonzalez, M., Baidya, K., et al. (2013). Orchestrated intron retention regulates normal granulocyte differentiation. Cell 154, 583–595.

Yang, Y., Han, L., Yuan, Y., Li, J., Hei, N., and Liang, H. (2014). Gene co- expression network analysis reveals common system-level properties of prog- nostic genes across cancer types. Nat. Commun. 5, 3231.

Yoshida, K., Sanada, M., Shiraishi, Y., Nowak, D., Nagata, Y., Yamamoto, R., Sato, Y., Sato-Otsubo, A., Kon, A., Nagasaki, M., et al. (2011). Frequent pathway mutations of splicing machinery in myelodysplasia. Nature 478, 64–69.

Zarnack, K., Ko nig, J., Tajnik, M., Martincorena, I., Eustermann, S., Ste vant, I., Reyes, A., Anders, S., Luscombe, N.M., and Ule, J. (2013). Direct competition between hnRNP C and U2AF65 protects the transcriptome from the exoniza- tion of Alu elements. Cell 152, 453–466.

Zhang, C., Frias, M.A., Mele, A., Ruggiu, M., Eom, T., Marney, C.B., Wang, H., Licatalosi, D.D., Fak, J.J., and Darnell, R.B. (2010). Integrative modeling de- fines the Nova splicing-regulatory network and its combinatorial controls. Science 329, 439–443.

 

SUPPLEMENTAL INFORMATION ATTACHED BELOW

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

Page 18: Functional Splicing Network Reveals Extensive Regulatory

U2AF65

SLU7 SF3B1 KD#1

SF3B1 KD#2

SNRPG KD#1

SNRPG KD#2

SRSF1

BRR2

-tubulin

-tubulin

-tubulin

-tubulin

- tubulin

- tubulin

- tubulin

- tubulin

56%

82%

90%

75%

96%

74%

69%

83%

β

β

β

β

−10

12

Sca

led

Z−sc

ore

SF3B1 KD #1

SF3B1 KD #2

FAS

OLR

1PA

X6

CH

EK

2FN

1ED

BFN

1ED

AN

UM

BB

CL2

L1R

AC

1M

STR

1M

AD

DC

AS

P2

APA

F1M

INK

1M

AP

3K7

MA

P4K

3N

OTC

H3

MA

P4K

2P

KM

2M

CL1

CFL

AR

CA

SP

9D

IAB

LOB

MF

CC

NE

1H

2AFY

STA

T3G

AD

D45

AB

IM_L

_EL

BIM

_EL

BIR

C5_

2BB

IRC

5_D

3S

MN

1S

MN

2P

HF1

9

−4−3

A B

C

SF3B1 KD #1 (OTP Pool)

CGCCAAGACUCACGAAGAUCCUCGAUUCUACAGGUUAUAGGCGGACCAUGAUAAUUUCCGGGAAGAUGAAUACAAA

-4-2

02

Scal

ed Z

-sco

re

p14 KD #1p14 KD #2

RobCor = 0.85

-3-2

-10

12

3

Scal

ed Z

-sco

reSNRPG KD #1SNRPG KD #2

RobCor = 0.77

RobCor = 0.82

SF3B1 KD #2 (oligo)

GACAGCAGAUUUGCUGGAUACGUGA(obtained from Corrionero et al., 2011)

P14 KD #1 (SiGENOME Pool)

GUGAUCACCUAUCGGGAUUGAACAGCUUAUGUGGUCUAUGAAGUAAAUCGGAUAUUGGAACACACCUGAAACUAGA

U2AF65

SLU7

SNRPG KD#1

SNRPG KD#2

SRSF1

BRR2

-tubulin

-tubulin

-tubulin

-tubulin

-tubulin

-tubulin

-tubulin

-tubulin

56%

90%

96%

69%

0

10

20

30

40

50

60

70

80

90

100

Control PRPF8 BRR2 PRPF3 PRPF31 PRPF4 PRPF6 SF3B1 KD#1 SRSF1 SLU7 U2AF1 U2AF2 P14 KD#1 P14 KD#2 SNRPG KD#1

SNRPG KD#2

Stauro

Normal Early apoptotic Late apoptotic Necrotic

N = 3

-4-2

02

4

Scal

ed Z

-sco

re IKSta_3hSta_8h

-4-2

02

4

Scal

ed Z

-sco

re

FAS

OLR

1PA

X6C

HEK

2FN

1ED

BFN

1ED

AN

UM

BBC

L2L1

RAC

1M

STR

1M

ADD

CAS

P2AP

AF1

MIN

K1M

AP3K

7M

AP4K

3N

OTC

H3

MAP

4K2

PKM

2VE

GFA

MC

L1C

FLAR

CAS

P9SY

KD

IABL

OBM

FC

CN

E1C

CN

D1

H2A

FYST

AT3

GAD

D45

ABI

M_L

_EL

BIM

_EL

BIR

C5_

2BBI

RC

5_D

3LM

NA

SMN

1SM

N2

PHF1

9

SMU1Sta_3hSta_8h

SF3B1Sta_3hSta_8h

-4-2

02

4

Scal

ed Z

-sco

re

FAS

OLR

1PA

X6C

HEK

2FN

1ED

BFN

1ED

AN

UM

BBC

L2L1

RAC

1M

STR

1M

ADD

CAS

P2AP

AF1

MIN

K1M

AP3K

7M

AP4K

3N

OTC

H3

MAP

4K2

PKM

2VE

GFA

MC

L1C

FLAR

CAS

P9SY

KD

IABL

OBM

FC

CN

E1C

CN

D1

H2A

FYST

AT3

GAD

D45

ABI

M_L

_EL

BIM

_EL

BIR

C5_

2BBI

RC

5_D

3LM

NA

SMN

1SM

N2

PHF1

9

D

% c

ells

P14 KD #1 (OTP Pool)

AUAUGGACCUAUUCGUCAACCUGAAGUAAAUCGGAUAUGACUAAAUCCCACGAAUGAGUGAUCACCUAUCGGGAUU

SNRPG KD #1 (SiGENOME Pool)AGCCUUGGAACGAGUAUAACUUUAUGAACCUUGUGAUAGUGGUAAUACGAGGAAAUAGACAAGAAGUUAUCAUUGA

SNRPG KD #1 (OTP Pool)GCAUUAAAGUGAAAGAAUCCGAGGAAAUAGUAUCAUCAAGGAAUAUUGCGGGGGAUUUGAGAUGGCGACUAGUGGAC

β

β

β

β

Depletion Efficiency

Page 19: Functional Splicing Network Reveals Extensive Regulatory

BA

-1 0 1 2 3

-2-1

01

23

4 Rcor = 0.12

-3 -2 -1 0 1 2

-2-1

01

23

4 Rcor = 0.24

-4 -3 -2 -1 0 1 2

-2-1

01

23

4 Rcor = 0.12

-1 0 1 2 3

-3-2

-10

1

Rcor = 0.56

-3 -2 -1 0 1 2

-3-2

-10

1 Rcor = 0.67

-4 -3 -2 -1 0 1 2

-3-2

-10

1 Rcor = 0.5

-1 0 1 2 3

-2-1

01

2 Rcor = 0.52

-3 -2 -1 0 1 2

-2-1

01

2 Rcor = 0.9

-4 -3 -2 -1 0 1 2

-2-1

01

2 Rcor = 0.78

IK_24h IK_48h IK_72h

SMU

1_72

hSM

U1_

48h

SMU

1_24

h

-6-4

-20

2

Sca

led

Z-sc

ore

FAS

OLR

1P

AX

6C

HE

K2

FN1E

DB

FN1E

DA

NU

MB

BC

L2L1

RA

C1

MS

TR1

MA

DD

CA

SP

2A

PA

F1M

INK

1M

AP

3K7

MA

P4K

3N

OTC

H3

MA

P4K

2P

KM

2V

EG

FAM

CL1

CFL

AR

CA

SP

9S

YK

DIA

BLO

BM

FC

CN

E1

CC

ND

1H

2AFY

STA

T3G

AD

D45

AB

IM_L

_EL

BIM

_EL

BIR

C5_

2BB

IRC

5_D

3LM

NA

SM

N1

SM

N2

PH

F19

IK_24hIK_48hIK_72h

-4-2

02

4

Sca

led

Z-sc

ore

FAS

OLR

1P

AX

6C

HE

K2

FN1E

DB

FN1E

DA

NU

MB

BC

L2L1

RA

C1

MS

TR1

MA

DD

CA

SP

2A

PA

F1M

INK

1M

AP

3K7

MA

P4K

3N

OTC

H3

MA

P4K

2P

KM

2V

EG

FAM

CL1

CFL

AR

CA

SP

9S

YK

DIA

BLO

BM

FC

CN

E1

CC

ND

1H

2AFY

STA

T3G

AD

D45

AB

IM_L

_EL

BIM

_EL

BIR

C5_

2BB

IRC

5_D

3LM

NA

SM

N1

SM

N2

PH

F19

SMU1_24hSMU1_48hSMU1_72h

-30

-20

-10

0

Sca

led

Z-sc

ore

FAS

OLR

1P

AX

6C

HE

K2

FN1E

DB

FN1E

DA

NU

MB

BC

L2L1

RA

C1

MS

TR1

MA

DD

CA

SP

2A

PA

F1M

INK

1M

AP

3K7

MA

P4K

3N

OTC

H3

MA

P4K

2P

KM

2V

EG

FAM

CL1

CFL

AR

CA

SP

9S

YK

DIA

BLO

BM

FC

CN

E1

CC

ND

1H

2AFY

STA

T3G

AD

D45

AB

IM_L

_EL

BIM

_EL

BIR

C5_

2BB

IRC

5_D

3LM

NA

SM

N1

SM

N2

PH

F19

SF3B1_24hSF3B1_48hSF3B1_72h

-3-2

-10

12

3

Sca

led

Z-sc

ore

Fas

Olr1

Pax

6C

hek2

FN1E

DB

FN1E

DA

NU

MB

BC

L2L1

Rac

1M

STR

1M

AD

DC

AS

P2

AP

AF1

MIN

K1

MA

P3K

7M

AP

4K3

NO

TCH

3M

AP

4K2

PK

M2

MC

L1C

FLA

RC

AS

P9

DIA

BLO

BM

FC

CN

E1

H2A

FYS

TAT3

GA

DD

45A

BIM

_L_E

LB

IM_E

LB

IRC

5_2B

BIR

C5_

D3

SM

N1

SM

N2

PH

F19

IK_48hSMU1_72h

01

23

45

Mea

n ab

s. Z

-sco

re

IK SMU1 SF3B1

24h48h72h

C

D

Page 20: Functional Splicing Network Reveals Extensive Regulatory

CELL FUNCTIONSPLICING TYPE

SPLICING TYPEce5ssmecomplex3ssir

CELL FUNCTIONapoptosis

bothcell proliferation

GENE CATEGORYSpliceosomeSplic. FactorsOther RNApr./ Misc.Chrom. Factors

GEN

E C

ATEG

ORY

+ cell proliferation- apoptosis

- cell proliferation+ apoptosis

−5 0 5Z−score

FAS

OLR

1C

HE

K2

APA

F1C

AS

P9

BM

FB

IRC

5_2B

FN1E

DB

FN1E

DA

BIM

_EL

BIM

_L_E

LG

AD

D45

AM

STR

1M

AP

3K7

H2A

FYM

AD

DR

AC

1N

OTC

H3

CC

NE

1M

AP

4K2

STA

T3M

AP

4K3

PK

M2

NU

MB

PAX

6C

AS

P2

CFL

AR

DIA

BLO

BC

L2L1

SM

N2

MIN

K1

PH

F19

SM

N1

BIR

C5_

D3

MC

L1

−1.0

−0.5

0.0

0.5

Mea

n Z−

scor

e

Cor

e S

pl.

Spl

. Fac

tors

Oth

er R

NA

pr.

Chr

om. F

acto

rs

−1.5

−1.0

−0.5

0.0

0.5

Mea

n Z−

scor

e

Pers

ist.

Spl

.

Tran

s. E

arly

Spl

.

Tran

s. M

id S

pl.

Tran

s. L

ate

Spl

.

EJC

B

C

A

Page 21: Functional Splicing Network Reveals Extensive Regulatory

A

SF3A2SNRPA1

SF3A1

SF3B4

SNRPB2

SF3B2

SF3B1

P14

SF3B3

SF3A3

LSM3

C20ORF14

TXNL4PRPF4

CD2BP2

PPIH

U5−200KD

DDX23

LSM6

HPRP8BP

LSM2

NHP2L1

U5−116KD

PRPF31

PRPF8

LSM4

USP39

SART1

LSM7PRPF3

SNRPD1

SNRPD3

SNRPB

SNRPG

SNRPF

SNRPD2

CRNKL1

XAB2

CTNNBL1SKIIPWBP11

PLRG1

PPIL1

PPIE

BCAS2

PRPF19

AQR

CDC5L

G10

KIAA1160DC

B

Persistent Trans. Early Trans. Mid. Trans. Late

Persistent

SF3B4 SNRPA1 SF3B3 SNRPA1 SF3A1 SNRPA1 SF3A3 SF3B3 SF3A3 SNRPA1

Trans. Early

PPIH SF3B2 PPIH SNRPA1 PPIH SF3B4 PRPF31 PRPF8 PPIH SF3B1

SNRPC SNRP70 RBM17 SR140 IK SMU1 PRPF31 C20ORF14 U2AF1 U2AF2

Trans. Mid.

KIAA1604 P14 KIAA1604 SF3B1 PRPF19 U5200KD XAB2 SNRPA1 XAB2 SF3B4

KIAA1604 DDX23 CDC40 DDX23 AQR DDX23 KIAA1604 PPIH CDC40 PPIH

CDC5L PLRG1 KIAA1604 CRNKL1 XAB2 CRNKL1 XAB2 CDC40 KIAA1604 XAB2

Trans. Late

C19ORF29 PRPF8 SLU7 P14 SLU7 SF3B2 C19ORF29 P14 SLU7 SF3B4

LENG1 SMNDC1 PRPF18 LSM7 PRPF18 LSM4 SLU7 PPIH LENG1 NHP2L1

SLU7 CRNKL1 SLU7 KIAA1604 SLU7 XAB2 SLU7 CDC40 LENG1 AQR

SLU7 C19ORF29 DHX8 DDX41 KIAA0073 FLJ35382 DHX8 C19ORF29

E

Page 22: Functional Splicing Network Reveals Extensive Regulatory

α-Tubulin

C IK SMU1 48h siRNA

72h

SMU1

IK C IK SMU1

C IK SMU1

48h 72h siRNA C IK SMU1

α-Tubulin

SMU1

IK

D

0

2500

5000

7500

10000

12500

15000

17500

20000

0 h 24 h 48 h 72 h

Control siRNA

IK KDSMU1 KDBoth KD

Cel

l num

ber

275 277451

218 174514

IK SMU1

IK SMU1

100 131208

IK SMU1

103 103143

IK SMU1

all genes (1458) p-value # genes all genes (1416) p-value # genesCellular Growth and Proliferation 8.49E-12 348 Cell Death and Survival 1.04E-08 311

Cell Death and Survival 1.07E-10 342 Cellular Growth and Proliferation 1.52E-08 314Gene Expression 1.12E-09 212 Cell-To-Cell Signaling and Interaction 1.37E-06 100Cellular Movement 2.90E-07 174 Gene Expression 1.59E-06 196

Cell Cycle 3.41E-06 135 Cellular Movement 6.62E-06 164

genes up (726) p-value # genes genes up (728) p-value # genesGene Expression 2.49E-13 139 Gene Expression 1.23E-08 119

Cell Death and Survival 1.91E-05 175 Cellular Development 1.63E-05 77Cellular Growth and Proliferation 1.93E-05 172 Cell Death and Survival 1.42E-04 154

Cell Cycle 7.94E-05 78 Cell Cycle 3.61E-04 68RNA Post-Transcriptional Modification 8.18E-05 20 DNA Replication, Recombination, and Repair 5.60E-04 26

genes down (732) p-value # genes genes down (688) p-value # genesCell-To-Cell Signaling and Interaction 1.01E-07 91 Cellular Movement 1.62E-09 102

Cellular Growth and Proliferation 1.15E-07 176 Cell-To-Cell Signaling and Interaction 1.70E-08 66Cellular Movement 7.29E-07 106 Cellular Growth and Proliferation 1.44E-07 169

Cell Death and Survival 1.72E-06 170 Cell Death and Survival 2.77E-06 159Cell Morphology 5.65E-06 102 Cell Cycle 4.94E-05 61

all exons (552) p-value # genes all exons (583) p-value # genesRNA Post-Transcriptional Modification 3.67E-11 49 Gene Expression 3.58E-09 202

Gene Expression 5.46E-09 195 RNA Post-Transcriptional Modification 4.60E-07 38Cellular Assembly and Organization 5.27E-05 171 Cellular Assembly and Organization 2.25E-06 151Cellular Function and Maintenance 5.82E-05 148 Cellular Function and Maintenance 2.25E-06 127

Cell Death and Survival 5.95E-05 237 Cell Cycle 4.58E-06 133

exons up (308) p-value # genes exons up (339) p-value # genesCell Cycle 2.04E-05 98 Gene Expression 2.76E-06 136

Cell Morphology 6.41E-05 114 Cellular Assembly and Organization 7.41E-06 118Gene Expression 8.80E-05 132 Cellular Function and Maintenance 1.87E-05 101

Cellular Assembly and Organization 1.88E-04 125 Protein Synthesis 2.75E-05 48Cellular Function and Maintenance 1.88E-04 111 Cell Cycle 1.60E-04 92

exons down (246) p-value # genes exons down (246) p-value # genesRNA Post-Transcriptional Modification 1.28E-09 26 RNA Post-Transcriptional Modification 5.16E-07 21

Gene Expression 2.49E-08 72 Gene Expression 3.85E-05 65Cell Morphology 4.15E-05 41 Cell Death and Survival 5.15E-05 86

Cell Cycle 5.91E-05 59 Cellular Assembly and Organization 1.33E-04 40Cellular Assembly and Organization 5.91E-05 54 Cellular Function and Maintenance 1.33E-04 38

IK KD gene ontologies SMU1 KD gene ontologies

PARPα-

FAS

MCL1

α-Tubulin

% ce

lls

F

E

CB

G

Gene expression up

Gene expression down

Alternative splicing up

Alternative splicing down

ex 6 inc

ex 6 skip

ex 2 inc

ex 2 skip

A

0%

10%

20%

30%

40%

50%

60%

70%

80%

90%

100%

CN48

CN72

CN96

IK K

D48

IK K

D72

IK K

D96

SMU1 KD48

SMU1 KD72

SMU1 KD96

Both KD48

Both KD72

Both KD96

STAURO

STA3

early apoptotic

late apoptotic

necrotic

normal cells

N=6N=6

Page 23: Functional Splicing Network Reveals Extensive Regulatory

−4 −2 0 2 4

−4−2

02

PC1

PC

2

1

2

3

4

5

6

7

8

9

10

11

12

13

14

15

16

17 18

19

20

21

22

23

24

25

26

2728

293031

32

33

34

35

36

3738

39

40

41

42

43

44

45

46

4748

49

50

51

52

53

54

55

56

57

58

59

60

61

6263

64

65

66

67

68 69

70

71

72

7374

75

76

77

78

79

80

81

82

83

84

85

8687

88

89

90

91

92

93

94

95

HNRPC−U2AF1

NHP2L1−SF3B4

SF3B2−SMU1

OLR1

CHEK2

BCL2L1MINK1

MAP3K7PKM2

MCL1

CFLARSTAT3

PHF19

nº nº nº nº nº nº nº nº nº nº1 SF3B1 SFRS10 11 IK SF3B4 21 SMNDC1 XAB2 31 DDX10 MORG1 41 DICER1 SFPQ 51 DDX31 NHP2L1 61 BRD4 RNPC2 71 SFRS1 TAF15 81 CPSF5 FRG1 91 DDX52 JMJD2B2 DBR1 SKIIP 12 NHP2L1 SNRPA1 22 DHX30 MGC2655 32 LSM6 MGC20398 42 KIAA0073 SFPQ 52 DDX52 SART1 62 NONO RBM15 72 DIS3 YBX1 82 DEK PPIE 92 MGC13125 SMARCA43 HNRPC U2AF1 13 SMU1 SNRPA1 23 BCAS2 MGC13125 33 DGCR14 SFRS11 43 NCBP2 NONO 53 EXOSC4 USP39 63 ILF3 SKIV2L2 73 DHX9 HNRPH2 83 BUB3 SDCCAG10 93 ACIN1 PRMT54 NHP2L1 SF3B1 14 NHP2L1 SNRPB2 24 THOC2 WTAP 34 DDX52 SFRS2 44 DHX30 NONO 54 FRG1 SNRPB 64 MGC23918 SKIV2L2 74 FNBP3 HNRPH2 84 ILF3 PPIG 94 ACIN1 CTCF5 RBM25 SF3B1 15 D6S2654E USP39 25 CD2BP2 GTL3 35 ELAVL1 SFRS10 45 HNRPA0 NONO 55 DHX15 SNRPG 65 AOF2 DDX31 75 MORF4L1 TIAL1 85 EZH2 PPIG 95 KHDRBS1 MORF4L16 IK SF3B2 16 C22ORF19 RY1 26 CTNNBL1 YBX1 36 P29 SRRM2 46 LENG1 RBM35A 56 FRG1 SNRPG 66 ACIN1 MGC2655 76 DDX52 MFAP1 86 CRK7 SUV39H17 NHP2L1 SF3A1 17 NHP2L1 SNRPD1 27 PPIH RBM25 37 EXOSC4 NCBP2 47 BRD4 C20ORF14 57 DDX52 LSM2 67 HNRPA1 ZNF207 77 FRG1 MFAP1 87 ILF2 SETD1A8 NHP2L1 SF3A3 18 NHP2L1 SNRPF 28 SF3B1 SMU1 38 HNRPL SNRP70 48 BRD4 PRPF4 58 ASH2L PPM1G 68 DHX15 HSPC148 78 CCDC55 KIN 88 HDAC4 SETD29 IK SF3A3 19 DDX11 DDX31 29 SF3B2 SMU1 39 FRG1 SNRP70 49 NOSIP PRPF31 59 FNBP3 SHARP 69 CTCF HNRPD 79 EP300 NOSIP 89 CCDC55 NAB2

10 NHP2L1 SF3B4 20 SFRS1 XAB2 30 SF3A1 SMU1 40 P29 SNRPA 50 FRG1 PRPF31 60 RBM25 SHARP 70 EED ILF3 80 C20ORF14 FRG1 90 DHX15 JMJD2B

edge edge edge edgeedge edge edge edge edge edge

B

C

−1.0 −0.5 0.0 0.5 1.0

−2−1

01

Z−score HNRPC

Z−sc

ore

U2A

F1

OLR1

CHEK2

FN1EDB

FN1EDA

BCL2L1

MSTR1

MADD

MINK1NOTCH3

MAP4K2

PKM2

MCL1

GADD45A

BIM_L_EL

BIM_EL

PHF19

Cor = −0.004

−3 −2 −1 0 1 2

−2−1

01

Z−score HNRPC

Z−sc

ore

U2A

F1

CASP2

MAP3K7

BMF

APAF1

CFLARRAC1

CCNE1

DIABLO

H2AFY

NUMB

STAT3

FAS

MAP4K3

PAX6

CASP9SMN1

SMN2

BIRC5_D3

BIRC5_2B

Cor = 0.773

A

−1.0 −0.5 0.0 0.5 1.0 1.5

−2−1

01

Z−score HNRPC

Z−sc

ore

U2A

F1

FASOLR1

PAX6

CHEK2

FN1EDB

BCL2L1

RAC1

MSTR1

CASP2

MINK1

MAP3K7

MAP4K3

NOTCH3

MAP4K2

PKM2

MCL1

BMF

STAT3

GADD45A

BIM_L_ELBIM_EL

PHF19

Cor = 0.0659

−3 −2 −1 0 1

−2.0

−1.5

−1.0

−0.5

0.0

0.5

1.0

Z−score HNRPC

Z−sc

ore

U2A

F1

DIABLO

FN1EDA

H2AFY

CCNE1

NUMB

APAF1

MADD

CFLAR

CASP9BIRC5_2B

BIRC5_D3

SMN1SMN2

Cor = 0.899D

E

Page 24: Functional Splicing Network Reveals Extensive Regulatory

   

Functional  splicing  network  reveals  extensive  regulatory  potential  of  the  core  

Spliceosomal  machinery  

Panagiotis  Papasaikas1,2,  *,  J.  Ramón  Tejedor1,2,*,  Luisa  Vigevani1,2      

and  Juan  Valcárcel1-­‐3    

 

Supplemental  Information.  

This  file  contains  the  following  items:  

-­‐ Supplementary  Figure  legends  

-­‐ Supplementary  Table  legends  

-­‐ Supplementary  Methods  

-­‐ Supplementary  References  

 

 

Supplementary  Figure  legends.  

 

Figure  S1.  Consistent  profiles  of  AS  changes  and  limited  cell  death  induced  by  the  

knockdown  of  core  components  using  different  siRNAs.    A.  AS  perturbation  profiles  

obtained  for  depletion  of  SNRPG,  P14  or  SF3B1  using  two  different  siRNAs  .  Robust  

correlation  between  conditions  and  siRNA  sequences  used   for   factor  depletion  are  

indicated.  Positive  Z-­‐scores  indicate  changes  towards  exon  inclusion,  negative  values  

indicate   exon   skipping.   B.   Knockdown   efficiency   for   a   subset   of   splicing   factor  

components.  Western  blot  analyses  of  protein  depletion  upon  siRNA  treatment  for  6  

Page 25: Functional Splicing Network Reveals Extensive Regulatory

     

 

spliceosome  factors,  for  two  of  them  two  independent  siRNA  libraries,  as   indicated  

in   A.   Depletion   efficiency  was   calculated   by   quantifying   the   levels   of  western   blot  

signals   using   ImageQuant   TL   software   (GE   Healthcare   Life   sicences).     C.  

Quantification   of   cell   death   upon   depletion   of   core   spliceosomal   components   or  

treatment   with   staurosporin.   Data   represents   the   percentage   of   normal,   early  

apoptotic,   late  apoptotic  or  necrotic   cells   analyzed  by   flow  cytometry   sorting  after  

staining   with   annexin   V   and   propidium   iodide.   Values   represent   the   mean   and  

standard  deviation  of  3  independent  biological  replicates.  D.  Induction  of  apoptosis  

by  staurosporine  leads  to  AS  changes  distinct  from  those  induced  by  the  knock  down  

or   core   splicing   factors.   Perturbation  profiles  were  measured  after  3  or  8  hours  of  

staurosporin   treatment   (1   mM)   and   compared   with   the   perturbation   profiles  

obtained  upon  knockdown  of  IK,  SMU1  or  SF3B1  for  72  hours.    

 

Figure   S2.   Differences   in   knockdown   conditions   do   not   significantly   alter  

correlation   values  within   the   network.  A.  Perturbation  profiles   upon   IK,   SMU1  or  

SF3B1     knockdown   for   24,   48   or   72   hours   across   the   35   splicing   events   used   for  

network  generation.  B.  Perturbation  profiles  upon  IK  or  SMU1  depletion  at  48  and  72  

hours,  respectively.  C.  Overall  effects  in  AS  changes  across  the  35  events  after  24,  48  

or   72  hours   of   siRNA   treatment.   Box  plots   represent   the  median   and   spread  of   Z-­‐

scores   of   the   effects   across   different   events   upon   the   knockdown   of   IK,   SMU1   or  

SF3B1   for   the   different   timepoints.   D.   Correlation   matrix   and   robust   correlation  

values  between  IK  or  SMU1  knockdowns  at  different  time  points.          

 

Page 26: Functional Splicing Network Reveals Extensive Regulatory

     

 

Figure  S3.  Regulation  of  AS  events  involved  in  cell  proliferation  or  apoptosis  based  

on  functional  annotations.  A.  Heatmap  representation  of  the  results  obtained  in  the  

screen   experiment   based   on   functional   annotations   of   the   AS   events.   Blue   or   red  

colors  indicate  splicing  changes  towards  more  cell  proliferation/less  apoptosis  or  less  

cell   proliferation/more   apoptosis   for   a   given   knockdown   condition,   respectively.  

Data   are   clustered   on   both   dimensions   (similarity   measure   based   on   Pearson  

correlation,   ward   linkage).   B   and   C.   Extent   and   direction   of   AS   changes   upon  

knockdown   of   subsets   of   splicing   factors.   Box-­‐plots   represent   the   skewness   and  

spread  of  the  mean  z-­‐scores  for  AS  splicing  changes  observed  across  the  AS  events  

analyzed   for   core   and   non-­‐core   splicing   factors,   other   RNA   processing   factors   and  

chromatin   factors   (A)   or   for   factors   that   are   either   persistent   or   transient  

components  of  the  spliceosome  at  different  stages  of  assembly  (B).  EJC  corresponds  

to  components  of  the  Exon  Junction  Complex.  

 

Figure  S4.   Sub-­‐network   topologies.  A.  Detailed  functional   interactions  of  U2snRNP  

proteins.   B.   Detailed   functional   interactions   among   U4/5/6   tri-­‐snRNP   proteins.   C.  

Functional   interactions   between   PRP19   and   Nineteen   Complex   components.   D.  

Functional   interactions   between   Sm   core   proteins.   E.   Top   5   functional   network  

connections   between   persistent   and   different   transient   components   of   the  

spliceosome  at  different  stages  of  assembly  based  on  robust  correlation  values.  

 

Figure   S5.   Genome-­‐wide   regulation   of   AS   by   IK   and   SMU1   and   links   with   cell  

proliferation   and   apoptosis.  A.  Depletion  of   IK  and  SMU1  protein   levels  by   siRNA-­‐

mediated  knockdown  at  48  or  72  hours.  The  proteins  were  detected  by  western  blot  

Page 27: Functional Splicing Network Reveals Extensive Regulatory

     

 

analysis   under   the   indicated   conditions.   B.   Overlap   between   gene   expression  

changes  upon  knockdown  of  IK  or  SMU1  by  RNAi.  Upper  panel:  up-­‐regulated  genes.  

Lower  panel:  down-­‐regulated  genes.    C.  Overlap  in  AS  changes  upon  knockdown  of  

IK   or   SMU1  by  RNAi.  Upper   panel:   changes   leading   to   up-­‐regulation  of   alternative  

exons.   Lower   panel:   changes   leading   to   down-­‐regulation   of   alternative   exons.     D.  

Gene  ontology  analysis  for  genes  or  exons  up-­‐  or  down-­‐regulated  upon  IK  or  SMU1  

knockdown   using   Ingenuity   pathway   tools.     E.   Induction   of   α-­‐PARP-­‐cleavage,   a  

feature   of   apoptotic   cells,   upon   siRNA-­‐mediated   knockdown   of   IK,   SMU1   or   both.  

Western   blots   (two   upper   panels)   were   carried   out   using   protein   extracts   of   cells  

transfected  with  specific  siRNAs  or  controls.  β-­‐tubulin  was  used  as   loading  control.  

RT-­‐PCR   assays   (lower   two  panels)  were  used   to   detect   changes   in  AS   of   Fas/CD95  

and   MCL1   endogenous   transcripts.   F.   IK   and/or   SMU1   knock   reduce   cell  

proliferation.   HeLa   cells   were   transfected   in   sixtuplicate   with   siRNAs   against   IK,  

SMU1   or   the   combination   of   both   and   cell   numbers   were   determined   using    

resazurin   (which  measures   aerobic   respiration)   at   different   time  points.   Error   bars  

represent  standard  deviations  for  a  given  siRNA  condition.  G.  IK  and/or  SMU1  knock  

promote  apoptosis.  HeLa  cells  were  transfected   in  sextuplicate  with  siRNAs  against  

IK,  SUM1  or  both  and  IK  and  SMU1  were  depleted  by  siRNA  transfection,    isolated  at  

different   time   points   and   different   cell   populations   analyzed   by   flow   cytometry  

sorting   after   staining   with   annexin   V   and   propidium   iodide.     The   fraction   of   cells  

under  different  knockdown  conditions  is  shown.  

 

Figure   S6.   Analysis   of   variable   functional   interactions   in   the   AS   network.   A.  

Principal  components  analysis  of   functional   interactions  extracted  by  subsets  of  AS  

Page 28: Functional Splicing Network Reveals Extensive Regulatory

     

 

events   (see   also  Methods).   Colors   indicate   the   positions   of   the   interactions   in   the  

Principal  Component  Analysis,  which  captures  the  different  relative  contributions  of  

the   35   AS   events   in   defining   these   connections.     The   top   ten   discriminatory   AS  

events  based  on  their   loadings  on  the  first  two  principal  components  are  shown  as  

grey   arrows.   The   numbers   identify   the   functional   connections   between   splicing  

factors  and  are  detailed  in  the  table  at  the  bottom.    B  and  C.  Representation  of    z-­‐  

score   values   of   AS   changes   upon   knockdown   of   U2AF1   or   HNRPC   for   AS   events  

harboring  (B)  or  not  (C)  composite  HNRPC  binding/3’  splice  site-­‐like  elements  within  

400   nucleotides   of   the   alternative   cassette   exon   boundaries.   Robust   correlation  

values  are  indicated  within  each  panel.  D  and  E.  Representation  of  z-­‐score  values  of  

AS   changes  upon   knockdown  of  U2AF1  or  HNRNPC   for  AS  events  harboring   (D)  or  

not  (E)  Alu  elements  within  0.5  kbp  of  the  alternative  cassette  exon  boundaries.  

 

Table  S1.  Information  on  AS  events  analyzed  in  this  study.  The  information  includes  

gene  symbol,  id,  exon  sequence  and  genomic  coordinates  for  the  AS  exons  analyzed.  

 

Table   S2.   Information   about   the   custom   siRNA   library   utilized   to   knockdown  

splicing  and  chromatin   factors.   Information  includes  siRNA  pools  catalog  numbers,  

gene   symbol,   id   and   accession   numbers   for   the   targeted   genes.   It   also   includes  

information   about   the   categories   in   which   their   protein   products   were   classified  

according   to   two   publications   (Wahl   et   al.,   2009   and   Zhou   et   al.,   2002),   the  

classifications   used   in   our   network   analysis   and   also   information   about   their  

functions  and  known  interactions.    

 

Page 29: Functional Splicing Network Reveals Extensive Regulatory

     

 

Table   S3.   Raw   data   of   AS   changes   upon   knockdown   of   splicing   and   chromatin  

factors.  Each  sheet  contains   the  data  corresponding  to  one  AS  event.  Data   include  

information   about   the   genes   knocked   down,   the   corresponding   siRNA   pools,  

quantification   of   alternative   isoforms   by   RT-­‐PCR   and   high-­‐throughput   capillary  

electrophoresis   and  percentage  of   exon   inclusion   (PSI   index)   for   each  of   the   three  

biological  replicas  as  well  as  the  median,  standard  deviation,  p  value  and  z-­‐scores  for  

the  three  values.    

 

Table  S4  .  Compendium  of  raw  data  of  AS  changes  upon  knockdown  of  splicing  and  

chromatin  factors  (classified  according  to  factor  classes),  upon  knockdown  of  P14,  

SF3B1  and  SNRPG  with   two   independent   siRNA   libraries,  upon  knockdown  of   IK-­‐

SMU1  core  splicing  factors  at  different  time  points  or   in  different  cell   lines,  upon  

treatment   with   drugs   targeting   the   spliceosome   and   upon   pharmacological  

treatments   involved   in   iron   homeostasis   modulation   (data   from   Tejedor   et   al.,  

accompanying  manuscript).  

 

 

Table  S5.  Robust  correlation  values  and  glasso  inverse  covariance  estimates  for  the  

complete  network  edge-­‐list.    

 

Table  S6.  Oligonucleotide   sequences  and   information  about  amplicons  utilized   in  

semi  quantitative  RT-­‐PCR  (sheet  1)  assays  or  real  time  PCR    analysis  (sheet  2)  used  

in  this  study    

Page 30: Functional Splicing Network Reveals Extensive Regulatory

     

 

Detailed  information  and  references  for  the  Splicing  events  analyzed  can  be  found  

at  the  following  link:  

https://s3.amazonaws.com/SplicingNet/ASEs.zip  

The   files   include   a   schematic   representation   for   every   AS   event;   information  

regarding   the   functional   role   of   the   gene,   as   retrieved   from   uniprot   /   genecards  

database;  the  protein  domain  affected  by  the  AS  event;  the  functional  effect  of  each  

of  the  isoforms  towards  cell  proliferation  or  apoptosis;  and  102  references  consulted  

for  the  selection.  

 

 

 

Supplementary  Methods  

Cell  lines  

HeLa  CCL-­‐2  cells  and  HEK293  cells  were  purchased  from  the  American  Type  Culture  

Collection   (ATCC).   Cells   were   cultured   in   Glutamax   Dulbecco’s   modified   Eagle’s  

medium   (Gibco,   Life   Technologies)   supplemented   with   10%   fetal   bovine   serum  

(Gibco,   Life   Technologies)   and   penicillin/streptomycin   antibiotics   (penicillin   500  

u/ml;  streptomycin  0.5  mg/ml,  Life  Technologies).  Cell  culture  was  performed  in  cell  

culture  dishes  in  a  humidified  incubator  at  37oC  under  5%  CO2.  

 

Drug  Treatments  

Pharmacological   treatments   with   splicing   arresting   drugs   were   performed   as  

described   in   Corrionero   et   al.   (2011).   HeLa   cells   were   treated   for   3   hours   with  

Spliceostatin  A  260  nM  or  8  hours  with  Meayamycin  20  nM  or  TG003  10  µM.  RNA  

Page 31: Functional Splicing Network Reveals Extensive Regulatory

     

 

was  isolated  using  the  Maxwell  16  LEV  simplyRNA  kit  (Promega)  and  RT  PCRs  for  the  

35  splicing  events  were  performed  as  described  above.    

 

Cell  proliferation  assays  

HeLa  cells  were   forward-­‐transfected  with  siRNAs  targeting   IK,  SMU1  or  both  genes  

by   plating   2500   cells   over   a   mixture   containing   siRNA-­‐RNAiMAX   lipofectamine  

complexes,   as   previously   described.   Indirect   estimations   of   cell   proliferation   were  

obtained   24,   48   and   72   hours   post   transfection   by   treating   the   cells   for   4h   with  

resazurin   (5   µM)   prior   to   the   fluorescence  measurements   (544   nm   excitation   and  

590  nm  emission  wavelengths).  Plate  values  were  obtained  with  an  Infinite  200  PRO  

series  multiplate  reader  (TECAN).  

 

Cell  apoptosis  assays  

Hela  cells  were  transfected  with  siRNAs  against  IK,  SMU1  or  both  genes,  as  described  

above.  Cells  were  trypsinized  and  collected   for  analysis  72  hours  post   transfection.  

Pellets   were   washed   once   with   complete   DMEM   medium   (10%   serum   plus  

antibiotics),   and   three   times   with   PBS   1X.   Cells   were   then   resuspended   in   200   µl  

Binding   buffer   provided   by   the   Annexin   V-­‐FITC   Apoptosis   Detection   Kit  

(eBiosciences),   resulting   in  a   final  density  of  4x105   cells/ml.  5  µl  of  Annexin  V-­‐FITC  

was  added   to  each  sample   followed  by  10  min   incubation  at   room  temperature   in  

dark  conditions.  Cells  were  washed  once  with  500  µl  binding  buffer  and   incubated  

with  10  µl  propidium  iodide  (20  µg/ml)  for  5  minutes.  FACS  analysis  was  performed  

using  a  FACSCalibur  flow  cytometry  system  (BD  Biosciences).      

 

Page 32: Functional Splicing Network Reveals Extensive Regulatory

     

 

Western  blots  

Protein   extraction   was   performed   in   RIPA   buffer   and   protein   abundance   was  

estimated  by  Bradford  assay.  Thirty  to  Forty  µg  of  total  protein  were  fractionated  by  

electrophoresis   in   10%   Bis-­‐Tris   polyacrilamide   gels   and   transferred   to   PVDF  

membranes.   Primary   antibodies   used   in   this   study   are   listed   below:   IK   (rabbit  

polyclonal,  Origene   TA308401),   SMU1   (mouse  monoclonal,   Santa   Cruz   Biotech,   sc-­‐

100896   JS-­‐12),   PARP   (rabbit   polyclonal,   cell   signaling,   9542S),   β-­‐Tubulin   (mouse  

monoclonal,   Sigma-­‐Aldrich,   T4026),   SF3B1   (rabbit   polyclonal,   Abcam,   ab39578),  

SNRPG   (rabbit   polyclonal,   Abcam,   ab111194),   SLU7   (rabbit   polyclonal,   gift   from  

Robin  Reed),  U2AF65   (mouse  monoclonal,  MC3   clone),   SRSF1   (mouse  monoclonal,  

gift  from  Adrian  Krainer),  BRR2  (rabbit  polyclonal,  Sigma,  -­‐HPA029321)  

 

Real  time  qPCR  

First   strand   cDNA   synthesis   was   set   up   with   500   ng   of   RNA,   50   pmol   of   oligo-­‐dT  

(Sigma-­‐Aldrich),   75   ng   of   random   primers   (Life   Technologies),   and   superscript   III  

reverse   transcriptase   (Life   Technologies)   in   20   µl   final   volume,   following   the  

manufacturer’s   instructions.  Quantitative  PCR  amplification  was  carried  out  using  1  

µl  of  1:2  to  1:4  diluted  cDNA  with  5  µl  of  2X  SYBR  Green  Master  Mix  (Roche)  and  4  

pmol  of  specific  primer  pairs   in  a  final  volume  of  10  µl   in  384  well-­‐white  microtiter  

plates,  Roche).  qPCR  mixes  were  analyzed  in  triplicates  in  a  Light  Cycler  480  system  

(Roche)   and   fold   change   ratios   were   calculated   according   to   the   Pfaffl   method  

(Pfaffl,  2001).  Primers  sequences  used  in  this  study  are  indicated  in  Table  S6.    

 

Affymetrix  arrays  

Page 33: Functional Splicing Network Reveals Extensive Regulatory

     

 

HeLa   cells   were   transfected   with   siRNAs   against   IK,   SMU1   or   RBM6   in   biological  

triplicate  as  described  above  or  –for  RBM6-­‐  as  described  in  Bechara  et  al.,  2013.  RNA  

was   isolated   using   the   RNeasy   minikit   (Quiagen).   RNA   quality   was   estimated   by  

Agilent   Bioanalyzer   nano   assay   and   samples   were   hybridized   to   GeneChip   Human  

Exon   1.0   ST   Array   (Affymetrix)   by   Genosplice   Technology.   Array   analysis   was  

performed   by   Genosplice   Technology   and   the   data   (gene   expression,   alternative  

splicing   changes   and   gene   ontologies   upon   IK,   SMU1   knockdown   conditions)   is  

available   at:   https://s3.amazonaws.com/SplicingNet/HJAY_arrays.zip   and   at   GEO  

database  with  the  accession  number  GSE56605  .  

 

Glasso-­‐based  Network  Reconstruction  from  Robust  Correlation  Estimates  

Here   we   describe   the   process   of   deriving   an   undirected   network   that   models  

functional   interconnections   of   splicing   perturbations   based   on   the   effects   of   gene  

KDs/cell-­‐perturbing  stimuli  on  ASEs.  The  main  steps  are:  

1. Data  preprocessing  and  reduction.  This  involves:  

1.1. Removal  of  sparse  variables  and  imputation  of  missing  values.  

1.2. Removal  of  uninformative  events.  

1.3. Data  scaling  (standardization).  

2. Robust  sample  covariance  estimation.    

3. Network   reconstruction   based   on   the   graphical   lasso   (glasso)   model   selection  

algorithm.    

In   the   next   sections  we   describe   the   basis   and   algorithmic   details   for   each   of   the  

above  steps.    

 

Page 34: Functional Splicing Network Reveals Extensive Regulatory

     

 

 

1.  Data  preprocessing  and  reduction.  

 

1.1  Removal  of  sparse  variables  and  imputation  of  missing  values  

Estimation  of  the  sample  covariance  matrix  requires  complete  data.  However  the  p  x  

n  matrix  that  contains  the  measurements  of  the  p  variables  (genes)   in  n  conditions  

(ASEs)  includes  missing  (NA)  values.  In  order  to  overcome  this  problem,  first  sparse  

variables   (#NA   values   >   n/2)   are   removed.   The   remaining   missing   values   are  

subsequently  filled-­‐in  by  applying  k-­‐nearest-­‐neighbor  based  imputation  (Hastie  et  al.,  

1999,   R   function   impute   available   as   part   of   the   bioconductor   project   at  

http://www.bioconductor.org/).  For  the  purposes  of  this  work  we  set  k=ceiling(0.25  

sqrt(p)).  

 

1.2  Removal  of  uninformative  ASEs.  

ASEs  not  affected  to  a  significant  degree  (median  absolute  (ΔPSI)<1)  by  the  KDs  offer  

little   information  on  the  genes’  covariance  and  can  potentially   introduce  noise  and  

are  thus  discarded  from  subsequent  analyses.    

 

 

1.3  Data  Scaling.  

Prior  to  sample  covariance  estimation  the  data  matrix  is  standardized  along  the  gene  

dimensions.  Standardization  of  the  events  ensures  that  only  the  shape  and  not  the  

magnitude  of  the  fluctuations  is  taken  into  account  for  covariance  estimation  and  it  

is  equivalent  to  using  correlation  in  lieu  of  covariance  in  all  downstream  analyses.    

Page 35: Functional Splicing Network Reveals Extensive Regulatory

     

 

 

2.  Robust  sample  correlation  estimation.  

When   outliers   are   present   they   dominate   the   correlation   estimates.   In   order   to  

overcome   the   sensitivity   of   sample   correlation   to   outliers   we   derive   robust  

correlation  estimates  based  on  a  weighting  algorithm  that  attempts  to  discriminate  

between  technical  outliers  and  reliable  influential  measurements.  

In  particular,   starting  with   an  M  =  p  x  n   dataset  of  p   standardized   variables   and  n  

samples   we   wish   to   derive   a   robust   correlation   p   x   p   matrix.   We   calculate   the  

influence  of  a  sample  u   in   the  estimation  of   the  correlation  of   two  variables  Xa,  Xb  

using  the  deleted  residual  distance:  

 

Where  p(Xa,  Xb)  and    p-­‐u(Xa,Xb)  are  the  Pearson’s  correlation  estimates  before  or  after  

removing  the  observations  Mau,  Mbu  coming  from  sample  u.  

A  measure  of  the  reliability  𝑅!,!!  of   these  measurements   for  the  estimation  of  p(Xa,  

Xb)  given  all   the  observed  data   is  then  based  on  their  cumulative   influence  on  high  

correlates  of  Xa,  and  Xb:  

 

 

where  𝕀 𝑎, 𝑖  is  the  indicator  function:    

 

T    being  a  minimum  correlation  threshold  for  considering  a  pair  of  variables.  

!! !! ,!! = ! !! ,!! − !!! !! ,!! !

!! ,!! = 1/ ! !! !! ,!! !(!, !)+ !! !! ,!! !(!, !)!!!!

!!!!

!(!, !)+ !(!, !)!!!!

!!!!

!

! !, ! ≔ 1!if#! ≠ !!!"#!|! !! ,!! | > !!0!otherwise !

Page 36: Functional Splicing Network Reveals Extensive Regulatory

     

 

The   final   robust   correlation   estimates   are   finally   calculated   as   the   weighted  

Pearson’s   correlation   p(Xa,   Xb;w)   where   w   is   a   vector   of   weights   of   length   n  

containing  the  values  𝑅!,!!  for  every  observation  u.    

Intuitively,   if   an   observation  Mau   distorts   correlation   estimates   among   all   highly  

correlated  partners  of  Xa   it   is   considered  an  outlier   and   is  down-­‐weighted.  On   the  

other  hand  if  including  this  observation  yields  similar  correlation  estimates  for  most  

of  these  pairs  it  is  considered  reliable  even  if  its  influence  on  the  individual  estimate  

of  p(Xa,   Xb)   is   high.   The   advantage   of   this   procedure   is   that   it   derives  weights   for  

individual   measurements   (as   opposed   to   a   single   weight   for   every   sample)   while  

taking  into  account  the  complete  dataset.  

The   algorithm   can   be   used   iteratively,   however   in   our   experience   the   estimates  

converge  within  1  to  2  iterations.    R  code  for  implementing  this  algorithm  is  available  

upon  request.  Examples  of  robust  correlation  estimates  and  regression  based  on  this  

metric  and  taken  from  our  dataset  are  shown  below:  

 

 

 

3.    Graphical  model  selection  for  Network  Reconstruction  using  Graphical  Lasso  

−4 −3 −2 −1 0 1

−5−4

−3−2

−10

1

Z−score RBM17

Z−sc

ore

CRN

KL1

FasOlr1 Pax6Chek2

FN1EDB

FN1EDA NUMB

BCL2L1

Rac1

MSTR1

MADD

CASP2

APAF1

MINK1

MAP3K7

MAP4K3NOTCH3 MAP4K2PKM2

MCL1

CFLARCASP9

DIABLO

BMF

CCNE1

H2AFY

STAT3

GADD45ABIM_L_EL

BIM_ELBIRC5_2B

BIRC5_D3

SMN1

SMN2

PHF19

ClassCor = 0.66RobCor = 0.03

−4 −3 −2 −1 0 1 2

−3−2

−10

12

34

Z−score CDC5L

Z−sc

ore

SKI

IP

FasOlr1

Pax6Chek2

FN1EDB

FN1EDA

NUMB

BCL2L1

Rac1

MSTR1

MADD

CASP2

APAF1

MINK1MAP3K7

MAP4K3

NOTCH3

MAP4K2

PKM2

MCL1

CFLARCASP9

DIABLO

BMF

CCNE1

H2AFY

STAT3

GADD45A

BIM_L_ELBIM_EL

BIRC5_2B

BIRC5_D3

SMN1SMN2PHF19

ClassCor = 0.26RobCor = 0.5

Page 37: Functional Splicing Network Reveals Extensive Regulatory

     

 

 

Undirected   graphical   models   (UGMs)   such   as   Markov   random   fields   (MRFs)  

represent   real-­‐world  networks  and  attempt   to  capture   important   structural  and  

functional   aspects   of   the   network   in   the   graph   structure.   The   graph   structure  

encodes   conditional   independence   assumptions   between   variables  

corresponding  to  nodes  of  the  network.  The  problem  of  recovering  the  structure  

of  the  graph  is  known  as  model  selection  or  covariance  estimation.    

  In  particular,  let  G=(V,E)  be  an  undirected  graph  on  p=|V|  nodes.  Given  n  

independent,   identically   distributed   (i.i.d)   samples   of   X=(X1,…,Xp),   we   wish   to  

identify   the   underlying   graph   structure.   We   restrict   the   analysis   to   Gaussian  

MRFs   where   the   model   assumes   that   the   observations   are   generated   from   a  

multivariate  Gaussian  distribution  N(µ,Σ).  Based  on   the  observation   that   in   the  

Gaussian   setting,   zero   components   of   the   inverse   covariance   matrix   Σ-­‐1  

correspond   to   conditional   independencies   given   the   other   variables,   different  

approaches  have  been  proposed  in  order  to  estimate  Σ-­‐1.  

Graphical  lasso  (gLasso)  provides  an  attractive  solution  to  the  problem  of  

covariance   estimation   for   undirected   models,   when   graph   sparsity   is   a   goal  

(Friedman   et   al.,   2008).   The   gLasso   algorithm   has   the   advantage   of   being  

consistent,  i.e.  in  the  presence  of  infinite  samples  its  parameter  estimates  will  be  

arbitrarily  close  to  the  true  estimates  with  probability  1.  

L1  regularization  (lasso)  is  a  smooth  form  of  subset  selection  for  achieving  

sparsity.   In   the   case   of   gLasso   the  model   is   constructed   by   optimizing   the   log-­‐

likelihood  function:  

Page 38: Functional Splicing Network Reveals Extensive Regulatory

     

 

Where  Θ   is   an  estimate   for   the   inverse   covariance  matrix  Σ-­‐1,  S   is   the  empirical  

covariance  matrix  of  the  data,  ||Θ||1  is  the  L1  norm  i.e  the  sum  of  the  absolute  

values   of   all   elements   in  Θ   -­‐1,   and   r   is   a   regularization   parameter   (in   this   case  

selected  based  on  estimates  of   the  FDR,   see  below).   The   solution   to   the  glasso  

optimization  is  convex  and  can  be  obtained  using  coordinate  descent.  

The  glasso  process  for  network  reconstruction  was  implemented  using  the  glasso  

R   package   by   Jerome   Friedman,   Trevor   Hastie   and   Rob   Tibshirani  

(http://statweb.stanford.edu/~tibs/glasso/).  

 

4.  Partition  of  Network  into  modules.  

Network  modules   or   communities   can   be   defined   loosely   as   sets   of   nodes  with   a  

more  dense  connection  pattern  among  their  members  than  between  their  members  

and   the   remainder   of   the.  Module   detection   in   real-­‐world   graphs   is   a   problem   of  

considerable   practical   interest   as   they   often   capture   meaningful   functional  

groupings.  

  Typically   community   structure   detection   methods   try   to   maximize  

modularity,  a  measure  of  the  overall  quality  of  a  certain  network  partition  in  terms  

of   the   identified  modules   (Newman,  2006).   In  particular,   for  a  network  partition  p,  

the  network  modularity  M(p)  is  defined  as:    

 

log detΘ− !" !Θ − ! Θ !!

! ! = !!! −

!!2!

!!

!!!!

Page 39: Functional Splicing Network Reveals Extensive Regulatory

     

 

where  m   is   the   number   of  modules   in   p,     lk   is   the   number   of   connections  within  

module   k,   L   is   the   total   number   of   network   connections   and  dk   is   the   sum  of   the  

degrees  of  the  nodes  in  module  k.  

Here,  we  identify  modules  of  genes  that  exhibit  similar  perturbation  profiles  among  

the   assayed   events   by   maximizing   the   network’s   modularity   using   the   greedy  

community   detection   algorithm   (Clauset   et   al.,   2004)   implemented   in   the  

fastgreedy.community   function   of   the   igraph   package   (Csardi   and   Nepusz,   2006,  

http://igraph.sourceforge.net/doc/R/fastgreedy.community.html).    

 

 

Supplemental  References    

Bechara  EG,  Sebestyén  E,  Bernardis   I,  Eyras  E,  Valcárcel   J.   (2013).  RBM5,  6,  and  10  

differentially  regulate  NUMB  alternative  splicing  to  control  cancer  cell  proliferation.  

Mol  Cell.  52(5):720-­‐33.    

Clauset  A,  Newman  ME,  Moore  C.  (2004).  Finding  community  structure  in  very  large  

networks.  Phys  Rev  E  Stat  Nonlin  Soft  Matter  Phys.  70(6  Pt  2):066111.    

Corrionero   A,   Miñana   B,   Valcárcel   J.   (2011).   Reduced   fidelity   of   branch   point  

recognition  and  alternative  splicing   induced  by  the  anti-­‐tumor  drug  spliceostatin  A.  

Genes  Dev.  25(5):445-­‐59.    

Csardi   G,   Nepusz   T.   (2006).   The   igraph   software   package   for   complex   network  

research.  InterJournal.  Complex  Systems  1695.    

Page 40: Functional Splicing Network Reveals Extensive Regulatory

     

 

Friedman  J,  Hastie  T,  Tibshirani  R.  (2008).  Sparse  inverse  covariance  estimation  with  

the  graphical  lasso.  Biostatistics.  9(3):432-­‐41.    

Hastie   T,   Tibshirani   R,   Sherlock   G,   Eisen  M,   Brown   P,   Botstei   D.   (1999).   Imputing  

Missing   Data   for   Gene   Expression   Arrays.   Technical   report   Stanford   Statistics  

Department.      

Newman  ME.   (2006).  Modularity   and   community   structure   in   networks.   Proc   Natl  

Acad  Sci  U  S  A.  103(23):8577-­‐82.    

Zhou  Z,   Licklider  LJ,  Gygi  SO,  Reed  R.   (2002).  Comprehensive  proteomic  analysis  of  

the  human  spliceosome.  Nature  419:  182-­‐5.