1
Functional Splicing Network Reveals Extensive Regulatory Potential of the Core Spliceosomal Machinery
Panagiotis Papasaikas,1,2,4 J. Ramo n Tejedor,1,2,4 Luisa Vigevani,1,2 and Juan Valca rcel1,2,3,* 1Centre de Regulacio Geno mica, Dr. Aiguader 88, 08003 Barcelona, Spain 2Universitat Pompeu Fabra, Dr. Aiguader 88, 08003 Barcelona, Spain 3Institucio Catalana de Recerca i Estudis Avanc ats, Passeig Lluis Companys 23, 08010 Barcelona, Spain 4Co-first authors
SUMMARY Pre-mRNA splicing relies on the poorly understood dynamic interplay between >150 protein compo- nents of the spliceosome. The steps at which splicing can be regulated remain largely unknown. We sys- tematically analyzed the effect of knocking down the components of the splicing machinery on alterna- tive splicing events relevant for cell proliferation and apoptosis and used this information to reconstruct a network of functional interactions. The network accurately captures known physical and functional associations and identifies new ones, revealing remarkable regulatory potential of core spliceosomal components, related to the order and duration of their recruitment during spliceosome assembly. In contrast with standard models of regulation at early steps of splice site recognition, factors involved in catalytic activation of the spliceosome display regu- latory properties. The network also sheds light on the antagonism between hnRNP C and U2AF, and on tar- gets of antitumor drugs, and can be widely used to identify mechanisms of splicing regulation.
INTRODUCTION
Pre-mRNA splicing is carried out by the spliceosome, one of the most complex molecular machineries of the cell, composed of five small nuclear ribonucleoprotein particles (U1, U2, U4/5/6 snRNP) and about 150 additional polypeptides (reviewed by Wahl et al., 2009). Detailed biochemical studies using a small number of model introns have delineated a sequential pathway for the assembly of spliceosomal subcomplexes. For example, U1 snRNP recognizes the 5’ splice site, and U2AF—a hetero- dimer of 35 and 65 kDa subunits—recognizes sequences at the 3’ end of introns. U2AF binding helps to recruit U2 snRNP to the upstream branchpoint sequence, forming complex A. U2 snRNP binding involves interactions of pre-mRNA sequences with U2 snRNA as well as with U2 proteins (e.g., SF3B1). Subse-
quent binding of preassembled U4/5/6 tri-snRNP forms complex B, which after a series of conformational changes forms com- plexes Bact and C, concomitant with the activation of the two catalytic steps that generate splicing intermediates and prod- ucts. Transition between spliceosomal subcomplexes involves profound dynamic changes in protein composition as well as extensive rearrangements of base-pairing interactions between snRNAs and between snRNAs and splice site sequences (Wahl et al., 2009). RNA structures contributed by base-pairing interac- tions between U2 and U6 snRNAs serve to coordinate metal ions critical for splicing catalysis (Fica et al., 2013), implying that the spliceosome is an RNA enzyme whose catalytic center is only established upon assembly of its individual components.
Differential selection of alternative splice sites in a pre-mRNA (alternative splicing, AS) is a prevalent mode of gene regulation in multicellular organisms, often subject to developmental regula- tion (Nilsen and Graveley, 2010) and frequently altered in disease (Cooper et al., 2009; Bonnal et al., 2012). Substantial efforts made to dissect mechanisms of AS regulation on a relatively small number of pre-mRNAs have provided a consensus picture in which protein factors recognizing cognate auxiliary sequences in the pre-mRNA promote or inhibit early events in spliceosome assembly (Fu and Ares, 2014). These regulatory factors include members of the hnRNP and SR protein families, which often display cooperative or antagonistic functions depending on the position of their binding sites relative to the regulated splice sites. Despite important progress (Barash et al., 2010; Zhang et al., 2010), the combinatorial nature of these contributions compli- cates the formulation of integrative models for AS regulation.
To systematically capture the contribution of splicing regulato- ry factors, including components of the core spliceosome (Clark et al., 2002; Park et al., 2004; Pleiss et al., 2007; Saltzman et al., 2011) to AS regulation, we set up a high-throughput screen to evaluate the effects of knocking down each individual splicing component or regulator on 36 functionally important AS events (ASEs) and developed a framework for data analysis and network modeling. Network-based approaches can provide rich representations for real-world systems that capture their or- ganization more adequately than more traditional approaches or mere cataloguing of pairwise relationships. Numerous studies have utilized them as a platform for deriving comprehensive tran- scriptional, metabolic, and physical interaction maps in several
2
(legend on next page)
3
organisms and contexts (e.g., Chatr-Aryamontri et al., 2013; Franceschini et al., 2013; Wong et al., 2012). Their application has also proven invaluable for deciphering regulatory relation- ships in transcriptional circuits, for studying their properties upon perturbation, and for connecting physiological responses and diseases to their molecular underpinnings (e.g., Kim et al., 2012; Watson et al., 2013; Yang et al., 2014).
Key to our approach is the premise that the distinct profiles of splicing changes caused by perturbation of regulatory factors depend on the functional bearings of those factors and can, therefore, be used as proxies for inferring functional relation- ships. We derive a network that recapitulates the topology of known splicing complexes and provides an extensive map of >500 functional associations among ~200 splicing factors (SFs). We use this network to identify general and particular mechanisms of AS regulation and to identify key SFs that mediate the effects on AS of perturbations induced by iron (see accompanying manuscript by Tejedor et al., 2014, in this issue of Molecular Cell) or by antitumor drugs targeting compo- nents of the splicing machinery. Our study offers a unique com- pendium of functional interactions among SFs, a discovery tool for coupling cell-perturbing stimuli to splicing regulation, and an expansive view of the splicing regulatory landscape and its organization.
RESULTS
Results from a genome-wide siRNA screen to identify regulators of Fas/CD95 AS revealed that knockdown of a significant fraction of core SFs (defined as proteins identified in highly puri- fied spliceosomal complexes assembled on model, single-intron pre-mRNAs; Wahl et al., 2009) caused changes in Fas/CD95 exon 6 inclusion (Tejedor et al., 2014). The number of core fac- tors involved and the extent of their regulatory effects exceeded those of classical splicing regulatory factors like SR proteins or hnRNPs. To systematically evaluate the contribution of different classes of splicing regulators to AS regulation, we set up a screen in which we assessed the effects of knockdown of each individual factor on 38 ASEs relevant for cell proliferation and/or apoptosis (Figures 1A and 1B). The ASEs were selected due to their clear impact on protein function and their docu- mented biological relevance (see Table S1 available online). A library of 270 siRNA pools (Table S2) was used, corresponding to genes encoding core spliceosomal components as well as auxiliary regulatory SFs and factors involved in other RNA-pro- cessing steps, including RNA stability, export, or polyadenyla- tion. In addition, 40 genes involved in chromatin structure mod-
ulation were included, to probe for possible functional links between chromatin and splicing regulation (Luco et al., 2011).
The siRNA pools were individually transfected in biological triplicates in HeLa cells using a robotized procedure (Supple- mental Experimental Procedures). Seventy-two hours posttrans- fection, RNA was isolated, cDNA libraries generated using oligo- dT and random primers, and the patterns of splicing probed by PCR using primers flanking each of the alternatively spliced regions (Figure 1A). PCR products were resolved by high- throughput capillary electrophoresis (HTCE) and the ratio be- tween isoforms calculated as the percent spliced in (PSI) index (Katz et al., 2010), which represents the percentage of exon inclusion/alternative splice site usage. Knockdowns of factors affecting one ASE (Fas/CD95) were also validated with siRNA pools from a second library and by a second independent technique (Illumina HiSeq sequencing; Tejedor et al., 2014). Conversely, the effects of knocking down core SFs (P14, SNRPG, and SF3B1) on the ASEs were robust when different siRNAs and RNA purification methods were used (Figure S1A). Several pieces of evidence indicate that the changes in AS are not due to the induction of cell death upon depletion of essential SFs. First, cell viability was not generally compromised after 72 hr of knockdown of 13 individual core SFs (Figure S1C). Sec- ond, cells detached from the plate were washed away before RNA isolation. Third, splicing changes upon induction of apoptosis with staurosporine were distinct from those observed upon knockdown of SFs (Figure S1D).
Overall the screen generated a total of 33,288 PCR data points. These measurements provide highly robust and sensitive estimates of the relative use of competing splice sites. Figure 1D shows the spread of the effects of knockdown for all the factors in the screen for the ASEs analyzed. Three events (VEGFA, SYK, and CCND1) were excluded from further analysis, because they were not affected in the majority of the knockdowns (median ab- solute DPSI <1, Figure 1D). From these data we generated, for every knockdown condition, a perturbation profile that reflects its impact across the 35 ASEs in terms of the magnitude and di- rection of the change. Figures 1E and 1F show examples of such profiles and the relationships between them: in one case, knock- down of CDC5L or PLRG1 generates very similar perturbation profiles (upper diagrams in Figures 1E and 1F), consistent with their well-known physical interactions within the PRP19 complex (Makarova et al., 2004). In contrast, the profiles associated with the knockdown of SNRPG and DDX52 are to a large extent opposite to each other (lower diagrams in Figures 1E and 1F), suggesting antagonistic functions. Perturbation profiles and relationships between factors were not significantly changed
Figure 1. Comprehensive Mapping of Functional Interactions between SFs in ASEs Implicated in Cell Proliferation and Apoptosis (A) Flowchart of the pipeline for network generation. See main text for details. (B) Genes with ASEs relevant for the regulation of cell proliferation and/or apoptosis used in this study. (C) Examples of HTCE profiles for FAS/CD95 and CHEK2 ASEs under conditions of knockdown of the core SF SF3B1. Upper panels show the HTCE profiles, lower panels represent the relative intensities of the inclusion and skipping isoforms. PSI values indicate the median and standard deviation of three independent experiments. (D) Spread of PSI changes observed for each of the events analyzed in this study upon the different knockdowns. (E) Examples of splicing perturbation profiles for different knockdowns. The profiles represent change toward inclusion (>0) or skipping (<0) for each ASE upon knockdown of the indicated factors, quantified as a robust Z score (see Experimental Procedures). (F) Robust correlation estimates and regression of the perturbation profiles of strongly correlated (PLRG1 versus CDC5L, upper panel) or anticorrelated (DDX52 versus SNRPG, lower panel) factors. See also Figures S1 and S2, Table S1, and Table S2.
4
CELL FUNCTION
SPLICING TYPE
SPLICING TYPE
ce5ssmecomplex3ssir
CELL FUNCTION
apoptosis
bothcell proliferation
GENE CATEGORY
SpliceosomeSplic. FactorsOther RNApr./ Misc.Chrom. Factors
GE
NE
CA
TE
GO
RY
SKIP INC
0.5
1.0
1.5
2.0
2.5
Mean
ab
s��=íVFRUH
&RUH�6S
O�
6SO��
FDFWRUV
2WKHU�51$SU
.
&KURP
��FDFWRUV
0.0
0.5
1.0
1.5
2.0
2.5
Mean
ab
s��=íVFRUH
PHUVLVW��6
SO�
TUa
ns. E
aUO\
�6SO�
TUa
ns��0
LG�6SO�
TUa
ns��/DWH�6S
O�
EJC
A B
C
í� 0 5=íVFRUH
n=112 n=32 n=64 n=40
n=22 n=44 n=24 n=22 n=7
PH
F19
CH
EK
2M
CL1
BIR
C5_
D3
OLR
1S
MN
1C
AS
P9
BM
FFA
SA
PAF1
GA
DD
45A
MIN
K1
SM
N2
BIR
C5_
2BN
UM
BP
KM
2S
TAT3
MA
P4K
2C
CN
E1
NO
TCH
3R
AC
1D
IAB
LOB
CL2
L1B
IM_L
_EL
MS
TR1
MA
DD
CFL
AR
CA
SP
2PA
X6
FN1E
DA
FN1E
DB
MA
P4K
3H
2AFY
MA
P3K
7B
IM_E
L
Figure 2. Coordinated Regulation of Splicing Events by Coherent Subsets of SFs (A) Heatmap representation of the results of the screen. ASEs used in this study are on the x axis, while knockdown conditions on the y axis. Data are clustered on both dimensions (ward linkage, similarity measure based on Pearson correlation). Information about ASEs, (a) type (ce, cassette exon; 5ss, alternative 5’ splice site; me, mutually exclusive exon; complex, multiexonic rearrangement; 3ss, alternative 3’ splice site; ir, intron retention) and (b) involvement in apoptosis and/or cell proliferation regulation, as well as information on the category of genes knocked down, is color coded. (B) Boxplot representation of the mean absolute Z score of AS changes induced by the knockdown of particular classes of factors, including core and noncore SFs, other RNA processing factors, and chromatin remodeling factors. Median and spread (interquartile range) of AS changes of mock siRNA conditions are represented as a green line and box, respectively. (C) As in (B) for spliceosomal factors that assemble early and stay as detectable components through the spliceosome cycle; factors that are present only transiently at early, mid, or late stages of assembly; and components of the exon junction complex (EJC). See also Figure S3, Table S2, Table S3, and Table S4.
when knockdown of a subset of factors was carried out for 48 hr (Figure S2A), arguing against AS changes being the conse- quence of secondary effects of SF knockdown in these cases (see also below the discussion on IK/SMU1).
Pervasive and Distinct Effects of Core Spliceosome Components on Alternative Splicing Regulation The output of the screening process is summarized in the heat- map of Figure 2A. Three main conclusions can be derived from these results. First, a large fraction of the SF knockdowns have
noticeable but distinct effects on AS, indicating that knockdown of many components of the spliceosome causes switches in splice site selection, rather than a generalized inhibition of splicing of every intron. Second, a substantial fraction of AS changes correspond to higher levels of alternative exon inclu- sion, again arguing against simple effects of decreasing splicing activity upon knockdown of general SFs, which would typically favor skipping of alternative exons that often harbor weaker splice sites. These observations suggest extensive versatility of the effects of modulating the levels of SFs on splice site
5
selection. Third, despite the diversity of the effects of SF knock- downs on multiple ASEs, similarities can also be drawn (Fig- ure 2A). For example, a number of ASEs relevant for the control of programmed cell death (e.g., Fas, APAF1, CASP9, or MCL1) cluster together and are similarly regulated by a common set of core spliceosome components. This observation suggests coordinated regulation of apoptosis by core factors, although the functional effects of these AS changes appear to be com- plex (Figure S3A). Another example is two distinct ASEs in the fibronectin gene (cassette exons EDA and EDB, of paramount importance to control aspects of development and cancer pro- gression; Muro et al., 2003) which cluster close together, sug- gesting common mechanisms of regulation of the ASEs in this locus.
To explore the effects of different classes of SFs on AS regu- lation, we first grouped them in four categories: genes coding for core spliceosome components, noncore SFs/regulators, RNA-processing factors not directly related to splicing, and chromatin-related factors. Core SFs display the greater spread and average magnitude of effects, followed by noncore SFs (Fig- ure 2B). Factors related with chromatin structure and remodeling showed a narrower range of milder effects (Figure 2B), although their values were clearly above the range of changes observed in mock knockdown samples. Of interest, knockdown of core fac- tors leads to stronger exon skipping effects, while noncore SFs and other RNA processing and chromatin-related factors show more balanced effects (Figures S3B and S3C).
To further dissect the effects of core components, we sub- divided them into four categories depending on the timing and duration of their recruitment to splicing complexes during spliceosome assembly (Wahl et al., 2009): (1) persistent compo- nents (e.g., 17S U2 snRNPs, Sm proteins) that enter the assem- bly process before or during complex A formation and remain until completion of the reaction, (2) transient early components that join the reaction prior to B complex activation but are scarce at later stages of the process, (3) transient middle components joining during B-act complex formation, and (4) transient late components that are only present during or after C complex for- mation. Our results indicate that knockdowns of factors that persist during spliceosome assembly cause AS changes of higher magnitude than those caused by knockdowns of transient factors involved in early spliceosomal complexes, which in turn are stronger and more dispersed than those caused by factors involved in middle complexes and in complex C formation or cat- alytic activation (Figure 2C). As observed for core factors in gen- eral, knockdown of persistent factors tends to favor skipping, while the effects of knockdown of more transient factors are more split between inclusion and skipping (Figures S3B and S3C).
Interestingly, knockdowns of components of the exon junction complex (EJC) display a range of effects that resembles those of transient early or mid splicing complexes (Figure 2C). These re- sults are in line with recent reports documenting effects of EJC components in AS, in addition to their standard function in post- splicing processes (Ashton-Beaucage et al., 2010).
To test whether the knockdown of core factors causes a gen- eral inhibition of splicing, we measured the levels of introns (both in constitutive and alternatively spliced regions) relative to exons
in four genes, upon the knockdown of four SFs (BRR2, SF3B1, SNRPG, and SLU7). The results shown in Figure 3 reveal a complex picture in which retention of certain introns is clearly increased and retention of other introns is very mild, and there is even one case (the two introns flanking the fibronectin EDI alternative exon) in which the low levels of intron detected under control conditions are reduced upon knockdown of core SFs (notably, this ASE typically displays a distinct pattern of AS upon knockdown of core SFs, Figure 2A). Therefore, rather than a general uniform inhibition of splicing, reduction in the levels of core SFs induces differential effects in different introns, a concept reminiscent of the differential effects observed for antitumor drugs targeting core components like SF3B1 (Bonnal et al., 2012).
Reconstruction of a Functional Network of Splicing Factors To systematically and accurately map the functional relation- ships of SFs on AS regulation, we quantified the similarity be- tween AS perturbation profiles for every pair of factors in our screen using a robust correlation estimate for the effects of the factors’ knockdowns across the ASEs. This measure captures the congruence between the shapes of perturbation profiles, while it does not take into account proportional differences in the magnitude of the fluctuations. Crucially, it discriminates be- tween biological and technical outliers and is resistant to distort- ing effects of the latter (Supplemental Experimental Procedures).
We next employed glasso, a regularization-based algorithm for graphical model selection (Friedman et al., 2008; Supple- mental Experimental Procedures), to reconstruct a network from these correlation estimates (Figures 4A and S4). Glasso seeks a parsimonious network model for the observed correla- tions. This is achieved by specifying a regularization parameter, which serves as a penalty that can be tuned in order to control the number of inferred connections (network sparsity) and there- fore the false discovery rate (FDR) of the final model. For a given regularization parameter, both the number of inferred connec- tions and the FDR are a decreasing function of the number of ASEs assayed (Figure 4B). Random sampling with different sub- sets of the real or reshuffled versions of the data indicates that network reconstruction converges to ~500 connections with a FDR <5% near 35 events (Figure 4B), implying that analysis of additional ASEs in the screening would have only small effects on network sparsity and accuracy.
The complete network of functional interactions among the screened factors is shown in Figure 4A. It is comprised of 196 nodes representing screened factors and 541 connections (average degree 5.5), of which 518 correspond to positive and 23 to negative functional associations. The topology of the re- sulting network has several key features. First, two classes of factors are easily distinguishable. The first class encompasses factors that form densely connected clusters comprised of core spliceosomal components (Figure 4A). Factors physically linked within U2 snRNP and factors functionally related with U2 snRNP activity, and with the transition from complex A to B, form a tight cluster (orange nodes in the inset of Figure 4A). An adjacent but distinct highly linked cluster corresponds mainly to factors physically associated within U5 snRNP or the U4/5/6
6
-2
-1
0
1
2
3
4
5
CN BRR2 SF3B1 SNRPG SLU7
INC SKIP Intron A Intron B GE
Fo
ld c
han
ge (
log
2)
A B
C D
5 6 7 2
GE
Intron A
Intron B
INC
SKIP
3
Constitutive
Intron
Fo
ld
ch
an
ge (
log
2)
-3
-2
-1
0
1
2
3
4
CN BRR2 SF3B1 SNRPG SLU7
INC SKIP Intron A Intron B Constitutive intron GE
0
0.1
0.2
0.3
0.4
0.5
0.6
0.7
CN BRR2 SF3B1 SNRPG SLU7
Exp
ressio
n v
alu
es
(co
mp
ared
to
HP
RT
1)
Control BRR2 SF3B1 SNRPG SLU7
PSI
SD
92.86 92.22 23.51 81.16 87.21
1.01 0.16 0.72 1.05 2.57
FAS 1 2 3
GE
Intron A
Intron B
INC
SKIP
MCL1
0
0.1
0.2
0.3
0.4
0.5
0.6
0.7
CN BRR2 SF3B1 SNRPG SLU7
Exp
ressio
n
valu
es
(co
mp
ared
to
HP
RT
1)
Control BRR2 SF3B1 SNRPG SLU7
PSI
SD
97.91 94.20 31.18 96.22 92.83
0.84 0.90 1.40 0.20 0.90
7 8 9 2
GE Intron A
Intron B
INC
6
Constitutive
intron
SKIP
CHEK2
0
0.1
0.2
0.3
0.4
0.5
0.6
0.7
CN BRR2 SF3B1 SNRPG SLU7
Exp
ressio
n
valu
es
(co
mp
ared
to
HP
RT
1)
-3
-2
-1
0
1
2
3
4
5
CN BRR2 SF3B1 SNRPG SLU7
INC SKIP Intron A Intron B Constitutive intron GE
Fo
ld
ch
an
ge (
log
2)
Control BRR2 SF3B1 SNRPG SLU7
PSI
SD
64.94 39.16 8.82 43.37 70.37
1.96 0.95 0.87 0.74 1.06
24 24A 25 3
Constitutive
intron
Intron A
Intron B
INC
SKIP
4
GE
-2.5
-2
-1.5
-1
-0.5
0
0.5
1
1.5
2
2.5
CN BRR2 SF3B1 SNRPG SLU7
INC SKIP Intron A Intron B Constitutive intron GE
Fo
ld
ch
an
ge (
log
2)
FN1 EDB
Exp
ressio
n
valu
es
(co
mp
ared
to
HP
RT
1)
0
0.1
0.2
0.3
0.4
0.5
0.6
0.7
CN BRR2 SF3B1 SNRPG SLU7
Control BRR2 SF3B1 SNRPG SLU7
PSI
SD
6.20 17.84 17.62 6.45 5.84
0.51 0.57 1.24 0.71 0.43
N=3
N=3 N=3
N=3
N=3N=3
N=3 N=3
Figure 3. Variety of Effects of Knockdown of Core SFs on Intron Retention (A–D) Real-time quantification of AS, intron retention, and gene expression for four genes included in the splicing network (FAS, A; MCL1, B; CHEK2, C; and FN1, D) upon knockdown of four core spliceosomal components (BRR2, SF3B1, SNRPG, and SLU7). In each of the panels, the graphs represent the following: top, scheme of genomic locations, AS patterns, and amplicons used; middle top, quantification of fold changes in expression of the different isoforms or introns (indicated by the color code) compared to mock siRNA conditions; middle bottom, relative levels of the different isoforms or introns (indicated by the color code) compared to HPRT1 housekeeping gene mRNAs; and bottom, AS changes measured by HTCE. PSI values are indicated. For all assays, values represent the mean, and error bars the standard deviation of three independent biological replicas.
7
tri-snRNP (blue nodes in Figure 4A, inset). Factors in these clus- ters and their mutual links encompass 23% of the network nodes but 63% of the total inferred functional associations (average degree 14.5). A second category includes factors that form a periphery of lower connectivity that occasionally projects to the core. This category includes many of the classical splicing regulators (e.g., SR proteins, hnRNPs) as well as several chromatin factors. An intermediate category—more densely in- terconnected but not reaching the density of interactions of the central modules—includes multiple members of the RNA- dependent DEAD/X box helicase family. While the number of connections (degree) is significantly higher for core spliceosomal factors in general (Figure 4C), this is especially clear for factors that represent persistent components of splicing complexes along the assembly pathway (Figures 4D and 4E). Persistent factors are particularly well connected to each other (50% of the possible functional connections are actually detected, Fig- ure 4E). Factors that belong to consecutive complexes are signif- icantly more linked to each other than factors that belong to early and late complexes, and factors that transiently assemble to- ward the final stages of spliceosome assembly display the least number of connections (Figures 4D and 4E).
Collectively, these results indicate that core SFs, particularly those that persist during assembly, display highly related func- tions in splice site recognition and fluctuations in their levels or activities have related consequences on the modulation of splice site choice. Of the observed connections, 20% can be attributed to known physical interactions and 82% of these are among core components (Figure 4A). Examples include tight links between the two subunits of U2AF, between the interacting partners PLRG1 and CDC5L or IK and SMU1. We estimate that about 50% of the functional links detected in our network could be pre- dicted by previous knowledge of the composition or function of splicing complexes.
Of relevance, a substantial number of the functional associa- tions closely recapitulate the composition and even—to some extent—the topology of known spliceosomal complexes (Fig- ure S4; see also below and Discussion). These observations suggest that differences in the sensitivity of ASEs to SF pertur- bations reflect the specific role of these factors in the splicing process. Conversely, these results warrant the use of splicing perturbation profiles as proxies for inferring physical or func- tional links between factors in the splicing process.
Generality of the Functional Interactions in the AS Regulation Network The above network was derived from variations in 35 ASEs selected because of their relevance for cell proliferation and apoptosis, but they represent a heterogeneous collection of AS types and, possibly, regulatory mechanisms. To identify func- tional connections persistent across AS types, we undertook a subsampling approach. We sampled subsets of 17 from the original 35 ASEs and iteratively reconstructed the network of functional associations from each subset (see Supplementary methods). Following 10,000 iterations, we asked how many functional connections were recovered in at least 90% of the sampled networks. The retrieved set of connections are shown in Figure 5A and they can be considered to represent a ‘‘basal’’
or indispensable splicing circuitry that captures close similarities in function between SFs regardless of the subset of splicing substrates considered. The majority of these connections corre- spond to links between core components, with modules corre- sponding to major spliceosomal complexes. For example, the majority of components of U2 snRNP constitute the most prom- inent of the core modules (Figure 5A). Strikingly, the module is topologically subdivided in two submodules, the first corre- sponding to components of the SF3a and SF3b complexes and the second corresponding to proteins in the Sm complex. The connectivity between U2 snRNP components remarkably recapitulates known topological features of U2 snRNP organiza- tion, including the assembly of SF3a/b complexes in stem loops I and II and the assembly of the Sm ring in a different region of the U2 snRNA (Behrens et al., 1993) (Figures 5A and S4). That the connectivity between components derived from as- sessing the effects on AS of depletion of these factors correlates with topological features of the organization of the snRNP further argues that the functions in splice site selection are tightly linked to the structure and function of this particle, highlighting the potential resolution of our approach for identifying meaningful structural and mechanistic links.
A second prominent ‘‘core’’ module corresponds to factors known to play roles in conformational changes previous to/ concomitant with catalysis in spliceosomes from yeast to hu- man, including PRP8, PRP31 or U5-200. Once again some of the topological features of the module are compatible with phys- ical associations between these factors (e.g., PRP8 with PRP31 or U5-116K or PRP31 with C20ORF14). That knockdown of these late-acting factors, which coordinate the final steps of the splicing process (Bottner et al., 2005; Ha cker et al., 2008; Wahl et al., 2009), causes coherent changes in splice site selec- tion strongly argues that the complex process of spliceosome assembly can be modulated at late steps, or even at the time of catalysis, to affect splice site choice. Other links in the ‘‘core’’ network recapitulate physical interactions, including the aforementioned modules involving the two subunits of the 3’
splice site-recognizing factor U2AF, the interacting partners PLRG1 and CDC5L and IK and SMU1.
To further evaluate the generality of our approach, we tested whether the functional concordance derived from our network analysis and from the iterative selection of a ‘‘core’’ network could be recapitulated in a different cell context and genome- wide. We focused our attention in the strong functional asso- ciation between IK and SMU1, two factors that have been described as associated with complex B and its transition to Bact (Bessonov et al., 2008) (Figure 5A). Analysis of the effects of knockdown of these factors in the same sets of ASEs in HEK293 cells revealed a similar set of associations, despite the fact that individual events display different splicing ratios upon IK/SMU1 knockdown in this cell line compared to HeLa (Figures 5B and 5C). To explore the validity of this functional association in a larger set of splicing events, RNA was isolated in biological triplicates from HeLa cells transfected with siRNAs against these factors, and changes in AS were assessed using genome-wide splicing-sensitive microarrays. The results revealed a striking overlap between the effects of IK and SMU1 knockdown, both on gene expression changes and on
8
A
SMNDC1
LSM3
C20ORF14
CRNKL1
XAB2
PRPF4
U2AF2
PPIH
SLU78���.'
SF3A2DDX23
SNRPD1
LSM2
SNRPD3 SNRPA1
NHP2L1
8���.'
SF3A1
PLRG1
SNRPB
C19ORF29
SF3B4
SNRPB2
PRPF31 PRPF19
AQR
SNRPGCDC5L
PRPF8
SF3B2U2AF1
SFRS9
LSM4
SF3B1
SART1
SNRPF
P14
SF3B3
CDC40.,$$����
SF3A3
LSM7
PRPF3
SNRPD2
P29 HNRPH1
ACIN1
RNPC2
EIF2C2
HNRPU
SIAHBP1
KIN
THOC3
RNPS1
BUB3
BRD4
HNRPD
)/-�����
RBM15
LENG1
EWSR1
/2&������
EP300
TXNL4
MOV10
DHX32CBX3
DDX11
SETD1A
TIA1
CARM1
CTNNBL1
U2AF2
DDX12
MGC13125
CD2BP2
THRAP3
NOSIP
DDX28
PHC1
CHD1
SNRPC
NONO
/60�
MGC23918
NAB2
HNRNPA3
MAGOH
ILF2
SFPQ
HPRP8BP
ELAVL1
MGC2803
MECP2
CHERP
SNRP70
TET1
LSM2
FUBP3PABPC1
SKIIP
SR140
BAT1
DDX31
DDX52
DNMT1
THOC1
0*&����
NHP2L1HNRPR
RBM17
SFRS1
RALY
SUV39H1
DHX8
WBP11
SNRPB
KIAA0073
PPIL1
&��25)��
THOC4
FRG1
DICER1
C22ORF19
MFAP1
SFRS10
AOF2
PPIE
SMARCA2
THOC2
SNRPB2
EZH2
HNRPA2B1
IK
BCAS2
DHX57
HSPC148
CCAR1
ZNF207
DHX38
DEK
SKIV2L2
HSPA5
C9ORF78
WBP4
FNBP3
SMARCA4
SNRPG
GTL3
JMJD2B
&36)�
HNRPF
ARS2
CPSF5
FLJ35382
ASCL1
SF3B2
MORF4L1
U2AF1
KAT5
DBR1
SFRS9
TCERG1
NCBP2
RBM25
PPIL2
SYNCRIP
SF3B1
HNRPA0
ILF3
SFRS5
SNRPA DDX41
NCBP1
DDX3X
DHX30
EHMT2
USP39
G10FAM32A
SART1
SMU1
DIS3
RBM5
PRPF18
SRRM2
'+;��
HSPA8
PPIG
.,$$����
SFRS11
PTBP1SFRS3
KAT2B
HTATSF1
DGCR14
SFRS4
WTAP
DDX10
PRPF3
,03í�
HNRPH25%0�
SRRM1
PABPN1
HNRPK
DHX9
SNRPD2
0 5 10 15 20 25 30
02
04
06
08
01
00
Degree
Cum
ulat
ive
Freq
uenc
y
All
Spliceosome
Splic. Factors
Other RNApr.
Chrom. Factors
Num
ber o
f Edg
es
Number of Assayed events
0 5 10 15 20 25 30
02
04
06
08
01
00
Degree
Cum
ulat
ive
Freq
uenc
y
Spliceosome
Persistent
Trans. Early
Trans. Mid
Trans. Late
5 10 15 20 25 30 35
10
50
100
500
5000
Random
Actual
DC
B
E
(legend on next page)
9
AS changes (61% overlap, p < 1E-20) covering a wide range of gene expression levels, fold differences in isoform ratios and AS types (Figures 5D–5F, S5B, and S5C). The overlap was much more limited with RBM6, a splicing regulator (Bechara et al., 2013) located elsewhere in the network. These results validate the strong functional similarities between IK and SMU1 detected in our analysis. The molecular basis for this link could be ex- plained by the depletion of each protein (but not their mRNA) upon knockdown of the other factor, consistent with formation of a heterodimer of the two proteins and destabilization of one partner after depletion of the other (Figure S5A). Notably, the best correlation between the effects of these factors was be- tween IK knockdown for 48 hr and SMU1 knockdown for 72 hr, perhaps reflecting a stronger functional effect of IK on AS regu- lation (Figures S2B and S2D).
Gene ontology analysis revealed a clear common enrichment of AS changes in genes involved in cell death and survival (Fig- ure S5D), suggesting that IK/SMU1 could play a role in the control of programmed cell death through AS. To evaluate this possibility, we tested the effects of knockdown of these factors, individually or combined, on cell proliferation and apoptotic as- says. The results indicated that their depletion activated cleav- age of PARP, substantially reduced cell growth, and increased the fraction of cells undergoing apoptosis (Figures S5E–S5G). We conclude that functional links revealed by the network can be used to infer common mechanisms of splicing regulation and provide insights into the biological framework of the regula- tory circuits involved.
Ancillary Functional Network Interactions Reveal Alternative Mechanisms of AS Regulation The iterative generation of networks from subsets of ASEs ex- plained above could, in addition to identifying general functional interactions, be used to capture functional links that are specific to particular classes of events characterized by distinct regula- tory mechanisms. To explore this possibility, we sought the set of functional connections that are robustly recovered in a signif- icant fraction of ASEs subsets but are absent in others (see Supplemental Experimental Procedures). Figure 6A shows the set of such identified interactions, which therefore represents specific functional links that only emerge when particular sub- sets of ASEs are considered. The inset represents graphically the results of principal component analysis (PCA) showing the variability in the strength of these interactions depending on
the presence or absence of different ASEs across the subsam- ples. The most discriminatory set of ASEs for separating these interactions is also shown (see also Figure S6). While this anal- ysis may be constrained by the limited number of events analyzed and higher FDR due to multiple testing, it can provide a valuable tool for discovering regulatory mechanisms underly- ing specific classes of ASEs as illustrated by the example below.
A synergistic link was identified between hnRNP C and U2AF1 (the 35 kDa subunit of U2AF, which recognizes the AG dinucleo- tide at 3’ splice sites) for a subset of ASEs (Figures 6A and S6A). We explored a possible relationship between the effects of knocking down these factors and the presence of consensus hnRNP C binding sites (characterized by stretches of uridine res- idues) or 3’ splice site-like motifs in the vicinity of the regulated exons. The results revealed a significant correlation between the two factors (0.795), associated with the presence of compos- ite elements comprised of putative hnRNP C binding sites within sequences conforming to a 3’ splice site motif (see Experimental Procedures) upstream and/or downstream of the regulated exons (Figures 6B and S6B). No such correlation was observed for ASEs lacking these sequence features (Figures 6C and S6C). These observations are in line with results from a recent report revealing that hnRNP C prevents exonization of Alu elements by competing the binding of U2AF to Alu sequences (Zarnack et al., 2013). Our results extend these findings by capturing func- tional relationships between hnRNP C and U2AF in ASEs that harbor binding sites for these factors in the vicinity of the regu- lated splice sites, both inside of Alu elements or independent of them (Figures 6B and 6D). Our data are compatible with a model in which uridine-rich and 3’ splice site-like sequences sequester U2AF away from the regulated splice sites, causing alternative exon skipping, while hnRNP C displaces U2AF away from these decoy sites, facilitating exon inclusion. In this model, depletion of U2AF and hnRNP C are expected to display the same effects in these ASEs, as observed in our data (Figures 6B and 6D). In contrast, in ASEs in which hnRNP C binding sites are located within the polypyrimidine tract of the 3’ splice site of the regulated exon, the two factors display antagonistic effects (Figures 6C and 6D), as expected from direct competition be- tween them at a functional splice site. This example illustrates the potential of the network approach to identify molecular mechanisms of regulation on the basis of specific sequence fea- tures and the interplay between cognate factors.
Figure 4. Functional Splicing Regulatory Network (A) Graphical representation of the reconstructed splicing network. Nodes (circles) correspond to individual factors and edges (lines) to inferred functional as- sociations. Positive or negative functional correlations are represented by green or red edges, respectively. Edge thickness signifies the strength of the functional interaction, while node size is proportional to the overall impact (median Z score) of a given knockdown in the regulation of AS. Node coloring depicts the network’s natural separation in coherent modules (see Supplemental Experimental Procedures). Known physical interactions as reported in the STRING database (Franceschini et al., 2013) are represented in black dotted lines between factors. The inset is an expanded view of factors in the U2 and U4/5/6 snRNP complexes. (B) Number of recovered network edges using actual or randomized data sets (red and green points respectively) as a function of the number of ASEs used. Lines represent the best fit curves for the points. (C and D) Network degree distribution. The plots represent the cumulative frequency of the number of connections (degree) for different classes of factors (as in Figures 2B and 2C). (E) Functional crosstalk between different categories of spliceosome components. Values represent the fraction of observed functional connections out of the total possible connections among factors that belong to the different categories: persistent, P; transient early, E; transient mid, M; or transient late, L factors. See also Figure S4E and Table S5.
10
SMNDC1
LSM3
C20ORF14
CRNKL1
XAB2
TXNL4
U2AF2PPIH
SLU7
8���.'
SF3A2
SNRPC
DDX23 SNRPD1
CHERP
SNRP70
SR140 SNRPD3
SNRPA1
RBM178���.'
SF3A1
PLRG1
SNRPB
C19ORF29
AOF2
SF3B4
SNRPB2
IK
PRPF31
PRPF19
AQR
CDC5L
PRPF8
SF3B2
U2AF1
NCBP2
LSM4
SF3B1
SART1
SNRPF
SMU1
P14
SF3B3
CDC40
.,$$����
SF3A3
LSM7
PRPF3
SNRPD2
Alternative splicing changes
IK SMU1
RBM6
462 415858
IK SMU1
RBM6
107
547
3631
Gene expression changes
p = 0.03
p = 1.07E -23p = 0.005
p = 3.36E-07
p = 0.73p = 4.60E-11
p = 6.87E-10
p = 3.38E-07
p = 0.14
p = 4.62E-31p = 0.08
p = 4.23E-05
p = 0.32
p = 6.46E-12
p = 9.32E-08
p = 0.1
IK KD AS distribution SMU1 KD AS distribution
A
D
E
338195 226
16
228
33
í� í� í� í� 0 1
í�í�
í�0
1
=íVFRUH��,.
=íVFRUH��608�
FAS
OLR1
PAX6
CHEK2
FN1EDB
FN1EDA
NUMB
BCL2L1
RAC1
MSTR1
MADD
CASP2
APAF1MINK1
MAP3K7MAP4K3
NOTCH3
MAP4K2
PKM2
MCL1
CFLARCASP9
DIABLO
BMFCCNE1
H2AFY
STAT3
GADD45ABIM_L_EL
BIM_ELBIRC5_2B
BIRC5_D3
SMN1SMN2
PHF19RobCor = 0.76
í� í� í� 0 1
í�í�
í�0
1
=íVFRUH��,.B+(.
=íVFRUH��608�B+(.
FAS
OLR1 PAX6 Chek2FN1EDB
FN1EDA
NUMBBCL2L1
RAC1
MSTR1
MADD
CASP2 APAF1
MINK1
MAP3K7
MAP4K3
NOTCH3MAP4K2
PKM2
MCL1
CFLAR
CASP9
DIABLO
BMFCCNE1
H2AFY
STAT3GADD45A
BIM_L_ELBIM_EL
BIRC5_2B
BIRC5_D3
SMN1SMN2
PHF19
RobCor = 0.77
B
C
AltHUnativH�EvHnt�TypH NumbHU�Rf�EvHntVAlter. First Exon 42926
Alter. Terminal Exon 27784Exon Cassette 25912
Mutually Exclusive Exons 3936Alter. Acceptor Splice Site 6443
Alter. Donor Splice Site 4986Intron Retention 9475
Complex 34841Internal Exon Deletion 3389
TRtal 159692
F AS event distribution in GW array platform
Figure 5. Core Network Involved in AS Regulation (A) Graphical representation of the functional connections present in at least 90% of 10,000 networks generated by iterative selection of subsets of 17 out of the 35 ASEs used to generate the complete network. Network attributes represented as in Figure 4A. Known spliceosome complexes are highlighted by shadowed areas (U1 snRNP, blue; U2 snRNP, yellow; SM proteins, green; and tri-snRNP proteins, red). (B and C) Consistency of inferred functional interactions in different cell lines. Knockdown of IK or SMU1 was carried out in parallel in HeLa (B) or HEK293 cells (C), and changes for the 35 ASEs were analyzed by RT-PCR and HTCE. Robust correlation estimates and regression for the AS changes observed in IK versus SMU1 knockdowns are shown for each of the cell lines. (D) Venn diagrams of the overlap between the number of gene expression changes upon IK, SMU1, or RBM6 knockdowns.
(legend continued on next page)
11
Figure S5.
Mapping Drug Targets within the Spliceosome Another application of our network analysis is to identify possible targets within the splicing machinery of physiological or external (e.g., pharmacological) perturbations that produce changes in AS that can be measured using the same experimental setting. The similarity between profiles of AS changes induced by such perturbations and the knockdown of particular factors can help to uncover SFs that mediate their effects. To evaluate the poten- tial of this approach, cells were treated with drugs known to affect AS decisions and the patterns of changes in the 35 ASEs induced by these treatments were assessed by the same robotized procedure for RNA isolation and RT-PCR analysis used to locate their position within the network.
The results indicate that structurally similar drugs like Spli- ceostatin A (SSA) and Meayamycin cause AS changes that closely resemble the effects of knocking down components of U2 snRNP (Figure 7A), including close links between these drugs and SF3B1, a known physical target of SSA (Kaida et al., 2007; Hasegawa et al., 2011) previously implicated in mediating the ef- fects of the drug through alterations in the AS of cell cycle genes (Corrionero et al., 2011). Of interest, changes in AS of MCL1 (leading to the production of the proapoptotic mRNA isoform) appear as prominent effects of both drugs (Figure 7B), confirm- ing and extending recent observations obtained using Meaya- mycin (Gao et al., 2013) and suggesting that this ASE can play a key general role in mediating the antiproliferative effects of drugs that target core splicing components like SF3B1 (Bonnal et al., 2012). A strong link is also captured between each of the drugs and PPIH (Figure 7A), a peptidyl-prolyl cis-trans isomerase which has been found to be associated with U4/5/6 tri-snRNP (Horowitz et al., 1997; Teigelkamp et al., 1998) but had not been implicated directly in 3’ splice site regulation. Indeed, the network indicates a close relationship between PPIH and multi- ple U2 snRNP components, particularly in the SF3a and SF3b complexes, further arguing that PPIH plays a role in 3’ splice site definition closely linked to branchpoint recognition by U2 snRNP. Treatment of the cells with the Clk (Cdc2-like) kinase family inhibitor TG003, which is also known to modulate AS (Mur- aki et al., 2004) and causes changes in ASEs analyzed in our network (Figure 7B), led to links with a completely different set of SFs (Figure 7A), attesting to the specificity of the results. These data confirm the potential of the network analysis to identify bona fide SF targets of physiological or pharmacological pertur- bations and further our understanding of the the underlying mo- lecular mechanisms of regulation.
DISCUSSION
The combined experimental and computational approach described in this manuscript provides a rich source of informa- tion and a powerful toolset for studying the splicing process and its regulation. Similar approaches could be useful to system- atically capture information about other aspects of transcrip-
tional or posttranscriptional gene regulation. The data provide comprehensive information about SFs (and some additional chromatin factors) relevant to understand the regulation of 35 ASEs important for cell proliferation and apoptosis. Second, the network analysis reveals functional links between SFs, which in some cases are based upon physical interactions. The recon- struction of the known topology of some complexes (e.g., U2 snRNP) by the network suggests that the approach captures important aspects of the operation of these particles and therefore has the potential to provide novel insights into the composition and intricate workings of multiple Spliceosomal subcomplexes. Third, the network can serve as a resource for exploring mechanisms of AS regulation induced by perturba- tions of the system, including genetic manipulations, internal or external stimuli, or exposure to drugs. As an example, the network was used in the accompanying manuscript by Tejedor et al. (2014) to identify a link between AS changes induced by variations in intracellular iron and the function of SRSF7, which can be explained in terms of iron-induced changes in the RNA binding and splicing regulatory activity of this zinc-knuckle-con- taining SR protein. Other examples include (1) the identification of the extensive regulatory overlap between the interacting pro- teins IK/RED and SMU1, relevant for the control of programmed cell death; (2) evidence for a general role of uridine-rich hnRNP C binding sequences (some associated with transposable ele- ments; Zarnack et al., 2013) in the regulation of nearby alterna- tive exons via antagonism with U2AF; and (3) delineation of U2 snRNP components and other factors, including PPIH, that are functionally linked to the effects of antitumor drugs on AS.
Our data reveal changes of AS regulation mediated either by core splicing components or by classical regulatory factors like proteins of the SR and hnRNP families. While the central core of functional links is likely to operate globally, alternative config- urations (particularly in the periphery of the network) are likely to emerge if similar approaches are applied to other cell types, genes, or biological contexts. We report considerable versatility in the effects of core spliceosome factors on alternative spicing. Precedents exist for a role of core SFs on the regulation of splicing efficiency or splice site selection in yeast, Drosophila, and mammalian cells (Clark et al., 2002; Park et al., 2004; Pleiss et al., 2007; Saltzman et al., 2011). Despite their assumed gen- eral roles in the splicing process, depletion or mutation of particular core factors led to differential splicing effects in these studies, and at least in some cases the differential effects could be attributed to particular features of the regulated pre-mRNAs. For example, Clark et al. (2002) found that the core components Prp17p and Prp18p are dispensable for splicing of yeast introns with short branchpoint to 3’ splice site distances. Pleiss et al. (2007) found that introns of yeast ribosome protein genes are particularly sensitive to mutation of various core factors, including two DEAD/X box family of RNA helicases (PRP2 and PRP5) involved in spliceosome conformational transitions and PRP8, a highly conserved protein involved in catalytic activation
(E) Venn diagrams of the overlap between the number of AS changes upon knockdown of IK, SMU1, or RBM6. (F) Distribution of ASEs in the Affymetrix array platform (upper panel) and distribution of the AS changes observed upon IK (lower left) or SMU1 (lower right) knockdown. p values correspond to the difference between the distribution of AS changes upon knockdown and the distribution of ASEs in the array. See also
12
A
SFRS11
SFRS1
SFRS2
SFRS10
SRRM2
SKIIP
U2AF1
SMNDC1
NCBP2
SNRP70
SNRPA
SF3B1
SF3B2
SFPQ
SF3A1
SF3A3
SF3B4
SNRPA1
SNRPB2
NONO
RBM35A
C20ORF14
CD2BP2
PRPF4
PRPF31
PPIH
CCDC55
NHP2L1
SART1
USP39
RY1
SNRPB
SNRPD1
SNRPF
DBR1
SNRPG
LSM2
LSM6
DHX15
DIS3
EXOSC4
IK
PPM1G
BCAS2
ACIN1
SHARP
RNPC2RBM15DHX9
SKIV2L2
DHX30
DDX10
DDX52
DEK
DDX31
DDX11XAB2
MGC2655
CRK7
CPSF5
ZNF207
THOC2
C22ORF19
MGC13125
WTAP
HSPC148
DGCR14
GTL3
HNRPA0
HNRPA1
HNRPC
HNRPD
HNRPL
ILF3ILF2
BUB3
ELAVL1
TAF15
YBX1
HNRPH2
KHDRBS1
DICER1
TIAL1 FNBP3RBM25
MFAP1
SMU1
CTNNBL1
KIN
P29 NOSIP
LENG1
MORG1
MGC20398
MGC23918
FRG1D6S2654EKIAA0073
PPIE
SDCCAG10
PPIG
SUV39H1
SETD1A
SETD2
NAB2
EED
AOF2
JMJD2BEZH2
SMARCA4ASH2LEP300
PRMT5
BRD4
CTCF
MORF4L1HDAC4
í� í� 0 � �
í�í�
0�
PC1
3&�
OLR1
CHEK2BCL2L1 MINK1
MAP3K7PKM2
MCL1
CFLAR
STAT3
PHF19
B
C
DIABLO AluSc
MCL1
&+(.��
+�$)Y 1kbp
AluJr
RAC1 AluSx3
AP$)��AluFLAMC AluSp
NUMB AluSx AluSc8
CCNE1
ALU
VL8�$)��
siHNRPC
Cass. Exon
500 bp
í� í� í� 0 1 �
í�í�
01
=íVFRUH��+153&
=íVFRUH��8�$
)�
%0)
AP$)�
&)/$5RAC1
CCNE1
DIABLO
+�$)<
NUMB
STA7�
)AS
0$3�.�
PAX6
CASP9SMN1
601�
%,5&�B'�
%,5&�B�%
&RU� ������
�� �� 0.0 0.5 1.0
í�í�
01
=íVFRUH��+153&
=íVFRUH��8�$
)�
OLR1
&+(.�
)1�('%
)1�(DA
%&/�/�
MSTR1
MADD
&$63�
MINK1
0$3�.�
NO7&+�
0$3�.�
3.0�
MCL1
*$''��$BIM_L_EL
BIM_EL
3+)��
&RU� �����
D
Figure 6. Variable Functional Interactions Extracted from Networks Generated from Subsets of ASEs (A) Network of ancillary functional connections characteristic of subsets of ASEs. Lines link SFs that display functional similarities in their effects on subsets of ASEs. Line colors indicate the positions of the interactions in the PCA shown in the inset, which captures the different relative contributions of the ASEs in defining these connections (see also Figure S6 and Experimental Procedures). For clarity, only the top ten discriminatory ASEs are shown. (B and C) Correlation values and regression for the effects of U2AF1 versus HNRPC knockdowns for ASEs in the presence (B) or absence (C) of upstream composite HNRPC binding/3’ splice site-like elements.
(legend continued on next page)
Figure S6.
13
(Collins and Guthrie, 1999; Mozaffari-Jovin et al., 2012). Distinct effects of the mutations on different ribosomal genes were also observed, arguing again for specific effects on splicing of certain genes. DEAD/H box proteins as well as components of U1, U2, and U4/6 snRNP were found to modulate particular ASEs in an RNAi screen of RNA binding proteins in Drosophila cells (Park et al., 2004). In contrast with the standard function of classical splicing regulators acting through cognate binding sites specific to certain target RNAs, core factors could both carry out general, essential functions for intron removal and in addition display reg- ulatory potential if their levels become limiting for the function of complexes, leading to differences in the efficiency or kinetics of assembly on alternative splice sites. This could be the case for the knockdown of SmB/B0, a component of the Sm complex pre- sent in most spliceosomal snRNPs, which leads to autoregula- tion of its own pre-mRNA as well as to effects on hundreds of other ASEs, particularly in genes encoding RNA processing fac- tors (Saltzman et al., 2011).
The results of our genome-wide siRNA screen for regulators of Fas AS (Tejedor et al., 2014) provide unbiased evidence that an extensive number of core SFs have the potential to contribute to the regulation of splice site selection. This is, in fact, the most populated category of screen hits. Similarly, the results of our network highlight coherent effects on alternative splice site choice of multiple core components, revealing also that the extent of their effects on alternative splice site selection can be largely attributed to the duration and order of their recruitment in the splicing reaction.
Remarkably, effects on splice site selection are associated with depletion not only of complexes involved in early splice site recognition but also of factors involved in complex B forma- tion or even in catalytic activation of fully assembled spliceo- somes. While particular examples of AS regulation at the time of the transition from complex A to B and even at later steps had been reported (House and Lynch, 2006; Bonnal et al., 2008; Lallena et al., 2002), our results indicate that the implication of late factors is quite general. Such links, e.g., those involving PRP8 and other interacting factors closely positioned near the re- acting chemical groups at the time of catalysis, like PRP31 or U5 200KD, are actually part of the persistent functional interactions that emerge from the analysis of any subsample of ASEs analyzed, highlighting their general regulatory potential in splice site selection. How can factors involved in late steps of the splicing process influence splice site choices? It is conceivable that limiting amounts of late spliceosome components would favor splicing of alternative splice site pairs harboring early com- plexes that can more efficiently recruit late factors, or influence the kinetics of conformational changes required for catalysis. In this context, the regulatory plasticity revealed by our network analysis may be contributed, at least in part, by the emerging real- ization that a substantial number of SFs contain disordered re- gions when analyzed in isolation (Korneta and Bujnicki, 2012; Chen and Moore, 2014). Such regions may be flexible to adopt
different conformations in the presence of other spliceosomal components, allowing alternative routes for spliceosome assem- bly on different introns, with some pathways being more sensitive than others to the depletion of a general factor.
Kinetic effects on the assembly and/or engagement of splice sites to undergo catalysis would be particularly effective if spli- ceosome assembly is not an irreversible process. Results of sin- gle molecule analysis are indeed compatible with this concept (Tseng and Cheng, 2008; Abelson et al., 2010; Hoskins et al., 2011; Hoskins and Moore, 2012; Shcherbakova et al., 2013), thus opening the extraordinary complexity of conformational transitions and dynamic compositional changes of the spliceo- some as possible targets for regulation. Interestingly, while depletion of early factors tends to favor exon skipping, depletion of late factors causes a similar number of exon inclusion and skipping effects, suggesting high plasticity in splice site choice at this stage of the process.
Given this potential, it is conceivable that modulation of the relative concentration of core components can function as a physiological mechanism for splicing regulation, for example during development and cell differentiation. Indeed, variations in the relative levels of core spliceosomal components have been reported (Wong et al., 2013). Furthermore, recent reports revealed the high incidence of mutations in core SFs in cancer. For example, the gene encoding SF3B1, a component of U2 snRNP involved in branchpoint recognition, has been described as among the most highly mutated in myelodysplastic syn- dromes (Yoshida et al., 2011), chronic lymphocytic leukemia (Quesada et al., 2012), and other cancers (reviewed in Bonnal et al., 2012), correlating with different disease outcomes. Remarkably, SF3B1 is the physical target of antitumor drugs like Spliceostatin A and—likely—Meayamycin, as captured by our study. These observations are again consistent with the idea that depletion, mutation, or drug-mediated inactivation of core SFs may not simply cause a collapse of the splicing pro- cess—at least under conditions of limited physical or functional depletion—but rather lead to changes in splice site selection that can contribute to physiological, pathological, or therapeutic out- comes. SF3B1 participates in the stabilization and proofreading of U2 snRNP binding to the branchpoint region (Corrionero et al., 2011), and it is therefore possible that variations in the activity of this protein result in switches between alternative 3’ splice sites harboring branch sites with different strengths and/or flanking decoy binding sites. Similar concepts may apply to explain the effects of mutations in other SFs linked to disease, including, for example, mutations in PRP8 leading to retinitis pigmentosa (Pena et al., 2007).
In summary, the data and methodological approaches pre- sented in this study provide a rich resource for understanding the function of the spliceosome and the mechanisms of AS regu- lation, including the identification of targets of physiological, pathological, or pharmacological perturbations within the com- plex splicing machinery.
(D) Schematic representation of the distribution of composite HNRPC binding sites/3’ splice site-like sequences in representative examples of the splicing events analyzed. The relative position of Alu elements in these regions is also shown. Genomic distances are drawn to scale. The di- rection of the arrows indicates up or downregulation of cassette exons upon knockdown of U2AF1 (black arrows) or HNRPC (white arrows). See also
14
A
PPIH
SNRPD3
SNRPA1
SF3A1
SF3B4
SNRPB2
SF3B2SF3B1
SART1
SNRPF
SF3B3
SF3A3
Spliceostatin
Meayamycin
DDX48
TG003HN
OOOR
OHO
O
O
O
O
RSpliceostatin A OCH 3
Meayamycin
CH3
FAS
OLR
1
PAX
6
CH
EK
2
FN1E
DB
FN1E
DA
NU
MB
BC
L2L1
RA
C1
MS
TR1
MA
DD
CA
SP
2
APA
F1
MIN
K1
MA
P3K
7
MA
P4K
3
NO
TCH
3
MA
P4K
2
PK
M2
MC
L1
CFL
AR
CA
SP
9
DIA
BLO
BM
F
CC
NE
1
H2A
FY
STA
T3
GA
DD
45A
BIM
_L_E
L
BIM
_EL
BIR
C5_
2B
BIR
C5_
D3
SM
N1
SM
N2
PH
F19
í�í�
í�í�
01
2
6FDOHG�=í
VFRUH
SpliceostatinMeayamycinTG003
B
N
SO
O
TG003
Figure 7. Mapping the Effects of Pharmaco- logical Treatments to the Splicing Network (A) Functional connections of splicing inhibitory drugs with the splicing machinery. See text for details. (B) Splicing perturbation profiles for Spliceostatin A, Meayamycin, or TG003 treatments across the 35 ASEs analyzed
(
15
EXPERIMENTAL PROCEDURES siRNA Library Transfection and mRNA Extraction HeLa cells were transfected in biological triplicates with siRNAs pools (ON TARGET plus smartpool, Dharmacon, Thermo Scientific) against 270 splicing and chromatin remodeling factors (see Table S2). Endogenous mRNAs were purified 72 hr posttransfection by using oligo dT-coated 96-well plates (mRNA catcher PLUS, Life Technologies) following the man- ufacturer’s instructions. RNA samples corresponding to treatments with splicing arresting drugs were isolated with the Maxwell 16 LEV simplyRNA kit (Promega). RT-PCR and High-Throughput Capillary Electrophoresis Cellular mRNAs were reverse transcribed using Superscript III Retro Tran- scriptase (Invitrogen, Life Technologies) following the manufacturer’s recom- mendations. PCR reactions for every individual splicing event analyzed were carried out using forward and reverse primers in exonic sequences flanking the alternatively spliced region of interest and further reagents provided in the GOTaq DNA polymerase kit (GoTaq, Promega). Primers used in this study are listed in Table S6.
HTCE measurements for the different splicing isoforms were performed in 96-well format in a Labchip GX Caliper workstation (Caliper, Perkin Elmer) us- ing a HT DNA High Sensitivity LabChip chip (Perkin Elmer). Data values were obtained using the Labchip GX software analysis tool (version 3.0). Quantification of AS Changes from HTCE Measurements Robust estimates of isoform ratios upon siRNA or pharmacological treatments were obtained using the median PSI indexes of the biological triplicates for each knockdown-AS pair. From this value xi the effect of each treatment was summarized as a robust Z score (Birmingham et al., 2009): Robust Sample Correlation Estimation Robust correlation estimation is based on an iterative weighting algorithm that discriminates between technical outliers and reliable measure- ments with high leverage. The weighting for mea- surements relies on calculation of a reliability index that takes into account their cumulative influence on correlation estimates of the complete data set. Algorithmic details are provided in Supplemental Experimental Procedures. Network Reconstruction The process of network reconstruction relies on the regularization-based graphical lasso (glasso) algorithm for graphical model selection (Friedman et al., 2008). The glasso process for network reconstruction was implemented using the glasso R package by Jerome Friedman, Trevor Hastie, and Rob Tibshirani (http://statweb.stanford.edu/~tibs/glasso/). A detailed exposition of the algorithmic steps for network reconstruction and analysis, glasso, and its use for graphical model selection is provided in Supplemental Experimental Procedures. Principal Component Analysis of Ancillary Connections PCA for ancillary edges was performed using R-mode PCA (R function prin- comp). The input data set is the scaled 95 3 35 matrix containing the absolute average correlation for each of the 95 edges over the subsets of the 10,000 samples that included each of the 35 events. The coordinates for projecting the 95 edges are directly derived from their scores on the first two principal components. The arrows representing the top ten discriminatory events (based on the vector norm of the first two PC loadings) were drawn from the origin to the coordinate defined by the first two PC loadings scaled by a con- stant factor for display purposes.
Splice Site Scoring, Identification of Probable HNRPC Binding Sites and Alu Elements For the identification and scoring of 3’ ss we used custom-built position weight matrices (PWMs) of length 21—8 intronic plus three exonic positions. The matrices were built using a set of human splice sites from constitutive, internal exons compiled from the hg19 UCSC annotation database (Karolchik et al., 2014). Background nucleotide frequencies were estimated from a set of strictly intronic regions. Threshold was set based
on a FDR of 0.5/kbp of random in- tronic sequences generated using a second-order Markov chain of actual in- tronic regions.
We considered as probable HNRPC binding sites consecutive stretches of U{4} as reported in Zarnack et al. (2013). Annotation of ALU elements was based on the RepeatMasker track developed by Arian Smit (http://www. repeatmasker.org) and the Repbase library (Jurka et al., 2005).
More detailed experimental procedures are provided in Supplemental Experimental Procedures.
AUTHOR CONTRIBUTIONS
P.P. set up the computational pipeline to generate the splicing network. J.R.T. set up and carried out the experimental analyses. L.V. contributed to the network analyses involving drugs. P.P., J.R.T., and J.V. designed the project and wrote the manuscript.
ACKNOWLEDGMENTS
We thank many CRG colleagues, EURASNET members, and Quaid Morris for their advice, encouragement, and critical reading of the manuscript, and Drs. Koide and Yoshida for reagents. We acknowledge the excellent technical sup- port of the CRG Robotics and Genomics facilities. J.R.T. was supported by a PhD fellowship from Fondo de Investigaciones Sanitarias. Work in our lab is supported by Fundacio n Botın, Consolider RNAREG, Ministerio de Economıa e Innovacio n, and AGAUR.
16
REFERENCES Abelson, J., Blanco, M., Ditzler, M.A., Fuller, F., Aravamudhan, P., Wood, M., Villa, T., Ryan, D.E., Pleiss, J.A., Maeder, C., et al. (2010). Conformational dy- namics of single pre-mRNA molecules during in vitro splicing. Nat. Struct. Mol. Biol. 17, 504–512. Ashton-Beaucage, D., Udell, C.M., Lavoie, H., Baril, C., Lefranc ois, M., Chagnon, P., Gendron, P., Caron-Lizotte, O., Bonneil, E., Thibault, P., and Therrien, M. (2010). The exon junction complex controls the splicing of MAPK and other long intron-containing transcripts in Drosophila. Cell 143, 251–262.
Barash, Y., Calarco, J.A., Gao, W., Pan, Q., Wang, X., Shai, O., Blencowe, B.J., and Frey, B.J. (2010). Deciphering the splicing code. Nature 465, 53–59. Bechara, E.G., Sebestye n, E., Bernardis, I., Eyras, E., and Valca rcel, J. (2013). RBM5, 6, and 10 differentially regulate NUMB alternative splicing to control cancer cell proliferation. Mol. Cell 52, 720–733. Behrens, S.E., Tyc, K., Kastner, B., Reichelt, J., and Lu hrmann, R. (1993). Small nuclear ribonucleoprotein (RNP) U2 contains numerous additional pro- teins and has a bipartite RNP structure under splicing conditions. Mol. Cell. Biol. 13, 3’7–319.
Bessonov, S., Anokhina, M., Will, C.L., Urlaub, H., and Lu hrmann, R. (2008). Isolation of an active step I spliceosome and composition of its RNP core. Nature 452, 846–850.
Birmingham, A., Selfors, L.M., Forster, T., Wrobel, D., Kennedy, C.J., Shanks, E., Santoyo-Lopez, J., Dunican, D.J., Long, A., Kelleher, D., et al. (2009). Statistical methods for analysis of high-throughput RNA interference screens. Nat. Methods 6, 569–575. Bonnal, S., Martınez, C., Fo rch, P., Bachi, A., Wilm, M., and Valca rcel, J. (2008). RBM5/Luca-15/H37 regulates Fas alternative splice site pairing after exon definition. Mol. Cell 32, 81–95.
Bonnal, S., Vigevani, L., and Valca rcel, J. (2012). The spliceosome as a target of novel antitumour drugs. Nat. Rev. Drug Discov. 11, 847–859.
Bottner, C.A., Schmidt, H., Vogel, S., Michele, M., and Ka ufer, N.F. (2005). Multiple genetic and biochemical interactions of Brr2, Prp8, Prp31, Prp1 and Prp4 kinase suggest a function in the control of the activation of spliceosomes in Schizosaccharomyces pombe. Curr. Genet. 48, 151–161.
Chatr-Aryamontri, A., Breitkreutz, B.J., Heinicke, S., Boucher, L., Winter, A., Stark, C., Nixon, J., Ramage, L., Kolas, N., O’Donnell, L., et al. (2013). The BioGRID interaction database: 2013 update. Nucleic Acids Res. 41 (Database issue), D816–D823.
Chen, W., and Moore, M.J. (2014). The spliceosome: disorder and dynamics defined. Curr. Opin. Struct. Biol. 24, 141–149.
Clark, T.A., Sugnet, C.W., and Ares, M., Jr. (2002). Genomewide analysis of mRNA processing in yeast using splicing-specific microarrays. Science 296, 907–910.
Collins, C.A., and Guthrie, C. (1999). Allele-specific genetic interactions be- tween Prp8 and RNA active site residues suggest a function for Prp8 at the cat- alytic core of the spliceosome. Genes Dev. 13, 1970–1982.
Cooper, T.A., Wan, L., and Dreyfuss, G. (2009). RNA and disease. Cell 136, 777–793.
Corrionero, A., Min ana, B., and Valca rcel, J. (2011). Reduced fidelity of branch point recognition and alternative splicing induced by the anti-tumor drug spli- ceostatin A. Genes Dev. 25, 445–459.
Fica, S.M., Tuttle, N., Novak, T., Li, N.S., Lu, J., Koodathingal, P., Dai, Q., Staley, J.P., and Piccirilli, J.A. (2013). RNA catalyses nuclear pre-mRNA splicing. Nature 503, 229–234.
Franceschini, A., Szklarczyk, D., Frankild, S., Kuhn, M., Simonovic, M., Roth, A., Lin, J., Minguez, P., Bork, P., von Mering, C., and Jensen, L.J. (2013). STRING v9.1: protein-protein interaction networks, with increased coverage and integration. Nucleic Acids Res. 41 (Database issue), D808–D815.
Friedman, J., Hastie, T., and Tibshirani, R. (2008). Sparse inverse covariance estimation with the graphical lasso. Biostatistics 9, 432–441.
Fu, X.D., and Ares, M., Jr. (2014). Context-dependent control of alternative splicing by RNA-binding proteins. Nat. Rev. Genet. 15, 689–701.
Gao, Y., Vogt, A., Forsyth, C.J., and Koide, K. (2013). Comparison of splicing factor 3b inhibitors in human cells. ChemBioChem 14, 49–52.
Ha cker, I., Sander, B., Golas, M.M., Wolf, E., Karago z, E., Kastner, B., Stark, H., Fabrizio, P., and Lu hrmann, R. (2008). Localization of Prp8, Brr2, Snu114 and U4/U6 proteins in the yeast tri-snRNP by electron microscopy. Nat. Struct. Mol. Biol. 15, 1206–1212.
Hasegawa, M., Miura, T., Kuzuya, K., Inoue, A., Won Ki, S., Horinouchi, S., Yoshida, T., Kunoh, T., Koseki, K., Mino, K., et al. (2011). Identification of SAP155 as the target of GEX1A (Herboxidiene), an antitumor natural product. ACS Chem. Biol. 6, 229–233.
Horowitz, D.S., Kobayashi, R., and Krainer, A.R. (1997). A new cyclophilin and the human homologues of yeast Prp3 and Prp4 form a complex associated with U4/U6 snRNPs. RNA 3, 1374–1387.
Hoskins, A.A., and Moore, M.J. (2012). The spliceosome: a flexible, reversible macromolecular machine. Trends Biochem. Sci. 37, 179–188.
Hoskins, A.A., Friedman, L.J., Gallagher, S.S., Crawford, D.J., Anderson, E.G., Wombacher, R., Ramirez, N., Cornish, V.W., Gelles, J., and Moore, M.J. (2011). Ordered and dynamic assembly of single spliceosomes. Science 331, 1289–1295.
House, A.E., and Lynch, K.W. (2006). An exonic splicing silencer represses spliceosome assembly after ATP-dependent exon recognition. Nat. Struct. Mol. Biol. 13, 937–944.
Jurka, J., Kapitonov, V.V., Pavlicek, A., Klonowski, P., Kohany, O., and Walichiewicz, J. (2005). Repbase Update, a database of eukaryotic repetitive elements. Cytogenet. Genome Res. 110, 462–467.
Kaida, D., Motoyoshi, H., Tashiro, E., Nojima, T., Hagiwara, M., Ishigami, K., Watanabe, H., Kitahara, T., Yoshida, T., Nakajima, H., et al. (2007). Spliceostatin A targets SF3b and inhibits both splicing and nuclear retention of pre-mRNA. Nat. Chem. Biol. 3, 576–583.
Karolchik, D., Barber, G.P., Casper, J., Clawson, H., Cline, M.S., Diekhans, M., Dreszer, T.R., Fujita, P.A., Guruvadoo, L., Haeussler, M., et al. (2014). The UCSC Genome Browser database: 2014 update. Nucleic Acids Res. 42 (Database issue), D764–D770. Katz, Y., Wang, E.T., Airoldi, E.M., and Burge, C.B. (2010). Analysis and design of RNA sequencing experiments for identifying isoform regulation. Nat. Methods 7, 1009–1015. Kim, D., Kim, M.S., and Cho, K.H. (2012). The core regulation module of stress- responsive regulatory networks in yeast. Nucleic Acids Res. 40, 8793–8802. Korneta, I., and Bujnicki, J.M. (2012). Intrinsic disorder in the human spliceoso- mal proteome. PLoS Comput. Biol. 8, e1002641. Lallena, M.J., Chalmers, K.J., Llamazares, S., Lamond, A.I., and Valca rcel, J. (2002). Splicing regulation at the second catalytic step by Sex-lethal involves 3’ splice site recognition by SPF45. Cell 109, 285–296. Luco, R.F., Allo, M., Schor, I.E., Kornblihtt, A.R., and Misteli, T. (2011). Epigenetics in alternative pre-mRNA splicing. Cell 144, 16–26. Makarova, O.V., Makarov, E.M., Urlaub, H., Will, C.L., Gentzel, M., Wilm, M., and Lu hrmann, R. (2004). A subset of human 35S U5 proteins, including Prp19, function prior to catalytic step 1 of splicing. EMBO J. 23, 2381–2391. Mozaffari-Jovin, S., Santos, K.F., Hsiao, H.H., Will, C.L., Urlaub, H., Wahl, M.C., and Lu hrmann, R. (2012). The Prp8 RNase H-like domain inhibits Brr2- mediated U4/U6 snRNA unwinding by blocking Brr2 loading onto the U4 snRNA. Genes Dev. 26, 2422–2434. Muraki, M., Ohkawara, B., Hosoya, T., Onogi, H., Koizumi, J., Koizumi, T., Sumi, K., Yomoda, J., Murray, M.V., Kimura, H., et al. (2004). Manipulation of alternative splicing by a newly developed inhibitor of Clks. J. Biol. Chem. 279, 24246–24254. Muro, A.F., Chauhan, A.K., Gajovic, S., Iaconcig, A., Porro, F., Stanta, G., and Baralle, F.E. (2003). Regulated splicing of the fibronectin EDA exon is essential for proper skin wound healing and normal lifespan. J. Cell Biol. 162, 149–160. Nilsen, T.W., and Graveley, B.R. (2010). Expansion of the eukaryotic proteome by alternative splicing. Nature 463, 457–463. Park, J.W., Parisky, K., Celotto, A.M., Reenan, R.A., and Graveley, B.R. (2004). Identification of alternative splicing regulators by RNA interference in Drosophila. Proc. Natl. Acad. Sci. USA 101, 15974–15979.
15
Pena, V., Liu, S., Bujnicki, J.M., Lu hrmann, R., and Wahl, M.C. (2007). Structure of a multipartite protein-protein interaction domain in splicing factor prp8 and its link to retinitis pigmentosa. Mol. Cell 25, 615–624. Pleiss, J.A., Whitworth, G.B., Bergkessel, M., and Guthrie, C. (2007). Transcript specificity in yeast pre-mRNA splicing revealed by mutations in core spliceosomal components. PLoS Biol. 5, e90. Quesada, V., Conde, L., Villamor, N., Ordo n ez, G.R., Jares, P., Bassaganyas, L., Ramsay, A.J., Bea , S., Pinyol, M., Martınez-Trillos, A., et al. (2012). Exome sequencing identifies recurrent mutations of the splicing factor SF3B1 gene in chronic lymphocytic leukemia. Nat. Genet. 44, 47–52.
Saltzman, A.L., Pan, Q., and Blencowe, B.J. (2011). Regulation of alternative splicing by the core spliceosomal machinery. Genes Dev. 25, 373–384.
Shcherbakova, I., Hoskins, A.A., Friedman, L.J., Serebrov, V., Corre a, I.R., Jr., Xu, M.Q., Gelles, J., and Moore, M.J. (2013). Alternative spliceosome assem- bly pathways revealed by single-molecule fluorescence microscopy. Cell Rep. 5, 151–165.
Teigelkamp, S., Achsel, T., Mundt, C., Go thel, S.F., Cronshagen, U., Lane, W.S., Marahiel, M., and Lu hrmann, R. (1998). The 20kD protein of human [U4/U6.U5] tri-snRNPs is a novel cyclophilin that forms a complex with the U4/U6-specific 60kD and 90kD proteins. RNA 4, 127–141.
Tejedor, J.R., Papasaikas, P., and Valca rcel, J. (2014). Genome-wide identifi- cation of Fas/CD95 alternative splicing regulators reveals links with iron ho- meostasis. Mol. Cell 57. Published online December 4, 2014. http://dx.doi. org/10.1016/j.molcel.2014.10.029.
Tseng, C.K., and Cheng, S.C. (2008). Both catalytic steps of nuclear pre- mRNA splicing are reversible. Science 320, 1782–1784.
Wahl, M.C., Will, C.L., and Lu hrmann, R. (2009). The spliceosome: design prin- ciples of a dynamic RNP machine. Cell 136, 701–718.
Watson, E., MacNeil, L.T., Arda, H.E., Zhu, L.J., and Walhout, A.J. (2013). Integration of metabolic and gene regulatory networks modulates the C. ele- gans dietary response. Cell 153, 253–266.
Wong, A.K., Park, C.Y., Greene, C.S., Bongo, L.A., Guan, Y., and Troyanskaya, O.G. (2012). IMP: a multi-species functional genomics portal for integration, visualization and prediction of protein functions and networks. Nucleic Acids Res. 40 (Web Server issue), W484–W490.
Wong, J.J., Ritchie, W., Ebner, O.A., Selbach, M., Wong, J.W., Huang, Y., Gao, D., Pinello, N., Gonzalez, M., Baidya, K., et al. (2013). Orchestrated intron retention regulates normal granulocyte differentiation. Cell 154, 583–595.
Yang, Y., Han, L., Yuan, Y., Li, J., Hei, N., and Liang, H. (2014). Gene co- expression network analysis reveals common system-level properties of prog- nostic genes across cancer types. Nat. Commun. 5, 3231.
Yoshida, K., Sanada, M., Shiraishi, Y., Nowak, D., Nagata, Y., Yamamoto, R., Sato, Y., Sato-Otsubo, A., Kon, A., Nagasaki, M., et al. (2011). Frequent pathway mutations of splicing machinery in myelodysplasia. Nature 478, 64–69.
Zarnack, K., Ko nig, J., Tajnik, M., Martincorena, I., Eustermann, S., Ste vant, I., Reyes, A., Anders, S., Luscombe, N.M., and Ule, J. (2013). Direct competition between hnRNP C and U2AF65 protects the transcriptome from the exoniza- tion of Alu elements. Cell 152, 453–466.
Zhang, C., Frias, M.A., Mele, A., Ruggiu, M., Eom, T., Marney, C.B., Wang, H., Licatalosi, D.D., Fak, J.J., and Darnell, R.B. (2010). Integrative modeling de- fines the Nova splicing-regulatory network and its combinatorial controls. Science 329, 439–443.
SUPPLEMENTAL INFORMATION ATTACHED BELOW
U2AF65
SLU7 SF3B1 KD#1
SF3B1 KD#2
SNRPG KD#1
SNRPG KD#2
SRSF1
BRR2
-tubulin
-tubulin
-tubulin
-tubulin
- tubulin
- tubulin
- tubulin
- tubulin
56%
82%
90%
75%
96%
74%
69%
83%
β
β
β
β
−10
12
Sca
led
Z−sc
ore
SF3B1 KD #1
SF3B1 KD #2
FAS
OLR
1PA
X6
CH
EK
2FN
1ED
BFN
1ED
AN
UM
BB
CL2
L1R
AC
1M
STR
1M
AD
DC
AS
P2
APA
F1M
INK
1M
AP
3K7
MA
P4K
3N
OTC
H3
MA
P4K
2P
KM
2M
CL1
CFL
AR
CA
SP
9D
IAB
LOB
MF
CC
NE
1H
2AFY
STA
T3G
AD
D45
AB
IM_L
_EL
BIM
_EL
BIR
C5_
2BB
IRC
5_D
3S
MN
1S
MN
2P
HF1
9
−4−3
A B
C
SF3B1 KD #1 (OTP Pool)
CGCCAAGACUCACGAAGAUCCUCGAUUCUACAGGUUAUAGGCGGACCAUGAUAAUUUCCGGGAAGAUGAAUACAAA
-4-2
02
Scal
ed Z
-sco
re
p14 KD #1p14 KD #2
RobCor = 0.85
-3-2
-10
12
3
Scal
ed Z
-sco
reSNRPG KD #1SNRPG KD #2
RobCor = 0.77
RobCor = 0.82
SF3B1 KD #2 (oligo)
GACAGCAGAUUUGCUGGAUACGUGA(obtained from Corrionero et al., 2011)
P14 KD #1 (SiGENOME Pool)
GUGAUCACCUAUCGGGAUUGAACAGCUUAUGUGGUCUAUGAAGUAAAUCGGAUAUUGGAACACACCUGAAACUAGA
U2AF65
SLU7
SNRPG KD#1
SNRPG KD#2
SRSF1
BRR2
-tubulin
-tubulin
-tubulin
-tubulin
-tubulin
-tubulin
-tubulin
-tubulin
56%
90%
96%
69%
0
10
20
30
40
50
60
70
80
90
100
Control PRPF8 BRR2 PRPF3 PRPF31 PRPF4 PRPF6 SF3B1 KD#1 SRSF1 SLU7 U2AF1 U2AF2 P14 KD#1 P14 KD#2 SNRPG KD#1
SNRPG KD#2
Stauro
Normal Early apoptotic Late apoptotic Necrotic
N = 3
-4-2
02
4
Scal
ed Z
-sco
re IKSta_3hSta_8h
-4-2
02
4
Scal
ed Z
-sco
re
FAS
OLR
1PA
X6C
HEK
2FN
1ED
BFN
1ED
AN
UM
BBC
L2L1
RAC
1M
STR
1M
ADD
CAS
P2AP
AF1
MIN
K1M
AP3K
7M
AP4K
3N
OTC
H3
MAP
4K2
PKM
2VE
GFA
MC
L1C
FLAR
CAS
P9SY
KD
IABL
OBM
FC
CN
E1C
CN
D1
H2A
FYST
AT3
GAD
D45
ABI
M_L
_EL
BIM
_EL
BIR
C5_
2BBI
RC
5_D
3LM
NA
SMN
1SM
N2
PHF1
9
SMU1Sta_3hSta_8h
SF3B1Sta_3hSta_8h
-4-2
02
4
Scal
ed Z
-sco
re
FAS
OLR
1PA
X6C
HEK
2FN
1ED
BFN
1ED
AN
UM
BBC
L2L1
RAC
1M
STR
1M
ADD
CAS
P2AP
AF1
MIN
K1M
AP3K
7M
AP4K
3N
OTC
H3
MAP
4K2
PKM
2VE
GFA
MC
L1C
FLAR
CAS
P9SY
KD
IABL
OBM
FC
CN
E1C
CN
D1
H2A
FYST
AT3
GAD
D45
ABI
M_L
_EL
BIM
_EL
BIR
C5_
2BBI
RC
5_D
3LM
NA
SMN
1SM
N2
PHF1
9
D
% c
ells
P14 KD #1 (OTP Pool)
AUAUGGACCUAUUCGUCAACCUGAAGUAAAUCGGAUAUGACUAAAUCCCACGAAUGAGUGAUCACCUAUCGGGAUU
SNRPG KD #1 (SiGENOME Pool)AGCCUUGGAACGAGUAUAACUUUAUGAACCUUGUGAUAGUGGUAAUACGAGGAAAUAGACAAGAAGUUAUCAUUGA
SNRPG KD #1 (OTP Pool)GCAUUAAAGUGAAAGAAUCCGAGGAAAUAGUAUCAUCAAGGAAUAUUGCGGGGGAUUUGAGAUGGCGACUAGUGGAC
β
β
β
β
Depletion Efficiency
BA
-1 0 1 2 3
-2-1
01
23
4 Rcor = 0.12
-3 -2 -1 0 1 2
-2-1
01
23
4 Rcor = 0.24
-4 -3 -2 -1 0 1 2
-2-1
01
23
4 Rcor = 0.12
-1 0 1 2 3
-3-2
-10
1
Rcor = 0.56
-3 -2 -1 0 1 2
-3-2
-10
1 Rcor = 0.67
-4 -3 -2 -1 0 1 2
-3-2
-10
1 Rcor = 0.5
-1 0 1 2 3
-2-1
01
2 Rcor = 0.52
-3 -2 -1 0 1 2
-2-1
01
2 Rcor = 0.9
-4 -3 -2 -1 0 1 2
-2-1
01
2 Rcor = 0.78
IK_24h IK_48h IK_72h
SMU
1_72
hSM
U1_
48h
SMU
1_24
h
-6-4
-20
2
Sca
led
Z-sc
ore
FAS
OLR
1P
AX
6C
HE
K2
FN1E
DB
FN1E
DA
NU
MB
BC
L2L1
RA
C1
MS
TR1
MA
DD
CA
SP
2A
PA
F1M
INK
1M
AP
3K7
MA
P4K
3N
OTC
H3
MA
P4K
2P
KM
2V
EG
FAM
CL1
CFL
AR
CA
SP
9S
YK
DIA
BLO
BM
FC
CN
E1
CC
ND
1H
2AFY
STA
T3G
AD
D45
AB
IM_L
_EL
BIM
_EL
BIR
C5_
2BB
IRC
5_D
3LM
NA
SM
N1
SM
N2
PH
F19
IK_24hIK_48hIK_72h
-4-2
02
4
Sca
led
Z-sc
ore
FAS
OLR
1P
AX
6C
HE
K2
FN1E
DB
FN1E
DA
NU
MB
BC
L2L1
RA
C1
MS
TR1
MA
DD
CA
SP
2A
PA
F1M
INK
1M
AP
3K7
MA
P4K
3N
OTC
H3
MA
P4K
2P
KM
2V
EG
FAM
CL1
CFL
AR
CA
SP
9S
YK
DIA
BLO
BM
FC
CN
E1
CC
ND
1H
2AFY
STA
T3G
AD
D45
AB
IM_L
_EL
BIM
_EL
BIR
C5_
2BB
IRC
5_D
3LM
NA
SM
N1
SM
N2
PH
F19
SMU1_24hSMU1_48hSMU1_72h
-30
-20
-10
0
Sca
led
Z-sc
ore
FAS
OLR
1P
AX
6C
HE
K2
FN1E
DB
FN1E
DA
NU
MB
BC
L2L1
RA
C1
MS
TR1
MA
DD
CA
SP
2A
PA
F1M
INK
1M
AP
3K7
MA
P4K
3N
OTC
H3
MA
P4K
2P
KM
2V
EG
FAM
CL1
CFL
AR
CA
SP
9S
YK
DIA
BLO
BM
FC
CN
E1
CC
ND
1H
2AFY
STA
T3G
AD
D45
AB
IM_L
_EL
BIM
_EL
BIR
C5_
2BB
IRC
5_D
3LM
NA
SM
N1
SM
N2
PH
F19
SF3B1_24hSF3B1_48hSF3B1_72h
-3-2
-10
12
3
Sca
led
Z-sc
ore
Fas
Olr1
Pax
6C
hek2
FN1E
DB
FN1E
DA
NU
MB
BC
L2L1
Rac
1M
STR
1M
AD
DC
AS
P2
AP
AF1
MIN
K1
MA
P3K
7M
AP
4K3
NO
TCH
3M
AP
4K2
PK
M2
MC
L1C
FLA
RC
AS
P9
DIA
BLO
BM
FC
CN
E1
H2A
FYS
TAT3
GA
DD
45A
BIM
_L_E
LB
IM_E
LB
IRC
5_2B
BIR
C5_
D3
SM
N1
SM
N2
PH
F19
IK_48hSMU1_72h
01
23
45
Mea
n ab
s. Z
-sco
re
IK SMU1 SF3B1
24h48h72h
C
D
CELL FUNCTIONSPLICING TYPE
SPLICING TYPEce5ssmecomplex3ssir
CELL FUNCTIONapoptosis
bothcell proliferation
GENE CATEGORYSpliceosomeSplic. FactorsOther RNApr./ Misc.Chrom. Factors
GEN
E C
ATEG
ORY
+ cell proliferation- apoptosis
- cell proliferation+ apoptosis
−5 0 5Z−score
FAS
OLR
1C
HE
K2
APA
F1C
AS
P9
BM
FB
IRC
5_2B
FN1E
DB
FN1E
DA
BIM
_EL
BIM
_L_E
LG
AD
D45
AM
STR
1M
AP
3K7
H2A
FYM
AD
DR
AC
1N
OTC
H3
CC
NE
1M
AP
4K2
STA
T3M
AP
4K3
PK
M2
NU
MB
PAX
6C
AS
P2
CFL
AR
DIA
BLO
BC
L2L1
SM
N2
MIN
K1
PH
F19
SM
N1
BIR
C5_
D3
MC
L1
−1.0
−0.5
0.0
0.5
Mea
n Z−
scor
e
Cor
e S
pl.
Spl
. Fac
tors
Oth
er R
NA
pr.
Chr
om. F
acto
rs
−1.5
−1.0
−0.5
0.0
0.5
Mea
n Z−
scor
e
Pers
ist.
Spl
.
Tran
s. E
arly
Spl
.
Tran
s. M
id S
pl.
Tran
s. L
ate
Spl
.
EJC
B
C
A
A
SF3A2SNRPA1
SF3A1
SF3B4
SNRPB2
SF3B2
SF3B1
P14
SF3B3
SF3A3
LSM3
C20ORF14
TXNL4PRPF4
CD2BP2
PPIH
U5−200KD
DDX23
LSM6
HPRP8BP
LSM2
NHP2L1
U5−116KD
PRPF31
PRPF8
LSM4
USP39
SART1
LSM7PRPF3
SNRPD1
SNRPD3
SNRPB
SNRPG
SNRPF
SNRPD2
CRNKL1
XAB2
CTNNBL1SKIIPWBP11
PLRG1
PPIL1
PPIE
BCAS2
PRPF19
AQR
CDC5L
G10
KIAA1160DC
B
Persistent Trans. Early Trans. Mid. Trans. Late
Persistent
SF3B4 SNRPA1 SF3B3 SNRPA1 SF3A1 SNRPA1 SF3A3 SF3B3 SF3A3 SNRPA1
Trans. Early
PPIH SF3B2 PPIH SNRPA1 PPIH SF3B4 PRPF31 PRPF8 PPIH SF3B1
SNRPC SNRP70 RBM17 SR140 IK SMU1 PRPF31 C20ORF14 U2AF1 U2AF2
Trans. Mid.
KIAA1604 P14 KIAA1604 SF3B1 PRPF19 U5200KD XAB2 SNRPA1 XAB2 SF3B4
KIAA1604 DDX23 CDC40 DDX23 AQR DDX23 KIAA1604 PPIH CDC40 PPIH
CDC5L PLRG1 KIAA1604 CRNKL1 XAB2 CRNKL1 XAB2 CDC40 KIAA1604 XAB2
Trans. Late
C19ORF29 PRPF8 SLU7 P14 SLU7 SF3B2 C19ORF29 P14 SLU7 SF3B4
LENG1 SMNDC1 PRPF18 LSM7 PRPF18 LSM4 SLU7 PPIH LENG1 NHP2L1
SLU7 CRNKL1 SLU7 KIAA1604 SLU7 XAB2 SLU7 CDC40 LENG1 AQR
SLU7 C19ORF29 DHX8 DDX41 KIAA0073 FLJ35382 DHX8 C19ORF29
E
α-Tubulin
C IK SMU1 48h siRNA
72h
SMU1
IK C IK SMU1
C IK SMU1
48h 72h siRNA C IK SMU1
α-Tubulin
SMU1
IK
D
0
2500
5000
7500
10000
12500
15000
17500
20000
0 h 24 h 48 h 72 h
Control siRNA
IK KDSMU1 KDBoth KD
Cel
l num
ber
275 277451
218 174514
IK SMU1
IK SMU1
100 131208
IK SMU1
103 103143
IK SMU1
all genes (1458) p-value # genes all genes (1416) p-value # genesCellular Growth and Proliferation 8.49E-12 348 Cell Death and Survival 1.04E-08 311
Cell Death and Survival 1.07E-10 342 Cellular Growth and Proliferation 1.52E-08 314Gene Expression 1.12E-09 212 Cell-To-Cell Signaling and Interaction 1.37E-06 100Cellular Movement 2.90E-07 174 Gene Expression 1.59E-06 196
Cell Cycle 3.41E-06 135 Cellular Movement 6.62E-06 164
genes up (726) p-value # genes genes up (728) p-value # genesGene Expression 2.49E-13 139 Gene Expression 1.23E-08 119
Cell Death and Survival 1.91E-05 175 Cellular Development 1.63E-05 77Cellular Growth and Proliferation 1.93E-05 172 Cell Death and Survival 1.42E-04 154
Cell Cycle 7.94E-05 78 Cell Cycle 3.61E-04 68RNA Post-Transcriptional Modification 8.18E-05 20 DNA Replication, Recombination, and Repair 5.60E-04 26
genes down (732) p-value # genes genes down (688) p-value # genesCell-To-Cell Signaling and Interaction 1.01E-07 91 Cellular Movement 1.62E-09 102
Cellular Growth and Proliferation 1.15E-07 176 Cell-To-Cell Signaling and Interaction 1.70E-08 66Cellular Movement 7.29E-07 106 Cellular Growth and Proliferation 1.44E-07 169
Cell Death and Survival 1.72E-06 170 Cell Death and Survival 2.77E-06 159Cell Morphology 5.65E-06 102 Cell Cycle 4.94E-05 61
all exons (552) p-value # genes all exons (583) p-value # genesRNA Post-Transcriptional Modification 3.67E-11 49 Gene Expression 3.58E-09 202
Gene Expression 5.46E-09 195 RNA Post-Transcriptional Modification 4.60E-07 38Cellular Assembly and Organization 5.27E-05 171 Cellular Assembly and Organization 2.25E-06 151Cellular Function and Maintenance 5.82E-05 148 Cellular Function and Maintenance 2.25E-06 127
Cell Death and Survival 5.95E-05 237 Cell Cycle 4.58E-06 133
exons up (308) p-value # genes exons up (339) p-value # genesCell Cycle 2.04E-05 98 Gene Expression 2.76E-06 136
Cell Morphology 6.41E-05 114 Cellular Assembly and Organization 7.41E-06 118Gene Expression 8.80E-05 132 Cellular Function and Maintenance 1.87E-05 101
Cellular Assembly and Organization 1.88E-04 125 Protein Synthesis 2.75E-05 48Cellular Function and Maintenance 1.88E-04 111 Cell Cycle 1.60E-04 92
exons down (246) p-value # genes exons down (246) p-value # genesRNA Post-Transcriptional Modification 1.28E-09 26 RNA Post-Transcriptional Modification 5.16E-07 21
Gene Expression 2.49E-08 72 Gene Expression 3.85E-05 65Cell Morphology 4.15E-05 41 Cell Death and Survival 5.15E-05 86
Cell Cycle 5.91E-05 59 Cellular Assembly and Organization 1.33E-04 40Cellular Assembly and Organization 5.91E-05 54 Cellular Function and Maintenance 1.33E-04 38
IK KD gene ontologies SMU1 KD gene ontologies
PARPα-
FAS
MCL1
α-Tubulin
% ce
lls
F
E
CB
G
Gene expression up
Gene expression down
Alternative splicing up
Alternative splicing down
ex 6 inc
ex 6 skip
ex 2 inc
ex 2 skip
A
0%
10%
20%
30%
40%
50%
60%
70%
80%
90%
100%
CN48
CN72
CN96
IK K
D48
IK K
D72
IK K
D96
SMU1 KD48
SMU1 KD72
SMU1 KD96
Both KD48
Both KD72
Both KD96
STAURO
STA3
early apoptotic
late apoptotic
necrotic
normal cells
N=6N=6
−4 −2 0 2 4
−4−2
02
PC1
PC
2
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17 18
19
20
21
22
23
24
25
26
2728
293031
32
33
34
35
36
3738
39
40
41
42
43
44
45
46
4748
49
50
51
52
53
54
55
56
57
58
59
60
61
6263
64
65
66
67
68 69
70
71
72
7374
75
76
77
78
79
80
81
82
83
84
85
8687
88
89
90
91
92
93
94
95
HNRPC−U2AF1
NHP2L1−SF3B4
SF3B2−SMU1
OLR1
CHEK2
BCL2L1MINK1
MAP3K7PKM2
MCL1
CFLARSTAT3
PHF19
nº nº nº nº nº nº nº nº nº nº1 SF3B1 SFRS10 11 IK SF3B4 21 SMNDC1 XAB2 31 DDX10 MORG1 41 DICER1 SFPQ 51 DDX31 NHP2L1 61 BRD4 RNPC2 71 SFRS1 TAF15 81 CPSF5 FRG1 91 DDX52 JMJD2B2 DBR1 SKIIP 12 NHP2L1 SNRPA1 22 DHX30 MGC2655 32 LSM6 MGC20398 42 KIAA0073 SFPQ 52 DDX52 SART1 62 NONO RBM15 72 DIS3 YBX1 82 DEK PPIE 92 MGC13125 SMARCA43 HNRPC U2AF1 13 SMU1 SNRPA1 23 BCAS2 MGC13125 33 DGCR14 SFRS11 43 NCBP2 NONO 53 EXOSC4 USP39 63 ILF3 SKIV2L2 73 DHX9 HNRPH2 83 BUB3 SDCCAG10 93 ACIN1 PRMT54 NHP2L1 SF3B1 14 NHP2L1 SNRPB2 24 THOC2 WTAP 34 DDX52 SFRS2 44 DHX30 NONO 54 FRG1 SNRPB 64 MGC23918 SKIV2L2 74 FNBP3 HNRPH2 84 ILF3 PPIG 94 ACIN1 CTCF5 RBM25 SF3B1 15 D6S2654E USP39 25 CD2BP2 GTL3 35 ELAVL1 SFRS10 45 HNRPA0 NONO 55 DHX15 SNRPG 65 AOF2 DDX31 75 MORF4L1 TIAL1 85 EZH2 PPIG 95 KHDRBS1 MORF4L16 IK SF3B2 16 C22ORF19 RY1 26 CTNNBL1 YBX1 36 P29 SRRM2 46 LENG1 RBM35A 56 FRG1 SNRPG 66 ACIN1 MGC2655 76 DDX52 MFAP1 86 CRK7 SUV39H17 NHP2L1 SF3A1 17 NHP2L1 SNRPD1 27 PPIH RBM25 37 EXOSC4 NCBP2 47 BRD4 C20ORF14 57 DDX52 LSM2 67 HNRPA1 ZNF207 77 FRG1 MFAP1 87 ILF2 SETD1A8 NHP2L1 SF3A3 18 NHP2L1 SNRPF 28 SF3B1 SMU1 38 HNRPL SNRP70 48 BRD4 PRPF4 58 ASH2L PPM1G 68 DHX15 HSPC148 78 CCDC55 KIN 88 HDAC4 SETD29 IK SF3A3 19 DDX11 DDX31 29 SF3B2 SMU1 39 FRG1 SNRP70 49 NOSIP PRPF31 59 FNBP3 SHARP 69 CTCF HNRPD 79 EP300 NOSIP 89 CCDC55 NAB2
10 NHP2L1 SF3B4 20 SFRS1 XAB2 30 SF3A1 SMU1 40 P29 SNRPA 50 FRG1 PRPF31 60 RBM25 SHARP 70 EED ILF3 80 C20ORF14 FRG1 90 DHX15 JMJD2B
edge edge edge edgeedge edge edge edge edge edge
B
C
−1.0 −0.5 0.0 0.5 1.0
−2−1
01
Z−score HNRPC
Z−sc
ore
U2A
F1
OLR1
CHEK2
FN1EDB
FN1EDA
BCL2L1
MSTR1
MADD
MINK1NOTCH3
MAP4K2
PKM2
MCL1
GADD45A
BIM_L_EL
BIM_EL
PHF19
Cor = −0.004
−3 −2 −1 0 1 2
−2−1
01
Z−score HNRPC
Z−sc
ore
U2A
F1
CASP2
MAP3K7
BMF
APAF1
CFLARRAC1
CCNE1
DIABLO
H2AFY
NUMB
STAT3
FAS
MAP4K3
PAX6
CASP9SMN1
SMN2
BIRC5_D3
BIRC5_2B
Cor = 0.773
A
−1.0 −0.5 0.0 0.5 1.0 1.5
−2−1
01
Z−score HNRPC
Z−sc
ore
U2A
F1
FASOLR1
PAX6
CHEK2
FN1EDB
BCL2L1
RAC1
MSTR1
CASP2
MINK1
MAP3K7
MAP4K3
NOTCH3
MAP4K2
PKM2
MCL1
BMF
STAT3
GADD45A
BIM_L_ELBIM_EL
PHF19
Cor = 0.0659
−3 −2 −1 0 1
−2.0
−1.5
−1.0
−0.5
0.0
0.5
1.0
Z−score HNRPC
Z−sc
ore
U2A
F1
DIABLO
FN1EDA
H2AFY
CCNE1
NUMB
APAF1
MADD
CFLAR
CASP9BIRC5_2B
BIRC5_D3
SMN1SMN2
Cor = 0.899D
E
Functional splicing network reveals extensive regulatory potential of the core
Spliceosomal machinery
Panagiotis Papasaikas1,2, *, J. Ramón Tejedor1,2,*, Luisa Vigevani1,2
and Juan Valcárcel1-‐3
Supplemental Information.
This file contains the following items:
-‐ Supplementary Figure legends
-‐ Supplementary Table legends
-‐ Supplementary Methods
-‐ Supplementary References
Supplementary Figure legends.
Figure S1. Consistent profiles of AS changes and limited cell death induced by the
knockdown of core components using different siRNAs. A. AS perturbation profiles
obtained for depletion of SNRPG, P14 or SF3B1 using two different siRNAs . Robust
correlation between conditions and siRNA sequences used for factor depletion are
indicated. Positive Z-‐scores indicate changes towards exon inclusion, negative values
indicate exon skipping. B. Knockdown efficiency for a subset of splicing factor
components. Western blot analyses of protein depletion upon siRNA treatment for 6
spliceosome factors, for two of them two independent siRNA libraries, as indicated
in A. Depletion efficiency was calculated by quantifying the levels of western blot
signals using ImageQuant TL software (GE Healthcare Life sicences). C.
Quantification of cell death upon depletion of core spliceosomal components or
treatment with staurosporin. Data represents the percentage of normal, early
apoptotic, late apoptotic or necrotic cells analyzed by flow cytometry sorting after
staining with annexin V and propidium iodide. Values represent the mean and
standard deviation of 3 independent biological replicates. D. Induction of apoptosis
by staurosporine leads to AS changes distinct from those induced by the knock down
or core splicing factors. Perturbation profiles were measured after 3 or 8 hours of
staurosporin treatment (1 mM) and compared with the perturbation profiles
obtained upon knockdown of IK, SMU1 or SF3B1 for 72 hours.
Figure S2. Differences in knockdown conditions do not significantly alter
correlation values within the network. A. Perturbation profiles upon IK, SMU1 or
SF3B1 knockdown for 24, 48 or 72 hours across the 35 splicing events used for
network generation. B. Perturbation profiles upon IK or SMU1 depletion at 48 and 72
hours, respectively. C. Overall effects in AS changes across the 35 events after 24, 48
or 72 hours of siRNA treatment. Box plots represent the median and spread of Z-‐
scores of the effects across different events upon the knockdown of IK, SMU1 or
SF3B1 for the different timepoints. D. Correlation matrix and robust correlation
values between IK or SMU1 knockdowns at different time points.
Figure S3. Regulation of AS events involved in cell proliferation or apoptosis based
on functional annotations. A. Heatmap representation of the results obtained in the
screen experiment based on functional annotations of the AS events. Blue or red
colors indicate splicing changes towards more cell proliferation/less apoptosis or less
cell proliferation/more apoptosis for a given knockdown condition, respectively.
Data are clustered on both dimensions (similarity measure based on Pearson
correlation, ward linkage). B and C. Extent and direction of AS changes upon
knockdown of subsets of splicing factors. Box-‐plots represent the skewness and
spread of the mean z-‐scores for AS splicing changes observed across the AS events
analyzed for core and non-‐core splicing factors, other RNA processing factors and
chromatin factors (A) or for factors that are either persistent or transient
components of the spliceosome at different stages of assembly (B). EJC corresponds
to components of the Exon Junction Complex.
Figure S4. Sub-‐network topologies. A. Detailed functional interactions of U2snRNP
proteins. B. Detailed functional interactions among U4/5/6 tri-‐snRNP proteins. C.
Functional interactions between PRP19 and Nineteen Complex components. D.
Functional interactions between Sm core proteins. E. Top 5 functional network
connections between persistent and different transient components of the
spliceosome at different stages of assembly based on robust correlation values.
Figure S5. Genome-‐wide regulation of AS by IK and SMU1 and links with cell
proliferation and apoptosis. A. Depletion of IK and SMU1 protein levels by siRNA-‐
mediated knockdown at 48 or 72 hours. The proteins were detected by western blot
analysis under the indicated conditions. B. Overlap between gene expression
changes upon knockdown of IK or SMU1 by RNAi. Upper panel: up-‐regulated genes.
Lower panel: down-‐regulated genes. C. Overlap in AS changes upon knockdown of
IK or SMU1 by RNAi. Upper panel: changes leading to up-‐regulation of alternative
exons. Lower panel: changes leading to down-‐regulation of alternative exons. D.
Gene ontology analysis for genes or exons up-‐ or down-‐regulated upon IK or SMU1
knockdown using Ingenuity pathway tools. E. Induction of α-‐PARP-‐cleavage, a
feature of apoptotic cells, upon siRNA-‐mediated knockdown of IK, SMU1 or both.
Western blots (two upper panels) were carried out using protein extracts of cells
transfected with specific siRNAs or controls. β-‐tubulin was used as loading control.
RT-‐PCR assays (lower two panels) were used to detect changes in AS of Fas/CD95
and MCL1 endogenous transcripts. F. IK and/or SMU1 knock reduce cell
proliferation. HeLa cells were transfected in sixtuplicate with siRNAs against IK,
SMU1 or the combination of both and cell numbers were determined using
resazurin (which measures aerobic respiration) at different time points. Error bars
represent standard deviations for a given siRNA condition. G. IK and/or SMU1 knock
promote apoptosis. HeLa cells were transfected in sextuplicate with siRNAs against
IK, SUM1 or both and IK and SMU1 were depleted by siRNA transfection, isolated at
different time points and different cell populations analyzed by flow cytometry
sorting after staining with annexin V and propidium iodide. The fraction of cells
under different knockdown conditions is shown.
Figure S6. Analysis of variable functional interactions in the AS network. A.
Principal components analysis of functional interactions extracted by subsets of AS
events (see also Methods). Colors indicate the positions of the interactions in the
Principal Component Analysis, which captures the different relative contributions of
the 35 AS events in defining these connections. The top ten discriminatory AS
events based on their loadings on the first two principal components are shown as
grey arrows. The numbers identify the functional connections between splicing
factors and are detailed in the table at the bottom. B and C. Representation of z-‐
score values of AS changes upon knockdown of U2AF1 or HNRPC for AS events
harboring (B) or not (C) composite HNRPC binding/3’ splice site-‐like elements within
400 nucleotides of the alternative cassette exon boundaries. Robust correlation
values are indicated within each panel. D and E. Representation of z-‐score values of
AS changes upon knockdown of U2AF1 or HNRNPC for AS events harboring (D) or
not (E) Alu elements within 0.5 kbp of the alternative cassette exon boundaries.
Table S1. Information on AS events analyzed in this study. The information includes
gene symbol, id, exon sequence and genomic coordinates for the AS exons analyzed.
Table S2. Information about the custom siRNA library utilized to knockdown
splicing and chromatin factors. Information includes siRNA pools catalog numbers,
gene symbol, id and accession numbers for the targeted genes. It also includes
information about the categories in which their protein products were classified
according to two publications (Wahl et al., 2009 and Zhou et al., 2002), the
classifications used in our network analysis and also information about their
functions and known interactions.
Table S3. Raw data of AS changes upon knockdown of splicing and chromatin
factors. Each sheet contains the data corresponding to one AS event. Data include
information about the genes knocked down, the corresponding siRNA pools,
quantification of alternative isoforms by RT-‐PCR and high-‐throughput capillary
electrophoresis and percentage of exon inclusion (PSI index) for each of the three
biological replicas as well as the median, standard deviation, p value and z-‐scores for
the three values.
Table S4 . Compendium of raw data of AS changes upon knockdown of splicing and
chromatin factors (classified according to factor classes), upon knockdown of P14,
SF3B1 and SNRPG with two independent siRNA libraries, upon knockdown of IK-‐
SMU1 core splicing factors at different time points or in different cell lines, upon
treatment with drugs targeting the spliceosome and upon pharmacological
treatments involved in iron homeostasis modulation (data from Tejedor et al.,
accompanying manuscript).
Table S5. Robust correlation values and glasso inverse covariance estimates for the
complete network edge-‐list.
Table S6. Oligonucleotide sequences and information about amplicons utilized in
semi quantitative RT-‐PCR (sheet 1) assays or real time PCR analysis (sheet 2) used
in this study
Detailed information and references for the Splicing events analyzed can be found
at the following link:
https://s3.amazonaws.com/SplicingNet/ASEs.zip
The files include a schematic representation for every AS event; information
regarding the functional role of the gene, as retrieved from uniprot / genecards
database; the protein domain affected by the AS event; the functional effect of each
of the isoforms towards cell proliferation or apoptosis; and 102 references consulted
for the selection.
Supplementary Methods
Cell lines
HeLa CCL-‐2 cells and HEK293 cells were purchased from the American Type Culture
Collection (ATCC). Cells were cultured in Glutamax Dulbecco’s modified Eagle’s
medium (Gibco, Life Technologies) supplemented with 10% fetal bovine serum
(Gibco, Life Technologies) and penicillin/streptomycin antibiotics (penicillin 500
u/ml; streptomycin 0.5 mg/ml, Life Technologies). Cell culture was performed in cell
culture dishes in a humidified incubator at 37oC under 5% CO2.
Drug Treatments
Pharmacological treatments with splicing arresting drugs were performed as
described in Corrionero et al. (2011). HeLa cells were treated for 3 hours with
Spliceostatin A 260 nM or 8 hours with Meayamycin 20 nM or TG003 10 µM. RNA
was isolated using the Maxwell 16 LEV simplyRNA kit (Promega) and RT PCRs for the
35 splicing events were performed as described above.
Cell proliferation assays
HeLa cells were forward-‐transfected with siRNAs targeting IK, SMU1 or both genes
by plating 2500 cells over a mixture containing siRNA-‐RNAiMAX lipofectamine
complexes, as previously described. Indirect estimations of cell proliferation were
obtained 24, 48 and 72 hours post transfection by treating the cells for 4h with
resazurin (5 µM) prior to the fluorescence measurements (544 nm excitation and
590 nm emission wavelengths). Plate values were obtained with an Infinite 200 PRO
series multiplate reader (TECAN).
Cell apoptosis assays
Hela cells were transfected with siRNAs against IK, SMU1 or both genes, as described
above. Cells were trypsinized and collected for analysis 72 hours post transfection.
Pellets were washed once with complete DMEM medium (10% serum plus
antibiotics), and three times with PBS 1X. Cells were then resuspended in 200 µl
Binding buffer provided by the Annexin V-‐FITC Apoptosis Detection Kit
(eBiosciences), resulting in a final density of 4x105 cells/ml. 5 µl of Annexin V-‐FITC
was added to each sample followed by 10 min incubation at room temperature in
dark conditions. Cells were washed once with 500 µl binding buffer and incubated
with 10 µl propidium iodide (20 µg/ml) for 5 minutes. FACS analysis was performed
using a FACSCalibur flow cytometry system (BD Biosciences).
Western blots
Protein extraction was performed in RIPA buffer and protein abundance was
estimated by Bradford assay. Thirty to Forty µg of total protein were fractionated by
electrophoresis in 10% Bis-‐Tris polyacrilamide gels and transferred to PVDF
membranes. Primary antibodies used in this study are listed below: IK (rabbit
polyclonal, Origene TA308401), SMU1 (mouse monoclonal, Santa Cruz Biotech, sc-‐
100896 JS-‐12), PARP (rabbit polyclonal, cell signaling, 9542S), β-‐Tubulin (mouse
monoclonal, Sigma-‐Aldrich, T4026), SF3B1 (rabbit polyclonal, Abcam, ab39578),
SNRPG (rabbit polyclonal, Abcam, ab111194), SLU7 (rabbit polyclonal, gift from
Robin Reed), U2AF65 (mouse monoclonal, MC3 clone), SRSF1 (mouse monoclonal,
gift from Adrian Krainer), BRR2 (rabbit polyclonal, Sigma, -‐HPA029321)
Real time qPCR
First strand cDNA synthesis was set up with 500 ng of RNA, 50 pmol of oligo-‐dT
(Sigma-‐Aldrich), 75 ng of random primers (Life Technologies), and superscript III
reverse transcriptase (Life Technologies) in 20 µl final volume, following the
manufacturer’s instructions. Quantitative PCR amplification was carried out using 1
µl of 1:2 to 1:4 diluted cDNA with 5 µl of 2X SYBR Green Master Mix (Roche) and 4
pmol of specific primer pairs in a final volume of 10 µl in 384 well-‐white microtiter
plates, Roche). qPCR mixes were analyzed in triplicates in a Light Cycler 480 system
(Roche) and fold change ratios were calculated according to the Pfaffl method
(Pfaffl, 2001). Primers sequences used in this study are indicated in Table S6.
Affymetrix arrays
HeLa cells were transfected with siRNAs against IK, SMU1 or RBM6 in biological
triplicate as described above or –for RBM6-‐ as described in Bechara et al., 2013. RNA
was isolated using the RNeasy minikit (Quiagen). RNA quality was estimated by
Agilent Bioanalyzer nano assay and samples were hybridized to GeneChip Human
Exon 1.0 ST Array (Affymetrix) by Genosplice Technology. Array analysis was
performed by Genosplice Technology and the data (gene expression, alternative
splicing changes and gene ontologies upon IK, SMU1 knockdown conditions) is
available at: https://s3.amazonaws.com/SplicingNet/HJAY_arrays.zip and at GEO
database with the accession number GSE56605 .
Glasso-‐based Network Reconstruction from Robust Correlation Estimates
Here we describe the process of deriving an undirected network that models
functional interconnections of splicing perturbations based on the effects of gene
KDs/cell-‐perturbing stimuli on ASEs. The main steps are:
1. Data preprocessing and reduction. This involves:
1.1. Removal of sparse variables and imputation of missing values.
1.2. Removal of uninformative events.
1.3. Data scaling (standardization).
2. Robust sample covariance estimation.
3. Network reconstruction based on the graphical lasso (glasso) model selection
algorithm.
In the next sections we describe the basis and algorithmic details for each of the
above steps.
1. Data preprocessing and reduction.
1.1 Removal of sparse variables and imputation of missing values
Estimation of the sample covariance matrix requires complete data. However the p x
n matrix that contains the measurements of the p variables (genes) in n conditions
(ASEs) includes missing (NA) values. In order to overcome this problem, first sparse
variables (#NA values > n/2) are removed. The remaining missing values are
subsequently filled-‐in by applying k-‐nearest-‐neighbor based imputation (Hastie et al.,
1999, R function impute available as part of the bioconductor project at
http://www.bioconductor.org/). For the purposes of this work we set k=ceiling(0.25
sqrt(p)).
1.2 Removal of uninformative ASEs.
ASEs not affected to a significant degree (median absolute (ΔPSI)<1) by the KDs offer
little information on the genes’ covariance and can potentially introduce noise and
are thus discarded from subsequent analyses.
1.3 Data Scaling.
Prior to sample covariance estimation the data matrix is standardized along the gene
dimensions. Standardization of the events ensures that only the shape and not the
magnitude of the fluctuations is taken into account for covariance estimation and it
is equivalent to using correlation in lieu of covariance in all downstream analyses.
2. Robust sample correlation estimation.
When outliers are present they dominate the correlation estimates. In order to
overcome the sensitivity of sample correlation to outliers we derive robust
correlation estimates based on a weighting algorithm that attempts to discriminate
between technical outliers and reliable influential measurements.
In particular, starting with an M = p x n dataset of p standardized variables and n
samples we wish to derive a robust correlation p x p matrix. We calculate the
influence of a sample u in the estimation of the correlation of two variables Xa, Xb
using the deleted residual distance:
Where p(Xa, Xb) and p-‐u(Xa,Xb) are the Pearson’s correlation estimates before or after
removing the observations Mau, Mbu coming from sample u.
A measure of the reliability 𝑅!,!! of these measurements for the estimation of p(Xa,
Xb) given all the observed data is then based on their cumulative influence on high
correlates of Xa, and Xb:
where 𝕀 𝑎, 𝑖 is the indicator function:
T being a minimum correlation threshold for considering a pair of variables.
!! !! ,!! = ! !! ,!! − !!! !! ,!! !
!! ,!! = 1/ ! !! !! ,!! !(!, !)+ !! !! ,!! !(!, !)!!!!
!!!!
!(!, !)+ !(!, !)!!!!
!!!!
!
! !, ! ≔ 1!if#! ≠ !!!"#!|! !! ,!! | > !!0!otherwise !
The final robust correlation estimates are finally calculated as the weighted
Pearson’s correlation p(Xa, Xb;w) where w is a vector of weights of length n
containing the values 𝑅!,!! for every observation u.
Intuitively, if an observation Mau distorts correlation estimates among all highly
correlated partners of Xa it is considered an outlier and is down-‐weighted. On the
other hand if including this observation yields similar correlation estimates for most
of these pairs it is considered reliable even if its influence on the individual estimate
of p(Xa, Xb) is high. The advantage of this procedure is that it derives weights for
individual measurements (as opposed to a single weight for every sample) while
taking into account the complete dataset.
The algorithm can be used iteratively, however in our experience the estimates
converge within 1 to 2 iterations. R code for implementing this algorithm is available
upon request. Examples of robust correlation estimates and regression based on this
metric and taken from our dataset are shown below:
3. Graphical model selection for Network Reconstruction using Graphical Lasso
−4 −3 −2 −1 0 1
−5−4
−3−2
−10
1
Z−score RBM17
Z−sc
ore
CRN
KL1
FasOlr1 Pax6Chek2
FN1EDB
FN1EDA NUMB
BCL2L1
Rac1
MSTR1
MADD
CASP2
APAF1
MINK1
MAP3K7
MAP4K3NOTCH3 MAP4K2PKM2
MCL1
CFLARCASP9
DIABLO
BMF
CCNE1
H2AFY
STAT3
GADD45ABIM_L_EL
BIM_ELBIRC5_2B
BIRC5_D3
SMN1
SMN2
PHF19
ClassCor = 0.66RobCor = 0.03
−4 −3 −2 −1 0 1 2
−3−2
−10
12
34
Z−score CDC5L
Z−sc
ore
SKI
IP
FasOlr1
Pax6Chek2
FN1EDB
FN1EDA
NUMB
BCL2L1
Rac1
MSTR1
MADD
CASP2
APAF1
MINK1MAP3K7
MAP4K3
NOTCH3
MAP4K2
PKM2
MCL1
CFLARCASP9
DIABLO
BMF
CCNE1
H2AFY
STAT3
GADD45A
BIM_L_ELBIM_EL
BIRC5_2B
BIRC5_D3
SMN1SMN2PHF19
ClassCor = 0.26RobCor = 0.5
Undirected graphical models (UGMs) such as Markov random fields (MRFs)
represent real-‐world networks and attempt to capture important structural and
functional aspects of the network in the graph structure. The graph structure
encodes conditional independence assumptions between variables
corresponding to nodes of the network. The problem of recovering the structure
of the graph is known as model selection or covariance estimation.
In particular, let G=(V,E) be an undirected graph on p=|V| nodes. Given n
independent, identically distributed (i.i.d) samples of X=(X1,…,Xp), we wish to
identify the underlying graph structure. We restrict the analysis to Gaussian
MRFs where the model assumes that the observations are generated from a
multivariate Gaussian distribution N(µ,Σ). Based on the observation that in the
Gaussian setting, zero components of the inverse covariance matrix Σ-‐1
correspond to conditional independencies given the other variables, different
approaches have been proposed in order to estimate Σ-‐1.
Graphical lasso (gLasso) provides an attractive solution to the problem of
covariance estimation for undirected models, when graph sparsity is a goal
(Friedman et al., 2008). The gLasso algorithm has the advantage of being
consistent, i.e. in the presence of infinite samples its parameter estimates will be
arbitrarily close to the true estimates with probability 1.
L1 regularization (lasso) is a smooth form of subset selection for achieving
sparsity. In the case of gLasso the model is constructed by optimizing the log-‐
likelihood function:
Where Θ is an estimate for the inverse covariance matrix Σ-‐1, S is the empirical
covariance matrix of the data, ||Θ||1 is the L1 norm i.e the sum of the absolute
values of all elements in Θ -‐1, and r is a regularization parameter (in this case
selected based on estimates of the FDR, see below). The solution to the glasso
optimization is convex and can be obtained using coordinate descent.
The glasso process for network reconstruction was implemented using the glasso
R package by Jerome Friedman, Trevor Hastie and Rob Tibshirani
(http://statweb.stanford.edu/~tibs/glasso/).
4. Partition of Network into modules.
Network modules or communities can be defined loosely as sets of nodes with a
more dense connection pattern among their members than between their members
and the remainder of the. Module detection in real-‐world graphs is a problem of
considerable practical interest as they often capture meaningful functional
groupings.
Typically community structure detection methods try to maximize
modularity, a measure of the overall quality of a certain network partition in terms
of the identified modules (Newman, 2006). In particular, for a network partition p,
the network modularity M(p) is defined as:
log detΘ− !" !Θ − ! Θ !!
! ! = !!! −
!!2!
!!
!!!!
where m is the number of modules in p, lk is the number of connections within
module k, L is the total number of network connections and dk is the sum of the
degrees of the nodes in module k.
Here, we identify modules of genes that exhibit similar perturbation profiles among
the assayed events by maximizing the network’s modularity using the greedy
community detection algorithm (Clauset et al., 2004) implemented in the
fastgreedy.community function of the igraph package (Csardi and Nepusz, 2006,
http://igraph.sourceforge.net/doc/R/fastgreedy.community.html).
Supplemental References
Bechara EG, Sebestyén E, Bernardis I, Eyras E, Valcárcel J. (2013). RBM5, 6, and 10
differentially regulate NUMB alternative splicing to control cancer cell proliferation.
Mol Cell. 52(5):720-‐33.
Clauset A, Newman ME, Moore C. (2004). Finding community structure in very large
networks. Phys Rev E Stat Nonlin Soft Matter Phys. 70(6 Pt 2):066111.
Corrionero A, Miñana B, Valcárcel J. (2011). Reduced fidelity of branch point
recognition and alternative splicing induced by the anti-‐tumor drug spliceostatin A.
Genes Dev. 25(5):445-‐59.
Csardi G, Nepusz T. (2006). The igraph software package for complex network
research. InterJournal. Complex Systems 1695.
Friedman J, Hastie T, Tibshirani R. (2008). Sparse inverse covariance estimation with
the graphical lasso. Biostatistics. 9(3):432-‐41.
Hastie T, Tibshirani R, Sherlock G, Eisen M, Brown P, Botstei D. (1999). Imputing
Missing Data for Gene Expression Arrays. Technical report Stanford Statistics
Department.
Newman ME. (2006). Modularity and community structure in networks. Proc Natl
Acad Sci U S A. 103(23):8577-‐82.
Zhou Z, Licklider LJ, Gygi SO, Reed R. (2002). Comprehensive proteomic analysis of
the human spliceosome. Nature 419: 182-‐5.