smith-waterman vs blast in sirna oligonucleotide design and selection christine lee dr. cecilie...
Post on 23-Jan-2016
220 Views
Preview:
TRANSCRIPT
Smith-Waterman vs Blast Smith-Waterman vs Blast in siRNA Oligonucleotide in siRNA Oligonucleotide
Design and SelectionDesign and Selection
Christine LeeChristine Lee
Dr. Cecilie Boysen, Ph.D.Dr. Cecilie Boysen, Ph.D.Paracel, Paracel, Applied High Performance ComputingApplied High Performance Computing
Southern California Bioinformatics InstituteSouthern California Bioinformatics Institute
Summer 2004Summer 2004Funded by the National Science Foundation and National Institute of Funded by the National Science Foundation and National Institute of
HealthHealth
OutlineOutline
History of RNAiHistory of RNAi Small interfering RNA (siRNA) MechanismSmall interfering RNA (siRNA) Mechanism siRNA design and selectionsiRNA design and selection Blast vs Smith-Waterman Blast vs Smith-Waterman Project Objectives and ResultsProject Objectives and Results Conclusions & Future WorkConclusions & Future Work
History of RNAiHistory of RNAi
Discovered in 1998 by Andrew Fire, Discovered in 1998 by Andrew Fire, Craig Mello, and colleagues Craig Mello, and colleagues
RNAi – silencing of gene expression RNAi – silencing of gene expression by dsRNA moleculesby dsRNA molecules
Organism used: Caenorhabditis Organism used: Caenorhabditis eleganselegans
Short interfering RNA (siRNA) Short interfering RNA (siRNA) MechanismMechanism
http://www.bioteach.ubc.ca/MolecularBiology/AntisenseRNA/siRNA.gif
siRNA Selection & Design: siRNA Selection & Design: Avoiding Cross-HybridizationAvoiding Cross-Hybridization
Important to guard against strong cross-Important to guard against strong cross-hybridization to other geneshybridization to other genes
Cross-hybridization with non-specific Cross-hybridization with non-specific targets results in wasted lab time and targets results in wasted lab time and materials, as well as inaccurate materials, as well as inaccurate conclusionsconclusions
Preliminary sequence analysis allows Preliminary sequence analysis allows verification of candidate oligos to protect verification of candidate oligos to protect against cross-hybridizationagainst cross-hybridization
siRNA Selection & DesignsiRNA Selection & Design Hybridization concerns: Hybridization concerns:
siRNA mismatch tolerancesiRNA mismatch tolerance Insertion/deletion vs mismatchInsertion/deletion vs mismatch
Query: 1 GAACTTATCTTCCTTCTTC 19 Query: 1 GAACTTATCTTCCTTCTTC 19 ||||||||||||||||||||||||||||||||||||||Sbjct: 3783 GAACTTATCTTCCTTCTTC 3801Sbjct: 3783 GAACTTATCTTCCTTCTTC 3801
Query: 19 GAAGAAGGAAGATAAGTTC 1 ||||||||| || ||||||Sbjct: 778 GAAGAAGGATGAGAAGTTC 796
Blast vs Smith-WatermanBlast vs Smith-Waterman Blast may potentially miss relevant Blast may potentially miss relevant
alignmentsalignments Using word size seven, nearly 6% of all Using word size seven, nearly 6% of all
possible alignments with three possible alignments with three mismatches between 21-mers will be mismatches between 21-mers will be missedmissed
Increasing word size or allowing more Increasing word size or allowing more mismatches contribute to higher rate of mismatches contribute to higher rate of missed hits missed hits
Smith-Waterman is said to have higher Smith-Waterman is said to have higher sensitivity, so why not use it?sensitivity, so why not use it?
Project ObjectivesProject Objectives
Test set: 10,000 19-mer oligos/siRNAsTest set: 10,000 19-mer oligos/siRNAs Test database: RefSeqTest database: RefSeq Comparison study between Blast and Comparison study between Blast and
Smith WatermanSmith Waterman 15/19 -> Percent Identity threshold 15/19 -> Percent Identity threshold
set to 78% … e-value adjustment set to 78% … e-value adjustment from default of 10. E-value 500 usedfrom default of 10. E-value 500 used
A Closer Look at Smith-A Closer Look at Smith-Waterman & Blast ParametersWaterman & Blast Parameters
AlgorithmAlgorithm AlignmentAlignment Score/Score/(ID)(ID)
ParamParam MatchMatch Mis-Mis-matchmatch
GO/GO/
GEGE
Gap Gap TotalTotal
Smith Smith WatermanWaterman
Query: 19 TCACCGTAGATGCTCTTTC 1Query: 19 TCACCGTAGATGCTCTTTC 1
|| |||| ||||||||||||| |||| |||||||||||
Sbjct: 2376 TC-CCGTGGATGCTCTTTC 2393Sbjct: 2376 TC-CCGTGGATGCTCTTTC 2393
2929
17/19 17/19 (89%)(89%)
defaultdefault +2+2 -2-2 -3-3
BlastBlast Query: 1 gaaagagcatctacgg 16Query: 1 gaaagagcatctacgg 16
||||||||||| ||||||||||||||| ||||Sbjct: 2393 gaaagagcatccacgg 2378Sbjct: 2393 gaaagagcatccacgg 2378
1212
15/16 15/16 (93%)(93%)
W 7W 7
e 500e 500
DefaultDefault
+1+1 -3-3 G -5G -5
E -2E -2
-7-7
BlastBlast Query: 1 gaaagagcatctacggtga 19Query: 1 gaaagagcatctacggtga 19
||||||||||| |||| ||||||||||||| |||| ||
Sbjct: 2393 gaaagagcatccacgg-ga 2376Sbjct: 2393 gaaagagcatccacgg-ga 2376
2929
17/19 17/19 (89%)(89%)
W7W7
e 500e 500
G 1 G 1
q 2 r 2q 2 r 2
+2+2 -2-2 G -1G -1
E -2E -2
-3-3
Smith-Waterman vs. Blast ResultsSmith-Waterman vs. Blast Results
Percent Identity: 89% ,GATA3 gene
>gi|4503928|ref|NM_002051.1| Homo sapiens GATA binding protein 3
(GATA3), mRNA
Length = 2365
Score = 31.7 bits (38), Expect = 0.041
Identities = 19/19 (100%)
Strand = Plus / Plus
Query: 1 ctttttaacatcgacggtc 19
|||||||||||||||||||
Sbjct: 299 ctttttaacatcgacggtc 317
SWN hit-4 bin Blast hit-1 bin W7 G1 r2 q2 e500 E2
Original Query Sequence: CTTTTTAACATCGACGGTC
Smith-Waterman vs. Blast ResultsSmith-Waterman vs. Blast Results
>gi|4503928|ref|NM_002051.1| Homo sapiens GATA binding protein 3
(GATA3), mRNA
Length = 2365
Score = 31.7 bits (38), Expect = 0.041
Identities = 19/19 (100%)
Strand = Plus / Plus
Query: 1 ctttttaacatcgacggtc 19
|||||||||||||||||||
Sbjct: 299 ctttttaacatcgacggtc 317
>gi|4557424|ref|NM_001248.1| Homo sapiens ectonucleoside triphosphate
diphosphohydrolase 3 (ENTPD3), mRNA
Length = 2797
Score = 24.6 bits (29), Expect = 5.7
Identities = 17/19 (89%), Gaps = 1/19 (5%)
Strand = Plus / Minus
Query: 1 aa-aatactgagagaggga 18
|| ||||||||| ||||||
Sbjct: 2044 aagaatactgagggaggga 2026
>gi|4503928|ref|NM_002051.1| Homo sapiens GATA binding protein 3
(GATA3), mRNA
Length = 2365
Score = 31.7 bits (38), Expect = 0.041
Identities = 19/19 (100%)
Strand = Plus / Plus
Query: 1 ctttttaacatcgacggtc 19
|||||||||||||||||||
Sbjct: 299 ctttttaacatcgacggtc 317
SWN hit-1 binBlast hit-4 bin
Original Query Sequence: AAAATACTGAGAGAGGGAG
Conclusions and Future Conclusions and Future WorkWork
Produce more conclusive statistics for occurrences Produce more conclusive statistics for occurrences of more accurate Smith-Waterman results of more accurate Smith-Waterman results
No consensus exists as to which hits are No consensus exists as to which hits are considered dangerous or significant for cross-considered dangerous or significant for cross-hybridizationhybridization
Creation of a position-specific matrixCreation of a position-specific matrix Mutation tolerance on the 5’ endMutation tolerance on the 5’ end Low tolerance on the 3’ endLow tolerance on the 3’ end GU wobbleGU wobble
ReferencesReferences Novina, C and Sharp, P. Novina, C and Sharp, P. The RNAi revolution.The RNAi revolution.
Nature. 2004 Jul 8;430(6996):161-4.Nature. 2004 Jul 8;430(6996):161-4. Dorsett, Y and Tuschl, T. Dorsett, Y and Tuschl, T. siRNAs: applications siRNAs: applications
in functional genomics and potential as in functional genomics and potential as therapeutics.therapeutics. Nat Rev Drug Discov. 2004 Nat Rev Drug Discov. 2004 Apr;3(4):318-29. Apr;3(4):318-29.
Snove, O Jr. and Holen, T. Snove, O Jr. and Holen, T. Many commonly Many commonly used siRNAs risk off-target activity.used siRNAs risk off-target activity. Biochem Biochem Biophys Res Commun. 2004 Jun Biophys Res Commun. 2004 Jun 18;319(1):256-63. 18;319(1):256-63.
Paroo, Z and Corey, DR. Paroo, Z and Corey, DR. Challenges for RNAi Challenges for RNAi in vivo.in vivo. Trends Biotechnol. 2004 Trends Biotechnol. 2004 Aug;22(8):390-4.Aug;22(8):390-4.
Amarzguioui, M. et al. Amarzguioui, M. et al. Tolerance for mutations Tolerance for mutations and chemical modifications in siRNAand chemical modifications in siRNA. Nucl . Nucl Acids Research. 2003; 31(2)589-595. Acids Research. 2003; 31(2)589-595.
AcknowledgementsAcknowledgements Dr. Cecilie Boysen Dr. Cecilie Boysen (advisor) Parcel Scientific Staff(advisor) Parcel Scientific Staff
David Meyer David Meyer Paracel Software EngineerParacel Software Engineer
Stephanie PaoStephanie Pao Paracel Technical Sales Engineer Paracel Technical Sales Engineer
Frances Tong Frances Tong Paracel InternParacel Intern
William White William White Paracel Technical WriterParacel Technical Writer
Southern California Bioinformatics Institute 2004 Southern California Bioinformatics Institute 2004 Faculty and Staff: Faculty and Staff: Dr. Jamil Momand, Dr. Nancy Warter-Perez, Dr. Jamil Momand, Dr. Nancy Warter-Perez, Dr. Sandra Sharp & Dr. Wendie Johnston,Dr. Sandra Sharp & Dr. Wendie Johnston,& Jackie Leung & Jackie Leung
Fellow interns Fellow interns NIH & NSFNIH & NSF
Short interfering RNA Mechanism
Post-transcriptional gene silencing.
Novina, C and Sharp, P. The RNAi revolution. Nature Vol 430. July 8, 2004.
Dorsett, Y and Tuschl, T. siRNAs: applications in functional genomics and potential as therapeutics. Nat Rev Drug Discov. 2004 Apr;3(4):318-29.
•Reverse genetic approaches – expensive and time consuming
•siRNA may be chemically synthesized or expressed from DNA vectors
MicroRNAsMicroRNAs
Short RNAs 19-25 nucleotidesShort RNAs 19-25 nucleotides Abundant, single stranded RNAs encoded in Abundant, single stranded RNAs encoded in
genomes of most multicellular organisms: from genomes of most multicellular organisms: from few thousand to 40,000 molecules per cellfew thousand to 40,000 molecules per cell
Some evolutionarily conserved and Some evolutionarily conserved and developmentally regulateddevelopmentally regulated
Translational silencing.
Picture from: Novina, C and Sharp, P. The RNAi revolution. Nature Vol 430. July 8, 2004.
Differences between siRNA and Differences between siRNA and miRNAmiRNA
siRNAsiRNA Promote the cleavage Promote the cleavage
or degradation of or degradation of mRNAsmRNAs
Sense strand has Sense strand has “exactly the same “exactly the same sequence as the sequence as the target strand” target strand”
Target genes or Target genes or genetic elements genetic elements from which they from which they originatedoriginated
miRNAmiRNA Regulate the Regulate the
expression of expression of mRNAs; transcription mRNAs; transcription is not impeded and is not impeded and mRNAs not destroyedmRNAs not destroyed
Imperfect base-Imperfect base-pairing between pairing between mRNA targets and mRNA targets and miRNAmiRNA
Regulate separate Regulate separate genesgenes
Interchangeability of siRNAs Interchangeability of siRNAs and miRNAsand miRNAs
miRNA may act like siRNAmiRNA may act like siRNA
* perfect or near-perfect complementarity * perfect or near-perfect complementarity to cellular mRNAsto cellular mRNAs
Could siRNA also work like miRNA?Could siRNA also work like miRNA?
* synthetic siRNA partially complementary * synthetic siRNA partially complementary to ‘reporter’ gene inhibited its expressionto ‘reporter’ gene inhibited its expression
Distinction between single site with almost Distinction between single site with almost exact complementarity and numerous exact complementarity and numerous partially complementary binding sitespartially complementary binding sites
Laboratory and Clinical Laboratory and Clinical Applications of siRNAApplications of siRNA
In C. elegans, simple experiment: inject In C. elegans, simple experiment: inject dsRNA, soak in dsRNA solution, or feed with dsRNA, soak in dsRNA solution, or feed with bacteria expressing dsRNAbacteria expressing dsRNA
In worms, screening for obesity and ageingIn worms, screening for obesity and ageing In fruitflies, purified long dsRNA used to In fruitflies, purified long dsRNA used to
identify roles of genes in cholesterol identify roles of genes in cholesterol metabolism and heart formationmetabolism and heart formation Therapeutic potential of siRNAs for humansTherapeutic potential of siRNAs for humans
FileFile TypeType BasesBases Sequences Sequences # of Oligos# of Oligos
BRCA1 BRCA1 fastafasta 32433243 11 32553255
GATA3 GATA3 fastafasta 30703070 11 30703070
HLA-molecule HLA-molecule fastafasta 29182918 11 29182918
Insulin-like-Insulin-like-growth-factor growth-factor
fastafasta 49894989 11 49714971
Interleukin-Interleukin-receptor receptor
fastafasta 14511451 11 14331433
NFKB1 NFKB1 fastafasta 41044104 11 41864186
Serine kinase Serine kinase fastafasta 35063506 11 34883488
Serotonin-Serotonin-receptor receptor
fastafasta 19271927 11 19091909
TNF2 TNF2 fastafasta 16691669 11 16511651
Vinculin Vinculin fastafasta 56475647 11 56295629
Total Total 3255432554 1010
Paroo, Z and Corey, DR. Challenges for RNAi in vivo. Trends Biotechnol. 2004 Aug;22(8):390-4.
Blast vs Smith-Waterman Blast vs Smith-Waterman Speed Test ResultsSpeed Test Results
11.35
205.69
46.7
346.08
0
50
100
150
200
250
300
350
Default e500
SWNBlast
Time in minutes
top related