an analysis of “alignments anchored on genomic landmarks can aid in the identification of...
Post on 21-Dec-2015
219 views
TRANSCRIPT
An analysis of “Alignments anchored on genomic
landmarks can aid in the identification of regulatory elements”
by Kannan Tharakaraman et al.
Sarah Aerni
July 8, 2005
Gene Regulation
Transcription factors– Cis-acting elements
Gene expression is regulated by gene itself (gene acts upon itself)
– Trans-acting elements Gene expression is regulated by other genes (gene
inhibits another)
Gene Regulation
US Department of Energy Office of Science
Motifs
Binding sites– Transcription factors– Zinc Finger
Hard to identify– Relatively short sequences– Some indices well conserved– Usually localized in certain
proximity of the gene
Techniques to Identify Regulatory Elements
Enumerative Methods– Align sequences, usually
use orthologous genes– Depends on local
alignments– Cannot be too similar or
too distant
Alignment Methods– Create w-mers and find
over-represented motifs– Frequency may be
misconstrued due to repeats
Tharakaraman Technique– Combine both methods– Include word placement with frequency – is the location of
Cis-Regulatory regions correlated?
Initial Steps
Mask repeats– Avoid identifying repeats as motifs– Maintain one position for possible
motifs
Align Transcription Start Site (TSS)
– Depend on proximity to TSS– Allow for slight shifts – look for
clusters
Define Significance
Alignment scores– Assign significance using
gap penalties from Mock Set
– Jittering – watch for overrepresented octonucleotides
– ρ = 5 determined to be significant without jittering
TRANSFAC
Database of Eukaryotic Transcriptional Regulatory Elements
Comparison of TRANSFAC octonucleotides to those identified by paper’s technique
GLAM
Sequence input Every sequence arbitrary position and window size
chosen– Gapless multiple alignment in window sequences– Uses probability to determine whether windows are
repositioned or resized (Gibbs Sampling)
“seed” constraints– OOPS (1 occurrence per sequence)– ZOOPS(0 or 1 occurrence per sequence)
Alignment Techniques
Different techniques show different results
A-GLAM determined to be best
– Compare to TRANSFAC– AlignACE cannot
function computationally at genomic scale
Distance to TSS
Cis-acting element locations determined by blocks Largest number close to 0 (TSS) Identified element correlated with TRANSFAC
Further Discussion
Discussion is limited to method results– Little information given on whether location is truly
correlated– No Biological discussion
Proximity of TSS and Cis-Acting binding sites– Narrow search range to a smaller field– Use in identification of types of element?