two sequences multiple sequences local blastz (zpicture-dcode.org) alignmentconverved tfbs lagan...

17
Two sequences Multiple sequences Loc al Blastz (zPicture-dcode.org) ALIGNMENT CONVERVED TFBS LAGAN (mVISTA) Global Glob al TBA/Multiz (Mulan-dcode.org) Loc al rVISTA at at dcode.org PROMOTER SEQUENCE ALIGNMENT Promoter Sequence Alignment Daniel Rico, PhD. [email protected]

Upload: karl-floor

Post on 14-Dec-2015

232 views

Category:

Documents


0 download

TRANSCRIPT

Two sequences

Multiple sequences

Loca

lLo

cal

Blastz(zPicture-dcode.org)

ALIGNMENT CONVERVED TFBS

LAGAN(mVISTA)

Global

Global

Global

Global

TBA/Multiz(Mulan-dcode.org)

LocalLocal

rVISTA at

at dcode.org

PROMOTER SEQUENCE ALIGNMENT

Promoter Sequence AlignmentDaniel Rico, PhD. [email protected] Rico, PhD. [email protected]

Whole Genome Alignments

• Local aligners– Work by “stacking” pairwise alignments– High specificity– BlastZ, LastZ, TBA + MultiZ

• Global aligners– Need to pre-define collinear segments– Better sensitivity– AVID/MAVID, LAGAN/MLAGAN, Pecan

• Mixed aligners– Combine both approaches– Shuffle-LAGAN, MAUVE

2

Reference Sequence Idea

– A sequence is fixed as the reference to which all other sequences are compared

S1: A T G C T CS2: A G A G CS3: T T C T GS4: A T T G C A T G C

S1: A T - G C - T - CS2: A - - G A - G - CS3: - T - T C - T - GS4: A T T G C A T G C

S1: A T G C T CS2: A - G A G C

S1: A T G C T CS2: A - G A G CS3: - T T C T G

3

S1: A T - G C - T - CS2: A - - G A - G - CS3: - T - T C - T - GS4: A T T G C A T G C

S1: A T G C T CS2: A G A G CS3: T T C T GS4: A T T G C A T G C

4

BlastZ: Improved pairwise alignment of Genomic Sequences

Nucleotide local alignment program developed by Webb Miller's group (http://www.bx.psu.edu/miller_lab/)

BlastZ computes local alignments for sequences of any length based on the assumption that the input sequences are related and share blocks of high conservation that are separated by regions that lack homology and vary in length in the two sequences.

Penalizes gaps using a large gap-opening penalty and small gap-extension penalty, to reduce the over-penalization of longer gaps

Zpicture is web server for aligning 2 sequences wit BlastZ:

http://zpicture.dcode.org/5

mVISTA: AVID, LAGAN and Shuffle-LAGAN

6

AVID, LAGAN AND MLAGAN ASSUME THAT ONE HAS ALREADY

IDENTIFIED APPARENT ORTHOLOGOUS REGIONS BETWEEN TWO SPECIES, AND THAT THERE ARE

NO GENOMIC REARRANGEMENTS

7

Copyright OpenHelix.

VISTA Enhancer Browser

Enhancer Browser Combines computational and experimental data

Copyright OpenHelix.

VISTA Alignment display

104637349 GTAGTGCCACTGAGTGTGACAGGGATGGCAAGAAAAGCATTAAGTTCCAAGGGGAAAGAA 104637408>>>>>>>>> | || ||| ||| |||| |||||||||| | || || |||| | |||||||| <<<<<<<<<052290302 GAGATGTCACCAAGTA-AACAGAGATGGCAAGAGGACCAATAGGTTCTAGTGGGAAAGAC 052290360

“sliding window” to measure sequence conservation(default window size 100bp)

Graphical presentation of sequence conservation as “peaks-and-valley” curve

>70% identity

base sequence coordinates

%identity

http://dcode.org/

(A) Standard stacked-pairwise visualization (smooth graph) of Mulan alignments of NOS-2 gene promoter. The human sequence (from -10 kb to +1 kb) was selected as the reference species. Repeats were masked in all species with RepeatMasker (Mulan settings); green regions in the base sequence indicate the human repeats. The graphical representations of the other sequences are displayed according to their similarity to the base sequence: the closer they are to human, the higher is the conservation (top sequences are less conserved). Parameters selected for detection of evolutionarily conserved regions (ECR) were 90 bp minimum length and minimum similarity of 65% (50% bottom cut-off). Red indicates regions that are upstream from the transcription start site; pink regions are downstream from it. Two conserved motifs in rodent NOS-2 promoters indicate the presence of distal and fragmented sequences that are very similar to the unique enhancer region conferring NF-κB regulation in human NOS-2. (B) A schematic representation of the hypothetical translocation of these sequences in human and rodents; double head arrows indicate the positional translocation.

Rico et al. BMC Genomics 2007 8:271 doi:10.1186/1471-2164-8-271

ECRs: Evolutionary Conserved Regions with Mulan

Promoter Sequence Alignment

Promoter Sequence Alignment

Promoter Sequence Alignment

Promoter Sequence Alignment

Promoter Sequence Alignment

Promoter Sequence Alignment

Two sequences

Multiple sequences

Loca

lLo

cal

Blastz(zPicture-dcode.org)

ALIGNMENT CONVERVED TFBS

LAGAN(mVISTA)

Global

Global

Global

Global

TBA/Multiz(Mulan-dcode.org)

LocalLocal

rVISTA at

at dcode.org

PROMOTER SEQUENCE ALIGNMENT

Promoter Sequence AlignmentDaniel Rico, PhD. [email protected] Rico, PhD. [email protected]