structural bioinformatics tools at pdbj - pdb japan · 07/08/2009  · question: why don’t we see...

101
Structural Bioinformatics Tools at PDBj Daron M. Standley Friday, August 7, 2009

Upload: others

Post on 04-Jan-2020

5 views

Category:

Documents


0 download

TRANSCRIPT

Page 1: Structural Bioinformatics Tools at PDBj - PDB Japan · 07/08/2009  · question: why don’t we see hits in the 40-90% range? Friday, August 7, 2009. MAFFTash Multiple sequence alignment

Structural Bioinformatics Tools at PDBjDaron M. Standley

Friday, August 7, 2009

Page 2: Structural Bioinformatics Tools at PDBj - PDB Japan · 07/08/2009  · question: why don’t we see hits in the 40-90% range? Friday, August 7, 2009. MAFFTash Multiple sequence alignment

Today’s topics

Searching PDBj

Making multiple sequence alignments

Predicting protein function from sequence

Friday, August 7, 2009

Page 3: Structural Bioinformatics Tools at PDBj - PDB Japan · 07/08/2009  · question: why don’t we see hits in the 40-90% range? Friday, August 7, 2009. MAFFTash Multiple sequence alignment

Searching PDBj

Sequence Navigator

Structure Navigator

Friday, August 7, 2009

Page 4: Structural Bioinformatics Tools at PDBj - PDB Japan · 07/08/2009  · question: why don’t we see hits in the 40-90% range? Friday, August 7, 2009. MAFFTash Multiple sequence alignment

Making multiple sequence alignments

MAFFTash

Friday, August 7, 2009

Page 5: Structural Bioinformatics Tools at PDBj - PDB Japan · 07/08/2009  · question: why don’t we see hits in the 40-90% range? Friday, August 7, 2009. MAFFTash Multiple sequence alignment

Predicting protein function from sequence

SFAS

Spanner

PDBj Quat and PDBj Het

SeSAW

Friday, August 7, 2009

Page 6: Structural Bioinformatics Tools at PDBj - PDB Japan · 07/08/2009  · question: why don’t we see hits in the 40-90% range? Friday, August 7, 2009. MAFFTash Multiple sequence alignment

Sequence Navigator

Friday, August 7, 2009

Page 7: Structural Bioinformatics Tools at PDBj - PDB Japan · 07/08/2009  · question: why don’t we see hits in the 40-90% range? Friday, August 7, 2009. MAFFTash Multiple sequence alignment

Sequence Navigator

Friday, August 7, 2009

Page 8: Structural Bioinformatics Tools at PDBj - PDB Japan · 07/08/2009  · question: why don’t we see hits in the 40-90% range? Friday, August 7, 2009. MAFFTash Multiple sequence alignment

Sequence Navigator

Friday, August 7, 2009

Page 9: Structural Bioinformatics Tools at PDBj - PDB Japan · 07/08/2009  · question: why don’t we see hits in the 40-90% range? Friday, August 7, 2009. MAFFTash Multiple sequence alignment

Sequence Navigator

type a PDB ID here

2qfb

Friday, August 7, 2009

Page 10: Structural Bioinformatics Tools at PDBj - PDB Japan · 07/08/2009  · question: why don’t we see hits in the 40-90% range? Friday, August 7, 2009. MAFFTash Multiple sequence alignment

Sequence Navigator

type a PDB ID here

2qfb

type a chain ID here

B

Friday, August 7, 2009

Page 11: Structural Bioinformatics Tools at PDBj - PDB Japan · 07/08/2009  · question: why don’t we see hits in the 40-90% range? Friday, August 7, 2009. MAFFTash Multiple sequence alignment

Sequence Navigator

type a PDB ID here

2qfb

paste a sequence here

type a chain ID here

B

Friday, August 7, 2009

Page 12: Structural Bioinformatics Tools at PDBj - PDB Japan · 07/08/2009  · question: why don’t we see hits in the 40-90% range? Friday, August 7, 2009. MAFFTash Multiple sequence alignment

Sequence Navigator results

Friday, August 7, 2009

Page 13: Structural Bioinformatics Tools at PDBj - PDB Japan · 07/08/2009  · question: why don’t we see hits in the 40-90% range? Friday, August 7, 2009. MAFFTash Multiple sequence alignment

Structure Navigator

Friday, August 7, 2009

Page 14: Structural Bioinformatics Tools at PDBj - PDB Japan · 07/08/2009  · question: why don’t we see hits in the 40-90% range? Friday, August 7, 2009. MAFFTash Multiple sequence alignment

Structure Navigator

Friday, August 7, 2009

Page 15: Structural Bioinformatics Tools at PDBj - PDB Japan · 07/08/2009  · question: why don’t we see hits in the 40-90% range? Friday, August 7, 2009. MAFFTash Multiple sequence alignment

Structure Navigator

Friday, August 7, 2009

Page 16: Structural Bioinformatics Tools at PDBj - PDB Japan · 07/08/2009  · question: why don’t we see hits in the 40-90% range? Friday, August 7, 2009. MAFFTash Multiple sequence alignment

Structure Navigator

type a PDB ID here

2qfb

Friday, August 7, 2009

Page 17: Structural Bioinformatics Tools at PDBj - PDB Japan · 07/08/2009  · question: why don’t we see hits in the 40-90% range? Friday, August 7, 2009. MAFFTash Multiple sequence alignment

Structure Navigator

type a PDB ID here

2qfb

type a chain ID hereB

Friday, August 7, 2009

Page 18: Structural Bioinformatics Tools at PDBj - PDB Japan · 07/08/2009  · question: why don’t we see hits in the 40-90% range? Friday, August 7, 2009. MAFFTash Multiple sequence alignment

Structure Navigator

type a PDB ID here

2qfb

type a chain ID hereB

[email protected]

structure-based queries take a lot of timeso please type your email address here!

Friday, August 7, 2009

Page 19: Structural Bioinformatics Tools at PDBj - PDB Japan · 07/08/2009  · question: why don’t we see hits in the 40-90% range? Friday, August 7, 2009. MAFFTash Multiple sequence alignment

Friday, August 7, 2009

Page 20: Structural Bioinformatics Tools at PDBj - PDB Japan · 07/08/2009  · question: why don’t we see hits in the 40-90% range? Friday, August 7, 2009. MAFFTash Multiple sequence alignment

Structure Navigator (cont.)

Sequence Navigator finds 29 hitsStructure Navigator finds 62 hits

Friday, August 7, 2009

Page 21: Structural Bioinformatics Tools at PDBj - PDB Japan · 07/08/2009  · question: why don’t we see hits in the 40-90% range? Friday, August 7, 2009. MAFFTash Multiple sequence alignment

Sequence vs. Structure Alignment

Sequence alignment is fast, but not very sensitive

Structure alignment is very sensitive, but slow

0

20

40

60

80

0 10 20 30 40 50 60 70 80 90 100

Distribution of hits

Sequence Navigator Structure Navigator

% sequence ID of hit

% of hits

Friday, August 7, 2009

Page 22: Structural Bioinformatics Tools at PDBj - PDB Japan · 07/08/2009  · question: why don’t we see hits in the 40-90% range? Friday, August 7, 2009. MAFFTash Multiple sequence alignment

Sequence vs. Structure Alignment

Sequence alignment is fast, but not very sensitive

Structure alignment is very sensitive, but slow

0

20

40

60

80

0 10 20 30 40 50 60 70 80 90 100

Distribution of hits

Sequence Navigator Structure Navigator

% sequence ID of hit

% of hits

question: why don’t we see hits in the 40-90% range?

Friday, August 7, 2009

Page 23: Structural Bioinformatics Tools at PDBj - PDB Japan · 07/08/2009  · question: why don’t we see hits in the 40-90% range? Friday, August 7, 2009. MAFFTash Multiple sequence alignment

MAFFTash

Multiple sequence alignment based on MAFFT program by Katoh, Misawa, Kuma, Miyata 2002 (Nucleic Acids Res. 30:3059-3066)

Uses structural information from ASH program by Standley, Toh, and Nakamura 2005 (BMC Bioinformatics 6, 221)

Friday, August 7, 2009

Page 24: Structural Bioinformatics Tools at PDBj - PDB Japan · 07/08/2009  · question: why don’t we see hits in the 40-90% range? Friday, August 7, 2009. MAFFTash Multiple sequence alignment

MAFFTash

Friday, August 7, 2009

Page 25: Structural Bioinformatics Tools at PDBj - PDB Japan · 07/08/2009  · question: why don’t we see hits in the 40-90% range? Friday, August 7, 2009. MAFFTash Multiple sequence alignment

MAFFTash

Friday, August 7, 2009

Page 26: Structural Bioinformatics Tools at PDBj - PDB Japan · 07/08/2009  · question: why don’t we see hits in the 40-90% range? Friday, August 7, 2009. MAFFTash Multiple sequence alignment

MAFFTash

Friday, August 7, 2009

Page 27: Structural Bioinformatics Tools at PDBj - PDB Japan · 07/08/2009  · question: why don’t we see hits in the 40-90% range? Friday, August 7, 2009. MAFFTash Multiple sequence alignment

MAFFTash

type sequences like this:

>Q6Q899...MTAAQRQNL...

Friday, August 7, 2009

Page 28: Structural Bioinformatics Tools at PDBj - PDB Japan · 07/08/2009  · question: why don’t we see hits in the 40-90% range? Friday, August 7, 2009. MAFFTash Multiple sequence alignment

MAFFTash

type sequences like this:

>Q6Q899...MTAAQRQNL...

type a PDB IDs like this:

>PDBID2plhA

Friday, August 7, 2009

Page 29: Structural Bioinformatics Tools at PDBj - PDB Japan · 07/08/2009  · question: why don’t we see hits in the 40-90% range? Friday, August 7, 2009. MAFFTash Multiple sequence alignment

MAFFTash results

Friday, August 7, 2009

Page 30: Structural Bioinformatics Tools at PDBj - PDB Japan · 07/08/2009  · question: why don’t we see hits in the 40-90% range? Friday, August 7, 2009. MAFFTash Multiple sequence alignment

jalview alignment viewer

Waterhouse A.M., Procter J.B., Martin D.M., Clamp M., Barton G.J. “Jalview Version 2--a multiple sequence alignment editor and analysis workbench.” Bioinformatics. (2009) 25(9):1189-91.

Friday, August 7, 2009

Page 31: Structural Bioinformatics Tools at PDBj - PDB Japan · 07/08/2009  · question: why don’t we see hits in the 40-90% range? Friday, August 7, 2009. MAFFTash Multiple sequence alignment

If typing input is difficult, please usePrep-MAFFTash

Friday, August 7, 2009

Page 32: Structural Bioinformatics Tools at PDBj - PDB Japan · 07/08/2009  · question: why don’t we see hits in the 40-90% range? Friday, August 7, 2009. MAFFTash Multiple sequence alignment

Prep-MAFFTash

Prep-MAFFTashautomatically findsPDB IDs and sequencesand formats them for MAFFTash

Friday, August 7, 2009

Page 33: Structural Bioinformatics Tools at PDBj - PDB Japan · 07/08/2009  · question: why don’t we see hits in the 40-90% range? Friday, August 7, 2009. MAFFTash Multiple sequence alignment

>PDBID2qfbB

Prep-MAFFTash

Prep-MAFFTashautomatically findsPDB IDs and sequencesand formats them for MAFFTash

Friday, August 7, 2009

Page 34: Structural Bioinformatics Tools at PDBj - PDB Japan · 07/08/2009  · question: why don’t we see hits in the 40-90% range? Friday, August 7, 2009. MAFFTash Multiple sequence alignment

>PDBID2qfbB

[email protected]

Prep-MAFFTash

Prep-MAFFTashautomatically findsPDB IDs and sequencesand formats them for MAFFTash

Friday, August 7, 2009

Page 35: Structural Bioinformatics Tools at PDBj - PDB Japan · 07/08/2009  · question: why don’t we see hits in the 40-90% range? Friday, August 7, 2009. MAFFTash Multiple sequence alignment

Prep-MAFFTash

Friday, August 7, 2009

Page 36: Structural Bioinformatics Tools at PDBj - PDB Japan · 07/08/2009  · question: why don’t we see hits in the 40-90% range? Friday, August 7, 2009. MAFFTash Multiple sequence alignment

Prep-MAFFTash

Friday, August 7, 2009

Page 37: Structural Bioinformatics Tools at PDBj - PDB Japan · 07/08/2009  · question: why don’t we see hits in the 40-90% range? Friday, August 7, 2009. MAFFTash Multiple sequence alignment

>PDBID2rmjA>PDBID2rqbA>PDBID3eqtB>UniRef100_UPI000184A4B6CKKCSKQACCGTDIQVIATAHHVNTTPKFKTLYSKGPNKTLQEKFADYQINGDIICKECGKTWGTTMVHKGIEVPCLQIRNFVVKYDDKKMTKDTYDKWSELP>UniRef100_UPI000180C60DLCKKCKKVATTSDKFQHINHQHHVVVDAGFIERAEIVQYPAIKHKVFGEQTIIGIVKCKSCRSDWGAYMRHRQQPISVLKIVNFVLQH>UniRef100_O44165KIICKKCEAILCTSKDIRSRNTQYLVCDPGFWSLVRKTRLTDEQQALIKYNATGSINCRRENCGLKLGQLIEVNTVDLPCLSALSIVLLVEGTDKRIIVKKWKNILDKYFTPTEIR>UniRef100_UPI00015A774AQLQCRSCFASVCSGGDIRKIENSHHVNVNTEFNETMFIMPENKPQFSNDLEVLKNADLHPSFSQDWGFEIKFKKVAILPCLKIKSFSFNTPKETKPYKKWKDVEFQVTEFDF>UniRef100_B6NSL9LHCKKCNQKACDAAELRLIEGAHHVHPDPEFLRNVGKADEAEEPLQKFQEWSRIGSVKCRKCEQDWGMLMLHKGMNLPCLSVKNFTLRKQGIPPRRPKKWKDSGIAVAEFDYAQD>UniRef100_UPI00016E726BKFSCRGCSQEVCTGEDIEVIEDIHRVNVTPQFRELFIRKESTKRKDSLLDYETNGYIVCGKCGQRWGSMMYFRGIQCPSLHVKNFVVAISGKNMPKCSKWTDVPARFSAFDY>UniRef100_UPI000175F4DCQLQCRSCFASVCSGGDIRKIENSHHVNVNTEFKNHYKVGDQVNMERTFEDWEPGRIISCRKCKKDWGFEIKFKKVAILPCLKIKSFSFNTPKETKPYKKWKDVEFQVTEFDF>UniRef100_UPI0001796BE9QLLCINCMVAVGYGSDLRKVEGTHHVNVNPNFSIYYKVSPKAVVIDREFKDWKPGGAVSCRNCGESWGMQMIYKSVKLPVIKVRSMLLETPRGRVQAKKWSHVPFP>UniRef100_Q6Q899KENKKLLCGKCKNFACYTADIRVVETSHYTVLGDAFKERFVCKPHPKPKIYDNFEKKAKIFCAKQNCSHDWGIFVRYKTFEIPVIKIESFVVEDIVSGVQNRHSKWKDFHFERIQFDPAEMS>UniRef100_UPI000194DF7FKKLYCGKCKAYACSTDDIRIIKGSHHIVLGNAFQERYTTKPHRKPVQFDDFVKKSKMHCRNTECQHDWGIIVKYKIFDNLPVIKIRSFVLEDVESGSQMDFQKWRSINLSLKNFDE>UniRef100_UPI000069F51FNRKLLCKKCKTYACNTDDIRVIKDSHHMIIDKTFKDRYITKKHPKPRTFEGYKKMYKIFCKRPECHEDWGVSGTYQGFQDLPLIKIEQFVIENPDGTQEYKDKWVDVHFT>UniRef100_Q9GLV6AFACYTADIRMVEKCHFTVVGDAFRERFVSKLHPKPKSFGNIEKRAKIYCARPDCSHDWGIYVRYKAFEMPFIKIESFVVEDIATGVQTVHAKWKDFNFEKLSFDAAEMA>UniRef100_UPI00005BBBB1FLCKNCGVPACSGEDIHVIEKMHHVNMTPEFKKLYLVRGNKALQTMCVDYQTNGEIICNKCGQAWGTMMVHKGLDLPCLKIKNFVVVFQNNLPKKQYKKWV>UniRef100_UPI00015A61DALSCRQCSVFVCSGEDIEIIEKMHHVNVTKQFSMEFKTFRENASLQERLLDYETNGVIACKKCGQQWGSMMLYRSTECPCLHIKNFVVTYGSKKKTFSKWRELSISFPAFDY>UniRef100_UPI00017B3B5EQLLCRNCFRQVASGSDIRLVDNTHYVNINPDFKRFYKTGERVIINRRFEDWEPGRTISCSNGSCNKKWGAEIKYKKVALLPNLSIENFALETPEGRMTPRKWKDITFT>UniRef100_UPI0000EB32D5KENKKLLCRKCKAFACYTADIRVVEECHYTVVGDAFTKCFVSKLHPKPKSFGHFEKRAKIFCARRNCGHDWGIHVKYKTFEIPVIKIESFVVEDIATGAQKLYAKWKDFPFEKIPFGSPEIP>UniRef100_UPI000194AF36EHVQLLCINCMVAVGHGSDLRKVEGTHHVNVNPNFSNYYNVSRDPVVINKVFKDWKPGGVISCRNCGEVWGLQMIYKSVKLPVLKVRSMLLETPQGRIQAKKWSRVPFSVPDFDF>UniRef100_B3KN51KENKKLLCRKCKALACYTADVRVIEECHYTVLGDAFKECFVSRPHPKPKQFSSFEKRAKIFCARQNCSHDWGIHVKYKTFEIPVIKIESFVVEDIATGVQTLYSKWKDFHFEKIPFDPAEMS>UniRef100_UPI00017978A2...

Prep-MAFFTash Paste the Prep-MAFFTash output into MAFFTash

Friday, August 7, 2009

Page 38: Structural Bioinformatics Tools at PDBj - PDB Japan · 07/08/2009  · question: why don’t we see hits in the 40-90% range? Friday, August 7, 2009. MAFFTash Multiple sequence alignment

Prep-MAFFTash MAFFTash

Friday, August 7, 2009

Page 39: Structural Bioinformatics Tools at PDBj - PDB Japan · 07/08/2009  · question: why don’t we see hits in the 40-90% range? Friday, August 7, 2009. MAFFTash Multiple sequence alignment

What can we do with multiple sequence alignments?

Easily locate conserved (i.e., important) residues

We can also use them for predicting the structure of a protein from its sequence

Friday, August 7, 2009

Page 40: Structural Bioinformatics Tools at PDBj - PDB Japan · 07/08/2009  · question: why don’t we see hits in the 40-90% range? Friday, August 7, 2009. MAFFTash Multiple sequence alignment

Protein structure prediction

Friday, August 7, 2009

Page 41: Structural Bioinformatics Tools at PDBj - PDB Japan · 07/08/2009  · question: why don’t we see hits in the 40-90% range? Friday, August 7, 2009. MAFFTash Multiple sequence alignment

Protein structure prediction

query sequence

Friday, August 7, 2009

Page 42: Structural Bioinformatics Tools at PDBj - PDB Japan · 07/08/2009  · question: why don’t we see hits in the 40-90% range? Friday, August 7, 2009. MAFFTash Multiple sequence alignment

Protein structure prediction

query sequence template structure 1

Friday, August 7, 2009

Page 43: Structural Bioinformatics Tools at PDBj - PDB Japan · 07/08/2009  · question: why don’t we see hits in the 40-90% range? Friday, August 7, 2009. MAFFTash Multiple sequence alignment

Protein structure prediction

query sequence template structure 1

template structure 2

Friday, August 7, 2009

Page 44: Structural Bioinformatics Tools at PDBj - PDB Japan · 07/08/2009  · question: why don’t we see hits in the 40-90% range? Friday, August 7, 2009. MAFFTash Multiple sequence alignment

Protein structure prediction

query sequence template structure 1

template structure 2

template structure 3

Friday, August 7, 2009

Page 45: Structural Bioinformatics Tools at PDBj - PDB Japan · 07/08/2009  · question: why don’t we see hits in the 40-90% range? Friday, August 7, 2009. MAFFTash Multiple sequence alignment

Protein structure prediction

query sequence template structure 1

template structure 2

template structure 3

query MSA

Friday, August 7, 2009

Page 46: Structural Bioinformatics Tools at PDBj - PDB Japan · 07/08/2009  · question: why don’t we see hits in the 40-90% range? Friday, August 7, 2009. MAFFTash Multiple sequence alignment

Protein structure prediction

query sequence template structure 1

template structure 2

template structure 3

what program?

query MSA

Friday, August 7, 2009

Page 47: Structural Bioinformatics Tools at PDBj - PDB Japan · 07/08/2009  · question: why don’t we see hits in the 40-90% range? Friday, August 7, 2009. MAFFTash Multiple sequence alignment

Protein structure prediction

query sequence template structure 1

template structure 2

template structure 3

query MSA

Friday, August 7, 2009

Page 48: Structural Bioinformatics Tools at PDBj - PDB Japan · 07/08/2009  · question: why don’t we see hits in the 40-90% range? Friday, August 7, 2009. MAFFTash Multiple sequence alignment

Protein structure prediction

query sequence template structure 1

template structure 2

template structure 3

MAFFT

query MSA

Friday, August 7, 2009

Page 49: Structural Bioinformatics Tools at PDBj - PDB Japan · 07/08/2009  · question: why don’t we see hits in the 40-90% range? Friday, August 7, 2009. MAFFTash Multiple sequence alignment

Protein structure prediction

query sequence template structure 1

template structure 2

template structure 3

MAFFT

Also: psiblast, clustalw, Muscle, etc.

query MSA

Friday, August 7, 2009

Page 50: Structural Bioinformatics Tools at PDBj - PDB Japan · 07/08/2009  · question: why don’t we see hits in the 40-90% range? Friday, August 7, 2009. MAFFTash Multiple sequence alignment

Protein structure prediction

query sequence template structure 1

template structure 2

template structure 3

MAFFT

query MSA

Friday, August 7, 2009

Page 51: Structural Bioinformatics Tools at PDBj - PDB Japan · 07/08/2009  · question: why don’t we see hits in the 40-90% range? Friday, August 7, 2009. MAFFTash Multiple sequence alignment

Protein structure prediction

query sequence template structure 1

template structure 2

template structure 3

MAFFT

query MSA

Friday, August 7, 2009

Page 52: Structural Bioinformatics Tools at PDBj - PDB Japan · 07/08/2009  · question: why don’t we see hits in the 40-90% range? Friday, August 7, 2009. MAFFTash Multiple sequence alignment

Protein structure prediction

query sequence template structure 1

template structure 2

template structure 3

MAFFT

query MSA

Friday, August 7, 2009

Page 53: Structural Bioinformatics Tools at PDBj - PDB Japan · 07/08/2009  · question: why don’t we see hits in the 40-90% range? Friday, August 7, 2009. MAFFTash Multiple sequence alignment

Protein structure prediction

query sequence template structure 1

template structure 2

template structure 3

MAFFT

query MSA

template MSAs

Friday, August 7, 2009

Page 54: Structural Bioinformatics Tools at PDBj - PDB Japan · 07/08/2009  · question: why don’t we see hits in the 40-90% range? Friday, August 7, 2009. MAFFTash Multiple sequence alignment

Protein structure prediction

query sequence template structure 1

template structure 2

template structure 3

MAFFT

what program?

query MSA

template MSAs

Friday, August 7, 2009

Page 55: Structural Bioinformatics Tools at PDBj - PDB Japan · 07/08/2009  · question: why don’t we see hits in the 40-90% range? Friday, August 7, 2009. MAFFTash Multiple sequence alignment

Protein structure prediction

query sequence template structure 1

template structure 2

template structure 3

MAFFT

query MSA

template MSAs

Friday, August 7, 2009

Page 56: Structural Bioinformatics Tools at PDBj - PDB Japan · 07/08/2009  · question: why don’t we see hits in the 40-90% range? Friday, August 7, 2009. MAFFTash Multiple sequence alignment

Protein structure prediction

query sequence template structure 1

template structure 2

template structure 3

MAFFTash

MAFFT

query MSA

template MSAs

Friday, August 7, 2009

Page 57: Structural Bioinformatics Tools at PDBj - PDB Japan · 07/08/2009  · question: why don’t we see hits in the 40-90% range? Friday, August 7, 2009. MAFFTash Multiple sequence alignment

Protein structure prediction

query sequence template structure 1

template structure 2

template structure 3

MAFFTash

MAFFTash

MAFFT

query MSA

template MSAs

Friday, August 7, 2009

Page 58: Structural Bioinformatics Tools at PDBj - PDB Japan · 07/08/2009  · question: why don’t we see hits in the 40-90% range? Friday, August 7, 2009. MAFFTash Multiple sequence alignment

Protein structure prediction

query sequence template structure 1

template structure 2

template structure 3MAFFTash

MAFFTash

MAFFTash

MAFFT

query MSA

template MSAs

Friday, August 7, 2009

Page 59: Structural Bioinformatics Tools at PDBj - PDB Japan · 07/08/2009  · question: why don’t we see hits in the 40-90% range? Friday, August 7, 2009. MAFFTash Multiple sequence alignment

Protein structure prediction

query MSAtemplate MSA 1

template MSA 2

template MSA 3

method tocompare MSAs

bestquery-template

alignment

Friday, August 7, 2009

Page 60: Structural Bioinformatics Tools at PDBj - PDB Japan · 07/08/2009  · question: why don’t we see hits in the 40-90% range? Friday, August 7, 2009. MAFFTash Multiple sequence alignment

SFAS (http://sysimm.ifrec.osaka-u.ac.jp/sfas/index.html)

Friday, August 7, 2009

Page 61: Structural Bioinformatics Tools at PDBj - PDB Japan · 07/08/2009  · question: why don’t we see hits in the 40-90% range? Friday, August 7, 2009. MAFFTash Multiple sequence alignment

SFAS (http://sysimm.ifrec.osaka-u.ac.jp/sfas/index.html)

Example: hERG potassium channel protein

AVLFLLMCTFALIAHWLACIWYAIGNMEQPHMDSRIGWLHNLGDQIGKPYNSSGLGGPSIKDKYVTALYFTFSSLTSVGFGNVSPNTNSEKIFSICVMLIGSLMYASIFGNVSAIIQRLYSGTARY

hERG

Friday, August 7, 2009

Page 62: Structural Bioinformatics Tools at PDBj - PDB Japan · 07/08/2009  · question: why don’t we see hits in the 40-90% range? Friday, August 7, 2009. MAFFTash Multiple sequence alignment

SFAS

Friday, August 7, 2009

Page 63: Structural Bioinformatics Tools at PDBj - PDB Japan · 07/08/2009  · question: why don’t we see hits in the 40-90% range? Friday, August 7, 2009. MAFFTash Multiple sequence alignment

SFAS results for HHpred alignment

J. Söding, A. Biegert, A. N. Lupas (2005) “The HHpred interactive server for protein homology detection and

structure prediction.” Nucleic Acids Res, 33:W244-248.

Friday, August 7, 2009

Page 64: Structural Bioinformatics Tools at PDBj - PDB Japan · 07/08/2009  · question: why don’t we see hits in the 40-90% range? Friday, August 7, 2009. MAFFTash Multiple sequence alignment

SFAS results for HHpred alignment

J. Söding, A. Biegert, A. N. Lupas (2005) “The HHpred interactive server for protein homology detection and

structure prediction.” Nucleic Acids Res, 33:W244-248.

Friday, August 7, 2009

Page 65: Structural Bioinformatics Tools at PDBj - PDB Japan · 07/08/2009  · question: why don’t we see hits in the 40-90% range? Friday, August 7, 2009. MAFFTash Multiple sequence alignment

SFAS results for HHpred alignment

J. Söding, A. Biegert, A. N. Lupas (2005) “The HHpred interactive server for protein homology detection and

structure prediction.” Nucleic Acids Res, 33:W244-248.

Friday, August 7, 2009

Page 66: Structural Bioinformatics Tools at PDBj - PDB Japan · 07/08/2009  · question: why don’t we see hits in the 40-90% range? Friday, August 7, 2009. MAFFTash Multiple sequence alignment

SFAS results for HHpred alignment

J. Söding, A. Biegert, A. N. Lupas (2005) “The HHpred interactive server for protein homology detection and

structure prediction.” Nucleic Acids Res, 33:W244-248.

Friday, August 7, 2009

Page 67: Structural Bioinformatics Tools at PDBj - PDB Japan · 07/08/2009  · question: why don’t we see hits in the 40-90% range? Friday, August 7, 2009. MAFFTash Multiple sequence alignment

SFAS results for HHpred alignment

J. Söding, A. Biegert, A. N. Lupas (2005) “The HHpred interactive server for protein homology detection and

structure prediction.” Nucleic Acids Res, 33:W244-248.

Friday, August 7, 2009

Page 68: Structural Bioinformatics Tools at PDBj - PDB Japan · 07/08/2009  · question: why don’t we see hits in the 40-90% range? Friday, August 7, 2009. MAFFTash Multiple sequence alignment

Spanner

Friday, August 7, 2009

Page 69: Structural Bioinformatics Tools at PDBj - PDB Japan · 07/08/2009  · question: why don’t we see hits in the 40-90% range? Friday, August 7, 2009. MAFFTash Multiple sequence alignment

PDB Quat (http://sysimm.ifrec.osaka-u.ac.jp/pdb_quat/index.html)

The homo-tetramer can be generated by extracting the symmetry info from the template

Friday, August 7, 2009

Page 70: Structural Bioinformatics Tools at PDBj - PDB Japan · 07/08/2009  · question: why don’t we see hits in the 40-90% range? Friday, August 7, 2009. MAFFTash Multiple sequence alignment

PDB Quat (http://sysimm.ifrec.osaka-u.ac.jp/pdb_quat/index.html)

The homo-tetramer can be generated by extracting the symmetry info from the template

Monomeric model

Friday, August 7, 2009

Page 71: Structural Bioinformatics Tools at PDBj - PDB Japan · 07/08/2009  · question: why don’t we see hits in the 40-90% range? Friday, August 7, 2009. MAFFTash Multiple sequence alignment

2r9r

PDB Quat (http://sysimm.ifrec.osaka-u.ac.jp/pdb_quat/index.html)

The homo-tetramer can be generated by extracting the symmetry info from the template

Template: 2r9rA Monomeric model

Friday, August 7, 2009

Page 73: Structural Bioinformatics Tools at PDBj - PDB Japan · 07/08/2009  · question: why don’t we see hits in the 40-90% range? Friday, August 7, 2009. MAFFTash Multiple sequence alignment

Friday, August 7, 2009

Page 74: Structural Bioinformatics Tools at PDBj - PDB Japan · 07/08/2009  · question: why don’t we see hits in the 40-90% range? Friday, August 7, 2009. MAFFTash Multiple sequence alignment

Example 2: JUN transcription factor

MTAKMETTFYDDALNASFLQSESGAYGYSNPKILKQSMTLNLADPVGSLKPHLRAKNSDLLTSPDVGLLKLASPELERLIIQSSNGHITTTPTPTQFLCPKNVTDEQEGFAEGFVRALAELHSQNTLPSVTSAAQPVSGAGMVAPAVASVAGAGGGGGYSASLHSEPPVYANLSNFNPGALSSGGGAPSYGAAGLAFPSQPQQQQQPPQPPHHLPQQIPVQHPRLQALKEEPQTVPEMPGETPPLSPIDMESQERIKAERKRMRNRIAASKCRKRKLERIARLEEKVKTLKAQNSELASTANMLREQVAQLKQKVMNHVNSGCQLMLTQQLQTF

JUN_MOUSE

Friday, August 7, 2009

Page 75: Structural Bioinformatics Tools at PDBj - PDB Japan · 07/08/2009  · question: why don’t we see hits in the 40-90% range? Friday, August 7, 2009. MAFFTash Multiple sequence alignment

Friday, August 7, 2009

Page 76: Structural Bioinformatics Tools at PDBj - PDB Japan · 07/08/2009  · question: why don’t we see hits in the 40-90% range? Friday, August 7, 2009. MAFFTash Multiple sequence alignment

Friday, August 7, 2009

Page 77: Structural Bioinformatics Tools at PDBj - PDB Japan · 07/08/2009  · question: why don’t we see hits in the 40-90% range? Friday, August 7, 2009. MAFFTash Multiple sequence alignment

Friday, August 7, 2009

Page 78: Structural Bioinformatics Tools at PDBj - PDB Japan · 07/08/2009  · question: why don’t we see hits in the 40-90% range? Friday, August 7, 2009. MAFFTash Multiple sequence alignment

PDB Het (http://sysimm.ifrec.osaka-u.ac.jp/pdb_het/index.html)

Friday, August 7, 2009

Page 79: Structural Bioinformatics Tools at PDBj - PDB Japan · 07/08/2009  · question: why don’t we see hits in the 40-90% range? Friday, August 7, 2009. MAFFTash Multiple sequence alignment

PDB Het (http://sysimm.ifrec.osaka-u.ac.jp/pdb_het/index.html)

Friday, August 7, 2009

Page 80: Structural Bioinformatics Tools at PDBj - PDB Japan · 07/08/2009  · question: why don’t we see hits in the 40-90% range? Friday, August 7, 2009. MAFFTash Multiple sequence alignment

PDB Het (http://sysimm.ifrec.osaka-u.ac.jp/pdb_het/index.html)

1jnmA

Friday, August 7, 2009

Page 81: Structural Bioinformatics Tools at PDBj - PDB Japan · 07/08/2009  · question: why don’t we see hits in the 40-90% range? Friday, August 7, 2009. MAFFTash Multiple sequence alignment

PDB Het (http://sysimm.ifrec.osaka-u.ac.jp/pdb_het/index.html)

1jnmA

Friday, August 7, 2009

Page 82: Structural Bioinformatics Tools at PDBj - PDB Japan · 07/08/2009  · question: why don’t we see hits in the 40-90% range? Friday, August 7, 2009. MAFFTash Multiple sequence alignment

SeSAW

Friday, August 7, 2009

Page 83: Structural Bioinformatics Tools at PDBj - PDB Japan · 07/08/2009  · question: why don’t we see hits in the 40-90% range? Friday, August 7, 2009. MAFFTash Multiple sequence alignment

SeSAW

Friday, August 7, 2009

Page 84: Structural Bioinformatics Tools at PDBj - PDB Japan · 07/08/2009  · question: why don’t we see hits in the 40-90% range? Friday, August 7, 2009. MAFFTash Multiple sequence alignment

SeSAW

Friday, August 7, 2009

Page 85: Structural Bioinformatics Tools at PDBj - PDB Japan · 07/08/2009  · question: why don’t we see hits in the 40-90% range? Friday, August 7, 2009. MAFFTash Multiple sequence alignment

SeSAW

Upload PDB-formatted model

Friday, August 7, 2009

Page 86: Structural Bioinformatics Tools at PDBj - PDB Japan · 07/08/2009  · question: why don’t we see hits in the 40-90% range? Friday, August 7, 2009. MAFFTash Multiple sequence alignment

SeSAW

Upload PDB-formatted model

Friday, August 7, 2009

Page 87: Structural Bioinformatics Tools at PDBj - PDB Japan · 07/08/2009  · question: why don’t we see hits in the 40-90% range? Friday, August 7, 2009. MAFFTash Multiple sequence alignment

SeSAW

Upload PDB-formatted model

type email address here

[email protected]

Friday, August 7, 2009

Page 88: Structural Bioinformatics Tools at PDBj - PDB Japan · 07/08/2009  · question: why don’t we see hits in the 40-90% range? Friday, August 7, 2009. MAFFTash Multiple sequence alignment

SeSAW

type template PDB ID here

2rqr

Upload PDB-formatted model

type email address here

[email protected]

Friday, August 7, 2009

Page 89: Structural Bioinformatics Tools at PDBj - PDB Japan · 07/08/2009  · question: why don’t we see hits in the 40-90% range? Friday, August 7, 2009. MAFFTash Multiple sequence alignment

SeSAW

Friday, August 7, 2009

Page 90: Structural Bioinformatics Tools at PDBj - PDB Japan · 07/08/2009  · question: why don’t we see hits in the 40-90% range? Friday, August 7, 2009. MAFFTash Multiple sequence alignment

SeSAW

Friday, August 7, 2009

Page 91: Structural Bioinformatics Tools at PDBj - PDB Japan · 07/08/2009  · question: why don’t we see hits in the 40-90% range? Friday, August 7, 2009. MAFFTash Multiple sequence alignment

SeSAW

Friday, August 7, 2009

Page 92: Structural Bioinformatics Tools at PDBj - PDB Japan · 07/08/2009  · question: why don’t we see hits in the 40-90% range? Friday, August 7, 2009. MAFFTash Multiple sequence alignment

SeSAW

Friday, August 7, 2009

Page 93: Structural Bioinformatics Tools at PDBj - PDB Japan · 07/08/2009  · question: why don’t we see hits in the 40-90% range? Friday, August 7, 2009. MAFFTash Multiple sequence alignment

SeSAW

Friday, August 7, 2009

Page 94: Structural Bioinformatics Tools at PDBj - PDB Japan · 07/08/2009  · question: why don’t we see hits in the 40-90% range? Friday, August 7, 2009. MAFFTash Multiple sequence alignment

SeSAW

Friday, August 7, 2009

Page 95: Structural Bioinformatics Tools at PDBj - PDB Japan · 07/08/2009  · question: why don’t we see hits in the 40-90% range? Friday, August 7, 2009. MAFFTash Multiple sequence alignment

SeSAW

Friday, August 7, 2009

Page 96: Structural Bioinformatics Tools at PDBj - PDB Japan · 07/08/2009  · question: why don’t we see hits in the 40-90% range? Friday, August 7, 2009. MAFFTash Multiple sequence alignment

SeSAW

Friday, August 7, 2009

Page 97: Structural Bioinformatics Tools at PDBj - PDB Japan · 07/08/2009  · question: why don’t we see hits in the 40-90% range? Friday, August 7, 2009. MAFFTash Multiple sequence alignment

SeSAW

Friday, August 7, 2009

Page 98: Structural Bioinformatics Tools at PDBj - PDB Japan · 07/08/2009  · question: why don’t we see hits in the 40-90% range? Friday, August 7, 2009. MAFFTash Multiple sequence alignment

SeSAW

Friday, August 7, 2009

Page 99: Structural Bioinformatics Tools at PDBj - PDB Japan · 07/08/2009  · question: why don’t we see hits in the 40-90% range? Friday, August 7, 2009. MAFFTash Multiple sequence alignment

SeSAW considers both sequence and structure

Friday, August 7, 2009

Page 100: Structural Bioinformatics Tools at PDBj - PDB Japan · 07/08/2009  · question: why don’t we see hits in the 40-90% range? Friday, August 7, 2009. MAFFTash Multiple sequence alignment

SeSAW considers both sequence and structure Original template does not have highest score!

Friday, August 7, 2009

Page 101: Structural Bioinformatics Tools at PDBj - PDB Japan · 07/08/2009  · question: why don’t we see hits in the 40-90% range? Friday, August 7, 2009. MAFFTash Multiple sequence alignment

Acknowledgements

Prof. Haruki Nakamura (PDBj, Osaka Univ.)

Mr. Mieszko Lis (MIT)

Assoc. Prof Akira Kinjo (PDBj, Osaka Univ.)

Ms. Reiko Yamashita (PDBj)

Mr. Atsuro Yoshihara (PDBj)

Friday, August 7, 2009