rna.genomics.purdue.edurna.genomics.purdue.edu/.../files/447/=genemark_summary.docx · web...

35
Genes predicted by GENEMARK (Monocot model) that have significant similarity to known proteins in the SwissProt database. Note: Please click on hyperlinks for more information on predicted gene functions Gene2: Aspartic proteinase nepenthesin-1 Details of best match Aspartic proteinase nepenthesin-1 Score = 206 bits (524), Expect = 3e-52, Method: Compositional matrix adjust. Identities = 152/452 (33%), Positives = 221/452 (48%), Gaps = 57/452 (12%) Query 17 ALSEHH----PGYRLELKQVDSYASGTLTRAKRLEKALKTSKSRANFFSDRRKTSNSPTK 72 AL+ H G+++ L+ VDS LT+ + LE+A++ R N P+ Sbjct 29 ALNHRHEAKVTGFQIMLEHVDS--GKNLTKFQLLERAIERGSRR---LQRLEAMLNGPSG 83 Query 73 SNNTRDRHSPELSSSIYQAGGGEGEYMMELSIGTPPQLIPAMIDTGSDLVWLKCDNCDHC 132 + +S+Y G+GEY+M LSIGTP Q A++DTGSDL+W +C C C Sbjct 84 -----------VETSVY---AGDGEYLMNLSIGTPAQPFSAIMDTGSDLIWTQCQPCTQC 129 Query 133 DLDHHGETIFFSDASSSYKKLPCNSTHCSGMSSAGIGPRCEET-CKYKYEYGDGSRTSGD 191 + IF SSS+ LPC+S C +SS P C C+Y Y YGDGS T G Sbjct 130 --FNQSTPIFNPQGSSSFSTLPCSSQLCQALSS----PTCSNNFCQYTYGYGDGSETQGS 183 Query 192 VGSDRISFRSHGAGEDHRSFFDGFLFGCARKLKGDWNFTQ----GLIGLGQKSHSLIQQL 247 +G++ ++F S FGC +G F Q GL+G+G+ SL QL

Upload: hoangtruc

Post on 18-Mar-2018

218 views

Category:

Documents


3 download

TRANSCRIPT

Page 1: rna.genomics.purdue.edurna.genomics.purdue.edu/.../files/447/=GENEMARK_summary.docx · Web viewNote: Please click on hyperlinks for more information on predicted gene functions Gene2:

Genes predicted by GENEMARK (Monocot model) that have significant similarity to known proteins in the SwissProt database.Note: Please click on hyperlinks for more information on predicted gene functions

Gene2: Aspartic proteinase nepenthesin-1

Details of best match

Aspartic proteinase nepenthesin-1

Score = 206 bits (524), Expect = 3e-52, Method: Compositional matrix adjust. Identities = 152/452 (33%), Positives = 221/452 (48%), Gaps = 57/452 (12%)

Query 17 ALSEHH----PGYRLELKQVDSYASGTLTRAKRLEKALKTSKSRANFFSDRRKTSNSPTK 72 AL+ H G+++ L+ VDS LT+ + LE+A++ R N P+ Sbjct 29 ALNHRHEAKVTGFQIMLEHVDS--GKNLTKFQLLERAIERGSRR---LQRLEAMLNGPSG 83

Query 73 SNNTRDRHSPELSSSIYQAGGGEGEYMMELSIGTPPQLIPAMIDTGSDLVWLKCDNCDHC 132 + +S+Y G+GEY+M LSIGTP Q A++DTGSDL+W +C C CSbjct 84 -----------VETSVY---AGDGEYLMNLSIGTPAQPFSAIMDTGSDLIWTQCQPCTQC 129

Query 133 DLDHHGETIFFSDASSSYKKLPCNSTHCSGMSSAGIGPRCEET-CKYKYEYGDGSRTSGD 191 + IF SSS+ LPC+S C +SS P C C+Y Y YGDGS T G Sbjct 130 --FNQSTPIFNPQGSSSFSTLPCSSQLCQALSS----PTCSNNFCQYTYGYGDGSETQGS 183

Query 192 VGSDRISFRSHGAGEDHRSFFDGFLFGCARKLKGDWNFTQ----GLIGLGQKSHSLIQQL 247 +G++ ++F S FGC +G F Q GL+G+G+ SL QLSbjct 184 MGTETLTFGSVS--------IPNITFGCGENNQG---FGQGNGAGLVGMGRGPLSLPSQL 232

Query 248 GDKLGYKFSYCLVSYDSPPSAKSFLFLGSSAALRGHDVVSTPILHGDHLDQTLYYVDLQS 307 KFSYC+ S S S L LGS A +T ++ + T YY+ L Sbjct 233 DVT---KFSYCMTPIGS--STPSNLLLGSLANSVTAGSPNTTLIQSSQIP-TFYYITLNG 286

Query 308 ITIGGVPVVVYDKESGHNTSVGPFLANKTVIDSGTTYTLLTPPVYEAMRKSIEEQVILPT 367

Page 2: rna.genomics.purdue.edurna.genomics.purdue.edu/.../files/447/=GENEMARK_summary.docx · Web viewNote: Please click on hyperlinks for more information on predicted gene functions Gene2:

+++G + + N++ G +IDSGTT T Y+++R+ Q+ LP Sbjct 287 LSVGSTRLPIDPSAFALNSNNG---TGGIIIDSGTTLTYFVNNAYQSVRQEFISQINLPV 343

Query 368 L-GNSAGLDLCFNSSGDTS-YGFPSVTFYFANQVQLVLPFENIFQVTSRDVVCLSMDSSG 425 + G+S+G DLCF + D S P+ +F + L LP EN F S ++CL+M SS Sbjct 344 VNGSSSGFDLCFQTPSDPSNLQIPTFVMHF-DGGDLELPSENYFISPSNGLICLAMGSSS 402

Query 426 GDLSIIGNMQQQNFHILYDLVASQISFQRVEC 457 +SI GN+QQQN ++YD S +SF +CSbjct 403 QGMSIFGNIQQQNMLVVYDTGNSVVSFASAQC 434

Gene3: Pentatricopeptide repeat-containing protein

Details of best match

Pentatricopeptide repeat-containing protein

Score = 135 bits (339), Expect = 3e-31, Method: Compositional matrix adjust. Identities = 84/261 (32%), Positives = 121/261 (46%), Gaps = 44/261 (16%)

Query 1 MPERTPAAWNAMIEAFFSIGDISSATKMFSSMPHRSPSSWNT------------------ 42 M ER +W AM+ + GDIS+A +F MP R SWN Sbjct 188 MSERNVVSWTAMLSGYARSGDISNAVALFEDMPERDVPSWNAILAACTQNGLFLEAVSLF 247

Query 43 ------------------VLSAYAQAGHIDIAKGIFASTPHRNVVS----WTSMIAANAQ 80 VLSA AQ G + +AKGI A R++ S S++ +Sbjct 248 RRMINEPSIRPNEVTVVCVLSACAQTGTLQLAKGIHAFAYRRDLSSDVFVSNSLVDLYGK 307

Query 81 VGDLVEVRKLFESMPEGDPVAWSSILSAYAQRGESLATIATFTNMKLFDM----PDKNCF 136 G+L E +F+ + AW+S+++ +A G S IA F M ++ PD FSbjct 308 CGNLEEASSVFKMASKKSLTAWNSMINCFALHGRSEEAIAVFEEMMKLNINDIKPDHITF 367

Query 137 LAALLACTHEGRVESAKEYFVGMAVDFGVDPGMEHYCCLVDVLARAGQLGSALELAASMP 196 + L ACTH G V + YF M FG++P +EHY CL+D+L RAG+ ALE+ ++M Sbjct 368 IGLLNACTHGGLVSKGRGYFDLMTNRFGIEPRIEHYGCLIDLLGRAGRFDEALEVMSTMK 427

Page 3: rna.genomics.purdue.edurna.genomics.purdue.edu/.../files/447/=GENEMARK_summary.docx · Web viewNote: Please click on hyperlinks for more information on predicted gene functions Gene2:

Query 197 FEAGAWQWRKLLAACRSFGDV 217 +A W LL AC+ G +Sbjct 428 MKADEAIWGSLLNACKIHGHL 448

Score = 75.1 bits (183), Expect = 4e-13, Method: Compositional matrix adjust. Identities = 54/171 (31%), Positives = 82/171 (47%), Gaps = 14/171 (8%)

Query 26 TKMFSSMPHRSPSSWNTVLSAYAQA-GHIDIAKGIFASTPHRNVVSWTSMIAANAQVGDL 84 T +F S H +L +YA + HI +A+ +F RNVVSWT+M++ A+ GD+Sbjct 150 THLFKSGFHLYVVVQTALLHSYASSVSHITLARQLFDEMSERNVVSWTAMLSGYARSGDI 209

Query 85 VEVRKLFESMPEGDPVAWSSILSAYAQRGESLATIATFTNM--KLFDMPDKNCFLAALLA 142 LFE MPE D +W++IL+A Q G L ++ F M + P++ + L ASbjct 210 SNAVALFEDMPERDVPSWNAILAACTQNGLFLEAVSLFRRMINEPSIRPNEVTVVCVLSA 269

Query 143 CTHEGRVESAK-----EYFVGMAVDFGVDPGMEHYCCLVDVLARAGQLGSA 188 C G ++ AK Y ++ D V LVD+ + G L ASbjct 270 CAQTGTLQLAKGIHAFAYRRDLSSDVFVSNS------LVDLYGKCGNLEEA 314

Score = 62.4 bits (150), Expect = 3e-09, Method: Compositional matrix adjust. Identities = 38/101 (37%), Positives = 53/101 (52%), Gaps = 5/101 (4%)

Query 18 SIGDISSATKMFSSMPHRSPSSWNTVLSAYAQAGHIDIAKGIFASTPHRNVVSWTSMIAA 77 S+ I+ A ++F M R+ SW +LS YA++G I A +F P R+V SW +++AASbjct 174 SVSHITLARQLFDEMSERNVVSWTAMLSGYARSGDISNAVALFEDMPERDVPSWNAILAA 233

Query 78 NAQVGDLVEVRKLFESM---PEGDP--VAWSSILSAYAQRG 113 Q G +E LF M P P V +LSA AQ GSbjct 234 CTQNGLFLEAVSLFRRMINEPSIRPNEVTVVCVLSACAQTG 274

Gene8: Cell division control protein 15

Details of best match

Cell division control protein 15

Score = 73.9 bits (180), Expect = 3e-12, Method: Compositional matrix adjust. Identities = 66/258 (25%), Positives = 108/258 (41%), Gaps = 41/258 (15%)

Page 4: rna.genomics.purdue.edurna.genomics.purdue.edu/.../files/447/=GENEMARK_summary.docx · Web viewNote: Please click on hyperlinks for more information on predicted gene functions Gene2:

Query 248 LKHRIGRGPFGDVWLATIHSHREEFDEFHEVA-------------VKMLPAISEDHIRAF 294 LK IGRG +G V+ A I+ H ++ EV + +L ++ ++I +Sbjct 27 LKQVIGRGSYGVVYKA-INKHTDQVVAIKEVVYENDEELNDIMAEISLLKNLNHNNIVKY 85

Query 295 TSL--RDHQVFRAFNRRQNGTAAR--------------KFYGVNLAQGVLDLHSRGITAL 338 + ++++ NG+ R K Y G+ LH G+ Sbjct 86 HGFIRKSYELYILLEYCANGSLRRLISRSSTGLSENESKTYVTQTLLGLKYLHGEGVIHR 145

Query 339 NLKPFNFLLDEHDQAVLGEFGIPFLLMDAISSDGPLVWLGTPNYMAPEQWEPKLRGPVSY 398 ++K N LL + L +FG+ I + L GT N+MAPE RG S Sbjct 146 DIKAANILLSADNTVKLADFGVS-----TIVNSSALTLAGTLNWMAPEILGN--RG-AST 197

Query 399 ETDSWGFACSFIEMLTGVKPWNTMSPSEIFHAVVEKGDKPVVPSGLPIALTRMLTSCLAS 458 +D W + +EMLT P++ ++ + I++AV + D PS L L+ C Sbjct 198 LSDIWSLGATVVEMLTKNPPYHNLTDANIYYAV--ENDTYYPPSSFSEPLKDFLSKCFVK 255

Query 459 DRRDRPTPNVLLKEADWV 476 + RPT + LLK W+Sbjct 256 NMYKRPTADQLLKHV-WI 272

Gene9: Arabidillo-1

Details of best match

ARABIDILLO-1

Score = 822 bits (2124), Expect = 0.0, Method: Compositional matrix adjust. Identities = 497/939 (52%), Positives = 599/939 (63%), Gaps = 132/939 (14%)

Query 2 RRVRRKCVHTLATKSSAGAENGDGIAEEEESRIPKHDGQVLVRCERESG-VDWTRLADDT 60 RRVRRK + G + + E+ I + LV E G VDW L DTSbjct 3 RRVRRKL------EEEKGKDKVVVLPSYPETSISNEED--LVAPELLHGFVDWISLPYDT 54

Query 61 LLGLFSLLNYRDRASVGSVCRAWHALSSSPSLWTSLDLRAHTLDSNMASSLASRCAKLSK 120 +L LF+ LNYRDRAS+ S C+ W L +S LWTSLDLR H D++MA+SLASRC L Sbjct 55 VLQLFTCLNYRDRASLASTCKTWRCLGASSCLWTSLDLRPHKFDASMAASLASRCVNLHY 114

Query 121 LKFRGASGASLIIDLQARQLKGLIGDGCKDLTDATLSMLVARHENLESLQLGPEL-EKIT 179 L+FRG A +I L+AR L + GD CK +TDATLSM+VARHE LESLQLGP+ E+IT

Page 5: rna.genomics.purdue.edurna.genomics.purdue.edu/.../files/447/=GENEMARK_summary.docx · Web viewNote: Please click on hyperlinks for more information on predicted gene functions Gene2:

Sbjct 115 LRFRGVESADSLIHLKARNLIEVSGDYCKKITDATLSMIVARHEALESLQLGPDFCERIT 174

Query 180 NEAIKVVAVCCRRLKCLRLAGIRDVDSEAIGDLVKHCPSLTELALLDCAVVDEAALGEAK 239 ++AIK VA CC +LK LRL+GIRDV SEAI L KHCP L +L LDC +DE ALG+ Sbjct 175 SDAIKAVAFCCPKLKKLRLSGIRDVTSEAIEALAKHCPQLNDLGFLDCLNIDEEALGKVV 234

Query 240 SLRYLSVAGSRNIMWTQAMQAWSKLENLVALDVSRTEVTPAAVMSFLSAPR-LRVLCALS 298 S+RYLSVAG+ NI W+ A W KL L LDVSRT++ P AV FL++ + L+VLCAL+Sbjct 235 SVRYLSVAGTSNIKWSIASNNWDKLPKLTGLDVSRTDIGPTAVSRFLTSSQSLKVLCALN 294

Query 299 CSALEDGSNSVSYVS-KDRVLLARFTELMNGLACISSLEQQDESRVL------------- 344 C LE+ + +SY K +VLLA FT + +GLA I + + + Sbjct 295 CHVLEEDESLISYNRFKGKVLLALFTNVFDGLASIFADNTKKPKDIFAYWRELMKTTKDK 354

Query 345 -----VCWTEWVLSHALLRIAENNTQGLDAFWLKQGTSVMLRLIKSMQEDVQERAATALA 399 + W EW++SH LLR AE N +GLD FWL +G +++L L++S QEDVQER+AT LASbjct 355 TINDFIHWIEWIISHTLLRTAECNPEGLDDFWLNEGAALLLNLMQSSQEDVQERSATGLA 414

Query 400 TFVVVDDENATVDSSRAEAVMHGGGIRSLLDLARSSREGVQSEAAKAIANLSVNAEVAKA 459 TFVVVDDENA++D RAEAVM GGIR LL+LA+S REG+QSEAAKAIANLSVNA +AK+Sbjct 415 TFVVVDDENASIDCGRAEAVMKDGGIRLLLELAKSWREGLQSEAAKAIANLSVNANIAKS 474

Query 460 VATEGGINILAGLARSPNRWVAEEAAGGLWNLSVGEEHK--------------------- 498 VA EGGI ILAGLA+S NR VAEEAAGGLWNLSVGEEHK Sbjct 475 VAEEGGIKILAGLAKSMNRLVAEEAAGGLWNLSVGEEHKNAIAQAGGVKALVDLIFRWPN 534

Query 499 ------ERAAGALANLAADDKCSMKVANAGGVNALAARA---------------LANLAA 537 ERAAGALANLAADDKCSM+VA AGGV+AL A LANLAASbjct 535 GCDGVLERAAGALANLAADDKCSMEVAKAGGVHALVMLARNCKYEGVQEQAARALANLAA 594

Query 538 HGDSNGNNAAVGREAG--KKLLAPCGTCH--LTTEIGRQLLQLVV---LRHCIAIGREGG 590 HGDSN NNAAVG+EAG + L+ + H + E L L R I++ GGSbjct 595 HGDSNNNNAAVGQEAGALEALVQLTKSPHEGVRQEAAGALWNLSFDDKNRESISVA--GG 652

Query 591 VAPLVAL--------------------------------------------ARSDAEDVH 606 V LVAL ARS+AEDVHSbjct 653 VEALVALAQSCSNASTGLQERAAGALWGLSVSEANSVAIGREGGVPPLIALARSEAEDVH 712

Query 607 ETAAGALWNLAFNPGNALRIVEEDGVSALVRLCSSSRSKMARFMAALALAYMFDGRMDEV 666 ETAAGALWNLAFNPGNALRIVEE GV ALV LCSSS SKMARFMAALALAYMFDGRMDE Sbjct 713 ETAAGALWNLAFNPGNALRIVEEGGVPALVHLCSSSVSKMARFMAALALAYMFDGRMDEY 772

Query 667 -----TTNEVVYCDSITKNGVARQSAMKNIEAFVQAFSDQPSLAAVPASQWGPSALQQVS 721 T++ +I+ +G AR A+K+IEAFV +F D P + P P+ L QV+Sbjct 773 ALMIGTSSSESTSKNISLDG-ARNMALKHIEAFVLSFID-PHIFESPVVSSTPTMLAQVT 830

Query 722 DSARIQEAGHLRCSGAEIGRFVAMLRNGSSVLRSCAAFALLQFTMPGGRHALHHANLLQR 781 + ARIQEAGHLRCSGAEIGRFV MLRN S L++CAAFALLQFT+PGGRHA+HH +L+Q Sbjct 831 ERARIQEAGHLRCSGAEIGRFVTMLRNPDSTLKACAAFALLQFTIPGGRHAMHHVSLMQN 890

Query 782 SGAARVLRGAAASTTAPLQARVFARLVLRNLELCQSEKS 820 G +R LR AAAS P +A++F +++LRNLE Q+E SSbjct 891 GGESRFLRSAAASAKTPREAKIFTKILLRNLEHHQAESS 929

Page 6: rna.genomics.purdue.edurna.genomics.purdue.edu/.../files/447/=GENEMARK_summary.docx · Web viewNote: Please click on hyperlinks for more information on predicted gene functions Gene2:

Gene10: Calcium/Calmodulin dependent protein kinase type 1G

Details of best match

Calcium/calmodulin-dependent protein kinase type 1G

Score = 127 bits (320), Expect = 6e-29, Method: Compositional matrix adjust. Identities = 89/257 (34%), Positives = 140/257 (54%), Gaps = 23/257 (8%)

Query 26 LGCGGQGQVFALTHRGFGHKRYAGKMFRSSSEARREASIMDMLHACPGVAHLEGFVKERD 85 LG G +VF + R G K +A K + S A R++S+ + + + H E V DSbjct 29 LGSGAFSEVFLVKQRLTG-KLFALKCIKKSP-AFRDSSLENEIAVLKKIKH-ENIVTLED 85

Query 86 --GAQCYGSIVMELC-GPSLFDRLLKAGPMCEEDAARTIKKLAETIKEIHSRGIVHRDLK 142 + + +VM+L G LFDR+L+ G E+DA+ I+++ +K +H GIVHRDLKSbjct 86 IYESTTHYYLVMQLVSGGELFDRILERGVYTEKDASLVIQQVLSAVKYLHENGIVHRDLK 145

Query 143 PENVFLKLDAAAADDVVIGDFGMATDDPRE-MAQCCGTGKYLAPEVIAIKFGKSSYTEAV 201 PEN+ L L ++I DFG++ + M+ CGT Y+APEV+A K Y++AVSbjct 146 PENL-LYLTPEENSKIMITDFGLSKMEQNGIMSTACGTPGYVAPEVLAQK----PYSKAV 200

Query 202 DVWGLGLIAYELLRGC------TEWRVMDLLRAGRFP-DGLF----SDGAADLLRGMLAV 250 D W +G+I Y LL G TE ++ + ++ G + + F S+ A D + +L Sbjct 201 DCWSIGVITYILLCGYPPFYEETESKLFEKIKEGYYEFESPFWDDISESAKDFICHLLEK 260

Query 251 EPRKRTTLDRVLAHPWI 267 +P +R T ++ L+HPWISbjct 261 DPNERYTCEKALSHPWI 277

Page 7: rna.genomics.purdue.edurna.genomics.purdue.edu/.../files/447/=GENEMARK_summary.docx · Web viewNote: Please click on hyperlinks for more information on predicted gene functions Gene2:

Gene13: Inositol-tetrakisphosphate 1-kinase 1

Details of best match

Inositol-tetrakisphosphate 1-kinase 1 Score = 370 bits (951), Expect = 4e-102, Method: Compositional matrix adjust. Identities = 179/312 (57%), Positives = 227/312 (72%), Gaps = 6/312 (1%)

Query 6 RFEVGYALAQKKQKSFVQPSLVEHARSRGIDMVCIDLDKPLVEQGPFDAILHKLSGKEWH 65 R+ +GYALA KKQ+SF+QPSLV A SRG+D+V +D +PL EQGPF ++HKL G +W Sbjct 18 RYVIGYALAPKKQQSFIQPSLVAQAASRGMDLVPVDASQPLAEQGPFHLLIHKLYGDDWR 77

Query 66 KELEEYEKKHPDVIIIDSPDAIERLHNRISMLQAVSDL-QVGDEQETFGIPKQSVMDRAD 124 +L + +HP V I+D P AI+RLHNRISMLQ VS+L D+ TFGIP Q V+ A Sbjct 78 AQLVAFAARHPAVPIVDPPHAIDRLHNRISMLQVVSELDHAADQDSTFGIPSQVVVYDAA 137

Query 125 CLGDLKAMSGLKFPVIAKPLVADGSAKSHAMSLIFNQEGLTKLKPPVVLQEFVNHGGVIF 184 L D ++ L+FP+IAKPLVADG+AKSH MSL++++EGL KL+PP+VLQEFVNHGGVIFSbjct 138 ALADFGLLAALRFPLIAKPLVADGTAKSHKMSLVYHREGLGKLRPPLVLQEFVNHGGVIF 197

Query 185 KVYVVGDYVKCVKRRSLPDV-PEDELNRSEALCFSQISNMGSTQQC----GASDYLQAEL 239 KVYVVG +V CVKRRSLPDV PED+ + ++ FSQ+SN+ + + G A +Sbjct 198 KVYVVGGHVTCVKRRSLPDVSPEDDASAQGSVSFSQVSNLPTERTAEEYYGEKSLEDAVV 257

Query 240 PPTKFVAELAKGLRENLGLRLFNFDLIRDSKAGNHYHVIDINYFPGYAKMPAYETVLTDF 299 PP F+ ++A GLR LGL+LFNFD+IRD +AG+ Y VIDINYFPGYAKMP YETVLTDFSbjct 258 PPAAFINQIAGGLRRALGLQLFNFDMIRDVRAGDRYLVIDINYFPGYAKMPGYETVLTDF 317

Query 300 FLSLAKLKASSN 311 F + NSbjct 318 FWEMVHKDGVGN 329

Page 8: rna.genomics.purdue.edurna.genomics.purdue.edu/.../files/447/=GENEMARK_summary.docx · Web viewNote: Please click on hyperlinks for more information on predicted gene functions Gene2:

Gene15: Flavonone-3-hydroxylase

Details of best match

Flavonone-3-hydroxylase

Score = 186 bits (471), Expect = 2e-46, Method: Compositional matrix adjust. Identities = 107/291 (36%), Positives = 154/291 (52%), Gaps = 21/291 (7%)

Query 27 DQSLPDKYIKPEIAR--VRCNTPLAGIPLIDFSQIHGQSRSKIIQDIANAAQEWGFFQVI 84 D L +++ E R V N IP+I + I G+ R +I + I A ++WG FQV+Sbjct 15 DDKLNSNFVRDEDERPKVAYNEFSNDIPVISLAGIDGEKRGEICRKIVEACEDWGIFQVV 74

Query 85 NHSVPLALMDAMMSAGLEFFNLPLEEKMAYFSEDYKLKLRFCTSFVPSTEAHWDWHDNLT 144 +H V L+ M EFF LP EEK+ + K K F S E DW + +TSbjct 75 DHGVGDDLIADMTRLAREFFALPAEEKLRFDMSGGK-KGGFIVSSHLQGEVVQDWREIVT 133

Query 145 HYFPPYG--DEHPWPKQPPSYEKAAREYFDEVLALGKTISRALSQGLGLEPDFLIKAFRE 202 ++ P D WP +P + K EY ++++ L T+ LS+ +GLE + L KA +Sbjct 134 YFSYPTNSRDYTRWPDKPEGWIKVTEEYSNKLMTLACTLLGVLSEAMGLELEALTKACVD 193

Query 203 GMNSIRLNYYPPCPRPDLAVGMSPHSDFGGFTILMQDQAGGLQVKRNG--EWYSVKPI-- 258 I +NYYP CP+PDL +G+ H+D G T+L+QDQ GGLQ R+G W +V+P+ Sbjct 194 MDQKIVVNYYPKCPQPDLTLGLKRHTDPGTITLLLQDQVGGLQATRDGGKTWITVQPVPG 253

Query 259 ------------FSNGKFQSAEHRVAVNSSSQRLSIATFFEPSEDVVVAPI 297 SNG+F++A+H+ VNS RLSIATF PS D V P+Sbjct 254 AFVVNLGDHGHFLSNGRFKNADHQAVVNSECSRLSIATFQNPSPDATVYPL 304

Page 9: rna.genomics.purdue.edurna.genomics.purdue.edu/.../files/447/=GENEMARK_summary.docx · Web viewNote: Please click on hyperlinks for more information on predicted gene functions Gene2:

Gene17: Histone-lysine N-methyltransferase

Details of best match

Histone-lysine N-methyltransferase Score = 85.9 bits (211), Expect = 6e-17, Method: Compositional matrix adjust. Identities = 47/93 (50%), Positives = 54/93 (58%), Gaps = 33/93 (35%)

Query 69 VIEYAGEIIRPTVAD-----------GAGTYMFCIDNERVVDATRAGSIAHLINHSCE-- 115 VIEY GE++RP +AD GAGTYMF IDNERV+DATR GSIAHLINHSCE Sbjct 922 VIEYTGELVRPPIADKREHLIYNSMVGAGTYMFRIDNERVIDATRTGSIAHLINHSCEPN 981

Query 116 --------------------DIAGGDEVTYDYR 128 D+A +E+TYDYRSbjct 982 CYSRVISVNGDEHIIIFAKRDVAKWEELTYDYR 1014

Page 10: rna.genomics.purdue.edurna.genomics.purdue.edu/.../files/447/=GENEMARK_summary.docx · Web viewNote: Please click on hyperlinks for more information on predicted gene functions Gene2:

Gene19: Mediator of RNA polymerase II transcription subunit

Details of best match

Mediator of RNA polymerase II transcription subunit

Score = 75.1 bits (183), Expect = 2e-13, Method: Compositional matrix adjust. Identities = 64/226 (28%), Positives = 102/226 (45%), Gaps = 63/226 (27%)

Query 26 LDRNLVFDYFV--LSPFYDRSCSNEQLRMRSVHPLDMTQLSKMTGVEYVLLEAQEPNLFV 83 L+ V DYF +PFYDR+C+NE ++M+ L + L++M G+EY+LL AQEP LF+Sbjct 24 LNSGSVLDYFSERSNPFYDRTCNNEVVKMQR---LTLEHLNQMVGIEYILLHAQEPILFI 80

Query 84 LRKQKRESPDKA----------------------------RAVHHISAAFSQVSAKL--- 112 +RKQ+R+SP + AVH I +AF + + Sbjct 81 IRKQQRQSPAQVIPLADYYIIAGVIYQAPDLGSVINSRVLTAVHGIQSAFDEAMSYCRYH 140

Query 113 -EKIGYDDENEHESDTSGRVDLK---------EILRIDQILGNVLRKLP-------PAPP 155 K + EHE R K + R+D +L ++ +K P P Sbjct 141 PSKGYWWHFKEHEEQDKVRPKAKRKEEPSSIFQRQRVDALLLDLRQKFPPKFVQLKPGEK 200

Query 156 PPPM--------PTPPSGVPEQPQDSQPEQST--EQGPPAKKAKVE 191 P P+ P P + PE+ + ++ Q T +GPP K+ +++Sbjct 201 PVPVDQTKKEAEPVPETVKPEEKETTKNVQQTVSAKGPPEKRMRLQ 246

Gene20: Pentatricopeptide repeat containing protein

Page 11: rna.genomics.purdue.edurna.genomics.purdue.edu/.../files/447/=GENEMARK_summary.docx · Web viewNote: Please click on hyperlinks for more information on predicted gene functions Gene2:

Details of best match

Pentatricopeptide repeat-containing protein

Score = 324 bits (830), Expect = 3e-87, Method: Compositional matrix adjust. Identities = 177/500 (35%), Positives = 282/500 (56%), Gaps = 9/500 (1%)

Query 589 LSTYARGGCVEEARIFFEDMPSRDMVSWNALLSAYARSGHVEEAKQVFSSMPSSNLVSWT 648 +S +R G + EAR FF+ + + + SWN+++S Y +G +EA+Q+F M N+VSW Sbjct 24 ISRLSRIGKINEARKFFDSLQFKAIGSWNSIVSGYFSNGLPKEARQLFDEMSERNVVSWN 83

Query 649 SLLAAYTQNGHIKLAKSVFEEMPQRDMMAWTIMLTALTQRYLVMEAENVFFNMPEYNLVS 708 L++ Y +N I A++VFE MP+R++++WT M+ Q +V EAE++F+ MPE N VSSbjct 84 GLVSGYIKNRMIVEARNVFELMPERNVVSWTAMVKGYMQEGMVGEAESLFWRMPERNEVS 143

Query 709 WTAMLTCYSQSGHIEEASLVFHAMEQRDIVAWTAMVAAYAQSGYVKEAIRIFSKMPELDC 768 WT M G I++A ++ M +D+VA T M+ + G V EA IF +M E + Sbjct 144 WTVMFGGLIDDGRIDKARKLYDMMPVKDVVASTNMIGGLCREGRVDEARLIFDEMRERNV 203

Query 769 VTCSTMICVYTQDADFRKAEEVYNAMPEWSVVTMNAMLSCYAQSSQVERAKRVFDEIPEK 828 VT +TMI Y Q+ A +++ MPE + V+ +ML Y S ++E A+ F+ +P KSbjct 204 VTWTTMITGYRQNNRVDVARKLFEVMPEKTEVSWTSMLLGYTLSGRIEDAEEFFEVMPMK 263

Query 829 SLVSWNAMLSGYAQNGEIEKAKNVFDRMEERDVVSWDAMVSGYAQNGYVEEARRIFNAMP 888 +++ NAM+ G+ + GEI KA+ VFD ME+RD +W M+ Y + G+ EA +F M Sbjct 264 PVIACNAMIVGFGEVGEISKARRVFDLMEDRDNATWRGMIKAYERKGFELEALDLFAQMQ 323

Query 889 ER-------NLVAWNALLCGLALNGSVEEAEELLLSTAMDDRNIVSWTAVAIGYAQVGHL 941 ++ +L++ ++ LA + L+ DD V+ + + Y + G LSbjct 324 KQGVRPSFPSLISILSVCATLASLQYGRQVHAHLVRCQFDDDVYVA-SVLMTMYVKCGEL 382

Query 942 QKTRRVFDAMPERDAVAWNAMLETYAYNGRTESTLELFHTMALMQT-PGEAGFVWILLAC 1000 K + VFD +D + WN+++ YA +G E L++FH M T P + + IL ACSbjct 383 VKAKLVFDRFSSKDIIMWNSIISGYASHGLGEEALKIFHEMPSSGTMPNKVTLIAILTAC 442

Query 1001 SHAGKLRSGLGYFASMTRDWKLVPLKQHFCCVVDLLGRAGYLGEAEVLVSAMPSDARELG 1060 S+AGKL GL F SM + + P +H+ C VD+LGRAG + +A L+ +M Sbjct 443 SYAGKLEEGLEIFESMESKFCVTPTVEHYSCTVDMLGRAGQVDKAMELIESMTIKPDATV 502

Query 1061 WSCLLGACRGHTEMDMSRGA 1080 W LLGAC+ H+ +D++ ASbjct 503 WGALLGACKTHSRLDLAEVA 522

Score = 218 bits (555), Expect = 2e-55, Method: Compositional matrix adjust. Identities = 145/507 (28%), Positives = 248/507 (48%), Gaps = 59/507 (11%)

Query 555 NKLLQHYSKQGNVQQARIVFDNITVKDSFSWNIMLSTYARGGCVEEARIFFEDMPSRDMV 614 N ++ Y G ++AR +FD ++ ++ SWN ++S Y + + EAR FE MP R++VSbjct 52 NSIVSGYFSNGLPKEARQLFDEMSERNVVSWNGLVSGYIKNRMIVEARNVFELMPERNVV 111

Query 615 SWNALLSAYARSGHVEEAKQVFSSMPSSNLVSWTSLLAAYTQNGHIKLAKSVFEEMPQRD 674 SW A++ Y + G V EA+ +F MP N VSWT + +G I A+ +++ MP +DSbjct 112 SWTAMVKGYMQEGMVGEAESLFWRMPERNEVSWTVMFGGLIDDGRIDKARKLYDMMPVKD 171

Query 675 MMAWTIMLTALTQRYLVMEAENVFFNMPEYNLVSWTAMLTCYSQSGHIEEASLVFHAMEQ 734 ++A T M+ L + V EA +F M E N+V+WT M+T Y Q+ ++ A +F M +Sbjct 172 VVASTNMIGGLCREGRVDEARLIFDEMRERNVVTWTTMITGYRQNNRVDVARKLFEVMPE 231

Query 735 RDIVAWTAMVAAYAQSGYVKEAIRIFSKMPELDCVTCSTMICVYTQDADFRKAEEVYNAM 794 + V+WT+M+ Y SG +++A F MP + C+ MI + + + KA V++ MSbjct 232 KTEVSWTSMLLGYTLSGRIEDAEEFFEVMPMKPVIACNAMIVGFGEVGEISKARRVFDLM 291

Page 12: rna.genomics.purdue.edurna.genomics.purdue.edu/.../files/447/=GENEMARK_summary.docx · Web viewNote: Please click on hyperlinks for more information on predicted gene functions Gene2:

Query 795 PEWSVVTMNAMLSCYAQSSQVERAKRVFDEIPEK-------SLVSW-------------- 833 + T M+ Y + A +F ++ ++ SL+S Sbjct 292 EDRDNATWRGMIKAYERKGFELEALDLFAQMQKQGVRPSFPSLISILSVCATLASLQYGR 351

Query 834 ------------------NAMLSGYAQNGEIEKAKNVFDRMEERDVVSWDAMVSGYAQNG 875 + +++ Y + GE+ KAK VFDR +D++ W++++SGYA +GSbjct 352 QVHAHLVRCQFDDDVYVASVLMTMYVKCGELVKAKLVFDRFSSKDIIMWNSIISGYASHG 411

Query 876 YVEEARRIFNAMPER----NLVAWNALLCGLALNGSVEEAEELLLSTAMDDRNIVS---- 927 EEA +IF+ MP N V A+L + G +EE E+ S M+ + V+ Sbjct 412 LGEEALKIFHEMPSSGTMPNKVTLIAILTACSYAGKLEEGLEIFES--MESKFCVTPTVE 469

Query 928 -WTAVAIGYAQVGHLQKTRRVFDAMPER-DAVAWNAMLETYAYNGRTESTLELFHTMA-- 983 ++ + G + K + ++M + DA W A+L +T S L+L A Sbjct 470 HYSCTVDMLGRAGQVDKAMELIESMTIKPDATVWGALLGAC----KTHSRLDLAEVAAKK 525

Query 984 -LMQTPGEAGFVWILLACSHAGKLRSG 1009 P AG ++LL+ +A + + GSbjct 526 LFENEPDNAG-TYVLLSSINASRSKWG 551

Score = 158 bits (399), Expect = 3e-37, Method: Compositional matrix adjust. Identities = 82/245 (33%), Positives = 134/245 (54%), Gaps = 29/245 (11%)

Query 769 VTCSTMICVYTQDADFRKAEEVYNAMPEWSVVTMNAMLSCYAQSSQVERAKRVFDEIPEK 828 V CS I ++ +A + ++++ ++ + N+++S Y + + A+++FDE+ E+Sbjct 18 VNCSFEISRLSRIGKINEARKFFDSLQFKAIGSWNSIVSGYFSNGLPKEARQLFDEMSER 77

Query 829 SLVSWNAMLSGYAQNGEIEKAKNVFDRMEERDVVSWDAMVSGYAQNGYVEEARRIFNAMP 888 ++VSWN ++SGY +N I +A+NVF+ M ER+VVSW AMV GY Q G V EA +F MPSbjct 78 NVVSWNGLVSGYIKNRMIVEARNVFELMPERNVVSWTAMVKGYMQEGMVGEAESLFWRMP 137

Query 889 ERNLVAWNALLCGLALNGSVEEAEEL-----------------------------LLSTA 919 ERN V+W + GL +G +++A +L L+ Sbjct 138 ERNEVSWTVMFGGLIDDGRIDKARKLYDMMPVKDVVASTNMIGGLCREGRVDEARLIFDE 197

Query 920 MDDRNIVSWTAVAIGYAQVGHLQKTRRVFDAMPERDAVAWNAMLETYAYNGRTESTLELF 979 M +RN+V+WT + GY Q + R++F+ MPE+ V+W +ML Y +GR E E FSbjct 198 MRERNVVTWTTMITGYRQNNRVDVARKLFEVMPEKTEVSWTSMLLGYTLSGRIEDAEEFF 257

Query 980 HTMAL 984 M +Sbjct 258 EVMPM 262

Score = 129 bits (324), Expect = 1e-28, Method: Compositional matrix adjust. Identities = 87/390 (22%), Positives = 176/390 (45%), Gaps = 76/390 (19%)

Query 549 NDIPMSNKLLQHYSKQGNVQQARIVFDNITVKDSFSWNIMLSTYARGGCVEEARIFFEDM 608 D+ S ++ ++G V +AR++FD + ++ +W M++ Y + V+ AR FE MSbjct 170 KDVVASTNMIGGLCREGRVDEARLIFDEMRERNVVTWTTMITGYRQNNRVDVARKLFEVM 229

Query 609 PSRDMVSWNALLSAYARSGHVEEAKQVFSSMPSSNLVSWTSLLAAYTQNGHIKLAKSVFE 668 P + VSW ++L Y SG +E+A++ F MP +++ +++ + + G I A+ VF+Sbjct 230 PEKTEVSWTSMLLGYTLSGRIEDAEEFFEVMPMKPVIACNAMIVGFGEVGEISKARRVFD 289

Query 669 EMPQRDMMAWTIMLTALTQRYLVMEAENVFFNM------PEY-NLVSW------------ 709 M RD W M+ A ++ +EA ++F M P + +L+S Sbjct 290 LMEDRDNATWRGMIKAYERKGFELEALDLFAQMQKQGVRPSFPSLISILSVCATLASLQY 349

Query 710 --------------------TAMLTCYSQSGHIEEASLVFHAMEQRDIVAWTAMVAAYAQ 749 + ++T Y + G + +A LVF +DI+ W ++++ YA Sbjct 350 GRQVHAHLVRCQFDDDVYVASVLMTMYVKCGELVKAKLVFDRFSSKDIIMWNSIISGYAS 409

Page 13: rna.genomics.purdue.edurna.genomics.purdue.edu/.../files/447/=GENEMARK_summary.docx · Web viewNote: Please click on hyperlinks for more information on predicted gene functions Gene2:

Query 750 SGYVKEAIRIFSKMPELDCVTCSTMICVYTQDADFRKAEEVYNAMPEWSVVTMNAMLSCY 809 G +EA++IF +MP MP + VT+ A+L+ Sbjct 410 HGLGEEALKIFHEMPS-------------------------SGTMP--NKVTLIAILTAC 442

Query 810 AQSSQVERAKRVFDEIPEKSLVS-----WNAMLSGYAQNGEIEKAKNVFDRMEER-DVVS 863 + + ++E +F+ + K V+ ++ + + G+++KA + + M + D Sbjct 443 SYAGKLEEGLEIFESMESKFCVTPTVEHYSCTVDMLGRAGQVDKAMELIESMTIKPDATV 502

Query 864 WDAMVSGYAQNGYVE----EARRIFNAMPE 889 W A++ + ++ A+++F P+Sbjct 503 WGALLGACKTHSRLDLAEVAAKKLFENEPD 532

Gene21: Protein kinase G11A

Details of best match

Protein kinase G11A.

Score = 683 bits (1762), Expect = 0.0, Method: Compositional matrix adjust. Identities = 334/446 (74%), Positives = 368/446 (82%), Gaps = 10/446 (2%)

Query 89 SSLSRGSNSSDVSDESSCSSFSSSANKPHKANDKRWEAIQSVRMRDGSLGLSHFRLLKRL 148 SS R S SSDVSDES+CSS SS KPHKAND RWEAIQ +R RDG LGLSHF+LLK+LSbjct 143 SSRCRPSTSSDVSDESACSSISS-VTKPHKANDSRWEAIQMIRTRDGILGLSHFKLLKKL 201

Query 149 GCGDIGSVYLAELRSTSCHFAMKVMDKASLASRKKLLRAQTEKEILQSLDHPFLPTLYTH 208 GCGDIGSVYL+EL T +FAMKVMDKASLASRKKLLRAQTEKEILQ LDHPFLPTLYTHSbjct 202 GCGDIGSVYLSELNGTKSYFAMKVMDKASLASRKKLLRAQTEKEILQCLDHPFLPTLYTH 261

Query 209 FETDKFSCLVMEFCMGGDLHTLRQRQPGKHFTEQAAKFYASEVLLSLEYLHMLGVVYRDL 268 FETDKFSCLVMEFC GGDLHTLRQRQ GK+F EQA KFY +E+LL++EYLHMLG++YRDLSbjct 262 FETDKFSCLVMEFCPGGDLHTLRQRQRGKYFPEQAVKFYVAEILLAMEYLHMLGIIYRDL 321

Query 269 KPENVLVREDGHIMLSDFDLSLRCVVSPTLVKSSM-DGD---KRGPAYCIQPACVQPSC- 323 KPENVLVREDGHIMLSDFDLSLRC VSPTL++SS D + K AYC+QPACV+PSC Sbjct 322 KPENVLVREDGHIMLSDFDLSLRCAVSPTLIRSSNPDAEALRKNNQAYCVQPACVEPSCM 381

Page 14: rna.genomics.purdue.edurna.genomics.purdue.edu/.../files/447/=GENEMARK_summary.docx · Web viewNote: Please click on hyperlinks for more information on predicted gene functions Gene2:

Query 324 IQPACVVQPSCLLPRFLSKAKSKKSRKPRNDVGNQVSPLPELVAEPTGARSMSFVGTHEY 383 IQP+C +C PRF S KSKK RKP+ +V NQVSP PEL+AEP+ ARSMSFVGTHEYSbjct 382 IQPSCATPTTCFGPRFFS--KSKKDRKPKPEVVNQVSPWPELIAEPSDARSMSFVGTHEY 439

Query 384 LAPEIIKGEGHGSAVDWWTFGIFLYELLHGKTPFKGSGNRATLFNVVGQPLKFPETSHVS 443 LAPEIIKGEGHGSAVDWWTFGIFLYELL GKTPFKGSGNRATLFNV+GQPL+FPE VSSbjct 440 LAPEIIKGEGHGSAVDWWTFGIFLYELLFGKTPFKGSGNRATLFNVIGQPLRFPEYPVVS 499

Query 444 FAARDLIRGLLVKDPQHRLASKRGATEIKQHPFFEGVNWALIRSTVPPEIPKPFEPE--P 501 F+ARDLIRGLLVK+PQ RL KRGATEIKQHPFFEGVNWALIR PPE+P+P E E PSbjct 500 FSARDLIRGLLVKEPQQRLGCKRGATEIKQHPFFEGVNWALIRCASPPEVPRPVEIERPP 559

Query 502 VLPSRVAPPPPPPPPLSSTTTQHSLE 527 P + P P + ++ LESbjct 560 KQPVSTSEPAAAPSDAAQKSSDSYLE 585

Gene22: Pentatricopeptide repeat containing protein

Details of best match

Pentatricopeptide repeat-containing protein

Score = 347 bits (891), Expect = 2e-94, Method: Compositional matrix adjust. Identities = 189/570 (33%), Positives = 304/570 (53%), Gaps = 41/570 (7%)

Query 446 GNLLIQMYGNCGKIEEARSVFNMLDEKNVFSWNIMQAAFIQNGFVQGARQIFDANPDKSV 505 N+ I GKI EAR +F+ D K++ SWN M A + N + AR++FD PD+++Sbjct 20 ANVRITHLSRIGKIHEARKLFDSCDSKSISSWNSMVAGYFANLMPRDARKLFDEMPDRNI 79

Query 506 VSWNSMIAAYAHRGMLDEAKNLFESMPIKNVVSWTGMLQALSRSGNVEDAKQLFDKMENK 565 +SWN +++ Y G +DEA+ +F+ MP +NVVSWT +++ +G V+ A+ LF KM KSbjct 80 ISWNGLVSGYMKNGEIDEARKVFDLMPERNVVSWTALVKGYVHNGKVDVAESLFWKMPEK 139

Query 566 DPVTWNTMLSAFASKGMLKETKSLFEEMPFRDRVTWTAMVTAHSQAGQGKEAIRYYYQMA 625

Page 15: rna.genomics.purdue.edurna.genomics.purdue.edu/.../files/447/=GENEMARK_summary.docx · Web viewNote: Please click on hyperlinks for more information on predicted gene functions Gene2:

+ V+W ML F G + + L+E +P +D + T+M+ + G+ EA + +M+Sbjct 140 NKVSWTVMLIGFLQDGRIDDACKLYEMIPDKDNIARTSMIHGLCKEGRVDEAREIFDEMS 199

Query 626 LEGLQPNRVTSLTVLHACAEYKNFKVARAFHELFVENGLEVDVTIGTALVDMYAKCGNLH 685 E V T +V Y + + Sbjct 200 ---------------------------------------ERSVITWTTMVTGYGQNNRVD 220

Query 686 QAQVVFDRMPDRDVVTWTAMTTAYANAGKFGDAQGLFAAMPIKNVVSHNTMLGALINAGR 745 A+ +FD MP++ V+WT+M Y G+ DA+ LF MP+K V++ N M+ L G Sbjct 221 DARKIFDVMPEKTEVSWTSMLMGYVQNGRIEDAEELFEVMPVKPVIACNAMISGLGQKGE 280

Query 746 LDEAREMFERMSEKNWASWNWMILGYAKSGFGREALRLFGLMDLEGYYADRSTYASALTA 805 + +AR +F+ M E+N ASW +I + ++GF EAL LF LM +G T S L+ Sbjct 281 IAKARRVFDSMKERNDASWQTVIKIHERNGFELEALDLFILMQKQGVRPTFPTLISILSV 340

Query 806 CSSIPAPVQGKLLHQELLESGGLEDDAVLTTALLDMYASCGGLETAEILFREMIFKDEVA 865 C+S+ + GK +H +L+ + D + + L+ MY CG L ++++F KD + Sbjct 341 CASLASLHHGKQVHAQLVRCQ-FDVDVYVASVLMTMYIKCGELVKSKLIFDRFPSKDIIM 399

Query 866 WTAMIAGYVRNDLDVKAVELFREM-LANGLSPGAVPFLHLFSACSHLGFVEESRWYFLMM 924 W ++I+GY + L +A+++F EM L+ P V F+ SACS+ G VEE + MSbjct 400 WNSIISGYASHGLGEEALKVFCEMPLSGSTKPNEVTFVATLSACSYAGMVEEGLKIYESM 459

Query 925 LEDYKVVPELDHYLCLIDLLGRAGQLDRAEELIETMPFQPVAGAWRTLLSACKTHNDKER 984 + V P HY C++D+LGRAG+ + A E+I++M +P A W +LL AC+TH+ + Sbjct 460 ESVFGVKPITAHYACMVDMLGRAGRFNEAMEMIDSMTVEPDAAVWGSLLGACRTHSQLDV 519

Query 985 ADRAAKKNSELDPGCGSPYLILSNLNAEAG 1014 A+ AKK E++P Y++LSN+ A GSbjct 520 AEFCAKKLIEIEPENSGTYILLSNMYASQG 549

Score = 171 bits (432), Expect = 3e-41, Method: Compositional matrix adjust. Identities = 118/462 (25%), Positives = 221/462 (47%), Gaps = 25/462 (5%)

Query 433 RLAEGFYDRHTYLGNLLIQMYGNCGKIEEARSVFNMLDEKNVFSWNIMQAAFIQNGFVQG 492 +L + DR+ N L+ Y G+I+EAR VF+++ E+NV SW + ++ NG V Sbjct 69 KLFDEMPDRNIISWNGLVSGYMKNGEIDEARKVFDLMPERNVVSWTALVKGYVHNGKVDV 128

Query 493 ARQIFDANPDKSVVSWNSMIAAYAHRGMLDEAKNLFESMPIKNVVSWTGMLQALSRSGNV 552 A +F P+K+ VSW M+ + G +D+A L+E +P K+ ++ T M+ L + G VSbjct 129 AESLFWKMPEKNKVSWTVMLIGFLQDGRIDDACKLYEMIPDKDNIARTSMIHGLCKEGRV 188

Query 553 EDAKQLFDKMENKDPVTWNTMLSAFASKGMLKETKSLFEEMPFRDRVTWTAMVTAHSQAG 612 ++A+++FD+M + +TW TM++ + + + + +F+ MP + V+WT+M+ + Q GSbjct 189 DEAREIFDEMSERSVITWTTMVTGYGQNNRVDDARKIFDVMPEKTEVSWTSMLMGYVQNG 248

Query 613 QGKEAIRYYYQM----------ALEGL-QPNRVTSLTVLHACAEYKNFKVARAFHELFVE 661 + ++A + M + GL Q + + + +N + ++ Sbjct 249 RIEDAEELFEVMPVKPVIACNAMISGLGQKGEIAKARRVFDSMKERNDASWQTVIKIHER 308

Query 662 NGLEVDVTIGTALVDMYAKCGNLHQAQVVFDRMPDRDVVTWTAMTTAYANAGKFGDAQGL 721 NG E++ +D++ L Q Q V P + + A + GK AQ +Sbjct 309 NGFELEA------LDLFI----LMQKQGVRPTFPTLISILSVCASLASLHHGKQVHAQLV 358

Query 722 FAAMPIKNVVSHNTMLGALINAGRLDEAREMFERMSEKNWASWNWMILGYAKSGFGREAL 781 + +V + ++ I G L +++ +F+R K+ WN +I GYA G G EALSbjct 359 RCQFDV-DVYVASVLMTMYIKCGELVKSKLIFDRFPSKDIIMWNSIISGYASHGLGEEAL 417

Query 782 RLFGLMDLEGYY-ADRSTYASALTACSSIPAPVQGKLLHQELLESGGLEDDAVLTTALLD 840 ++F M L G + T+ + L+ACS +G +++ + G++ ++DSbjct 418 KVFCEMPLSGSTKPNEVTFVATLSACSYAGMVEEGLKIYESMESVFGVKPITAHYACMVD 477

Page 16: rna.genomics.purdue.edurna.genomics.purdue.edu/.../files/447/=GENEMARK_summary.docx · Web viewNote: Please click on hyperlinks for more information on predicted gene functions Gene2:

Query 841 MYASCGGL-ETAEILFREMIFKDEVAWTAMI-AGYVRNDLDV 880 M G E E++ + D W +++ A + LDVSbjct 478 MLGRAGRFNEAMEMIDSMTVEPDAAVWGSLLGACRTHSQLDV 519

Score = 117 bits (293), Expect = 5e-25, Method: Compositional matrix adjust. Identities = 98/426 (23%), Positives = 201/426 (47%), Gaps = 33/426 (7%)

Query 567 PVTWNTMLSAFASKGMLKETKSLFEEMPFRDRVTWTAMVTAHSQAGQGKEAIRYYYQMAL 626 P T N ++ + G + E + LF+ + +W +MV + ++A + + +M Sbjct 17 PPTANVRITHLSRIGKIHEARKLFDSCDSKSISSWNSMVAGYFANLMPRDARKLFDEM-- 74

Query 627 EGLQPNR-VTSLTVLHACAEYKNFKV--ARAFHELFVENGLEVDVTIGTALVDMYAKCGN 683 P+R + S L KN ++ AR +L E +V TALV Y G Sbjct 75 ----PDRNIISWNGL-VSGYMKNGEIDEARKVFDLMPER----NVVSWTALVKGYVHNGK 125

Query 684 LHQAQVVFDRMPDRDVVTWTAMTTAYANAGKFGDAQGLFAAMPIKNVVSHNTMLGALINA 743 + A+ +F +MP+++ V+WT M + G+ DA L+ +P K+ ++ +M+ L Sbjct 126 VDVAESLFWKMPEKNKVSWTVMLIGFLQDGRIDDACKLYEMIPDKDNIARTSMIHGLCKE 185

Query 744 GRLDEAREMFERMSEKNWASWNWMILGYAKSGFGREALRLFGLMDLEGYYADRSTYASAL 803 GR+DEARE+F+ MSE++ +W M+ GY ++ +A ++F +M + + +Sbjct 186 GRVDEAREIFDEMSERSVITWTTMVTGYGQNNRVDDARKIFDVMPEK----------TEV 235

Query 804 TACSSIPAPVQGKLLH--QELLESGGLEDDAVLTTALLDMYASCGGLETAEILFREMIFK 861 + S + VQ + +EL E ++ + A++ G + A +F M +Sbjct 236 SWTSMLMGYVQNGRIEDAEELFEVMPVK-PVIACNAMISGLGQKGEIAKARRVFDSMKER 294

Query 862 DEVAWTAMIAGYVRNDLDVKAVELFREMLANGLSPGAVPFLHLFSACSHLGFVEESRWYF 921 ++ +W +I + RN +++A++LF M G+ P + + S C+ L + + Sbjct 295 NDASWQTVIKIHERNGFELEALDLFILMQKQGVRPTFPTLISILSVCASLASLHHGKQVH 354

Query 922 LMMLEDYKVVPELDHYL--CLIDLLGRAGQLDRAEELIETMPFQPVAGAWRTLLSACKTH 979 ++ + ++D Y+ L+ + + G+L +++ + + P + + W +++S +HSbjct 355 AQLV---RCQFDVDVYVASVLMTMYIKCGELVKSKLIFDRFPSKDII-MWNSIISGYASH 410

Query 980 NDKERA 985 E ASbjct 411 GLGEEA 416

Score = 82.0 bits (201), Expect = 2e-14, Method: Compositional matrix adjust. Identities = 44/177 (24%), Positives = 93/177 (52%), Gaps = 11/177 (6%)

Query 411 LGDALRECARTRNLAEGRKIHARLAEGFYDRHTYLGNLLIQMYGNCGKIEEARSVFNMLD 470 L L CA +L G+++HA+L +D Y+ ++L+ MY CG++ +++ +F+ Sbjct 334 LISILSVCASLASLHHGKQVHAQLVRCQFDVDVYVASVLMTMYIKCGELVKSKLIFDRFP 393

Query 471 EKNVFSWNIMQAAFIQNGFVQGARQIFDANP-----DKSVVSWNSMIAAYAHRGMLDEAK 525 K++ WN + + + +G + A ++F P + V++ + ++A ++ GM++E Sbjct 394 SKDIIMWNSIISGYASHGLGEEALKVFCEMPLSGSTKPNEVTFVATLSACSYAGMVEEGL 453

Query 526 NLFESMP----IKNVVS-WTGMLQALSRSGNVEDAKQLFDKME-NKDPVTWNTMLSA 576 ++ESM +K + + + M+ L R+G +A ++ D M D W ++L ASbjct 454 KIYESMESVFGVKPITAHYACMVDMLGRAGRFNEAMEMIDSMTVEPDAAVWGSLLGA 510

Page 17: rna.genomics.purdue.edurna.genomics.purdue.edu/.../files/447/=GENEMARK_summary.docx · Web viewNote: Please click on hyperlinks for more information on predicted gene functions Gene2:

Gene23: LECRPA3

Details of best match

Bark agglutinin LECRPA3

Score = 105 bits (261), Expect = 3e-22, Method: Compositional matrix adjust. Identities = 89/265 (33%), Positives = 141/265 (53%), Gaps = 35/265 (13%)

Query 24 SFRFSFSPPFTKNDR--ILVGGNATKTGSCLRLTSRSR--FETGRAIYAERIRLVDSSSN 79 SF F+ FT+ D+ L+G L LT+ + + TGRA+Y++ + + DS++ Sbjct 34 SFNFT---NFTRGDQGVTLLGQANIMANGILALTNHTNPTWNTGRALYSKPVPIWDSATG 90

Query 80 TVSSFSTNFIFRIRQ--GLISADGLAFFLTSSTEDPRVPPEESSGRQLGLISANRDGYPS 137 V+SF T+F F +++ G I ADG+ FFL + R+P + S+G QLG+++AN+ Sbjct 91 NVASFVTSFSFVVQEIKGAIPADGIVFFLA---PEARIP-DNSAGGQLGIVNANK---AY 143

Query 138 NQMVAVEFDTYPNVNETQDQHVGIDINSVRNSYRVANLSSSGLQFTNMTLMSAWIDYSSN 197 N V VEFDTY N + + H+GID +S+ + V SG +L+ I Y S Sbjct 144 NPFVGVEFDTYSNNWDPKSAHIGIDASSLISLRTVKWNKVSG------SLVKVSIIYDSL 197

Query 198 SSVLEVRLGYFYEPRPEEPMVSGVVRLNDFLGDRVWVGFSAATGAFADGYEVLAWEFAAG 257 S L V + + + ++ VV L LG++V VGF+AAT + Y++ AW Sbjct 198 SKTLSVVVTH---ENGQISTIAQVVDLKAVLGEKVRVGFTAATTTGRELYDIHAW----- 249

Query 258 ATVTTFTPTFIATVTTSTSTNTKLS 282 +FT T + T T+STS N ++Sbjct 250 ----SFTSTLV-TATSSTSKNMNIA 269

Page 18: rna.genomics.purdue.edurna.genomics.purdue.edu/.../files/447/=GENEMARK_summary.docx · Web viewNote: Please click on hyperlinks for more information on predicted gene functions Gene2:

Gene24: 1-aminocyclopropane-1-carboxylate oxidase homolog 4

Details of best match

1-aminocyclopropane-1-carboxylate oxidase homolog 4

Score = 166 bits (420), Expect = 1e-40, Method: Compositional matrix adjust. Identities = 111/359 (30%), Positives = 166/359 (46%), Gaps = 60/359 (16%)

Query 13 DFSSFKESLRGVKDLVDSGIRELPRFYIRSDSRMRVQS---SVLPPGGEVPIVDLRELDG 69 + +F E+ GVK LVDSG+ ++PR + ++ S L +P +DL D Sbjct 15 ELKAFDETKTGVKGLVDSGVSQVPRIFHHPTVKLSTPKPLPSDLLHLKTIPTIDLGGRDF 74

Query 70 SD---RGRIVEAVARASEEWGFFQV----------------AKEFFAMPVEDRMEIFSAD 110 D R +E + A+ +WGFFQV ++F E R E +S DSbjct 75 QDAIKRNNAIEEIKEAAAKWGFFQVINHGVSLELLEKMKKGVRDFHEQSQEVRKEFYSRD 134

Query 111 LFKRTRFGTSHNPSQETSLEWKDYLRHPCLPLEESMQSWPTKPASYRRVASDYCRGVKGL 170 +R + ++ + + W+D P Q P R + +Y + V LSbjct 135 FSRRFLYLSNFDLFSSPAANWRDTFSCTMAPDTPKPQDLP---EICRDIMMEYSKQVMNL 191

Query 171 ADKLLEVLSESLGLERRYLGSVFGSERLQEMFCNYYPPCPNPELTIGIGEHSDVGGITVL 230 L E+LSE+LGLE +L + S+ L M +YYPPCP P+LT+G +HSD +TVLSbjct 192 GKFLFELLSEALGLEPNHLNDMDCSKGLL-MLSHYYPPCPEPDLTLGTSQHSDNSFLTVL 250

Query 231 LQNEVEGLEVRKDGHWYSIKPVKDAFVVNLGDQLQ---------------PSRGARIR-- 273 L +++EGL+VR++GHW+ + V A ++N+GD LQ +R R R Sbjct 251 LPDQIEGLQVRREGHWFDVPHVSGALIINIGDLLQLITNDKFISLEHRVLANRATRARVS 310

Query 274 -----------------PIPELLDEEHPPAYKEVTFQDYLADFFKHKLQGKRCLDSYKI 315 PI EL+ EE+PP Y+E T +DY F L G L +KISbjct 311 VACFFTTGVRPNPRMYGPIRELVSEENPPKYRETTIKDYATYFNAKGLDGTSALLHFKI 369

Page 19: rna.genomics.purdue.edurna.genomics.purdue.edu/.../files/447/=GENEMARK_summary.docx · Web viewNote: Please click on hyperlinks for more information on predicted gene functions Gene2:

Gene26: Peroxidase 52

Details of best match

Peroxidase 52

Score = 301 bits (770), Expect = 8e-81, Method: Compositional matrix adjust. Identities = 170/311 (54%), Positives = 216/311 (69%), Gaps = 18/311 (5%)

Query 246 DNTYFQLLQSTQGLL-FSDQQLLSGGQSELASMVNEFAGDQQA-----FFTAFANGMGCD 299 DN Y Q T S LLS Q+ + S VN A + F F NG CDSbjct 21 DNNYVVEAQLTTNFYSTSCPNLLSTVQTAVKSAVNSEARMGASILRLFFHDCFVNG--CD 78

Query 300 ASILLDGSS---GEKNAGPNVNSARGFDVIDNVKAAVESSCKGVVSCADILALSAREAVV 356 SILLD +S GE+NA PN NSARGF+VIDN+K+AVE +C GVVSCADILA++AR++VVSbjct 79 GSILLDDTSSFTGEQNAAPNRNSARGFNVIDNIKSAVEKACPGVVSCADILAIAARDSVV 138

Query 357 ALRGPSWTVVFGRRDSTTSSQSTANSAIPPPSSTASRLITSFQNQGLSTQDLVALSGSHT 416 AL GP+W V GRRD+ T+SQ+ ANS IP P+S+ S+LI+SF GLST+D+VALSG+HTSbjct 139 ALGGPNWNVKVGRRDARTASQAAANSNIPAPTSSLSQLISSFSAVGLSTRDMVALSGAHT 198

Query 417 IGQAQCTNFRARLYNGTSGDTIDASFKSNLERNCP--STGGNSNLAPLDLQTPVTFDNLY 474 IGQ++CTNFRAR+YN T+ I+A+F + +R CP S G+ NLAPLD+ T +FDN YSbjct 199 IGQSRCTNFRARIYNETN---INAAFATTRQRTCPRASGSGDGNLAPLDVTTAASFDNNY 255

Query 475 FKNLQAQKGLLFSDQQLFSGGQSSLMSTVNTYANNQQAFFSAFATAMVKMGNINPLTGSN 534 FKNL Q+GLL SDQ LF+GG + S V Y+NN +F S F AM+KMG+I+PLTGS+Sbjct 256 FKNLMTQRGLLHSDQVLFNGGSTD--SIVRGYSNNPSSFNSDFTAAMIKMGDISPLTGSS 313

Query 535 GQIRANCRKTN 545 G+IR C +TNSbjct 314 GEIRKVCGRTN 324

Score = 284 bits (726), Expect = 1e-75, Method: Compositional matrix adjust. Identities = 158/296 (53%), Positives = 208/296 (70%), Gaps = 17/296 (5%)

Page 20: rna.genomics.purdue.edurna.genomics.purdue.edu/.../files/447/=GENEMARK_summary.docx · Web viewNote: Please click on hyperlinks for more information on predicted gene functions Gene2:

Query 24 AQLSSSFYSSTCPNLTDIVRNVIQSAVANENRMAASILRLHFHDCFVNGCDGSVLLQG-- 81 AQL+++FYS++CPNL V+ ++SAV +E RM ASILRL FHDCFVNGCDGS+LL Sbjct 28 AQLTTNFYSTSCPNLLSTVQTAVKSAVNSEARMGASILRLFFHDCFVNGCDGSILLDDTS 87

Query 82 ---GEENAPPNQRSARGFEVIDSVKSAVESACPGVVSCADILALSAHESVTALGGPSWTV 138 GE+NA PN+ SARGF VID++KSAVE ACPGVVSCADILA++A +SV ALGGP+W VSbjct 88 SFTGEQNAAPNRNSARGFNVIDNIKSAVEKACPGVVSCADILAIAARDSVVALGGPNWNV 147

Query 139 VFGRRDSLSPASVADVSANLPGPGFTALRLIRSFQNQDLSPRDLVALSGGHTIGQAQCFT 198 GRRD+ + AS A ++N+P P + +LI SF LS RD+VALSG HTIGQ++C Sbjct 148 KVGRRDART-ASQAAANSNIPAPTSSLSQLISSFSAVGLSTRDMVALSGAHTIGQSRCTN 206

Query 199 FRARLYNGTAGDSIDPALKSRLEQNCPPSAPNGDRNLENLD-TSPATFDNTYFQLLQSTQ 257 FRAR+YN T +I+ A + ++ CP ++ +GD NL LD T+ A+FDN YF+ L + +Sbjct 207 FRARIYNET---NINAAFATTRQRTCPRASGSGDGNLAPLDVTTAASFDNNYFKNLMTQR 263

Query 258 GLLFSDQQLLSGGQSELASMVNEFAGDQQAF---FTAFANGMGCDASILLDGSSGE 310 GLL SDQ L +GG ++ S+V ++ + +F FTA MG D S L GSSGESbjct 264 GLLHSDQVLFNGGSTD--SIVRGYSNNPSSFNSDFTAAMIKMG-DISPLT-GSSGE 315

Gene27: Pentatricopeptide repeat containing protein + Cationic Peroxidase-1

Details of best matchDetails of match 2

Pentatricopeptide repeat-containing protein

Score = 307 bits (787), Expect = 1e-82, Method: Compositional matrix adjust. Identities = 158/485 (32%), Positives = 285/485 (58%), Gaps = 18/485 (3%)

Query 20 GYLQDAEKVFDSMPELDLVTWNAMLTGNAHNGHLQGAVLVFQSMREKNLVSYNAMLAAYG 79 G + +A K+FDS + +WN+M+ G N + A +F M ++N++S+N +++ Y Sbjct 31 GKIHEARKLFDSCDSKSISSWNSMVAGYFANLMPRDARKLFDEMPDRNIISWNGLVSGYM 90

Query 80 QNGNLCQARRIFEEMPNRDLVSWNTMLAACAQSGDLESAKIVFDSMKERNLVSWTTILAA 139

Page 21: rna.genomics.purdue.edurna.genomics.purdue.edu/.../files/447/=GENEMARK_summary.docx · Web viewNote: Please click on hyperlinks for more information on predicted gene functions Gene2:

+NG + +AR++F+ MP R++VSW ++ +G ++ A+ +F M E+N VSWT +L Sbjct 91 KNGEIDEARKVFDLMPERNVVSWTALVKGYVHNGKVDVAESLFWKMPEKNKVSWTVMLIG 150

Query 140 YAQNGHLQDAMKLFDRMKEHDLIASNAMLSGFALNGQLQQARGIFNQMGERNVVSWNAML 199 + Q+G + DA KL++ + + D IA +M+ G G++ +AR IF++M ER+V++W M+Sbjct 151 FLQDGRIDDACKLYEMIPDKDNIARTSMIHGLCKEGRVDEAREIFDEMSERSVITWTTMV 210

Query 200 TACVRNGDMGEAKRIFDRMEVRTLVSWNAMLAAYVQAGQLPKAKELFQQMPDRDLISWNA 259 T +N + +A++IFD M +T VSW +ML YVQ G++ A+ELF+ MP + +I+ NASbjct 211 TGYGQNNRVDDARKIFDVMPEKTEVSWTSMLMGYVQNGRIEDAEELFEVMPVKPVIACNA 270

Query 260 ILSMHAWNGDVERAREVYESLHEKDIVSCTAMLSVYAQNGHLEESKQIF-----DGI-PE 313 ++S G++ +AR V++S+ E++ S ++ ++ +NG E+ +F G+ P Sbjct 271 MISGLGQKGEIAKARRVFDSMKERNDASWQTVIKIHERNGFELEALDLFILMQKQGVRPT 330

Query 314 W----DLVSWNAMLSS--YSQNGYLQDAKFMFDEIPQKDLVSCNALLAAYAQNGHLLEAR 367 + ++S A L+S + + + Q + FD D+ + L+ Y + G L++++Sbjct 331 FPTLISILSVCASLASLHHGKQVHAQLVRCQFD----VDVYVASVLMTMYIKCGELVKSK 386

Query 368 RVFSWMIERDIVSWNTLLAGYAQNGHSSEALDLFASMKM--SEIPDEIAFTCALVAASHA 425 +F +DI+ WN++++GYA +G EAL +F M + S P+E+ F L A S+ASbjct 387 LIFDRFPSKDIIMWNSIISGYASHGLGEEALKVFCEMPLSGSTKPNEVTFVATLSACSYA 446

Query 426 GVVYPGWSLFIAMRMDYGLIPSKQHYCCLIDLLSRARYLDEAEDLITRMPFVPDVYDWTC 485 G+V G ++ +M +G+ P HY C++D+L RA +EA ++I M PD W Sbjct 447 GMVEEGLKIYESMESVFGVKPITAHYACMVDMLGRAGRFNEAMEMIDSMTVEPDAAVWGS 506

Query 486 ILASC 490 +L +CSbjct 507 LLGAC 511

Score = 242 bits (617), Expect = 8e-63, Method: Compositional matrix adjust. Identities = 117/330 (35%), Positives = 205/330 (62%), Gaps = 6/330 (1%)

Query 103 NTMLAACAQSGDLESAKIVFDSMKERNLVSWTTILAAYAQNGHLQDAMKLFDRMKEHDLI 162 N + ++ G + A+ +FDS +++ SW +++A Y N +DA KLFD M + ++ISbjct 21 NVRITHLSRIGKIHEARKLFDSCDSKSISSWNSMVAGYFANLMPRDARKLFDEMPDRNII 80

Query 163 ASNAMLSGFALNGQLQQARGIFNQMGERNVVSWNAMLTACVRNGDMGEAKRIFDRMEVRT 222 + N ++SG+ NG++ +AR +F+ M ERNVVSW A++ V NG + A+ +F +M + Sbjct 81 SWNGLVSGYMKNGEIDEARKVFDLMPERNVVSWTALVKGYVHNGKVDVAESLFWKMPEKN 140

Query 223 LVSWNAMLAAYVQAGQLPKAKELFQQMPDRDLISWNAILSMHAWNGDVERAREVYESLHE 282 VSW ML ++Q G++ A +L++ +PD+D I+ +++ G V+ ARE+++ + ESbjct 141 KVSWTVMLIGFLQDGRIDDACKLYEMIPDKDNIARTSMIHGLCKEGRVDEAREIFDEMSE 200

Query 283 KDIVSCTAMLSVYAQNGHLEESKQIFDGIPEWDLVSWNAMLSSYSQNGYLQDAKFMFDEI 342 + +++ T M++ Y QN ++++++IFD +PE VSW +ML Y QNG ++DA+ +F+ +Sbjct 201 RSVITWTTMVTGYGQNNRVDDARKIFDVMPEKTEVSWTSMLMGYVQNGRIEDAEELFEVM 260

Query 343 PQKDLVSCNALLAAYAQNGHLLEARRVFSWMIERDIVSWNTLLAGYAQNGHSSEALDLFA 402 P K +++CNA+++ Q G + +ARRVF M ER+ SW T++ + +NG EALDLF Sbjct 261 PVKPVIACNAMISGLGQKGEIAKARRVFDSMKERNDASWQTVIKIHERNGFELEALDLFI 320

Query 403 SMKMSEI----PDEIAF--TCALVAASHAG 426 M+ + P I+ CA +A+ H GSbjct 321 LMQKQGVRPTFPTLISILSVCASLASLHHG 350

Score = 231 bits (589), Expect = 1e-59, Method: Compositional matrix adjust. Identities = 128/431 (29%), Positives = 242/431 (56%), Gaps = 24/431 (5%)

Page 22: rna.genomics.purdue.edurna.genomics.purdue.edu/.../files/447/=GENEMARK_summary.docx · Web viewNote: Please click on hyperlinks for more information on predicted gene functions Gene2:

Query 1 MPQRDLVAWTTLLVAYTQRGYLQDAEKVFDSMPELDLVTWNAMLTGNAHNGHLQGAVLVF 60 MP R++++W L+ Y + G + +A KVFD MPE ++V+W A++ G HNG + A +FSbjct 74 MPDRNIISWNGLVSGYMKNGEIDEARKVFDLMPERNVVSWTALVKGYVHNGKVDVAESLF 133

Query 61 QSMREKNLVSYNAMLAAYGQNGNLCQARRIFEEMPNRDLVSWNTMLAACAQSGDLESAKI 120 M EKN VS+ ML + Q+G + A +++E +P++D ++ +M+ + G ++ A+ Sbjct 134 WKMPEKNKVSWTVMLIGFLQDGRIDDACKLYEMIPDKDNIARTSMIHGLCKEGRVDEARE 193

Query 121 VFDSMKERNLVSWTTILAAYAQNGHLQDAMKLFDRMKEHDLIASNAMLSGFALNGQLQQA 180 +FD M ER++++WTT++ Y QN + DA K+FD M E ++ +ML G+ NG+++ ASbjct 194 IFDEMSERSVITWTTMVTGYGQNNRVDDARKIFDVMPEKTEVSWTSMLMGYVQNGRIEDA 253

Query 181 RGIFNQMGERNVVSWNAMLTACVRNGDMGEAKRIFDRMEVRTLVSWNAMLAAYVQAGQLP 240 +F M + V++ NAM++ + G++ +A+R+FD M+ R SW ++ + + G Sbjct 254 EELFEVMPVKPVIACNAMISGLGQKGEIAKARRVFDSMKERNDASWQTVIKIHERNGFEL 313

Query 241 KAKELFQQMPDRDL----ISWNAILSMHAWNGDVERAREVYESL----HEKDIVSCTAML 292 +A +LF M + + + +ILS+ A + ++V+ L + D+ + ++Sbjct 314 EALDLFILMQKQGVRPTFPTLISILSVCASLASLHHGKQVHAQLVRCQFDVDVYVASVLM 373

Query 293 SVYAQNGHLEESKQIFDGIPEWDLVSWNAMLSSYSQNGYLQDAKFMFDEIP-----QKDL 347 ++Y + G L +SK IFD P D++ WN+++S Y+ +G ++A +F E+P + + Sbjct 374 TMYIKCGELVKSKLIFDRFPSKDIIMWNSIISGYASHGLGEEALKVFCEMPLSGSTKPNE 433

Query 348 VSCNALLAAYAQNGHLLEARRVFSWMIERDIVSWNTLLAGYA-------QNGHSSEALDL 400 V+ A L+A + G + E +++ M + + A YA + G +EA+++Sbjct 434 VTFVATLSACSYAGMVEEGLKIYESM--ESVFGVKPITAHYACMVDMLGRAGRFNEAMEM 491

Query 401 FASMKMSEIPD 411 SM + PDSbjct 492 IDSMTVE--PD 500

Cationic peroxidase 1

Score = 264 bits (675), Expect = 1e-69, Method: Compositional matrix adjust. Identities = 136/258 (52%), Positives = 180/258 (69%), Gaps = 14/258 (5%)

Query 569 GCDASILLDGAN---LEQNAFPNAGSARGFDIVDSIKSSVESSCPGVVSCADLLALIARD 625 GCDAS+LLD + E+ A PNA S RGF+++D+IKS VES CPGVVSCAD+LA+ ARDSbjct 70 GCDASVLLDDTSNFTGEKTAGPNANSIRGFEVIDTIKSQVESLCPGVVSCADILAVAARD 129

Query 626 SVVALNGPSWTVVFGRRDSLTASQSAANANLPPPTLNASALIASFQNQGLSTTDMVALSG 685 SVVAL G SW V+ GRRDS TAS S+AN++LP P N S LI++F N+G +T ++V LSGSbjct 130 SVVALGGASWNVLLGRRDSTTASLSSANSDLPAPFFNLSGLISAFSNKGFTTKELVTLSG 189

Query 686 AHTIGQARCTTFKARLYGPFQRGDQMDQSFNTSLQSSCPSSNGDTNLSPLDVQTPTSFDN 745 AHTIGQA+CT F+ R+Y +D ++ SLQ++CPS GDTNLSP DV TP FDNSbjct 190 AHTIGQAQCTAFRTRIYNE----SNIDPTYAKSLQANCPSVGGDTNLSPFDVTTPNKFDN 245

Query 746 RYFRNLQNRRGLLFSDQTLFSGNQASTRNLVNSYASSQSTFFQDFGNAMTQCANLKARLY 805 Y+ NL+N++GLL SDQ LF+G ST + V +Y+++ +TF DFGNAM + NL Sbjct 246 AYYINLRNKKGLLHSDQQLFNG--VSTDSQVTAYSNNAATFNTDFGNAMIKMGNLS---- 299

Query 806 RPFQSGDQTLQSSCVGSN 823 P ++++C +NSbjct 300 -PLTGTSGQIRTNCRKTN 316

Page 23: rna.genomics.purdue.edurna.genomics.purdue.edu/.../files/447/=GENEMARK_summary.docx · Web viewNote: Please click on hyperlinks for more information on predicted gene functions Gene2:

Score = 62.0 bits (149), Expect = 1e-08, Method: Compositional matrix adjust. Identities = 31/71 (43%), Positives = 43/71 (60%), Gaps = 6/71 (8%)

Query 796 QCANLKARLYRPFQ---SGDQTLQSSCVGSNGDTNLSPLDIQTPTSFDNRYFRNLHNRRG 852 QC + R+Y + ++LQ++C GDTNLSP D+ TP FDN Y+ NL N++GSbjct 197 QCTAFRTRIYNESNIDPTYAKSLQANCPSVGGDTNLSPFDVTTPNKFDNAYYINLRNKKG 256

Query 853 LLFS---LFSG 860 LL S LF+GSbjct 257 LLHSDQQLFNG 267

Gene28: Cationic Peroxidase-1

Details of best match

Cationic peroxidase 1

Score = 214 bits (546), Expect = 3e-55, Method: Compositional matrix adjust. Identities = 115/255 (45%), Positives = 155/255 (60%), Gaps = 31/255 (12%)

Query 23 GCDASIMLNGSNN---EQFAFPNINSLRGYNVIENIKALVEAKCPNTVSCADIIVIVARE 79 GCDAS++L+ ++N E+ A PN NS+RG+ VI+ IK+ VE+ CP VSCADI+ + AR+Sbjct 70 GCDASVLLDDTSNFTGEKTAGPNANSIRGFEVIDTIKSQVESLCPGVVSCADILAVAARD 129

Query 80 CVMA--------------------TAANVELPPFFLNVSRLIANFQSHGLSVQDLVALSG 119 V+A ++AN +LP F N+S LI+ F + G + ++LV LSGSbjct 130 SVVALGGASWNVLLGRRDSTTASLSSANSDLPAPFFNLSGLISAFSNKGFTTKELVTLSG 189

Query 120 SHTIGQGQCGNFKSRLYGPSLSSSPDYMNPYYNQSLRSQCPSSGGDSNLSPLDLQTPVVF 179 +HTIGQ QC F++R+Y S ++P Y +SL++ CPS GGD+NLSP D+ TP FSbjct 190 AHTIGQAQCTAFRTRIYNES------NIDPTYAKSLQANCPSVGGDTNLSPFDVTTPNKF 243

Query 180 DNKYYKNLINFSGLFHSDQTLWSGGDWTVAQLVHTYAMNQARFFQDFATGMINMGNLKPL 239 DN YY NL N GL HSDQ L++G + V Y+ N A F DF MI MGNL PLSbjct 244 DNAYYINLRNKKGLLHSDQQLFNG--VSTDSQVTAYSNNAATFNTDFGNAMIKMGNLSPL 301

Query 240 LAPNGQIRKYCGKVN 254 +GQIR C K NSbjct 302 TGTSGQIRTNCRKTN 316

Page 24: rna.genomics.purdue.edurna.genomics.purdue.edu/.../files/447/=GENEMARK_summary.docx · Web viewNote: Please click on hyperlinks for more information on predicted gene functions Gene2:

Gene29: Pentatricopeptide Repeat containing protein

Details of best match

Pentatricopeptide repeat-containing protein Score = 312 bits (800), Expect = 5e-84, Method: Compositional matrix adjust. Identities = 162/530 (30%), Positives = 289/530 (54%), Gaps = 17/530 (3%)

Query 451 LTAYAHFGHLEKSQDIFERMPQRNLFSWNAMLALYGVKGLMEKANRLFQEMPEWNSVSWT 510 +T + G + +++ +F+ +++ SWN+M+A Y + A +LF EMP+ N +SW Sbjct 24 ITHLSRIGKIHEARKLFDSCDSKSISSWNSMVAGYFANLMPRDARKLFDEMPDRNIISWN 83

Query 511 TMLDGYSQNGYLGRSKLVFDSMPERNLISWGCMLAAYAHNGHLMDAKRVFDTMPEHNLVC 570 ++ GY +NG + ++ VFD MPERN++SW ++ Y HNG + A+ +F MPE N V Sbjct 84 GLVSGYMKNGEIDEARKVFDLMPERNVVSWTALVKGYVHNGKVDVAESLFWKMPEKNKVS 143

Query 571 FNAILTAFAQNGHLAKAKHAFDTMPETNVVTWNAMLTAYSDNGRVEQAKVMFDSMPYRNL 630 + +L F Q+G + A ++ +P+ + + +M+ GRV++A+ +FD M R++Sbjct 144 WTVMLIGFLQDGRIDDACKLYEMIPDKDNIARTSMIHGLCKEGRVDEAREIFDEMSERSV 203

Query 631 VSWTCMLAMHAQYGQVTEARRTFDTLPERTINVTDALLTVYAHNGRIEDSKILFDGMPHW 690 ++WT M+ + Q +V +AR+ FD +PE+T ++L Y NGRIED++ LF+ MP Sbjct 204 ITWTTMVTGYGQNNRVDDARKIFDVMPEKTEVSWTSMLMGYVQNGRIEDAEELFEVMPVK 263

Query 691 DMLAYSTMLSAFAQNGHVEDAKNLYDSMPEKHLVSKTSMLAMYAQHGSINEAQSMFDSM- 749 ++A + M+S Q G + A+ ++DSM E++ S +++ ++ ++G EA +F M Sbjct 264 PVIACNAMISGLGQKGEIAKARRVFDSMKERNDASWQTVIKIHERNGFELEALDLFILMQ 323

Query 750 ------AFQDIVAWNSMLA--AYTQHGYVDQAKTI---FDIMPERDAVSWSTMLAAYARK 798 F +++ S+ A A HG A+ + FD+ D S ++ Y + Sbjct 324 KQGVRPTFPTLISILSVCASLASLHHGKQVHAQLVRCQFDV----DVYVASVLMTMYIKC 379

Query 799 GHLPQAKKFFSTIPEPSFVSWNALLTAYYQNSVPSGVFESFDMLKMVDGSLSG-ICFVLV 857 G L ++K F P + WN++++ Y + + + F + + + + FV Sbjct 380 GELVKSKLIFDRFPSKDIIMWNSIISGYASHGLGEEALKVFCEMPLSGSTKPNEVTFVAT 439

Page 25: rna.genomics.purdue.edurna.genomics.purdue.edu/.../files/447/=GENEMARK_summary.docx · Web viewNote: Please click on hyperlinks for more information on predicted gene functions Gene2:

Query 858 LVACSHGGKLVEAGQRFVSMRLDYAFEPAKQHYACMVDVLGRAGRLEDARELVHAMPFVA 917 L ACS+ G + E + + SM + +P HYACMVD+LGRAGR +A E++ +M Sbjct 440 LSACSYAGMVEEGLKIYESMESVFGVKPITAHYACMVDMLGRAGRFNEAMEMIDSMTVEP 499

Query 918 NGFEWVTLLGSCRSNDDFKQGARVAKAMMDANPASGAAYILLANLYDSRA 967 + W +LLG+CR++ AK +++ P + YILL+N+Y S+ Sbjct 500 DAAVWGSLLGACRTHSQLDVAEFCAKKLIEIEPENSGTYILLSNMYASQG 549

Score = 251 bits (641), Expect = 1e-65, Method: Compositional matrix adjust. Identities = 148/543 (27%), Positives = 269/543 (49%), Gaps = 56/543 (10%)

Query 324 NKLIELYGRVGSPLHAQEVFDAIAHKNSLSWVMLLNAYCRNRQLDRTNQVFRSMPHRDLI 383 N I R+G A+++FD+ K+ SW ++ Y N ++F MP R++ISbjct 21 NVRITHLSRIGKIHEARKLFDSCDSKSISSWNSMVAGYFANLMPRDARKLFDEMPDRNII 80

Query 384 SWTCLLTALAQNGHLIKAQQVFDQMPMRDLVCWNSMLVAYGRSGRIEEAVKVFEDMPEKD 443 SW L++ +NG + +A++VFD MP R++V W +++ Y +G+++ A +F MPEK+Sbjct 81 SWNGLVSGYMKNGEIDEARKVFDLMPERNVVSWTALVKGYVHNGKVDVAESLFWKMPEKN 140

Query 444 LITWTSVLTAYAHFGHLEKSQDIFERMPQRNLFSWNAMLALYGVKGLMEKANRLFQEMPE 503 ++WT +L + G ++ + ++E +P ++ + +M+ +G +++A +F EM ESbjct 141 KVSWTVMLIGFLQDGRIDDACKLYEMIPDKDNIARTSMIHGLCKEGRVDEAREIFDEMSE 200

Query 504 WNSVSWTTMLDGYSQNGYLGRSKLVFDSMPERNLISWGCMLAAYAHNGHLMDAKRVFDTM 563 + ++WTTM+ GY QN + ++ +FD MPE+ +SW ML Y NG + DA+ +F+ MSbjct 201 RSVITWTTMVTGYGQNNRVDDARKIFDVMPEKTEVSWTSMLMGYVQNGRIEDAEELFEVM 260

Query 564 PEHNLVCFNAILTAFAQNGHLAKAKHAFDTMPETNVVTWNAMLTAYSDNGRVEQAKVMFD 623 P ++ NA+++ Q G +AKA+ FD+M E N +W ++ + NG +A +F Sbjct 261 PVKPVIACNAMISGLGQKGEIAKARRVFDSMKERNDASWQTVIKIHERNGFELEALDLFI 320

Query 624 SM-------PYRNLVSW--------------------------------TCMLAMHAQYG 644 M + L+S + ++ M+ + GSbjct 321 LMQKQGVRPTFPTLISILSVCASLASLHHGKQVHAQLVRCQFDVDVYVASVLMTMYIKCG 380

Query 645 QVTEARRTFDTLPERTINVTDALLTVYAHNGRIEDSKILFDGMP-----HWDMLAYSTML 699 ++ +++ FD P + I + +++++ YA +G E++ +F MP + + + LSbjct 381 ELVKSKLIFDRFPSKDIIMWNSIISGYASHGLGEEALKVFCEMPLSGSTKPNEVTFVATL 440

Query 700 SAFAQNGHVEDAKNLYDSMPEKHLVSKTS-----MLAMYAQHGSINEAQSMFDSMAFQ-D 753 SA + G VE+ +Y+SM V + M+ M + G NEA M DSM + DSbjct 441 SACSYAGMVEEGLKIYESMESVFGVKPITAHYACMVDMLGRAGRFNEAMEMIDSMTVEPD 500

Query 754 IVAWNSMLAAYTQHGYVDQ----AKTIFDIMPERDA--VSWSTMLAAYARKGHLPQAKKF 807 W S+L A H +D AK + +I PE + S M A+ R + + +K Sbjct 501 AAVWGSLLGACRTHSQLDVAEFCAKKLIEIEPENSGTYILLSNMYASQGRWADVAELRKL 560

Query 808 FST 810 TSbjct 561 MKT 563

Score = 205 bits (521), Expect = 1e-51, Method: Compositional matrix adjust. Identities = 88/272 (32%), Positives = 164/272 (60%), Gaps = 0/272 (0%)

Query 572 NAILTAFAQNGHLAKAKHAFDTMPETNVVTWNAMLTAYSDNGRVEQAKVMFDSMPYRNLV 631 N +T ++ G + +A+ FD+ ++ +WN+M+ Y N A+ +FD MP RN++Sbjct 21 NVRITHLSRIGKIHEARKLFDSCDSKSISSWNSMVAGYFANLMPRDARKLFDEMPDRNII 80

Query 632 SWTCMLAMHAQYGQVTEARRTFDTLPERTINVTDALLTVYAHNGRIEDSKILFDGMPHWD 691 SW +++ + + G++ EAR+ FD +PER + AL+ Y HNG+++ ++ LF MP +

Page 26: rna.genomics.purdue.edurna.genomics.purdue.edu/.../files/447/=GENEMARK_summary.docx · Web viewNote: Please click on hyperlinks for more information on predicted gene functions Gene2:

Sbjct 81 SWNGLVSGYMKNGEIDEARKVFDLMPERNVVSWTALVKGYVHNGKVDVAESLFWKMPEKN 140

Query 692 MLAYSTMLSAFAQNGHVEDAKNLYDSMPEKHLVSKTSMLAMYAQHGSINEAQSMFDSMAF 751 ++++ ML F Q+G ++DA LY+ +P+K +++TSM+ + G ++EA+ +FD M+ Sbjct 141 KVSWTVMLIGFLQDGRIDDACKLYEMIPDKDNIARTSMIHGLCKEGRVDEAREIFDEMSE 200

Query 752 QDIVAWNSMLAAYTQHGYVDQAKTIFDIMPERDAVSWSTMLAAYARKGHLPQAKKFFSTI 811 + ++ W +M+ Y Q+ VD A+ IFD+MPE+ VSW++ML Y + G + A++ F +Sbjct 201 RSVITWTTMVTGYGQNNRVDDARKIFDVMPEKTEVSWTSMLMGYVQNGRIEDAEELFEVM 260

Query 812 PEPSFVSWNALLTAYYQNSVPSGVFESFDMLK 843 P ++ NA+++ Q + FD +KSbjct 261 PVKPVIACNAMISGLGQKGEIAKARRVFDSMK 292

Score = 165 bits (418), Expect = 1e-39, Method: Compositional matrix adjust. Identities = 79/242 (32%), Positives = 143/242 (59%), Gaps = 0/242 (0%)

Query 601 TWNAMLTAYSDNGRVEQAKVMFDSMPYRNLVSWTCMLAMHAQYGQVTEARRTFDTLPERT 660 T N +T S G++ +A+ +FDS +++ SW M+A + +AR+ FD +P+R Sbjct 19 TANVRITHLSRIGKIHEARKLFDSCDSKSISSWNSMVAGYFANLMPRDARKLFDEMPDRN 78

Query 661 INVTDALLTVYAHNGRIEDSKILFDGMPHWDMLAYSTMLSAFAQNGHVEDAKNLYDSMPE 720 I + L++ Y NG I++++ +FD MP ++++++ ++ + NG V+ A++L+ MPESbjct 79 IISWNGLVSGYMKNGEIDEARKVFDLMPERNVVSWTALVKGYVHNGKVDVAESLFWKMPE 138

Query 721 KHLVSKTSMLAMYAQHGSINEAQSMFDSMAFQDIVAWNSMLAAYTQHGYVDQAKTIFDIM 780 K+ VS T ML + Q G I++A +++ + +D +A SM+ + G VD+A+ IFD MSbjct 139 KNKVSWTVMLIGFLQDGRIDDACKLYEMIPDKDNIARTSMIHGLCKEGRVDEAREIFDEM 198

Query 781 PERDAVSWSTMLAAYARKGHLPQAKKFFSTIPEPSFVSWNALLTAYYQNSVPSGVFESFD 840 ER ++W+TM+ Y + + A+K F +PE + VSW ++L Y QN E F+Sbjct 199 SERSVITWTTMVTGYGQNNRVDDARKIFDVMPEKTEVSWTSMLMGYVQNGRIEDAEELFE 258

Query 841 ML 842 ++Sbjct 259 VM 260

Score = 135 bits (340), Expect = 1e-30, Method: Compositional matrix adjust. Identities = 100/387 (25%), Positives = 187/387 (48%), Gaps = 30/387 (7%)

Query 319 DVFVQNKLIELYGRVGSPLHAQEVFDAIAHKNSLSWVMLLNAYCRNRQLDRTNQVFRSMP 378 D + +I + G A+E+FD ++ ++ ++W ++ Y +N ++D ++F MPSbjct 171 DNIARTSMIHGLCKEGRVDEAREIFDEMSERSVITWTTMVTGYGQNNRVDDARKIFDVMP 230

Query 379 HRDLISWTCLLTALAQNGHLIKAQQVFDQMPMRDLVCWNSMLVAYGRSGRIEEAVKVFED 438 + +SWT +L QNG + A+++F+ MP++ ++ N+M+ G+ G I +A +VF+ Sbjct 231 EKTEVSWTSMLMGYVQNGRIEDAEELFEVMPVKPVIACNAMISGLGQKGEIAKARRVFDS 290

Query 439 MPEKDLITWTSVLTAYAHFGHLEKSQDIFERMPQR----------NLFSWNAMLA-LYGV 487 M E++ +W +V+ + G ++ D+F M ++ ++ S A LA L+ Sbjct 291 MKERNDASWQTVIKIHERNGFELEALDLFILMQKQGVRPTFPTLISILSVCASLASLHHG 350

Query 488 KGLMEKANRLFQEMPEWNSVSWTTMLDGYSQNGYLGRSKLVFDSMPERNLISWGCMLAAY 547 K + + R ++ + + TM Y + G L +SKL+FD P +++I W +++ YSbjct 351 KQVHAQLVRCQFDVDVYVASVLMTM---YIKCGELVKSKLIFDRFPSKDIIMWNSIISGY 407

Query 548 AHNGHLMDAKRVFDTMP-----EHNLVCFNAILTAFAQNGHLAKAKHAFDTMPETNVVT- 601 A +G +A +VF MP + N V F A L+A + G + + +++M V Sbjct 408 ASHGLGEEALKVFCEMPLSGSTKPNEVTFVATLSACSYAGMVEEGLKIYESMESVFGVKP 467

Page 27: rna.genomics.purdue.edurna.genomics.purdue.edu/.../files/447/=GENEMARK_summary.docx · Web viewNote: Please click on hyperlinks for more information on predicted gene functions Gene2:

Query 602 ----WNAMLTAYSDNGRVEQAKVMFDSMPYR-NLVSWTCMLAMHAQYGQ--VTE--ARRT 652 + M+ GR +A M DSM + W +L + Q V E A++ Sbjct 468 ITAHYACMVDMLGRAGRFNEAMEMIDSMTVEPDAAVWGSLLGACRTHSQLDVAEFCAKKL 527

Query 653 FDTLPERTINVTDALLTVYAHNGRIED 679 + PE + L +YA GR DSbjct 528 IEIEPENS-GTYILLSNMYASQGRWAD 553

Score = 52.0 bits (123), Expect = 2e-05, Method: Compositional matrix adjust. Identities = 39/188 (20%), Positives = 91/188 (48%), Gaps = 14/188 (7%)

Query 291 LLDLLQSCRSIPAAH---EIRSRIQSTLGAADVFVQNKLIELYGRVGSPLHAQEVFDAIA 347 L+ +L C S+ + H ++ +++ DV+V + L+ +Y + G + ++ +FD Sbjct 334 LISILSVCASLASLHHGKQVHAQLVRCQFDVDVYVASVLMTMYIKCGELVKSKLIFDRFP 393

Query 348 HKNSLSWVMLLNAYCRNRQLDRTNQVFRSMP-----HRDLISWTCLLTALAQNGHLIKAQ 402 K+ + W +++ Y + + +VF MP + +++ L+A + G + + Sbjct 394 SKDIIMWNSIISGYASHGLGEEALKVFCEMPLSGSTKPNEVTFVATLSACSYAGMVEEGL 453

Query 403 QVFDQMP----MRDLVC-WNSMLVAYGRSGRIEEAVKVFEDMP-EKDLITWTSVLTAYAH 456 ++++ M ++ + + M+ GR+GR EA+++ + M E D W S+L A Sbjct 454 KIYESMESVFGVKPITAHYACMVDMLGRAGRFNEAMEMIDSMTVEPDAAVWGSLLGACRT 513

Query 457 FGHLEKSQ 464 L+ ++Sbjct 514 HSQLDVAE 521

Gene31: Clavaminate synthase-like protein

Details of best match

Clavaminate synthase-like protein

Score = 108 bits (271), Expect = 2e-23, Method: Compositional matrix adjust. Identities = 55/122 (45%), Positives = 76/122 (62%), Gaps = 5/122 (4%)

Query 156 RAKEGNSRIEWNQNGTASLFMGPKIGTKFCKSKGRKVWFNSIGSTY---ELMLISPPGEH 212 RA + ++EW ++G A MGP K+ +S+ RKVWFNS+ + Y E P Sbjct 210 RAVDLGMKLEWTEDGGAKTVMGPIPAIKYDESRNRKVWFNSMVAAYTGWEDKRNDP--RK 267

Query 213 GISFGDGTPLNEKFLAACKRIMEEEKVAFKWRKGDVLIIDNDAVLHAREPSRPPRKILAA 272 ++FGDG PL + C RI+EEE VA W++GDVL+IDN AVLH+R P PPR++LA+Sbjct 268 AVTFGDGKPLPADIVHDCLRILEEECVAVPWQRGDVLLIDNWAVLHSRRPFDPPRRVLAS 327

Query 273 LA 274 L Sbjct 328 LC 329

Score = 90.9 bits (224), Expect = 6e-18, Method: Compositional matrix adjust. Identities = 55/126 (43%), Positives = 75/126 (59%), Gaps = 9/126 (7%)

Page 28: rna.genomics.purdue.edurna.genomics.purdue.edu/.../files/447/=GENEMARK_summary.docx · Web viewNote: Please click on hyperlinks for more information on predicted gene functions Gene2:

Query 8 VLVEGHTPEQR---SHPFRIPHVFVPFDSSCA----ALLMLLEGIQSQKADIEHALHQSG 60 +LVE P+Q+ S PF P V P +S +L + + I++QK ++ LH+SGSbjct 5 LLVETPIPQQKHYESKPF--PAVISPPSASIPIPALSLPLFTQTIKTQKHYLDSLLHESG 62

Query 61 AVLLRGFEVLTASDFNDVLEAFGYDNFVYNGRGPHKKAIIGRVVTANEKLVHFPIGFHNE 120 AVL RGF V +A DFNDV+EAFG+D Y G + +++GRV TANE I FH+ESbjct 63 AVLFRGFPVNSADDFNDVVEAFGFDELPYVGGAAPRTSVVGRVFTANESPPDQKIPFHHE 122

Query 121 MAYVPE 126 MA V ESbjct 123 MAQVRE 128

Gene32: Anthranilate N-benzoyltransferase protein 3

Details of best match

Anthranilate N-benzoyltransferase protein 3

Score = 112 bits (280), Expect = 4e-24, Method: Compositional matrix adjust. Identities = 105/370 (28%), Positives = 162/370 (43%), Gaps = 56/370 (15%)

Query 50 PNPGLQVFKEILRKGLSKVLAAYPCCAGRLRINSSDKLEIDCNNQGARLGVGSCKLSISE 109 P+ + IL + LSK L Y AGRL+IN D+ EIDCN +GA S Sbjct 56 PSSSMYFDANILIEALSKALVPYYPMAGRLKING-DRYEIDCNGEGALFVEAE-----SS 109

Query 110 VTAADFDEFFLEPSDNM-----------ESFEDLPLLNIQVTEMN-NGYAISMLQHHTLG 157 DF +F P+D + + PLL +Q+T G +I QHH + Sbjct 110 HVLEDFGDF--RPNDELHRVMVPTCDYSKGISSFPLLMVQLTRFRCGGVSIGFAQHHHVC 167

Query 158 EATSAICFLMNFAGQCRGEELWLVPEFDR-TQMKASDHPMPSFQHHEFGKQEKDSRYTVT 216 + S F ++A +G L P DR + + P + H +F Sbjct 168 DRMSHFEFNNSWARIAKGLLPALEPVHDRYLHLCPRNPPQIKYTHSQFEPF--------- 218

Query 217 SESVLNAESLRRSVAKGSV-KQRKKYHLSRSRLAQIKQAALTDVSNC----STFEALAAQ 271 SL + + G K + + LSR ++ +KQ D SN ST+E +A Sbjct 219 ------VPSLPKELLDGKTSKSQTLFKLSREQINTLKQK--LDWSNTTTRLSTYEVVAGH 270

Query 272 VWKA--NVAALPKKEVARMRFLVDTRS-IIQPPLSRGFFGSAVYVVMVQARTEELLTEPL 328 VW++ L E ++ VD RS I P L +G+ G+ V++ + A +L PLSbjct 271 VWRSVSKARGLSDHEEIKLIMPVDGRSRINNPSLPKGYCGNVVFLAVCTATVGDLACNPL 330

Query 329 GVTAMRIQQAKKSVTEEYVRSGLDFLELHPD----YWYHPD----CDTVINAWPRAMSNS 380 TA ++Q+A K + ++Y+RS +D E PD Y P+ + ++N+W R +Sbjct 331 TDTAGKVQEALKGLDDDYLRSAIDHTESKPDLPVPYMGSPEKTLYPNVLVNSWGRIPYQA 390

Query 381 TQLDFGLGKP 390 +DFG G PSbjct 391 --MDFGWGNP 398

Page 29: rna.genomics.purdue.edurna.genomics.purdue.edu/.../files/447/=GENEMARK_summary.docx · Web viewNote: Please click on hyperlinks for more information on predicted gene functions Gene2:

Gene33: Alcohol dehydrogenase class-3

Details of best match

Alcohol dehydrogenase class-3

Score = 243 bits (619), Expect = 3e-63, Method: Compositional matrix adjust. Identities = 143/361 (39%), Positives = 197/361 (54%), Gaps = 65/361 (18%)

Query 1 MAS-TVGRAIQCRAAVLHSAGSEFELETINVEPPKSGEIRMQVLYSSLCHTDITIADWGT 59 MAS T G+ I C+AAV + +E + V PP++GE+R+++L+++LCHTD W Sbjct 1 MASPTQGQVITCKAAVAYEPNKPLVIEDVQVAPPQAGEVRVKILFTALCHTDHYT--WSG 58

Query 60 LK----YPVILGHEGSGVVESVGEGVTEFAPGDHVICVYQGECGKCKLCKLSTTNHCEVS 115 +P ILGHE +G+VESVGEGVT+ PGDHVI YQ EC +CK CK TN C Sbjct 59 KDPEGLFPCILGHEAAGIVESVGEGVTDVQPGDHVIPCYQAECKECKFCKSGKTNLCGKV 118

Query 116 FGNLFTPFMPLDGTARFSSLDGSAIHHFVNTSTFTEYTVLDKTSVVKVDP-VPLEKACLL 174 M D +RF S++G I+HF+ TSTF++YTV+ SV K++P PL+K CLLSbjct 119 RSATGVGVMMNDMKSRF-SVNGKPIYHFMGTSTFSQYTVVHDVSVAKINPQAPLDKVCLL 177

Query 175 GCGVPTGLGSALNLANVEAGSTVAVIGLGTVGLA-------------------------- 208 GCGVPTGLG+ N A VE+GS VAV GLGTVGLA Sbjct 178 GCGVPTGLGAVWNTAKVESGSVVAVFGLGTVGLAVAEGAKAAGASRVIGIDIDNKKFDVA 237

Query 209 --------------------VLLELTNGGVDYCFECVGKPNLL----------YGTTVMV 238 VL++LT+GGVDY FEC+G +++ +GT+V+VSbjct 238 KNFGVTEFVNPKEHDKPIQQVLVDLTDGGVDYSFECIGNVSIMRAALECSDKGWGTSVIV 297

Query 239 GAPRPDEMVTFPPVILLSGRQLKSGYFGGFRGKSDMRKLVDMCSTKGAAIDPARAASIQL 298 G + ++ P L++GR K FGGF+ ++ + LVD K +D ++ LSbjct 298 GVAASGQEISTRPFQLVTGRVWKGTAFGGFKSRTQVPWLVDKYMKKEIKVDEYITHNMNL 357

Query 299 A 299 ASbjct 358 A 358