project report on influenza virus

143
s Contents Page no. Preface Chapter 1: Introduction Chapter 2: Material and methods Chapter 3: Result and discussion Chapter4: Conclusion Chapter 5: Bibliography and references Chapter 6: Abbreviation Appendix

Upload: brijesh-singh-yadav

Post on 18-Nov-2014

296 views

Category:

Documents


2 download

DESCRIPTION

That project report focus on the Brid flu virus.

TRANSCRIPT

Page 1: Project  Report on Influenza Virus

s Contents

Page no.

Preface

Chapter 1: Introduction

Chapter 2: Material and methods

Chapter 3: Result and discussion

Chapter4: Conclusion

Chapter 5: Bibliography and references

Chapter 6: Abbreviation

Appendix

Preface

Page 2: Project  Report on Influenza Virus

Viruses are masters of interspecies navigation. Mutating rapidly and often grabbing the genetic material of other

viruses, they can jump from animals to humans with a quick flick of their DNA. Sometimes, as in West Nile

fever, the transfer occurs through an intermediate host such as a mosquito. But viruses can also make the leap

directly.

Since the 1980s, the list of diseases that have hitchhiked directly from animals to people has grown

rapidly — Hantavirus, SARS, monkey pox and, most recently, avian influenza, commonly called bird flu. With

the exception of HIV/AIDS, perhaps none of these illnesses has more potential to create widespread harm than

bird flu does.

In people, bird flu usually begins much like conventional influenza, with fever, cough, sore throat and

muscle aches, but bird flu can lead to life-threatening complications.

So far, bird flu is hard for humans to contract, but health officials warn a major flu outbreak could occur if the

virus mutates into a form that can spread easily from person to person. The grimmest scenario would be a global

epidemic to rival the flu pandemic of 1918 and 1919, which claimed millions of lives worldwide. In the

meantime, researchers are trying to sort out options for a vaccine. Bird flu seems to be developing resistance to

the flu drug Tami flu. And a French vaccine maker has produced a bird flu vaccine that promoted an immune

system response but still needs further study.

Lots of work has been going on in bird flu virus .So I also decide to do something in this topic. So I

perform the phylogenetic analysis of different strains of Influenza A virus.

Our current work aimed at analyzing the phylogenetic relationship between the different strains of

Influenza A virus and analyzing the cause of virulence in the light of evolution. We compare the evolutionary

position of different strains in the phylogenetic trees, taking five different types of strains which come under either

HPAI or LPAI. This study throws some light on evolutionary relationship between different strains of Influenza A

virus for better understanding of the evolution of pathogenesis in terms of antigenic drift and shift . Also to model

the unknown protein structure of influenza A virus and to find appropriate drug target for it.

The Gene sequences were collected from NCBI site for the strain of Influenza A virus for the purpose of

phylogenetic analysis. we got nearly 40 sequences .the sequences are aligned using CLUSTAL W package to

know their similarity and relationships..using the output of the CLUSTAL W as the input for the PHYLIP we

perform phylogenetic analysis .Then we use N-J plot for visualizing the tree constructed by the PHYLIP.

After that I collect the protein sequences from NCBI website of H5N1 strain of Influenza A virus.I got

nearly 10 such sequences In order to proceed the modelling we take protein sequences of h5n1 strain only. For it

we use SWISS MODEL server.

Then we go for docking in order to proceed the structure prediction analysis .we find that the

Neuraminidase protein in influenza A virus coded by the NA gene is one of the reasons of its pathogenecity.We

find out appropriate ligand from PDBSUM or CSA, .then we perform docking through HEX software for docking.

We also perform Family analysis through GENSCAN, ORF FINDER for finding consense sequence, motif, exon,

introns & orfs etc.

Chapter: 1

2

Page 3: Project  Report on Influenza Virus

INTRODUCTION

Introduction

Avian Influenza viruses that infect bird are called avian influenza A viruses only influenza A viruses

infect birds and all known subtypes of influenza a viruses can infect birds. However there are substantial genetic

differences between the subtype that typically infect both people and birds. Avian influenza A, h5 and h9 viruses

can be distinguished as low pathogenic and high pathogenic forms on the basis of genetic features of the virus.

Influenza A virus, the virus that causes avian flu. Transmission electron micrograph of negatively stained

virus particles in late passage. (Source: Dr. Erskine Palmer, Centers for Disease Control and Prevention

Public Health Image Library)

Avian influenza is a disease of birds caused by influenza viruses closely related to human influenza viruses.

Transmission to humans in close contact with poultry or other birds occurs rarely and only with some strains of

avian influenza. The potential for transformation of avian influenza into a form that both causes severe disease in

humans and spreads easily from person to person is a great concern for world health

Wild birds are the natural host for all known subtypes of influenza A viruses. Typically, wild birds do not become

sick when they are infected with avian influenza A viruses. However, domestic poultry, such as turkeys and

chickens, can become very sick and die from avian influenza, and some avian influenza A viruses also can cause

serious disease and death in wild birds.

Influenza Virus Types, Subtypes, and Strains

Three distinct types of influenza virus, dubbed A, B, and C, have been identified. Together these viruses, which

are antigenically distinct from one another, comprise their own viral family, Orthomyxoviridae. Most cases of

the flu, especially those that occur in epidemics or pandemics, are caused by the influenza A virus, which can

affect a variety of animal species, but the B virus, which normally is only found in humans, is responsible for

many localized outbreaks. The influenza C virus is morphologically and genetically different than the other two

viruses and is generally no symptomatic, so is of little medical concern

3

Page 4: Project  Report on Influenza Virus

Influenza Type A

Influenza type A viruses can infect people, birds, pigs, horses, seals, whales, and other animals, but wild birds are

the natural hosts for these viruses. Influenza type A viruses are divided into subtypes based on two proteins on the

surface of the virus. These proteins are called hemagglutinin (HA) and neuraminidase (NA). There are 15

different HA subtypes and 9 different NA subtypes. Many different combinations of HA and NA proteins are

possible. Only some influenza A subtypes (i.e., H1N1, H1N2, and H3N2) are currently in general circulation

among people. Other subtypes are found most commonly in other animal species. For example, H7N7 and H3N8

viruses cause illness in horses and dogs.

 Subtypes of influenza A virus are named according to their HA and NA surface proteins. For example, an “H7N2

virus” designates influenza A subtype that has an HA 7 protein and an NA 2 protein. Similarly an “H5N1” virus

has an HA 5 protein and an NA 1 protein.

Influenza Type B

 Influenza B viruses are normally found only in humans. Unlike influenza A viruses, these viruses are not

classified according to subtype. Although influenza type B viruses can cause human epidemics, they have not

caused pandemics.

Influenza Type C 

Influenza type C viruses cause mild illness in humans and do not cause epidemics or pandemics. These viruses are

not classified according to subtype.

 Strains

Influenza B viruses and subtypes of influenza A virus are further characterized into strains. There are many

different strains of influenza B viruses and of influenza A subtypes. New strains of influenza viruses appear and

replace older strains. This process occurs through a type of change is called “drift” (see How Influenza Viruses

Can Change: Shift and Drift). When a new strain of human influenza virus emerges, antibody protection that may

have developed after infection or vaccination with an older strain may not provide protection against the new

strain. Thus, the influenza vaccine is updated on a yearly basis to keep up with the changes in influenza viruses.

Subtypes

Influenza A viruses are significant for their potential for disease and death in humans and other animals. Influenza

A virus subtypes that have been confirmed in humans, in order of the number of known human pandemic deaths

that they have caused, include:

H1N1, which caused "Spanish Flu" and currently causes seasonal human flu

H2N2, which caused "Asian Flu"

H3N2, which caused "Hong Kong Flu" and currently causes seasonal human flu

H5N1, the world's major current pandemic threat

H9N2, which has infected three people

4

Page 5: Project  Report on Influenza Virus

Human influenza virus verses avian influenza virus

Humans can be infected with influenza types A, B, and C. However, the only subtypes of influenza A virus that

normally infect people are influenza A subtypes H1N1, H1N2, and H3N2. Between 1957 and 1968, H2N2 viruses

also circulated among people, but currently do not.

 

Only influenza A viruses infect birds. Wild birds are the natural host for all subtypes of influenza A virus.

Typically wild birds do not get sick when they are infected with influenza virus. However, domestic poultry, such

as turkeys and chickens, can get very sick and die from avian influenza, and some avian viruses also can cause

serious disease and death in wild birds.

 

Structure

The structure of the influenza virus (see Figure 1) is somewhat variable, but the virion particles are usually

spherical or ovoid in shape and 80 to 120 nanometres in diameter. Sometimes filamentous forms of the virus

occur as well, and are more common among some influenza strains than others. The influenza virion is an

enveloped virus that derives its lipid bilayer from the plasma membrane of a host cell. Two different varieties of

glycoprotein spike are embedded in the envelope. Approximately 80 percent of the spikes are hemagglutinin, a

trimeric protein that functions in the attachment of the virus to a host cell. The remaining 20 percent or so of the

glycoprotein spikes consist of neuraminidase, which is thought to be predominantly involved in facilitating the

release of newly produced virus particles from the host cell. On the inner side of the envelope that surrounds an

influenza virion is an antigenic matrix protein lining. Within the envelope is the influenza genome, which is

organized into eight pieces of single-stranded RNA (A and B forms only; influenza C has 7 RNA segments). The

RNA is packaged with nucleoprotein into a helical ribonucleoprotein form, with three polymerase peptides for

each RNA segment.

Diagrammatic representation of the morphology of an influenza virion.

The virion is generally rounded but may be long and filamentous.

A single-stranded RNA genome is closely associated with a helical nucleoprotein (NP), and is present in eight

5

Page 6: Project  Report on Influenza Virus

separate segments of ribonucleoprotein (RNP), each of which has to be present for

successful replication. The segmented genome is enclosed within an outer lipoprotein

envelope. An antigenic protein called the matrix protein (MP

1) lines the inside of the envelope and and is chemically bound

to the RNP. The envelope carries two types of protruding

spikes. One is a box-shaped protein, called the neuraminidase

(NA), of which there are nine major antigenic types, and which

has enzymic properties as the name implies. The other type of

envelope spike is a trimeric protein called the hemagglutinin

(HA) (illustrated on the right) of which there are 13 major antigenic types. The

hemagglutinin functions during attachment of the virus particle to the cell membrane, and

can combine with specific receptors on a variety of cells including red blood cells. The lipoprotein envelope

makes the virion rather labile - susceptible to heat, drying, detergents and solvents.

Genes of Influenza A virus

Influenza A viruses have 10 genes on eight separate RNA molecules (called: PB2,

PB1, PA, HA, NP, NA, M, and NS). HA, NA, and M specify the structure of proteins that are most medically

relevant as targets for antiviral drugs and antibodies. (An eleventh recently discovered gene called PB1-F2

sometimes creates a protein but is absent from some influenza virus isolates.) this segmentation of the influenza

genome facilitates genetic recombination by segment reassortment in hosts who are infected with two different

influenza viruses at the same time. Influenza A virus is the only species in the Influenzavirus A genus of the

Orthomyxoviridae family and are negative sense, single-stranded, segmented RNA viruses.

Surface encoding gene segments

Surface antigen encoding gene segments (RNA molecule): (HA, NA)

o HA codes for hemagglutinin, which is an antigenic glycoprotein, found on the surface of the

influenza viruses and is responsible for binding the virus to the cell that is being infected.

o NA codes for neuraminidase, which is an antigenic glycoprotein enzyme, found on the surface

of the influenza viruses. It helps the release of progeny viruses from infected cells.

Internal encoding gene segments

Internal viral protein encoding gene segments (RNA molecule): (M, NP, NS, PA, PB1, PB2)

Matrix encoding gene segments:

o M codes for the matrix proteins (M1 and M2) that along with the two surface proteins

(hemagglutinin and neuraminidase) make up the capsid (protective coat) of the virus. It encodes

by using different reading frames from the same RNA segment.

M1 is a protein that binds to the viral RNA.

6

Page 7: Project  Report on Influenza Virus

M2 is a protein that uncoats the virus exposing its contents (the eight RNA segments)

to the cytoplasm of the host cell. The M2 transmembrane protein is an ion channel

required for efficient infection. Nucleoprotein encoding gene segments

o NP codes for nucleoprotein.

o NS: NS codes for two nonstructural proteins (NS1 and NEP). "[T]he pathogenicity of influenza

virus was related to the nonstructural (NS) gene of the H5N1/97 virus"

Polymerase encoding gene segments

o PA codes for the PA protein which is a critical component of the viral polymerase.

o PB1 codes for the PB1 protein and the PB1-F2 protein.

o PB2 codes for the PB2 protein which is a critical component of the viral polymerase.

How the Flu Virus Can Change - "Drift" and "Shift"

Influenza viruses can change in two different ways. One is called "antigenic drift." These are small changes in

the virus that happen continually over time. Antigenic drift produces new virus strains that may not be

recognized by the body's immune system. This process works as follows: a person infected with a particular flu

virus strain develops antibody against that virus. As newer virus strains appear, the antibodies against the older

strains no longer recognize the "newer" virus, and reinfection can occur. This is one of the main reasons why

people can get the flu more than one time. In most years, one or two of the three virus strains in the influenza

vaccine are updated to keep up with the changes in the circulating flu viruses. So, people who want to be

protected from flu need to get a flu shot every year.

The other type of change is called "antigenic shift." Antigenic shift is an abrupt, major change in the influenza A

viruses, resulting in new hemagglutinin and/or new hemagglutinin and neuraminidase proteins in influenza

viruses that infect humans. Shift results in a new influenza A subtype. When shift happens, most people have

little or no protection against the new virus. While influenza viruses are changing by antigenic drift all the time,

antigenic shift happens only occasionally. Type A viruses undergo both kinds of changes; influenza type B

viruses change only by the more gradual process of antigenic drift.

Influenza viruses are dynamic and are continuously evolving. Influenza viruses can change in two different

ways: antigenic drift and antigenic shift. Influenza viruses are changing by antigenic drift all the time, but

antigenic shift happens only occasionally. Influenza type A viruses undergo both kinds of changes; influenza

type B viruses change only by the more gradual process of antigenic drift.

Genetic of drifting And shifting:

Antigenic drift refers to small, gradual changes that occur through point mutations in the two genes that contain

the genetic material to produce the main surface proteins, hemagglutinin, and neuraminidase. These point

7

Page 8: Project  Report on Influenza Virus

mutations occur unpredictably and result in minor changes to these surface proteins. Antigenic drift produces

new virus strains that may not be recognized by antibodies to earlier influenza strains. This process works as

follows: a person infected with a particular influenza virus strain develops antibody against that strain. As newer

virus strains appear, the antibodies against the older strains might not recognize the "newer" virus, and infection

with a new strain can occur. This is one of the main reasons why people can become infected with influenza

viruses more than one time and why global surveillance is critical in order to monitor the evolution of human

influenza virus stains for selection of which strains should be included in the annual production of influenza

vaccine. In most years, one or two of the three virus strains in the influenza vaccine are updated to keep up with

the changes in the circulating influenza viruses. For this reason, people who want to be immunized against

influenza need to be vaccinated every year.

Antigenic shift refers to an abrupt, major change to produce a novel influenza A virus subtype in humans that

was not currently circulating among people (see more information below under Influenza Type A and Its

Subtypes). Antigenic shift can occur either through direct animal (poultry)-to-human transmission or through

mixing of human influenza A and animal influenza A virus genes to create a new human influenza A subtype

virus through a process called genetic reassortment. Antigenic shift results in a new human influenza A subtype.

A global influenza pandemic (worldwide spread) may occur if three conditions are met:

A new subtype of influenza A virus is introduced into the human population.

The virus causes serious illness in humans.

The virus can spread easily from person to person in a sustained manner

Diagrammatic representation of Antigenic drift:

8

Page 9: Project  Report on Influenza Virus

Diagrammatic representation of Antigenic shift

9

Page 10: Project  Report on Influenza Virus

Low Pathogenic versus Highly Pathogenic Avian Influenza A Viruses

Avian influenza A virus strains are further classified as low pathogenic (LPAI) or highly pathogenic (HPAI) on

the basis of specific molecular genetic and pathogenesis criteria that require specific testing. Most avian

influenza A viruses are LPAI viruses that are usually associated with mild disease in poultry. In contrast, HPAI

viruses can cause severe illness and high mortality in poultry. More recently, some HPAI viruses (e.g., H5N1)

have been found to cause no illness in some poultry, such as ducks. LPAI viruses have the potential to evolve

into HPAI viruses and this has been documented in some poultry outbreaks. Avian influenza A viruses of the

subtypes H5 and H7,including H5N1, H7N7, and H7N3 viruses, have been associated with HPAI, and human

infection with these viruses have ranged from mild (H7N3, H7N7) to severe and fatal disease (H7N7, H5N1).

Human illness due to infection with LPAI viruses has been documented, including very mild symptoms (e.g.,

conjunctivitis) to influenza-like illness. Examples of LPAI viruses that have infected humans include H7N7,

H9N2, and H7N2.

In general, direct human infection with avian influenza viruses occurs very infrequently, and has been associated

with direct contact (e.g., touching) infected sick or dead infected birds (domestic poultry).

Mutation

Influenza viruses have a relatively high mutation rate that is characteristic of RNA viruses. The H5N1 virus has

mutated into a variety of types with differing pathogenic profiles; some pathogenic to one species but not others,

some pathogenic to multiple species. The ability of various influenza strains to show species-selectivity is largely

due to variation in the hemagglutinin genes. Genetic mutations in the hemagglutinin gene that cause single amino

10

Page 11: Project  Report on Influenza Virus

acid substitutions can significantly alter the ability of viral hemagglutinin proteins to bind to receptors on the

surface of host cells. Such mutations in avian H5N1 viruses can change virus strains from being inefficient at

infecting human cells to being as efficient in causing human infections as more common human influenza virus

types. This doesn't mean one amino acid substitution can cause a pandemic but it does mean one amino acid

substitution can cause an avian flu virus that is not pathogenic in humans to become pathogenic in humans. H3N2

("swine flu") is endemic in pigs in China, and has been detected in pigs in Vietnam, increasing fears of the

emergence of new variant strains. The dominant strain of annual flu virus in January 2006 was H3N2, which is

now resistant to the standard antiviral drugs amantadine and rimantadine. The possibility of H5N1 and H3N2

exchanging genes through reassortment is a major concern. If a reassortment in H5N1 occurs, it might remain an

H5N1 subtype, or it could shift subtypes, as H2N2 did when it evolved into the Hong Kong Flu strain of

H3N2.Both the H2N2 and H3N2 pandemic strains contained avian flu virus RNA segments. "While the pandemic

human influenza viruses of 1957 (H2N2) and 1968 (H3N2) clearly arose through reassortment between human

and avian viruses, the influenza virus causing the 'Spanish flu' in 1918 appears to be entirely derived from an

avian source".

Transmission of Influenza A Viruses between Animals and People

Influenza A viruses have infected many different animals, including ducks, chickens, pigs, whales, horses, and

seals. However, certain subtypes of influenza A virus are specific to certain species, except for birds, which are

hosts to all known subtypes of influenza A. Subtypes that have caused widespread illness in people either in the

past or currently are H3N2, H2N2, H1N1, and H1N2. H1N1 and H3N2 subtypes also have caused outbreaks in

pigs, and H7N7 and H3N8 viruses have caused outbreaks in horses.

Influenza A viruses normally seen in one species sometimes can cross over and cause illness in another species.

For example, until 1998, only H1N1 viruses circulated widely in the U.S. pig population. However, in 1998,

H3N2 viruses from humans were introduced into the pig population and caused widespread disease among pigs.

Most recently, H3N8 viruses from horses have crossed over and caused outbreaks in dogs.

Avian influenza A viruses may be transmitted from animals to humans in two main ways:

Directly from birds or from avian virus-contaminated environments to people.

Through an intermediate host, such as a pig.

Influenza A viruses have eight separate gene segments. The segmented genome allows influenza A viruses from

different species to mix and create a new influenza A virus if viruses from two different species infect the same

person or animal. For example, if a pig were infected with a human influenza A virus and an avian influenza A

virus at the same time, the new replicating viruses could mix existing genetic information (reassortment) and

produce a new virus that had most of the genes from the human virus, but a hemagglutinin and/or neuraminidase

from the avian virus. The resulting new virus might then be able to infect humans and spread from person to

person, but it would have surface proteins (hemagglutinin and/or neuraminidase) not previously seen in influenza

viruses that infect humans.

11

Page 12: Project  Report on Influenza Virus

This type of major change in the influenza A viruses is known as antigenic shift. Antigenic shift results when a

new influenza A subtype to which most people have little or no immune protection infects humans. If this new

virus causes illness in people and can be transmitted easily from person to person, an influenza pandemic can

occur.

It is possible that the process of genetic reassortment could occur in a human who is co-infected with avian

influenza A virus and a human strain of influenza A virus. The genetic information in these viruses could reassort

to create a new virus with a hemagglutinin from the avian virus and other genes from the human virus.

Theoretically, influenza A viruses with a hemagglutinin against which humans have little or no immunity that

have reassorted with a human influenza virus are more likely to result in sustained human-to-human transmission

and pandemic influenza. Therefore, careful evaluation of influenza viruses recovered from humans who are

infected with avian influenza is very important to identify reassortment if it occurs.

Although it is unusual for people to get influenza virus infections directly from animals, sporadic human

infections and outbreaks caused by certain avian influenza A viruses and pig influenza viruses have been reported.

(For more information see Avian Influenza Infections in Humans ) These sporadic human infections and

outbreaks, however, rarely result in sustained transmission among humans.

Symptoms in humans

Avian influenza hemagglutinin bind alpha 2-3 sialic acid receptors while human influenza hemagglutinin bind

alpha 2-6 sialic acid receptors. Usually other differences also exist. There is as yet no human form of H5N1, so all

humans who have caught it so far have caught avian H5N1.

Humans who catch a humanized Influenza A virus (in other words a human flu virus of type A) usually have

symptoms that include fever, cough, sore throat, muscle aches, conjunctivitis and, in severe cases, severe

breathing problems and pneumonia that may be fatal. The severity of the infection will depend to a large part on

the state of the infected person's immune system and if the victim has been exposed to the strain before, and is

therefore partially immune. No one knows if these or other symptoms will be the symptoms of a humanized H5N1

flu.

Highly pathogenic H5N1 avian flu in a human is far worse, killing 50% of humans that catch it. In one case, a boy

with H5N1 experienced diarrhea followed rapidly by a coma without developing respiratory or flu-like symptoms.

There have been studies of the levels of cytokines in humans infected by the H5N1 flu virus. Of particular concern

is an elevated levels of tumor necrosis factor alpha (TNFα), a protein that is associated with tissue destruction at

sites of infection and increased production of other cytokines. Flu virus-induced increases in the level of cytokines

are also associated with flu symptoms including fever, chills, vomiting and headache. Tissue damage associated

with pathogenic flu virus infection can ultimately result in death. The inflammatory cascade triggered by H5N1

has been called a 'cytokine storm' by some, because of what seems to be a positive feedback process of damage to

the body resulting from immune system stimulation. H5N1 type flu virus induces higher levels of cytokines than

the more common flu virus types such as H1N1.

12

Page 13: Project  Report on Influenza Virus

PREVENTION

Vaccines

A new vaccine is formulated annually with the types and strains of influenza predicted to be the major problems

for that year (predictions are based on worldwide monitoring of influenza). The vaccine is multivalent and the

current one is to two strains of influenza A and one of influenza B. The vaccine given to adults at present is an

inactivated preparation of egg-grown virus. It is contraindicated for those with allergies to eggs. It has a short

lived protective effect and so is usually given in the fall (figure 11) so that protection is high in December/January

- the usual peak months for flu in the northern hemisphere. It needs to be given every year since, besides the short

lived nature of  the protection, the most effective strains for the vaccine will change due to drift or shift. Only

certain formulations of the vaccine are approved for young children. Previously, a subunit vaccine was

recommended.

In 2003, a live, attenuated (much less pathogenic than wild-type virus) vaccine (marketed as FluMist) was

approved for use in the United States. It is only approved for healthy individuals (those not at risk for

complications from influenza infection) from five to forty nine years of age. It is given nasally and should provide

mucosal, humoral and cell-mediated immunity. In this vaccine, the vaccine virus is a cold-adapted strain which

can grow in the upper respiratory tract where it is cooler, but grows poorly in the lower respiratory tract. It is

attenuated due to multiple changes in the various genome segments. Reassortment is used to generate viruses

which have six gene segments from the attenuated virus and the HA and NA coding segments from the virus

which is likely to be a problem in the up-coming influenza season. A reassortant is generated for each strain

expected to be a problem. Since this is a live vaccine, given intranasally as a spray, it generates an IgA response

and an IgM/G response. FluMist vaccine virus is also grown on eggs and so is contraindicated for people with an

egg allergy. Since this is a live viral vaccine, it is also contraindicated for children and young adolescents on any

therapy containing aspirin due to the potential risk of Reye's syndrome.

The CDC recommends: “Physicians should administer influenza vaccine to any person who wishes to reduce the

likelihood of becoming ill with influenza (the vaccine can be administered to children as young as 6 months).

Persons who provide essential community services should be considered for vaccination to minimize disruption of

essential activities during influenza outbreaks. Students or other persons in institutional settings (e.g., those who

reside in dormitories) should be encouraged to receive vaccine to minimize the disruption of routine activities

during epidemics.”

Chemotherapy

Rimantadine and amantadine block virus entry across the endosome and also interfere with virus release (see anti-

viral chemotherapy section). They are good prophylactic agents for influenza A, but there are some problems in

taking them on a long term basis. They may be given as protective agents during an outbreak, especially to those

at severe risk and key personnel. They may also be given at the time of vaccination for a few weeks, until the

humoral response has time to develop. (There is some evidence that these drugs can help prevent more serious

complications if given early in infection.)

13

Page 14: Project  Report on Influenza Virus

Two neuraminidase inhibitors have recently been approved by the FDA (zanamivir [Relenza] and oseltamivir).

They are active against influenza A and influenza B. These drugs can reduce the duration of uncomplicated

influenza (by approximately 1day). Oseltamavir is approved for prophylaxis as well as treatment. At the moment,

Zanamivir is only approved for treatment but trials indicate it is probably as effective as oseltamivir in

prophylaxis.

As yet there are no clear data on the ability of any of the these drugs to reduce serious complications when used to

treat influenza (as contrasted with when they are used prophylactically).

Treatment and prevention for humans

The best treatments are rest, liquids, anti-febrile agents (not aspirin in the young or adolescent, since Reye's

disease is a potential problem). Be aware of and treat complications appropriatelyThere is no highly effective

treatment for H5N1 flu, but oseltamivir (commercially marketed by Roche as Tamiflu), can sometimes inhibit the

influenza virus from spreading inside the user's body. This drug has become a focus for some governments and

organizations trying to be seen as making preparations for a possible H5N1 pandemic. On April 20, 2006, Roche

AG announced that a stockpile of three million treatment courses of Tamiflu is waiting at the disposal of the

World Health Organization to be used in case of a flu pandemic; separately Roche donated two million courses to

the WHO for use in developing nations that may be affected by such a pandemic but lack the ability to purchase

large quantities of the drug.

There are several H5N1 vaccines for several of the avian H5N1 varieties, but the continual mutation of H5N1

renders them of limited use to date: while vaccines can sometimes provide cross-protection against related flu

strains, the best protection would be from a vaccine specifically produced for any future pandemic flu virus strain.

Dr. Daniel Lucey, co-director of the Biohazardous Threats and Emerging Diseases graduate program at

Georgetown University has made this point, "There is no H5N1 pandemic so there can be no pandemic

vaccine".However, "pre-pandemic vaccines" have been created; are being refined and tested; and do have some

promise both in furthering research and preparedness for the next pandemic.Vaccine manufacturing companies are

being encouraged to increase capacity so that if a pandemic vaccine is needed, facilities will be available for rapid

production of large amounts of a vaccine specific to a new pandemic strain.

Animal and lab studies suggest that Relenza (Zanamivir), which is in the same class of drugs as Tamiflu, may also

be effective against H5N1, in a study performed on mice in 2000, "zanamivir was shown to be efficacious in

treating avian influenza viruses H9N2, H6N1, and H5N1 transmissible to mammals" (Leneva 2001).However

another paper, de Jong 2005, suggested that Zazamivir might not provide protection in humans from the current

avian strain of H5N1 if "systemic involvement of influenza infection is suspected - as has recently been suggested

by some reports on avian H5N1 influenza in humans." While no one knows if zanamivir will be useful or not on a

yet to exist pandemic strain of H5N1, it might be useful to stockpile zanamivir as well as oseltamivir in the event

14

Page 15: Project  Report on Influenza Virus

of an H5N1 influenza pandemic. Neither oseltamivir nor zanamivir can currently be manufactured in quantities

that would be meaningful once efficient human transmission starts.

Phylogenetic analysis

Phylogenetic analysis tools are applied to reconstruct the evolution trees at molecular level

Phylogenetic Trees: Presenting Evolutionary Relationships

Systematics describes the pattern of relationships among taxa and is intended to help us understand the history of all life. But

history is not something we can see—it has happened once and leaves only clues as to the actual events. Scientists use these clues

to build hypotheses, or models, of life's history. In phylogenetic studies, the most convenient way of visually presenting

evolutionary relationships among a group of organisms is through illustrations called phylogenetic trees.

 

Node: represents a taxonomic unit. This can be

either an existing species or an ancestor.

Branch: defines the relationship between the taxa

in terms of descent and ancestry.

Topology: the branching patterns of the tree.

Branch length: represents the number of changes

that have occurred in the branch.

Root: the common ancestor of all taxa.

Distance scale: scale that represents the number

of differences between organisms or sequences.

Clade: a group of two or more taxa or DNA

sequences that includes both their common ancestor and all of their descendents.

Operational Taxonomic Unit (OTU): taxonomic level of sampling selected by the user to be used in a study, such as

individuals, populations, species, genera, or bacterial strains.

 

A phylogenetic tree is composed of nodes, each representing a taxonomic unit (species, populations, individuals), and branches,

which define the relationship between the taxonomic units in terms of descent and ancestry. Only one branch can connect any

15

Page 16: Project  Report on Influenza Virus

two adjacent nodes. The branching pattern of the tree is called the topology, and the branch length usually represents the number

of changes that have occurred in the branch. This is called a scaled branch. Scaled trees are often calibrated to represent the

passage of time. Such trees have a theoretical basis in the particular gene or genes under analysis. Branches can also be unscaled,

which means that the branch length is not proportional to the number of changes that has occurred, although the actual number

may be indicated numerically somewhere on the branch. Phylogenetic trees may also be either rooted or unrooted. In rooted

trees, there is a particular node, called the root, representing a common ancestor, from which a unique path leads to any other

node. An unrooted tree only specifies the relationship among species,

without identifying a common ancestor, or evolutionary path.

Figure1.Possible ways of drawing a tree.

Phylogenetic trees, a convenient way of representing evolutionary relationships among a group of organisms, can be drawn in

various ways. Branches on phylogenetic trees may be scaled (top panel) representing the amount of evolutionary change, time, or

both, when there is a molecular clock, or they may be unscaled (middle panel) and have no direct correspondence with either time

or amount of evolutionary change. Phylogenetic trees may be rooted (top and middle panels) or unrooted (bottom panels). In the

case of unrooted trees, branching relationships between taxa are specified by the way they are connected to each other, but the

position of the common ancestor is not. For example, on an unrooted tree with five species, there are five branches (four external,

one internal) on which the tree can be rooted. Rooting on each of the five branches has different implications for evolutionary

relationships..

Methods of Phylogenetic Analysis

16

Page 17: Project  Report on Influenza Virus

Two major groups of analyses exist to examine phylogenetic relationships: phenetic methods and cladistic methods. It is

important to note that phenetics and cladistics have had an uneasy relationship over the last 40 years or so. Most of today's

evolutionary biologists favor cladistics, although a strictly cladistic approach may result in counterintuitive results.

17

Page 18: Project  Report on Influenza Virus

Phenetic Method of Analysis

Phenetics, also known as numerical taxonomy, involves the use of various measures of overall similarity for the ranking of

species. There is no restriction on the number or type of characters (data) that can be used, although all data must be first

converted to a numerical value, without any character "weighting". Each organism is then compared with every other for all

characters measured, and the number of similarities (or differences) is calculated. The organisms are then clustered in such a way

that the most similar are grouped close together and the more different ones are linked more distantly. The taxonomic clusters,

called phenograms, that result from such an analysis do not necessarily reflect genetic similarity or evolutionary relatedness. The

lack of evolutionary significance in phenetics has meant that this system has had little impact on animal classification, and as a

consequence, interest in and use of phenetics has been declining in recent years.

Cladistic Method of Analysis

An alternative approach to diagramming relationships between taxa is called cladistics. The basic assumption behind cladistics is

that members of a group share a common evolutionary history. Thus, they are more closely related to one another than they are to

other groups of organisms. Related groups of organisms are recognized because they share a set of unique features ( apomorphies)

that were not present in distant ancestors but which are shared by most or all of the organisms within the group. These shared

derived characteristics are called synapomorphies. Therefore, in contrast to phenetics, cladistics groupings do not depend on

whether organisms share physical traits but depend on their evolutionary relationships. Indeed, in cladistic analyses two organisms

may share numerous characteristics but still be considered members of different groups.

Cladistic analysis entails a number of assumptions. For example, species are assumed to arise primarily by bifurcation, or

separation, of the ancestral lineage; species are often considered to become extinct upon hybridization (crossbreeding); and

hybridization is assumed to be rare or absent. In addition, cladistic groupings must possess the following characteristics: all

species in a grouping must share a common ancestor and all species derived from a common ancestor must be included in the

taxon. The application of these requirements results in the following terms being used to describe the different ways in which

groupings can be made:

A monophyletic grouping is one in which all species share a common ancestor, and all species derived from that

common ancestor are included. This is the only form of grouping accepted as valid by cladists.

A paraphyletic grouping is one in which all species share a common ancestor, but not all species derived from that

common ancestor are included.

A polyphyletic grouping is one in which species that do not share an immediate common ancestor are lumped together,

while excluding other members that would link them.

The Origins of Molecular Phylogenetics

Macromolecular data, meaning gene (DNA) and protein sequences, are accumulating at an increasing rate because of recent

advances in molecular biology. For the evolutionary biologist, the rapid accumulation of sequence data from whole genomes has

been a major advance, because the very nature of DNA allows it to be used as a "document" of evolutionary history. Comparisons

18

Page 19: Project  Report on Influenza Virus

of the DNA sequences of various genes between different organisms can tell a scientist a lot about the relationships of organisms

that cannot otherwise be inferred from morphology, or an organism's outer form and inner structure. Because genomes evolve by

the gradual accumulation of mutations, the amount of nucleotide sequence difference between a pair of genomes from different

organisms should indicate how recently those two genomes shared a common ancestor. Two genomes that diverged in the recent

past should have fewer differences than two genomes whose common ancestor is more ancient. Therefore, by comparing different

genomes with each other, it should be possible to derive evolutionary relationships between them, the major objective of

molecular phylogenetics.

Molecular phylogenetics attempts to determine the rates and patterns of change occurring in DNA and proteins and to reconstruct

the evolutionary history of genes and organisms. Two general approaches may be taken to obtain this information. In the first

approach, scientists use DNA to study the evolution of an organism. In the second approach, different organisms are used to study

the evolution of DNA. Whatever the approach, the general goal is to infer process from pattern: the processes of organismal

evolution deduced from patterns of DNA variation and processes of molecular evolution inferred from the patterns of variations in

the DNA itself.

 

19

Page 20: Project  Report on Influenza Virus

Molecular Phylogenetic Analysis: Fundamental Elements

As we just discussed, macromolecules, especially gene and protein sequences, have surpassed morphological and other organismal

characters as the most popular forms of data for phylogenetic analyses. Therefore, this next section will concentrate only on

molecular data.

It is important to point out that a single, all-purpose recipe does not exist for phylogenetic analysis of molecular data. Although

numerous algorithms, procedures, and computer programs have been developed, their reliability and practicality are, in all cases,

dependent upon the size and structure of the dataset under analysis. The merits and shortfalls of these various methods are subject

to much scientific debate, because the danger of generating incorrect results is greater in computational molecular phylogenetics

than in many other fields of science. Occasionally, the limiting factor in such analyses is not so much the computational method

used, but the users' understanding of what the method is actually doing with the data. Therefore, the goal of this section is to

demonstrate to the reader that practical analysis should be thought of both as a search for a correct model (analysis) as well as a

search for the correct tree (outcome).

Phylogenetic tree-building models presume particular evolutionary models. For any given set of data, these models may be

violated because of various occurrences, such as the transfer of genetic material between organisms. Therefore, when interpreting

a given analysis, a person should always consider the model used and entertain possible explanations for the results obtained. For

example, models used in molecular phylogenetic analysis methods make "default" assumptions, including:

The sequence is correct and originates from the specified source.

The sequences are homologous—all descended in some way from a shared ancestral sequence.

Each position in a sequence alignment is homologous with every other in that alignment.

Each of the multiple sequences included in a common analysis has a common phylogenetic history with the other

sequences.

The sampling of taxa is adequate to resolve the problem under study.

Sequence variation among the samples is representative of the broader group.

The sequence variability in the sample contains phylogenetic signal adequate to resolve the problem under study.

 

A straightforward phylogenetic analysis consists of four steps:

1. Alignment—building the data model and extracting a dataset.

2. Determining the substitution model—consider sequence variation.

3. Tree building.

4. Tree evaluation.

20

Page 21: Project  Report on Influenza Virus

Introduction to Homology modelling

One method that can be applied to generate reasonable model of proteins structure is homology modelling. This procedure is also

termed as comparative modelling or knowledge-based modelling.

Why homology modelling is useful

Homology modelling are useful to get a rough idea where alpha carbon of a residue sit the folded protein. They

can guide hypothesis about structure–function relationship. Homology models are unreliable in predicting the

conformation of insertion or deletion .Homology model are unlikely to be useful in modelling ligand-docking

drug designing unless the sequence identity with the template is > 70% & even then less reliable than an

empirical crystallographic or NMR.

Aim of Comparative Modelling

The aim of comparative modelling or homology protein structure modelling is to build a 3d model for a protein of

unknown structure (the target) based on the one or more related protein of known structures.

Introduction to Docking

Docking studies are molecular modeling studies aiming at finding a proper fit between a ligand and its binding site.

There are two classes of protein docking:

1)Protein-protein docking

2)Protein Receptor-Ligand

Protein-Protein Docking interactions

Protein-protein interactions occur between two proteins that are similar in size. The interface between the two

molecules tend to be flatter and smoother than those in protein-ligand interactions. Protein-protein interactions are

usually more rigid; the interfaces of these interactions do not have the ability to alter their conformation in order

to improve binding and ease movement. Conformational changes are limited by steric constraint and thus are said

to be rigid.

Fig: Protein-Protein docking.

Protein Receptor–Ligand docking

21

Page 22: Project  Report on Influenza Virus

Protein receptor-ligand motifs fit together tightly, and are often referred to as a lock and key mechanism. There is

both high specificity and induced fit within these interfaces with specificity increasing with rigidity. Protein

receptor-ligand can either have a rigid ligand and a flexible receptor, or a flexible ligand with a rigid receptor.

Fig:Protein Ligand-Receptor Docking

Rigid Ligand with a Flexible Receptor

The native structure of the rigid ligand flexible receptor often maximizes the interface area between the

molecules. They move within respect to one another in a perpendicular direction in respect to the interface. This

allows for binding of a receptor with a larger than usual ligand. Normally when there is ligand overlap in the

docking interface, energy penalties incur. If the van der Waals forces can be decreased, energy loss in the system

will be minimilized. This can be accomplished by allowing flexibility in the receptor. Flexibility receptors allow

for docking of a larger ligand than would be allowed for with a rigid receptor.

Flexible Ligand with a Rigid Receptor

When the fit between the ligand and receptor does not need to be induced, the receptor can retain its rigidity while

maintaing the free energy of the system. For successful docking, the parameters of the ligand need to be

maintained and the ligand must be slightly smaller in size than that of the receptor interface. No docking is

completely rigid though; there is intrinsic movement which allows for small conformational adaptation for ligand

binding. When the six degrees of freedom for protein movement are taken into consideration (three rotational,

three translational), the amount of inherent flexibility allowed the receptor is even greater. This further offsets any

energy penalty between the receptor and ligand, allowing for easier, more enegetically favorable binding between

the two.

Aim of docking

The aim of docking is to find out the new drugs target, it will open new vistas for further drug development .The

finding of our docking will be useful in finding a cure for the infectious disease bird flu, also it will open new

avenues for finding other possible drug targets in influenza A virus. The docking results can be used to design

new lead compounds and hence can aid in the new drug discovery process.

Receptor

22

Page 23: Project  Report on Influenza Virus

A residue on the surface of the cell that serves as a recognition or binding site for antigens,antibody or other

cellular or immunological components.It is a molecule with in a cell suface to which a substance (such as

harmones or a drug ),selectively bind causing a change in the activity of the cell.

Ligand

The molecule which binds to a protein molecule (eg, receptor). As a ligand binds through the interaction of many

weak, noncovalent bonds formed to the binding site of a protein, the tight binding of a ligand depends upon a

precise fit to the surface-exposed amino acid residues on the protein.

Active Site

The active site of a protein/enzyme is the region that binds the substrates (and the cofactor, if any). It also contains

the residues that directly participate in the making and breaking of bonds. These residues are called the catalytic

groups. In essence, the interaction of the enzyme and substrate at the active site promotes the formation of the

transition state. The active site is the region of the enzyme that most directly lowers the G of the reaction,

which results in the rate enhancement characteristic of enzyme action.

Amino acids in protein active sites:

It is difficult to generalize which amino acids are likely to be in a protein active/functional site as this greatly

depends on the type of function. With that in mind, below are preferences for the 20 amino acids to lie within

functional regions on proteins These were worked out by considering how often particular amino acids were in

contact with bound non-protein atoms in protein three-dimensional structures. Postive values mean that the amino

acid makes more contacts than one would expect by chance; negative values mean that it makes fewer. The below

does not include protein-protein, or protein-peptide interactions, where many of the amino acids with negative

values (e.g. tryptophan or proline) can play critical roles.

His 0.360 Tyr -0.040 Asp 0.045 Gly -0.070Trp -0.140 Met 0.025 Val -0.060 Asn 0.080Leu -0.180 Phe -0.120 Gln 0.050 Cys 0.210

Ile -0.005

Ala 0.025 Glu 0.050 Arg 0.055

Pro -0.200

Lys 0.100 Thr 0.100 Ser 0.130

Neuraminidase

Neuraminidase ribbon diagram

23

Page 24: Project  Report on Influenza Virus

Neuraminidase is an antigenic glycoprotein enzyme(EC 3.2.1.18) found on the surface of the Influenza virus.

Subtypes

Nine neuraminidase subtypes are known; many occur only in various species of duck and chicken. Subtypes N1 and N2 have been positively linked to epidemics in man, and strains with N3 or N7 subtypes have been identified in a number of isolated deaths.

Structure

The neuraminidase enzyme exists as a mushroom-shape projection on the surface of the influenza virus. It has a head consisting of four co-planar and roughly spherical subunits, and a hydrophobic region that is embedded within the interior of the virus' membrane. It is comprised of a single polypeptide chain that is oriented in the opposite direction to the hemagglutinin antigen. The composition of the polypeptide is a single chain of six conserved polar amino acids, followed by hydrophilic, variable amino acids.

Function

Neuraminidase has functions that aid in the efficiency of virus release from cells. Neuraminidase cleaves terminal sialic acid residues from carbohydrate moieties on the surfaces of infected cells. This promotes the release of progeny viruses from infected cells. Neuraminidase also cleaves sialic acid residues from viral proteins, preventing aggregation of viruses. Administration of chemical inhibitors of neuraminidase is a treatment that limits the severity and spread of viral infections.

Neuraminidase is also a virulence factor for the bacteria Bacteroides fragilis.

Ideally influenza virus neuraminidase NA should act on the same type of virus receptor the virus hemagglutinin HA binds to. This is not always so. It is not quite clear how the virus manages to function if there is no close match between the specificities of NA and HA

Neuraminidase inhibitors

Inhibitors are used for combating the virus. They are zanamivir and oseltamivir.

Neuraminidase inhibitors are a class of antiviral drugs whose mode of action relies on blocking the function of viral neuraminidase protein, thus preventing the virus from budding from the host cell.

Oseltamivir, Zanamivir and Peramivir belong to this class.

Unlike the M2 inhibitors, which work only against the influenza A, neuraminidase inhibitors act against both influenza A and B.

24

Page 25: Project  Report on Influenza Virus

Chapter 2

MATERIALS AND

METHODS

25

Page 26: Project  Report on Influenza Virus

Materials and methods

Influenza virus belong to orthomyxoviridae family is a special kind of virus whose sequence

available in segments (total 8 segments) not in genome. The influenza A virus genome is contained on 8 single

non-paired RNA strained that code for 10 proteins. The segmented nature of genome allows for the exchange of

entire genes between different viral strains when they cohabitate the same cell.

For our analysis we take three different types of sequences. For this purpose we take gene sequences of five

different strains (i.e. H5N1, H2N2, H1N1, H9N2, and H3N2) available in different segment collected from NCBI

(www. ncbi.nlm.nih.gov) .we get 41 such gene sequences. We also take genome sequences of these different

strains (i.e. H5N1, H2N2, H9N2, H1N1, and H3N2) and protein sequences with the Gene and genome sequences.

We collect genes and protein sequences from influenza virus resources available on the website

(www.ncbi.nlm.nih.gov). We got around 40 such nucleotide sequences and around 60 such protein sequences.

We take these three types of sequences as each sequence is informative in their sense.

After collecting these sequences from their repositories we proceed our further analysis.

Phylogenetic analysis:

There are four steps for phylogenetic analysis:

Sequence alignment

Determining the substitution model

Tree building

Tree evaluation

Multiple Sequence Alignment

The first step following data retrieval is the execution of a multiple sequence alignment, obtained via

CLUSTALW (progressive alignment method). The purpose of this step is to place the most closely related

sequences in the user's data set together prior to initiating tree construction. PHYLIP takes the patterns gleaned

from multiple sequence alignment when building phylogenies.

26

Page 27: Project  Report on Influenza Virus

 2. Phylogenetic Method

Analyses in the present interface are rendered according to the distance method. Four program within Phylip are

employed here they are SEQBOOT, DNADIST, NEIGHBOR AND CONSENSE.

[A] Once multiple alignment has been completed, the data set is transmitted to SEQBOOT. SEQBOOT generates

multiple possible arrangements of the alignment (reflecting the number of conceivable evolutionary paths).

[B]DNADIST reads in the data from SEQBOOT and computes a distance score for protein sequences. This step is

most critical, since no subsequent analysis can be made without a measure of sequence divergence or similarity. A

Day Hoff PAM matrix is used for computation of distance scores between pairs of sequences. A distance score

reflects the number of single amino acid alterations required in order generate an identity sequence from a second

sequence.

[C] NEIGHBOR implements the Neighbor-joining method (Saitou and Nei 1987) to determine the most

reasonable positioning of branches. Two sequences having the smallest distance scores are joined as "neighbors"

and will share a node below them (or to their left) in the final tree.

Alternatively, if the user specifies a rooted tree, then NEIGHBOR implements another algorithm, the

unweighted pair group method with arithmetic mean (UPGMA). The UPGMA algorithm assumes a

molecular clock and generates rooted trees.

[D] Then the branch ordering data is passed to CONSENSE for resampling computations. Any phylogenetic

method renders the most likely tree, i.e., those relationships that are most reasonable given the sequence

alignments. As such, any single tree is only one of many possible trees that could have arisen over evolutionary

time. Resampling methods, therefore, are designed to find the most probable tree among the many possible

evolutionary paths that could have generated a given set of proteins.

[E] Lastly we draw tree using NJ-PLOT. NJ-plot is a tree drawing program able to draw any phylogenetic tree

expressed in the Newick phylogenetic tree format (e.g., the format used by the PHYLIP package). NJ plot is

especially convenient for rooting the unrooted trees obtained from parsimony, distance or maximum likelihood

tree-building methods. The trees were drawn as unrooted trees.

Family analysis

Then we go for family analysis, in family analysis we use GENSCAN tool for finding motifs,

exons, introns in our genome sequences. To find out the ORF, we use GET ORF of EMBOSS.

Modelling

27

Page 28: Project  Report on Influenza Virus

Taking protein sequences from the NCBI of h5n1 strain of influenza A virus we perform homology modelling of

these protein sequences using SWISS MODEL server.

First step that we follow is we do PDB Blast of these sequences to get appropriate template present in PDB for our

sequences. We get lots of hits .Among them we select the best template following some criteria.

Then we go for modelling through Swiss Model Server.

Then we visualize the modelled structure modelled by Swiss Model Server in SWISSPDB VIEWER. After that

our next step is docking of neuraminidase protein.

Docking

At last we take protein neuraminidase of avian influenza virus, this protein is one of the reasons of its

pathogenicity.To perform docking we use HEX software. It automatically searches the active site for our ligand

where our ligand is best fitted.

For performing the docking we find out the ligand and receptor of our protein using many receptor and ligand

finding tools such as PDBSUM, SUMO, CSA, and JENA LIBRARY.

28

Page 29: Project  Report on Influenza Virus

Chapter 3

RESULT AND DISCUSSION

29

Page 30: Project  Report on Influenza Virus

Result and discussion

In the current work of Phylogenetic analysis the trees were constructed using neighbor joining method and were

represented as unrooted trees .The bootstrap values at the node representing the robustness of the trees were also

satisfactory. We find out branch length and distances of gene.

The general nature of the tree and the relative distances of different strains from common ancestors are analyzed.

Bootstrap is used to evaluate the reliability of a Phylogenetic Tree. In a bootstrapped tree, u can see some values

in each node. According to these values, we can say the evolutionary strength of the nodes. The scale bar shows

the number of substitution per residue.

Table 1:

[A]

Gene showing higher branch length in H5N1

GENE NAME BRANCH LENGTH PRESENT IN STRAINS

PA

NS1,NS2

2.23

1.56

H5N1

H5N1

[B]

Gene showing higher branch length in H9N2

GENE NAME BRANCH LENGTH PRESENT IN STRAINS

NP 1.767 H9N2

30

Page 31: Project  Report on Influenza Virus

PB2 1.665 H9N2

[C]

Gene showing higher branch length in H3N2

GENE NAME BRANCH LENGTH PRESENT IN STRAINS

HA

M1,M2

2.03

2.45

H3N2

H3N2

[D]

Gene showing higher branch length in H1N1

GENE NAME BRANCH LENGTH PRESENT IN STRAINS

NA

PB1,PB1-F2

2.206

1.858

H1N1

H1N1

Our analysis through gene sequences shows that same genes like PA, HA, PB1, PB1-F2, NS, PB2, M1, M2, NP.

NA are present in all strains. It reflects that H5N1, H2N2, H9N2, H3N2, H1N1 are evolved from the same

common ancestor at the same rate.

In case of genes like PA, NS1,NS2 they remain more conserved in h9n2,h2n2,h3n2,h1n1 than in h5n1[table no.1

(A)]

In case of genes like NP,PB2 they remain more conserved in h5n1,h2n2,h3n2,and h1n1 than in h9n2.[table no.

(B)]

In contrast, for the genes like HA, M1, M2, h3n2 strain appears to diverge more from the common ancestor than

h1n1, h2n2, h5n1, h9n2 [table no.1 (C)].

31

Page 32: Project  Report on Influenza Virus

In case of genes like PB1 and PB1-F2 are highly conserved in h3n2, h2n2, h9n2, h5n1 than in h1n1.{table

no.1(D)]

Therefore, from this observation it might be concluded that in the course of evolution, the genes underwent

suitable modifications in strains h1n1, h9n2, h5n1, h3n2. as compared to h2n2.This proves that H2N2 is less

pandemic as compared to others, which are main causal of pandemic bird flu now-a-days

So our current analysis, it can be said that overall, from the common ancestor these strains are diverged more in

the course of evolution. In order to adopt a better survival strategy this drift is more prominent.

Outputs

Clustal w output

 ClustalW   Results

Results of search

Number of sequences 41

Alignment score 1714686

Sequence format Pearson

Sequence type nt

ClustalW version 1.83

Output file clustalw-20060728-05490446.output

Alignment file clustalw-20060728-05490446.aln

Guide tree file clustalw-20060728-05490446.dnd

Your input file clustalw-20060728-05490446.input

Alignment

41 2392

gi|7385295 ---------- ---------- ---------- -----AGCAA AAGCAGGTAC gi|3214017 ---------- ---------- ---------- ---------A AAGCAGGTAC gi|7391268 ---------- ---------- ---------- -----AGCAA AAGCAGGTAC gi|8486136 ---------- ---------- ---------- -----AGCGA AAGCAGGTAC gi|7391913 ---------- ---------- ---------- -----AGCAA AAGCAGGTAC gi|3214015 ---------- ---------- ---------- ---------- ---------- gi|9316315 ---------- ---------- ---------- ---------- ----------

32

Page 33: Project  Report on Influenza Virus

gi|7385295 ---------- ---------- ---------- ---------- ---------- gi|7392130 ---------- ---------- ---------- ---------- ---------- gi|7391914 ---------- ---------- ---------- ---------- ---------- gi|8486129 ---------- ---------- ---------- ---------- ---------- gi|7385294 AGCAAAAGCA GGTCAATTAT ATTCAATATG GAAAGAATAA AAGAACTAAG gi|3214016 GCCAAAAGCA GGTCAATTAT ATTCAATATG GAAAGAATAA AAGAACTAAG gi|7391882 AGCAAAAGCA GGTCAATTAT ATTCAATATG GAAAGAATAA AAGAACTACG gi|7391905 AGCAAAAGCA GGTCAATTAT ATTCAGTATG GAAAGAATAA AAGAACTACG gi|8486138 AGCGAAAGCA GGTCAATTAT ATTCAATATG GAAAGAATAA AAGAACTAAG gi|7392156 ---------- ---------- ---------- ---------- ---------- gi|7391921 ---------- ---------- ---------- ---------- ---------- gi|8486131 ---------- ---------- ---------- ---------- ---------- gi|3214016 ---------- ---------- ---------- ---------- ---------- gi|7385294 ---------- ---------- ---------- ---------- ---------- gi|7385295 ---------- ---------- ---------- ---------- ---------- gi|3214142 ---------- ---------- ---------- ---------- ---------- gi|7391268 ---------- ---------- ---------- ---------- ---------- gi|7391915 ---------- ---------- ---------- ---------- ---------- gi|8486122 ---------- ---------- ---------- ---------- ---------- gi|7385295 ---------- ---------- ---------- ---------- ---------- gi|7391914 ---------- ---------- ---------- ---------- ---------- gi|8486125 ---------- ---------- ---------- ---------- ---------- gi|3214016 ---------- ---------- ---------- ---------- ---------- gi|7391920 ---------- ---------- ---------- ---------- ---------- gi|7392126 ---------- ---------- ---------- ---------- ---------- gi|8486127 ---------- ---------- ---------- ---------- ---------- gi|7392130 ---------- ---------- ---------- ---------- ---------- gi|7391913 ---------- ---------- ---------- ---------- ---------- gi|3214016 ---------- ---------- ---------- ---------- ---------- gi|7385294 ---------- ---------- ---------- ----AGCAAA AGCAGGCAAA gi|3214016 ---------- ---------- ---------- -----GCAAA AGCAGGCAAA gi|7391268 ---------- ---------- ---------- ----AGCAAA AGCAGGCAAA gi|7391914 ---------- ---------- ---------- ----AGCAAA AGCAGGCAAA gi|8486134 ---------- ---------- ---------- ----AGCGAA AGCAGGCAAA

TGATCCAAAA TGGAAGACTT TGTGCGACAA TGCTTCAATC CAATGATTGT TGATCCAAAA TGGAAGACTT TGTGCGACAG TGCTTCAATC CAATGATTGT TGATTCGAAA TGGAAGATTT TGTGCGACAA TGCTTCAATC CGATGATTGT TGATCCAAAA TGGAAGATTT TGTGCGACAA TGCTTCAATC CGATGATTGT TGATTCGAAA TGGAAGATTT TGTGCGACAA TGCTTCAACC CGATGATTGT ---------- ---------- ---------- ---------- ---------- ---------- ---------- ---------- ---------- ---------- ---------- ---------- ---------- ---------- ---------- ---------- ---------- ---------- ---------- ---------- ---------- ---------- ---------- ---------- ---------- ---------- ---------- ---------- ---------- ---------- AGATCTAATG TCGCAGTCCC GCACTCGCGA GATACTAACA AAAACCACTG AAATTTGATG TCGCAATCTC GCACTCGCGA GATACTGACA AAAACCACTG GAATCTGATG TCGCAGTCTC GCACTCGCGA GATACTAACA AAAACCACAG GAACCTGATG TCGCAGTCTC GCACTCGCGA GATACTGACA AAAACCACAG AAATCTAATG TCGCAGTCTC GCACCCGCGA GATACTCACA AAAACCACCG ---------- ---------- ---------- ---------- ---------- ---------- ---------- ---------- ---------- ---------- ---------- ---------- ---------- ---------- ---------- ---------- ---------- ---------- ---------- ---------- ---------- ---------- ---------- ---------- ---------- ---------- ---------- ---------- ---------- ---------- ---------- ---------- ---------- ---------- ---------- ---------- ---------- ---------- ---------- ---------- ---------- ---------- ---------- ---------- ---------- ---------- ---------- ---------- ---------- ---------- ---------- ---------- ---------- ---------- ---------- ---------- ---------- ---------- ---------- ---------- ---------- ---------- ---------- ---------- ---------- ---------- ---------- ---------- ---------- ---------- ---------- ---------- ---------- ---------- ---------- ---------- ---------- ---------- ---------- ---------- ---------- ---------- ---------- ---------- ---------- ---------- ---------- ---------- ---------- ---------- ---------- ---------- ---------- ---------- ---------- ---------- ---------- ---------- ---------- ---------- CCATTTGAAT GGATGTCAAT CCGACTTTAC TTTTCTTAAA AGTGCCAGCG CCATTTGAAT GGATGTCAAT CCGACTTTAC TTTTCTTAAA AGTGCCAGCG CCATTTGAAT GGATGTCAAT CCGACCTTAC TTTTCTTGAA AGTTCCAGCG CCATTTGAAT GGATGTCAAT CCGACTCTAC TGTTCCTAAA GGTTCCAGCG CCATTTGAAT GGATGTCAAT CCGACCTTAC TTTTCTTAAA AGTGCCAGCA

CGAGCTTGCG GAAAAGGCAA TGAAAGAATA TGGGGAAGAT CCGAAAATCG CGAGCTTGCG GAAAAGACAA TGAAGGAATA TGGGGAAGAC CCGAAAATTG CGAACTTGCG GAAAAGGCAA TGAAAGAGTA TGGAGAAGAT CTGAAAATCG CGAGCTTGCG GAAAAAACAA TGAAAGAGTA TGGGGAGGAC CTGAAAATCG CGAACTTGCA GAAAAAGCAA TGAAAGAGTA TGGAGAGGAT CTGAAAATTG ---------- ---------- ---------- ---------- ---------- ---------- ---------- ---------- ---------- ---------- ---------- ---------- ---------- ---------- ----------

33

Page 34: Project  Report on Influenza Virus

---------- ---------- ---------- ---------- ---------- ---------- ---------- ---------- ---------- ---------- ---------- ---------- ---------- ---------- ---------- TGGATCATAT GGCCATAATC AAGAAATACA CATCAGGAAG ACAAGAGAAG TGGATCATAT GGCCATAATT AAGAAGTACA CATCAGGAAG ACAGGAGAAG TGGACCATAT GGCCATAATT AAGAAGTACA CATCAGGGAG ACAGGAAAAG TGGACCATAT GGCCATAATT AAGAAGTACA CATCGGGGAG ACAGGAAAAG TGGACCATAT GGCCATAATC AAGAAGTACA CATCAGGAAG ACAGGAGAAG ---------- ---------- ---------- ---------- ---------- ---------- ---------- ---------- ---------- ---------- ---------- ---------- ---------- ---------- ---------- ---------- ---------- ---------- ---------- ---------- ---------- ---------- ---------- ---------- ---------- ---------- ---------- ---------- ---------- ---------- ---------- ---------- ---------- ---------- ---------- ---------- ---------- ---------- ---------- ---------- ---------- ---------- ---------- ---------- ---------- ---------- ---------- ---------- ---------- ---------- ---------- ---------- ---------- ---------- ---------- ---------- ---------- ---------- ---------- ---------- ---------- ---------- ---------- ---------- ---------- ---------- ---------- ---------- ---------- ---------- ---------- ---------- ---------- ---------- ---------- ---------- ---------- ---------- ---------- ---------- ---------- ---------- ---------- ---------- ---------- ---------- ---------- ---------- ---------- ---------- ---------- ---------- ---------- ---------- ---------- ---------- ---------- ---------- ---------- ---------- CAAAATGCTA TAAGTACCAC ATTCCCTTAT ACTGGAGATC CTCCATACAG CAAAATGCAA TAAGTACCAC ATTCCCTTAT ACTGGAGATC CCCCATATAG CAAAATGCCA TAAGTACTAC ATTCCCTTAT ACTGGAGATC CTCCATACAG CAAAATGCCA TAAGCACCAC ATTCCCTTAT ACTGGAGATC CTCCATACAG CAAAATGCTA TAAGCACAAC TTTCCCTTAT ACCGGAGACC CTCCTTACAG

AAACGAACAA ATTTGCCGCA ATATGCACGC ACTTAGAAGT CT---GTTTC AAACAAATAA GTTCGCTGCA ATATGCACAC ACTTAGAAGT CT---GCTTC AAACAAACAA ATTTGCAGCA ATATGCACTC ACTTGGAAGT AT---GCTTC AAACAAACAA ATTTGCAGCA ATATGCACTC ACTTGGAAGT AT---GCTTC AAACAAACAA ATTTGCAGCA ATATGCACCC ACTTGGAGGT AT---GTTTC ---------- ---------- ---------- ---------- ---------- ---------- ---------- ---------- ---------- ---------- ---------- ---------- ---------- ---------- ---------- ---------- ---------- ---------- ---------- ---------- ---------- ---------- ---------- ---------- ---------- ---------- ---------- ---------- ---------- ---------- AACCCTGCTC TCAGAATGAA ATGGATGATG GCAATGAAAT ATCCAATCAC AATCCCGCTC TTAGAATGAA ATGGATGATG GCGATGAAAT ACCCGATCAC AACCCGTCAC TTAGGATGAA ATGGATGATG GCAATGAAAT ATCCAATTAC AACCCGTCAC TTAGGATGAA ATGGATGATG GCAATGAAAT ACCCAATCAC AACCCAGCAC TTAGGATGAA ATGGATGATG GCAATGAAAT ATCCAATTAC ---------- ---------- ---------- ---------- ---------- ---------- ---------- ---------- ---------- ---------- ---------- ---------- ---------- ---------- ---------- ---------- ---------- ---------- ---------- ---------- ---------- ---------- ---------- ---------- ---------- ---------- ---------- ---------- ---------- ---------- ---------- ---------- ---------- ---------- ---------- ---------- ---------- ---------- ---------- ---------- ---------- ---------- ---------- ---------- ---------- ---------- ---------- ---------- ---------- ---------- ---------- ---------- ---------- ---------- ---------- ---------- ---------- ---------- ---------- ---------- ---------- ---------- ---------- ---------- ---------- ---------- ---------- ---------- ---------- ---------- ---------- ---------- ---------- ---------- ---------- ---------- ---------- ---------- ---------- ---------- ---------- ---------- ---------- ---------- ---------- ---------- ---------- ---------- ---------- ---------- ---------- ---------- ---------- ---------- ---------- ---------- ---------- ---------- ---------- ---------- CCATGGAACA GGAACAGGAT ACACCATGGA CACAGTCAAC AGAACACATC CCATGGAACA GGAACAGGAT ACACCATGGA CACAGTCAAC AGAACACATC CCATGGGACA GGAACAGGAT ACACCATGGA CACAGTCAAC AGAACACATC CCATGGAACA GGAACAGGAT ACACCATGGA CACAGTCAAC AGAACACACC CCATGGGACA GGAACAGGAT ACACCATGGA TACTGTCAAC AGGACACATC

ATGTATTCAG ATTTCCACTT TATTGATGAA CGGGGCGAAT CAACAATTAT ATGTATTCAG ACTTCCATTT CATTGACGAA CGAGGCGAAT CAATAATTGT ATGTATTCAG ATTTTCATTT CATCAATGAG CAAGGCGAGT CAATAATGGT ATGTATTCAG ATTTCCACTT CATCAATGAG CAAGGCGAGT CAATAATCGT ATGTATTCAG ATTTTCATTT CATCAATGAA CAAGGCGAAT CAATAGTGGT ---------- ---------- ---------- ---------- ---------- ---------- ---------- ---------- ---------- ---------- ---------- ---------- ---------- ---------- ---------- ---------- ---------- ---------- ---------- ----------

34

Page 35: Project  Report on Influenza Virus

---------- ---------- ---------- ---------- ---------- ---------- ---------- ---------- ---------- ---------- AGCAGACAAG AGAATAATGG AGATGATTCC TGAAAGGAAT GAGCAAGGAC AGCTGACAAA AGAATAATGG AGATGATCCC TGAAAGGAAT GAGCAAGGCC AGCTGACAAG AGGATAACAG AAATGGTTCC TGAGAGAAAT GAGCAAGGAC TGCTGACAAA AGGATAACAG AAATGGTTCC GGAGAGAAAT GAACAAGGAC AGCAGACAAG AGGATAACGG AAATGATTCC TGAGAGAAAT GAGCAAGGAC ---------- ---------- ---------- ---------- ---------- ---------- ---------- ---------- ---------- ---------- ---------- ---------- ---------- ---------- ---------- ---------- ---------- ---------- ---------- ---------- ---------- ---------- ---------- ---------- ---------- ---------- ---------- ---------- ---------- ---------- ---------- ---------- ---------- ---------- ---------- ---------- ---------- ---------- ---------- ---------- ---------- ---------- ---------- ---------- ---------- ---------- ---------- ---------- ---------- ---------- ---------- -------GCA GGGGT----- --ATAATCTG TCAAAATGGA ---------- AGCAAAAGCA GGGGTTATA- CCATAGACAA CCAAAAGCAT ---------- AGCAAAAGCA GGGGAAA--- ATAAAAACAA CCAAAATGAA ---------- -GCAAAAGCA GGGGAAT--- TACTTAACTA GCAAAATGGA ---------- AGCAAAAGCA GGGGATAATT CTATTAACCA TGAAGACTAT ---------- ---------- ---------- ---------- ---------- ---------- ---------- ---------- ---------- ---------- ---------- ---------- ---------- ---------- ---------- ---------- ---------- ---------- ---------- ---------- ---------- ---------- ---------- ---------- ---------- AATATTCAGA AAAGGGGAAA TGGACAACGA ACACAGAGAC TGGAGCACCC AATATTCAGA AAAAGGGAGG TGGACAACAA ACACAGAGAC CGGAGCACCC AATATTCAGA AAAGGGGAAG TGGACAACAA ACACGGAAAC TGGAGCGCCC AATATTCAGA GAAGGGGAAG TGGACGACAA ATACAGAAAC TGGGGCACCC AGTACTCAGA AAAGGCAAGA TGGACAACAA ACACCGAAAC TGGAGCACCG

AGAATCTGGC GATCCCAATG CATTATTGAA ACACCGGTTT GAAATAATCG GGAATCTGGT GATCCAAATG CATTGTTGAA GCACAGGTTT GAAATAATTG AGAGCTTGAT GATCCAAATG CACTTTTGAA GCACAGATTT GAAATAATAG AGAACTTGGT GATCCTAATG CACTTTTGAA GCACAGATTT GAAATAATCG AGAACTTGAT GATCCAAATG CACTGTTAAA GCACAGATTT GAAATAATCG ---------- ---------- ---------- ---------- ---------- ---------- ---------- ---------- ---------- ---------- ---------- ---------- ---------- ---------- ---------- ---------- ---------- ---------- ---------- ---------- ---------- ---------- ---------- ---------- ---------- ---------- ---------- ---------- ---------- ---------- AAACGCTTTG GAGCAAGACA AATGATGCTG GGTCGGACAG AGTGATGGTG AAACTCTTTG GAGCAAAACA AATGACGCTG GATCAGACAG GGTAATGGTA AAACTCTATG GAGTAAAATG AGTGATGCCG GGTCAGATCG AGTAATGGTA AAACTCTATG GAGTAAAATG AGTGATGCTG GATCAGATCG AGTGATGGTA AAACTTTATG GAGTAAAATG AATGATGCCG GATCAGACCG AGTGATGGTA ---------- ---------- ---------- ---------- ---------- ---------- ---------- ---------- ---------- ---------- ---------- ---------- ---------- ---------- ---------- ---------- ---------- ---------- ---------- ---------- ---------- ---------- ---------- ---------- ---------- ---------- ---------- ---------- ---------- ---------- ---------- ---------- ---------- ---------- ---------- ---------- ---------- ---------- ---------- ---------- ---------- ---------- ---------- ---------- ---------- ---------- ---------- ---------- ---------- ---------- GAAAAT---- AGTGCTTCTT CTTGCAATAG TCAGTCTT-- ---------G AACAAT---- GGCCATCATT TATCTCATAC TCCTGTTCAC AG-----CAG GGCAA----- ACCTACTGGT CCTGTTATGT GCACTTGC-- AG-----CTG AACAAT---- ATCACTAATA ACTATACTAC TAGTAGTAAC AG-----CAA CATTGCTTTG AGCTACATTC TATGTCTGGT TTTCGCTCAA AAACTTCCCG ---------- ---------- ---------- ---------- ---------- ---------- ---------- ---------- ---------- ---------- ---------- ---------- ---------- ---------- ---------- ---------- ---------- ---------- ---------- ---------- ---------- ---------- ---------- ---------- ---------- CAACTCAATC CGATTGATGG ACCACTACCT GAGGATAATG AGCCGAGTGG CAACTCAACC CTATTGATGG ACCATTACCT GAAGACAATG AGCCGAGCGG CAACTTAACC CAATTGATGG ACCACTACCT GAGGACAATG AACCAAGTGG CAACTCAACC CAATTGATGG ACCACTACCT GAGGATAATG AGCCAAGTGG CAACTCAACC CGATTGATGG GCCACTGCCA GAAGACAATG AACCAAGTGG

AAGGGAGGGA CCGAACAATG GCCTGGACAG TGGTGAATAG TATCTGCAAC AAGGAAGAGA CCGAGCAATG GCCTGGACAG TGGTGAATAG CATCTGCAAC AGGGAAGAGA TCGCACAATG GCCTGGACAG TAGTAAACAG TATTTGCAAC AGGGAAGAGA TCGCACAATG GCCTGGACAG TAGTAAACAG TATTTGCAAC AGGGGAGAGA CAGAACAATG GCCTGGACAG TAGTAAACAG TATCTGCAAC ---------- ---------- ---------- ---------- ---------- ---------- ---------- ---------- ---------- ---------- ---------- ---------- ---------- ---------- ---------- ---------- ---------- ---------- ---------- ---------- ---------- ---------- ---------- ---------- ----------

35

Page 36: Project  Report on Influenza Virus

---------- ---------- ---------- ---------- ---------- TCTCCCCTAG CTGTAACTTG GTGGAACAGG AATGGGCCGA CAACAAGTAC TCACCTCTGG CTGTAACGTG GTGGAACAGA AATGGACCAA CAACAAGTAC TCACCTTTGG CAGTGACATG GTGGAATAGA AATGGACCAA TGACAAGTAC TCACCTTTGG CTGTAACATG GTGGAATAGA AATGGACCCG TGGCAAGTAC TCACCTCTGG CTGTGACATG GTGGAATAGG AATGGACCAA TGACAAATAC ---------- ---------- ---------- ---------- ---------- ---------- ---------- ---------- ---------- ---------- ---------- ---------- ---------- ---------- ---------- ---------- ---------- ---------- ---------- ---------- ---------- ---------- ---------- ---------- ---------- ---------- ---------- ---------- ---------- ---------- ---------- ---------- ---------- ---------- ---------- ---------- ---------- ---------- ---------- ---------- ---------- ---------- ---------- ---------- ---------- ---------- ---------- ---------- ---------- ---------- TCAAAAGTGA TCAGA----- ----TTTGCA TTGGTTACCA TGCAAACAAC TGAGGGGGGA CCAGA----- ----TATGCA TTGGATACCA TGCCAATAAT CAGATGCAGA CACAA----- ----TATGTA TAGGCTACCA TGCGAACAAT GCAATGCAGA TAAAA----- ----TCTGCA TCGGCCACCA GTCAACAAAC GAAATGACAA CAGCACGGCA ACGCTGTGCC TTGGGCACCA TGCAGT--AC ---------- ---------- ---------- ---------- ---------- ---------- ---------- ---------- ---------- ---------- ---------- ---------- ---------- ---------- ---------- ---------- ---------- ---------- ---------- ---------- ---------- ---------- ---------- ---------- ---------- GTATGCACAA ACAGATTGTG TATTGGAAGC AATGGCTTTC CTTGAAGAAT GTATGCACAA ACAGATTGTG TATTGGAAGC AATGGCTTTC CTTGAAGAAT ATATGCACAA ACAGACTGCG TCCTGGAAGC AATGGCTTTC CTTGAGGAAT ATATGCACAA ACAGACTGTG TCCTGGAGGC TATGGCCTTC CTTGAAGAAT TTATGCCCAA ACAGATTGTG TATTGGAAGC AATGGCTTTC CTTGAGGAAT

ACCACAGGAG TTGAGAAGCC T-AAATTTCT CCCAGATTTG TATGACTACA ACAACAGGAG TCGATAAACC C-AAATTTCT TCCGGATCTA TACGACTACA ACCACAGGAG CTGAGAAACC G-AAGTTTCT GCCAGATTTG TATGATTACA ACTACAGGGG CTGAGAAACC A-AAGTTTCT ACCAGATTTG TATGATTACA ACTACTGGAG CAGAAAAACC A-AAGTTTCT ACCAGATTTG TATGATTACA ---------- ---------- ---------- ----AGCAAA AGCAGGGTAG ---------- ---------- ---------- ---------- AGCAGGGTTA ---------- ---------- ---------- ----AGCAAA AGCAGGGTAG ---------- ---------- ---------- ---------- ---------- ---------- ---------- ---------- ----AGCAAA AGCAGGGTTA ---------- ---------- ---------- ----AGCAAA AGCAGGGTAG AGTCCATTAT CCAAAGGTTT ACAAAACATA CTTTGAGAAG GTTGAAAGGT AGTCCATTAT CCAAAGGTGT ATAAAACCTA CTTTGAAAAG GTTGAAAGAT GGTTCATTAT CCAAAAATCT ACAAGACTTA TTTTGAGAAA GTCGAAAGGT GGTCCATTAC CCAAAAGTAT ACAAGACTTA TTTTGACAAA GTCGAAAGGT AGTTCATTAT CCAAAAATCT ACAAAACTTA TTTTGAAAGA GTCGAAAGGC ---------- ---------- ---------- ---------- ---------- ---------- ---------- ---------- ---------- ---------- ---------- ---------- ---------- ---------- ---------- ---------- ---------- ---------- ---------- ---------- ---------- ---------- ---------- ---------- ---------- ---------- ---------- ---------- ---------- ---------- ---------- ---------- ---------- ---------- ---------- ---------- ---------- ---------- ---------- ---------- ---------- ---------- ---------- ---------- ---------- ---------- ---------- ---------- ---------- ---------- TCGACAGAG- -CAGGTTGAC ACAATAATGG AAAAGAACGT TACTGTTACA TCCACAGAA- -AAGGTCGAC ACAATTCTAG AGCGGAATGT CACTGTGACT TCAACCGAC- -ACTGTTGAC ACAGTGCTCG AGAAGAATGT GACAGTGACA TCCACAGAA- -ACTGTGGAC ACGCTAACAG AAACCAATGT TCCTGTGACA CAAACGGAAC GATAGTGAAA ACAATCACGA ATGACCAAAT TGAAGTCACT ---------- ---------- ---------- ---------- ---------- ---------- ---------- ---------- ---------- ---------- ---------- ---------- ---------- ---------- ---------- ---------- ---------- ---------- ---------- ---------- ---------- ---------- ---------- ---------- ---------- CCCACCCAGG GATCTTTGAA AACTCGTGTC TTGAAACGAT GGAAGTTGTT CCCACCCAGG ACTCTTTGAA AACTCATGTC TTGAAACGAT GGAAGTTGTC CACACCCAGG AATCTTTGAA AATTCGTGTC TTGAAACGAT GGAAGTTATT CCCACCCAGG TATCTTTGAG AACTCATGCC TTGAAACAAT GGAAGTCGTT CCCATCCTGG TATTTTTGAA AACTCGTGTA TTGAAACGAT GGAGGTTGTT

AGGAGAACCG ATTTATTGAA ATTGGAGTGA CACGGAGGGA AGTTCACACA AGGAAAACCG ATTCACTGAA ATTGGTGTGA CACGGAGGGA AGTTCACATA AGGAGAATAG ATTCATCGAG ATTGGAGTGA CAAGGAGAGA AGTCCACATA AGGAAAATAG ATTCATCGAA ATTGGAGTAA CAAGGAGAGA AGTTCACATA AGGAGAATAG ATTCATCGAA ATTGGAGTGA CAAGAAGAGA AGTCCACATA ATAATCACTC ACTGAGTGAC ATCAACATCA TGGCGTCCCA AGGCACCAAA ATAATCACTC ACTGAGTGAC ATCAACATCA TGGCGTCGCA AGGCACCAAA ATAATCACTC ACTGAGTGAC ATCAACATCA TGGCGTCTCA GGGCACCAAA ---------- ---------- ---------A TGGCGTCCCA AGGCACCAAA ATAATCACTC ACCGAGTGAC ATCAAAATCA TGGCGTCCCA AGGCACCAAA ATAATCACTC ACTGAGTGAC ATCAAAATCA TGGCGTCCCA AGGCACCAAA

36

Page 37: Project  Report on Influenza Virus

TAAAACATGG AACCTTCGGT CCCGTTCATT TCCGAAACCA AGTTAAAATA TAAAACACGG AACCTTTGGC CCTGTTCATT TCCGGAATCA AGTCAAAATA TAAAACATGG AACCTTTGGC CCTGTCCATT TTAGAAACCA AGTCAAAATA TAAAACATGG AACCTTTGGC CCTGTTCATT TTAGAAATCA AGTCAAGATA TAAAGCATGG AACCTTTGGC CCTGTCCATT TTAGAAACCA AGTCAAAATA ---------- ---------- ---------- ---------- ---------- ---------- ---------- ---------- ---------- ---------- ---------- ---------- ---------- ---------- ---------- ---------- ---------- ---------- ---------- ---------- ---------- ---------- ---------- ---------- ---------- ---------- ---------- ---------- ---------- ---------- ---------- ---------- ---------- ---------- ---------- ---------- ---------- ---------- ---------- ---------- ---------- ---------- ---------- ---------- ---------- ---------- ---------- ---------- ---------- ---------- CATGCCCAAG ACATACTGGA AAAGACACAC AATGGGAAGC TCTGCGATCT CATGCCAAGG ACATCCTTGA GAAGACCCAT AACGGAAAGC TATGCAAACT CACTCTGTTA ACCTGCTCGA AGACAGCCAC AACGGAAAAC TATGTAGATT CATGCCAAAG AATTGCTCCA CACAGAGCAT AATGGAATGC TGTGTGCAAC AATGCTACTG AACTGGTTCA GAGTTCCTCA ACAGGTGGAA TATGCGA--C ---------- ---------- -------AGC AAAAGCAGGA GATTAAAATG ---------- ---------- -------AGC GAAAGCAGGG GTTTAAAATG ---------- ---------- ---------- ---------- -------ATG ---------- ---------- -------AGC AAAAGCAGGA G-TAAAGATG ---------- ---------- ---------- ---------- -------ATG CAGCAAACAA GAGTGGATAA GCTGACCCAA GGTCGCCAAA CCTATGACTG CAGCAAACGA GAGTGGATAA GCTGACCCAA GGTCGCCAGA CTTATGACTG CAACAAACAA GAGTGGACAA ACTGACCCAA GGTCGTCAGA CCTATGACTG CAACAAACAA GGGTGGACAA ACTAACCCAA GGCCGCCAGA CTTATGATTG CAGCAAACAC GAGTAGACAA GCTGACACAA GGCCGACAGA CCTATGACTG

TACTATCTAG AAAAAGCCAA -CAAGATAAA ATCTGAGAAG ACACACATTC TATTACTTAG AAAAAGCTAA -CAAGATAAA ATCCGAGAAA ACACATATCC TACTATCTTG AAAAGGCCAA -TAAAATTAA ATCTGAGAAT ACACACATCC TACTATCTGG AAAAGGCCAA -TAAAATTAA ATCTGAGAAA ACACACATCC TATTACCTTG AAAAGGCCAA -TAAAATTAA ATCTGAGAAC ACACACATTC CGATCCTATG AACAGATGGA -AACTGGTGG AGAACGCCAG AATGCCACTG CGATCCTATG AACAGATGGA -AACTGGTGG AGAACGCCAG AATGCCACTG CGATCTTATG AACAGATGGA -AACTGGTGG AGAACGCCAG AATGCTACTG CGGTCTTATG AACAGATGGA -AACTGATGG GGAACGCCAG AATGCAACTG CGGTCTTATG AACAGATGGA -AACTGATGG GGATCGCCAG AATGCAACTG CGGTCTTACG AACAGATGGA -GACTGATGG AGAACGCCAG AATGCCACTG CGTCGCCGGG TGGATATAAA -CCCGGGCCA TGCAGATCTC AGTGCTAAAG CGCCGCAGGG TTGACATGAA -CCCTGGCCA TGCAGATCTC AGCGCTAAAG CGCCGAAGAG TTGACATAAA -CCCTGGTCA TGCAGACCTC AGTGCCAAGG CGCAGAAGAG TAGACATAAA -CCCTGGTCA TGCAGACCTC AGTGCCAAAG CGTCGGAGAG TTGACATAAA -TCCTGGTCA TGCAGATCTC AGTGCCAAGG ---------- ---------- ---------- ---------- ---------- ---------- ---------- ---------- ---------- ---------- ---------- ---------- ---------- ---------- ---------- ---------- ---------- ---------- ---------- ---------- ---------- ---------- ---------- ---------- ---------- ---------- ---------- ---------- ---------- ---------- ---------- ---------- ---------- ---------- ---------- ---------- ---------- ---------- ---------- ---------- ---------- ---------- ---------- ---------- ---------- ---------- ---------- ---------- ---------- ---------- AAATGGAGTG AAGCCTCTCA -TTTTGAGAG ATTGTAGTGT AGCTGGATGG AAACGGAATC CCTCCACTTG -AACTAGGGG ACTGTAGCAT TGCCGGATGG AAAAGGAATA GCCCCACTAC -AATTGGGGA AATGTAACAT CGCCGGATGG AAGCCTGGGA CATCCCCTCA -TTCTAGACA CATGCACTAT TGAAGGACTA AGTCCTCATC AGATC-CTTG -ATGGAGAAA ACTGCACACT AATAGATGCT AATCCAAATC AGAAGATAAT -AACCATTGG ATCAATCTGT ATGGTAGTTG AATCCAAATC AGAAAATAAT -AACCATTGG ATCAATCTGT CTGGTAGTCG AATCCAAATC AAAAGATAAT -AACAATTGG CTCTGTCTCT CTCACCATTG AATCCAAATC AAAAGATAAT -AACGATTGG CTCTGTTTCT CTCACCATTT AATCCAAATC AAAAGATAAT -AGCACTTGG CTCTGTTTCT ATAACTATTG GACATTGAAA AGAAACCAGC CGGCTGCAAC CGCTTTGGCC AACACTATAG GACATTGAAT AGAAACCAGC CGGCTGCAAC TGCTTTGGCC AACACCATAG GACATTGAAC AGAAATCAGC CGGCTGCAAC TGCGCTAGCC AACACTATAG GACATTAAAC AGAAATCAAC CGGCAGCAAC TGCATTAGCC AACACCATAG GACTTTAAAT AGAAACCAGC CTGCTGCAAC AGCATTGGCC AACACAATAG

ACATATTCTC ATTCACTGGA GAGGAAATGG CCACCAAAGC GGACTACACC ACATCTTTTC ATTCACTGGA GAAGAAATGG CCACTAAAGC TGACTACACC ACATTTTCTC ATTCACTGGG GAAGAAATGG CCACAAAGGC CGACTACACT ACATTTTCTC GTTCACTGGG GAAGAAATGG CCACAAAGGC CGACTACACT ACATCTTCTC ATTCACTGGG GAGGAAATAG CCACAAAGGC AGACTACACT AGATCAGGGC ATCTGTTGGA AGAATGGTT- ----GGTGG- AATTGGGAGG AGATCAGGGC ATCTGTTGGA AGAATGGTT- ----GGTGG- AATTGGGAGG AGATCAGAGC ATCTGTTGGA AGAATGGTT- ----GGTGG- AATTGGGAGG AGATCAGAGC ATCCGTCGGG AAGATGATT- ----GATGG- AATTGGACGA AGATTAGGGC ATCCGTCGGG AAGATGATT- ----GATGG- AATTGGGAGA AAATCAGAGC ATCCGTCGGA AAAATGATT- ----GGTGG- AATTGGACGA AAGCACAAGA TGTTATCATG GAGGTCGTTT TCCCAAATG- AAGTGGGAGC

37

Page 38: Project  Report on Influenza Virus

AAGCACAAGA TGTCATCATG GAGGTCGTTT TCCCAAATG- AAGTTGGAGC AGGCACAAGA CGTAATCATG GAAGTTGTTT TCCCCAATG- AAGTGGGGGC AGGCACAAGA TGTAATTATG GAAGTTGTTT TTCCCAATG- AAGTGGGAGC AGGCACAGGA TGTAATCATG GAAGTTGTTT TCCCTAACG- AAGTGGGAGC ---------- ---------- ---------- ---------- ---------- ---------- ---------- ---------- ---------- ---------- ---------- ---------- ---------- ---------- ---------- ---------- ---------- ---------- ---------- ---------- ---------- ---------- ---------- ---------- ---------- ---------- ---------- ---------- ---------- ---------- ---------- ---------- ---------- ---------- ---------- ---------- ---------- ---------- ---------- ---------- ---------- ---------- ---------- ---------- ---------- ---------- ---------- ---------- ---------- ---------- CTCCTCGGAA ACCCTATGTG TGACGAATTC ATCAATGTGC CGGAATGGTC CTCCTTGGAA ATCCAGAATG TGATAGGCTT CTAAGTGTGC CAGAATGGTC CTCTTGGGAA ACCCAGAATG CGACCCACTG CTTCCAGTGA GATCATGGTC GTCTATGGCA ACCCTTCTTG TGACCTGCTG TTGGGAGGAA GAGAATGGTC CTATTGGGAG ACCCTCAGTG TGATGGCTTC C--AAAATAA GAAATGGGAC GGATAATTAG CTTGATGTTA CAAATTGGGA ACATAATCTC AA-TATGGGT GACTAATTAG CCTAATATTG CAAATAGGGA ATATAATCTC AA-TATGGAT CAACAGTATG CTTCCTCATG CAGATTGCCA TCCTGGTAAC TACTGTGACA CCACAATATG CTTCTTCATG CAAATTGCCA TCCTGATAAC CACTGTAACA CGACAATATG TTTACTCATG CAGATTGCCA TCTTAGCAAC GACTATGACA AGGTCTTCAG ATCGAATGGT CTAACAGCCA ATGAATCGGG AAGGCTAATA AAGTATTCAG ATCGAACGGT CTAACAGCCA ATGAGTCAGG AAGGTTAATA AGGTCTTCAG ATCGAATGGA CTGACAGCTA ATGAGTCGGG AAGGCTAATA AAGTTTTTAG ATCGAATGGA CTAACAGCCA ATGAATCAGG AAGGCTAATA AAGTGTTCAG ATCAAATGGC CTCACGGCCA ATGAGTCTGG AAGGCTCATA

CTTGATGAAG AAAGCAGGGC CCGAATCAAA ACCAGGCTGT TCACTATAAG CTTGATGAAG AGAGCAGGGC AAGAATAAAA ACCAGACTAT TCACCATAAG CTCGATGAGG AAAGCAGGGC TAGGATCAAA ACCAGACTAT TCACCATAAG CTCGATGAAG AAAGCAGGGC TAGGATCAAA ACCAGGCTAT TCACCATAAG CTCGACGAGG AAAGCAGGGC TAGGATTAAA ACCAGGCTAT TTACCATAAG TTTTACGTAC AGATGTGCAC TGAACTCAAA CTCAGCGACC AAGAAGGAAG TTTTACGTAC AGATGTGCAC TGAACTCAAA CTCAGCGACC AAGAAGGAAG TTTTATATAC AGATGTGCAC TGAACTCAAA CTCAGCGACT ATGAAGGAAG TTCTACATCC AAATGTGCAC CGAACTTAAA CTCAGTGATT ATGAGGGGCG TTCTACATCC AAATGTGCAC TGAACTTAAA CTCAGTGATC ATGAAGGGCG TTCTACATCC AAATGTGCAC AGAACTTAAA CTCAGTGATT ATGAGGGACG TAGAATATTG ACATCAGAGT CGCAATTGAC AATAACAAAA GAGAAGAAAG CAGGATATTG ACATCAGAAT CACAGCTGAC AATAACAAAG GAAAAGAGGG CAGGATACTA ACGTCGGAAT CACAATTAAC AATAACCAAA GAGAAAAAAG CAGGATACTA ACATCAGAAT CGCAATTAAC AATAACTAAA GAGAAAAAAG CAGGATACTA ACATCGGAAT CGCAACTAAC GATAACCAAA GAGAAGAAAG ---------- ---------- ---------- ---------- ---------- ---------- ---------- ----AGCAAA AGCAGGGTGA CAAAGACATA ---------- ---------- ----AGCAAA AGCAGGGTGA CAAAGACATA ---------- ---------- ----AGCAAA AGCAGGGTGA CAAAGACATA ---------- ---------- ---------- ------GTGA CAAAGACATA ---------- ---------- ---------- ---------- ---------- ---------- ---------- ---------- ---------- ---------- ---------- ---------- ---------- ---------- ---------- ---------- ---------- ---------- ---------- ---------- ---------- ---------- ---------- ---------- ---------- TTACATAGTG GAGAAGGCCA GTCCAGCCAA TGACCTCTGT TACCCAGGGG CTATATAATG GAGAAAGAAA ACCCGAGATA CAGTTTGTGT TACCCAGGCA CTACATTGTA GAAACACCAA ACTCTGAGAA TGGAATATGT TATCCAGGAG CTACATCGTC GAAAGATCAT CAGCTGTAAA TGGAACGTGT TACCCTGGGA CTTTTTGTTG AACGCAGCAA AGCCT----A CAGCAACTGT TACCCTTATG CAGTCATTCA ATTCAGACAG GGAATCAACA CCAAGCTGAA CCATG-CAAT TAGCCATTCA ATTCAAACTG GAAGTCAAAA CCATACTGGA ATATG-CAAC TTGCATTTTA AGCAACAT-- GAGTGCGACT CCCCCGCGAG CAACC-AAGT TTGCATTTCA AGCAATAT-- GAATTCAACT CCCCCCCAAA CAACC-AAGT CTACATTTCA ------AT-- GAATGTACCA ACCCATCGAA CAATC-AAGC GATTTCCTCA AAGACGTGAT GGAATCAATG GATAAGGGAG AAATGGAAAT GATTTCCTCA AGGACGTAAT GGAATCAATG GATAAGGAAG AAATGGAAAT GATTTCCTCA AGGATGTGAT AGAATCAATG GATAAAGAGG AGATGGAAAT GATTTCCTCA AGGATGTGAT GGAATCAATG GATAAAGAGG AAATGGAGAT GACTTCCTTA AGGATGTAAT GGAGTCAATG AAAAAAGAAG AAATGGGGAT

GCAGG----- AAATGGCCAG TAGGGGTTTA TGGGATTCCT TTCGTC-AGT ACAGG----- AAATGGCAAG CAGGGGTCTA TGGGATTCCT TTCGTC-AGT ACAAG----- AAATGGCCAA CAGAGGCCTC TGGGATTCCT TTCGTC-AGT ACAAG----- AAATGGCCAG CAGAGGCCTC TGGGATTCCT TTCGTC-AGT ACAAG----- AAATGGCCAA CAGAGGCCTC TGGGATTCCT TTCGTC-AGT GTTGA-TCCA GAACAGTATA ACAATAGAGA GAATGGTTCT CTCCGC-ATT GTTGA-TCCA GAACAGTATA ACAATAGAGA GAATGGTTCT CTCCGC-ATT GCTGA-TTCA GAACAGCATA ACAATAGAGA GAATGGTTCT CTCTGC-ATT ACTGA-TCCA GAACAGCTTA ACAATAGAGA GAATGGTGCT CTCTGC-TTT GTTGA-TCCA GAACAGCTTG ACAATAGAGA AAATGGTGCT CTCTGC-TTT GTTGA-TCCA AAACAGCTTA ACAATAGAGA GAATGGTGCT CTCTGC-TTT AAGAGCTCCA GGATTGTAAA ATTGCTCCTT TAATGGTGGC ATACAT-GTT AGGAACTCAA GAATTGTAAT ATTGCTCCTT TAATGGTGGC ATATAT-GTT

38

Page 39: Project  Report on Influenza Virus

AAGAACTCCA AGATTGCAAA ATTTCTCCTT TGATGGTTGC ATACAT-GTT AAGAACTCCG AGATTGCAAA ATTTCTCCCT TGATGGTTGC ATACAT-GTT AAGAACTCCA GGATTGCAAA ATTTCTCCTT TGATGGTTGC ATACAT-GTT ATGGATTCTA ACACTGTGTC AAGTTTTCAG GTAGATTGCT TCCTTT-GGC ATGGATTCCA ACACTGTGTC AAGTTTCCAG GTAGATTGCT TTCTTT-GGC ATGGATCCAA ACACTGTGTC AAGCTTTCAG GTAGATTGCT TTCTTT-GGC ATGGATTCCA ACACTGTGTC AAGCTTTCAG GTAGACTGCT TTCTTT-GGC ATGGATTCCA ACACGATAAC CTCGTTTCAG GTAGATTGTT ATCTAT-GGC ---------- ---------- ---------A GCAAAAGCAG GTAGAT-ATT ---------- ---------- --GGGGAATT CCAAAAGCAG GTAGAT-ATT ---------- ---------- ---------A GCAAAAGCAG GTAGAT-ATT ---------- ---------- ---------A GCAAAAGCAG GTAGAT-ATT ---------- ---------- ---------A GCGAAAGCAG GTAGAT-ATT ATTTC---AA CGACTATGAA GAACTGAAAC ACCTATTGAG CAGAAC-AAA GCTTC---AA TGACTATGAA GAATTGAAAC ATCTCCTCAG CAGCGT-GAA ATTTC---AT CGACTATGAG GAGCTGAGGG AGCAATTGAG CTCAGT-GTC ATGTA---GA AAACCTAGAG GAACTCAGGA CACTTTTTAG TTCCGC-TAG ATGTG---CC GGATTATGCC TCCCTTAGGT CACTAGTTGC CTCATC-CGG CAAAGCATTA TTACTTATGA AAACAACACC TGGGTAAATC A------AAC CAAAACATCA TTACCTATAA AAATAGCACC TGGGTAAAGG A------CAC AATGCCGTGT GAACCAATAA TAATAGAAAG -GAACATAAC A------GAG GATGCTGTGT GAACCAACAA TAATAGAAAG -AAACATAAC A------GAG AGTGCCATGT GAACCAATCA TAATAGAAAG -GAACATAAC A------GAG AATAACACAT TTCCAGAGAA AGAGAAGAGT GAGGGACAAC ATGACCAAGA AACAACACAT TTCCAGAGAA AGAGAAGAGT GAGGGACAAC ATGACCAAGA AACAACACAC TTCCAAAGAA AAAGAAGAGT AAGAGACAAC ATGACCAAGA AACAACACAC TTTCAAAGAA AAAGGAGAGT AAGAGACAAC ATGACCAAGA CACAACTCAT TTTCAGAGAA AGAGACGGGT GAGAGACAAT ATGACTAAGA

CCGAGAGAGG CGAAGAGAC- -AGTTGAAGA AAG-ATTTGA AATCACAGGG CCGAGAGAGG CGAAGAGAC- -AATTGAAGA AAG-ATTTGA AATCACAGGG CCGAAAGAGG CGAAGAAAC- -AATTGAAGA AAG-ATTTGA AATCACAGGG CCGAGAGAGG AGAAGAGAC- -AATTGAAGA AAG-GTTTGA AATCACAGGA CCGAAAGAGG CGAAGAAAC- -AATTGAAGA AAA-ATTTGA AATCTCAGGA TGATGAAAGG AGGAACAGG- -TACCTAGAG G-------AA CATCCCAGTG TGATGAAAGG AGGAACAGG- -TACCTAGAG G-------AA CATCCCAGTG TGATGAAAGG AGGAACAAA- -TACCTGGAA G-------AA CATCCCAGTG TGACGAGAGA AGGAATAAA- -TATCTGGAA G-------AA CATCCCAGCG TGATGAAAGA AGGAATAAA- -TACCTGGAA G-------AA CACCCCAGCG TGACGAAAGG AGAAATAAA- -TACCTGGAA G-------AA CATCCCAGTG GGAAAGAGAA CTGGTCCGC- -AAAACCAGA TTT-C--TAC CGGTAGCAGG GGAAAGAGAA TTGGTTCGC- -AAGACCAGA TTC-C--TAC CCGTGGCTGG AGAGAGAGAA CTTGTCCGA- -AAAACGAGA TTT-C--TCC CAGTTGCTGG AGAGAGAGAA CTTGTCCGA- -AAAACAAGA TTT-C--TCC CAGTTGCTGG GGAGAGAGAA CTGGTCCGC- -AAAACGAGA TTC-C--TCC CAGTGGCTGG ATGTCCGAAA ACAAGTTGT- -AGACCAAGA ACT-AGGTGA TGCCCCATTC ATATCCGGAA ACAAGTTGT- -AGACCAAGA ACT-GAGTGA TGCCCCATTC ATGTCCGCAA ACGAGTTGC- -AGACCAAGA ACT-AGGTGA TGCCCCATTC ATGTCCGCAA ACGATTTGC- -AGACCAAGA ACT-GGGTGA TGCCCCATTC ACATAAGAAA GCTACTCAG- -TATGAGAGA CAT-GTGTGA TGCCCCCTTT GAAAAATGAG TCTTCTAACC GAGGTCGAAA CGT-ACGTTC TCTCTATCGT GAAAGATGAG TCTTCTAACC GAGGTCGAAA CGT-ACGTTC TCTCTATCAT GAAAGATGAG CCTTCTAACC GAGGTCGAAA CGT-ACGTTC TCTCTATCGT GAAAGATGAG CCTTCTAACC GAGGTCGAAA CGT-ATGTTC TCTCTATCGT GAAAGATGAG TCTTCTAACC GAGGTCGAAA CGT-ACGTTC TCTCTATCAT CCATTTTGAG AAAATTCAGA TCATCCCCAA A---AGTTCT TGGTCCAATC ACATTTTGAG AAAGTTAAGA TTTTGCCCAA A---GATAGA TGGAC---AC ATCATTCGAA AGATTCGAAA TATTTCCCAA AGAAAGCTCA TGGCCCAACC TTCCTACCAA AGAATCCAAA TCTTCCCAGA C---ACAACC TGGA------ CACACTGGAG TTTAACAATG AAAGCTTCAA T--------- TGGAC---TG ATATGTCAAC ATCAGCAATA CCAATTTTCT TACTG--AAA AAGCTGTGGC A--------- ---------- ---------- ---------- --------AC ATA-GTGTAT TTGAATAACA CCACCATAGA GAAAG--AGA TCTGCCCCGA ATA-GTGTAT CTGACCAACA CCACCATAGA GAAGG--AAA TGTGCCCCAA ATA-GTGCAT TTGAATAATA CTACCATAGA GAAGG--AAA GTTGTCCTAA AAATGGTCAC ACAAAGAACA ATAGGGAAGA AAAAACAAAG GCTGAACAAA AAATGGTCAC ACAAAGAACA ATAGGGAAGA AGAAGCAAAA GCTGACAAAA AAATGGTCAC ACAACGAACA ATAGGAAAGA AGAAGCAAAG ATTGAACAAG AAATGGTCAC ACAAAGAACA ATAGGGAAGA AAAAACAAAG AGTGAATAAG AAATGATAAC ACAGAGAACA ATAGGTAAAA GGAAACAGAG ATTGAACAAA

ACTATGTGCA GGCTTGCCGA CCAAAGTCTC CCACCTAATT TCTCCAGCCT ACCATGCGTA GGCTTGCCGA CCAAAGTCTC CCACCTAACT TCTCCAGCCT ACAATGCGCA GGCTTGCCGA CCAAAGTCTC CCGCCGAACT TCTCCTGCCT ACAATGCGCA AGCTTGCCGA CCAAAGTCTC CCGCCGAACT TCTCCAGCCT ACTATGCGTA GGCTTGCCGA CCAAAGTCTC CCACCGAAAT TCTCCTGCCT CGGGGAAGGA CCCGAAGAAG ACCGGAGGTC CAATCTACCG AAGGAGAGAC CGGGGAAGGA CCCGAAGAAG ACCGGAGGTC CAATCTACCG AAGGAGAGAC CGGGGAAGGA CCCAAAGAAA ACTGGAGGTC CAATCTACCG AAGAAGAGAC CGGGGAAGGA TCCTAAGAAA ACTGGAGGAC CCATATACAA GAGAGTAGAT CGGGGAAAGA TCCCAAGAAA ACTGGGGGGC CCATATACAG GAGAGTAGAT CGGGGAAGGA TCCTAAGAAA ACTGGAGGAC CTATATACAG AAGAGTAAAC CGGAACAAGC AGTGTGTACA TTGAGGTATT GCATTTGACT CAAGGGACCT CGGGACAAGC AGCGTATATA TAGAAGTATT GCATTTGACT CAAGGAACTT TGGAACAAGC AGTGTGTACA TTGAAGTGTT ACACTTGACT CAAGGAACAT

39

Page 40: Project  Report on Influenza Virus

CGGAACAAGC AGTATATACA TTGAAGTCTT ACATTTGACT CAAGGAACGT TGGAACAAGC AGTGTGTACA TTGAAGTGTT GCATTTGACT CAAGGAACAT CTTGATCGGC TTCGCCGAGA TCAGAAGTCC CTAAGGGGAA GAGGCAGCAC CTTGATCGGC TTCGCCGAGA TCAGAGGTCC CTAAGGGGAA GAGGCAATAC CTTGATCGGC TTCGCCGAGA TCAGAAATCC CTAAGAGGAA GGGGCAGCAC CTTGACCGGC TTCGCCGAGA TCAGAAGTCC CTAAGAGGAA GAGGCAGCAC GATGACAGGC TCCGAAGAGA CCAAAAGGCA TTAAAGGGAA GAGGCAGCAC CCCGTCAGGC CCCCTCAAAG CCGAGATCGC GCAGAGACTT GAGGATGTCT CCCATCAGGC CCCCTCAAAG CCGAGATCGC GCAGAGACTT GAGGATGTTT CCCGTCAGGC CCCCTCAAAG CCGAGATCGC ACAGAGACTT GAAGATGTCT TCCATCAGGC CCCCTCAAAG CCGAGATCGC GCAGAGACTT GAAGATGTCT CCCGTCAGGC CCCCTCAAAG CCGAGATCGC ACAGAGACTT GAAGATGTCT ATGATGCCTC ATCAGGGGTG AGCTCAGCAT GTCCATACCA TGGGAGGTCC AGCATACAAC AACTGGAGGT TCATGGGCCT GCGCGGTGTC AGGTAAACCA ACAACACAAC CAAAGGAGTA ACGGCAGCAT GCTCCCATGC GGGGAAAAGC ATGTGACTTA CACTGGAACA AGCAGAGCAT GTTC------ ------AGGT GAGTCACTCA AAATGGAACA AGCTCTGCTT GCAAAAGGAG ATCTAATAAC TTCAGTAA-- -CATTAGCGG GCAATTCA-- -TCTCTTTGC CCCAT----T TTCAGTGA-- -TATTAACCG GCAATTCA-- -TCTCTTTGT CCCAT----C AGTAGTGG-- -AATACAGAA ATTGGTCAAA GCCGCAATGT CAAAT----T ACTAGCAG-- -AATACAGAA ATTGGTCAAA GCCGCAATGT GACAT----T AGTAGCAG-- -AATACAAGA ATTGGTCAAA ACCGCAATGT CAAAT----T AGGAGCTACC TAATAAGAGC ACTGACACTG AACACAATGA CAAAAGACGC AAGAGCTACC TAATAAGAGC ACTGACACTG AACACAATGA CAAAAGATGC AGAAGCTATC TGATAAGAGC ACTGACATTG AACACAATGA CTAAAGATGC AGAGGCTATC TAATAAGAGC TTTGACATTG AACACGATGA CCAAAGATGC AGGAGTTATC TAATTAGAGC ATTGACCCTG AACACAATGA CCAAAGATGC

TGAAAAATTT AGAGC-CTAT GTGGATGGAT TCGAACCGAA CGGCTGCATT TGAAAACTTT AGAGC-CTAT GTGGATGGAT TCAAACCGAA CGGCTGCATT TGAGAATTTT AGAGC-CTAT GTGGATGGAT TCGAACCGAA CGGCTACATT TGAAAATTTT AGAGC-CTAT GTGGATGGAT TCGAACCGAA CGGCTACATT TGAGAATTTT AGAGC-CTAT GTGGATGGAT TCGAACCGAA CGGCTGCATT G--GGAAATG GGTGAGAGAG CTGATTCTGT ATGACAAAGA GGAGA----T G--GGAAATG GGTGAGAGAG CTGATTCTGT ATGACAAAGA GGAGA----T G--GAAAATG GGTGAGAGAG CTGATTCTGT ATGACAAAGA GGAGA----T G--GAAAGTG GATGAGGGAA CTCGTCCTTT ATGACAAAGA AGAAA----T G--GAAAATG GATGAGGGAA CTCGTCCTTT ATGACAAAGA AGAGA----T G--GAAAGTG GATGAGAGAA CTCATCCTTT ATGACAAAGA AGAAA----T GTTGGGAACA GATGTACACT CCCGGCGGAG AAGTAAGAAA TGATGATGTT GCTGGGAGCA GATGTACACA CCAGGAGGGG AGGTAAGAAA TGATGATGTT GTTGGGAACA GATGTACACC CCAGGTGGAG AAGTGAGGAA TGATGATGTT GTTGGGAACA AATGTACACT CCAGGTGGAG AAGTGAGGAA TGACGATGTT GCTGGGAACA GATGTATACT CCAGGAGGGG AAGTGAAGAA TGATGATGTT TCTCGATCTA GACAT-CGAA GCAGCCACCC GTGTTGGAAA GCAGATAGTA TCTCGGTCTA GACAT-CAAA GCAGCCACCC ATGTTGGAAA GCAAATTGTA TCTTGGTCTG GACAT-CGAG ACAGCCACAC GTGCTGGAAA GCAGATAGTG TCTTGGTCTG GACAT-CAGA ACTGCCACTC GTGAAGGAAA GCATATAGTG ACTTGGACTC GATTT-AAGA GTGGCTACAA TGGAGGGGAA AAAGATCGTT TTGCAGGAAA GAACACCGAT CTCGA-GGCT CTCATGGAAT GGCTAAAGAC TTGCAGGGAA GAACACAGAT CTTGA-GGCT CTCATGGAAT GGCTAAAGAC TTGCTGGGAA GAACACAGAT CTTGA-GGCT CTCATGGAAT GGCTAAAGAC TTGCTGGGAA AAACACAGAT CTTGA-GGCT CTCATGGAAT GGCTAAAGAC TTGCAGGGAA GAACACCGAT CTTGA-GGTT CTCATGGAAT GGCTAAAGAC TCCTTTTTCA GAAATGTGGT ATGGCTTAT- CAAAAAGAAC AGTGCATACC TCATTCTTCA GGAACATGGT CTGGCTGACA CGTAAAGGAT CAAAT-TATC AGTTTTTACA GAAATTTGCT ATGGCTGAC- GGAGAAGGAG GGCTCATACC TCATTCTACA GGAGTATGAG ATGGCTGAC- TCAAAAGAGC GGTTTTTACC AGTTTCTTTA GTAGATTGAA TTGGTTGACC CACTTAAAAT T-CAAATACC AGCGGATGGG CTGTACACAG TAAGGACAAC GGTATAAGAA TCGGTTCCAA CGTGGGTGGG CTATATACAG CAAAGACAAT AGCATAAGAA TTGGTTCCAA ACAGGATTTG CACCTTTTTC TAAGGACAAT TCAATCCGGC TTTCTGCTGG ACAGGATTTG CACCTTTTTC TAAGGACAAT TCGATTAGGC TTTCCGCTGG ACAGGGTTCG CCCCTTTCTC CAAGGACAAC TCAATTAGGC TTTCTGCAGG AGAAAGAGGC AAATTGAAGA GGCGGGCAAT TGCAACACCC GGGATGCAAA TGAAAGGGGA AAATTGAAAA GACGAGCGAT TGCAACACCC GGAATGCAAA AGAGAGAGGT AAATTAAAAA GAAGAGCAAT TGCAACACCC GGTATGCAGA AGAGAGAGGT AAATTAAAAA GAAGGGCTAT TGCAACACCC GGGATGCAAA TGAGAGAGGG AAGCTAAAAC GGAGAGCAAT TGCAACCCCA GGGATGCAAA

GAGGGCAAGC TTTCTCAAAT GTCGAAAGAA GTAAACGCCA GAATTGAGCC GAGGGCAAGC TTTCTCAAAT GTCGAAAGAA GTGAACGCCA GAATTGAGCC GAGGGCAAGC TTTCTCAAAT GTCCAAAGAA GTAAATGCAA AAATTGAACC GAGGGCAAGC TGTCTCAAAT GTCCAAAGAA GTAAATGCTA GAATTGAACC GAGGGCAAGC TTTCTCAAAT GTCCAAAGAA GTGAATGCCA AAATTGAACC AAGGAGAATT TGGCG--TCA AGCGAACAAT GGAGAAGACG CAACTGCTGG AAGGAGAATT TGGCG--TCA AGCGAACAAT GGAGAAGACG CAACTGCTGG CAGGAGAATT TGGCG--TCA AGCGAACAAT GGAGAAGATG CAACTGCTGG AAGGCGAATC TGGCG--CCA AGCCAATAAT GGTGATGATG CAACAGCTGG AAGGCGAATC TGGCG--CCA AGCCAACAAT GGTGAGGATG CGACAGCTGG AAGGCGAATC TGGCG--CCA AGCTAATAAT GGTGACGATG CAACGGCTGG GACCAGAGTT TGATCATCGC TGCCAGAAAC ATTGTTAGGA GAGCAACAGT GACCAAAGTT TAATCATTGC TGCTAGGAAC ATTGTCAGGA GAGCAACAGT GATCAAAGTC TAATTATTGC AGCCAGGAAC ATAGTGAGAA GAGCAGCAGT GACCAAAGCC TAATTATTGC GGCCAGGAAC ATAGTAAGAA GAGCTGCAGT

40

Page 41: Project  Report on Influenza Virus

GATCAAAGCT TGATTATTGC TGCTAGGAAC ATAGTGAGAA GAGCTGCAGT GAGAGGATTC TGAAGGAAGA ATCCGATGAG GCACTTAAAA TGACCATGGC GAAAAGATTC TGAAAGAAGA ATCTGATGAG GCACTTAAAA TGACCATGGT GAGCGGATTC TGAAAGAAGA ATCCGATGAG GCACTTAAAA TGACCATGGC GAGCGGATTC TGGAGGAAGA ATCTGACGAG GCACTTAAAA TGACTATCGC GAGGACATCC TGAAGAGTGA GACAAATGAA AACCTCAAAA TAGCCATTGC AAGACCAATC CTGTC---AC CTCTGACTAA AGGGATTTTA GGATT--TGT AAGACCAATC CTGTC---AC CTCTGACTAA GGGGATTTTA GGGTT--TGT AAGACCAATC CTGTC---AC CTCTGACTAA GGGGATTTTG GGATT--TGT AAGACCAATT CTGTC---AC CTCTGACTAA GGGGATTTTG GGGTT--TGT AAGACCAATC CTGTC---AC CTCTGACTAA GGGGATTTTA GGATT--TGT CAACAATAAA GAGGAGCTAC AATAATACCA ACCAAGAAGA TCTTTTAGTA CGGTTGCCAA AGGATCGTAC AACAATACAA GCGGAGAACA AATGCTAATA CAAAGCTGAA AAATTCTTAT GTGAACAAGA AAGGGAAAGA AGTCCTTGTA CTGTTCAAGA CGCCCAATAC ACAAATAACA GGGGAAAGAG CATTCTTTTC CAGCATTGAA CGTGACTATG CCAAACAATG AAAAATTTGA CAAACTGTAC GGGGGATGTG TTTGTTATAA GA-GAGCCGT TCATCTCATG CTCCCACTTG AGGAGACGTT TTTGTCATAA GA-GAGCCCT TTATTTCATG TTCTCACTTG TGGGGACATT TGGGTGACGA GA-GAACCTT ATGTGTCATG CGATCCTGGC TGGGGACATC TGGGTGACAA GA-GAACCTT ATGTGTCATG CGACCCTGAC CGGGGATATT TGGGTGACAA GA-GAACCTT ATGTATCGTG CGGTCTTGGT TCAGAGGATT CGTGTACTTT GTCGAAACAC TAGCGAGGAG TATCTGTGAG TCAGAGGATT CGTGCACTTT GTCGAAGCAC TAGCAAGGAG CATCTGTGAA TCAGAGGGTT CGTGCACTTT GTCGAAACAC TAGCGAGAAA TATTTGTGAG TTAGAGGGTT CGTGTACTTC GTTGAAACTT TAGCTAGAAG CATTTGCGAA TAAGGGGGTT TGTATACTTT GTTGAGACAC TGGCAAGGAG TATATGTGAG

ATTTCTGAAG ACAACA---C CACGCCCTCT TAGATTACCT GATGGG---C ATTTCTGAAG ACAACA---C CACGTCCCCT CAGATTGCCT GATGGA---C TTTTCTGAAA ACAACA---C CAAGACCAAT TAGACTTCCG GATGGG---C TTTTTTGAAA ACAACA---C CACGACCACT TAGACTTCCG AATGGG---C TTTTCTGAAG ACAACA---C CAAGACCAAT CAAACTTCCT AATGGA---C TCTCACTCAT ATGATG---A TCTGGCATTC CAACCTAAAT GATGCC--AC TCTCACTCAT ATGATG---A TCTGGCATTC CAACCTAAAT GATGCC--AC TCTCACTCAC ATGATG---A TCTGGCATTC CAATCTAAAT GATGCC--AC GCTGACTCAC ATGATG---A TCTGGCATTC CAATTTGAAT GATACA--AC TCTAACTCAC ATAATG---A TCTGGCATTC CAATTTGAAT GATGCA--AC TCTGACTCAC ATGATG---A TCTGGCATTC CAATTTGAAT GATGCA--AC ATCAGCGGAC CCACTG---G CATCACTCTT GGAGATGTGT CACAGCACAC ATCAGCAGAC CCATTG---G CTTCACTCCT GGAAATGTGC CATAGCACAC ATCAGCAGAT CCACTA---G CATCTTTATT GGAGATGTGC CACAGCACAC ATCAGCAGAT CCACTA---G CATCTTTATT GGAGATGTGC CACAGCACAC ATCAGCAGAC CCACTA---G CATCTTTATT GGAGATGTGC CACAGCACAC CTCCGCACCT GCTTCG---C GATACCTAAC TGACATG-AC TATTGAGGAA CTCCACACCT GCTTCG---C GATACATAAC TGACATG-AC TATTGAGGAA CTCTGTACCT GCGTCG---C GTTACCTAAC CGACATG-AC TCTTGAGGAA TTCAGTGCCT GCTTCA---C GCTACCTAAC TGAAATG-AC TCTTGAGGAA TTCCAGTCCT GCTCCT---C GGTATATCAC CGATATG-AG CATAGAGGAG GTTCACGCTC ACCGTG---C CCAGTGAGCG AGGACTGCAG CGTAGA---- GTTCACGCTC ACCGTG---C CCAGTGAGCG AGGACTGCAG CGTAGA---- ATTCACGCTC ACCGTG---C CAAGTGAGCG AGGACTGCAG CGTAGA---- GTTCACGCTC ACCGTG---C CCAGTGAGCG AGGACTGCAG CGTAGA---- GTTCACGCTC ACCGTG---C CCAGTGAGCG AGGACTGCAG CGTAGA---- CTGTGGGGGA TTCACC---A TCCTAATGAT GCGGCAGAGC AGACAAAGCT ATTTGGGGAG TGCACC---A TCCTAATGAT GAGGCAGAAC AAAGAGCATT CTGTGGGGTA TTCATC---A CCCGTCTAAC AGTAAGGATC AACAGAATAT GTGTGGGGCA TACATC---A CCCACCCACC TATACCGAGC AAACAAATTT ATTTGGGGGG TTCACC---A CCCGGGTACG GACAATGACC AAATCAGCCT GAATGCAGAA CTTTCT---T TTTGACTCAG GGAGCCTTGC TGAATGACAA GAATGCAGGA CCTTTT---T TCTGACCCAA GGTGCCTTAC TGAATGACAG AAGTGTTATC AATTTG---C ACTCGGGCAG GGGACCACAC TAGACAACAA AAGTGTTACC AATTTG---C CCTTGGACAG GGAACAACAC TAAACAACGT AAATGTTACC AATTTG---C ACTTGGGCAG GGAACCACTT TGAACAACAA AAACTTGAGC AATCTGGACT CCCCGTCGGA GGGAATGAAA AGAAGGCTAA AAACTTGAGC AATCTGGACT CCCCGTTGGA GGGAATGAGA AGAAGGCTAA AAACTTGAAC AGTCTGGGCT TCCGGTTGGA GGTAATGAAA AGAAGGCTAA AAGCTTGAAC AGTCTGGACT TCCGGTTGGG GGTAATGAAA AGAAGGCCAA AAACTTGAAC AATCAGGGTT GCCAGTTGGA GGCAATGAGA AGAAAGCAAA

CTCCCTGCTC TCAGCGGTCG AAGTT--TTT GCTGATGGAT GCCCTTA--- CTCCCTGCTC CCAGCGGTCG AAATT--CTT GCTGATGGAT GCTCTGA--- CTCCTTGTTT TCAGCGGTCC AAATT--CCT GCTGATGGAT GCTTTAA--- CTCCCTGTTC TCAGCGGTCC AAATT--CCT GCTGATGGAT GCCTTAA--- CTCCTTGTTA TCAGCGGTCC AAATT--CCT CCTGATGGAT GCTTTGA--- ATACCAGAGA ACAAGAGCCC TCGTG--CGG ACTGGAATGG ACCCCAG--A ATACCAGAGA ACAAGAGCCC TCGTG--CGG ACTGGAATGG ACCCCAG--A ATACCAGAGA ACAAGAGCTC TCGTG--CGT ACTGGGATGG ACCCTAG--A ATACCAGAGG ACAAGAGCTC TTGTT--CGC ACCGGAATGG ATCCCAG--G ATACCAGAGG ACAAGAGCTC TTGTT--CGA ACTGGAATGG ATCCCAG--A TTATCAGAGG ACAAGGGCTC TTGTT--CGC ACCGGAATGG ATCCCAG--G AAATTGGGGG AATAAGGATG GTGGA--CAT CCTTAGGCAA AACCCAACTG AAATTGGCGG AGTAAGAATG GTAGA--CAT CCTTAAACAA AACCCAACAG AGATTGGCGG GACAAGGATG GTGGA--CAT TCTTAGGCAG AACCCAACGG AAATTGGCGG GACAAGGATG GTGGA--CAT TCTTAGACAG AACCCGACTG AGATTGGTGG AATTAGGATG GTAGA--CAT CCTTAAGCAG AACCCAACAG

41

Page 42: Project  Report on Influenza Virus

TTGTCAAGGG ACT--GGTTC ATGCT--AAT GCCCAAGCAG AAAGTGGAAG TTGTCAAGAA ACT--GGTTC ATGCT--AAT GCCCAAGCAG AAAGTGGAAG ATGTCAAGGG AAT--GGTCC ATGCT--CAT ACCCAAGCAG AAAGTGGCAG ATGTCAAGGG ATT--GGTTA ATGCT--CAT TCCCAAGCAG AAAGTGACAG ATGAGCCGAG AAT--GGTAC ATGCT--GAT GCCTAGGCAG AAAATAACTG CGCTTTGTCC AGAATGCCTT AAATG--GAA ATGGAGATCC AAACAATATG CGATTTGTCC AAAATGCCCT AAATG--GGA ATGGAGACCC AAACAACATG CGCTTTGTCC AAAATGCCCT CAATG--GGA ATGGGGATCC AAATAACATG CGCTTTGTCC AAAATGCCCT CAATG--GGA ATGGAGATCC AAATAACATG CGCTTTGTCC AAAATGCCCT TAATG--GGA ACGGGGATCC AAATAACATG CTATCAAAAC CCAACCACTT ACATTTCCGT TGGAACATCA ACACTGAACC GTACCAGAAT GTGGGAACCT ATGTTTCCGT AGCCACATCA ACATTGTACA CTATCAGAAT GAAAATGCTT ATGTCTCTGT AGTGACTTCA AATTATAACA GTACATAAGA AACGACACAA CAACAAGCGT GACAACAGAA GATTTGAATA ATATGCTCAA GCATCAGGAA GAATCACAGT CTCTACCAAA AGAAGCCAAC GCACTCCAAT GGGACCGTCA AAGACAGAAG CC-CTCACAG AACATTGA-- GCATTCAAAT GGGACTGTTA AGGACAGAAG CC-CTTATAG GGCCTTAA-- ACATTCAAAT GACACAATAC ATGATAGAAT CC-CTCATCG AACCCTAT-- GCATTCAAAT GACACAGTAC ATGATAGGAC CC-CTTATCG GACCCTAT-- ACACTCAAAT GGCACAATAC ATGATAGGAG TC-CCCATAG AACCCTTT-- ATTGGCAAAT GTCGTGAGGA AGATGATGAC TAACTCACAA GATACAGAGC ATTGGCAAAT GTTGTGAGAA AGATGATGAC TAACTCACAA GACACAGAGC ACTAGCAAAT GTTGTTAGAA AAATGATGAC TAATTCACAA GACACAGAGC ACTGGCAAAT GTTGTGAGAA AAATGATGAC TAATTCACAA GACACTGAGC GTTGGCAAAT GTTGTAAGGA AGATGATGAC CAATTCTCAG GACACCGAAC

AATTAAGCAT CGAAGACCCG AGTCATGAGG GGGAGGGGAT -ACCGCTATA AATTAAGCAT TGAGGACCCG AGCCATGAGG GGGAGGGGAT -ACCGCTATA AATTAAGCAT TGAGGACCCA AGTCACGAAG GGGAGGGAAT -ACCACTATA AATTAAGCAT TGAGGACCCA AGTCATGAAG GAGAGGGAAT -ACCGCTATA AATTGAGCAT TGAAGACCCA AGTCATGAAG GAGAAGGGAT -TCCATTATA ATGTGCTCTC TGATGCAAGG ATCAACCCTC CCGAGGAGAT CTGGAGCTGC ATGTGCTCTC TGATGCAAGG ATCAACCCTC CCGAGGAGAT CTGGAGCTGC ATGTGCTCTC TGATGCAAGG ATCAACTCTC CCGAGGAGAT CTGGAGCTGC ATGTGCTCTT TGATGCAGGG TTCGACTCTC CCTAGGAGGT CTGGAGCTGC ATGTGCTCTC TGATGCAGGG CTCGACTCTC CCTAGAAGGT CCGGAGCTGC ATGTGCTCTC TGATGCAAGG TTCAACTCTC CCTAGGAGGT CTGGAGCCGC AGGAGCAAGC TGTGGATATA TGCAAAGCAG CAATGGGTTT GAGGATCAGT AAGAGCAAGC TGTAGATATA TGCAAGGCAG CAATGGGTTT GAAAATCAGC AAGAACAAGC TGTGGATATA TGCAAGGCTG CAATGGGACT GAGAATCAGC AAGAACAAGC TGTGGATATA TGCAAGGCTG CAATGGGATT GAGAATCAGC AAGAGCAAGC CGTGGGTATA TGCAAGGCTG CAATGGGACT GAGAATTAGC GCCCTCTTTG CATCAGAATA GACCAGGCAA TCATGGATAA GAACATCATG GACCTCTTTG CATCAGAATG GACCAGGCAA TCATGGAGAA AAACATCATG GCCCTCTTTG TATCAGAATG GACCAGGCGA TCATGGATAA AAACATCATA GGCCCCTTTG CATTAGAATG GACCAGGCAG TAATGGGTAA AACCATCATA GAGGCCTTAT GGTGAAAATG GACCAAGCCA TAATGGATAA AAGAATTATC GATAGGGCAG TTAAGCTATA CAAGAAGCTG AAAAGAGAAA TAACATTCCA GACAGGGCAG TTAAACTATA CAAGAAGCTG AAGAGGGAAA TGACATTCCA GACAGAGCAG TTAAACTGTA TAGAAAGCTT AAGAGGGAGA TAACATTCCA GACAAAGCAG TTAAACTGTA TAGGAAACTT AAGAGGGAGA TAACGTTCCA GACAAAGCAG TTAAACTGTA TAGGAAGCTC AAGAGGGAGA TAACATTCCA AGAGATTGGT TCCAGAAATA GCTACTAGAC CCAAAGTAAA CGGGCAAAGT AAAGGTCAAT CCCAGAAATA GCAGCAAGGC CTAAAGTGAA TGGACTAGGA GGAGATTTAC CCCGGAAATA GCAGAAAGAC CCAAAGTAAG AGATCAAGCT GGACCTTCAA ACCAGTGATA GGGCCAAGGC CCCTTGTCAA TGGTCTGCAG AAACCGTAAT CCCGAGTATC GGATCTAGAC CCAGGATAAG GGATGTCCCC TGAGTTGTCC TGTGGGTGA- GGCTCCCTCC CCATATAACT CAAGGTTTGA TGAGCTGCCC TGTCGGTGA- AGCTCCGTCC CCGTACAATT CAAGATTTGA TAATGAATGA GTTGGGTG-- --TTCCATTT CATTTAGGAA CCAGGCAAGT TGATGAATGA ATTAGGTG-- --TTCCATTT CATCTGGGGA CCAAGCAAGT TAATGAACGA GTTGGGTG-- --TTCCATTT CATTTGGGAA CCAAACAAGT TCTCTTTTAC AATTACTG-- GAGACAACAC CAAATGGAAT GAGAATCAGA TCTCCTTTAC AGTTACCG-- GAGACAACAC CAAATGGAAT GAGAATCAGA TCTCTTTCAC AATTACTG-- GAGACAACAC CAAATGGAAT GAGAATCAAA TTTCTTTCAC AATCACTG-- GGGACAACAC TAAGTGGAAT GAAAATCAAA TTTCTTTGAC CATCACTG-- GAGATAACAC CAAATGGAAC GAAAATCAGA

TGATGCAATC AAATGCATGA AAACATTTTT CGGCTGGAAA GAGCCCAACA TGATGCGATA AAATGCATGA AAACATTCTT CGGCTGGAGA GAGCCCAACA TGATGCGATC AAGTGCATGA GAACATTCTT TGGATGGAAA GAACCCTATA TGATGCAATC AAATGCATGA GAACATTCTT TGGATGGAAG GAACCCAATG TGATGCGATC AAGTGCATAA AAACATTCTT TGGATGGAAA GAACCTTATA TGGTGCAGCA A--TAAAGGG AGTCGGGACA ATGGTAATGG AACTA--AT- TGGTGCAGCA A--TAAAGGG AGTCGGGACA ATGGTAATGG AACTA--AT- TGGTGCGGCA G--TAAAGGG AGTCGGAACG ATGGTGATGG AACTA--AT- AGGCGCTGCA G--TCAAAGG AGTTGGGACA ATGGTGATGG AGTTG--AT- AGGTGCTGCA G--TCAAAGG AATCGGGACA ATGGTGATGG AACTG--AT- AGGTGCTGCA G--TCAAAGG AGTTGGAACA ATGGTGATGG AATTG--GT- TCATCCTTTA GCTTTGGAGG CTTCACTTTC AAAAGAACAA ATGGATCAT- TCATCCTTCA GCTTTGGAGG GTTCACTTTC AAAAGAACAA AGGGGTCTT- TCATCCTTCA GTTTTGGCGG GTTCACATTT AAGAGAACAA GCGGGTCAT- TCATCCTTCA GCTTTGGTGG GTTTACATTT AAAAGAACAA GCGGGTCAT- TCATCCTTCA GTTTTGGTGG ATTCACATTT AAGAGAACAA GCGGATCAT- TTGAAAGCGA ATTTCAGTGT GATTTTTGAC CGGCTAGAGA CCCTAATAT-

42

Page 43: Project  Report on Influenza Virus

TTGAAAGCGA ATTTCAGTGT GATTTTTGAC CGACTAGAGA CCATAGTAT- CTGAAAGCGA ACTTCAGTGT GATTTTTGAC CGGCTGGAGA CTCTAATAT- TTGAAAGCAA ACTTTAGTGT GATTTTTAAT CGACTTGAAG CTCTGATAC- CTTAAAGCAA ATTTCTCAGT TCTATTTGAT CAACTAGAGA CATTAGTCT- TGGGGCTAAG GAGGTCGCAC -TCAGCTACT CAACCGGTGC ACTTGCCAGT TGGAGCAAAG GAAGTTGCAC -TCAGTTACT CAACTGGTGC GCTTGCCAGT TGGGGCCAAA GAAGTAGCGC -TCAGTTATT CTGCTGGTGC ACTTGCCAGT TGGGGCCAAA GAAATAGCTC -TCAGTTATT CTGCTGGTGC ACTTGCCAGT TGGGGCCAAA GAAATCTCAC -TCAGTTATT CTGCTGGTGC ACTTGCCAGT GGAAGAATGG AGTTCTTCTG GACAATTTTA AAGCCGAATG ATGCCATCAA CGTAGAATGG AATTCTCTTG GACCCTCTTG GATATGTGGG ACACCATAAA GGGAGGATGA ACTATTACTG GACCTTGCTA AAACCCGGAG ACACAATAAT GGAAGAATTG ATTATTATTG GTCGGTACTA AAACCAGGCC AAACATTGCG AGCAGAATAA GCATCTATTG GACAATAGTA AAACCGGGAG ACATACTTTT GTCTGTTGCT TGGTCGGCAA GTGCTTGCCA TGATGGCACC AGTTGGTTGA ATCGGTTGCT TGGTCAGCAA GTGCATGTCA TGATGGCATG GGCTGGCTAA GTGTGTAGCA TGGTCCAGCT CAAGTTGTCA CGATGGAAAA GCATGGTTGC GTGCATAGCA TGGTCCAGCT CAAGTTGTCA CGATGGAAAA GCATGGCTGC GTGCATAGCA TGGTCCAGCT CAAGCTGCCA TGATGGGAAG GCATGGTTAC ACCCTCGGAT GTTTCTAGCA ATGATAACAT ACATCACAAG GAACCAACCT ATCCTCGAAT ATTTCTAGCA ATGATAACAT ACATCACAAG GAACCAACCT ATCCTCGAGT GTTTCTGGCG ATGATAACAT ACATCACAAG AAATCAACCT ACCCTCGAAT GTTTTTGGCG ATGATTACAT ATATCACAAA AAATCAACCT ATCCTCGGAT GTTTTTGGCC ATGATCACAT ATATGACCAG AAATCAGCCC

TTGTAAAACC ACATGAAA-- -AAGGCATAA ACCCCAATTA CCTCCTGGCT TCATCAAGCC ACACGAGA-- -AGGGCATAA ATCCCAATTA TCTTCTGGCT TTGTTAAACC ACACGAAA-- -AGGGAATAA ATCCAAATTA TCTGCTGTCA TTGTTAAACC ACACGAAA-- -AGGGAATAA ATCCAAATTA TCTTCTGTCA TAGTCAAACC ACACGAAA-- -AGGGAATAA ATTCAAATTA CCTGCTGTCA TCGGATGATA AAGCGAGG-- -CATTAATGA CCGGAACTTC TGGAGAGGCG TCGGATGATA AAGCGAGG-- -CATTAATGA CCGGAACTTC TGGAGAGGCG TCGGATGATA AAGCGAGG-- -GATTAACGA TCGGAATTTC TGGAGAGGTG CAGGATGATC AAACGTGG-- -GATCAATGA TCGGAACTTC TGGAGAGGTG CAGAATGGTC AAACGGGG-- -GATCAACGA TCGAAATTTC TGGAGAGGTG CAGGATGATC AAACGTGG-- -GATCAATGA TCGGAACTTC TGGAGGGGTG CCGTCAAGAA GGAAGAGG-- -AAGTGCTTA CAGGCAACCT CCAAACATTG CTGTCAAAAG AGAGGAAG-- -AAGTGCTTA CAGGCAACCT CCAAACATTG CAATCAAGAG AGAGGAAG-- -AAGTGCTTA CGGGCAATCT CCAAACATTG CAGTCAAAAA AGAGGAAG-- -AAGTGCTTA CAGGCAATCT CCAAACATTG CAGTCAAGAG AGAGGAAG-- -AGGTGCTTA CGGGCAATCT TCAAACATTG TACTAAGGGC TTTCACCG-- -AAGAGGGAG CAATTGTTGG CGAAATTTCA TACTAAGGGC TTTCACCG-- -AAGAGGGAG CAATTGTTGG CGAAATCTCA TGCTAAGGGC TTTCACCG-- -AAGAGGGAG CAATTGTTGG CGAAATTTCA TACTTAGAGC GTTTACAG-- -ATGAAGGAG CAATAGTGGG CGAAATCTCA CTCTGAGGGC ATTCACAG-- -AAAGTGGTG CTATTGTGGC TGAAATATTT TGTATGGGTC TCATATAC-- -AACAGGATG GGAACGGTGA CC-ACAGAAG TGCATGGGTC TCATATAC-- -AACCGGATG GGAACAGTGA CC-ACAGAAG TGCATGGGCC TCATATAC-- -AACAGGATG GGGGCTGTGA CC-ACTGAAG TGCATGGGCC TCATATAC-- -AATAGGATG GGGGCTGTAA CC-ACTGAAG TGTATGGGCC TCATATAC-- -AACAGGATG GGGGCTGTGA CC-ACTGAAG TTTCGAGAGT AATGGAAATT TCATTGCTCC AGAATATGCA TACAAAATTG TTTTGAGAGC ACTGGTAATC TAGTTGCACC AGAGTATGGG TTCAAAATAT ATTTGAGGCA AATGGAAATC TAATAGCACC AAGGTATGCT TTCGCACTGA AGTACGATCC AATGGGAATC TAATTGCTCC ATGGTATGGA CACGTTCTTT GATTAACAGC ACAGGGAATC TAATTGCTCC TCGGGGTTAC TTCAAAATAC CAATTGGAAT TTCTGGCCCA GACAATGGGG CTGTGGCTGT ATTGAAATAC CAATCGGAAT TTCAGGTCCA GATAATGGAG CAGTGGCTGT ATTAAAATAC ATGTTTGTGT CACTGGGGAT GATAAAAATG CAACTGCTAG CTTCATTTAT ATGTTTGTGT AACGGGGGAT GATAAAAATG CAACTGCTAG CTTCATTTAC ATGTTTGTGT CACTGGGGAT GATAGAAATG CGACTGCTAG CATCATTTAT GAATGGTTTA GAAATGTCTT AAGCATTGCT CCTATAATGT TCTCAAACAA GAATGGTTTA GAAATGTCTT GAGCATTGCC CCTATAATGT TCTCAAATAA GAATGGTTTA GAAACGTCCT GAGCATTGCA CCCATAATGT TCTCAAATAA GAGTGGTTCA GAAACATCCT GAGCATCGCA CCAATAATGT TCTCAAACAA GAATGGTTCA GAAATGTTCT AAGTATTGCT CCAATAATGT TCTCAAACAA

TGGAAGCAGG TGCTGGCAGA GCTCCAAGAT ATTGAAAACG AGGAGAAAAT TGGAAGCAGG TGCTGGCAGA ACTCCAGGAT ATTGAAAATG AGGATAAAAT TGGAAGCAAG TACTGGCGGA ACTGCAGGAC ATTGAGAATG AGGAGAAGAT TGGAAGCAAG TACTGGCAGA ACTGCAGGAC ATTGAGAATG AGGAGAAAAT TGGAAGCAAG TATTGTCAGA ATTGCAGGAC ATTGAAAATG AGGAGAAGAT ATAATGGACG AAGAACAAGG ATTGCATATG AGAGAATGTG C---AACATC ATAATGGACG AAGAACAAGG ATTGCATATG AGAGAATGTG C---AACATC AAAATGGGCG AAGAACAAGA ATTGCATATG AGAGAATGTG C---AACATC AGAATGGACG GAAAACAAGG AGTGCTTACG AGAGAATGTG C---AACATT AGAATGGGCG GAAAACAAGA AGTGCTTATG AGAGAATGTG C---AACATT AGAATGGACG AAAAACAAGA ATTGCTTATG AAAGAATGTG C---AACATT AAAATAAAAG TACATGAGGG G-TATGAAGA ATTCACAATG G---TTGGGC AAGATAAAAG TACATGAAGG A-TATGAGGA ATTCACAATG G---TTGGAC AAAATAAGGG TGCATGAGGG G-TACGAGGA ATTCACAATG G---TGGGGA AAGATAAGAG TACATGAGGG G-TATGAGGA GTTCACAATG G---TGGGGA AAGATAAGAG TGCATGAGGG A-TATGAAGA GTTCACAATG G---TTGGGA CCATTGCCTT CTCTTCCAGG A-CAT--ACT ATTGAGGATG T---CAAAAA CCATTGCCTT CTTTTCCAGG A-CAT--ACT ATTGAGGATG T---CAAAAA

43

Page 44: Project  Report on Influenza Virus

CCATTGCCTT CTCTTCCAGG A-CAT--ACT GCTGAGGATG T---CAAAAA CCATTACCTT CCCTTCCAGG A-CAT--ACT GACGAGGATG T---CAAAAA CCCATTCCCT CCGTACCAGG A-CAT--TTT ACAGAGGATG T---CAAAAA TGGCTTTTGG CCTAGTGTGT GCCACTTGTG AGC-AGATTG C-----AGAT TGGCTCTTGG CCTAGTATGT GCCACTTGTG AAC-AGATTG C-----TGAT TGGCCTTTGC CGTGGTATGT GCAACCTGTG AAC-AGATTG C-----TGAC TGGCATTTGG CCTGGTATGT GCAACATGTG AAC-AGATTG C-----TGAC TGGCATTTGG CCTGGTATGT GCAACCTGTG AAC-AGATTG C-----TGAC TCAAGAAAGG GGACTCAGCA ATTATGAAAA GTGAATTGGA ATATGGTAAC CGAAAAGAGG TAGTTCAGGG ATCATGAAGA CAGAAGGAAC ACTTGAGAAC GTAGAGGCTT TGGGTCCGGC ATCATCACCT CAAACGCATC AATGCATGAG CAGGAGGGAG CCATGGAAGA ATCCTGAAGA CTGATTTAAA AGGTGGTAAT GAAGTGGGAA AAGCTCA--- ATAATGAGAT CAGATGCACC CATTGGCAAA AACGGCATAA TAACAGACAC TATCAAGAGT TG-GAGGAAC AACAT---AC AACGGCATAA TAACTGAAAC CATAAAAAGT TG-GAGGAAG AAAAT---AT GACGGGAGGC TTATGGACAG TATTGGTTCA TG-GTCTCAA AATAT---CC AATGGGAGGC TTGTAGATAG TATTGTTTCA TG-GTCCAAA AAAAT---CC GATGGGATGC TTACCGACAG TATTGGTTCA TG-GTCTAAG AACAT---CC GATGGCAAGA TTAGGGAAAG GATACATGTT CGAAAGTAAG AGCATGAAGC AATGGCGAGG TTAGGAAAAG GATACATGTT CGAGAGTAAG AGCATGAAGC AATGGCTAGA CTAGGGAAAG GTTACATGTT CGAAAGCAAG AGCATGAAGC AATGGCAAGA CTAGGAAAAG GATACATGTT CGAGAGTAAG AGAATGAAGC AATGGCGAGA CTGGGAAAAG GGTATATGTT TGAGAGCAAG AGTATGAAAC

TCCAAAGACA AAGAACAT-- GAGGAAAACA AGCCAATTGA AGTGGGCACT CCCAAAAACA AAGAACAT-- GAAGAAAACA AGCCAATTAA TGTGGGCACT TCCAAGAACT AAAAACAT-- GAAGAAAACG AGTCAGCTAA AGTGGGCACT TCCAAAGACT AAAAATAT-- GAAAAAAACA AGTCAGCTAA AGTGGGCACT CCCAAGGACT AAAAACAT-- GAAGAAAACG AGTCAACTAA AGTGGGCTCT CTCAAAGGGA AATTTCAA-- ACAGCAGCAC AAAGAGCAAT GATGGATCAG CTCAAAGGGA AATTTCAA-- ACAGCAGCAC AAAGAGCAAT GATGGATCAG CTCAAAGGGA AATTCCAA-- ACAGCAGCAC AAAGAGCAAT GATGGATCAG CTCAAAGGAA AATTTCAA-- ACAGCTGCAC AAAGAGCAAT GATGGATCAA CTTAAAGGAA AATTTCAA-- ACAGCTGCAC AAAGAGCAAT GGTGGATCAA CTCAAAGGGA AATTTCAA-- ACTGCTGCAC AAAAAGCAAT GATGGATCAA GGAGAGCAAC AGCTATCC-- TGAGGAAAGC AACTAG-AAG GCTGATTCAG GAAGAGCAAC AGCCATTC-- TAAGAAAAGC AACCAG-AAG GATGATCCAA AAAGGGCAAC AGCTATAC-- TCAGAAAAGC AACCAG-GAG ATTGGTTCAG AAAGAGCAAC AGCTATAC-- TCAGAAAAGC AACCAG-AAG ATTGGTTCAG GAAGAGCAAC AGCCATAC-- TCAGAAAAGC AACCAG-GAG ATTGATTCAG TGCAATTGGG GTCCTCAT-- CGGAGGACTT GAATGG-AAT GATAACACAG TGCAATTGGG GTCCTCAT-- CGGAGGACTT GAATGG-AAT GATAACACAG TGCAGTTGGA GTCCTCAT-- CGGAGGACTT GAATGG-AAT GATAACACAG TGCAATTGGG GTCCTCAT-- CGGAGGACTT GAATGG-AAT GATAACACAG TGCAATTGGA ATCCTCAT-- CGGTGGACTT GAATGG-AAT GATAACTCAA TCACAGCATC GGTCTCAC-- AGACAGAT-- -GGCAACTAC CACCAACCCA GCCCAACATC GGTCCCAC-- AGGCAGAT-- -GGCGACTAC CACCAACCCA TCCCAGCATA GGTCTCAC-- AGGCAAAT-- -GGTGACAAC AACCAATCCA TCCCAGCACA GGTCTCAT-- AGGCAAAT-- -GGTGGCAAC AACCAATCCA TCCCAGCATC GGTCTCAT-- AGGCAAAT-- -GGTGACAAC AACCAACCCA TGCAACACCA AGTGTCAA-- ACTCCAATGG GGGCGATAAA CTCTAGTATG TGTGAAACCA AATGCCAA-- ACTCCTTTGG GAGCAATAAA TACAACACTA TGTAACACGA AGTGTCAA-- ACACCCCTGG GAGCTATAAA CAGCAGTCTC TGTGTAGTGC AATGTCAG-- ACTGAAAAAG GTGGCTTAAA CAGTACATTG TGCAATTCTG AATGCATC-- ACTCCAAATG GAAGCATTCC CAATGACAAA TGAGAACTCA AGAGTCTG-- AATGTGCATG TGTAAATGGC TCTTGCTTTA TGAGGACACA AGAGTCTG-- AATGTGCCTG TGTAAATGGT TCATGTTTTA TCAGGACCCA GGAGTCGG-- AATGCGTTTG TATCAATGGG ACTTGCACAG TCAGGACCCA GGAGTCAG-- AATGCGTTTG TATCAATGGA ACTTGTACAG TCAGAACTCA GGAGTCAG-- AATGCGTTTG CATCAATGGA ACTTGTACAG TACGGACACA AATACCAGCA GAAATGCTTG CAAGCATTG- ACTTGAAATA TACGGACACA AATACCAGCA GAAATGCTTG CAAACATTG- ACTTGAAATA TCCGAACACA AATACCAGCA GAAATGCTAG CAAGTATTG- ACCTGAAATA TCCGAACACA AATACCCGCA GAAATGCTAG CAAGCATTG- ACCTGAAGTA TTAGAACTCA AATACCTGCA GAAATGCTAG CAAGCATTG- ATTTGAAATA

TGGTGAGAAT ATGGCACCAG AGAAAGTAGA CTT--TGAGG ATTGCAAAGA CGGGGAGAAT ATGGCACCGG AAAAATTGGA CTT--TGAGG ACTGCAAAGA TGGTGAGAAC ATGGCACCAG AGAAGGTAGA CTT--TGACA ACTGTAGAGA TGGTGAGAAC ATGGCACCAG AAAAGGTAGA CTT--TGACG ACTGTAAAGA TGGTGAAAAC ATGGCACCAG AGAAAGTAGA CTT--TGACA ACTGCAGAGA GTGCGAGAAA GCAGAAATCC TGGGAATGCT GA---AATTG AAGATCTCAT GTGCGAGAAA GCAGAAATCC TGGGAATGCT GA---AATTG AAGATCTCAT GTACGGGAAA GCAGAAATCC TGGGAATGCT GA---GATTG AAGATCTCAT GTGAGAGAAA GCCGGAACCC AGGAAATGCT GA---GATCG AAGATCTAAT GTGAGAGAAA GTCGGAACCC AGGAAATGCT GA---GATCG AAGATCTCAT GTGAGAGAGA GCCGGGACCC AGGGAATGCT GA---GTTCG AAGATCTCAC TTGATAGTAA GTGGAAG-AG ATGAACAATC AAT--CGCTG AAGCGATCAT CTGATAGTCA GCGGAAG-GG ACGAGCAATC AAT--TGCTG AGGCAATTAT CTGATAGTGA GTGGAAG-AG ACGAACAGTC AAT--AGCCG AAGCAATAAT CTCATAGTGA GTGGAAG-AG ACGAACAGTC AAT--AGCCG AAGCAATAAT CTGATAGTGA GTGGGAG-AG ACGAACAGTC GAT--TGCCG AAGCAATAAT TTCG-AGTCT CTAAAACTCT ACAGAGATTC GCT--TGGAG AAGCAGTAAT TTCG-AGTCT CTAAAAATCT ACAGAGATTC GCT--TGGAG AAGCAGTAAT TTCG-AGTCT CTGAAACTCT ACAGAGATTC GCT--TGGAG AAGCAGTAAT

44

Page 45: Project  Report on Influenza Virus

TTCG-AGTCT CTGAAACTCT ACAGAGATTC ACT--TGGAG AAGCAGTGAT TTCG-AGCGT CTGAAAATAT ACAGAGATTC GCT--TGGGG AATCCATGAT CTAATCAGGC ATGAGAACAG AATGGTGCTG GCCAGCACTA CAGCTAAGGC CTAATCAGGC ATGAGAACAG AATGGTACTA GCCAGCACTA CGGCTAAGGC CTAATAAGAC ATGAGAACAG AATGGTTCTG GCCAGCACTA CAGCTAAGGC TTAATAAAAC ATGAGAACAG AATGGTTTTG GCCAGCACTA CAGCTAAGGC CTAATCAGAC ATGAGAACAG AATGGTTTTA GCCAGCACTA CAGCTAAGGC CCATTCCACA ACATACACCC CCTCACCATC GGG--GAATG CCCCAAATAT CCTTTTCACA ATGTCCACCC ACTGACAATA GGT--GAATG CCCCAAATAT CCTTTCCAGA ATATACACCC AGTCACAATA GGA--GAGTG CCCAAAATAC CCATTCCACA ATATCAGTAA ATATGCATTT GGA--ACCTG CCCCAAATAT CCATTTCAAA ATGTAAACAG GATCACATAT GGG--GCCTG TCCCAGATAT CTGTAATGAC TGACGGACCA AGTAATGGGC AGGCCTCATA T-AAGATCTT CTATAATGAC TGATGGCCCG AGTGATGGGC TGGCCTCGTA C-AAAATTTT TAGTAATGAC TGATGGAA-G TGCTTCAGGA AGAGCCGATA CTAGAATACT TAGTAATGAC TGATGGGA-G TGCTTCAGGA AAAGCTGATA CTAAAATACT TAGTAATGAC TGATGGAA-G TGCATCAGGA AGGGCTGATA CTAAAATACT CTTCAACGAA TCAACGAG-- ----AAAGAA AATCGAGAAA ATAAGACCTC CTTCAACGAA TCGACGAG-- ----AAAGAA AATTGAGAAA ATAAGACCTC CTTTAATGAA TCAACCAG-- ----AAAGAA AATTGAGAAA ATAAGGCCTC TTTCAATGAA TCAACAAG-- ----GAAGAA AATTGAGAAA ATAAGGCCTC TTTCAATGAT TCAACAAG-- ----AAAGAA GATTGAAAAA ATCCGACCGC

TGTTAGCGAT CTAAGGCAGT ATGACAGTGA TGAACCAAAG CCTAGATCAC TATTGGCGAT CTGAAACAGT ATCAAAGTGA TGAGCCAGAG CTCAGATCGA CATAAGCGAT TTGAAGCAAT ATGATAGTGA CGAACCTGAA TTAAGGTCAC TGTAGGTGAT TTGAAGCAAT ATGATAGTGA TGAACCAGAA TTGAGGTCGC CATAAGCGAT TTGAAGCAAT ATGATAGTGA CGAACCTGAA TTAAGGTCAC CTTTCTGGCA CGGTCTGCAC ----TCATCC TGAG--AGGA TC--CGTAGC CTTTCTGGCA CGGTCTGCAC ----TCATCC TGAG--AGGA TC--CGTAGC ATTTCTGGCA CGGTCTGCAC ----TCATCC TGAG--AGGA TC--AGTGGC CTTTCTGGCA CGGTCTGCAC ----TCATAT TGAG--AGGG TC--AGTTGC ATTTTTGGCA AGATCTGCAT ----TGATAT TGAG--AGGG TC--AGTTGC TTTTCTAGCA CGGTCTGCAC ----TCATAT TGAG--AGGG TC--GGTTGC TGTAGCAATG GTGTTCTCAC AGGAGGATTG CATGATAAAG GC--AGTCCG TGTGGCAATG GTGTTCTCAC AAGAAGATTG CATGGTAAAG GC--AGTCCG TGTAGCCATG GTGTTTTCAC AAGAAGATTG CATGATAAAA GC--AGTTAG CGTGGCCATG GTGTTTTCAC AAGAGGATTG CATGATAAAA GC--AGTTAG TGTGGCCATG GTATTTTCAC AAGAGGATTG TATGATAAAA GC--AGTTAG -GAGAATGGG AGACCTCCAC ----TCACTC CAAAACAGAA AC--GGAAAA -GAGAATGGG GGACCTCCAC ----TTACTC CAAAACAGAA AC--GGAAAA -GAGAATGGG AGACCTCCAC ----TCACTC CAAAACAGAA AC--GAGAAA -GAGAATGGG AGATCTCCAC ----TCCCTC CAAAACAGAA AC--GGAAAG -GAGAATGGG GGACCTTCAC ----TCCCTC CAAAACAGAA AC--GCTACA TATGGAGCAG ATGGCTGGAT CGAGTGAGCA GGCAGCGGAA GCCATGGAGG CATGGAGCAG ATGGCTGGAT CAAGTGAGCA GGCAGCAGAA GCCATGGAAG TATGGAGCAA ATGGCTGGAT CGAGTGAGCA AGCAGCAGAG GCCATGGAGG TATGGAGCAA ATGGCTGGAT CAAGTGAGCA GGCAGCGGAG GCCATGGAAA TATGGAGCAA ATGGCTGGAT CGAGTGAGCA AGCAGCAGAG GCCATGGAGG GTGAAATCAA ACAGATTAGT CCTTGCGACT GGACTCAGAA ATACCCCTCA GTAAAATCGG AGAAATTGGT CTTAGCAACA GGACTAAGGA ATGTTCCCCA GTCAGGAGTG CCAAATTGAG GATGGTTACA GGACTAAGGA ACATTCCGTC GTAAGAGTTA ATAGTCTCAA ACTGGCAGTC GGTCTGAGGA ACGTGCCTGC GTTAAGCAAA ACACTCTGAA ATTGGCAACA GGGATGCGAA ATGTACCAGA CAAAATGGAA AAAGGGAAAG TAGTTAAATC AGTCGAATTG AA--TGCCCC CAAGATCGAA AAGGGGAAGG TTACTAAATC AATAGAGTTG AA--TGCACC ATTCATTGAA GAGGGGAAAA TTGTCCATAT TAGCCCATTG TC--AGGAAG ATTCATTGAG GAGGGGAAAA TCATTCATAC TAGCACATTG TC--AGGAAG ATTCATTAGA GAAGGGAAAA TTGTCCACAT TGGTCCACTG TC--AGGAAG TACTAATAGA TGGCACAGCC TCATTGAGTC CTGGAATGAT GA--TGGGCA TACTAATAGA GGGCACAGCC TCATTGAGTC CAGGGATGAT GA--TGGGCA TCCTAATAGA TGGCACAGTC TCATTGAGTC CTGGAATGAT GA--TGGGCA TTCTAATAGA TGGCACAGCA TCATTGAGCC CTGGGATGAT GA--TGGGCA TCTTAATAGA GGGGACTGCA TCATTGAGCC CTGGAATGAT GA--TGGGCA

TAGCAAGCTG GATCCA---- ---G-AGTGA ATTCAACAAG GCATGCGAAT TAGCAAGCTG GATCCA---- ---G-AGTGA GTTCAACAAG GCATGTGAAT TTTCAAGCTG GATCCA---- ---G-AATGA GTTCAACAAG GCATGCGAGC TTGCAAGTTG GATTCA---- ---G-AATGA GTTCAACAAG GCATGCGAAC TTTCAAGCTG GATACA---- ---G-AATGA GTTCAACAAG GCCTGCGAGC CCATAAGTCC TGCTTG---- ---CCTGCTT GTGTGTA--- -----CGGGC CCATAAGTCC TGCTTG---- ---CCTGCTT GTGTGTA--- -----CGGGC CCACAAGTCC TGCTTG---- ---CCTGCTT GTGTGTA--- -----CGGGC TCACAAATCT TGTCTG---- ---CCCGCCT GTGTGTA--- -----TGGAC TCACAAATCT TGCCTA---- ---CCTGCCT GTGCGTA--- -----TGGAC TCACAAGTCC TGCCTG---- ---CCTGCCT GTGTGTA--- -----TGGAC AGGCGATCTG AATTTC---- ---GTGAACA GAGCAAACCA AAGATTGAAC AGGTGATTTG AATTTC---- ---GTAAACA GAGCAAATCA ACGACTGAAT AGGTGACCTG AATTTC---- ---GTTAATA GGGCAAATCA GCGATTGAAT AGGTGACCTG AATTTC---- ---GTCAACA GAGCAAATCA ACGGTTGAAC AGGTGATCTG AATTTC---- ---GTCAATA GGGCGAATCA GCGACTGAAT TGGCGAGAAC AATTAG---- ---GTCAAAA GTTCGAAGAG ATAAGATGGC TGGCGAGAAC AGCTAG---- ---GTCAAAA GTTTGAAGAG ATAAGATGGC TGGCGGGAAC AATTAG---- ---GTCAGAA GTTTGAAGAA ATAAGATGGT TGGAGAGAAC AATTGA---- ---GCCAGAA GTTTGAAGAG ATAAGATGGT

45

Page 46: Project  Report on Influenza Virus

TGGCGAAACG AGTTGA---- ---GTCAGAA GTTTGAAGAG ATCAGATGGC TTGCTAGTCA GGCTAG---- ---GCAGATG GTGCAGGCAA ---TGAGGAC TCGCAAGTCA GGCTAG---- ---GCAAATG GTGCAGGCTA ---TGAGGAC TTGCTAGTCA GGCCAG---- ---GCAAATG GTGCAGGCAA ---TGAGAGC TTGCTAGTCA GGCCAG---- ---GCAAATG GTGCAGGCAA ---TGAGAGC TTGCTAGTCA GGCTAG---- ---GCAAATG GTGCAAGCGA ---TGAGAAC GAGAGAGAGA AGAAGAAAAA AGAGAGGACT ATTTGGAGCT ATAGCAGGTT GATTGAATCA AG-------- ----AGGATT GTTTGGGGCA ATAGCTGGTT CATTCAATCC AG-------- ----AGGTCT ATTTGGAGCC ATTGCCGGTT TAGATCAAGT AG-------- ----AGGACT ATTTGGAGCC ATAGCTGGAT GAAACAAACT AG-------- ----AGGCAT ATTTGGCGCA ATCGCGGGTT TAATTATCAC TATGAGGAGT GCTCCTGTTA TCCTGATGCT G---GCGAAA TAATTCTCAC TATGAGGAAT GTTCCTGTTA CCCTGATACC G---GCAAAG TGCTCAGCAT GTAGAGGAGT GTTCCTGTTA TCCTCGATAT C---CTGACG TGCTCAGCAT GTCGAGGAGT GCTCCTGCTA TCCTCGATAT C---CTGGTG TGCTCAGCAT GTGGAGGAAT GCTCCTGTTA CCCCCGGTAT C---CAGAAG TGTTCAATAT GCTGAGTACA GTCTTAGGAG TTTCAATCCT GAATCTTGGG TGTTTAATAT GCTAAGTACG GTCTTAGGAG TCTCAATCTT AAATCTTGGG TGTTCAACAT GCTAAGTACA GTCTTAGGAG TCTCAATCCT GAATCTCGGG TGTTCAACAT GCTAAGTACG GTTTTAGGAG TCTCGGTACT GAATCTTGGG TGTTCAATAT GTTAAGCACT GTATTAGGCG TCTCCATCCT GAATCTTGGA

TGACAGATT- CAAGTTGGAT TGAACTTGAT GAAATAGGGG AAGACGTTGC TGACCGATT- CGAGCTGGAT AGAACTCGAT GAGATAGGGG AAGATGTTGC TGACCGATT- CAATCTGGAT AGAGCTCGAT GAGATTGGAG AAGACGTGGC TGACAGATT- CAAGCTGGAT AGAGCTTGAT GAGATTGGAG AAGATGTGGC TAACTGATT- CAATCTGGAT AGAGCTCGAT GAAATTGGAG AGGACGTAGC TCGCTGTG-- GCCAGTGGAT ATGATTTTGA GAGGGAAGGG TACTCTCTGG TCGCTGTG-- GCCAGTGGAT ATGATTTTGA GAGGGAAGGG TACTCTCTGG TTGCCGTG-- GCCAGTGGAT ATGACTTTGA GAGAGAAGGG TACTCTCTGG CTGCCATA-- GCCAGTGGGT ACAACTTCGA AAAAGAGGGA TACTCTCTAG CTGCAGTA-- TCCAGTGGGT ACGACTTCGA AAAAGAGGGA TATTCCTTGG CTGCCGTA-- GCCAGTGGGT ACGACTTTGA AAGAGAGGGA TACTCTCTAG CCCATGCATC AACTCCTGAG GCACTTCCAA AAAGATGCAA AAGTGCTGTT CCCATGCACC AACTCCTGAG ACACTTTCAA AAGGATGCAA AGGTGCTGTT CCCATGCATC AACTTTTAAG ACATTTTCAG AAAGATGCAA AAGTGCTCTT CCCATGCATC AGCTTTTAAG GCATTTTCAG AAAGATGCGA AAGTGCTTTT CCTATGCATC AACTTTTAAG ACATTTTCAG AAGGATGCGA AAGTGCTTTT TGATTGAA-- GAAGTGAGAC ACAGATTGAA GATAACAGAG AATAGTTTTG TGATTGAA-- GAAGTGAGAC ACAGACTAAA AACAACTGAA AATAGCTTTG TGATTGAA-- GAAGTGAGAC ACAAACTGAA GGTAACAGAG AATAGTTTTG TAATTGAA-- GAAATGCGAC ATAGGTTAAG AATTACAGAG AATAGCTTTG TCATTGCT-- GAATGTAGAA ATATACTGAC AAAGACTGAA AATAGCTTTG AATTGGGACT CATCCTAGCT CCAGTGCCGG TCTGAAAGAT AATCTTCTTG AATTGGGACT CACCCTAGTT CCAGTGCAGG TCTAAAAGAT GATCTTATTG CATTGGGACT CCTCCTAGCT CCAGTGCTGG TCTAAAAGAT GATCTTCTTG CGTTGGGACT CATCCTAGCT CCAGTACTGG TCTAAGAGAT GATCTTCTTG CATTGGGACT CATCCTAGCT CCAGTGCTGG TCTGAAAAAT GATCTTCTTG TTATAGAGGG AGGATGGCAG GGAATGGTAG ATGGTTGGTA TGGGTACCAC TTATAGAAGG AGGATGGCAA GGAATGGTTG ATGGTTGGTA TGGATACCAT TTATTGAAGG GGGATGGACT GGAATGATAG ATGGATGGTA CGGTTATCAT TCATAGAAGG AGGTTGGCCA GGACTAGTCG CTGGCTGGTA TGGTTTCCAG TCATAGAAAA TGGTTGGGAG GGAATGGTAG ACGGTTGGTA CGGTTTCAGG TCACATGTGT GTGCAGGGAT AAT----TGG CATGGCTCAA ATCGGCCATG TGATGTGTGT GTGCAGAGAC AAT----TGG CATGGTTCGA ACCGGCCATG TCAGATGTAT CTGCAGAGAC AAC----TGG AAAGGCTCTA ATAGGCCCGT TCAGATGTGT CTGCAGAGAC AAC----TGG AAAGGCTCCA ATAGGCCCAT TTAGATGTGT TTGCAGAGAC AAT----TGG AAGGGCTCCA ATAGACCCGT CAGAAGAGGT ACACCAAAAC CACATACTGG TGGGACGGAC TCCAATCCTC CAGAAGAGGT ACACCAAAAC CACATACTGG TGGGATGGGC TCCAATCCTC CAAAAGAAAT ACACCAAAAC NACATACTGG TGGGACGGAC TCCAATCCTC CAAAAGAAAT ACACCAAGAC AACATACTGG TGGGATGGGC TCCAATCCTC CAAAAGAGAT ACACCAAGAC TACTTACTGG TGGGATGGTC TTCAATCCTC

TCCAATTGAG CACATTGCAA --GTATGAGA AGGAACTATT TCACAGCGGA CCCAATTGAG CACATTGCAA --GCATGAGA AGGAACTACT TCACAGCGGA TCCAATTGAA CACATTGCAA --GCATGAGA AGGAATTACT TCACAGCAGA TCCAATTGAA CACATTGCAA --GCATGAGA AGGAATTATT TCACATCAGA CCCAATTGAG TACATTGCAA --GCATGAGG AGGAATTATT TCACAGCAGA TTGGGATAGA TCCTTTCCGT CTGCTTCAGA ACAGTCAGGT CTTCAG-TCT TTGGGATAGA TCCTTTCCGT CTGCTTCAGA ACAGTCAGGT CTTCAG-TCT TCGGGATTGA TCCTTTCCGT CTGCTGCAAA ACAGCCAGGT CTTTAG-TCT TGGGAATAGA CCCTTTCAAA CTGCTTCAAA ACAGCCAAGT ATACAG-CCT TGGGAATAGA CCCTTTCAAA CTACTTCAAA ATAGCCAAAT ATACAG-CCT TCGGAATAGA CCCTTTCAGA CTGCTTCAAA ACAGCCAAGT GTACAG-CCT TCAGAACTGG GGAATTGAAC --CTATTGAC AATGTCATGG GGATGATCGG TCAAAACTGG GGAATTGAAC --CCATCGAC AATGTCATGG GTATGATTGG TCAAAATTGG GGAATTGAAC --ATATCGAC AATGTAATGG GAATGATTGG TCAAAATTGG GGAATTGAAC --ACATCGAC AGTGTGATGG GAATGGTTGG TCAAAATTGG GGAGTTGAAC --CTATCGAC AATGTGATGG GAATGATTGG AGCAAATAAC ATTTATGCAA --GCCTTACA GCTACTATTT GAAG---TGG AACAAATAAC ATTCATGCAA --GCATTACA ACTGCTGTTT GAAG---TGG AGCAAATAAC ATTTATGCAA --GCCTTACA TCTATTGCTT GAAG---TGG AGCAAATAAC CTTTATGCAA --GCCTTACA ACTATTGCTT GAAG---TGG AACAGATAAC ATTTTTGCAA --GCATTGCA ACTCTTACTT GAAG---TTG

46

Page 47: Project  Report on Influenza Virus

AAAATTTGCA GGCCTACCAA --------AA ACGAATGGGA GTGCAAATGC AAAATTTGCA GGCTTACCAG --------AA ACGGATGGGA GTGCAAATGC AAAATTTGCA GGCCTATCAG --------AA ACGAATGGGG GTGCAGATGC AAAATTTGCA GACCTATCAG --------AA ACGAATGGGG GTGCAGATGC AAAATTTGCA GGCCTATCAG --------AA ACGAATGGGG GTGCAGATGC CATAGCAATG AGCAGGGGAG TGGATACGCT GCAGACAAAG AATCCACTCA CACAGCAATG ACCAGGGATC AGGGTATGCA GCAGACAAAG AATCCACTCA CATCAGAATG AACAGGGATC AGGCTATGCA GCGGATCAAA AAAGCACACA CATTCAAATG ATCAAGGGGT TGGTATGGCT GCAGATAGGG ATTCAACTCA CATCAAAATT CTGAGGGAAC AGGACAAGCA GCAGATCTCA AAAGCACTCA GGTATCTTTC AA--TCAAAA --TTTGGAGT ATCAAATA-G GATATATATG GGTGTCTTTC GA--TCAAAA --CCTGGATT ATCAAATA-G GATACATCTG CATAGACATA AATATGGAAG --ATTATAGC ATTGATTCCA GTTATGTGTG CGTAGATATA AACATAAAGG --ATTATAGC ATTGTTTCCA GTTATGTGTG GCTATATATA AATGTGGCAG --ATTATAGT GTTGATTCTA GTTATGTGTG TGATGATTTC GCTCTCATAG --TGAATGCA CCAAATCATG AGGGAATAGA TGATGATTTC GCTCTCATAG --TGAATGCA CCAAATCATG AGGGAATACA TGATGACTTC GCTCTCATAG --TGAATGCA CCAAATCATG AGGGAATACA CGACGATTTT GCCCTCATAG --TGAATGCA CCAAATCATG AGGGAATACA TGACGATTTT GCTCTGATTG --TGAATGCA CCCAATCATG AAGGGATTCA

AGTATCCCAT TGCAGGGC-- ----TACTGA ATACATAATG AAGGGAGTGT AGTGTCTCAT TGCAGGGC-- ----CACTGA GTACATAATG AAGGGGGTTT GGTGTCCCAT TGCAGAGC-- ----CACAGA ATATATAATG AAGGGGGTAT GGTGTCTCAC TGCAGAGC-- ----CACAGA ATACATAATG AAGGGGGTGT GGTGTCCCAT TGTAGAGC-- ----CACTGA GTACATAATG AAGGGGGTAT TATTAGACCA AATGAGAATC C-AGCACATA AAAGTCAATT GGTATGGATG TATTAGACCA AATGAGAATC C-AGCACATA AAAGTCAATT GGTATGGATG AATTAGACCA AATGAGAATC C-AGCACATA AAAGTCAATT GGTGTGGATG AATCAGACCG AACGAGAATC C-AGCACACA AGAGTCAGCT GGTGTGGATG AATCAGACCT AACGAGAATC C-AGCACACA AGAGTCAGCT GGTGTGGATG AATCAGACCA AATGAGAATC C-AGCACACA AGAGTCAACT GGTGTGGATG AATATTACCT GACATGACTC CAAGCGCAGA GATGTCACTG AGAGGAGTGA AATATTGCCT GACATGACCC CCAGCACGGA AATGTCACTA AGAGGAGTGA AGTATTACCA GACATGACTC CAAGCACAGA GATGTCAATG AGAGGGATAA AGTATTACCA GATATGACTC CAAGCACAGA GATGTCAATG AGAGGAATAA GATATTGCCC GACATGACTC CAAGCATCGA GATGTCAATG AGAGGAGTGA AACAAGAGAT AAGAACTTTC TCGTTTCAGC TTATTTAA-- ---------- AACAGGAGAT AAGAACTTTC TCATTTCAGC TTATTTAATG ATAAAAAACA AGCAAGAGAT AAGAACTTTC TCATTTCAGC TTATTTAATA ATAAAAAACA AGCAAGAGAT AAGAACTTTC TCGTTTCAGC TTATTTAATG ATAAAAAACA AGAGTGAGAT AAGGACCTTC TCTTTTCAGC TTATTTAATA CTAAAAAACA AGCGATTCAA GTGATCCTCT TGTTGTTGCC GCAAGTATCA TTGGGATACT AGAGATTCAA GTGATCCTCT CGTTGTTGCA GCAAGTATCA TTGGGATATT AACGATTCAA GTGACCCCCT TGTTGTTGCT GCGAGTATCA TTGGGATCTT AACGATTCAA GTGACCCGCT TGTTGTTGCC GCGAGTATCA TTGGGATCTT AACGGTTCAA GTGATCCTCT CGCTATTGCC GCAAATATCA TTGGGATCTT AAAGGCAATA GATGGAGTCA CCAATAAGGT CAACTCGATC ATTGACAAAA AAAGGCATTT AATGGAATCA CCAACAAGGT AAATTCTGTG ATTGAAAAGA AAATGCCATT AACGGGATTA CAAACAAGGT GAACTCTGTT ATCGAGAAAA AAAGGCAATT GATAAAATAA CATCCAAGGT GAATAATATA GTCGACAAGA AGCAGCAATC AACCAAATCA ATGGGAAGCT GAATAGGTTG ATCGGGAAAA CAGTGGAGTT TTCGG----- -AGACAATCC ACGCCCCAAT GATGGAACAG CAGTGGGGTT TTCGG----- -TGACAACCC GCGTCCCAAA GATGGAACAG CTCAGGGCTT GTTGG----- -CGACACACC CAGAAACGAC GACAGATCTA CTCAGGGCTT GTTGG----- -AGACACACC CAGAAAAAAC GACAGCTCCA CTCAGGACTT GTTGG----- -CGACACACC AAGAAATGAC GATAGCTCCA AGCAGGGGTG GATAGGTT-- -CTATAGGAC TTGCAAACTA GTTGGAATCA AGCAGGAGTG GATAGATT-- -CTATAGGAC TTGCAAGCTA GTTGGAATCA AGCAGGGGTG AATAGATT-- -CTACAGAAC CTGCAAGCTA GTCGGAATCA AGCAGGAGTG GATAGATT-- -CTACAGGAC CTGCAAGTTA GTGGGAATCA AGCCGGAGTC GACAGGTT-- -TTATCGAAC CTGTAAGCTA CATGGAATCA

ACATAAACAC AGCTTTGTTG AATGCATCCT GTGCAGCCAT GGATGACTTC ACATAAATAC AGCTTTGCTC AATGCATCTT GTGCAGCCAT GGATGACTTC ACATTAATAC TGCCTTGCTT AATGCATCCT GTGCAGCAAT GGACGATTTC ACATCAATAC TGCCTTACTT AATGCATCTT GTGCAGCAAT GGATGATTTC ACATTAATAC TGCCCTGCTC AATGCATCCT GTGCAGCAAT GGACGATTTT GCAT--GCCA TTCTGCAGCA TTTGAGGACC TGAGAGTCTC AAGTTTCATT GCAT--GCCA TTCTGCAGCA TTTGAGGACC TGAGAGTCTC AAGTTTCATT GCAT--GCCA TTCTGCAGCA TTTGAAGATC TGAGAGTCTC AAGCTTCATC GCAT--GCAA TTCTGCTGCA TTTGAAGATC TAAGAGTATT AAGCTTCATC GCAT--GCCA TTCTGCTGCA TTTGAAGATT TAAGATTGTT AAGCTTCATC GCAT--GCCA TTCTGCCGCA TTTGAAGATC TAAGAGTATT GAGCTTCATC GAGTTAGTAA GATGGGAGTA GATGAATATT CCAGCACGGA GAGAGTGGTG GAGTTAGCAA AATGGGGGTG GATGAATATT CTAGCACTGA AAGGGTGGTC GAGTCAGCAA AATGGGCGTG GATGAATACT CCAGCACAGA GAGGGTAGTG GAGTCAGCAA AATGGGTGTG GATGAATACT CCAGTACAGA GAGGGTGGTG GAATCAGCAA AATGGGTGTA GATGAGTACT CCAGCACGGA GAGGGTAGTG ---------- ---------- ---------- ---------- ---------- CCCTTGTTTC TACT------ ---------- ---------- ---------- CCCTTGTTTC TACT------ ---------- ---------- ---------- CCCTTGTTTC TACT------ ---------- ---------- ---------- C--------- ---------- ---------- ---------- ---------- GCACTTGATA TTGTGGATTC TTGATCGTCT TTTCTTCAAA TGCATTTATC

47

Page 48: Project  Report on Influenza Virus

GCACTTGATA TTGTGGATTC TTGATCGTCT TTTCTTCAAA TGCATTTATC GCACTTTATA TTGTGGATTC TTGATCGTCT TTTTTTCAAA TGCATTTATC GCACTTGATA TTGTGGATTC TTGATCGTCT TTTTTTCAAA TGCGTCTATC GCACTTGATA TTGTGGATTC TTGATCGTCT TTTTTTCAAA TGCATTTACC TGAACACTCA GTTTGAGGCC GTTGGAAGGG AATTTAATAA CTTGGAAAGG TGAACACCCA ATTTGAAGCT GTTGGGAAAG AATTCAGTAA CTTAGAGAAA TGAACATTCA ATTCACAGCT GTGGGTAAAG AATTCAACAA ATTAGAAAAA TGAACAAGCA ATATGAAATA ATTGATCATG AATTCAGTGA GGTTGAAACT CAAACGAGAA ATTCCATCAG ATTGAAAAAG AATTCTCAGA AGTAGAAGGG GCAGTTGTGG TCCGGTGTCC CCTAAC---- -----GGGGC ATATGGAGTA GCAGCTGTGG TCCAGTGTAT GTTGAT---- -----GGAGC AAACGGAGTA GCAATAGTAA TTGCAGGAAT CCTAACAATG AGAGAGGGAA TCCAGGAGTG GCAGTAGCCA TTGCTTGGAT CCTAACAATG AAGAAGGTGG TCATGGAGTG GCAGCAGTAA CTGCAGGGAT CCTAATAACG AGAGAGGGGG CCCAGGAGTG ATATGACCAA GAAGAAGTCT TACATAAATC GGACAGGAAC ATGTGAATTC ACATGAGCAA AAAGAAGTCT TACATAAATC GGACAGGAAC ATTTGAGTTC ATATGAGCAA AAAGAAGTCC TACATAAATA GGACAGGGAC ATTTGAATTC ACATGAGCAA AAAGAAGTCC TATATAAATA AAACAGGGAC ATTTGAATTC ATATGAGCAA GAAAAAGTCT TACATAAACA GAACAGGTAC ATTTGAATTC

CAACTGATCC CAATGATAAG CAAATGCAGA ACCAAAGAAG GAAGACGGAA CAACTGATTC CAATGATAAG CAAATGCAGA ACAAAAGAAG GAAGAAGGAA CAACTAATTC CCATGATAAG CAAGTGTAGA ACTAAAGAGG GAAGGCGAAA CAATTAATTC CAATGATAAG CAAGTGTAGA ACTAAGGAGG GAAGGCGAAA CAACTAATTC CCATGATAAG CAAGTGCAGA ACTAAAGAGG GAAGGCGAAA AGAGGAACAA GAGTGATCCC AAGAGGACAA CTATCCACTA GAGGAGTTCA AGAGGAACAA GAGTGATCCC AAGAGGACAA CTATCCACTA GAGGAGTTCA AGAGGGACAA GAGTGGCCCC AAGGGGACAA CTATCTACTA GAGGAGTTCA AGAGGGACCA AAGTATCCCC AAGGGGGAAA CTTTCCACTA GAGGAGTACA AGAGGGACAA AAGTATCTCC GCGGGGGAAA CTGTCAACTA GAGGAGTACA AAAGGGACGA AGGTGGTCCC AAGAGGGAAG CTTTCCACTA GAGGAGTTCA GTGAGTATTG ACCGTTTCTT GAGGGTCCGA GATCAGCAGG GGAACGTACT GTGAGCATTG ACCGTTTCTT AAGGGTCCGA GATCAGCGAG GAAATGTACT GTAAGCATTG ACCGGTTTTT GAGAGTTCGA GACCAACGAG GAAATGTACT GTTAGCATTG ATCGGTTTTT GAGAGTTCGA GACCAACGCG GGAATGTATT GTGAGCATTG ACCGGTTCTT GAGAGTCCGG GACCAACGAG GAAATGTACT ---------- ---------- ---------- ---------- ---------- ---------- ---------- ---------- ---------- ---------- ---------- ---------- ---------- ---------- ---------- ---------- ---------- ---------- ---------- ---------- ---------- ---------- ---------- ---------- ---------- GTCGCCTTAA ATACGGTTTG AAAAGAGGGC CTTCTACGGA AGGGGTACCT GTCGCTTTAA ATACGGTTTG AAAAGAGGGC CTTCTACGGA AGGAGTGCCT GCTTCTTTAA ACACGGTCTG AAAAGAGGGC CTTCTACGGA AGGAGTACCT GACTCTTCAA ACACGGCCTT AAAAGAGGCC CTTCTACGGA AGGAGTACCT GTCGCTTTAA ATACGGACTG AAAGGAGGGC CTTCTACGGA AGGAGTGCCA AGGATAGAGA ATTTAAACAA GCAGATGGAA GACGGATTCC TAGATGTCTG AGACTGGAGA ACTTGAACAA AAAGATGGAA GACGGGTTTC TAGATGTGTG AGGATGGAAA ATTTAAATAA AAAAGTTGAT GATGGATTTC TGGACATTTG AGACTCAATA TGATCAATAA TAAGATTGAT GACCAAATAC AAGACGTATG AGAATTCAGG ACCTCGAGAA ATATGTTGAG GACACTAAAA TAGATCTCTG AAAGGGTTTT CATTTAAATA CGGCAATG-- ----GTGTTT GGATCGGGAG AAGGGATTTT CATATAGGTA TGGTAATG-- ----GTGTTT GGATAGGAAG AAAGGCTGGG CCTTTGACAA TGGAGATG-- ----ACGTGT GGATGGGAAG AAAGGCTGGG CCTTTGATGA TGGAAATG-- ----ACGTGT GGATGGGAAG AAAGGGTGGG CCTTTGACAA TGGAAATG-- ----ATGTTT GGATGGGACG ACAAGCTTCT TCTACCGCTA TGGGTTCGTA GCCAACTTCA GTATGGAGCT ACAAGCTTTT TCTACCGCTA TGGGTTTGTA GCCAACTTCA GCATGGAGCT ACAAGCTTTT TCTATCGCTA TGGATTTGTA GCCAATTTTA GCATGGAGCT ACAAGCTTTT TTTATCGATA TGGATTTGTG GCTAATTTTA GCATGGAGCT ACAAGTTTTT TCTATCGTTA TGGGTTTGTT GCCAATTTCA GCATGGAGCT

AACTAACCTG TATGGATTCC TTATAAAAGG AAGATCC--C ATTTGAGAAA GACAAACCTG TATGGGTTCA TTATAAAAGG AAGGTCC--C ATTTGAGAAA GACCAATTTA TATGGTTTCA TCATAAAAGG AAGATCT--C ACTTAAGGAA GACCAACTTG TATGGTTTCA TCATAAAAGG AAGATCC--C ACTTAAGGAA AACCAATTTA TATGGATTCA TCATAAAGGG AAGATCT--C ATTTAAGGAA GATTGCTTCA AATGAGAACG TGGAAGCAAT GGATTCCAGC ACTCTTGAAC GATTGCTTCA AATGAGAACG TGGAAGCAAT GGATTCCAGC ACTCTTGAAC AATTGCTTCA AATGAGAACA TGGAAACAAT GGACTCCAGC ACTCTTGAAC AATTGCTTCA AATGAAAACA TGGATACTAT GGAATCAAGT ACTCTTGAAC AATTGCTTCA AATGAGAACA TGGATAATAT GGGATCGAGC ACTCTTGAAC AATTGCTTCC AATGAAAATA TGGAGACTAT GGAATCAAGT ACACTTGAAC CTTATCTCCT GAAGAGGTTA GTGAAACACA GGGAACAGAG AAGTTGACAA CCTATCCCCT GAAGAAGTTA GTGAAACACA GGGAATGGAA AAGTTGACGA ACTATCTCCT GAGGAGGTCA GTGAAACACA GGGGACAGAG AAACTGACAA ATTGTCTCCT GAGGAGGTCA GTGAAACACA GGGAACTGAA AGATTGACAA ACTGTCTCCC GAGGAGGTCA GTGAAACACA GGGAACAGAG AAACTGACAA ---------- ---------- ---------- ---------- ---------- ---------- ---------- ---------- ---------- ---------- ---------- ---------- ---------- ---------- ---------- ---------- ---------- ---------- ---------- ---------- ---------- ---------- ---------- ---------- ---------- GAGTCTATGA GGGAAGAGTA TCGGCAGGAA CAGCAGAGTG CTGTGGATGT GAGTCTATGA GGGAAGAGTA TCGGCAGGAA CAGCAGAATG CTGTGGATGT

48

Page 49: Project  Report on Influenza Virus

GAGTCTATGA GGGAAGAATA TCGAAAGGAA CAGCAGAGTG CTGTGGATGC GAGTCTATGA GGGAAGAATA TCGAAAGGAA CAGCAGAATG CTGTGGATGC AAGTCTATGA GGGAAGAATA TCGAAAGGAA CAGCAGAGTG CTGTGGATGC GACTTATAAT GCTGAACTTC TGGTTCTCAT GGAAAAT--G AGAGAACTCT GACATACAAT GCAGAGCTTC TAGTTCTGAT GGAAAAT--G AGAGGACACT GACATATAAT GCAGAATTGT TAGTTCTACT GGAAAAT--G AAAGGACTCT GGCATATAAT GCAGAATTGC TAGTACTACT TGAAAAT--C AAAAAACACT GTCATACAAC GCGGAGCTTC TTGTGGCCCT GGAGAAC--C AACATACAAT AACCAAAAGC ACTAATTCCA GGAGCGGCTT TGAAATGATT TGGGATCCAA GACCAAAAGT CACAGTTCCA GACATGGGTT TGAGATGATT TGGGATCCTA AACGATCAGC AAGGATTTAC GCTCAGGTTA TGAAACTTTC AAAGTCATTG AACGATCAGC GAGAAGTTAC GCTCAGGATA TGAAACCTTC AAAGTCATTG AACAATCAAG AAAGATTCGC GCTCTGGTTA TGAGACTTTC AGGGTCGTTG GCCCAGCTTT GGAGTGTCTG GGATTAATGA ATCGGCTGAC ATGAGCATTG GCCCAGCTTT GGAGTTTCCG GAATTAATGA ATCGGCTGAC ATGAGCATTG GCCCAGCTTT GGAGTGTCTG GAATTAATGA ATCGGCTGAT ATGAGCATTG TCCCAGTTTT GGAGTGTCTG GAATAAACGA GTCAGCTGAT ATGAGTATTG TCCCAGTTTT GGTGTGTCTG GGAGCAACGA GTCAGCGGAC ATGAGTATTG

TGACACCGAT GTGGTAAACT TTGTGAGTAT GGAATTCTCT CTTACTGATC TGATACTGAC GTGGTGAACT TTGTGAGTAT GGAATTCTCC CTTACTGACC TGACACCGAC GTGGTAAACT TTGTGAGCAT GGAGTTTTCT CTCACTGACC TGACACCGAC GTGGTAAACT TTGTGAGCAT GGAGTTTTCT CTCACTGACC TGACACAGAT GTGGTAAACT TTGTGAGCAT GGAGTTTTCT CTCACTGACC TGAGAAGCAG ATATTGGGCT ATAAGGACCA GGAGTGGAGG AAACACCAAT TGAGAAGCAG ATATTGGGCT ATAAGGACCA GGAGTGGAGG AAACACCAAT TGAGAAGCAG ATATTGGGCT ATAAGGACCA GGAGTGGAGG AAACACCAAC TAAGAAGCAG GTACTGGGCC ATAAGGACCA GAAGTGGAGG AAACACTAAT TGAGAAGCGG GTACTGGGCC ATAAGGACCA GGAGTGGAGG AAACACTAAT TGAGAAGCAG GTACTGGGCC ATAAGGACCA GAAGTGGAGG AAACACCAAT TAACATATTC ATCCTCAATG ATGTGGGAAA TCAACGGTCC TGAGTCAGTG TAACTTATTC ATCGTCTATG ATGTGGGAGA TTAACGGGCC AGAATCAGTG TAACTTACTC ATCGTCAATG ATGTGGGAGA TTAATGGCCC TGAGTCAGTG TAACATATTC ATCGTCGATG ATGTGGGAGA TTAACGGTCC TGAGTCGGTT TAACTTACTC ATCGTCAATG ATGTGGGAGA TTAATGGTCC TGAATCAGTG ---------- ---------- ---------- ---------- ---------- ---------- ---------- ---------- ---------- ---------- ---------- ---------- ---------- ---------- ---------- ---------- ---------- ---------- ---------- ---------- ---------- ---------- ---------- ---------- ---------- TGACGATGGT CATTTTGTCA ACATAGAGCT GGAGTAAAAA ACTACCTTGT TGACGATGGT CATTTTGTCA ACATAGAGCT GGAGTAAAAG ATCTTCCT-- TGACGATAGT CATTTTGTCA GCATAGAGCT GGAGTAAAAA ACTACCTTGT TGACGACAGT CATTTTGTCA GCATAGAGTT GGAGTAAAAA ACTACCTTGT TGACGATGGT CATTTTGTCA GCATAGAGCT GGAGTAAAAA ACTACCTTGT AGACTTTCAT GACTCAAATG TCAAGAACCT TTA-TGACAA GGTCCGACTA TGACTTTCAT GATTCTAATG TCAAGAATCT GTA-TGATAA AGTCAGAATG GGATTTCCAT GACTCAAATG TGAAGAATCT GTA-TGAGAA AGTAAAAAGC CGATGAGCAT GATGCGAACG TGAACAATCT ATA-TAACAA GGTGAAGAGG TGATCTAACT GACTCAGAAA TGAACAAACT GTT-TGAAAG AACAAAGAAG ATGGGTGGAC TGGAACGGAC AGTA-GCTTC TCGGTGA--- --AACAAGAT ATGGATGGAC AGAGACTGAT AGTA-AGTTC TCTGTGA--- --GGCAAGAT GTGGTTGGTC CACACCTAAT TCCA-AATCG CAGATCAA-- TAGACAGGTC AAGGCTGGTC CAAACCTAAT TCCA-AATTG CAGATAAA-- TAGGCAAGTC GTGGTTGGAC TACGGCTAAT TCCA-AGTCA CAAATAAA-- TAGGCAAGTC GTGTTACAGT GATAAAGAAC AATATGATGG ACAACGACCT TGGACCAGCA GAGTTACAGT GATAAAGAAT AATATGATAA ACAACGACCT TGGACCAGCA GGGTAACAGT GATAAAGAAC AATATGATAA ATAATGACCT TGGGCCAGCA GAGTAACAGT GATAAAGAAC AACATGATAA ACAATGACCT TGGGCCAGCA GAGTTACTGT CATCAAAAAC AATATGATAA ACAATGATCT TGGTCCAGCA

CGAGGCTGGA GCCACACAGA TGGGAAAAGT ACTGCGTTCT TCGGATAGGA CAAGGCTGGA GCCACACAAA TGGGAAAAGT ACTGTGTTCT TGAAGTAGGG CGAGACTTGA GCCACACAAA TGGGAGAAGT ACTGTGTCCT TGAGATAGGA CAAGACTTGA ACCACACAAA TGGGAGAAGT ACTGTGTTCT TGAGATAGGA CGAGACTTGA GCCACATAAA TGGGAGAAAT ACTGTGTCCT TGAGATAGGA CAACAGAGAG CATCTGCAGG ACAAATCAGT GTACAGCCCA CTTTCTCAGT CAACAGAGAG CATCTGCAGG ACAAATCAGT GTACAGCCCA CTTTCTCAGT CAGCAGAGAG CATCTGCAGG ACAAATCAGT GTGCAGCCTA CTTTCTCGGT CAACAGAGGG CCTCTGCAGG TCAAATCAGT GTACAACCTG CATTTTCTGT CAACAGAGGG CCTCCGCAGG CCAAACCAGT GTGCAACCTA CGTTTTCTGT CAACAGAGGG CATCTGCGGG CCAAATCAGC ATACAACCTA CGTTCTCAGT CTTGTTAACA CTTATCAATG GATCATCAGG AATTGGGAGA CTGTAAAGAT CTAGTTAACA CATATCAATG GATCATTAGG AATTGGGAGA CTGTAAAGAT TTGGTCAATA CCTATCAGTG GATCATCAGA AACTGGGAAA CTGTTAAAAT TTGGTCAATA CCTATCAATG GATCATCAGA AATTGGGAAG CTGTCAAAAT TTGGTCAATA CCTATCAATG GATCATCAGA AACTGGGAAA CTGTTAAAAT ---------- ---------- ---------- ---------- ---------- ---------- ---------- ---------- ---------- ---------- ---------- ---------- ---------- ---------- ---------- ---------- ---------- ---------- ---------- ---------- ---------- ---------- ---------- ---------- ---------- TTCTACT--- ---------- ---------- ---------- ---------- ---------- ---------- ---------- ---------- ---------- TTCTACT--- ---------- ---------- ---------- ----------

49

Page 50: Project  Report on Influenza Virus

TTCTACT--- ---------- ---------- ---------- ---------- TTCTACT--- ---------- ---------- ---------- ---------- CAGCTTAGGG ATAATGCAAA GGAGCTGGGT AATGGTTGTT TCGAGTTCTA CAGCTGAGAG ACAACGTCAA AGAACTAGGA AATGGATGTT TTGAATTTTA CAATTAAAGA ATAATGCCAA AGAAATCGGA AATGGATGTT TTGAGTTCTA GCACTGGGCT CCAATGCTAT GGAAGATGGG AAAGGCTGTT TCGAGCTATA CAACTGAGGG AAAATGCTGA GGATATGGGC AATGGTTGTT TCAAAATATA ATCGTAGCAA TAACTG---- ---ATTGGTC AGGATATAGC GGGAGTTTTG GTTGTGGCAA TGACTG---- ---ATTGGTC AGGGTATAGC GGGAGTTTCG ATAGTTGACA GCAATA---- ---ATTGGTC AGGTTACTCT GGTATTTTCT ATAGTTGACA GAGGTA---- ---ATAGGTC CGGTTATTCT GGTATTTTCT ATAGTTGACA GTGATA---- ---ACTGGTC TGGGTATTCT GGTATATTCT ACAGCTCAGA TGGCTCTTCA GCTATTCATT AAGGACTACA GATACCCATA ACAGCCCAGA TGGCTCTTCA GCTGTTCATT AAAGACTACA GATACACCTA ACAGCCCAAA TGGCTCTTCA ACTATTCATC AAAGACTACA GATACACGTA ACAGCCCAGA TGGCTCTCCA ATTGTTCATC AAAGACTACA GATATACATA ACAGCTCAAA TGGCCCTTCA GTTGTTCATC AAAGATTACA GGTACACGTA

GACATGCTCT TACGGACTGA AATAGGCCAA GTGTCAAGGC CCATGTTTCT GAAATGCTCT TGCGGACTGC AATAGGCCAG GTGTCAAGGC CCATGTTCCT GATATGCTAC TAAGAAGTGC CATAGGCCAG ATGTCAAGGC CTATGTTCTT GATATGCTTC TAAGAAGTGC CATAGGCCAG GTTTCAAGGC CCATGTTCTT GATATGTTAC TAAGAAGTGC CATAGGCCAA ATTTCAAGGC CTATGTTCTT ACAGAGAAAT CTTCCCTT-C GAAAGACCGA CCATTATGGC TGCGTTTAAG ACAGAGAAAT CTTCCCTT-C GAAAGACCGA CCATTATGGC TGCGTTTAAG ACAGAGAAAT CTTCCCTT-C GAAAGAGCGA CCATTATGGC GGCATTCACA GCAAAGAAAC CTCCCATT-T GACAAACCAA CCATCATGGC AGCATTCACT ACAAAGAAAC CTCCCATT-T GAAAAGTCAA CCATCATGGC AGCATTCACT ACAGAGAAAT CTCCCTTT-T GACAGAACAA CCGTTATGGC AGCATTCACT TCAATGGTCT CAAGATCC-C ACAATGCTGT ACAATAAGAT GGAGTTTGA- CCAATGGTCC CAAGAACC-C ACCATGCTAT ACAATAAGAT GGAGTTTGA- TCAATGGTCT CAGAATCC-T ACAATGCTAT ACAATAAAAT GGAATTTGA- TCAATGGTCT CAGAATCC-T GCAATGTTGT ACAACAAAAT GGAATTTGA- TCAGTGGTCC CAGAACCC-T ACAATGCTAT ACAATAAAAT GGAATTTGA- ---------- ---------- ---------- ---------- ---------- ---------- ---------- ---------- ---------- ---------- ---------- ---------- ---------- ---------- ---------- ---------- ---------- ---------- ---------- ---------- ---------- ---------- ---------- ---------- ---------- ---------- ---------- ---------- ---------- ---------- ---------- ---------- ---------- ---------- ---------- ---------- ---------- ---------- ---------- ---------- ---------- ---------- ---------- ---------- ---------- ---------- ---------- ---------- ---------- ---------- TCACAAATGT GATAATGAAT GTATGGAAAG TGTAAAAAAC GGAACGTATG TCACAAATGT GACAATGAAT GCATGGATAG TGTGAAAAAC GGGACATATG CCACAAGTGT GACAATGAAT GCATGGAAAG TGTAAGAAAT GGGACTTATG CCATAAATGT GATGATCAGT GCATGGAAAC AATTCGGAAC GGGACCTATA CCACAAATGT GACAATGCCT GCATAGGGTC AATCAGAAAT GGAACTTATG TCCAGCATCC AGAACTGACA GGATTAGATT GCATAAGACC TTGTTTCTGG TTCAACATCC TGAGCTAACA GGGCTAGACT GTATAAGGCC GTGCTTCTGG CT-------- ---GTTGA-G GGCAAAAGAT GCATCAATAG GTGCTTTTAT CT-------- ---GTTGA-A GGCAAAAGCT GCATCAATCG GTGCTTTTAT CT-------- ---GTTGA-A GGAAAAACCT GCATCAACAG GTGTTTTTAT CCGATGCCAC AGGGGGGA-T ACACAAATCC AAACGAGGAG ATCATTCGAG CCGATGCCAC AGAGGTGA-T ACACAAATTC AAACTAGAAG ATCATTTGAA CCGGTGCCAC AGAGGGGA-C ACACAAATTC AGACAAGGAG ATCATTCGAG TAGGTGCCAT AGAGGAGA-C ACACAAATTC AGACGAGAAG ATCATTCGAG CCGATGCCAT AGAGGTGA-C ACACAAATAC AAACCCGAAG ATCATTTGAA

TTATGTGAGA ACCAATGGAA CCTCCAAGAT CAAGATGAAA TGGGGCATGG GTATGTGAGA ACTAACGGAA CCTCCAAAAT TAAGATGAAA TGGGGGATGG GTATGTGAGG ACAAATGGAA CATCAAAGAT TAAAATGAAA TGGGGAATGG GTATGTGAGG ACAAATGGAA CCTCAAAAAT TAAAATGAAA TGGGGAATGG GTATGTGAGG ACAAACGGAA CATCAAAGGT CAAAATGAAA TGGGGAATGG GGGAATACCG AGGGCAGAAC ATCTGACATG AGGACTGAAA TCATAAGGAT GGGAATACCG AGGGCAGAAC ATCTGACATG AGGACTGAAA TCATAAGGAT GGGAATACAG AGGGCAGAAC ATCTGACATG AGGACTGAAA TCATAAGGAT GGGAATACAG AGGGAAGAAC ATCAGACATG AGGGCAGAAA TCATAAGGAT GGAAATACGG AGGGAAGGAC TTCAGACATG AGGGCAGAAA TCATAAGAAT GGGAATACAG AGGGGAGAAC ATCTGACATG AGGACCGAAA TCATAAGGAT ATCGTTCCAA TCCTTGGTGC CAAAGGCTGC CAGAAGCCAA TATAGTGGAT ACCATTTCAA TCTTTAGTAC CAAAGGCTGC CAGAAGCCAA TATAGTGGAT GCCATTTCAG TCTTTAGTTC CTAAGGCCAT TAGAGGCCAA TACAGTGGAT ACCATTTCAA TCTTTAGTCC CCAAGGCCAT TAGAAGCCAA TACAGTGGGT ACCATTTCAG TCTTTAGTAC CTAAGGCCAT TAGAGGCCAA TACAGTGGGT ---------- ---------- ---------- ---------- ---------- ---------- ---------- ---------- ---------- ---------- ---------- ---------- ---------- ---------- ---------- ---------- ---------- ---------- ---------- ---------- ---------- ---------- ---------- ---------- ---------- ---------- ---------- ---------- ---------- ---------- ---------- ---------- ---------- ---------- ---------- ---------- ---------- ---------- ---------- ---------- ---------- ---------- ---------- ---------- ----------

50

Page 51: Project  Report on Influenza Virus

---------- ---------- ---------- ---------- ---------- ACTACCCGCA GTATTCAGAA GAAGCAAGAC TAAACAGAGA GGAAATAAGT ATTATCCCAA GTATGAAGAA GAATCTAAAC TAAATAGAAA TGAAATCAAA ATTATCCCAA ATATTCAGAA GAGTCAAAGT TGAACAGGGA AAAGGTAGAT ATAGGAGAAA GTATAGAGAG GAATCAAGAC TAGAAAGGCA GAAAATAGAG ACCATGATGT ATACAGAGAT GAAGCATTAA ACAACCGGTT CCAGATCAAA GTTGAGCTAA TCAGAGGGCG GCCCAAAGAG AGCACAATTT GGACTAGTGG GTTGAATTAA TCAGGGGACG ACCTAAAGAA AAAACAATCT GGACTAGTGC GTGGAGTTGA TAAGGGGAAG GCAACAGGAG ACTAGAGTAT GGTGGACCTC GTGGAGTTGA TAAGGGGAAG AAAAGAGGAA ACTGAAGTCT TGTGGACCTC GTGGAGTTGA TAAGAGGGAG ACCACAGGAG ACCAGAGTAT GGTGGACTTC CTGAAGAAGC TGTGGGAGCA GACCCGCTCA AAGGCAGGAC TGTTGGTTTC TTGAAGAAGC TGTGGGAGCA GACCCGCTCA AAGGCAGGAC TGTTGGTTTC CTAAAGAAGC TGTGGGAGCA AACCCGCTCA AAGGCAGGAC TTTTGGTGTC CTAAAGAAGC TGTGGGATCA AACCCAATCA AGGGCAGGAC TATTGGTATC ATAAAGAAAC TGTGGGAGCA AACCCGTTCC AAAGCTGGAC TGCTGGTCTC

AAATGAGGCG ATGCCCTTTT CAATCCCTTC AACAGATTGA GAGCATGATT AAATGAGACG CTGCCTTCTT CAATCTCTTC AACAGATTGA GAGCATGATC AGATGAGGCC TTGCCTCCTT CAGTCACTAC AACAAATCGA GAGTATGGTT AGATGAGGCG TTGTCTCCTC CAGTCACTTC AACAAATTGA GAGTATGATT AGATGAGACG TTGCCTCCTT CAGTCACTCC AGCAGATCGA GAGCATGATT GATGGAAAGT GCCAGACCAG AAGATGTGTC TTTCCAGGGG CGGGGAGTCT GATGGAAAGT GCCAGACCAG AAGATGTGTC TTTCCAGGGG CGGGGAGTCT GATGGAAAGC TCCAGACCAG AAGATGTGTC TTTCCAGGGG CGGGGAGTCT GATGGAAGGT GCAAAACCAG AAGAAATGTC CTTCCAGGGG CGGGGAGTCT GATGGAAGGT GCAAAACCAG AAGAAGTGTC ATTCCGGGGG AGGGGAGTTT GATGGAAAGT GCAAGACCAG AAGATGTGTC TTTCCAGGGG CGGGGAGTCT TTGTGAGAAC ACTATTCCAA CAGATGCGTG ATGTTTTGGG ---GACATTT TTGTGAGAAC GCTATTCCAG CAGATGCGTG ATGTTTTGGG ---AACGTTC TTGTTAGGAC TCTATTCCAA CAAATGAGGG ATGTACTTGG ---GACATTT TTGTCAGAAC TCTATTCCAA CAAATGAGAG ACGTACTTGG ---GACATTT TTGTGAGAAC TCTGTTCCAA CAAATGAGGG ATGTGCTTGG ---GACATTT ---------- ---------- ---------- ---------- ---------- ---------- ---------- ---------- ---------- ---------- ---------- ---------- ---------- ---------- ---------- ---------- ---------- ---------- ---------- ---------- ---------- ---------- ---------- ---------- ---------- ---------- ---------- ---------- ---------- ---------- ---------- ---------- ---------- ---------- ---------- ---------- ---------- ---------- ---------- ---------- ---------- ---------- ---------- ---------- ---------- ---------- ---------- ---------- ---------- ---------- GGAGTAAAAT TGGAATCAAT GGGAACTTAC CAAATACTGT CAATTTATTC GGGGTAAAAT TGAGCAGCAT GGGGGTTTAT CAAATCCTTG CCATTTATGC GGAGTGAAAT TGGAATCAAT GGGGATCTAT CAGATTCTGG CGATCTACTC GGGGTTAAGC TGGAATCTGA GGGAACTTAC AAAATCCTCA CCATTTATTC GGTGTTGAGT TGAAGTCAGG ATACAAAGAT TGGATCCTAT GGATTTCCTT GAGCAGCATA TCTTTTTGTG GTGTAAATAG TGACAC-TGT GGGTTGGTCT GAGCAGCATT TCTTTTTGTG GCGTGAATAG TGATAC-TGT AGATTGGTCT AAACAGTATT GTTGTGTTTT GTGGCACTTC AGGTACTTAT GGAACAGGCT AAACAGTATT GTTGTGTTTT GTGGCACCTC AGGTACATAT GGAACAGGCT AAATAGCATC ATTGTATTTT GTGGAACTTC AGGTACCTAT GGAACAGGCT AGATGGAGGA CCAAACCCAT ACAATATCCG GAATCTCCAC ATTCCGGAGG AGATGGAGGG CCGAATTTAT ACAACATCCG GAATCTTCAC ATTCCAGAAG GGATGGAGGA TCAAACTTAT ACAATATCCG GAATCTCCAC ATTCCAGAAG AGATGGGGGA CCAAACTTAT ACAATATCCG GAACCTTCAC ATCCCTGAAG CGACGGAGGC CCAAATTTAT ACAACATTAG AAATCTCCAC ATTCCTGAAG

GAGGCCGAGT CTTCTGTCAA AGAAAAAGAC ATGACTAAAG AATTCTTTGA GAGGCTGAGT CTTCTATCAA AGAGAAAGAC ATGACCAAAG AATTCTTTGA GAAGCCGAGT CCTCTGTCAA AGAGAAAGAC ATGACCAAAG AGTTTTTTGA GAAGCTGAGT CCTCTGTCAA AGAGAAAGAC ATGACCAAAG AGTTCTTTGA GAAGCCGAGT CCTCGATTAA AGAGAAAGAC ATGACCAAAG AGTTTTTTGA TCGAGCTC-T CGGACGAAAA GGCAACGAAC CCGATCGTGC CTTCCTTTGA TCGAGCTC-T CGGACGAAAA GGCAACGAAC CCGATCGTGC CTTCCTTTGA TCGAGCTC-T CGGACGAAAA GGCAACGAAC CCGATCGTGC CTTCCTTTGA TCGAGCTC-T CGGACGAAAA GGCAACGAAC CCGATCGTGC CCTCTTTTGA TCGAGCTC-T CAGACGAGAA GGCAACGAAC CCGATCGTGC CCTCTTTTGA TCGAGCTC-T CGGACGAAAA GGCAGCGAGC CCGATCGTGC CTTCCTTTGA GATACTGT-C CAAATAATCA AGCTGCTACC ATTTGCAGCA GCCCCACCGG GACACTGT-T CAAATAATCA AACTACTACC ATTTGCAGCA GCCCCACCGG GATACCAC-C CAGATAATAA AGCTTCTTCC CTTTGCAGCC GCCCCACCAA GACACCAC-C CAGATAATAA AGCTTCTCCC TTTTGCAGCC GCTCCACCAA GATACCGC-A CAGATAATAA AACTTCTTCC CTTCGCAGCC GCTCCACCAA ---------- ---------- ---------- ---------- ---------- ---------- ---------- ---------- ---------- ---------- ---------- ---------- ---------- ---------- ---------- ---------- ---------- ---------- ---------- ---------- ---------- ---------- ---------- ---------- ---------- ---------- ---------- ---------- ---------- ---------- ---------- ---------- ---------- ---------- ---------- ---------- ---------- ---------- ---------- ---------- ---------- ---------- ---------- ---------- ---------- ---------- ---------- ---------- ---------- ----------

51

Page 52: Project  Report on Influenza Virus

AACAGTGGCG AGTTCCCTAG CACTGGCAAT CATGGTAGCT GGTCTATCTT TACAGTAGCA GGTTCTCTGT CACTGGCAAT CATGATGGCT GGGATCTCTT AACTGTCGCC AGTTCACTGG TGCTTTTGGT CTCCCTGGGG GCAATCAGTT GACTGTCGCC TCATCTCTTG TGCTTGCAAT GGGGTTTGCT GCCTTCCTGT TGCCATATCA TGTTTTTT-- -GCTTTGTGT TGCTTTGTTG GGGTTCATCA --TGGCCAGA -CGATGCCGA GTTGCCATTC ACCATTG-AC AAGT------ --TGGCCAGA -CGGTGCTGA GTTGCCATTC ACCATTG-AC AAGT------ CATGGCCTGA -TGGGGCGAA CATCAATTTC ATGCCTATAT AA-------- CATGGCCTGA -TGGGGCGGA CATCAATCTC ATGCCTATAT AAGCTTTCGC CATGGCCTGA -TGGAGCGAA TATCAATTTC ATGTCTATAT AAGCTTTCGC CTGGCTTGAA GTGGGAATTG ATGGATGAAG ACTACCAGGG CAGACTGTGT TTTGCTTGAA GTGGGAGTTG ATGGATGAAG ATTACCAGGG AAGACTGTGT TCTGCTTGAA ATGGGAGCTA ATGGATGAAG ACTATCAGGG GAGGCTTTGT TCTGCTTAAA GTGGGAGCTA ATGGATGAGA ATTATCGGGG AAGACTTTGT TCTGCCTAAA ATGGGAATTG ATGGATGAGG ATTACCAGGG GCGTTTATGC

AAACAAATCA GAAACATGGC CAATTGGAGA ATC-ACCCAA GGGAGTGGAG AAACAGATCG GAGACATGGC CAATTGGAGA GTC-ACCTAA GGGAGTGGAG GAATAAATCA GAAACATGGC CCATTGGGGA GTC-CCCCAA AGGAGTGGAA GAACAAATCA GAAACATGGC CCATTGGAGA GTC-TCCCAA AGGAGTGGAG GAATAAATCA GAAGCATGGC CCATTGGGGA GTC-CCCCAA GGGAGTGGAA CATGAGTAAT GAAGGATCTT ATTTCTTCGG AGA-CAATGC AGAGGAATAT CATGAGTAAT GAAGGATCTT ATTTCTTCGG AGA-CAATGC AGAGGAATAT CATGAGTAAT GAAGGATCTT ATTTCTTCGG AGA-CAATGC AGAGGAATAT CATGAGTAAT GAAGGATCTT ATTTCTTCGG AGA-CAATGC AGAGGAGTAC TATGAGTAAT GAAGGATCTT ATTTCTTCGG AGA-CAATGC AGAAGAGTAC CATGAGTAAT GAAGGATCTT ATTTCTTCGG AGA-CAATGC AGAGGAGTAC AGCCGAGCAG AATGCAGTTT TCTTCTCTAA CTG-TGAATG TGAGAGGCTC AACAGAGTAG GATGCAATTT TCTTCTCTGA CTG-TGAATG TGAGGGGATC AGCAAAGTAG AATGCAGTTC TCTTCATTGA CTG-TGAATG TGAGGGGATC AGCAAAGCAG AATGCAGTTC TCTTCACTGA CTG-TAAATG TGAGGGGATC AGCAAAGTAG AATGCAGTTC TCCTCATTTA CTG-TGAATG TGAGGGGATC ---------- ---------- ---------- ---------- ---------- ---------- ---------- ---------- ---------- ---------- ---------- ---------- ---------- ---------- ---------- ---------- ---------- ---------- ---------- ---------- ---------- ---------- ---------- ---------- ---------- ---------- ---------- ---------- ---------- ---------- ---------- ---------- ---------- ---------- ---------- ---------- ---------- ---------- ---------- ---------- ---------- ---------- ---------- ---------- ---------- ---------- ---------- ---------- ---------- ---------- TATGGATGTG CTCCAATGGA TCGTTACAAT GCA-GAATTT GCATTTAAAT TCTGGATGTG CTCCAACGGG TCTCTGCAGT GCA-GAATCT GCATATGA-- TCTGGATGTG TTCTAATGGA TCTTTGCAGT GCA-GAATAT GCATCTGAGA TCTGGGCCAT GTCCAATGGA TCTTGCAGAT GCA-ACATTT GTATATAA-- TGTGGGCCTG CCAAAAAGGC AACATTAGGT GCA-ACATTT GCATTTGAG- AGTTTGTTCA AAAAACTCCT TGTTTCTACT ---------- ---------- AGTCTGTTCA AAAAACTCCT TGTTTCTACT ---------- ---------- ---------- ---------- ---------- ---------- ---------- AATTTTAGAA AAAAACTCCT TGTTTCTACT ---------- ---------- AATTTT---- ---------- ---------- ---------- ---------- AATCCTCTGA ACCCGTTTGT TAGTCATAAG GAAATTGAGT CTGTCAACAA AACCCTCTGA ACCCGTTTGT CAGTCATAAG GAAGTTGAAT CCGTCAACAA AATCCCCTGA ATCCATTTGT CAGTCATAAG GAAATTGAGT CTGTAAACAA AACCCCCTGA ATCCCTTTGT CAGCCATAAA GAAATTGAGT CTGTAAACAA AACCCACTGA ACCCATTTGT CAGCCATAAA GAAATTGAAT CAATGAACAA

GAAGGCTCCA TCGGGAAGGT GTGCAGAACC TTACTGGCTA AATCTGTTTT GAAGGCTCAA TCGGGAAGGT GTGCAGAACC TTACTAGCAA AATCTGTGTT GAAGGTTCCA TTGGGAAGGT CTGCAGGACT TTATTAGCCA AGTCGGTATT GAAAGTTCCA TTGGGAAGGT CTGCAGGACT TTATTAGCAA AGTCGGTATT GAAGGTTCCA TTGGGAAAGT CTGTAGGACT CTATTGGCTA AGTCAGTGTT GACAATTGAA GAAAAA-TAC CCTTGTTTCT ACT------- ---------- GACAATTGAG GAAAAA-TAC CCTTGTTTCT A--------- ---------- GACAATTGAA GAAAAA-TAC CCTTGTTTCT ACT------- ---------- GACAATTAA- ---------- ---------- ---------- ---------- GACAATTAAG GAAAAAATAC CCTTGTTTCT ACT------- ---------- GACAATTAAA GAAAAA-TAC CCTTGTTTCT ACT------- ---------- AGGAATGAGA ATACTCGTGA GGGGTAACTC CCCCGTGTTC AACTACAACA AGGAATGAGA ATACTTGTGA GAGGTAACTC CCCTGCATTT AACTACAACA AGGAATGAGA ATACTTGTAA GGGGCAATTC TCCTGTATTC AACTACAACA AGGGATGAGA ATACTTGTAA GGGGCAATTC TCCTGTATTC AACTACAACA AGGAATGAGA ATACTTGTAA GGGGCAATTC TCCTGTATTC AACTACAACA ---------- ---------- ---------- ---------- ---------- ---------- ---------- ---------- ---------- ---------- ---------- ---------- ---------- ---------- ---------- ---------- ---------- ---------- ---------- ---------- ---------- ---------- ---------- ---------- ---------- ---------- ---------- ---------- ---------- ---------- ---------- ---------- ---------- ---------- ---------- ---------- ---------- ---------- ---------- ---------- ---------- ---------- ---------- ---------- ---------- ---------- ---------- ---------- ---------- ---------- TTGTGAGTTC AGAT-TGTAG TTAAAAACAC C--------- ----------

52

Page 53: Project  Report on Influenza Virus

TTGTAAGT-C ATTT-TATAA TTAAAAACAC CCTTGTTTCC TGA------- TTAGAATTTC AGAAATATGA GGAAAAACAC CCTTGTTTCT ACT------- ---------- ---------- ---------- ---------- ---------- -------TGC AT-----TAA TTAAAAACAC CCTTGTTTCT ACT------- ---------- ---------- ---------- ---------- ---------- ---------- ---------- ---------- ---------- ---------- ---------- ---------- ---------- ---------- ---------- ---------- ---------- ---------- ---------- ---------- ---------- ---------- ---------- ---------- ---------- TGCTGTGGTA ATGCCAGCTC ATGGCCCAGC CAAGAGCATG GAATATGATG TGCTGTGGTA ATGCCAGCCC ATGGTCCGGC CAAGAGCATG GAATATGATG TGCTGTGGTA ATGCCAGCTC ACGGTCCAGC CAAGAGCATG GAATATGATG TGCTGTAGTG ATGCCAGCCC ACGGTCCAGC CAAAAGTATG GAATATGATG TGCAGTGATG ATGCCAGCAC ATGGTCCAGC CAAAAACATG GAGTATGATG

CAACAGTCTA TATGCATCTC CACAACTCGA GGGGTTTTCA GCTGAATCAA CAACAGCCTA TATTCATCTC CACAACTCGA AGGATTTTCA GCTGAATCGA CAATAGCCTG TATGCATCCC CACAATTAGA AGGATTTTCA GCTGAATCAA TAACAGCTTG TATGCATCTC CACAACTAGA AGGATTTTCA GCTGAATCAA CAATAGCCTG TATGCATCAC CACAATTGGA AGGATTTTCA GCGGAGTCAA ---------- ---------- ---------- ---------- ---------- ---------- ---------- ---------- ---------- ---------- ---------- ---------- ---------- ---------- ---------- ---------- ---------- ---------- ---------- ---------- ---------- ---------- ---------- ---------- ---------- ---------- ---------- ---------- ---------- ---------- AGGCAACCAA AAGGCTTACA GTCCTCGGAA AGGACGCAGG TGCATTAACA AGACAACTAA GAGGCTTACA ATACTTGGGA AGGACGCAGG TGCGCTTACA AGACCACTAA GAGACTAACA ATTCTCGGAA AGGATGCTGG CACTTTAACT AGACCACTAA AAGACTAACA ATTCTCGGAA AAGATGCCGG CACTTTAATT AGGCCACGAA GAGACTCACA GTTCTCGGAA AGGATGCTGG CACTTTAACC ---------- ---------- ---------- ---------- ---------- ---------- ---------- ---------- ---------- ---------- ---------- ---------- ---------- ---------- ---------- ---------- ---------- ---------- ---------- ---------- ---------- ---------- ---------- ---------- ---------- ---------- ---------- ---------- ---------- ---------- ---------- ---------- ---------- ---------- ---------- ---------- ---------- ---------- ---------- ---------- ---------- ---------- ---------- ---------- ---------- ---------- ---------- ---------- ---------- ---------- ---------- ---------- ---------- ---------- ---------- ---------- ---------- ---------- ---------- ---------- ---------- ---------- ---------- ---------- ---------- ---------- ---------- ---------- ---------- ---------- ---------- ---------- ---------- ---------- ---------- ---------- ---------- ---------- ---------- ---------- ---------- ---------- ---------- ---------- ---------- ---------- ---------- ---------- ---------- ---------- ---------- ---------- ---------- ---------- ---------- ---------- ---------- ---------- ---------- ---------- CAGTTGCGAC TACACATTCA TGGATTCCCA AGAGGAATCG TTCCATTCTC CCGTTGCAAC TACACATTCA TGGATTCCCA AGAGAAATCG CTCCATTCTC CTGTTGCTAC TACACACTCC TGGACCCCTA AGAGGAACCG CTCCATTCTC CCGTTGCAAC TACACACTCC TGGAATCCCA AGAGGAACCG CTCTATTCTA CTGTTGCAAC AACACACTCC TGGATCCCCA AAAGAAATCG ATCCATCTTG

GAAAATTGCT TCTCATTGTT CAGGCACTTA GGGACAACCT GGAACCTGGA GAAAACTACT ACTCATTGTT CAAGCACTTA GGGACAACCT GGAACCTGGA GAAAACTGCT TCTTGTCGTT CAGGCTCTTA GGGACAATCT TGAACCTGGA GAAAACTGCT TCTTATCGTT CAGGCTCTTA GGGACAATCT GGAACCTGGG GAAAACTGCT TCTTGTTGTT CAGGCTCTTA GGGACAACCT CGAACCTGGG ---------- ---------- ---------- ---------- ---------- ---------- ---------- ---------- ---------- ---------- ---------- ---------- ---------- ---------- ---------- ---------- ---------- ---------- ---------- ---------- ---------- ---------- ---------- ---------- ---------- ---------- ---------- ---------- ---------- ---------- GAAGATCCAG ACGAGGGAAC AGCCGGGGTG GAATCTGCAG TATTGAGGGG GAGGACCCAG ATGAAGGAAC AGCAGGAGTA GAGTCTGCAG TATTGAGAGG GAAGACCCAG ATGAAGGCAC ATCCGGAGTG GAGTCCGCTG TTCTGAGAGG GAAGACCCAG ATGAAAGCAC ATCCGGAGTG GAGTCCGCCG TCTTGAGAGG GAAGACCCAG ATGAAGGCAC AGCTGGAGTG GAGTCCGCTG TTCTGAGGGG ---------- ---------- ---------- ---------- ---------- ---------- ---------- ---------- ---------- ---------- ---------- ---------- ---------- ---------- ---------- ---------- ---------- ---------- ---------- ---------- ---------- ---------- ---------- ---------- ---------- ---------- ---------- ---------- ---------- ---------- ---------- ---------- ---------- ---------- ---------- ---------- ---------- ---------- ---------- ---------- ---------- ---------- ---------- ---------- ---------- ---------- ---------- ---------- ---------- ---------- ---------- ---------- ---------- ---------- ---------- ---------- ---------- ---------- ---------- ----------

53

Page 54: Project  Report on Influenza Virus

---------- ---------- ---------- ---------- ---------- ---------- ---------- ---------- ---------- ---------- ---------- ---------- ---------- ---------- ---------- ---------- ---------- ---------- ---------- ---------- ---------- ---------- ---------- ---------- ---------- ---------- ---------- ---------- ---------- ---------- ---------- ---------- ---------- ---------- ---------- ---------- ---------- ---------- ---------- ---------- AACACCAGCC AAAGGGGGAT TCTTGAGGAT GAACAGATGT ATCAGAAGTG AACACTAGCC AAAGGGGAAT TCTTGAGGAT GAACAAATGT ACCAGAAGTG AACACAAGCC AAAGGGGAAT TCTTGAAGAT GAACAGATGT ATCAGAAGTG AACACTAGCC AAAGGGGAAT TCTTGAGGAT GAACAGATGT ACCAAAAGTG AATACAAGTC AAAGAGGAGT ACTTGAAGAT GAACAAATGT ACCAAAGGTG

ACCTTCGATC TTGGGGGGCT ATATGAAGCA ATTGAGGAGT GCCTGATTAA ACCTTTGATC TTGAAGGGCT ATATGGAGCA ATTGAGGAGT GCCTGATTAA ACCTTTGATC TTGGGGGGCT ATATGAAGCA ATTGAGGAGT GCCTGATTAA ACCTTTGATC TTGGGGGGCT ATATGAAGCA ATTGAGGAGT GCCTAATTAA ACCTTTGATC TCGGGGGGCT ATATGAAGCA ATTGAGGAGT GCCTGATTAA ---------- ---------- ---------- ---------- ---------- ---------- ---------- ---------- ---------- ---------- ---------- ---------- ---------- ---------- ---------- ---------- ---------- ---------- ---------- ---------- ---------- ---------- ---------- ---------- ---------- ---------- ---------- ---------- ---------- ---------- ATTCCTAATT CTAGGCAGAG AGGACAAAAG ATATGGACCC GCATTGAGCA ATTTCTAATC CTCGGCAAAG AAGACAAAAG ATATGGACCA GCATTAAGCA ATTCCTCATT CTGGGCAAGG AAGATAGAAG ATATGGACCA GCATTAAGCA GTTTCTCATT ATAGGTAAGG AAGACAGAAG ATACGGACCA GCATTAAGCA ATTCCTCATT CTGGGCAAAG AAGACAGGAG ATATGGGCCA GCATTAAGCA ---------- ---------- ---------- ---------- ---------- ---------- ---------- ---------- ---------- ---------- ---------- ---------- ---------- ---------- ---------- ---------- ---------- ---------- ---------- ---------- ---------- ---------- ---------- ---------- ---------- ---------- ---------- ---------- ---------- ---------- ---------- ---------- ---------- ---------- ---------- ---------- ---------- ---------- ---------- ---------- ---------- ---------- ---------- ---------- ---------- ---------- ---------- ---------- ---------- ---------- ---------- ---------- ---------- ---------- ---------- ---------- ---------- ---------- ---------- ---------- ---------- ---------- ---------- ---------- ---------- ---------- ---------- ---------- ---------- ---------- ---------- ---------- ---------- ---------- ---------- ---------- ---------- ---------- ---------- ---------- ---------- ---------- ---------- ---------- ---------- ---------- ---------- ---------- ---------- ---------- ---------- ---------- ---------- ---------- ---------- ---------- ---------- ---------- ---------- ---------- CTGCAATCTA TTCGAGAAAT TCTTCCCTAG CAGTTCATAT CGGAGGCCAG CTGCACTCTA TTCGAGAAAT TCTTCCCTAG CAGTTCATAT CGGAGGCCAG TTGCAATCTA TTTGAGAAAT TCTTCCCTAG CAGTTCGTAC AGGAGACCAG CTGCAACTTG TTCGAGAAAT TTTTCCCTAG TAGTTCATAT AGGAGACCGA CTGCAATTTA TTTGAAAAAT TCTTCCCCAG CAGTTCATAC AGAAGACCAG

TGATCCCTGG GTTTTGCTTA ATGCATCTTG GTTCAACTCC TTCCTCACAC TGATCCCTGG GTTTTGCTTA ATGCATCTTG GTTCAACTCC TTCCTCACAC TGATCCCTGG GTTTTGCTTA ATGCGTCTTG GTTCAACTCC TTCCTAACAC TGATCCCTGG GTTTTGCTTA ATGCTTCTTG GTTCAACTCC TTCCTTACAC TGATCCCTGG GTTTTGCTCA ATGCATCTTG GTTCAACTCC TTCCTGACAC ---------- ---------- ---------- ---------- ---------- ---------- ---------- ---------- ---------- ---------- ---------- ---------- ---------- ---------- ---------- ---------- ---------- ---------- ---------- ---------- ---------- ---------- ---------- ---------- ---------- ---------- ---------- ---------- ---------- ---------- TCAATGAACT GAGCAATCTT GCAAAAGGGG AGAAGGCTAA TGTATTGATA TCAATGAACT GAGCAATCTT ACGAAAGGGG AGAAAGCTAA TGTATTGATA TCAATGAACT GAGTACCCTT GCAAAAGGAG AAAAGGCTAA TGTACTAATT TCAATGAACT GAGTAACCTT GCAAAAGGGG AAAAGGCTAA TGTGCTAATC TCAATGAACT GAGCAACCTT GCGAAAGGAG AGAAGGCTAA TGTGCTAATT ---------- ---------- ---------- ---------- ---------- ---------- ---------- ---------- ---------- ---------- ---------- ---------- ---------- ---------- ---------- ---------- ---------- ---------- ---------- ---------- ---------- ---------- ---------- ---------- ---------- ---------- ---------- ---------- ---------- ---------- ---------- ---------- ---------- ---------- ---------- ---------- ---------- ---------- ---------- ---------- ---------- ---------- ---------- ---------- ---------- ---------- ---------- ---------- ---------- ---------- ---------- ---------- ---------- ---------- ---------- ---------- ---------- ---------- ---------- ---------- ---------- ---------- ---------- ---------- ----------

54

Page 55: Project  Report on Influenza Virus

---------- ---------- ---------- ---------- ---------- ---------- ---------- ---------- ---------- ---------- ---------- ---------- ---------- ---------- ---------- ---------- ---------- ---------- ---------- ---------- ---------- ---------- ---------- ---------- ---------- ---------- ---------- ---------- ---------- ---------- ---------- ---------- ---------- ---------- ---------- TTGGAATTTC CAGCATGGTG GAGGCCATGG TGTCTAGGGC CCGAATTGAT TTGGAATTTC CAGCATGATG GAGGCCATGG TGTCTAGGGC CCGAATTGAT TTGGAATTTC CAGCATGGTG GAGGCCATGG TGTCTAGGGC TCGGATTGAT TTGGAATTTC TAGCATGGTG GAGGCCATGG TGTCTAGGGC CCGGATTGAT TCGGGATATC CAGTATGGTG GAGGCTATGG TTTCCAGAGC CCGAATTGAT

ATGCACTAAG A--TAGTTGT GGCAATGCTA CTATTTGCTA TCCATACTGT ATGCACTAAA A--TAGTTGT GGCAATGCTA CTATTTGCTA TCCATACTGT ATGCATTAAG A--TAGTTGT GGCAATGCTA CTATTTGCTA TCCATACTGT ATGCATTGAG T--TAGTTGT GGCAGTGCTA CTATTTGCTA TCCATACTGT ATGCATTAAA A--TAGTTAT GGCAGTGCTA CTATTTGTTA TCCGTACTGT ---------- ---------- ---------- ---------- ---------- ---------- ---------- ---------- ---------- ---------- ---------- ---------- ---------- ---------- ---------- ---------- ---------- ---------- ---------- ---------- ---------- ---------- ---------- ---------- ---------- ---------- ---------- ---------- ---------- ---------- ATGCAAGGAG ACGTGGTGTT GGTAATGAAA CGGAAACGGG ACTTTAGCAT GGGCAAGGAG ACGTAGTGTT GGTAATGAAA CGGAAACGGG ACTCTAGCAT GGGCAAGGAG ACGTGGTGTT GGTAATGAAA CGAAAACGGG ACTCTAGCAT GGGCAAGGAG ACGTGGTGTT GGTAATGAAA CGAAAACGGG ACTCTAGCAT GGGCAAGGAG ACGTGGTGTT GGTAATGAAA CGAAAACGGG ACTCTAGCAT ---------- ---------- ---------- ---------- ---------- ---------- ---------- ---------- ---------- ---------- ---------- ---------- ---------- ---------- ---------- ---------- ---------- ---------- ---------- ---------- ---------- ---------- ---------- ---------- ---------- ---------- ---------- ---------- ---------- ---------- ---------- ---------- ---------- ---------- ---------- ---------- ---------- ---------- ---------- ---------- ---------- ---------- ---------- ---------- ---------- ---------- ---------- ---------- ---------- ---------- ---------- ---------- ---------- ---------- ---------- ---------- ---------- ---------- ---------- ---------- ---------- ---------- ---------- ---------- ---------- ---------- ---------- ---------- ---------- ---------- ---------- ---------- ---------- ---------- ---------- ---------- ---------- ---------- ---------- ---------- ---------- ---------- ---------- ---------- ---------- ---------- ---------- ---------- ---------- ---------- ---------- ---------- ---------- ---------- ---------- ---------- ---------- ---------- ---------- ---------- GCACGAATTG ACTTCGAGTC TGGAAGGATT AAGAAAGAAG AGTTTGCTGA GCACGGATTG ACTTCGAGTC TGGAAGGATT AAGAAAGAAG AATTTGCTGA GCACGGATTG ACTTCGAGTC TGGACGGATT AAGAAAGAGG AGTTCGCTGA GCCAGAATTG ACTTCGAGTC TGGACGGATT AAGAAGGAAG AGTTCTCTGA GCACGGATTG ATTTCGAATC TGGAAGGATA AAGAAAGAAG AGTTCACTGA

CCAAAAAAGT ACCTTGTTTC TACT------ ---------- ---------- CCAAAAAAGT ACCTTGTTTC ---------- ---------- ---------- CCAAAAAAGT ACCTTGTTTC TACT------ ---------- ---------- CCAAAAAAGT ACCTTGTTTC TACT------ ---------- ---------- CCAAAAAAGT ACCTTGTTTC TACT------ ---------- ---------- ---------- ---------- ---------- ---------- ---------- ---------- ---------- ---------- ---------- ---------- ---------- ---------- ---------- ---------- ---------- ---------- ---------- ---------- ---------- ---------- ---------- ---------- ---------- ---------- ---------- ---------- ---------- ---------- ---------- ---------- ACTTACTGAC AGCCAGACAG CGACCAAAAG AATTCGGATG GCCATCAATT ACTTACTGAC AGCCAGACAG CGACCAAAAG AATTCGGATG GCCATCAATT ACTTACTGAC AGCCAGACAG CGACCAAAAG AATTCGGATG GCCATCAATT ACTTACTGAC AGCCAGACAG CGACCAAAAG AATTCGGATG GCCATCAATT ACTTACTGAC AGCCAGACAG CGACCAAAAG AATTCGGATG GCCATCAATT ---------- ---------- ---------- ---------- ---------- ---------- ---------- ---------- ---------- ---------- ---------- ---------- ---------- ---------- ---------- ---------- ---------- ---------- ---------- ---------- ---------- ---------- ---------- ---------- ---------- ---------- ---------- ---------- ---------- ---------- ---------- ---------- ---------- ---------- ---------- ---------- ---------- ---------- ---------- ---------- ---------- ---------- ---------- ---------- ---------- ---------- ---------- ---------- ---------- ---------- ---------- ---------- ---------- ---------- ---------- ---------- ---------- ---------- ---------- ---------- ---------- ---------- ---------- ---------- ---------- ---------- ---------- ---------- ---------- ----------

55

Page 56: Project  Report on Influenza Virus

---------- ---------- ---------- ---------- ---------- ---------- ---------- ---------- ---------- ---------- ---------- ---------- ---------- ---------- ---------- ---------- ---------- ---------- ---------- ---------- ---------- ---------- ---------- ---------- ---------- ---------- ---------- ---------- ---------- ---------- GATCATGAAG ATCTGTTCCA CCATTGAAGA GCTCGGACGG CAAAAATAGT GATCTTGAAG ATCTGTTCCA CCATTGAAGA GCTCGGACGG CAAGGGAAGT GATCATGAAG ATCTGTTCCA CCATTGAAGA GCTCAGACGG CAAAAATAGT GATCATGAAG ATCTGTTCCA CCATTGAAGA ACTCAGACGG CAAAAATAAT GATCATGAAG ATCTGTTCCA CCATTGAAGA GCTCAGACGG CAAAAATAGT

---------- ---------- ---------- ---------- -- ---------- ---------- ---------- ---------- -- ---------- ---------- ---------- ---------- -- ---------- ---------- ---------- ---------- -- ---------- ---------- ---------- ---------- -- ---------- ---------- ---------- ---------- -- ---------- ---------- ---------- ---------- -- ---------- ---------- ---------- ---------- -- ---------- ---------- ---------- ---------- -- ---------- ---------- ---------- ---------- -- ---------- ---------- ---------- ---------- -- AGTGTTGAAT AGTTTAAAAA CGACCTTGTT TCTACT---- -- AGTGTCGAAT TGTTTAAAAA CGACCTTGTT TCTACT---- -- AATGTTGAAT AGTTTAAAAA CGACCTTGTT TCTACT---- -- AATGTTGAAT AGTTTAAAAA CGACCTTGTT TCTACT---- -- AGTGTCGAAT AGTTTAAAAA CGACCTTGTT TCTACT---- -- ---------- ---------- ---------- ---------- -- ---------- ---------- ---------- ---------- -- ---------- ---------- ---------- ---------- -- ---------- ---------- ---------- ---------- -- ---------- ---------- ---------- ---------- -- ---------- ---------- ---------- ---------- -- ---------- ---------- ---------- ---------- -- ---------- ---------- ---------- ---------- -- ---------- ---------- ---------- ---------- -- ---------- ---------- ---------- ---------- -- ---------- ---------- ---------- ---------- -- ---------- ---------- ---------- ---------- -- ---------- ---------- ---------- ---------- -- ---------- ---------- ---------- ---------- -- ---------- ---------- ---------- ---------- -- ---------- ---------- ---------- ---------- -- ---------- ---------- ---------- ---------- -- ---------- ---------- ---------- ---------- -- ---------- ---------- ---------- ---------- -- ---------- ---------- ---------- ---------- -- GAATTTAGCT TGTCCTTCAT GAAAAAATGC CTTGTTTCTA CT GAATTTGGCT TGTCCTTCAT GAAAAAATGC ---------- -- GAATTTAGCT TGTCCTTCAT GAAAAAATGC CTTGTTTCTA CT GAATTTAGCT TGTCCTTCAT GAAAAAATGC CTTGTTTCTA CT GAATTTAGCT TGTCCTTCAT GAAAAAATGC CTTGTTCCTA CT

Output of Dnadist method

41

gi|7385295 0.0000 0.1063 0.1753 0.1641 0.1931 2.3471 2.3223 2.3333 2.3923 2.3492 2.4457 2.4589 2.4414 2.4361 2.4361 2.4175 2.1281 2.1334 2.1353 2.1765 2.4725 2.3944 2.4102 2.3769 2.3570 2.3404 3.4066 3.1170 3.3324 3.3557 3.9585 3.7485 3.8024 3.4048 3.3406 3.4574 3.6922 3.6989 3.7171 3.5242 3.9746 gi|3214017 0.1063 0.0000 0.1819 0.1662 0.1978 2.2872 2.2655 2.2463 2.3663 2.3324 2.3903 2.4156 2.3836 2.3723 2.3888 2.3549 2.1077 2.1311 2.1038 2.1778 2.3779 2.3026 2.3312 2.3061 2.3164 2.2842 3.2983 3.0012 3.2202 3.3403 3.8456 3.6957 3.7325 3.3231 3.2845 3.4185 3.5965 3.6382 3.6024 3.4367 3.9079 gi|7391268 0.1753 0.1819 0.0000 0.0635 0.0732 2.3310 2.3093 2.2899 2.3079 2.2873 2.3792 2.3645 2.3543 2.3290 2.3290 2.3346 2.0488 2.0537 2.0979 2.0716 2.4847 2.1932 2.2132 2.2194 2.1773 2.1622 3.1839 2.9733 3.2670 3.3618 3.6743 3.4250 3.3827 3.3130 3.2729 3.4077 3.6756 3.6942 3.6854 3.5127 3.9462 gi|8486136 0.1641 0.1662 0.0635 0.0000 0.1143 2.2752 2.2540 2.2418 2.2874 2.2447 2.3460 2.3970 2.3782 2.3763 2.3879 2.3699 2.0142 2.0117 2.0652 2.0141 2.4847 2.2457 2.2457 2.2790 2.2420 2.1932

56

Page 57: Project  Report on Influenza Virus

3.2076 2.9808 3.3757 3.3544 3.7509 3.4675 3.4940 3.4097 3.4056 3.5080 3.6887 3.6989 3.6968 3.5370 4.0509 gi|7391913 0.1931 0.1978 0.0732 0.1143 0.0000 2.3164 2.2949 2.2982 2.2653 2.2545 2.3346 2.3900 2.3915 2.3745 2.3763 2.3722 2.0646 2.0454 2.1000 2.0535 2.5042 2.1984 2.2083 2.2405 2.2301 2.1671 3.3522 3.0557 3.2515 3.3894 3.6161 3.3418 3.3619 3.2329 3.1764 3.2901 3.7411 3.8351 3.7548 3.5715 4.1240 gi|3214015 2.3471 2.2872 2.3310 2.2752 2.3164 0.0000 0.0026 0.0604 0.2072 0.2188 0.1729 1.8735 1.8961 1.8326 1.8229 1.8459 1.7862 1.8365 1.8321 1.9505 1.9020 2.5268 2.4282 2.3710 2.3772 2.4389 3.2524 3.5546 2.8988 3.2747 4.0283 3.7818 4.1020 3.5122 3.2693 3.4376 3.3673 3.3846 3.2598 3.4325 3.6583 gi|9316315 2.3223 2.2655 2.3093 2.2540 2.2949 0.0026 0.0000 0.0629 0.2081 0.2185 0.1773 1.8978 1.9210 1.8593 1.8535 1.8706 1.7862 1.8365 1.8321 1.9505 1.9020 2.5268 2.4282 2.3710 2.3772 2.4389 3.2085 3.5254 2.8775 3.2445 3.9636 3.7506 4.0654 3.5122 3.2480 3.4376 3.3452 3.3622 3.2386 3.4065 3.6268 gi|7385295 2.3333 2.2463 2.2899 2.2418 2.2982 0.0604 0.0629 0.0000 0.1857 0.1931 0.1538 1.8691 1.8946 1.8198 1.8043 1.8482 1.8428 1.9098 1.9150 2.0378 1.9624 2.5193 2.4126 2.3697 2.3658 2.4131 3.2091 3.3703 2.8235 3.1243 3.7117 3.7628 4.0000 3.4275 3.2102 3.3757 3.2474 3.2770 3.1404 3.3142 3.5072 gi|7392130 2.3923 2.3663 2.3079 2.2874 2.2653 0.2072 0.2081 0.1857 0.0000 0.0840 0.0854 1.7959 1.7853 1.7906 1.8189 1.8426 1.7581 1.8321 1.8591 1.9264 1.9050 2.3097 2.2709 2.2180 2.2387 2.2348 3.4449 3.4819 3.0244 3.2002 3.8465 3.5716 3.8485 3.2679 3.1513 3.2696 3.3755 3.4057 3.3646 3.5577 3.8782 gi|7391914 2.3492 2.3324 2.2873 2.2447 2.2545 0.2188 0.2185 0.1931 0.0840 0.0000 0.1316 1.8751 1.9080 1.8509 1.8905 1.8919 1.8452 1.9098 1.9579 1.9903 2.0053 2.2506 2.2334 2.2124 2.2378 2.2339 3.4797 3.6735 3.0447 3.2764 3.8861 3.6433 3.9515 3.2690 3.1515 3.1859 3.5497 3.5460 3.5616 3.7566 3.8968 gi|8486129 2.4457 2.3903 2.3792 2.3460 2.3346 0.1729 0.1773 0.1538 0.0854 0.1316 0.0000 1.9223 1.9150 1.8833 1.9165 1.9450 1.7786 1.8124 1.9045 1.9465 1.9350 2.2993 2.2342 2.1553 2.1970 2.2077 3.4354 3.6329 2.9941 3.2301 3.7635 3.6555 3.9188 3.3270 3.1548 3.4501 3.5787 3.6013 3.5044 3.6913 3.9760 gi|7385294 2.4589 2.4156 2.3645 2.3970 2.3900 1.8735 1.8978 1.8691 1.7959 1.8751 1.9223 0.0000 0.1329 0.1899 0.1958 0.1861 1.8359 1.7804 1.8717 1.7894 1.6858 2.3617 2.3911 2.4332 2.5037 2.4545 3.4453 2.9944 3.2243 3.2154 3.6537 3.8403 3.7811 3.5032 3.4476 3.6604 3.3555 3.3149 3.3699 3.2953 3.1628 gi|3214016 2.4414 2.3836 2.3543 2.3782 2.3915 1.8961 1.9210 1.8946 1.7853 1.9080 1.9150 0.1329 0.0000 0.1985 0.2048 0.1866 1.8219 1.7786 1.8343 1.7768 1.6980 2.3391 2.3485 2.4835 2.5265 2.4560 3.4125 2.9268 3.2629 3.0825 3.4834 4.0980 3.8438 3.4661 3.4527 3.7056 3.2613 3.2418 3.2920 3.2639 3.1153 gi|7391882 2.4361 2.3723 2.3290 2.3763 2.3745 1.8326 1.8593 1.8198 1.7906 1.8509 1.8833 0.1899 0.1985 0.0000 0.0718 0.0771 1.7394 1.6830 1.7922 1.7328 1.6304 2.3199 2.3441 2.4214 2.4810 2.4421 3.2508 2.8484 2.9419 3.0785 3.3668 3.6878 3.5423 3.4478 3.4432 3.5653 3.4535 3.4243 3.4878 3.5116 3.3345 gi|7391905 2.4361 2.3888 2.3290 2.3879 2.3763 1.8229 1.8535 1.8043 1.8189 1.8905 1.9165 0.1958 0.2048 0.0718 0.0000 0.1157 1.7256 1.6556 1.7738 1.6841 1.6127 2.2979 2.3320 2.3968 2.4595 2.4125 3.1872 2.8279 2.8438 3.0565 3.4205 3.7099 3.5393 3.5158 3.5048 3.5986 3.4109 3.3748 3.4148 3.4229 3.3269 gi|8486138 2.4175 2.3549 2.3346 2.3699 2.3722 1.8459 1.8706 1.8482 1.8426 1.8919 1.9450 0.1861 0.1866 0.0771 0.1157 0.0000 1.7464 1.6892 1.8241 1.7445 1.6850 2.3446 2.3642 2.4330 2.4839 2.4675 3.3591 2.8793 3.0757 3.1889 3.4216 3.5794 3.4283 3.3558 3.3524 3.4877 3.3955 3.3630 3.4362 3.4367 3.2463 gi|7392156 2.1281 2.1077 2.0488 2.0142 2.0646 1.7862 1.7862 1.8428 1.7581 1.8452 1.7786 1.8359 1.8219 1.7394 1.7256 1.7464 0.0000 0.0672 0.0791 0.1454 0.4088 1.9652 2.0763 1.9867 2.0224 2.0344 2.7992 2.6308 2.6861 3.4638 3.2188 3.7354 3.3783 3.5087 3.4517 3.3484 3.2180 3.0730 3.3767 3.2010 3.3079 gi|7391921 2.1334 2.1311 2.0537 2.0117 2.0454 1.8365 1.8365 1.9098 1.8321 1.9098 1.8124 1.7804 1.7786 1.6830 1.6556 1.6892 0.0672 0.0000 0.1177 0.1624 0.3727 1.9452 2.0526 1.9835 2.0339 2.0339

57

Page 58: Project  Report on Influenza Virus

2.7025 2.4562 2.5497 3.3621 3.0388 3.8921 3.5995 3.4530 3.4882 3.3117 3.0655 2.9303 3.2099 3.0375 3.2489 gi|8486131 2.1353 2.1038 2.0979 2.0652 2.1000 1.8321 1.8321 1.9150 1.8591 1.9579 1.9045 1.8717 1.8343 1.7922 1.7738 1.8241 0.0791 0.1177 0.0000 0.1252 0.3973 2.0882 2.1954 2.1324 2.1609 2.1728 2.8688 2.6256 2.7224 3.3832 3.4314 4.2119 3.7273 3.6344 3.6545 3.4832 3.1278 3.0100 3.3041 3.1280 3.4143 gi|3214016 2.1765 2.1778 2.0716 2.0141 2.0535 1.9505 1.9505 2.0378 1.9264 1.9903 1.9465 1.7894 1.7768 1.7328 1.6841 1.7445 0.1454 0.1624 0.1252 0.0000 0.4015 2.0495 2.1645 2.1431 2.1843 2.1541 2.9173 2.7117 2.7484 3.3454 3.3862 4.5655 3.9304 3.9352 3.7668 3.7097 3.3456 3.1959 3.5743 3.4190 3.7992 gi|7385294 2.4725 2.3779 2.4847 2.4847 2.5042 1.9020 1.9020 1.9624 1.9050 2.0053 1.9350 1.6858 1.6980 1.6304 1.6127 1.6850 0.4088 0.3727 0.3973 0.4015 0.0000 2.1572 2.2267 2.2642 2.2859 2.2775 3.1556 2.6173 3.0914 3.5086 3.3677 3.8865 4.2283 3.4490 3.3421 3.4440 2.8189 2.7691 2.8591 2.7993 3.0417 gi|7385295 2.3944 2.3026 2.1932 2.2457 2.1984 2.5268 2.5268 2.5193 2.3097 2.2506 2.2993 2.3617 2.3391 2.3199 2.2979 2.3446 1.9652 1.9452 2.0882 2.0495 2.1572 0.0000 0.0786 0.1123 0.1362 0.1063 4.8457 3.5818 3.5454 4.2749 4.3292 5.3580 5.6194 4.9122 4.8318 4.5861 4.5477 4.3458 4.3820 5.0460 4.4859 gi|3214142 2.4102 2.3312 2.2132 2.2457 2.2083 2.4282 2.4282 2.4126 2.2709 2.2334 2.2342 2.3911 2.3485 2.3441 2.3320 2.3642 2.0763 2.0526 2.1954 2.1645 2.2267 0.0786 0.0000 0.1233 0.1396 0.1160 4.4229 3.3648 3.6261 4.2044 4.2963 5.4914 5.7825 5.6269 5.4956 5.1057 4.6878 4.4745 4.5093 5.2488 4.6878 gi|7391268 2.3769 2.3061 2.2194 2.2790 2.2405 2.3710 2.3710 2.3697 2.2180 2.2124 2.1553 2.4332 2.4835 2.4214 2.3968 2.4330 1.9867 1.9835 2.1324 2.1431 2.2642 0.1123 0.1233 0.0000 0.0525 0.0567 5.3095 3.8582 3.7704 4.5809 4.8015 5.8650 6.0600 5.0782 4.9995 4.8603 4.7422 4.6551 4.6975 5.3408 4.9884 gi|7391915 2.3570 2.3164 2.1773 2.2420 2.2301 2.3772 2.3772 2.3658 2.2387 2.2378 2.1970 2.5037 2.5265 2.4810 2.4595 2.4839 2.0224 2.0339 2.1609 2.1843 2.2859 0.1362 0.1396 0.0525 0.0000 0.0779 5.1091 3.7759 3.8247 4.2165 4.6605 5.5875 5.9400 4.7465 4.8260 4.5831 4.6156 4.4064 4.4469 4.9537 4.6771 gi|8486122 2.3404 2.2842 2.1622 2.1932 2.1671 2.4389 2.4389 2.4131 2.2348 2.2339 2.2077 2.4545 2.4560 2.4421 2.4125 2.4675 2.0344 2.0339 2.1728 2.1541 2.2775 0.1063 0.1160 0.0567 0.0779 0.0000 4.7002 3.6553 3.6434 4.0542 4.4690 5.7112 6.0600 4.6885 4.6312 4.5197 4.6864 4.4694 4.5141 5.0517 4.7604 gi|7385295 3.4066 3.2983 3.1839 3.2076 3.3522 3.2524 3.2085 3.2091 3.4449 3.4797 3.4354 3.4453 3.4125 3.2508 3.1872 3.3591 2.7992 2.7025 2.8688 2.9173 3.1556 4.8457 4.4229 5.3095 5.1091 4.7002 0.0000 0.4017 0.5474 0.7250 0.9789 4.1633 3.8804 3.0973 3.2133 3.2358 3.7356 3.5969 3.6391 3.7716 3.6338 gi|7391914 3.1170 3.0012 2.9733 2.9808 3.0557 3.5546 3.5254 3.3703 3.4819 3.6735 3.6329 2.9944 2.9268 2.8484 2.8279 2.8793 2.6308 2.4562 2.6256 2.7117 2.6173 3.5818 3.3648 3.8582 3.7759 3.6553 0.4017 0.0000 0.5016 0.7396 0.9167 3.8304 3.9581 3.0471 3.1679 3.2251 3.3705 3.2747 3.2685 3.3578 3.3159 gi|8486125 3.3324 3.2202 3.2670 3.3757 3.2515 2.8988 2.8775 2.8235 3.0244 3.0447 2.9941 3.2243 3.2629 2.9419 2.8438 3.0757 2.6861 2.5497 2.7224 2.7484 3.0914 3.5454 3.6261 3.7704 3.8247 3.6434 0.5474 0.5016 0.0000 0.7361 0.8937 3.3880 3.4460 2.9635 2.9855 3.1497 3.4162 3.3643 3.2892 3.3775 3.3498 gi|3214016 3.3557 3.3403 3.3618 3.3544 3.3894 3.2747 3.2445 3.1243 3.2002 3.2764 3.2301 3.2154 3.0825 3.0785 3.0565 3.1889 3.4638 3.3621 3.3832 3.3454 3.5086 4.2749 4.2044 4.5809 4.2165 4.0542 0.7250 0.7396 0.7361 0.0000 0.9653 3.9915 3.8362 3.1780 3.0869 3.3141 3.6510 3.4541 3.3768 3.3963 3.5212 gi|7391920 3.9585 3.8456 3.6743 3.7509 3.6161 4.0283 3.9636 3.7117 3.8465 3.8861 3.7635 3.6537 3.4834 3.3668 3.4205 3.4216 3.2188 3.0388 3.4314 3.3862 3.3677 4.3292 4.2963 4.8015 4.6605 4.4690 0.9789 0.9167 0.8937 0.9653 0.0000 3.5835 3.6419 3.6494 3.7049 3.7925 3.7010 3.5998 3.4375 3.3912 3.7110 gi|7392126 3.7485 3.6957 3.4250 3.4675 3.3418 3.7818 3.7506 3.7628 3.5716 3.6433 3.6555 3.8403 4.0980 3.6878 3.7099 3.5794 3.7354 3.8921 4.2119 4.5655 3.8865 5.3580 5.4914 5.8650 5.5875 5.7112

58

Page 59: Project  Report on Influenza Virus

4.1633 3.8304 3.3880 3.9915 3.5835 0.0000 0.2097 0.9395 0.9020 0.8816 1.8960 1.8747 1.8492 1.7900 1.8792 gi|8486127 3.8024 3.7325 3.3827 3.4940 3.3619 4.1020 4.0654 4.0000 3.8485 3.9515 3.9188 3.7811 3.8438 3.5423 3.5393 3.4283 3.3783 3.5995 3.7273 3.9304 4.2283 5.6194 5.7825 6.0600 5.9400 6.0600 3.8804 3.9581 3.4460 3.8362 3.6419 0.2097 0.0000 0.9753 0.9052 0.9255 1.8904 1.8394 1.8416 1.7969 1.8803 gi|7392130 3.4048 3.3231 3.3130 3.4097 3.2329 3.5122 3.5122 3.4275 3.2679 3.2690 3.3270 3.5032 3.4661 3.4478 3.5158 3.3558 3.5087 3.4530 3.6344 3.9352 3.4490 4.9122 5.6269 5.0782 4.7465 4.6885 3.0973 3.0471 2.9635 3.1780 3.6494 0.9395 0.9753 0.0000 0.1261 0.1695 1.7832 1.7161 1.8218 1.7742 1.8920 gi|7391913 3.3406 3.2845 3.2729 3.4056 3.1764 3.2693 3.2480 3.2102 3.1513 3.1515 3.1548 3.4476 3.4527 3.4432 3.5048 3.3524 3.4517 3.4882 3.6545 3.7668 3.3421 4.8318 5.4956 4.9995 4.8260 4.6312 3.2133 3.1679 2.9855 3.0869 3.7049 0.9020 0.9052 0.1261 0.0000 0.2151 1.8357 1.7788 1.8401 1.7820 1.9628 gi|3214016 3.4574 3.4185 3.4077 3.5080 3.2901 3.4376 3.4376 3.3757 3.2696 3.1859 3.4501 3.6604 3.7056 3.5653 3.5986 3.4877 3.3484 3.3117 3.4832 3.7097 3.4440 4.5861 5.1057 4.8603 4.5831 4.5197 3.2358 3.2251 3.1497 3.3141 3.7925 0.8816 0.9255 0.1695 0.2151 0.0000 1.6716 1.6186 1.6838 1.6851 1.8115 gi|7385294 3.6922 3.5965 3.6756 3.6887 3.7411 3.3673 3.3452 3.2474 3.3755 3.5497 3.5787 3.3555 3.2613 3.4535 3.4109 3.3955 3.2180 3.0655 3.1278 3.3456 2.8189 4.5477 4.6878 4.7422 4.6156 4.6864 3.7356 3.3705 3.4162 3.6510 3.7010 1.8960 1.8904 1.7832 1.8357 1.6716 0.0000 0.0715 0.1065 0.1524 0.1954 gi|3214016 3.6989 3.6382 3.6942 3.6989 3.8351 3.3846 3.3622 3.2770 3.4057 3.5460 3.6013 3.3149 3.2418 3.4243 3.3748 3.3630 3.0730 2.9303 3.0100 3.1959 2.7691 4.3458 4.4745 4.6551 4.4064 4.4694 3.5969 3.2747 3.3643 3.4541 3.5998 1.8747 1.8394 1.7161 1.7788 1.6186 0.0715 0.0000 0.1206 0.1522 0.1953 gi|7391268 3.7171 3.6024 3.6854 3.6968 3.7548 3.2598 3.2386 3.1404 3.3646 3.5616 3.5044 3.3699 3.2920 3.4878 3.4148 3.4362 3.3767 3.2099 3.3041 3.5743 2.8591 4.3820 4.5093 4.6975 4.4469 4.5141 3.6391 3.2685 3.2892 3.3768 3.4375 1.8492 1.8416 1.8218 1.8401 1.6838 0.1065 0.1206 0.0000 0.1089 0.1903 gi|7391914 3.5242 3.4367 3.5127 3.5370 3.5715 3.4325 3.4065 3.3142 3.5577 3.7566 3.6913 3.2953 3.2639 3.5116 3.4229 3.4367 3.2010 3.0375 3.1280 3.4190 2.7993 5.0460 5.2488 5.3408 4.9537 5.0517 3.7716 3.3578 3.3775 3.3963 3.3912 1.7900 1.7969 1.7742 1.7820 1.6851 0.1524 0.1522 0.1089 0.0000 0.2047 gi|8486134 3.9746 3.9079 3.9462 4.0509 4.1240 3.6583 3.6268 3.5072 3.8782 3.8968 3.9760 3.1628 3.1153 3.3345 3.3269 3.2463 3.3079 3.2489 3.4143 3.7992 3.0417 4.4859 4.6878 4.9884 4.6771 4.7604 3.6338 3.3159 3.3498 3.5212 3.7110 1.8792 1.8803 1.8920 1.9628 1.8115 0.1954 0.1953 0.1903 0.2047 0.0000

Output tree of neighbor method

41 PopulationsNeighbor-Joining/UPGMA method version 3.573cNeighbor-joining methodNegative branch lengths allowed

+gi|3214017 ! ! +gi|7391268 ! +-19 ! +-20 +gi|7391913 ! ! ! ! ! +gi|8486136 ! ! ! ! +gi|7385295 ! ! +-16 ! ! ! +gi|3214142 ! ! +------------17 ! ! ! ! +gi|7391268 ! ! ! ! +-14 ! ! ! +-15 +gi|7391915

59

Page 60: Project  Report on Influenza Virus

! ! ! ! ! ! ! +gi|8486122-18-21 ! ! ! ! +gi|7385294 ! ! ! +-23 ! ! ! ! +gi|3214016 ! ! ! +-------26 ! ! ! ! ! +gi|7391882 ! ! ! ! ! +-24 ! ! ! ! +-25 +gi|8486138 ! ! ! ! ! ! ! ! ! +gi|7391905 ! ! ! ! ! ! ! ! +gi|7392156 ! ! ! ! +-28 ! ! ! ! ! ! +gi|8486131 ! +----------31 +-33 +-29 +-27 ! ! ! ! ! ! +gi|3214016 ! ! ! ! +------30 ! ! ! ! ! ! ! +gi|7391921 ! ! ! ! ! ! ! ! ! ! ! +--gi|7385294 ! ! ! ! ! ! ! ! ! ! +---gi|7385295 ! ! ! ! ! +--9 ! ! ! ! ! ! +gi|7391914 ! ! ! +-32 +-11 ! ! ! ! ! ! +---gi|3214016 ! ! ! ! +------------12 +-10 ! ! ! ! ! ! +-------gi|7391920 ! ! ! ! ! ! ! ! ! ! ! +-gi|8486125 ! ! ! ! ! ! ! ! ! ! +gi|7392126 ! ! ! ! ! +------1 ! ! ! +------22 ! +gi|8486127 ! ! ! ! +------8 ! +-34 ! ! ! +gi|7392130 ! ! ! ! ! +--2 ! ! ! ! +--3 +gi|7391913 ! ! ! ! ! ! ! ! ! +gi|3214016 ! ! +----------13 ! ! ! +gi|7385294 ! ! ! +--5 ! ! ! ! ! +gi|7391914 ! ! ! +--6 +--4 ! ! ! ! ! +-gi|8486134 ! ! +-------7 ! ! ! ! +gi|7391268 ! ! ! ! ! +gi|3214016 ! ! ! ! +gi|3214015 ! ! +-35 ! ! +-36 +gi|9316315 ! ! ! ! ! ! ! +gi|7385295 ! +--------37 ! ! +gi|7392130 ! ! +-38 ! +-39 +gi|7391914 ! ! ! +gi|8486129 ! +gi|7385295

remember: (although rooted by outgroup) this is an unrooted tree!

Between And Length------- --- ------ 18 gi|3214017 0.03154 18 21 0.06092 21 20 0.01757 20 19 0.01126 19 gi|7391268 0.03236 19 gi|7391913 0.04084 20 gi|8486136 0.04104 21 31 0.99624 31 17 1.13561 17 16 0.01035 16 gi|7385295 0.02910 16 gi|3214142 0.04950 17 15 0.03620 15 14 0.01454

60

Page 61: Project  Report on Influenza Virus

14 gi|7391268 0.02217 14 gi|7391915 0.03033 15 gi|8486122 0.02651 31 34 0.26861 34 33 0.07196 33 26 0.78395 26 23 0.05711 23 gi|7385294 0.06604 23 gi|3214016 0.06686 26 25 0.02485 25 24 0.01715 24 gi|7391882 0.02677 24 gi|8486138 0.05033 25 gi|7391905 0.03805 33 32 0.06004 32 30 0.61962 30 29 0.09053 29 28 0.01212 28 gi|7392156 0.00981 28 27 0.03984 27 gi|8486131 0.03722 27 gi|3214016 0.08798 29 gi|7391921 0.03538 30 gi|7385294 0.24606 32 22 0.62308 22 12 1.17380 12 11 0.08424 11 9 0.08769 9 gi|7385295 0.36795 9 gi|7391914 0.03375 11 10 0.06886 10 gi|3214016 0.33587 10 gi|7391920 0.62943 12 gi|8486125 0.16543 22 13 1.01545 13 8 0.57819 8 1 0.57132 1 gi|7392126 0.09325 1 gi|8486127 0.11645 8 3 0.14471 3 2 0.03457 2 gi|7392130 0.08332 2 gi|7391913 0.04278 3 gi|3214016 0.09468 13 7 0.69229 7 6 0.03131 6 5 0.01490 5 gi|7385294 0.04241 5 4 0.02914 4 gi|7391914 0.06240 4 gi|8486134 0.14230 6 gi|7391268 0.02620 7 gi|3214016 0.02627 34 37 0.82306 37 36 0.07466 36 35 0.04020 35 gi|3214015 0.00078 35 gi|9316315 0.00182 36 gi|7385295 0.02015 37 39 0.02873 39 38 0.01388 38 gi|7392130 0.02606 38 gi|7391914 0.05794 39 gi|8486129 0.05262 18 gi|7385295 0.07476

Output of consenseMajority-rule and strict consensus tree program, version 3.573c

Species in order:

gi|3214017 gi|7391268 gi|7391913 gi|8486136 gi|7385295 gi|3214142 gi|7391268 gi|7391915 gi|8486122 gi|7385294 gi|3214016 gi|7391882 gi|8486138

61

Page 62: Project  Report on Influenza Virus

gi|7391905 gi|7392156 gi|8486131 gi|3214016 gi|7391921 gi|7385294 gi|7385295 gi|7391914 gi|3214016 gi|7391920 gi|8486125 gi|7392126 gi|8486127 gi|7392130 gi|7391913 gi|3214016 gi|7385294 gi|7391914 gi|8486134 gi|7391268 gi|3214016 gi|3214015 gi|9316315 gi|7385295 gi|7392130 gi|7391914 gi|8486129 gi|7385295

Sets included in the consensus tree

Set (species in order) How many times out of 1.00

....****** ********** ********** ********** . 1.00

.......... ....****.. .......... .......... . 1.00

....*****. .......... .......... .......... . 1.00

.......... .........* ***....... .......... . 1.00

.......... ....***... .......... .......... . 1.00

.......... .......... .........* **........ . 1.00

.......... .***...... .......... .......... . 1.00

.......... .......... .**....... .......... . 1.00

.***...... .......... .......... .......... . 1.00

......**.. .......... .......... .......... . 1.00

....**.... .......... .......... .......... . 1.00

.********* ********** ********** ********** . 1.00

.......... .......... ....**.... .......... . 1.00

.......... .......... ......**.. .......... . 1.00

.......... .**....... .......... .......... . 1.00

.**....... .......... .......... .......... . 1.00

.......... .....**... .......... .......... . 1.00

.........* ********** ********** ****...... . 1.00

.......... .......... .........* ***....... . 1.00

.......... .......... .......... ....****** . 1.00

.......... .......... ......***. .......... . 1.00

.......... ....****** ********** ****...... . 1.00

.........* *......... .......... .......... . 1.00

.......... .......... .......... ....***... . 1.00

.......... .......... .......... .......**. . 1.00

.......... .........* ********** ****...... . 1.00

......***. .......... .......... .......... . 1.00

.......... .........* *......... .......... . 1.00

.......... .......... .......... ....**.... . 1.00

.......... .........* ****...... .......... . 1.00

.......... .......... .......... **........ . 1.00

.......... ....*****. .......... .......... . 1.00

.........* ****...... .......... .......... . 1.00

.......... .......... ....****** ****...... . 1.00

.......... .......... .......... .......*** . 1.00

.......... .......... ....*****. .......... . 1.00

.........* ********** ********** ********** . 1.00

.......... .......... .........* ****...... . 1.00

Sets NOT included in consensus tree: NONE

CONSENSUS TREE:the numbers at the forks indicate the numberof times the group consisting of the specieswhich are to the right of that fork occurredamong the trees, out of 1.00 trees

+----gi|8486131 +--1.0

62

Page 63: Project  Report on Influenza Virus

+--1.0 +----gi|3214016 ! ! +--1.0 +---------gi|7392156 ! ! +------------1.0 +--------------gi|7391921 ! ! ! +-------------------gi|7385294 ! ! +----gi|7391914 ! +--1.0 ! ! +----gi|7385295 ! +--1.0 +--1.0 ! ! +----gi|3214016 ! ! +------------1.0 +--1.0 ! ! ! ! +----gi|7391920 ! ! ! ! ! ! ! +--------------gi|8486125 ! ! ! ! ! ! +---------gi|3214016 ! ! ! +--1.0 ! +--1.0 ! ! +----gi|7391913 ! ! ! +--1.0 ! ! +-------1.0 +----gi|7392130 ! ! ! ! ! ! ! ! +----gi|7392126 ! ! ! +-------1.0 ! ! ! +----gi|8486127 +--1.0 +--1.0 ! ! ! +--------------gi|7391268 ! ! ! ! ! ! ! +--1.0 +----gi|7391914 ! ! ! ! ! +--1.0 ! ! ! ! +--1.0 +----gi|8486134 ! ! +--1.0 ! ! ! ! +---------gi|7385294 ! ! ! ! ! +-------------------gi|3214016 ! ! ! ! +----gi|7391882 +--1.0 ! +--1.0 ! ! ! +--1.0 +----gi|8486138 ! ! ! ! ! ! ! +----------------------1.0 +---------gi|7391905 ! ! ! ! ! ! +----gi|3214016 ! ! +-------1.0 ! ! +----gi|7385294 ! ! ! ! +----gi|7391914 ! ! +--1.0 ! ! +--1.0 +----gi|7392130 +--1.0 ! ! ! ! ! +---------------------------1.0 +---------gi|8486129 ! ! ! ! ! ! +---------gi|7385295 ! ! +--1.0 ! ! ! +----gi|9316315 ! ! +--1.0 ! ! +----gi|3214015 ! ! ! ! +----gi|7391268 +--1.0 ! +--1.0 ! ! ! +--1.0 +----gi|7391915 ! ! ! ! ! ! ! +--------------------------------1.0 +---------gi|8486122 ! ! ! ! ! ! +----gi|3214142 ! ! +-------1.0 ! ! +----gi|7385295 ! ! ! ! +----gi|7391268 ! ! +--1.0 ! +------------------------------------------1.0 +----gi|7391913 ! ! ! +---------gi|8486136 ! +-----------------------------------------------------------gi|3214017 ! +-----------------------------------------------------------gi|7385295

63

Page 64: Project  Report on Influenza Virus

Neighbor tree with branch length:

64

Page 65: Project  Report on Influenza Virus

gi|7385295gi|7391914

gi|3214016gi|7391920

gi|8486125gi|7392156

gi|8486131gi|3214016

gi|7391921gi|7385294

gi|7385294gi|3214016

gi|7391882gi|8486138

gi|7391905gi|3214015gi|9316315

gi|7385295gi|7392130

gi|7391914gi|8486129

gi|7385295gi|3214142gi|7391268gi|7391915

gi|8486122gi|7391268gi|7391913

gi|8486136gi|3214017

gi|7385295gi|7392126

gi|8486127gi|7392130

gi|7391913gi|3214016

gi|7385294gi|7391914

gi|8486134gi|7391268

gi|32140160.2

Family Analysis

We also performed Family analysis in our current work in order to find out no. of base pairs present, exons, introns, orfs present in the sequences of these different strains.

Result of GETORF tool

>NC_007359.1_1 [25 - 2172] Influenza A virus (A/Goose/Guangdong/1/96(H5N1)) segment 3, complete sequence

65

Page 66: Project  Report on Influenza Virus

MEDFVRQCFNPMIVELAEKAMKEYGEDPKIETNKFAAICTHLEVCFMYSDFHFIDERGESTIIESGDPNALLKHRFEIIEGRDRTMAWTVVNSICNTTGVEKPKFLPDLYDYKENRFIEIGVTRREVHTYYLEKANKIKSEKTHIHIFSFTGEEMATKADYTLDEESRARIKTRLFTIRQEMASRGLWDSFRQSERGEETVEERFEITGTMCRLADQSLPPNFSSLEKFRAYVDGFEPNGCIEGKLSQMSKEVNARIEPFLKTTPRPLRLPDGPPCSQRSKFLLMDALKLSIEDPSHEGEGIPLYDAIKCMKTFFGWKEPNIVKPHEKGINPNYLLAWKQVLAELQDIENEEKIPKTKNMRKTSQLKWALGENMAPEKVDFEDCKDVSDLRQYDSDEPKPRSLASWIQSEFNKACELTDSSWIELDEIGEDVAPIEHIASMRRNYFTAEVSHCRATEYIMKGVYINTALLNASCAAMDDFQLIPMISKCRTKEGRRKTNLYGFLIKGRSHLRNDTDVVNFVSMEFSLTDPRLEPHRWEKYCVLRIGDMLLRTEIGQVSRPMFLYVRTNGTSKIKMKWGMEMRRCPFQSLQQIESMIEAESSVKEKDMTKEFFENKSETWPIGESPKGVEEGSIGKVCRTLLAKSVFNSLYASPQLEGFSAESRKLLLIVQALRDNLEPGTFDLGGLYEAIEECLINDPWVLLNASWFNSFLTHALR

>NC_007359.1_2 [1706 - 1359] (REVERSE SENSE) Influenza A virus (A/Goose/Guangdong/1/96(H5N1)) segment 3, complete sequenceMGLDTWPISVRKSMSPIRRTQYFSHLCGSSLGSVRENSILTKFTTSVSFLKWDLPFIRNPYRLVFRLPSLVLHLLIIGISWKSSMAAQDAFNKAVFMYTPFIMYSVALQWDTSAVK>NC_007359.1_3 [1337 - 1026] (REVERSE SENSE) Influenza A virus (A/Goose/Guangdong/1/96(H5N1)) segment 3, complete sequenceMCSIGATSSPISSSSIQLESVNSHALLNSLWIQLASDLGFGSSLSYCLRSLTSLQSSKSTFSGAIFSPSAHFNWLVFLMFFVFGIFSSFSISWSSASTCFQARR>NC_007359.1_4 [311 - 3] (REVERSE SENSE) Influenza A virus (A/Goose/Guangdong/1/96(H5N1)) segment 3, complete sequenceMQILFTTVQAIVRSLPSIISNRCFNNALGSPDSIIVDSPRSSIKWKSEYMKQTSKCVHIAANLFVSIFGSSPYSFIAFSASSTIIGLKHCRTKSSILDQYLLL>NC_007357.1_1 [28 - 2304] Influenza A virus (A/Goose/Guangdong/1/96(H5N1)) segment 1, complete sequenceMERIKELRDLMSQSRTREILTKTTVDHMAIIKKYTSGRQEKNPALRMKWMMAMKYPITADKRIMEMIPERNEQGQTLWSKTNDAGSDRVMVSPLAVTWWNRNGPTTSTVHYPKVYKTYFEKVERLKHGTFGPVHFRNQVKIRRRVDINPGHADLSAKEAQDVIMEVVFPNEVGARILTSESQLTITKEKKEELQDCKIAPLMVAYMLERELVRKTRFLPVAGGTSSVYIEVLHLTQGTCWEQMYTPGGEVRNDDVDQSLIIAARNIVRRATVSADPLASLLEMCHSTQIGGIRMVDILRQNPTEEQAVDICKAAMGLRISSSFSFGGFTFKRTNGSSVKKEEEVLTGNLQTLKIKVHEGYEEFTMVGRRATAILRKATRRLIQLIVSGRDEQSIAEAIIVAMVFSQEDCMIKAVRGDLNFVNRANQRLNPMHQLLRHFQKDAKVLFQNWGIEPIDNVMGMIGILPDMTPSAEMSLRGVRVSKMGVDEYSSTERVVVSIDRFLRVRDQQGNVLLSPEEVSETQGTEKLTITYSSSMMWEINGPESVLVNTYQWIIRNWETVKIQWSQDPTMLYNKMEFESFQSLVPKAARSQYSGFVRTLFQQMRDVLGTFDTVQIIKLLPFAAAPPEPSRMQFSSLTVNVRGSGMRILVRGNSPVFNYNKATKRLTVLGKDAGALTEDPDEGTAGVESAVLRGFLILGREDKRYGPALSINELSNLAKGEKANVLIMQGDVVLVMKRKRDFSILTDSQTATKRIRMAIN>NC_007357.1_2 [2303 - 2001] (REVERSE SENSE) Influenza A virus (A/Goose/Guangdong/1/96(H5N1)) segment 1, complete sequenceMMAIRILLVAVWLSVSMLKSRFRFITNTTSPCIINTLAFSPFARLLSSLMLNAGPYLLSSLPRIRNPLNTADSTPAVPSSGSSVNAPASFPRTVSLLVALL>NC_007364.1_1 [15 - 704] Influenza A virus (A/Goose/Guangdong/1/96(H5N1)) segment 8, complete sequenceMDSNTITSFQVDCYLWHIRKLLSMRDMCDAPFDDRLRRDQKALKGRGSTLGLDLRVATMEGKKIVEDILKSETNENLKIAIASSPAPRYITDMSIEEMSREWYMLMPRQKITGGLMVKMDQAIMDKRIILKANFSVLFDQLETLVSLRAFTESGAIVAEIFPIPSVPGHFTEDVKNAIGILIGGLEWNDNSIRASENIQRFAWGIHDENGGPSLPPKQKRYMAKRVESEV>NC_007364.1_2 [481 - 849] Influenza A virus (A/Goose/Guangdong/1/96(H5N1)) segment 8, complete sequenceMWLKYFPFPPYQDILQRMSKMQLESSSVDLNGMITQFERLKIYRDSLGESMMRMGDLHSLQNRNATWRNELSQKFEEIRWLIAECRNILTKTENSFEQITFLQALQLLLEVESEIRTFSFQLI>NC_007364.1_3 [382 - 56] (REVERSE SENSE) Influenza A virus (A/Goose/Guangdong/1/96(H5N1)) segment 8, complete sequenceMAWSIFTIRPPVIFCLGISMYHSRLISSMLISVIYRGAGLEAMAILRFSFVSLFRMSSTIFFPSIVATLKSSPSVLPLPFNAFWSLRSLSSKGASHMSLILSSFLMCHR>NC_007361.1_1 [21 - 1427] Influenza A virus (A/Goose/Guangdong/1/96(H5N1)) segment 6, complete sequenceMNPNQKIITIGSICMVVGIISLMLQIGNIISIWVSHSIQTGNQHQAEPCNQSIITYENNTWVNQTYVNISNTNFLTEKAVASVTLAGNSSLCPISGWAVHSKDNGIRIGSKGDVFVIREPFISCSHLECRTFFLTQGALLNDKHSNGTVKDRSPHRTLMSCPVGEAPSPYNSRFESVAWSASACHDGTSWLTIGISGPDNGAVAVLKYNGIITDTIKSWRNNILRTQESECACVNGSCFTVMTDGPSNGQASYKIFKMEKGKVVKSVELNAPNYHYEECSCYPDAGEITCVCRDNWHGSNRPWVSFNQNLEYQIGYICSGVFGDNPRPNDGTGSCGPVSPNGAYGVKGFSFKYGNGVWIGRTKSTNSRSGFEMIWDPNGWTGTDSSFSVKQDIVAITDWSGYSGSFVQHPELTGLDCIRPCFWVELIRGRPKESTIWTSGSSISFCGVNSDTVGWSWPDDAELPFTIDK>NC_007361.1_2 [1426 - 959] (REVERSE SENSE) Influenza A virus (A/Goose/Guangdong/1/96(H5N1)) segment 6, complete sequenceMSMVNGNSASSGQDQPTVSLFTPQKDMLLPLVQIVLSLGRPLISSTQKQGLMQSNPVSSGCWTKLPLYPDQSVIATISCFTEKLLSVPVHPFGSQIISKPLLELVLLVLPIQTPLPYLNENPFTPYAPLGDTGPQLPVPSLGRGLSPKTPLHIYPI>NC_007362.1_1 [22 - 1725] Influenza A virus (A/Goose/Guangdong/1/96(H5N1)) segment 4, complete sequenceMEKIVLLLAIVSLVKSDQICIGYHANNSTEQVDTIMEKNVTVTHAQDILEKTHNGKLCDLNGVKPLILRDCSVAGWLLGNPMCDEFINVPEWSYIVEKASPANDLCYPGDFNDYEELKHLLSRTNHFEKIQIIPKSSWSNHDASSGVSSACPYHGRSSFFRNVVWLIKKNSAYPTIKRSYNNTNQEDLLVLWGIHHPNDAAEQTKLYQNPTTYISVGTSTLNQRLVPEIATRPKVNGQSGRMEFFWTILKPNDAINFESNGNFIAPEYAYKIVKKGDSAIMKSELEYGNCNTKCQTPMGAINSSMPFHNIHPLTIGECPKYVKSNRLVLATGLRNTPQRERRRKKRGLFGAIAGFIEGGWQGMVDGWYGYHHSNEQGSGYAADKESTQKAIDGVTNKVNSIIDKMNTQFEAVGREFNNLERRIENLNKQMEDGFLDVWTYNAELLVLMENERTLDFHDSNVKNLYDKVRLQLRDNAKELGNGCFEFYHKCDNECMESVKNGTYDYPQYSEEARLNREEISGVKLESMGTYQILSIYSTVASSLALAIMVAGLSLWMCSNGSLQCRICI>NC_007360.1_1 [28 - 1539] Influenza A virus (A/Goose/Guangdong/1/96(H5N1)) segment 5, complete sequenceMSDINIMASQGTKRSYEQMETGGERQNATEIRASVGRMVGGIGRFYIQMCTELKLSDYEGRLIQNSITIERMVLSAFDERRNKYLEEHPSAGKDPKKTGGPIYRRRDGKWVRELILYDKEEIRRIWRQANNGEDATAGLTHMMIWHSNLNDATYQRTRALVRTGMDPRMCSLMQGSTLPRRSGAAGAAVKGVGTMVMELIRMIKRGINDRNFWRGENGRRTRIAYERMCNILKGKFQTAAQRAMMDQVRESRNPGNAEIEDLIFLARSALILRGSVAHKSCLPACVYGLAVASGYDFEREGYSLVGIDPFRLLQNSQVFSLIRPNENPAHKSQLVWMACHSAAFEDLRVSSFIRGTRVAPRGQLSTRGVQIASNENMETMDSSTLELRSRYWAIRTRSGGNTNQQRASAGQISVQPTFSVQRNLPFERATIMAAFTGNTEGRTSDMRTEIIRMMESSRPEDVSFQGRGVFELSDEKATNPIVPSFDMSNEGSYFFGDNAEEYDN>NC_007360.1_2 [1487 - 1137] (REVERSE SENSE) Influenza A virus (A/Goose/Guangdong/1/96(H5N1)) segment 5, complete sequenceMSKEGTIGFVAFSSESSKTPRPWKDTSSGLELSIILMISVLMSDVLPSVFPVNAAIMVALSKGRFLCTEKVGCTLICPADALCWLVFPPLLVLIAQYLLLSSRVLESIVSMFSFEAI>NC_007360.1_3 [465 - 28] (REVERSE SENSE) Influenza A virus (A/Goose/Guangdong/1/96(H5N1)) segment 5, complete sequenceMPDHHVSETSSCIFSIVRLTPNSPDLLFVIQNQLSHPFSVSSSVDWTSSFLWVLPRTGMFFQVFVPPFIKCRENHSLYCYAVLNQPSFIVAEFEFSAHLYIKPPNSTNHSSNRCSDLSSILAFSTSFHLFIRSFGALRRHDVDVTQ>NC_007358.1_1 [25 - 2295] Influenza A virus (A/Goose/Guangdong/1/96(H5N1)) segment 2, complete sequenceMDVNPTLLFLKVPAQNAISTTFPYTGDPPYSHGTGTGYTMDTVNRTHQYSEKGKWTTNTETGAPQLNPIDGPLPEDNEPSGYAQTDCVLEAMAFLEESHPGIFENSCLETMEVVQQTRVDKLTQGRQTYDWTLKRNQPAATALANTIEVFRSNGLTANESGRLIDFLKDVMESMDKGEMEIITHFQRKRRVRDNMTKKMVTQRTIGKKKQRLNKRSYLIRALTLNTMTKDAERGKLKRRAIATPGMQIRGFVYFVETLARSICEKLEQSGLPVGGNEKKAKLANVVRKMMTNSQDTELSFTITGDNTKWNENQNPRMFLAMITYITRNQPEWFRNVLSIAPIMFSNKMARLGKGYMFESKSMKLRTQIPAEMLASIDLKYFNESTRKKIEKIRPLLIDGTASLSPGMMMGMFNMLSTVLGVSILNLGQKRYTKTTYWWDGLQSSDDFALIVNAPNHEGIEAGVDRFYRTCKLVGINMTKKKSYINRTGTCEFTSFFYRYGFVANFSMELPSFGVSGINESADMSIGVTVIKNNMMDNDLGPATAQMALQLFIKDYRYPYRCHRGDTQIQTRRSFELKKLWEQTRSKAGLLVSDGGPNPYNIRNLHIPEAGLKWELMDEDYQGRLCNPLNPFVSHKEIESVNNAVVMPAHGPAKSMEYDAVATTHSWIPKRNRSILNTSQRGILEDEQMYQKCCNLFEKFFPSSSYRRPVGISSMVEAMVSRARIDARIDFESGRIKKEEFAEIMKICSTIEELGRQK>NC_007358.1_2 [1373 - 963] (REVERSE SENSE) Influenza A virus (A/Goose/Guangdong/1/96(H5N1)) segment 2, complete sequenceMRAKSSEDWSPSHQYVVLVYLFCPRFRIETPKTVLSILNMPIIIPGLNEAVPSISRGLIFSIFFLVDSLKYFKSMLASISAGICVRSFMLLLSNMYPFPNLAILFENIIGAMLKTFLNHSGWFLVMYVIIARNIRGF>NC_007363.1_1 [26 - 781] Influenza A virus (A/Goose/Guangdong/1/96(H5N1)) segment 7, complete sequenceMSLLTEVETYVLSIVPSGPLKAEIAQRLEDVFAGKNTDLEALMEWLKTRPILSPLTKGILGFVFTLTVPSERGLQRRRFVQNALNGNGDPNNMDRAVKLYKKLKREITFHGAKEVALSYSTGALASCMGLIYNRMGTVTTEVAFGLVCATCEQIADSQHRSHRQMATTTNPLIRHENRMVLASTTAKAMEQMAGSSEQAAEAMEVASQARQMVQAMRTIGTHPSSSAGLKDNLLENLQAYQKRMGVQMQRFK>NC_007363.1_2 [865 - 542] (REVERSE SENSE) Influenza A virus (A/Goose/Guangdong/1/96(H5N1)) segment 7, complete sequence

66

Page 67: Project  Report on Influenza Virus

MHLKKRRSRIHNIKCSIPMILAATTRGSLESLHLHSHSFLVGLQIFKKIIFQTGTGARMSPNCPHCLHHLPSLTSNLHGFRCLLTRSSHLLHSLSCSAGQHHSVLMPD>NC_007363.1_3 [378 - 1] (REVERSE SENSE) Influenza A virus (A/Goose/Guangdong/1/96(H5N1)) segment 7, complete sequenceMSATSLAPWNVISLFSFLYSLTALSILFGSPFPFKAFWTKRLRCSPRSLGTVSVNTNPKIPLVRGDRIGLVFSHSMRASRSVFFPAKTSSSLCAISALRGPDGTIERTYVSTSVRRLIFQYLPAFA>NC_004907.1_1 [33 - 788] Influenza A virus (A/Hong Kong/1073/99(H9N2)) segment 7, complete sequenceMSLLTEVETYVLSIIPSGPLKAEIAQRLEDVFAGKNTDLEALMEWLKTRPILSPLTKGILGFVFTLTVPSERGLQRRRFVQNALNGNGDPNNMDRAVKLYKKLKREMTFHGAKEVALSYSTGALASCMGLIYNRMGTVTTEVALGLVCATCEQIADAQHRSHRQMATTTNPLIRHENRMVLASTTAKAMEQMAGSSEQAAEAMEVASQARQMVQAMRTIGTHPSSSAGLKDDLIENLQAYQKRMGVQMQRFK>NC_004907.1_2 [385 - 2] (REVERSE SENSE) Influenza A virus (A/Hong Kong/1073/99(H9N2)) segment 7, complete sequenceMSATSFAPWNVISLFSFLYSLTALSMLFGSPFPFRAFWTNRLRCSPRSLGTVSVNTNPKIPLVRGDRIGLVFSHSMRASRSVFFPAKTSSSLCAISALRGPDGMIERTYVSTSVRRLIFQYLPAFGIP>NC_004912.1_1 [21 - 2168] Influenza A virus (A/Hong Kong/1073/99(H9N2)) segment 3, complete sequenceMEDFVRQCFNPMIVELAEKTMKEYGEDPKIETNKFAAICTHLEVCFMYSDFHFIDERGESIIVESGDPNALLKHRFEIIEGRDRAMAWTVVNSICNTTGVDKPKFLPDLYDYKENRFTEIGVTRREVHIYYLEKANKIKSEKTHIHIFSFTGEEMATKADYTLDEESRARIKTRLFTIRQEMASRGLWDSFRQSERGEETIEERFEITGTMRRLADQSLPPNFSSLENFRAYVDGFKPNGCIEGKLSQMSKEVNARIEPFLKTTPRPLRLPDGPPCSQRSKFLLMDALKLSIEDPSHEGEGIPLYDAIKCMKTFFGWREPNIIKPHEKGINPNYLLAWKQVLAELQDIENEDKIPKTKNMKKTSQLMWALGENMAPEKLDFEDCKDIGDLKQYQSDEPELRSIASWIQSEFNKACELTDSSWIELDEIGEDVAPIEHIASMRRNYFTAEVSHCRATEYIMKGVYINTALLNASCAAMDDFQLIPMISKCRTKEGRRKTNLYGFIIKGRSHLRNDTDVVNFVSMEFSLTDPRLEPHKWEKYCVLEVGEMLLRTAIGQVSRPMFLYVRTNGTSKIKMKWGMEMRRCLLQSLQQIESMIEAESSIKEKDMTKEFFENRSETWPIGESPKGVEEGSIGKVCRTLLAKSVFNSLYSSPQLEGFSAESRKLLLIVQALRDNLEPGTFDLEGLYGAIEECLINDPWVLLNASWFNSFLTHALK>NC_004912.1_2 [1735 - 1412] (REVERSE SENSE) Influenza A virus (A/Hong Kong/1073/99(H9N2)) segment 3, complete sequenceMEVPLVLTYRNMGLDTWPIAVRKSISPTSRTQYFSHLCGSSLGSVRENSILTKFTTSVSFLKWDLPFIMNPYRFVFLLPSFVLHLLIIGISWKSSMAAQDALSKAVFM>NC_004912.1_3 [1688 - 1119] (REVERSE SENSE) Influenza A virus (A/Hong Kong/1073/99(H9N2)) segment 3, complete sequenceMAYCSPQEHFPYFKNTVLFPFVWLQPWVSKGEFHTHKVHHVSIISQMGPSFYNEPIQVCLPSSFFCSAFAYHWNQLEVIHGCTRCIEQSCIYVNPLHYVLSGPAMRHFRCEVVPSHACNVLNWGNIFPYLIEFYPARIGQFTCLVELTLDPACYRSELWLITLILFQIANIFAVLKVQFFRCHILPECPH>NC_004912.1_4 [331 - 2] (REVERSE SENSE) Influenza A virus (A/Hong Kong/1073/99(H9N2)) segment 3, complete sequenceMGLSTPVVLQMLFTTVQAIARSLPSIISNLCFNNAFGSPDSTIIDSPRSSMKWKSEYMKQTSKCVHIAANLFVSIFGSSPYSFIVFSASSTIIGLKHCRTKSSILDQYLL>NC_004906.1_1 [27 - 716] Influenza A virus (A/Hong Kong/1073/99(H9N2)) segment 8, complete sequenceMDSNTVSSFQVDCFLWHVRKRFADQELGDAPFLDRLRRDQKSLRGRGSTLGLDIRTATREGKHIVERILEEESDEALKMTIASVPASRYLTEMTLEEMSRDWLMLIPKQKVTGPLCIRMDQAVMGKTIILKANFSVIFNRLEALILLRAFTDEGAIVGEISPLPSLPGHTDEDVKNAIGVLIGGLEWNDNTVRVSETLQRFTWRSSDENGRSPLPPKQKRKVERTIEPEV>NC_004906.1_2 [535 - 861] Influenza A virus (A/Hong Kong/1073/99(H9N2)) segment 8, complete sequenceMTRMSKMQLGSSSEDLNGMITQFESLKLYRDSLGEAVMRMGDLHSLQNRNGKWREQLSQKFEEIRWLIEEMRHRLRITENSFEQITFMQALQLLLEVEQEIRTFSFQLI>NC_004906.1_3 [643 - 293] (REVERSE SENSE) Influenza A virus (A/Hong Kong/1073/99(H9N2)) segment 8, complete sequenceMLLQVNLCRVSETRTVLSFHSSPPMRTPIAFLTSSSVCPGREGNGEISPTIAPSSVNALSSIRASSRLKITLKFAFNMMVLPITAWSILMQRGPVTFCLGMSINQSLDISSRVISVR>NC_004905.1_1 [28 - 1539] Influenza A virus (A/Hong Kong/1073/99(H9N2)) segment 5, complete sequenceMSDINIMASQGTKRSYEQMETGGERQNATEIRASVGRMVGGIGRFYVQMCTELKLSDQEGRLIQNSITIERMVLSAFDERRNRYLEEHPSAGKDPKKTGGPIYRRRDGKWVRELILYDKEEIRRIWRQANNGEDATAGLTHMMIWHSNLNDATYQRTRALVRTGMDPRMCSLMQGSTLPRRSGAAGAAIKGVGTMVMELIRMIKRGINDRNFWRGDNGRRTRIAYERMCNILKGKFQTAAQRAMMDQVRESRNPGNAEIEDLIFLARSALILRGSVAHKSCLPACVYGLAVASGYDFEREGYSLVGIDPFRLLQNSQVFSLIRPNENPAHKSQLVWMACHSAAFEDLRVSSFIRGTRVIPRGQLSTRGVQIASNENVEAMDSSTLELRSRYWAIRTRSGGNTNQQRASAGQISVQPTFSVQRNLPFERPTIMAAFKGNTEGRTSDMRTEIIRMMESARPEDVSFQGRGVFELSDEKATNPIVPSFDMSNEGSYFFGDNAEEYDN>NC_004905.1_2 [1047 - 625] (REVERSE SENSE) Influenza A virus (A/Hong Kong/1073/99(H9N2)) segment 5, complete sequenceMACHPYQLTFMCWILIWSNKTEDLTVLKQTERIYPNQRVPFPLKIISTGHSEPVHTSRQAGLMGYGSSQDECRPCQKDEIFNFSIPRISAFSHLIHHCSLCCCLKFPFEDVAHSLICNPCSSSIIASPEVPVINASLYHPN

>NC_004908.1_1 [32 - 1711] Influenza A virus (A/Hong Kong/1073/99(H9N2)) segment 4, complete sequenceMETISLITILLVVTASNADKICIGHQSTNSTETVDTLTETNVPVTHAKELLHTEHNGMLCATSLGHPLILDTCTIEGLVYGNPSCDLLLGGREWSYIVERSSAVNGTCYPGNVENLEELRTLFSSASSYQRIQIFPDTTWNVTYTGTSRACSGSFYRSMRWLTQKSGFYPVQDAQYTNNRGKSILFVWGIHHPPTYTEQTNLYIRNDTTTSVTTEDLNRTFKPVIGPRPLVNGLQGRIDYYWSVLKPGQTLRVRSNGNLIAPWYGHVLSGGSHGRILKTDLKGGNCVVQCQTEKGGLNSTLPFHNISKYAFGTCPKYVRVNSLKLAVGLRNVPARSSRGLFGAIAGFIEGGWPGLVAGWYGFQHSNDQGVGMAADRDSTQKAIDKITSKVNNIVDKMNKQYEIIDHEFSEVETRLNMINNKIDDQIQDVWAYNAELLVLLENQKTLDEHDANVNNLYNKVKRALGSNAMEDGKGCFELYHKCDDQCMETIRNGTYNRRKYREESRLERQKIEGVKLESEGTYKILTIYSTVASSLVLAMGFAAFLFWAMSNGSCRCNICI*>NC_004909.1_1 [1 - 1401] Influenza A virus (A/Hong Kong/1073/99(H9N2)) segment 6, complete sequenceMNPNQKIIALGSVSITIATICLLMQIAILATTMTLHFNECTNPSNNQAVPCEPIIIERNITEIVHLNNTTIEKESCPKVAEYKNWSKPQCQITGFAPFSKDNSIRLSAGGDIWVTREPYVSCGLGKCYQFALGQGTTLNNKHSNGTIHDRSPHRTLLMNELGVPFHLGTKQVCIAWSSSSCHDGKAWLHVCVTGDDRNATASIIYDGMLTDSIGSWSKNILRTQESECVCINGTCTVVMTDGSASGRADTKILFIREGKIVHIGPLSGSAQHVEECSCYPRYPEVRCVCRDNWKGSNRPVLYINVADYSVDSSYVCSGLVGDTPRNDDSSSSSNCRDPNNERGGPGVKGWAFDNGNDVWMGRTIKKDSRSGYETFRVVGGWTTANSKSQINRQVIVDSDNWSGYSGIFSVEGKTCINRCFYVELIRGRPQETRVWWTSNSIIVFCGTSGTYGTGSWPDGANINFMSI>NC_004910.1_1 [28 - 2304] Influenza A virus (A/Hong Kong/1073/99(H9N2)) segment 1, complete sequenceMERIKELRNLMSQSRTREILTKTTVDHMAIIKKYTSGRQEKNPALRMKWMMAMKYPITADKRIMEMIPERNEQGQTLWSKTNDAGSDRVMVSPLAVTWWNRNGPTTSTVHYPKVYKTYFEKVERLKHGTFGPVHFRNQVKIRRRVDMNPGHADLSAKEAQDVIMEVVFPNEVGARILTSESQLTITKEKREELKNCNIAPLMVAYMLERELVRKTRFLPVAGGTSSVYIEVLHLTQGTCWEQMYTPGGEVRNDDVDQSLIIAARNIVRRATVSADPLASLLEMCHSTQIGGVRMVDILKQNPTEEQAVDICKAAMGLKISSSFSFGGFTFKRTKGSSVKREEEVLTGNLQTLKIKVHEGYEEFTMVGRRATAILRKATRRMIQLIVSGRDEQSIAEAIIVAMVFSQEDCMVKAVRGDLNFVNRANQRLNPMHQLLRHFQKDAKVLFQNWGIEPIDNVMGMIGILPDMTPSTEMSLRGVRVSKMGVDEYSSTERVVVSIDRFLRVRDQRGNVLLSPEEVSETQGMEKLTITYSSSMMWEINGPESVLVNTYQWIIRNWETVKIQWSQEPTMLYNKMEFEPFQSLVPKAARSQYSGFVRTLFQQMRDVLGTFDTVQIIKLLPFAAAPPEQSRMQFSSLTVNVRGSGMRILVRGNSPAFNYNKTTKRLTILGKDAGALTEDPDEGTAGVESAVLRGFLILGKEDKRYGPALSINELSNLTKGEKANVLIGQGDVVLVMKRKRDSSILTDSQTATKRIRMAIN>NC_004910.1_2 [2303 - 2001] (REVERSE SENSE) Influenza A virus (A/Hong Kong/1073/99(H9N2)) segment 1, complete sequenceMMAIRILLVAVWLSVSMLESRFRFITNTTSPCPINTLAFSPFVRLLSSLMLNAGPYLLSSLPRIRNPLNTADSTPAVPSSGSSVSAPASFPSIVSLLVVLL>NC_004910.1_3 [1295 - 735] (REVERSE SENSE) Influenza A virus (A/Hong Kong/1073/99(H9N2)) segment 1, complete sequenceMFTKFKSPRTAFTMQSSCENTIATIIASAIDCSSLPLTISWIILLVAFLRMAVALRPTIVNSSYPSCTFIFNVWRLPVSTSSSLLTEDPFVLLKVNPPKLKDELIFKPIAALHISTACSSVGFCLRMSTILTPPICVLWHISRSEANGSADTVALLTMFLAAMIKLWSTSSFLTSPPGVYICSQQVP>NC_004905.2_1 [22 - 1533] Influenza A virus (A/Hong Kong/1073/99(H9N2)) segment 5, complete sequenceMSDINIMASQGTKRSYEQMETGGERQNATEIRASVGRMVGGIGRFYVQMCTELKLSDQEGRLIQNSITIERMVLSAFDERRNRYLEEHPSAGKDPKKTGGPIYRRRDGKWVRELILYDKEEIRRIWRQANNGEDATAGLTHMMIWHSNLNDATYQRTRALVRTGMDPRMCSLMQGSTLPRRSGAAGAAIKGVGTMVMELIRMIKRGINDRNFWRGDNGRRTRIAYERMCNILKGKFQTAAQRAMMDQVRESRNPGNAEIEDLIFLARSALILRGSVAHKSCLPACVYGLAVASGYDFEREGYSLVGIDPFRLLQNSQVFSLIRPNENPAHKSQLVWMACHSAAFEDLRVSSFIRGTRVIPRGQLSTRGVQIASNENVEAMDSSTLELRSRYWAIRT

67

Page 68: Project  Report on Influenza Virus

RSGGNTNQQRASAGQISVQPTFSVQRNLPFERPTIMAAFKGNTEGRTSDMRTEIIRMMESARPEDVSFQGRGVFELSDEKATNPIVPSFDMSNEGSYFFGDNAEEYDN>NC_004905.2_2 [1041 - 619] (REVERSE SENSE) Influenza A virus (A/Hong Kong/1073/99(H9N2)) segment 5, complete sequenceMACHPYQLTFMCWILIWSNKTEDLTVLKQTERIYPNQRVPFPLKIISTGHSEPVHTSRQAGLMGYGSSQDECRPCQKDEIFNFSIPRISAFSHLIHHCSLCCCLKFPFEDVAHSLICNPCSSSIIASPEVPVINASLYHPN>NC_004911.1_1 [24 - 2297] Influenza A virus (A/Hong Kong/1073/99(H9N2)) segment 2, complete sequenceMDVNPTLLFLKVPAQNAISTTFPYTGDPPYSHGTGTGYTMDTVNRTHQYSEKGRWTTNTETGAPQLNPIDGPLPEDNEPSGYAQTDCVLEAMAFLEESHPGLFENSCLETMEVVQQTRVDKLTQGRQTYDWTLNRNQPAATALANTIEVFRSNGLTANESGRLIDFLKDVMESMDKEEMEITTHFQRKRRVRDNMTKKMVTQRTIGKKKQKLTKKSYLIRALTLNTMTKDAERGKLKRRAIATPGMQIRGFVHFVEALARSICEKLEQSGLPVGGNEKKAKLANVVRKMMTNSQDTELSFTVTGDNTKWNENQNPRIFLAMITYITRNQPEWFRNVLSIAPIMFSNKMARLGKGYMFESKSMKLRTQIPAEMLANIDLKYFNESTRKKIEKIRPLLIEGTASLSPGMMMGMFNMLSTVLGVSILNLGQKRYTKTTYWWDGLQSSDDFALIVNAPNHEGIQAGVDRFYRTCKLVGINMSKKKSYINRTGTFEFTSFFYRYGFVANFSMELPSFGVSGINESADMSIGVTVIKNNMINNDLGPATAQMALQLFIKDYRYTYRCHRGDTQIQTRRSFELKKLWEQTRSKAGLLVSDGGPNLYNIRNLHIPEVCLKWELMDEDYQGRLCNPLNPFVSHKEVESVNNAVVMPAHGPAKSMEYDAVATTHSWIPKRNRSILNTSQRGILEDEQMYQKCCTLFEKFFPSSSYRRPVGISSMMEAMVSRARIDARIDFESGRIKKEEFAEILKICSTIEELGRQGK>NC_004911.1_2 [2272 - 1925] (REVERSE SENSE) Influenza A virus (A/Hong Kong/1073/99(H9N2)) segment 2, complete sequenceMVEQIFKISANSSFLILPDSKSIRASIRALDTMASIMLEIPTGLRYELLGKNFSNRVQHFWYICSSSRIPLWLVLRMERFLLGIHECVVATASYSMLLAGPWAGITTALLTDSTSL>NC_004911.1_3 [1372 - 962] (REVERSE SENSE) Influenza A virus (A/Hong Kong/1073/99(H9N2)) segment 2, complete sequenceMRAKSSEDWSPSHQYVVLVYLFCPRFKIETPKTVLSILNMPIIIPGLNEAVPSISRGLIFSIFFLVDSLKYFKSMFASISAGICVRSFMLLLSNMYPFPNLAILFENIIGAMLKTFLNHSGWFLVMYVIIARNIRGF>NC_007374.1_1 [44 - 1729] Influenza A virus (A/Korea/426/68(H2N2)) segment 4, complete sequenceMAIIYLILLFTAVRGDQICIGYHANNSTEKVDTILERNVTVTHAKDILEKTHNGKLCKLNGIPPLELGDCSIAGWLLGNPECDRLLSVPEWSYIMEKENPRYSLCYPGSFNDYEELKHLLSSVKHFEKVKILPKDRWTQHTTTGGSWACAVSGKPSFFRNMVWLTRKGSNYPVAKGSYNNTSGEQMLIIWGVHHPNDEAEQRALYQNVGTYVSVATSTLYKRSIPEIAARPKVNGLGRRMEFSWTLLDMWDTINFESTGNLVAPEYGFKISKRGSSGIMKTEGTLENCETKCQTPLGAINTTLPFHNVHPLTIGECPKYVKSEKLVLATGLRNVPQIESRGLFGAIAGFIEGGWQGMVDGWYGYHHSNDQGSGYAADKESTQKAFNGITNKVNSVIEKMNTQFEAVGKEFSNLEKRLENLNKKMEDGFLDVWTYNAELLVLMENERTLDFHDSNVKNLYDKVRMQLRDNVKELGNGCFEFYHKCDNECMDSVKNGTYDYPKYEEESKLNRNEIKGVKLSSMGVYQILAIYATVAGSLSLAIMMAGISFWMCSNGSLQCRICI>NC_007376.1_1 [25 - 2172] Influenza A virus (A/Korea/426/68(H2N2)) segment 3, complete sequenceMEDFVRQCFNPMIVELAEKAMKEYGEDLKIETNKFAAICTHLEVCFMYSDFHFINEQGESIMVELDDPNALLKHRFEIIEGRDRTMAWTVVNSICNTTGAEKPKFLPDLYDYKENRFIEIGVTRREVHIYYLEKANKIKSENTHIHIFSFTGEEMATKADYTLDEESRARIKTRLFTIRQEMANRGLWDSFRQSERGEETIEERFEITGTMRRLADQSLPPNFSCLENFRAYVDGFEPNGYIEGKLSQMSKEVNAKIEPFLKTTPRPIRLPDGPPCFQRSKFLLMDALKLSIEDPSHEGEGIPLYDAIKCMRTFFGWKEPYIVKPHEKGINPNYLLSWKQVLAELQDIENEEKIPRTKNMKKTSQLKWALGENMAPEKVDFDNCRDISDLKQYDSDEPELRSLSSWIQNEFNKACELTDSIWIELDEIGEDVAPIEHIASMRRNYFTAEVSHCRATEYIMKGVYINTALLNASCAAMDDFQLIPMISKCRTKEGRRKTNLYGFIIKGRSHLRNDTDVVNFVSMEFSLTDPRLEPHKWEKYCVLEIGDMLLRSAIGQMSRPMFLYVRTNGTSKIKMKWGMEMRPCLLQSLQQIESMVEAESSVKEKDMTKEFFENKSETWPIGESPKGVEEGSIGKVCRTLLAKSVFNSLYASPQLEGFSAESRKLLLVVQALRDNLEPGTFDLGGLYEAIEECLINDPWVLLNASWFNSFLTHALR>NC_007376.1_2 [858 - 550] (REVERSE SENSE) Influenza A virus (A/Korea/426/68(H2N2)) segment 3, complete sequenceMKTRRPIRKSNWSWCCFQKRFNFCIYFFGHLRKLALNVAVRFESIHIGSKILKAGEVRRETLVGKPAHCPCDFKSFFNCFFASFGLTKGIPEASVGHFLSYGE>NC_007377.1_1 [26 - 781] Influenza A virus (A/Korea/426/68(H2N2)) segment 7, complete sequenceMSLLTEVETYVLSIVPSGPLKAEIAQRLEDVFAGKNTDLEALMEWLKTRPILSPLTKGILGFVFTLTVPSERGLQRRRFVQNALNGNGDPNNMDRAVKLYRKLKREITFHGAKEVALSYSAGALASCMGLIYNRMGAVTTEVAFAVVCATCEQIADSQHRSHRQMVTTTNPLIRHENRMVLASTTAKAMEQMAGSSEQAAEAMEVASQARQMVQAMRAIGTPPSSSAGLKDDLLENLQAYQKRMGVQMQRFK>NC_007377.1_2 [517 - 185] (REVERSE SENSE) Influenza A virus (A/Korea/426/68(H2N2)) segment 7, complete sequenceMPVRPMLGVSNLFTGCTYHGKGHFSGHSPHPVVYEAHATGKCTSRITERYFFGPMECYLPLKLSIQFNCSVHVIWIPIPIEGILDKASTLQSSLTWHGEREYKSQNPLSQR>NC_007377.1_3 [378 - 1] (REVERSE SENSE) Influenza A virus (A/Korea/426/68(H2N2)) segment 7, complete sequenceMSATSLAPWNVISLLSFLYSLTALSMLFGSPFPLRAFWTKRLRCSPRSLGTVSVNTNPKIPLVRGDRIGLVFSHSMRASRSVFFPAKTSSSLCAISALRGPDGTIERTYVSTSVRRLIFQYLPAFA>NC_007382.1_1 [1 - 1407] Influenza A virus (A/Korea/426/68(H2N2)) segment 6, complete sequenceMNPNQKIITIGSVSLTIATVCFLMQIAILVTTVTLHFKQHECDSPASNQVMPCEPIIIERNITEIVYLNNTTIEKEICPEVVEYRNWSKPQCQITGFAPFSKDNSIRLSAGGDIWVTREPYVSCDPGKCYQFALGQGTTLDNKHSNDTIHDRIPHRTLLMNELGVPFHLGTRQVCVAWSSSSCHDGKAWLHVCVTGDDKNATASFIYDGRLMDSIGSWSQNILRTQESECVCINGTCTVVMTDGSASGRADTRILFIEEGKIVHISPLSGSAQHVEECSCYPRYPDVRCICRDNWKGSNRPVIDINMEDYSIDSSYVCSGLVGDTPRNDDRSSNSNCRNPNNERGNPGVKGWAFDNGDDVWMGRTISKDLRSGYETFKVIGGWSTPNSKSQINRQVIVDSNNWSGYSGIFSVEGKRCINRCFYVELIRGRQQETRVWWTSNSIVVFCGTSGTYGTGSWPDGANINFMPI*>NC_007381.1_1 [1 - 1494] Influenza A virus (A/Korea/426/68(H2N2)) segment 5, complete sequenceMASQGTKRSYEQMETDGERQNATEIRASVGKMIDGIGRFYIQMCTELKLSDYEGRLIQNSLTIERMVLSAFDERRNKYLEEHPSAGKDPKKTGGPIYKRVDGKWMRELVLYDKEEIRRIWRQANNGDDATAGLTHMMIWHSNLNDTTYQRTRALVRTGMDPRMCSLMQGSTLPRRSGAAGAAVKGVGTMVMELIRMIKRGINDRNFWRGENGRKTRSAYERMCNILKGKFQTAAQRAMMDQVRESRNPGNAEIEDLIFLARSALILRGSVAHKSCLPACVYGPAIASGYNFEKEGYSLVGIDPFKLLQNSQVYSLIRPNENPAHKSQLVWMACNSAAFEDLRVLSFIRGTKVSPRGKLSTRGVQIASNENMDTMESSTLELRSRYWAIRTRSGGNTNQQRASAGQISVQPAFSVQRNLPFDKPTIMAAFTGNTEGRTSDMRAEIIRMMEGAKPEEMSFQGRGVFELSDEKATNPIVPSFDMSNEGSYFFGDNAEEYDN*>NC_007381.1_2 [723 - 184] (REVERSE SENSE) Influenza A virus (A/Korea/426/68(H2N2)) segment 5, complete sequenceMIHHCSLCSCLKFSFENVAHSLVSTPCFPSILTSPEVPIIDPTFDHPDQLHHHCPNSFDCSACSSRPPRESRTLHQRAHPGIHSGANKSSCPLVCCIIQIGMPDHHVSQPSCCIITIIGLAPDSPYFFFVIKDEFPHPLSIYSLVYGSSSFLRILPRAGMFFQIFIPSLVKSREHHSLYC>NC_007375.1_1 [25 - 2295] Influenza A virus (A/Korea/426/68(H2N2)) segment 2, complete sequenceMDVNPTLLFLKVPAQNAISTTFPYTGDPPYSHGTGTGYTMDTVNRTHQYSEKGKWTTNTETGAPQLNPIDGPLPEDNEPSGYAQTDCVLEAMAFLEESHPGIFENSCLETMEVIQQTRVDKLTQGRQTYDWTLNRNQPAATALANTIEVFRSNGLTANESGRLIDFLKDVIESMDKEEMEITTHFQRKRRVRDNMTKKMVTQRTIGKKKQRLNKRSYLIRALTLNTMTKDAERGKLKRRAIATPGMQIRGFVHFVETLARNICEKLEQSGLPVGGNEKKAKLANVVRKMMTNSQDTELSFTITGDNTKWNENQNPRVFLAMITYITRNQPEWFRNVLSIAPIMFSNKMARLGKGYMFESKSMKLRTQIPAEMLASIDLKYFNESTRKKIEKIRPLLIDGTVSLSPGMMMGMFNMLSTVLGVSILNLGQKKYTKTTYWWDGLQSSDDFALIVNAPNHEGIQAGVNRFYRTCKLVGINMSKKKSYINRTGTFEFTSFFYRYGFVANFSMELPSFGVSGINESADMSIGVTVIKNNMINNDLGPATAQMALQLFIKDYRYTYRCHRGDTQIQTRRSFELKKLWEQTRSKAGLLVSDGGSNLYNIRNLHIPEVCLKWELMDEDYQGRLCNPLNPFVSHKEIESVNNAVVMPAHGPAKSMEYDAV

68

Page 69: Project  Report on Influenza Virus

ATTHSWTPKRNRSILNTSQRGILEDEQMYQKCCNLFEKFFPSSSYRRPVGISSMVEAMVSRARIDARIDFESGRIKKEEFAEIMKICSTIEELRRQK>NC_007380.1_1 [1 - 711] Influenza A virus (A/Korea/426/68(H2N2)) segment 8, complete sequenceMDSNTVSSFQVDCFLWHVRKQVVDQELGDAPFLDRLRRDQKSLRGRGSTLDLDIEAATRVGKQIVERILKEESDEALKMTMASAPASRYLTDMTIEELSRDWFMLMPKQKVEGPLCIRIDQAIMDKNIMLKANFSVIFDRLETLILLRAFTEEGAIVGEISPLPSLPGHTIEDVKNAIGVLIGGLEWNDNTVRVSKTLQRFAWRSSNENGRPPLTPKQKRKMARTIRSKVRRDKMAD>NC_007380.1_2 [467 - 835] Influenza A virus (A/Korea/426/68(H2N2)) segment 8, complete sequenceMLAKFHHCLLFQDILLRMSKMQLGSSSEDLNGMITQFESLKLYRDSLGEAVMRMGDLHSLQNRNGKWREQLGQKFEEIRWLIEEVRHRLKITENSFEQITFMQALQLLFEVEQEIRTFSFQLI*>NC_007380.1_3 [767 - 120] (REVERSE SENSE) Influenza A virus (A/Korea/426/68(H2N2)) segment 8, complete sequenceMLFAQNYSLLSSICVSLLQSAILSLRTFDLIVLAIFRFCFGVSGGLPFSLLLLQANLCRVLETRTVLSFHSSPPMRTPIAFLTSSIVCPGREGNGEISPTIAPSSVKALSNIRVSSRSKITLKFAFNMMFLSMIAWSILMQRGPSTFCLGISMNQSLDNSSIVMSVRYREAGAEAMVILSASSDSSFRILSTICFPTRVAASMSRSRVLPLPLRDF>NC_007378.1_1 [28 - 2304] Influenza A virus (A/Korea/426/68(H2N2)) segment 1, complete sequenceMERIKELRNLMSQSRTREILTKTTVDHMAIIKKYTSGRQEKNPSLRMKWMMAMKYPITADKRITEMVPERNEQGQTLWSKMSDAGSDRVMVSPLAVTWWNRNGPMTSTVHYPKIYKTYFEKVERLKHGTFGPVHFRNQVKIRRRVDINPGHADLSAKEAQDVIMEVVFPNEVGARILTSESQLTITKEKKEELQDCKISPLMVAYMLERELVRKTRFLPVAGGTSSVYIEVLHLTQGTCWEQMYTPGGEVRNDDVDQSLIIAARNIVRRAAVSADPLASLLEMCHSTQIGGTRMVDILRQNPTEEQAVDICKAAMGLRISSSFSFGGFTFKRTSGSSIKREEEVLTGNLQTLKIRVHEGYEEFTMVGKRATAILRKATRRLVQLIVSGRDEQSIAEAIIVAMVFSQEDCMIKAVRGDLNFVNRANQRLNPMHQLLRHFQKDAKVLFQNWGIEHIDNVMGMIGVLPDMTPSTEMSMRGIRVSKMGVDEYSSTERVVVSIDRFLRVRDQRGNVLLSPEEVSETQGTEKLTITYSSSMMWEINGPESVLVNTYQWIIRNWETVKIQWSQNPTMLYNKMEFEPFQSLVPKAIRGQYSGFVRTLFQQMRDVLGTFDTTQIIKLLPFAAAPPKQSRMQFSSLTVNVRGSGMRILVRGNSPVFNYNKTTKRLTILGKDAGTLTEDPDEGTSGVESAVLRGFLILGKEDRRYGPALSINELSTLAKGEKANVLIGQGDVVLVMKRKRDSSILTDSQTATKRIRMAIN>NC_007378.1_2 [2303 - 2001] (REVERSE SENSE) Influenza A virus (A/Korea/426/68(H2N2)) segment 1, complete sequenceMMAIRILLVAVWLSVSMLESRFRFITNTTSPCPISTLAFSPFARVLSSLMLNAGPYLLSSLPRMRNPLRTADSTPDVPSSGSSVKVPASFPRIVSLLVVLL>NC_007378.1_3 [1163 - 798] (REVERSE SENSE) Influenza A virus (A/Korea/426/68(H2N2)) segment 1, complete sequenceMVAFLSIAVALFPTIVNSSYPSCTLIFNVWRLPVSTSSSLLIDDPLVLLNVNPPKLKDELILSPIAALHISTACSSVGFCLRMSTILVPPICVLWHISNKDASGSADTAALLTMFLAAIIRL>NC_007373.1_1 [28 - 2304] Influenza A virus (A/New York/392/2004(H3N2)) segment 1, complete sequenceMERIKELRNLMSQSRTREILTKTTVDHMAIIKKYTSGRQEKNPSLRMKWMMAMKYPITADKRITEMVPERNEQGQTLWSKMSDAGSDRVMVSPLAVTWWNRNGPVASTVHYPKVYKTYFDKVERLKHGTFGPVHFRNQVKIRRRVDINPHADLSAKEAQDVIMEVVFPNEVGARILTSESQLTITKEKKEELRDCKISPLMVAYMLERELVRKTRFLPVAGGTSSIYIEVLHLTQGTCWEQMYTPGGEVRNDDVDQSLIIAARNIVRRAAVSADPLASLLEMCHSTQIGGTRMVDILRQNPTEEQAVDICKAAMGLRISSSFSFGGFTFKRTSGSSVKKEEEVLTGNLQTLKIRVHEGYEEFTMVGKRATAILRKATRRLVQLIVSGRDEQSIAEAIIVAMVFSQEDCMIKAVRGDLNFVNRANQRLNPMHQLLRHFQKDAKVLFQNWGIEHIDSVMGMVGVLPDMTPSTEMSMRGIRVSKMGVDEYSSTERVVVSIDRFLRVRDQRGNVLLSPEEVSETQGTERLTITYSSSMMWEINGPESVLVNTYQWIIRNWEAVKIQWSQNPAMLYNKMEFEPFQSLVPKAIRSQYSGFVRTLFQQMRDVLGTFDTTQIILLPFAAAPPKQSRMQFSSLTVNVRGSGMRILVRGNSPVFNYNKTTKRLTILGKDAGTLIEDPDESTSGVESAVLRGFLIIGKEDRRYGPALSINELSNLAKGEKANVLIGQGDVVLVMKRKRDSSILTDSQTATKRIRMAIN>NC_007373.1_2 [2303 - 2001] (REVERSE SENSE) Influenza A virus (A/New York/392/2004(H3N2)) segment 1, complete sequenceMMAIRILLVAVWLSVSMLESRFRFITNTTSPCPISTLAFSPFARLLSSLMLNAGPYLLSSLPIMRNPLKTADSTPDVLSSGSSIKVPASFPRIVSLLVVLL>NC_007373.1_3 [1670 - 1368] (REVERSE SENSE) Influenza A virus (A/New York/392/2004(H3N2)) segment 1, complete sequenceMTKTDSGPLISHIIDDEYVIVNLSVPCVSLTSSGDNNTFPRWSRTLKNRSMLTTTLSVLEYSSTPILLTLIPLIDISVLGVISGNTPTIPITLSMCSIPQF>NC_007373.1_4 [1163 - 735] (REVERSE SENSE) Influenza A virus (A/New York/392/2004(H3N2)) segment 1, complete sequenceMVAFLSIAVALFPTIVNSSYPSCTLIFNVWRLPVSTSSSFLTDDPLVLLNVNPPKLKDELILNPIAALHISTACSSVGFCLRMSTILVPPICVLWHISNKDASGSADTAALLTMFLAAIIRLWSTSSFLTSPPGVYICSQHVP>NC_007369.1_1 [46 - 1539] Influenza A virus (A/New York/392/2004(H3N2)) segment 5, complete sequenceMASQGTKRSYEQMETDGDRQNATEIRASVGKMIDGIGRFYIQMCTELKLSDHEGRLIQNSLTIEKMVLSAFDERRNKYLEEHPSAGKDPKKTGGPIYRRVDGKWMRELVLYDKEEIRRIWRQANNGEDATAGLTHIMIWHSNLNDATYQRTRALVRTGMDPRMCSLMQGSTLPRRSGAAGAAVKGIGTMVMELIRMVKRGINDRNFWRGENGRKTRSAYERMCNILKGKFQTAAQRAMVDQVRESRNPGNAEIEDLIFLARSALILRGSVAHKSCLPACAYGPAVSSGYDFEKEGYSLVGIDPFKLLQNSQIYSLIRPNENPAHKSQLVWMACHSAAFEDLRLLSFIRGTKVSPRGKLSTRGVQIASNENMDNMGSSTLELRSGYWAIRTRSGGNTNQQRASAGQTSVQPTFSVQRNLPFEKSTIMAAFTGNTEGRTSDMRAEIIRMMEGAKPEEVSFRGRGVFELSDEKATNPIVPSFDMSNEGSYFFGDNAEEYDN>NC_007369.1_2 [768 - 445] (REVERSE SENSE) Influenza A virus (A/New York/392/2004(H3N2)) segment 5, complete sequenceMIHHCSLCSCLKFSFKNVAHSLISTSCFPPILTSPEISIVDPPFDHSDQFHHHCPDSFDCSTCSSGPSRESRALHQRAHSGIHSSSNKSSCPLVCCIIQIGMPDHYVS>NC_007369.1_3 [499 - 194] (REVERSE SENSE) Influenza A virus (A/New York/392/2004(H3N2)) segment 5, complete sequenceMSSGMLHHSNWNARSLCELDQLSHPHHCWLGARFALSLLCHKGRVPSSIFHLLSCIWAPQFSWDLSPRWGVLPGIYSFFHQKQRAPFSLLSSCSGSTALHDH>NC_007369.1_4 [411 - 28] (REVERSE SENSE) Influenza A virus (A/New York/392/2004(H3N2)) segment 5, complete sequenceMAPDSPYLFFVIKDEFPHPFSIYSPVYGPPSFLGIFPRAGVFFQVFIPSFIKSREHHFLYCQAVLDQPPFMITEFKFSAHLDVESPNSINHLPDGCPNLSCILAIPISFHLFIRPFGALGRHDFDVTR>NC_007367.1_1 [26 - 781] Influenza A virus (A/New York/392/2004(H3N2)) segment 7, complete sequenceMSLLTEVETYVLSIVPSGPLKAEIAQRLEDVFAGKNTDLEALMEWLKTRPILSPLTKGILGFVFTLTVPSERGLQRRRFVQNALNGNGDPNNMDKAVKLYRKLKREITFHGAKEIALSYSAGALASCMGLIYNRMGAVTTEVAFGLVCATCEQIADSQHRSHRQMVATTNPLIKHENRMVLASTTAKAMEQMAGSSEQAAEAMEIASQARQMVQAMRAVGTHPSSSTGLRDDLLENLQTYQKRMGVQMQRFK>NC_007367.1_2 [517 - 185] (REVERSE SENSE) Influenza A virus (A/New York/392/2004(H3N2)) segment 7, complete sequenceMPMRPVLGVSNLFTCCTYQAKCHFSGYSPHPIVYEAHATGKCTSRITESYFFGPMERYLPLKFPIQFNCFVHVIWISIPIEGILDKASTLQSSLTGHGEREHKPQNPLSQR>NC_007367.1_3 [378 - 1] (REVERSE SENSE) Influenza A virus (A/New York/392/2004(H3N2)) segment 7, complete sequenceMRAISLAPWNVISLLSFLYSLTALSMLFGSPFPLRAFWTKRLRCSPRSLGTVSVNTNPKIPLVRGDRIGLVFSHSMRASRSVFFPAKTSSSLCAISALRGPDGTIERTYVSTSVRRLIFQYLPAFA>NC_007371.1_1 [25 - 2172] Influenza A virus (A/New York/392/2004(H3N2)) segment 3, complete sequenceMEDFVRQCFNPMIVELAEKAMKEYGEDLKIETNKFAAICTHLEVCFMYSDFHFINEQGESIVVELDDPNALLKHRFEIIEGRDRTMAWTVVNSICNTTGAEKPKFLPDLYDYKENRFIEIGVTRREVHIYYLEKANKIKSENTHIHIFSFTGEEIATKADYTLDEESRARIKTRLFTIRQEMANRGLWDSFRQSERGEETIEEKFEISGTMRRLADQSLPPKFSCLENFRAYVDGFEPNGCIEGKLSQMSKEVNAKIEPFLKTTPRPIKLPNGPPCYQRSKFLLMDALKLSIEDPSHEGEGIPLYDAIKCIKTFFGWKEPYIVKPHEKGINSNYLLSWKQVLSELQDIENEEKIPRTKNMKKTSQLKWALGENMAPEKVDFDNCRDISDLKQYDSDEPELRSLSSWIQNEFNKACELTDSIWIELDEIGEDVAPIEYIASMRRNYFTAEVSHCRATEYIMKGVYINTALLNASCAAMDDFQLIPMISKCRTKEGRRKTNLYGFIIKGRSH

69

Page 70: Project  Report on Influenza Virus

LRNDTDVVNFVSMEFSLTDPRLEPHKWEKYCVLEIGDMLLRSAIGQISRPMFLYVRTNGTSKVKMKWGMEMRRCLLQSLQQIESMIEAESSIKEKDMTKEFFENKSEAWPIGESPKGVEEGSIGKVCRTLLAKSVFNSLYASPQLEGFSAESRKLLLVVQALRDNLEPGTFDLGGLYEAIEECLINDPWVLLNASWFNSFLTHALK>NC_007368.1_1 [20 - 1426] Influenza A virus (A/New York/392/2004(H3N2)) segment 6, complete sequenceMNPNQKIITIGSVSLTISTICFFMQIAILITTVTLHFKQYEFNSPPNNQVMLCEPTIIERNITEIVYLTNTTIEKEMCPKLAEYRNWSKPQCDITGFAPFSKDNSIRLSAGGDIWVTREPYVSCDPDKCYQFALGQGTTLNNVHSNDTVHDRTPYRTLLMNELGVPFHLGTKQVCIAWSSSSCHDGKAWLHVCVTGDDKNATASFIYNGRLVDSIVSWSKKILRTQESECVCINGTCTVVMTDGSASGKADTKILFIEEGKIIHTSTLSGSAQHVEECSCYPRYPGVRCVCRDNWKGSNRPIVDINIKDYSIVSSYVCSGLVGDTPRKNDSSSSSHCLDPNNEEGGHGVKGWAFDDGNDVWMGRTISEKLRSGYETFKVIEGWSKPNSKLQINRQVIVDRGNRSGYSGIFSVEGKSCINRCFYVELIRGRKEETEVLWTSNSIVVFCGTSGTYGTGSWPDGADINLMPI>NC_007368.1_2 [360 - 34] (REVERSE SENSE) Influenza A virus (A/New York/392/2004(H3N2)) segment 6, complete sequenceMSPPAESLIELSLEKGANPVMSHCGFDQFLYSASLGHISFSMVVLVRYTISVMFLSIIVGSHSITWLFGGELNSYCLKCNVTVVIRMAICMKKHIVEMVRETEPIVIIF>NC_007370.1_1 [27 - 716] Influenza A virus (A/New York/392/2004(H3N2)) segment 8, complete sequenceMDSNTVSSFQVDCFLWHIRKQVVDQELSDAPFLDRLRRDQRSLRGRGNTLGLDIKAATHVGKQIVEKILKEESDEALKMTMVSTPASRYITDMTIEELSRNWFMLMPKQKVEGPLCIRMDQAIMEKNIMLKANFSVIFDRLETIVLLRAFTEEGAIVGEISPLPSFPGHTIEDVKNAIGVLIGGLEWNDNTVRVSKNLQRFAWRSSNENGGPPLTPKQKRKMARTARSKV>NC_007370.1_2 [493 - 861] Influenza A virus (A/New York/392/2004(H3N2)) segment 8, complete sequenceMLAKSHHCLLFQDILLRMSKMQLGSSSEDLNGMITQFESLKIYRDSLGEAVMRMGDLHLLQNRNGKWREQLGQKFEEIRWLIEEVRHRLKTTENSFEQITFMQALQLLFEVEQEIRTFSFQLI>NC_007370.1_3 [793 - 146] (REVERSE SENSE) Influenza A virus (A/New York/392/2004(H3N2)) segment 8, complete sequenceMLFVQSYFQLFLVCVSLLQSAILSLQTFDLAVLAIFRFCFGVSGGPPFSLLLLQANLCRFLETRTVLSFHSSPPMRTPIAFLTSSIVCPGKEGNGEISPTIAPSSVKALSNTMVSSRSKITLKFAFNMMFFSMIAWSILMQRGPSTFCLGISMNQFLDNSSIVMSVMYREAGVETMVILSASSDSSFRIFSTICFPTWVAALMSRPRVLPLPLRDL>NC_007372.1_1 [25 - 2295] Influenza A virus (A/New York/392/2004(H3N2)) segment 2, complete sequenceMDVNPTLLFLKVPAQNAISTTFPYTGDPPYSHGTGTGYTMDTVNRTHQYSEKGKWTTNTETGAPQLNPIDGPLPEDNEPSGYAQTDCVLEAMAFLEESHPGIFENSCLETMEVVQQTRVDKLTQGRQTYDWTLNRNQPAATALANTIEVFRSNGLTANESGRLIDFLKDVMESMDKEEMEITTHFQRKRRVRDNMTKKMVTQRTIGKKKQRVNKRGYLIRALTLNTMTKDAERGKLKRRAIATPGMQIRGFVYFVETLARSICEKLEQSGLPVGGNEKKAKLANVVRKMMTNSQDTELSFTITGDNTKWNENQNPRMFLAMITYITKNQPEWFRNILSIAPIMFSNKMARLGKGYMFESKRMKLRTQIPAEMLASIDLKYFNESTRKKIEKIRPLLIDGTASLSPGMMMGMFNMLSTVLGVSVLNLGQKKYTKTTYWWDGLQSSDDFALIVNAPNHEGIQAGVDRFYRTCKLVGINMSKKKSYINKTGTFEFTSFFYRYGFVANFSMELPSFGVSGINESADMSIGVTVIKNNMINNDLGPATAQMALQLFIKDYRYTYRCHRGDTQIQTRRSFELKKLWDQTQSRAGLLVSDGGPNLYNIRNLHIPEVCLKWELMDENYRGRLCNPLNPFVSHKEIESVNNAVVMPAHGPAKSMEYDAVATTHSWNPKRNRSILNTSQRGILEDEQMYQKCCNLFEKFFPSSSYRRPIGISSMVEAMVSRARIDARIDFESGRIKKEEFSEIMKICSTIEELRRQK>NC_007372.1_2 [2285 - 1884] (REVERSE SENSE) Influenza A virus (A/New York/392/2004(H3N2)) segment 2, complete sequenceMSSSMVEQIFMISENSSFLIRPDSKSILASIRALDTMASTMLEIPIGLLYELLGKNFSNKLQHFWYICSSSRIPLWLVFRIERFLLGFQECVVATASYSILLAGPWAGITTALFTDSISLWLTKGFRGLQSLPR>NC_007372.1_3 [1373 - 1011] (REVERSE SENSE) Influenza A virus (A/New York/392/2004(H3N2)) segment 2, complete sequenceMRAKSSEDWSPSHQYVVLVYFFCPRFSTETPKTVLSMLNMPIIIPGLNDAVPSIRRGLIFSIFFLVDSLKYFRSMLASISAGICVRSFILLLSNMYPFPSLAILFENIIGAMLRMFLNHSG>NC_007366.1_1 [30 - 1727] Influenza A virus (A/New York/392/2004(H3N2)) segment 4, complete sequenceMKTIIALSYILCLVFAQKLPGNDNSTATLCLGHHAVPNGTIVKTITNDQIEVTNATELVQSSSTGGICDSPHQILDGENCTLIDALLGDPQCDGFQNKKWDLFVERSKAYSNCYPYDVPDYASLRSLVASSGTLEFNNESFNWTGVTQNGTSSACKRRSNNSFFSRLNWLTHLKFKYPALNVTMPNNEKFDKLYIWGVHHPGTDNDQISLYAQASGRITVSTKRSQQTVIPSIGSRPRIRDVPSRISIYWTIVKPGDILLINSTGNLIAPRGYFKIRSGKSSIMRSDAPIGKCNSECITPNGSIPNDKPFQNVNRITYGACPRYVKQNTLKLATGMRNVPEKQTRGIFGAIAGFIENGWEGMVDGWYGFRHQNSEGTGQAADLKSTQAAINQINGKLNRLIGKTNEKFHQIEKEFSEVEGRIQDLEKYVEDTKIDLWSYNAELLVALENQHTIDLTDSEMNKLFERTKKQLRENAEDMGNGCFKIYHKCDNACIGSIRNGTYDHDVYRDEALNNRFQIKGVELKSGYKDWILWISFAISCFLLCVALLGFIMWACQKGNIRCNICI>NC_007366.1_2 [1259 - 888] (REVERSE SENSE) Influenza A virus (A/New York/392/2004(H3N2)) segment 4, complete sequenceMMEFLVCFPDQPIQLPIDLVDCCLSAFEICCLSCSLRILMPETVPTVYHSLPTIFYETRDCAKYASSLFLWYISHPCCQFQSVLLNISGTGPICDPVYILKWFVIGNASIWSDAFRIAFANGCI>NC_002018.1_1 [21 - 1382] Influenza A virus (A/Puerto Rico/8/34(H1N1)) segment 6, complete sequenceMNPNQKIITIGSICLVVGLISLILQIGNIISIWISHSIQTGSQNHTGICNQNIITYKNSTWVKDTTSVILTGNSSLCPIRGWAIYSKDNSIRIGSKGDVFVIREPFISCSHLECRTFFLTQGALLNDRHSNGTVKDRSPYRALMSCPVGEAPSPYNSRFESVAWSASACHDGMGWLTIGISGPDNGAVAVLKYNGIITETIKSWRKKILRTQESECACVNGSCFTIMTDGPSDGLASYKIFKIEKGKVTKSIELNAPNSHYEECSCYPDTGKVMCVCRDNWHGSNRPWVSFDQNLDYQIGYICSGVFGDNPRPKDGTGSCGPVYVDGANGVKGFSYRYGNGVWIGRTKSHSSRHGFEMIWDPNGWTETDSKFSVRQDVVAMTDWSGYSGSFVQHPELTGLDCIRPCFWVELIRGRPKEKTIWTSASSISFCGVNSDTVDWSWPDGAELPFTIDK>NC_002023.1_1 [28 - 2304] Influenza A virus (A/Puerto Rico/8/34(H1N1)) segment 1, complete sequenceMERIKELRNLMSQSRTREILTKTTVDHMAIIKKYTSGRQEKNPALRMKWMMAMKYPITADKRITEMIPERNEQGQTLWSKMNDAGSDRVMVSPLAVTWWNRNGPMTNTVHYPKIYKTYFERVERLKHGTFGPVHFRNQVKIRRRVDINPGHADLSAKEAQDVIMEVVFPNEVGARILTSESQLTITKEKKEELQDCKISPLMVAYMLERELVRKTRFLPVAGGTSSVYIEVLHLTQGTCWEQMYTPGGEVKNDDVDQSLIIAARNIVRRAAVSADPLASLLEMCHSTQIGGIRMVDILKQNPTEEQAVGICKAAMGLRISSSFSFGGFTFKRTSGSSVKREEEVLTGNLQTLKIRVHEGYEEFTMVGRRATAILRKATRRLIQLIVSGRDEQSIAEAIIVAMVFSQEDCMIKAVRGDLNFVNRANQRLNPMHQLLRHFQKDAKVLFQNWGVEPIDNVMGMIGILPDMTPSIEMSMRGVRISKMGVDEYSSTERVVVSIDRFLRVRDQRGNVLLSPEEVSETQGTEKLTITYSSSMMWEINGPESVLVNTYQWIIRNWETVKIQWSQNPTMLYNKMEFEPFQSLVPKAIRGQYSGFVRTLFQQMRDVLGTFDTAQIIKLLPFAAAPPKQSRMQFSSFTVNVRGSGMRILVRGNSPVFNYNKATKRLTVLGKDAGTLTEDPDEGTAGVESAVLRGFLILGKEDRRYGPALSINELSNLAKGEKANVLIGQGDVVLVMKRKRDSSILTDSQTATKRIRMAIN>NC_002023.1_2 [2303 - 2001] (REVERSE SENSE) Influenza A virus (A/Puerto Rico/8/34(H1N1)) segment 1, complete sequenceMMAIRILLVAVWLSVSMLESRFRFITNTTSPCPISTLAFSPFARLLSSLMLNAGPYLLSSLPRMRNPLRTADSTPAVPSSGSSVKVPASFPRTVSLFVALL>NC_002017.1_1 [33 - 1730] Influenza A virus (A/Puerto Rico/8/34(H1N1)) segment 4, complete sequenceMKANLLVLLCALAAADADTICIGYHANNSTDTVDTVLEKNVTVTHSVNLLEDSHNGKLCRLKGIAPLQLGKCNIAGWLLGNPECDPLLPVRSWSYIVETPNSENGICYPGDFIDYEELREQLSSVSSFERFEIFPKESSWPNHNTTKGVTAACSHAGKSSFYRNLLWLTEKEGSYPKLKNSYVNKKGKEVLVLWGIHHPSNSKDQQNIYQNENAYVSVVTSNYNRRFTPEIAERPKVRDQAGRMNYYWTLLKPGDTIIFEANGNLIAPRYAFALSRGFGSGIITSNASMHECNTKCQTPLGAINSSLPFQNIHPVTIGECPKYVRSAKLRMVTGLRNIPSIQSRGLFGAIAGFIEGGWTGMIDGWYGYHHQNEQGSGYAADQKSTQNAINGITNKVNSVIEKMNIQFTAVGKEFNKLEKRMENLNKKVDDGFLDIWTYNAELLVLLENERTLDFHDSNVKNLYEKVKSQLKNNAKEIGNGCFEFYHKCDNECMESVRNGTYDYPKYSEESKLNREKVDGVKLESMGIYQILAIYSTVASSLVLLVSLGAISFWMCSNGSLQCRICI>NC_002017.1_2 [1680 - 1276] (REVERSE SENSE) Influenza A virus (A/Puerto Rico/8/34(H1N1)) segment 4, complete sequenceMPPGRPKAPVNWRQLSRSPESDRSPLIPISLHLPFPCSTLTLLNIWDNHKSHFLHFPCIHCHTCGRTQNIHFRFLWHYSLIGFLLSHTDSSHLSHGNPESFHFPVELTILHYMSKCPEIHHQLFYLNFPSFFLIC>NC_002019.1_1 [28 - 1539] Influenza A virus (A/Puerto Rico/8/34(H1N1)) segment 5, complete sequenceMSDIKIMASQGTKRSYEQMETDGERQNATEIRASVGKMIGGIGRFYIQMCTELKLSDYEGRLIQNSLTIERMVLSAFDERRNKYLEEHPSAGKDPKKTGGPIYRRVNGKWMRELILYDKEEIRRIWRQANNGDDATAGLTHMMIWHSNLNDATYQRTRALVRTGMDPRMCSLMQGSTLPRRSGAAGAAVKGVGTMVMELVRMIKRGINDRNFWRGENGRKTRIAYERMCNILKGKFQTAAQKAMMDQVRESRDPGNAEFEDLTFLARSALILRGSVAHKSCLPACVYGPAVASGYDFEREGYSLVGIDPFRLLQNSQVYSLIRPNENPAHKSQLVWMACHSAAFEDLRVLSFIKGTKVVPRGKLSTRGVQIASNENMETMESSTLELRSRYWAI

70

Page 71: Project  Report on Influenza Virus

RTRSGGNTNQQRASAGQISIQPTFSVQRNLPFDRTTVMAAFTGNTEGRTSDMRTEIIRMMESARPEDVSFQGRGVFELSDEKAASPIVPSFDMSNEGSYFFGDNAEEYDN>NC_002019.1_2 [549 - 229] (REVERSE SENSE) Influenza A virus (A/Puerto Rico/8/34(H1N1)) segment 5, complete sequenceMHQRAHPGIHSGANKSPCPLISCIIQIGMPDHHVSQTSRCIVTIISLAPDSPYFFFVIKDEFSHPLSVYSSVYRSSSFLRILPRTGMFFQVFISPFVKSREHHSLYC>NC_002022.1_1 [25 - 2172] Influenza A virus (A/Puerto Rico/8/34(H1N1)) segment 3, complete sequenceMEDFVRQCFNPMIVELAEKTMKEYGEDLKIETNKFAAICTHLEVCFMYSDFHFINEQGESIIVELGDPNALLKHRFEIIEGRDRTMAWTVVNSICNTTGAEKPKFLPDLYDYKENRFIEIGVTRREVHIYYLEKANKIKSEKTHIHIFSFTGEEMATKADYTLDEESRARIKTRLFTIRQEMASRGLWDSFRQSERGEETIEERFEITGTMRKLADQSLPPNFSSLENFRAYVDGFEPNGYIEGKLSQMSKEVNARIEPFLKTTPRPLRLPNGPPCSQRSKFLLMDALKLSIEDPSHEGEGIPLYDAIKCMRTFFGWKEPNVVKPHEKGINPNYLLSWKQVLAELQDIENEEKIPKTKNMKKTSQLKWALGENMAPEKVDFDDCKDVGDLKQYDSDEPELRSLASWIQNEFNKACELTDSSWIELDEIGEDVAPIEHIASMRRNYFTSEVSHCRATEYIMKGVYINTALLNASCAAMDDFQLIPMISKCRTKEGRRKTNLYGFIIKGRSHLRNDTDVVNFVSMEFSLTDPRLEPHKWEKYCVLEIGDMLLRSAIGQVSRPMFLYVRTNGTSKIKMKWGMEMRRCLLQSLQQIESMIEAESSVKEKDMTKEFFENKSETWPIGESPKGVEESSIGKVCRTLLAKSVFNSLYASPQLEGFSAESRKLLLIVQALRDNLEPGTFDLGGLYEAIEECLINDPWVLLNASWFNSFLTHALS>NC_002022.1_2 [1706 - 1380] (REVERSE SENSE) Influenza A virus (A/Puerto Rico/8/34(H1N1)) segment 3, complete sequenceMGLETWPMALLRSISPISRTQYFSHLCGSSLGSVRENSMLTKFTTSVSFLKWDLPFMMKPYKLVFRLPSLVLHLLIIGINWKSSIAAQDALSKAVLMYTPFIMYSVALQ>NC_002022.1_3 [858 - 550] (REVERSE SENSE) Influenza A virus (A/Puerto Rico/8/34(H1N1)) segment 3, complete sequenceMRTGRPIRKSKWSWCCFQKRFNSSIYFFGHLRQLALNVAVRFESIHIGSKIFKAGEVRRETLVGKLAHCSCDFKPFFNCLFSSLGLTKGIPEASAGHFLSYGE>NC_002016.1_1 [26 - 781] Influenza A virus (A/Puerto Rico/8/34(H1N1)) segment 7, complete sequenceMSLLTEVETYVLSIIPSGPLKAEIAQRLEDVFAGKNTDLEVLMEWLKTRPILSPLTKGILGFVFTLTVPSERGLQRRRFVQNALNGNGDPNNMDKAVKLYRKLKREITFHGAKEISLSYSAGALASCMGLIYNRMGAVTTEVAFGLVCATCEQIADSQHRSHRQMVTTTNPLIRHENRMVLASTTAKAMEQMAGSSEQAAEAMEVASQARQMVQAMRTIGTHPSSSAGLKNDLLENLQAYQKRMGVQMQRFK>NC_002016.1_2 [378 - 1] (REVERSE SENSE) Influenza A virus (A/Puerto Rico/8/34(H1N1)) segment 7, complete sequenceMSEISLAPWNVISLLSFLYSLTALSMLFGSPFPLRAFWTKRLRCSPRSLGTVSVNTNPKIPLVRGDRIGLVFSHSMRTSRSVFFPAKTSSSLCAISALRGPDGMIERTYVSTSVRRLIFQYLPAFA>NC_002020.1_1 [27 - 716] Influenza A virus (A/Puerto Rico/8/34(H1N1)) segment 8, complete sequenceMDPNTVSSFQVDCFLWHVRKRVADQELGDAPFLDRLRRDQKSLRGRGSTLGLDIETATRAGKQIVERILKEESDEALKMTMASVPASRYLTDMTLEEMSREWSMLIPKQKVAGPLCIRMDQAIMDKNIILKANFSVIFDRLETLILLRAFTEEGAIVGEISPLPSLPGHTAEDVKNAVGVLIGGLEWNDNTVRVSETLQRFAWRSSNENGRPPLTPKQKREMAGTIRSEV>NC_002020.1_2 [493 - 861] Influenza A virus (A/Puerto Rico/8/34(H1N1)) segment 8, complete sequenceMLAKFHHCLLFQDILLRMSKMQLESSSEDLNGMITQFESLKLYRDSLGEAVMRMGDLHSLQNRNEKWREQLGQKFEEIRWLIEEVRHKLKVTENSFEQITFMQALHLLLEVEQEIRTFSFQLI>NC_002020.1_3 [793 - 293] (REVERSE SENSE) Influenza A virus (A/Puerto Rico/8/34(H1N1)) segment 8, complete sequenceMLFAQNYSLLPSVCVSLLQSTILFLQTSDLIVPAISRFCFGVSGGLPFSLLLLQANLCRVSETRTVLSFHSSPPMRTPTAFLTSSAVCPGREGNGEISPTIAPSSVKALSNIRVSSRSKITLKFAFSMMFLSMIAWSILIQRGPATFCLGMSMDHSLDISSRVMSVR>NC_002021.1_1 [25 - 2295] Influenza A virus (A/Puerto Rico/8/34(H1N1)) segment 2, complete sequenceMDVNPTLLFLKVPAQNAISTTFPYTGDPPYSHGTGTGYTMDTVNRTHQYSEKARWTTNTETGAPQLNPIDGPLPEDNEPSGYAQTDCVLEAMAFLEESHPGIFENSCIETMEVVQQTRVDKLTQGRQTYDWTLNRNQPAATALANTIEVFRSNGLTANESGRLIDFLKDVMESMKKEEMGITTHFQRKRRVRDNMTKKMITQRTIGKRKQRLNKRSYLIRALTLNTMTKDAERGKLKRRAIATPGMQIRGFVYFVETLARSICEKLEQSGLPVGGNEKKAKLANVVRKMMTNSQDTELSLTITGDNTKWNENQNPRMFLAMITYMTRNQPEWFRNVLSIAPIMFSNKMARLGKGYMFESKSMKLRTQIPAEMLASIDLKYFNDSTRKKIEKIRPLLIEGTASLSPGMMMGMFNMLSTVLGVSILNLGQKRYTKTTYWWDGLQSSDDFALIVNAPNHEGIQAGVDRFYRTCKLHGINMSKKKSYINRTGTFEFTSFFYRYGFVANFSMELPSFGVSGSNESADMSIGVTVIKNNMINNDLGPATAQMALQLFIKDYRYTYRCHRGDTQIQTRRSFEIKKLWEQTRSKAGLLVSDGGPNLYNIRNLHIPEVCLKWELMDEDYQGRLCNPLNPFVSHKEIESMNNAVMMPAHGPAKNMEYDAVATTHSWIPKRNRSILNTSQRGVLEDEQMYQRCCNLFEKFFPSSSYRRPVGISSMVEAMVSRARIDARIDFESGRIKKEEFTEIMKICSTIEELRRQK

GENSCANW output for sequence

Predicted genes/exons:

Gn.Ex Type S .Begin ...End .Len Fr Ph I/Ac Do/T CodRg P.... Tscr..----- ---- - ------ ------ ---- -- -- ---- ---- ----- ----- ------

1.01 Sngl + 27 719 693 2 0 73 49 328 0.996 23.61 1.02 PlyA + 869 874 6 -0.45

2.00 Prom + 928 967 40 -13.61 2.01 Sngl + 986 2482 1497 1 0 92 55 810 0.998 73.55 2.02 PlyA + 3149 3154 6 -3.44

3.00 Prom + 3955 3994 40 -6.66 3.01 Sngl + 4020 5516 1497 2 0 83 44 805 0.935 71.05 3.02 PlyA + 6328 6333 6 -3.24

4.00 Prom + 6396 6435 40 -12.11 4.01 Sngl + 6488 8767 2280 1 0 51 35 750 0.423 59.21 4.02 PlyA + 8781 8786 6 -5.12

5.00 Prom + 8819 8858 40 -13.78 5.01 Sngl + 8880 9638 759 2 0 65 46 323 0.920 21.54 5.02 PlyA + 9860 9865 6 -0.45

6.00 Prom + 9899 9938 40 -13.78 6.01 Sngl + 9959 12109 2151 1 0 59 33 826 0.988 68.37 6.02 PlyA + 12146 12151 6 -0.45

7.00 Prom + 12292 12331 40 -12.59 7.01 Sngl + 12362 14518 2157 1 0 89 40 940 0.565 83.44 7.02 PlyA + 14541 14546 6 -0.45

8.00 Prom + 14579 14618 40 -13.78 8.01 Sngl + 14658 16346 1689 2 0 80 38 557 0.990 45.02 8.02 PlyA + 16363 16368 6 -1.75

71

Page 72: Project  Report on Influenza Virus

9.00 Prom + 16428 16467 40 -11.43 9.01 Sngl + 16470 18749 2280 2 0 41 35 702 0.651 53.41 9.02 PlyA + 18763 18768 6 -5.12

10.00 Prom + 18818 18857 40 -15.2910.01 Sngl + 18863 21136 2274 1 0 8 39 973 0.267 77.6610.02 PlyA + 21159 21164 6 -0.45

11.00 Prom + 21188 21227 40 -15.7211.01 Sngl + 21259 23409 2151 0 0 59 33 992 0.971 84.9711.02 PlyA + 23446 23451 6 -0.45

12.00 Prom + 23507 23546 40 -13.9612.01 Sngl + 23549 24241 693 1 0 73 49 329 0.977 23.7112.02 PlyA + 24391 24396 6 -0.45

13.00 Prom + 24451 24490 40 -11.7213.01 Sngl + 24513 26009 1497 2 0 84 49 641 0.963 55.2513.02 PlyA + 26012 26017 6 -0.45

14.00 Prom + 26042 26081 40 -15.7214.01 Init + 26108 27308 1201 1 1 74 75 443 0.399 34.8714.02 Intr + 27688 27999 312 2 0 55 15 237 0.042 9.2614.03 Intr + 28828 30279 1452 2 0 15 36 589 0.010 36.0514.04 Term + 31521 32227 707 1 2 64 42 332 0.123 19.8814.05 PlyA + 32449 32454 6 -0.45

15.00 Prom + 32479 32518 40 -13.2415.01 Init + 32554 34197 1644 0 0 93 25 583 0.346 44.5915.02 Intr + 34783 35570 788 0 2 33 33 272 0.364 6.4715.03 Term + 35886 37416 1531 0 1 -31 49 737 0.285 47.8615.04 PlyA + 37420 37425 6 -5.51

16.00 Prom + 37448 37487 40 -13.2416.01 Sngl + 37526 39676 2151 1 0 72 33 1201 0.998 107.1716.02 PlyA + 39713 39718 6 -0.45

17.00 Prom + 39868 39907 40 -7.9617.01 Sngl + 39938 42094 2157 1 0 89 40 1247 0.932 114.1417.02 PlyA + 42117 42122 6 -0.45

18.00 Prom + 42146 42185 40 -13.2418.01 Sngl + 42227 44506 2280 1 0 51 36 825 0.480 66.8118.02 PlyA + 44520 44525 6 -5.12

19.00 Prom + 44543 44582 40 -15.0819.01 Sngl + 44617 46767 2151 0 0 72 33 1153 0.997 102.3719.02 PlyA + 46804 46809 6 -0.45

20.00 Prom + 46924 46963 40 -9.0620.01 Sngl + 47018 49177 2160 1 0 89 38 1193 0.949 108.5320.02 PlyA + 49197 49202 6 -0.45

21.00 Prom + 49247 49286 40 -12.9621.01 Sngl + 49439 51568 2130 1 0 77 36 779 0.704 65.8421.02 PlyA + 52825 52830 6 1.05

22.00 Prom + 53053 53092 40 -9.3622.01 Init + 53164 54702 1539 0 0 72 -6 469 0.165 28.3022.02 Intr + 55106 55616 511 1 1 3 25 291 0.042 6.8422.03 Term + 55711 55916 206 2 2 66 42 181 0.752 8.8322.04 PlyA + 55939 55944 6 -5.99

23.00 Prom + 55971 56010 40 -12.7823.01 Sngl + 56013 58292 2280 2 0 51 36 890 0.257 73.3123.02 PlyA + 58306 58311 6 -5.12

24.00 Prom + 58331 58370 40 -13.7824.01 Sngl + 58409 60559 2151 1 0 72 33 900 0.979 77.0724.02 PlyA + 60596 60601 6 -0.45

25.00 Prom + 60720 60759 40 -11.1425.01 Sngl + 60817 62973 2157 0 0 85 40 799 0.946 68.9425.02 PlyA + 62996 63001 6 -0.45

26.00 Prom + 63021 63060 40 -13.7826.01 Sngl + 63101 63793 693 1 0 73 49 386 0.985 29.4126.02 PlyA + 63940 63945 6 1.05

27.00 Prom + 64006 64045 40 -11.7227.01 Sngl + 64068 65564 1497 2 0 84 43 719 0.960 62.4527.02 PlyA + 65682 65687 6 -3.24

28.00 Prom + 65827 65866 40 -2.1628.01 Init + 67149 68696 1548 2 0 83 71 698 0.591 59.74

72

Page 73: Project  Report on Influenza Virus

28.02 Term + 69149 69736 588 1 0 3 42 289 0.529 10.3228.03 PlyA + 69958 69963 6 -0.45

Predicted peptide sequence(s):Predicted coding sequence(s):

>gi|GENSCAN_predicted_peptide_1|230_aaMDSNTVSSFQVDCFLWHVRKRFADQELGDAPFLDRLRRDQKSLRGRGSTLGLDIRTATREGKHIVERILEEESDEALKMTIASVPASRYLTEMTLEEMSRDWLMLIPKQKVTGPLCIRMDQAVMGKTIILKANFSVIFNRLEALILLRAFTDEGAIVGEISPLPSLPGHTDEDVKNAIGVLIGGLEWNDNTVRVSETLQRFTWRSSDENGRSPLPPKQKRKVERTIEPEV>gi|GENSCAN_predicted_CDS_1|693_bpatggattccaacactgtgtcaagctttcaggtagactgctttctttggcatgtccgcaaacgatttgcagaccaagaactgggtgatgccccattccttgaccggcttcgccgagatcagaagtccctaagaggaagaggcagcactcttggtctggacatcagaactgccactcgtgaaggaaagcatatagtggagcggattctggaggaagaatctgacgaggcacttaaaatgactatcgcttcagtgcctgcttcacgctacctaactgaaatgactcttgaggaaatgtcaagggattggttaatgctcattcccaagcagaaagtgacagggcccctttgcattagaatggaccaggcagtaatgggtaaaaccatcatattgaaagcaaactttagtgtgatttttaatcgacttgaagctctgatactacttagagcgtttacagatgaaggagcaatagtgggcgaaatctcaccattaccttcccttccaggacatactgacgaggatgtcaaaaatgcaattggggtcctcatcggaggacttgaatggaatgataacacagttcgagtctctgaaactctacagagattcacttggagaagcagtgatgagaatgggagatctccactccctccaaaacagaaacggaaagtggagagaacaattgagccagaagtttga>gi|GENSCAN_predicted_peptide_2|498_aaMASQGTKRSYEQMETGGERQNATEIRASVGRMVGGIGRFYVQMCTELKLSDQEGRLIQNSITIERMVLSAFDERRNRYLEEHPSAGKDPKKTGGPIYRRRDGKWVRELILYDKEEIRRIWRQANNGEDATAGLTHMMIWHSNLNDATYQRTRALVRTGMDPRMCSLMQGSTLPRRSGAAGAAIKGVGTMVMELIRMIKRGINDRNFWRGDNGRRTRIAYERMCNILKGKFQTAAQRAMMDQVRESRNPGNAEIEDLIFLARSALILRGSVAHKSCLPACVYGLAVASGYDFEREGYSLVGIDPFRLLQNSQVFSLIRPNENPAHKSQLVWMACHSAAFEDLRVSSFIRGTRVIPRGQLSTRGVQIASNENVEAMDSSTLELRSRYWAIRTRSGGNTNQQRASAGQISVQPTFSVQRNLPFERPTIMAAFKGNTEGRTSDMRTEIIRMMESARPEDVSFQGRGVFELSDEKATNPIVPSFDMSNEGSYFFGDNAEEYDN>gi|GENSCAN_predicted_CDS_2|1497_bpatggcgtcgcaaggcaccaaacgatcctatgaacagatggaaactggtggagaacgccagaatgccactgagatcagggcatctgttggaagaatggttggtggaattgggaggttttacgtacagatgtgcactgaactcaaactcagcgaccaagaaggaaggttgatccagaacagtataacaatagagagaatggttctctccgcatttgatgaaaggaggaacaggtacctagaggaacatcccagtgcggggaaggacccgaagaagaccggaggtccaatctaccgaaggagagacgggaaatgggtgagagagctgattctgtatgacaaagaggagataaggagaatttggcgtcaagcgaacaatggagaagacgcaactgctggtctcactcatatgatgatctggcattccaacctaaatgatgccacataccagagaacaagagccctcgtgcggactggaatggaccccagaatgtgctctctgatgcaaggatcaaccctcccgaggagatctggagctgctggtgcagcaataaagggagtcgggacaatggtaatggaactaattcggatgataaagcgaggcattaatgaccggaacttctggagaggcgataatggacgaagaacaaggattgcatatgagagaatgtgcaacatcctcaaagggaaatttcaaacagcagcacaaagagcaatgatggatcaggtgcgagaaagcagaaatcctgggaatgctgaaattgaagatctcatctttctggcacggtctgcactcatcctgagaggatccgtagcccataagtcctgcttgcctgcttgtgtgtacgggctcgctgtggccagtggatatgattttgagagggaagggtactctctggttgggatagatcctttccgtctgcttcagaacagtcaggtcttcagtcttattagaccaaatgagaatccagcacataaaagtcaattggtatggatggcatgccattctgcagcatttgaggacctgagagtctcaagtttcattagaggaacaagagtgatcccaagaggacaactatccactagaggagttcagattgcttcaaatgagaacgtggaagcaatggattccagcactcttgaactgagaagcagatattgggctataaggaccaggagtggaggaaacaccaatcaacagagagcatctgcaggacaaatcagtgtacagcccactttctcagtacagagaaatcttcccttcgaaagaccgaccattatggctgcgtttaaggggaataccgagggcagaacatctgacatgaggactgaaatcataaggatgatggaaagtgccagaccagaagatgtgtctttccaggggcggggagtcttcgagctctcggacgaaaaggcaacgaacccgatcgtgccttcctttgacatgagtaatgaaggatcttatttcttcggagacaatgcagaggaatatgacaattga>gi|GENSCAN_predicted_peptide_3|498_aaMASQGTKRSYEQMETDGERQNATEIRASVGKMIDGIGRFYIQMCTELKLSDYEGRLIQNSLTIERMVLSAFDERRNKYLEEHPSAGKDPKKTGGPIYKRVDGKWMRELVLYDKEEIRRIWRQANNGDDATAGLTHMMIWHSNLNDTTYQRTRALVRTGMDPRMCSLMQGSTLPRRSGAAGAAVKGVGTMVMELIRMIKRGINDRNFWRGENGRKTRSAYERMCNILKGKFQTAAQRAMMDQVRESRNPGNAEIEDLIFLARSALILRGSVAHKSCLPACVYGPAIASGYNFEKEGYSLVGIDPFKLLQNSQVYSLIRPNENPAHKSQLVWMACNSAAFEDLRVLSFIRGTKVSPRGKLSTRGVQIASNENMDTMESSTLELRSRYWAIRTRSGGNTNQQRASAGQISVQPAFSVQRNLPFDKPTIMAAFTGNTEGRTSDMRAEIIRMMEGAKPEEMSFQGRGVFELSDEKATNPIVPSFDMSNEGSYFFGDNAEEYDN>gi|GENSCAN_predicted_CDS_3|1497_bpatggcgtcccaaggcaccaaacggtcttatgaacagatggaaactgatggggaacgccagaatgcaactgagatcagagcatccgtcgggaagatgattgatggaattggacgattctacatccaaatgtgcaccgaacttaaactcagtgattatgaggggcgactgatccagaacagcttaacaatagagagaatggtgctctctgcttttgacgagagaaggaataaatatctggaagaacatcccagcgcggggaaggatcctaagaaaactggaggacccatatacaagagagtagatggaaagtggatgagggaactcgtcctttatgacaaagaagaaataaggcgaatctggcgccaagccaataatggtgatgatgcaacagctgggctgactcacatgatgatctggcattccaatttgaatgatacaacataccagaggacaagagctcttgttcgcaccggaatggatcccaggatgtgctctttgatgcagggttcgactctccctaggaggtctggagctgcaggcgctgcagtcaaaggagttgggacaatggtgatggagttgatcaggatgatcaaacgtgggatcaatgatcggaacttctggagaggtgagaatggacggaaaacaaggagtgcttacgagagaatgtgcaacattctcaaaggaaaatttcaaacagctgcacaaagagcaatgatggatcaagtgagagaaagccggaacccaggaaatgctgagatcgaagatctaatctttctggcacggtctgcactcatattgagagggtcagttgctcacaaatcttgtctgcccgcctgtgtgtatggacctgccatagccagtgggtacaacttcgaaaaagagggatactctctagtgggaatagaccctttcaaactgcttcaaaacagccaagtatacagcctaatcagaccgaacgagaatccagcacacaagagtcagctggtgtggatggcatgcaattctgctgcatttgaagatctaagagtattaagcttcatcagagggaccaaagtatccccaagggggaaactttccactagaggagtacaaattgcttcaaatgaaaacatggatactatggaatcaagtactcttgaactaagaagcaggtactgggccataaggaccagaagtggaggaaacactaatcaacagagggcctctgcaggtcaaatcagtgtacaacctgcattttctgtgcaaagaaacctcccatttgacaaaccaaccatcatggcagcattcactgggaatacagagggaagaacatcagacatgagggcagaaatcataaggatgatggaaggtgcaaaaccagaagaaatgtccttccaggggcggggagtcttcgagctctcggacgaaaaggcaacgaacccgatcgtgccctcttttgacatgagtaatgaaggatcttatttcttcggagacaatgcagaggagtacgacaattaa>gi|GENSCAN_predicted_peptide_4|759_aaMERIKELRNLMSQSRTREILTKTTVDHMAIIKKYTSGRQEKNPSLRMKWMMAMKYPITADKRITEMVPERNEQGQTLWSKMSDAGSDRVMVSPLAVTWWNRNGPMTSTVHYPKIYKTYFEKVERLKHGTFGPVHFRNQVKIRRRVDINPGHADLSAKEAQDVIMEVVFPNEVGARILTSESQLTITKEKKEELQDCKISPLMVAYMLERELVRKTRFLPVAGGTSSVYIEVLHLTQGTCWEQMYTPGGEVRNDDVDQSLIIAARNIVRRAAVSADPLASLLEMCHSTQIGGTRMVDILRQNPTEEQAVDICKAAMGLRISSSFSFGGFTFKRTSGSSIKREEEVLTGNLQTLKIRVHEGYEEFTMVGKRATAILRKATRRLVQLIVSGRDEQSIAEAIIVAMVFSQEDCMIKAVRGDLNFVNRANQRLNPMHQLLRHFQKDAKVLFQNWGIEHIDNVMGMIGVLPDMTPSTEMSMRGIRVSKMGVDEYSSTERVVVSIDRFLRVRDQRGNVLLSPEEVSETQGTEKLTITYSSSMMWEINGPESVLVNTYQWIIRNWETVKIQWSQNPTM

73

Page 74: Project  Report on Influenza Virus

LYNKMEFEPFQSLVPKAIRGQYSGFVRTLFQQMRDVLGTFDTTQIIKLLPFAAAPPKQSRMQFSSLTVNVRGSGMRILVRGNSPVFNYNKTTKRLTILGKDAGTLTEDPDEGTSGVESAVLRGFLILGKEDRRYGPALSINELSTLAKGEKANVLIGQGDVVLVMKRKRDSSILTDSQTATKRIRMAIN>gi|GENSCAN_predicted_CDS_4|2280_bpatggaaagaataaaagaactacggaatctgatgtcgcagtctcgcactcgcgagatactaacaaaaaccacagtggaccatatggccataattaagaagtacacatcagggagacaggaaaagaacccgtcacttaggatgaaatggatgatggcaatgaaatatccaattacagctgacaagaggataacagaaatggttcctgagagaaatgagcaaggacaaactctatggagtaaaatgagtgatgccgggtcagatcgagtaatggtatcacctttggcagtgacatggtggaatagaaatggaccaatgacaagtacggttcattatccaaaaatctacaagacttattttgagaaagtcgaaaggttaaaacatggaacctttggccctgtccattttagaaaccaagtcaaaatacgccgaagagttgacataaaccctggtcatgcagacctcagtgccaaggaggcacaagacgtaatcatggaagttgttttccccaatgaagtgggggccaggatactaacgtcggaatcacaattaacaataaccaaagagaaaaaagaagaactccaagattgcaaaatttctcctttgatggttgcatacatgttagagagagaacttgtccgaaaaacgagatttctcccagttgctggtggaacaagcagtgtgtacattgaagtgttacacttgactcaaggaacatgttgggaacagatgtacaccccaggtggagaagtgaggaatgatgatgttgatcaaagtctaattattgcagccaggaacatagtgagaagagcagcagtatcagcagatccactagcatctttattggagatgtgccacagcacacagattggcgggacaaggatggtggacattcttaggcagaacccaacggaagaacaagctgtggatatatgcaaggctgcaatgggactgagaatcagctcatccttcagttttggcgggttcacatttaagagaacaagcgggtcatcaatcaagagagaggaagaagtgcttacgggcaatctccaaacattgaaaataagggtgcatgaggggtacgaggaattcacaatggtggggaaaagggcaacagctatactcagaaaagcaaccaggagattggttcagctgatagtgagtggaagagacgaacagtcaatagccgaagcaataattgtagccatggtgttttcacaagaagattgcatgataaaagcagttagaggtgacctgaatttcgttaatagggcaaatcagcgattgaatcccatgcatcaacttttaagacattttcagaaagatgcaaaagtgctctttcaaaattggggaattgaacatatcgacaatgtaatgggaatgattggagtattaccagacatgactccaagcacagagatgtcaatgagagggataagagtcagcaaaatgggcgtggatgaatactccagcacagagagggtagtggtaagcattgaccggtttttgagagttcgagaccaacgaggaaatgtactactatctcctgaggaggtcagtgaaacacaggggacagagaaactgacaataacttactcatcgtcaatgatgtgggagattaatggccctgagtcagtgttggtcaatacctatcagtggatcatcagaaactgggaaactgttaaaattcaatggtctcagaatcctacaatgctatacaataaaatggaatttgagccatttcagtctttagttcctaaggccattagaggccaatacagtggatttgttaggactctattccaacaaatgagggatgtacttgggacatttgataccacccagataataaagcttcttccctttgcagccgccccaccaaagcaaagtagaatgcagttctcttcattgactgtgaatgtgaggggatcaggaatgagaatacttgtaaggggcaattctcctgtattcaactacaacaagaccactaagagactaacaattctcggaaaggatgctggcactttaactgaagacccagatgaaggcacatccggagtggagtccgctgttctgagaggattcctcattctgggcaaggaagatagaagatatggaccagcattaagcatcaatgaactgagtacccttgcaaaaggagaaaaggctaatgtactaattgggcaaggagacgtggtgttggtaatgaaacgaaaacgggactctagcatacttactgacagccagacagcgaccaaaagaattcggatggccatcaattaa>gi|GENSCAN_predicted_peptide_5|252_aaMSLLTEVETYVLSIVPSGPLKAEIAQRLEDVFAGKNTDLEALMEWLKTRPILSPLTKGILGFVFTLTVPSERGLQRRRFVQNALNGNGDPNNMDRAVKLYRKLKREITFHGAKEVALSYSAGALASCMGLIYNRMGAVTTEVAFAVVCATCEQIADSQHRSHRQMVTTTNPLIRHENRMVLASTTAKAMEQMAGSSEQAAEAMEVASQARQMVQAMRAIGTPPSSSAGLKDDLLENLQAYQKRMGVQMQRFK>gi|GENSCAN_predicted_CDS_5|759_bpatgagccttctaaccgaggtcgaaacgtacgttctctctatcgtcccgtcaggccccctcaaagccgagatcgcacagagacttgaagatgtctttgctgggaagaacacagatcttgaggctctcatggaatggctaaagacaagaccaatcctgtcacctctgactaaggggattttgggatttgtattcacgctcaccgtgccaagtgagcgaggactgcagcgtagacgctttgtccaaaatgccctcaatgggaatggggatccaaataacatggacagagcagttaaactgtatagaaagcttaagagggagataacattccatggggccaaagaagtagcgctcagttattctgctggtgcacttgccagttgcatgggcctcatatacaacaggatgggggctgtgaccactgaagtggcctttgccgtggtatgtgcaacctgtgaacagattgctgactcccagcataggtctcacaggcaaatggtgacaacaaccaatccactaataagacatgagaacagaatggttctggccagcactacagctaaggctatggagcaaatggctggatcgagtgagcaagcagcagaggccatggaggttgctagtcaggccaggcaaatggtgcaggcaatgagagccattgggactcctcctagctccagtgctggtctaaaagatgatcttcttgaaaatttgcaggcctatcagaaacgaatgggggtgcagatgcaacgattcaagtga>gi|GENSCAN_predicted_peptide_6|716_aaMEDFVRQCFNPMIVELAEKAMKEYGEDLKIETNKFAAICTHLEVCFMYSDFHFINEQGESIMVELDDPNALLKHRFEIIEGRDRTMAWTVVNSICNTTGAEKPKFLPDLYDYKENRFIEIGVTRREVHIYYLEKANKIKSENTHIHIFSFTGEEMATKADYTLDEESRARIKTRLFTIRQEMANRGLWDSFRQSERGEETIEERFEITGTMRRLADQSLPPNFSCLENFRAYVDGFEPNGYIEGKLSQMSKEVNAKIEPFLKTTPRPIRLPDGPPCFQRSKFLLMDALKLSIEDPSHEGEGIPLYDAIKCMRTFFGWKEPYIVKPHEKGINPNYLLSWKQVLAELQDIENEEKIPRTKNMKKTSQLKWALGENMAPEKVDFDNCRDISDLKQYDSDEPELRSLSSWIQNEFNKACELTDSIWIELDEIGEDVAPIEHIASMRRNYFTAEVSHCRATEYIMKGVYINTALLNASCAAMDDFQLIPMISKCRTKEGRRKTNLYGFIIKGRSHLRNDTDVVNFVSMEFSLTDPRLEPHKWEKYCVLEIGDMLLRSAIGQMSRPMFLYVRTNGTSKIKMKWGMEMRPCLLQSLQQIESMVEAESSVKEKDMTKEFFENKSETWPIGESPKGVEEGSIGKVCRTLLAKSVFNSLYASPQLEGFSAESRKLLLVVQALRDNLEPGTFDLGGLYEAIEECLINDPWVLLNASWFNSFLTHALR>gi|GENSCAN_predicted_CDS_6|2151_bpatggaagattttgtgcgacaatgcttcaatccgatgattgtcgaacttgcggaaaaggcaatgaaagagtatggagaagatctgaaaatcgaaacaaacaaatttgcagcaatatgcactcacttggaagtatgcttcatgtattcagattttcatttcatcaatgagcaaggcgagtcaataatggtagagcttgatgatccaaatgcacttttgaagcacagatttgaaataatagagggaagagatcgcacaatggcctggacagtagtaaacagtatttgcaacaccacaggagctgagaaaccgaagtttctgccagatttgtatgattacaaggagaatagattcatcgagattggagtgacaaggagagaagtccacatatactatcttgaaaaggccaataaaattaaatctgagaatacacacatccacattttctcattcactggggaagaaatggccacaaaggccgactacactctcgatgaggaaagcagggctaggatcaaaaccagactattcaccataagacaagaaatggccaacagaggcctctgggattcctttcgtcagtccgaaagaggcgaagaaacaattgaagaaagatttgaaatcacagggacaatgcgcaggcttgccgaccaaagtctcccgccgaacttctcctgccttgagaattttagagcctatgtggatggattcgaaccgaacggctacattgagggcaagctttctcaaatgtccaaagaagtaaatgcaaaaattgaaccttttctgaaaacaacaccaagaccaattagacttccggatgggcctccttgttttcagcggtccaaattcctgctgatggatgctttaaaattaagcattgaggacccaagtcacgaaggggagggaataccactatatgatgcgatcaagtgcatgagaacattctttggatggaaagaaccctatattgttaaaccacacgaaaagggaataaatccaaattatctgctgtcatggaagcaagtactggcggaactgcaggacattgagaatgaggagaagattccaagaactaaaaacatgaagaaaacgagtcagctaaagtgggcacttggtgagaacatggcaccagagaaggtagactttgacaactgtagagacataagcgatttgaagcaatatgatagtgacgaacctgaattaaggtcactttcaagctggatccagaatgagttcaacaaggcatgcgagctgaccgattcaatctggatagagctcgatgagattggagaagacgtggctccaattgaacacattgcaagcatgagaaggaattacttcacagcagaggtgtcccattgcagagccacagaatatataatgaagggggtatacattaatactgccttgcttaatgcatcctgtgcagcaatggacgatttccaactaattcccatgataagcaagtgtagaactaaagagggaaggcgaaagaccaatttatatggtttcatcataaaaggaagatctcacttaaggaatgacaccgacgtggtaaactttgtgagcatggagttttctctcactgacccgagacttgagccacacaaatgggagaagtactgtgtccttgagataggagatatgctactaagaagtgccataggccagatgtcaaggcctatgttcttgtatgtgaggacaaatggaacatcaaagattaaaatgaaatggggaatggagatgaggccttgcctccttcagtcactacaacaaatcgagagtatggttgaagccgagtcctctgtcaaagagaaagacatgaccaaagagttttttgagaataaatcagaaacatggcccattggggagtcccccaaaggagtggaagaaggttccattgggaaggtctgcaggactttattagccaagtcggtattcaatagcctgtatgcatccccacaattagaaggattttcagctgaatcaagaaaactgcttcttgtcgttcaggctcttagggacaatcttgaacctggaacctttgatcttggggggctatatgaagcaattgaggagtgcctgattaatgatccctgggttttgcttaatgcgtcttggttcaactccttcctaacacatgcattaagatag>gi|GENSCAN_predicted_peptide_7|718_aaMDTVNRTHQYSEKGKWTTNTETGAPQLNPIDGPLPEDNEPSGYAQTDCVLEAMAFLEESHPGIFENSCLETMEVIQQTRVDKLTQGRQTYDWTLNRNQPAATALANTIEVFRSNGLTANESGRLIDFLKDVIESMDKEEMEITTHFQRKRRVRDNMTKKMVTQRTIGKKKQRLNKRSYLIRALTLNTMTK

74

Page 75: Project  Report on Influenza Virus

DAERGKLKRRAIATPGMQIRGFVHFVETLARNICEKLEQSGLPVGGNEKKAKLANVVRKMMTNSQDTELSFTITGDNTKWNENQNPRVFLAMITYITRNQPEWFRNVLSIAPIMFSNKMARLGKGYMFESKSMKLRTQIPAEMLASIDLKYFNESTRKKIEKIRPLLIDGTVSLSPGMMMGMFNMLSTVLGVSILNLGQKKYTKXTYWWDGLQSSDDFALIVNAPNHEGIQAGVNRFYRTCKLVGINMSKKKSYINRTGTFEFTSFFYRYGFVANFSMELPSFGVSGINESADMSIGVTVIKNNMINNDLGPATAQMALQLFIKDYRYTYRCHRGDTQIQTRRSFELKKLWEQTRSKAGLLVSDGGSNLYNIRNLHIPEVCLKWELMDEDYQGRLCNPLNPFVSHKEIESVNNAVVMPAHGPAKSMEYDAVATTHSWTPKRNRSILNTSQRGILEDEQMYQKCCNLFEKFFPSSSYRRPVGISSMVEAMVSRARIDARIDFESGRIKKEEFAEIMKICSTIEELRRQK>gi|GENSCAN_predicted_CDS_7|2157_bpatggacacagtcaacagaacacatcaatattcagaaaaggggaagtggacaacaaacacggaaactggagcgccccaacttaacccaattgatggaccactacctgaggacaatgaaccaagtggatatgcacaaacagactgcgtcctggaagcaatggctttccttgaggaatcacacccaggaatctttgaaaattcgtgtcttgaaacgatggaagttattcaacaaacaagagtggacaaactgacccaaggtcgtcagacctatgactggacattgaacagaaatcagccggctgcaactgcgctagccaacactatagaggtcttcagatcgaatggactgacagctaatgagtcgggaaggctaatagatttcctcaaggatgtgatagaatcaatggataaagaggagatggaaataacaacacacttccaaagaaaaagaagagtaagagacaacatgaccaagaaaatggtcacacaacgaacaataggaaagaagaagcaaagattgaacaagagaagctatctgataagagcactgacattgaacacaatgactaaagatgcagagagaggtaaattaaaaagaagagcaattgcaacacccggtatgcagatcagagggttcgtgcactttgtcgaaacactagcgagaaatatttgtgagaaacttgaacagtctgggcttccggttggaggtaatgaaaagaaggctaaactagcaaatgttgttagaaaaatgatgactaattcacaagacacagagctctctttcacaattactggagacaacaccaaatggaatgagaatcaaaatcctcgagtgtttctggcgatgataacatacatcacaagaaatcaacctgaatggtttagaaacgtcctgagcattgcacccataatgttctcaaataaaatggctagactagggaaaggttacatgttcgaaagcaagagcatgaagctccgaacacaaataccagcagaaatgctagcaagtattgacctgaaatactttaatgaatcaaccagaaagaaaattgagaaaataaggcctctcctaatagatggcacagtctcattgagtcctggaatgatgatgggcatgttcaacatgctaagtacagtcttaggagtctcaatcctgaatctcgggcaaaagaaatacaccaaaacnacatactggtgggacggactccaatcctctgatgacttcgctctcatagtgaatgcaccaaatcatgagggaatacaagcaggggtgaatagattctacagaacctgcaagctagtcggaatcaatatgagcaaaaagaagtcctacataaataggacagggacatttgaattcacaagctttttctatcgctatggatttgtagccaattttagcatggagctgcccagctttggagtgtctggaattaatgaatcggctgatatgagcattggggtaacagtgataaagaacaatatgataaataatgaccttgggccagcaacagcccaaatggctcttcaactattcatcaaagactacagatacacgtaccggtgccacagaggggacacacaaattcagacaaggagatcattcgagctaaagaagctgtgggagcaaacccgctcaaaggcaggacttttggtgtcggatggaggatcaaacttatacaatatccggaatctccacattccagaagtctgcttgaaatgggagctaatggatgaagactatcaggggaggctttgtaatcccctgaatccatttgtcagtcataaggaaattgagtctgtaaacaatgctgtggtaatgccagctcacggtccagccaagagcatggaatatgatgctgttgctactacacactcctggacccctaagaggaaccgctccattctcaacacaagccaaaggggaattcttgaagatgaacagatgtatcagaagtgttgcaatctatttgagaaattcttccctagcagttcgtacaggagaccagttggaatttccagcatggtggaggccatggtgtctagggctcggattgatgcacggattgacttcgagtctggacggattaagaaagaggagttcgctgagatcatgaagatctgttccaccattgaagagctcagacggcaaaaatag>gi|GENSCAN_predicted_peptide_8|562_aaMAIIYLILLFTAVRGDQICIGYHANNSTEKVDTILERNVTVTHAKDILEKTHNGKLCKLNGIPPLELGDCSIAGWLLGNPECDRLLSVPEWSYIMEKENPRYSLCYPGSFNDYEELKHLLSSVKHFEKVKILPKDRWTQHTTTGGSWACAVSGKPSFFRNMVWLTRKGSNYPVAKGSYNNTSGEQMLIIWGVHHPNDEAEQRALYQNVGTYVSVATSTLYKRSIPEIAARPKVNGLGRRMEFSWTLLDMWDTINFESTGNLVAPEYGFKISKRGSSGIMKTEGTLENCETKCQTPLGAINTTLPFHNVHPLTIGECPKYVKSEKLVLATGLRNVPQIESRGLFGAIAGFIEGGWQGMVDGWYGYHHSNDQGSGYAADKESTQKAFNGITNKVNSVIEKMNTQFEAVGKEFSNLEKRLENLNKKMEDGFLDVWTYNAELLVLMENERTLDFHDSNVKNLYDKVRMQLRDNVKELGNGCFEFYHKCDNECMDSVKNGTYDYPKYEEESKLNRNEIKGVKLSSMGVYQILAIYATVAGSLSLAIMMAGISFWMCSNGSLQCRICI

>gi|GENSCAN_predicted_CDS_8|1689_bpatggccatcatttatctcatactcctgttcacagcagtgaggggggaccagatatgcattggataccatgccaataattccacagaaaaggtcgacacaattctagagcggaatgtcactgtgactcatgccaaggacatccttgagaagacccataacggaaagctatgcaaactaaacggaatccctccacttgaactaggggactgtagcattgccggatggctccttggaaatccagaatgtgataggcttctaagtgtgccagaatggtcctatataatggagaaagaaaacccgagatacagtttgtgttacccaggcagcttcaatgactatgaagaattgaaacatctcctcagcagcgtgaaacattttgagaaagttaagattttgcccaaagatagatggacacagcatacaacaactggaggttcatgggcctgcgcggtgtcaggtaaaccatcattcttcaggaacatggtctggctgacacgtaaaggatcaaattatccggttgccaaaggatcgtacaacaatacaagcggagaacaaatgctaataatttggggagtgcaccatcctaatgatgaggcagaacaaagagcattgtaccagaatgtgggaacctatgtttccgtagccacatcaacattgtacaaaaggtcaatcccagaaatagcagcaaggcctaaagtgaatggactaggacgtagaatggaattctcttggaccctcttggatatgtgggacaccataaattttgagagcactggtaatctagttgcaccagagtatgggttcaaaatatcgaaaagaggtagttcagggatcatgaagacagaaggaacacttgagaactgtgaaaccaaatgccaaactcctttgggagcaataaatacaacactaccttttcacaatgtccacccactgacaataggtgaatgccccaaatatgtaaaatcggagaaattggtcttagcaacaggactaaggaatgttccccagattgaatcaagaggattgtttggggcaatagctggttttatagaaggaggatggcaaggaatggttgatggttggtatggataccatcacagcaatgaccagggatcagggtatgcagcagacaaagaatccactcaaaaggcatttaatggaatcaccaacaaggtaaattctgtgattgaaaagatgaacacccaatttgaagctgttgggaaagaattcagtaacttagagaaaagactggagaacttgaacaaaaagatggaagacgggtttctagatgtgtggacatacaatgcagagcttctagttctgatggaaaatgagaggacacttgactttcatgattctaatgtcaagaatctgtatgataaagtcagaatgcagctgagagacaacgtcaaagaactaggaaatggatgttttgaattttatcacaaatgtgacaatgaatgcatggatagtgtgaaaaacgggacatatgattatcccaagtatgaagaagaatctaaactaaatagaaatgaaatcaaaggggtaaaattgagcagcatgggggtttatcaaatccttgccatttatgctacagtagcaggttctctgtcactggcaatcatgatggctgggatctctttctggatgtgctccaacgggtctctgcagtgcagaatctgcatatga>gi|GENSCAN_predicted_peptide_9|759_aaMERIKELRNLMSQSRTREILTKTTVDHMAIIKKYTSGRQEKNPSLRMKWMMAMKYPITADKRITEMVPERNEQGQTLWSKMSDAGSDRVMVSPLAVTWWNRNGPVASTVHYPKVYKTYFDKVERLKHGTFGPVHFRNQVKIRRRVDINPGHADLSAKEAQDVIMEVVFPNEVGARILTSESQLTITKEKKEELRDCKISPLMVAYMLERELVRKTRFLPVAGGTSSIYIEVLHLTQGTCWEQMYTPGGEVRNDDVDQSLIIAARNIVRRAAVSADPLASLLEMCHSTQIGGTRMVDILRQNPTEEQAVDICKAAMGLRISSSFSFGGFTFKRTSGSSVKKEEEVLTGNLQTLKIRVHEGYEEFTMVGKRATAILRKATRRLVQLIVSGRDEQSIAEAIIVAMVFSQEDCMIKAVRGDLNFVNRANQRLNPMHQLLRHFQKDAKVLFQNWGIEHIDSVMGMVGVLPDMTPSTEMSMRGIRVSKMGVDEYSSTERVVVSIDRFLRVRDQRGNVLLSPEEVSETQGTERLTITYSSSMMWEINGPESVLVNTYQWIIRNWEAVKIQWSQNPAMLYNKMEFEPFQSLVPKAIRSQYSGFVRTLFQQMRDVLGTFDTTQIIKLLPFAAAPPKQSRMQFSSLTVNVRGSGMRILVRGNSPVFNYNKTTKRLTILGKDAGTLIEDPDESTSGVESAVLRGFLIIGKEDRRYGPALSINELSNLAKGEKANVLIGQGDVVLVMKRKRDSSILTDSQTATKRIRMAIN>gi|GENSCAN_predicted_CDS_9|2280_bpatggaaagaataaaagaactacggaacctgatgtcgcagtctcgcactcgcgagatactgacaaaaaccacagtggaccatatggccataattaagaagtacacatcggggagacaggaaaagaacccgtcacttaggatgaaatggatgatggcaatgaaatacccaatcactgctgacaaaaggataacagaaatggttccggagagaaatgaacaaggacaaactctatggagtaaaatgagtgatgctggatcagatcgagtgatggtatcacctttggctgtaacatggtggaatagaaatggacccgtggcaagtacggtccattacccaaaagtatacaagacttattttgacaaagtcgaaaggttaaaacatggaacctttggccctgttcattttagaaatcaagtcaagatacgcagaagagtagacataaaccctggtcatgcagacctcagtgccaaagaggcacaagatgtaattatggaagttgtttttcccaatgaagtgggagccaggatactaacatcagaatcgcaattaacaataactaaagagaaaaaagaagaactccgagattgcaaaatttctcccttgatggttgcatacatgttagagagagaacttgtccgaaaaacaagatttctcccagttgctggcggaacaagcagtatatacattgaagtcttacatttgactcaaggaacgtgttgggaacaaatgtacactccaggtggagaagtgaggaatgacgatgttgaccaaagcctaattattgcggccaggaacatagtaagaagagctgcagtatcagcagatccactagcatctttattggagatgtgccac

75

Page 76: Project  Report on Influenza Virus

agcacacaaattggcgggacaaggatggtggacattcttagacagaacccgactgaagaacaagctgtggatatatgcaaggctgcaatgggattgagaatcagctcatccttcagctttggtgggtttacatttaaaagaacaagcgggtcatcagtcaaaaaagaggaagaagtgcttacaggcaatctccaaacattgaagataagagtacatgaggggtatgaggagttcacaatggtggggaaaagagcaacagctatactcagaaaagcaaccagaagattggttcagctcatagtgagtggaagagacgaacagtcaatagccgaagcaataatcgtggccatggtgttttcacaagaggattgcatgataaaagcagttagaggtgacctgaatttcgtcaacagagcaaatcaacggttgaaccccatgcatcagcttttaaggcattttcagaaagatgcgaaagtgctttttcaaaattggggaattgaacacatcgacagtgtgatgggaatggttggagtattaccagatatgactccaagcacagagatgtcaatgagaggaataagagtcagcaaaatgggtgtggatgaatactccagtacagagagggtggtggttagcattgatcggtttttgagagttcgagaccaacgcgggaatgtattattgtctcctgaggaggtcagtgaaacacagggaactgaaagattgacaataacatattcatcgtcgatgatgtgggagattaacggtcctgagtcggttttggtcaatacctatcaatggatcatcagaaattgggaagctgtcaaaattcaatggtctcagaatcctgcaatgttgtacaacaaaatggaatttgaaccatttcaatctttagtccccaaggccattagaagccaatacagtgggtttgtcagaactctattccaacaaatgagagacgtacttgggacatttgacaccacccagataataaagcttctcccttttgcagccgctccaccaaagcaaagcagaatgcagttctcttcactgactgtaaatgtgaggggatcagggatgagaatacttgtaaggggcaattctcctgtattcaactacaacaagaccactaaaagactaacaattctcggaaaagatgccggcactttaattgaagacccagatgaaagcacatccggagtggagtccgccgtcttgagagggtttctcattataggtaaggaagacagaagatacggaccagcattaagcatcaatgaactgagtaaccttgcaaaaggggaaaaggctaatgtgctaatcgggcaaggagacgtggtgttggtaatgaaacgaaaacgggactctagcatacttactgacagccagacagcgaccaaaagaattcggatggccatcaattaa>gi|GENSCAN_predicted_peptide_10|757_aaMDVNPTLLFLKVPAQNAISTTFPYTGDPPYSHGTGTGYTMDTVNRTHQYSEKGKWTTNTETGAPQLNPIDGPLPEDNEPSGYAQTDCVLEAMAFLEESHPGIFENSCLETMEVVQQTRVDKLTQGRQTYDWTLNRNQPAATALANTIEVFRSNGLTANESGRLIDFLKDVMESMDKEEMEITTHFQRKRRVRDNMTKKMVTQRTIGKKKQRVNKRGYLIRALTLNTMTKDAERGKLKRRAIATPGMQIRGFVYFVETLARSICEKLEQSGLPVGGNEKKAKLANVVRKMMTNSQDTELSFTITGDNTKWNENQNPRMFLAMITYITKNQPEWFRNILSIAPIMFSNKMARLGKGYMFESKRMKLRTQIPAEMLASIDLKYFNESTRKKIEKIRPLLIDGTASLSPGMMMGMFNMLSTVLGVSVLNLGQKKYTKTTYWWDGLQSSDDFALIVNAPNHEGIQAGVDRFYRTCKLVGINMSKKKSYINKTGTFEFTSFFYRYGFVANFSMELPSFGVSGINESADMSIGVTVIKNNMINNDLGPATAQMALQLFIKDYRYTYRCHRGDTQIQTRRSFELKKLWDQTQSRAGLLVSDGGPNLYNIRNLHIPEVCLKWELMDENYRGRLCNPLNPFVSHKEIESVNNAVVMPAHGPAKSMEYDAVATTHSWNPKRNRSILNTSQRGILEDEQMYQKCCNLFEKFFPSSSYRRPIGISSMVEAMVSRARIDARIDFESGRIKKEEFSEIMKICSTIEELRRQK>gi|GENSCAN_predicted_CDS_10|2274_bpatggatgtcaatccgactctactgttcctaaaggttccagcgcaaaatgccataagcaccacattcccttatactggagatcctccatacagccatggaacaggaacaggatacaccatggacacagtcaacagaacacaccaatattcagagaaggggaagtggacgacaaatacagaaactggggcaccccaactcaacccaattgatggaccactacctgaggataatgagccaagtggatatgcacaaacagactgtgtcctggaggctatggccttccttgaagaatcccacccaggtatctttgagaactcatgccttgaaacaatggaagtcgttcaacaaacaagggtggacaaactaacccaaggccgccagacttatgattggacattaaacagaaatcaaccggcagcaactgcattagccaacaccatagaagtttttagatcgaatggactaacagccaatgaatcaggaaggctaatagatttcctcaaggatgtgatggaatcaatggataaagaggaaatggagataacaacacactttcaaagaaaaaggagagtaagagacaacatgaccaagaaaatggtcacacaaagaacaatagggaagaaaaaacaaagagtgaataagagaggctatctaataagagctttgacattgaacacgatgaccaaagatgcagagagaggtaaattaaaaagaagggctattgcaacacccgggatgcaaattagagggttcgtgtacttcgttgaaactttagctagaagcatttgcgaaaagcttgaacagtctggacttccggttgggggtaatgaaaagaaggccaaactggcaaatgttgtgagaaaaatgatgactaattcacaagacactgagctttctttcacaatcactggggacaacactaagtggaatgaaaatcaaaaccctcgaatgtttttggcgatgattacatatatcacaaaaaatcaacctgagtggttcagaaacatcctgagcatcgcaccaataatgttctcaaacaaaatggcaagactaggaaaaggatacatgttcgagagtaagagaatgaagctccgaacacaaatacccgcagaaatgctagcaagcattgacctgaagtatttcaatgaatcaacaaggaagaaaattgagaaaataaggcctcttctaatagatggcacagcatcattgagccctgggatgatgatgggcatgttcaacatgctaagtacggttttaggagtctcggtactgaatcttgggcaaaagaaatacaccaagacaacatactggtgggatgggctccaatcctccgacgattttgccctcatagtgaatgcaccaaatcatgagggaatacaagcaggagtggatagattctacaggacctgcaagttagtgggaatcaacatgagcaaaaagaagtcctatataaataaaacagggacatttgaattcacaagctttttttatcgatatggatttgtggctaattttagcatggagcttcccagttttggagtgtctggaataaacgagtcagctgatatgagtattggagtaacagtgataaagaacaacatgataaacaatgaccttgggccagcaacagcccagatggctctccaattgttcatcaaagactacagatatacatataggtgccatagaggagacacacaaattcagacgagaagatcattcgagctaaagaagctgtgggatcaaacccaatcaagggcaggactattggtatcagatgggggaccaaacttatacaatatccggaaccttcacatccctgaagtctgcttaaagtgggagctaatggatgagaattatcggggaagactttgtaaccccctgaatccctttgtcagccataaagaaattgagtctgtaaacaatgctgtagtgatgccagcccacggtccagccaaaagtatggaatatgatgccgttgcaactacacactcctggaatcccaagaggaaccgctctattctaaacactagccaaaggggaattcttgaggatgaacagatgtaccaaaagtgctgcaacttgttcgagaaatttttccctagtagttcatataggagaccgattggaatttctagcatggtggaggccatggtgtctagggcccggattgatgccagaattgacttcgagtctggacggattaagaaggaagagttctctgagatcatgaagatctgttccaccattgaagaactcagacggcaaaaataa>gi|GENSCAN_predicted_peptide_11|716_aaMEDFVRQCFNPMIVELAEKAMKEYGEDLKIETNKFAAICTHLEVCFMYSDFHFINEQGESIVVELDDPNALLKHRFEIIEGRDRTMAWTVVNSICNTTGAEKPKFLPDLYDYKENRFIEIGVTRREVHIYYLEKANKIKSENTHIHIFSFTGEEIATKADYTLDEESRARIKTRLFTIRQEMANRGLWDSFRQSERGEETIEEKFEISGTMRRLADQSLPPKFSCLENFRAYVDGFEPNGCIEGKLSQMSKEVNAKIEPFLKTTPRPIKLPNGPPCYQRSKFLLMDALKLSIEDPSHEGEGIPLYDAIKCIKTFFGWKEPYIVKPHEKGINSNYLLSWKQVLSELQDIENEEKIPRTKNMKKTSQLKWALGENMAPEKVDFDNCRDISDLKQYDSDEPELRSLSSWIQNEFNKACELTDSIWIELDEIGEDVAPIEYIASMRRNYFTAEVSHCRATEYIMKGVYINTALLNASCAAMDDFQLIPMISKCRTKEGRRKTNLYGFIIKGRSHLRNDTDVVNFVSMEFSLTDPRLEPHKWEKYCVLEIGDMLLRSAIGQISRPMFLYVRTNGTSKVKMKWGMEMRRCLLQSLQQIESMIEAESSIKEKDMTKEFFENKSEAWPIGESPKGVEEGSIGKVCRTLLAKSVFNSLYASPQLEGFSAESRKLLLVVQALRDNLEPGTFDLGGLYEAIEECLINDPWVLLNASWFNSFLTHALK>gi|GENSCAN_predicted_CDS_11|2151_bpatggaagattttgtgcgacaatgcttcaacccgatgattgtcgaacttgcagaaaaagcaatgaaagagtatggagaggatctgaaaattgaaacaaacaaatttgcagcaatatgcacccacttggaggtatgtttcatgtattcagattttcatttcatcaatgaacaaggcgaatcaatagtggtagaacttgatgatccaaatgcactgttaaagcacagatttgaaataatcgaggggagagacagaacaatggcctggacagtagtaaacagtatctgcaacactactggagcagaaaaaccaaagtttctaccagatttgtatgattacaaggagaatagattcatcgaaattggagtgacaagaagagaagtccacatatattaccttgaaaaggccaataaaattaaatctgagaacacacacattcacatcttctcattcactggggaggaaatagccacaaaggcagactacactctcgacgaggaaagcagggctaggattaaaaccaggctatttaccataagacaagaaatggccaacagaggcctctgggattcctttcgtcagtccgaaagaggcgaagaaacaattgaagaaaaatttgaaatctcaggaactatgcgtaggcttgccgaccaaagtctcccaccgaaattctcctgccttgagaattttagagcctatgtggatggattcgaaccgaacggctgcattgagggcaagctttctcaaatgtccaaagaagtgaatgccaaaattgaaccttttctgaagacaacaccaagaccaatcaaacttcctaatggacctccttgttatcagcggtccaaattcctcctgatggatgctttgaaattgagcattgaagacccaagtcatgaaggagaagggattccattatatgatgcgatcaagtgcataaaaacattctttggatggaaagaaccttatatagtcaaaccacacgaaaagggaataaattcaaattacctgctgtcatggaagcaagtattgtcagaattgcaggacattgaaaatgaggagaagatcccaaggactaaaaacatgaagaaaacgagtcaactaaagtgggctcttggtgaaaacatggcaccagagaaagtagactttgacaactgcagagacataagcgatttgaagcaatatgatagtgacgaacctgaattaaggtcactttcaagctggatacagaatgagttcaacaaggcctgcgagctaactgattcaatctggatagagctcgatgaaattggagaggacgtagccccaattgagtacattgcaagcatgaggaggaattatttcacagcagaggtgtcccattgtagagccactgagtacataatgaagggggtatacattaatactgccctgctcaatgcatcctgtgcagcaatggacgattttcaactaattcccatgataagcaagtgcagaactaaagagggaaggcgaaaaaccaatttatatggattcatcataaagggaagatctcatttaaggaatgacacagatgtggtaaactttgtgagcatggagttttctctcactgacccgagacttgagccacataaatgggaga

76

Page 77: Project  Report on Influenza Virus

aatactgtgtccttgagataggagatatgttactaagaagtgccataggccaaatttcaaggcctatgttcttgtatgtgaggacaaacggaacatcaaaggtcaaaatgaaatggggaatggagatgagacgttgcctccttcagtcactccagcagatcgagagcatgattgaagccgagtcctcgattaaagagaaagacatgaccaaagagttttttgagaataaatcagaagcatggcccattggggagtcccccaagggagtggaagaaggttccattgggaaagtctgtaggactctattggctaagtcagtgttcaatagcctgtatgcatcaccacaattggaaggattttcagcggagtcaagaaaactgcttcttgttgttcaggctcttagggacaacctcgaacctgggacctttgatctcggggggctatatgaagcaattgaggagtgcctgattaatgatccctgggttttgctcaatgcatcttggttcaactccttcctgacacatgcattaaaatag>gi|GENSCAN_predicted_peptide_12|230_aaMDSNTVSSFQVDCFLWHIRKQVVDQELSDAPFLDRLRRDQRSLRGRGNTLGLDIKAATHVGKQIVEKILKEESDEALKMTMVSTPASRYITDMTIEELSRNWFMLMPKQKVEGPLCIRMDQAIMEKNIMLKANFSVIFDRLETIVLLRAFTEEGAIVGEISPLPSFPGHTIEDVKNAIGVLIGGLEWNDNTVRVSKNLQRFAWRSSNENGGPPLTPKQKRKMARTARSKV>gi|GENSCAN_predicted_CDS_12|693_bpatggattccaacactgtgtcaagtttccaggtagattgctttctttggcatatccggaaacaagttgtagaccaagaactgagtgatgccccattccttgatcggcttcgccgagatcagaggtccctaaggggaagaggcaatactctcggtctagacatcaaagcagccacccatgttggaaagcaaattgtagaaaagattctgaaagaagaatctgatgaggcacttaaaatgaccatggtctccacacctgcttcgcgatacataactgacatgactattgaggaattgtcaagaaactggttcatgctaatgcccaagcagaaagtggaaggacctctttgcatcagaatggaccaggcaatcatggagaaaaacatcatgttgaaagcgaatttcagtgtgatttttgaccgactagagaccatagtattactaagggctttcaccgaagagggagcaattgttggcgaaatctcaccattgccttcttttccaggacatactattgaggatgtcaaaaatgcaattggggtcctcatcggaggacttgaatggaatgataacacagttcgagtctctaaaaatctacagagattcgcttggagaagcagtaatgagaatgggggacctccacttactccaaaacagaaacggaaaatggcgagaacagctaggtcaaaagtttga>gi|GENSCAN_predicted_peptide_13|498_aaMASQGTKRSYEQMETDGDRQNATEIRASVGKMIDGIGRFYIQMCTELKLSDHEGRLIQNSLTIEKMVLSAFDERRNKYLEEHPSAGKDPKKTGGPIYRRVDGKWMRELVLYDKEEIRRIWRQANNGEDATAGLTHIMIWHSNLNDATYQRTRALVRTGMDPRMCSLMQGSTLPRRSGAAGAAVKGIGTMVMELIRMVKRGINDRNFWRGENGRKTRSAYERMCNILKGKFQTAAQRAMVDQVRESRNPGNAEIEDLIFLARSALILRGSVAHKSCLPACAYGPAVSSGYDFEKEGYSLVGIDPFKLLQNSQIYSLIRPNENPAHKSQLVWMACHSAAFEDLRLLSFIRGTKVSPRGKLSTRGVQIASNENMDNMGSSTLELRSGYWAIRTRSGGNTNQQRASAGQTSVQPTFSVQRNLPFEKSTIMAAFTGNTEGRTSDMRAEIIRMMEGAKPEEVSFRGRGVFELSDEKATNPIVPSFDMSNEGSYFFGDNAEEYDN>gi|GENSCAN_predicted_CDS_13|1497_bpatggcgtcccaaggcaccaaacggtcttatgaacagatggaaactgatggggatcgccagaatgcaactgagattagggcatccgtcgggaagatgattgatggaattgggagattctacatccaaatgtgcactgaacttaaactcagtgatcatgaagggcggttgatccagaacagcttgacaatagagaaaatggtgctctctgcttttgatgaaagaaggaataaatacctggaagaacaccccagcgcggggaaagatcccaagaaaactggggggcccatatacaggagagtagatggaaaatggatgagggaactcgtcctttatgacaaagaagagataaggcgaatctggcgccaagccaacaatggtgaggatgcgacagctggtctaactcacataatgatctggcattccaatttgaatgatgcaacataccagaggacaagagctcttgttcgaactggaatggatcccagaatgtgctctctgatgcagggctcgactctccctagaaggtccggagctgcaggtgctgcagtcaaaggaatcgggacaatggtgatggaactgatcagaatggtcaaacgggggatcaacgatcgaaatttctggagaggtgagaatgggcggaaaacaagaagtgcttatgagagaatgtgcaacattcttaaaggaaaatttcaaacagctgcacaaagagcaatggtggatcaagtgagagaaagtcggaacccaggaaatgctgagatcgaagatctcatatttttggcaagatctgcattgatattgagagggtcagttgctcacaaatcttgcctacctgcctgtgcgtatggacctgcagtatccagtgggtacgacttcgaaaaagagggatattccttggtgggaatagaccctttcaaactacttcaaaatagccaaatatacagcctaatcagacctaacgagaatccagcacacaagagtcagctggtgtggatggcatgccattctgctgcatttgaagatttaagattgttaagcttcatcagagggacaaaagtatctccgcgggggaaactgtcaactagaggagtacaaattgcttcaaatgagaacatggataatatgggatcgagcactcttgaactgagaagcgggtactgggccataaggaccaggagtggaggaaacactaatcaacagagggcctccgcaggccaaaccagtgtgcaacctacgttttctgtacaaagaaacctcccatttgaaaagtcaaccatcatggcagcattcactggaaatacggagggaaggacttcagacatgagggcagaaatcataagaatgatggaaggtgcaaaaccagaagaagtgtcattccgggggaggggagttttcgagctctcagacgagaaggcaacgaacccgatcgtgccctcttttgatatgagtaatgaaggatcttatttcttcggagacaatgcagaagagtacgacaattaa>gi|GENSCAN_predicted_peptide_14|1223_aaMNPNQKIITIGSVSLTISTICFFMQIAILITTVTLHFKQYEFNSPPNNQVMLCEPTIIERNITEIVYLTNTTIEKEMCPKLAEYRNWSKPQCDITGFAPFSKDNSIRLSAGGDIWVTREPYVSCDPDKCYQFALGQGTTLNNVHSNDTVHDRTPYRTLLMNELGVPFHLGTKQVCIAWSSSSCHDGKAWLHVCVTGDDKNATASFIYNGRLVDSIVSWSKKILRTQESECVCINGTCTVVMTDGSASGKADTKILFIEEGKIIHTSTLSGSAQHVEECSCYPRYPGVRCVCRDNWKGSNRPIVDINIKDYSIVSSYVCSGLVGDTPRKNDSSSSSHCLDPNNEEGGHGVKGWAFDDGNDVWMGRTISEKLRSGYETFKVIEGWSKPNSKLQINRQVIVDRGPLKAEIAQRLEDVFAGKNTDLEALMEWLKTRPILSPLTKGILGFVFTLTVPSERGLQRRRFVQNALNGNGDPNNMDKAVKLYRKLKREITFHGAKEIALSYSAVPNGTIVKTITNDQIEVTNATELVQSSSTGGICDSPHQILDGENCTLIDALLGDPQCDGFQNKKWDLFVERSKAYSNCYPYDVPDYASLRSLVASSGTLEFNNESFNWTGVTQNGTSSACKRRSNNSFFSRLNWLTHLKFKYPALNVTMPNNEKFDKLYIWGVHHPGTDNDQISLYAQASGRITVSTKRSQQTVIPSIGSRPRIRDVPSRISIYWTIVKPGDILLINSTGNLIAPRGYFKIRSGKSSIMRSDAPIGKCNSECITPNGSIPNDKPFQNVNRITYGACPRYVKQNTLKLATGMRNVPEKQTRGIFGAIAGFIENGWEGMVDGWYGFRHQNSEGTGQAADLKSTQAAINQINGKLNRLIGKTNEKFHQIEKEFSEVEGRIQDLEKYVEDTKIDLWSYNAELLVALENQHTIDLTDSEMNKLFERTKKQLRENAEDMGNGCFKIYHKCDNACIGSIRNGTYDHDVYRDEALNNRFQIKGPLKAEIAQRLEDVFAGKNTDLEALMEWLKTRPILSPLTKGILGFVFTLTVPSERGLQRRRFVQNALNGNGDPNNMDRAVKLYKKLKREITFHGAKEVALSYSTGALASCMGLIYNRMGTVTTEVAFGLVCATCEQIADSQHRSHRQMATTTNPLIRHENRMVLASTTAKAMEQMAGSSEQAAEAMEVASQARQMVQAMRTIGTHPSSSAGLKDNLLENLQAYQKRMGVQMQRFK>gi|GENSCAN_predicted_CDS_14|3672_bpatgaatccaaatcaaaagataataacgattggctctgtttctctcaccatttccacaatatgcttcttcatgcaaattgccatcctgataaccactgtaacattgcatttcaagcaatatgaattcaactcccccccaaacaaccaagtgatgctgtgtgaaccaacaataatagaaagaaacataacagagatagtgtatctgaccaacaccaccatagagaaggaaatgtgccccaaactagcagaatacagaaattggtcaaagccgcaatgtgacattacaggatttgcacctttttctaaggacaattcgattaggctttccgctggtggggacatctgggtgacaagagaaccttatgtgtcatgcgaccctgacaagtgttaccaatttgcccttggacagggaacaacactaaacaacgtgcattcaaatgacacagtacatgataggaccccttatcggaccctattgatgaatgaattaggtgttccatttcatctggggaccaagcaagtgtgcatagcatggtccagctcaagttgtcacgatggaaaagcatggctgcatgtttgtgtaacgggggatgataaaaatgcaactgctagcttcatttacaatgggaggcttgtagatagtattgtttcatggtccaaaaaaatcctcaggacccaggagtcagaatgcgtttgtatcaatggaacttgtacagtagtaatgactgatgggagtgcttcaggaaaagctgatactaaaatactattcattgaggaggggaaaatcattcatactagcacattgtcaggaagtgctcagcatgtcgaggagtgctcctgctatcctcgatatcctggtgtcagatgtgtctgcagagacaactggaaaggctccaataggcccatcgtagatataaacataaaggattatagcattgtttccagttatgtgtgctcagggcttgttggagacacacccagaaaaaacgacagctccagcagtagccattgcttggatcctaacaatgaagaaggtggtcatggagtgaaaggctgggcctttgatgatggaaatgacgtgtggatgggaagaacgatcagcgagaagttacgctcaggatatgaaaccttcaaagtcattgaaggctggtccaaacctaattccaaattgcagataaataggcaagtcatagttgacagaggccccctcaaagccgagatcgcgcagagacttgaagatgtctttgctgggaaaaacacagatcttgaggctctcatggaatggctaaagacaagaccaattctgtcacctctgactaaggggattttggggtttgtgttcacgctcaccgtgcccagtgagcgaggactgcagcgtagacgctttgtccaaaatgccctcaatgggaatggagatccaaataacatggacaaagcagttaaactgtataggaaacttaagagggagataacgttccatggggccaaagaaatagctctcagttattctgctgtaccaaacggaacgatagtgaaaacaatcacgaatgaccaaattgaagtcactaatgctactgaactggttcagagttcctcaacaggtggaatatgcgacagtcctcatcagatccttgatggagaaaactgcacactaatagatgctctattgggagaccctcagtgtgatggcttccaaaataagaaatgggac

77

Page 78: Project  Report on Influenza Virus

ctttttgttgaacgcagcaaagcctacagcaactgttacccttatgatgtgccggattatgcctcccttaggtcactagttgcctcatccggcacactggagtttaacaatgaaagcttcaattggactggagtcactcaaaatggaacaagctctgcttgcaaaaggagatctaataacagtttctttagtagattgaattggttgacccacttaaaattcaaatacccagcattgaacgtgactatgccaaacaatgaaaaatttgacaaactgtacatttggggggttcaccacccgggtacggacaatgaccaaatcagcctatatgctcaagcatcaggaagaatcacagtctctaccaaaagaagccaacaaaccgtaatcccgagtatcggatctagacccaggataagggatgtccccagcagaataagcatctattggacaatagtaaaaccgggagacatacttttgattaacagcacagggaatctaattgctcctcggggttacttcaaaatacgaagtgggaaaagctcaataatgagatcagatgcacccattggcaaatgcaattctgaatgcatcactccaaatggaagcattcccaatgacaaaccatttcaaaatgtaaacaggatcacatatggggcctgtcccagatatgttaagcaaaacactctgaaattggcaacagggatgcgaaatgtaccagagaaacaaactagaggcatatttggcgcaatcgcgggtttcatagaaaatggttgggagggaatggtagacggttggtacggtttcaggcatcaaaattctgagggaacaggacaagcagcagatctcaaaagcactcaagcagcaatcaaccaaatcaatgggaagctgaataggttgatcgggaaaacaaacgagaaattccatcagattgaaaaagaattctcagaagtagaagggagaattcaggacctcgagaaatatgttgaggacactaaaatagatctctggtcatacaacgcggagcttcttgtggccctggagaaccaacatacaattgatctaactgactcagaaatgaacaaactgtttgaaagaacaaagaagcaactgagggaaaatgctgaggatatgggcaatggttgtttcaaaatataccacaaatgtgacaatgcctgcatagggtcaatcagaaatggaacttatgaccatgatgtatacagagatgaagcattaaacaaccggttccagatcaaaggccccctcaaagccgagatcgcgcagagacttgaggatgtctttgcaggaaagaacaccgatctcgaggctctcatggaatggctaaagacaagaccaatcctgtcacctctgactaaagggattttaggatttgtgttcacgctcaccgtgcccagtgagcgaggactgcagcgtagacgctttgtccagaatgccttaaatggaaatggagatccaaacaatatggatagggcagttaagctatacaagaagctgaaaagagaaataacattccatggggctaaggaggtcgcactcagctactcaaccggtgcacttgccagttgtatgggtctcatatacaacaggatgggaacggtgaccacagaagtggcttttggcctagtgtgtgccacttgtgagcagattgcagattcacagcatcggtctcacagacagatggcaactaccaccaacccactaatcaggcatgagaacagaatggtgctggccagcactacagctaaggctatggagcagatggctggatcgagtgagcaggcagcggaagccatggaggttgctagtcaggctaggcagatggtgcaggcaatgaggacaattgggactcatcctagctccagtgccggtctgaaagataatcttcttgaaaatttgcaggcctaccaaaaacgaatgggagtgcaaatgcagcgattcaagtga>gi|GENSCAN_predicted_peptide_15|1320_aaMEKIVLLLAIVSLVKSDQICIGYHANNSTEQVDTIMEKNVTVTHAQDILEKTHNGKLCDLNGVKPLILRDCSVAGWLLGNPMCDEFINVPEWSYIVEKASPANDLCYPGDFNDYEELKHLLSRTNHFEKIQIIPKSSWSNHDASSGVSSACPYHGRSSFFRNVVWLIKKNSAYPTIKRSYNNTNQEDLLVLWGIHHPNDAAEQTKLYQNPTTYISVGTSTLNQRLVPEIATRPKVNGQSGRMEFFWTILKPNDAINFESNGNFIAPEYAYKIVKKGDSAIMKSELEYGNCNTKCQTPMGAINSSMPFHNIHPLTIGECPKYVKSNRLVLATGLRNTPQRERRRKKRGLFGAIAGFIEGGWQGMVDGWYGYHHSNEQGSGYAADKESTQKAIDGVTNKVNSIIDKMNTQFEAVGREFNNLERRIENLNKQMEDGFLDVWTYNAELLVLMENERTLDFHDSNVKNLYDKVRLQLRDNAKELGNGCFEFYHKCDNECMESVKNGTYDYPQYSEEARLNREEISGVKLESMGTYQILSIYSTVASSLALAIMGALLNDKHSNGTVKDRSPHRTLMSCPVGEAPSPYNSRFESVAWSASACHDGTSWLTIGISGPDNGAVAVLKYNGIITDTIKSWRNNILRTQESECACVNGSCFTVMTDGPSNGQASYKIFKMEKGKVVKSVELNAPNYHYEECSCYPDAGEITCVCRDNWHGSNRPWVSFNQNLEYQIGYICSGVFGDNPRPNDGTGSCGPVSPNGAYGVKGFSFKYGNGVWIGRTKSTNSRSGFEMIWDPNGWTGTDSSFSVKQDIVAITDWVDNHSLSDINIMASQGTKRSYEQMETGGERQNATEIRASVGRMVGGIGRFYIQMCTELKLSDYEGRLIQNSITIERMVLSAFDERRNKYLEEHPSAGKDPKKTGGPIYRRRDGKWVRELILYDKEEIRRIWRQANNGEDATAGLTHMMIWHSNLNDATYQRTRALVRTGMDPRMCSLMQGSTLPRRSGAAGAAVKGVGTMVMELIRMIKRGINDRNFWRGENGRRTRIAYERMCNILKGKFQTAAQRAMMDQVRESRNPGNAEIEDLIFLARSALILRGSVAHKSCLPACVYGLAVASGYDFEREGYSLVGIDPFRLLQNSQVFSLIRPNENPAHKSQLVWMACHSAAFEDLRVSSFIRGTRVAPRGQLSTRGVQIASNENMETMDSSTLELRSRYWAIRTRSGGNTNQQRASAGQISVQPTFSVQRNLPFERATIMAAFTGNTEGRTSDMRTEIIRMMESSRPEDVSFQGRGVFELSDEKATNPIVPSFDMSNEGSYFFGDNAEEYDN>gi|GENSCAN_predicted_CDS_15|3963_bpatggagaaaatagtgcttcttcttgcaatagtcagtcttgtcaaaagtgatcagatttgcattggttaccatgcaaacaactcgacagagcaggttgacacaataatggaaaagaacgttactgttacacatgcccaagacatactggaaaagacacacaatgggaagctctgcgatctaaatggagtgaagcctctcattttgagagattgtagtgtagctggatggctcctcggaaaccctatgtgtgacgaattcatcaatgtgccggaatggtcttacatagtggagaaggccagtccagccaatgacctctgttacccaggggatttcaacgactatgaagaactgaaacacctattgagcagaacaaaccattttgagaaaattcagatcatccccaaaagttcttggtccaatcatgatgcctcatcaggggtgagctcagcatgtccataccatgggaggtcctcctttttcagaaatgtggtatggcttatcaaaaagaacagtgcatacccaacaataaagaggagctacaataataccaaccaagaagatcttttagtactgtgggggattcaccatcctaatgatgcggcagagcagacaaagctctatcaaaacccaaccacttacatttccgttggaacatcaacactgaaccagagattggttccagaaatagctactagacccaaagtaaacgggcaaagtggaagaatggagttcttctggacaattttaaagccgaatgatgccatcaatttcgagagtaatggaaatttcattgctccagaatatgcatacaaaattgtcaagaaaggggactcagcaattatgaaaagtgaattggaatatggtaactgcaacaccaagtgtcaaactccaatgggggcgataaactctagtatgccattccacaacatacaccccctcaccatcggggaatgccccaaatatgtgaaatcaaacagattagtccttgcgactggactcagaaatacccctcagagagagagaagaagaaaaaagagaggactatttggagctatagcaggttttatagagggaggatggcagggaatggtagatggttggtatgggtaccaccatagcaatgagcaggggagtggatacgctgcagacaaagaatccactcaaaaggcaatagatggagtcaccaataaggtcaactcgatcattgacaaaatgaacactcagtttgaggccgttggaagggaatttaataacttggaaaggaggatagagaatttaaacaagcagatggaagacggattcctagatgtctggacttataatgctgaacttctggttctcatggaaaatgagagaactctagactttcatgactcaaatgtcaagaacctttatgacaaggtccgactacagcttagggataatgcaaaggagctgggtaatggttgtttcgagttctatcacaaatgtgataatgaatgtatggaaagtgtaaaaaacggaacgtatgactacccgcagtattcagaagaagcaagactaaacagagaggaaataagtggagtaaaattggaatcaatgggaacttaccaaatactgtcaatttattcaacagtggcgagttccctagcactggcaatcatgggagccttgctgaatgacaagcactccaatgggaccgtcaaagacagaagccctcacagaacattgatgagttgtcctgtgggtgaggctccctccccatataactcaaggtttgagtctgttgcttggtcggcaagtgcttgccatgatggcaccagttggttgacaattggaatttctggcccagacaatggggctgtggctgtattgaaatacaacggcataataacagacactatcaagagttggaggaacaacatactgagaactcaagagtctgaatgtgcatgtgtaaatggctcttgctttactgtaatgactgacggaccaagtaatgggcaggcctcatataagatcttcaaaatggaaaaagggaaagtagttaaatcagtcgaattgaatgcccctaattatcactatgaggagtgctcctgttatcctgatgctggcgaaatcacatgtgtgtgcagggataattggcatggctcaaatcggccatgggtatctttcaatcaaaatttggagtatcaaataggatatatatgcagtggagttttcggagacaatccacgccccaatgatggaacaggcagttgtggtccggtgtcccctaacggggcatatggagtaaaagggttttcatttaaatacggcaatggtgtttggatcgggagaaccaaaagcactaattccaggagcggctttgaaatgatttgggatccaaatgggtggactggaacggacagtagcttctcggtgaaacaagatatcgtagcaataactgattgggtagataatcactcactgagtgacatcaacatcatggcgtctcagggcaccaaacgatcttatgaacagatggaaactggtggagaacgccagaatgctactgagatcagagcatctgttggaagaatggttggtggaattgggaggttttatatacagatgtgcactgaactcaaactcagcgactatgaaggaaggctgattcagaacagcataacaatagagagaatggttctctctgcatttgatgaaaggaggaacaaatacctggaagaacatcccagtgcggggaaggacccaaagaaaactggaggtccaatctaccgaagaagagacggaaaatgggtgagagagctgattctgtatgacaaagaggagatcaggagaatttggcgtcaagcgaacaatggagaagatgcaactgctggtctcactcacatgatgatctggcattccaatctaaatgatgccacataccagagaacaagagctctcgtgcgtactgggatggaccctagaatgtgctctctgatgcaaggatcaactctcccgaggagatctggagctgctggtgcggcagtaaagggagtcggaacgatggtgatggaactaattcggatgataaagcgagggattaacgatcggaatttctggagaggtgaaaatgggcgaagaacaagaattgcatatgagagaatgtgcaacatcctcaaagggaaattccaaacagcagcacaaagagcaatgatggatcaggtacgggaaagcagaaatcctgggaatgctgagattgaagatctcatatttctggcacggtctgcactcatcctgagaggatcagtggcccacaagtcctgcttgcctgcttgtgtgtacgggcttgccgtggccagtggatatgactttgagagagaagggtactctctggtcgggattgatcctttccgtctgctgcaaaacagccaggtctttagtctaattagaccaaatgagaatccagcacataaaagtcaattggtgtggatggcatgccattctgcagcatttgaagatctgagagtctcaagcttcatcagagggacaagagtggccccaaggggacaactatctactagaggagttcaaattgctt

78

Page 79: Project  Report on Influenza Virus

caaatgagaacatggaaacaatggactccagcactcttgaactgagaagcagatattgggctataaggaccaggagtggaggaaacaccaaccagcagagagcatctgcaggacaaatcagtgtgcagcctactttctcggtacagagaaatcttcccttcgaaagagcgaccattatggcggcattcacagggaatacagagggcagaacatctgacatgaggactgaaatcataaggatgatggaaagctccagaccagaagatgtgtctttccaggggcggggagtcttcgagctctcggacgaaaaggcaacgaacccgatcgtgccttcctttgacatgagtaatgaaggatcttatttcttcggagacaatgcagaggaatatgacaattga>gi|GENSCAN_predicted_peptide_16|716_aaMEDFVRQCFNPMIVELAEKAMKEYGEDPKIETNKFAAICTHLEVCFMYSDFHFIDERGESTIIESGDPNALLKHRFEIIEGRDRTMAWTVVNSICNTTGVEKPKFLPDLYDYKENRFIEIGVTRREVHTYYLEKANKIKSEKTHIHIFSFTGEEMATKADYTLDEESRARIKTRLFTIRQEMASRGLWDSFRQSERGEETVEERFEITGTMCRLADQSLPPNFSSLEKFRAYVDGFEPNGCIEGKLSQMSKEVNARIEPFLKTTPRPLRLPDGPPCSQRSKFLLMDALKLSIEDPSHEGEGIPLYDAIKCMKTFFGWKEPNIVKPHEKGINPNYLLAWKQVLAELQDIENEEKIPKTKNMRKTSQLKWALGENMAPEKVDFEDCKDVSDLRQYDSDEPKPRSLASWIQSEFNKACELTDSSWIELDEIGEDVAPIEHIASMRRNYFTAEVSHCRATEYIMKGVYINTALLNASCAAMDDFQLIPMISKCRTKEGRRKTNLYGFLIKGRSHLRNDTDVVNFVSMEFSLTDPRLEPHRWEKYCVLRIGDMLLRTEIGQVSRPMFLYVRTNGTSKIKMKWGMEMRRCPFQSLQQIESMIEAESSVKEKDMTKEFFENKSETWPIGESPKGVEEGSIGKVCRTLLAKSVFNSLYASPQLEGFSAESRKLLLIVQALRDNLEPGTFDLGGLYEAIEECLINDPWVLLNASWFNSFLTHALR>gi|GENSCAN_predicted_CDS_16|2151_bpatggaagactttgtgcgacaatgcttcaatccaatgattgtcgagcttgcggaaaaggcaatgaaagaatatggggaagatccgaaaatcgaaacgaacaaatttgccgcaatatgcacgcacttagaagtctgtttcatgtattcagatttccactttattgatgaacggggcgaatcaacaattatagaatctggcgatcccaatgcattattgaaacaccggtttgaaataatcgaagggagggaccgaacaatggcctggacagtggtgaatagtatctgcaacaccacaggagttgagaagcctaaatttctcccagatttgtatgactacaaggagaaccgatttattgaaattggagtgacacggagggaagttcacacatactatctagaaaaagccaacaagataaaatctgagaagacacacattcacatattctcattcactggagaggaaatggccaccaaagcggactacacccttgatgaagaaagcagggcccgaatcaaaaccaggctgttcactataaggcaggaaatggccagtaggggtttatgggattcctttcgtcagtccgagagaggcgaagagacagttgaagaaagatttgaaatcacagggactatgtgcaggcttgccgaccaaagtctcccacctaatttctccagccttgaaaaatttagagcctatgtggatggattcgaaccgaacggctgcattgagggcaagctttctcaaatgtcgaaagaagtaaacgccagaattgagccatttctgaagacaacaccacgccctcttagattacctgatgggcctccctgctctcagcggtcgaagtttttgctgatggatgcccttaaattaagcatcgaagacccgagtcatgagggggaggggataccgctatatgatgcaatcaaatgcatgaaaacatttttcggctggaaagagcccaacattgtaaaaccacatgaaaaaggcataaaccccaattacctcctggcttggaagcaggtgctggcagagctccaagatattgaaaacgaggagaaaattccaaagacaaagaacatgaggaaaacaagccaattgaagtgggcacttggtgagaatatggcaccagagaaagtagactttgaggattgcaaagatgttagcgatctaaggcagtatgacagtgatgaaccaaagcctagatcactagcaagctggatccagagtgaattcaacaaggcatgcgaattgacagattcaagttggattgaacttgatgaaataggggaagacgttgctccaattgagcacattgcaagtatgagaaggaactatttcacagcggaagtatcccattgcagggctactgaatacataatgaagggagtgtacataaacacagctttgttgaatgcatcctgtgcagccatggatgacttccaactgatcccaatgataagcaaatgcagaaccaaagaaggaagacggaaaactaacctgtatggattccttataaaaggaagatcccatttgagaaatgacaccgatgtggtaaactttgtgagtatggaattctctcttactgatccgaggctggagccacacagatgggaaaagtactgcgttcttcggataggagacatgctcttacggactgaaataggccaagtgtcaaggcccatgtttctttatgtgagaaccaatggaacctccaagatcaagatgaaatggggcatggaaatgaggcgatgcccttttcaatcccttcaacagattgagagcatgattgaggccgagtcttctgtcaaagaaaaagacatgactaaagaattctttgaaaacaaatcagaaacatggccaattggagaatcacccaagggagtggaggaaggctccatcgggaaggtgtgcagaaccttactggctaaatctgttttcaacagtctatatgcatctccacaactcgaggggttttcagctgaatcaagaaaattgcttctcattgttcaggcacttagggacaacctggaacctggaaccttcgatcttggggggctatatgaagcaattgaggagtgcctgattaatgatccctgggttttgcttaatgcatcttggttcaactccttcctcacacatgcactaagatag>gi|GENSCAN_predicted_peptide_17|718_aaMDTVNRTHQYSEKGKWTTNTETGAPQLNPIDGPLPEDNEPSGYAQTDCVLEAMAFLEESHPGIFENSCLETMEVVQQTRVDKLTQGRQTYDWTLKRNQPAATALANTIEVFRSNGLTANESGRLIDFLKDVMESMDKGEMEIITHFQRKRRVRDNMTKKMVTQRTIGKKKQRLNKRSYLIRALTLNTMTKDAERGKLKRRAIATPGMQIRGFVYFVETLARSICEKLEQSGLPVGGNEKKAKLANVVRKMMTNSQDTELSFTITGDNTKWNENQNPRMFLAMITYITRNQPEWFRNVLSIAPIMFSNKMARLGKGYMFESKSMKLRTQIPAEMLASIDLKYFNESTRKKIEKIRPLLIDGTASLSPGMMMGMFNMLSTVLGVSILNLGQKRYTKTTYWWDGLQSSDDFALIVNAPNHEGIEAGVDRFYRTCKLVGINMTKKKSYINRTGTCEFTSFFYRYGFVANFSMELPSFGVSGINESADMSIGVTVIKNNMMDNDLGPATAQMALQLFIKDYRYPYRCHRGDTQIQTRRSFELKKLWEQTRSKAGLLVSDGGPNPYNIRNLHIPEAGLKWELMDEDYQGRLCNPLNPFVSHKEIESVNNAVVMPAHGPAKSMEYDAVATTHSWIPKRNRSILNTSQRGILEDEQMYQKCCNLFEKFFPSSSYRRPVGISSMVEAMVSRARIDARIDFESGRIKKEEFAEIMKICSTIEELGRQK>gi|GENSCAN_predicted_CDS_17|2157_bpatggacacagtcaacagaacacatcaatattcagaaaaggggaaatggacaacgaacacagagactggagcaccccaactcaatccgattgatggaccactacctgaggataatgagccgagtgggtatgcacaaacagattgtgtattggaagcaatggctttccttgaagaatcccacccagggatctttgaaaactcgtgtcttgaaacgatggaagttgttcagcaaacaagagtggataagctgacccaaggtcgccaaacctatgactggacattgaaaagaaaccagccggctgcaaccgctttggccaacactatagaggtcttcagatcgaatggtctaacagccaatgaatcgggaaggctaatagatttcctcaaagacgtgatggaatcaatggataagggagaaatggaaataataacacatttccagagaaagagaagagtgagggacaacatgaccaagaaaatggtcacacaaagaacaatagggaagaaaaaacaaaggctgaacaaaaggagctacctaataagagcactgacactgaacacaatgacaaaagacgcagaaagaggcaaattgaagaggcgggcaattgcaacacccgggatgcaaatcagaggattcgtgtactttgtcgaaacactagcgaggagtatctgtgagaaacttgagcaatctggactccccgtcggagggaatgaaaagaaggctaaattggcaaatgtcgtgaggaagatgatgactaactcacaagatacagagctctcttttacaattactggagacaacaccaaatggaatgagaatcagaaccctcggatgtttctagcaatgataacatacatcacaaggaaccaacctgaatggtttagaaatgtcttaagcattgctcctataatgttctcaaacaagatggcaagattagggaaaggatacatgttcgaaagtaagagcatgaagctacggacacaaataccagcagaaatgcttgcaagcattgacttgaaatacttcaacgaatcaacgagaaagaaaatcgagaaaataagacctctactaatagatggcacagcctcattgagtcctggaatgatgatgggcatgttcaatatgctgagtacagtcttaggagtttcaatcctgaatcttgggcagaagaggtacaccaaaaccacatactggtgggacggactccaatcctctgatgatttcgctctcatagtgaatgcaccaaatcatgagggaatagaagcaggggtggataggttctataggacttgcaaactagttggaatcaatatgaccaagaagaagtcttacataaatcggacaggaacatgtgaattcacaagcttcttctaccgctatgggttcgtagccaacttcagtatggagctgcccagctttggagtgtctgggattaatgaatcggctgacatgagcattggtgttacagtgataaagaacaatatgatggacaacgaccttggaccagcaacagctcagatggctcttcagctattcattaaggactacagatacccataccgatgccacaggggggatacacaaatccaaacgaggagatcattcgagctgaagaagctgtgggagcagacccgctcaaaggcaggactgttggtttcagatggaggaccaaacccatacaatatccggaatctccacattccggaggctggcttgaagtgggaattgatggatgaagactaccagggcagactgtgtaatcctctgaacccgtttgttagtcataaggaaattgagtctgtcaacaatgctgtggtaatgccagctcatggcccagccaagagcatggaatatgatgcagttgcgactacacattcatggattcccaagaggaatcgttccattctcaacaccagccaaagggggattcttgaggatgaacagatgtatcagaagtgctgcaatctattcgagaaattcttccctagcagttcatatcggaggccagttggaatttccagcatggtggaggccatggtgtctagggcccgaattgatgcacgaattgacttcgagtctggaaggattaagaaagaagagtttgctgagatcatgaagatctgttccaccattgaagagctcggacggcaaaaatag>gi|GENSCAN_predicted_peptide_18|759_aaMERIKELRDLMSQSRTREILTKTTVDHMAIIKKYTSGRQEKNPALRMKWMMAMKYPITADKRIMEMIPERNEQGQTLWSKTNDAGSDRVMVSPLAVTWWNRNGPTTSTVHYPKVYKTYFEKVERLKHGTFGPVHFRNQVKIRRRVDINPGHADLSAKEAQDVIMEVVFPNEVGARILTSESQLTITKEKKEELQDCKIAPLMVAYMLERELVRKTRFLPVAGGTSSVYIEVLHLTQGTCWEQMYTPGGEVRNDDVDQSLIIAARNIVRRATVSADPLASLLEMCHSTQIGGIRMVDILRQNPTEEQAVDICKAAMGLRISSSFSFGGFTFKRTNGSSVKKEEEVLTGNLQTLKIKVHEGYEEFTMVGRRATAILRKATRR

79

Page 80: Project  Report on Influenza Virus

LIQLIVSGRDEQSIAEAIIVAMVFSQEDCMIKAVRGDLNFVNRANQRLNPMHQLLRHFQKDAKVLFQNWGIEPIDNVMGMIGILPDMTPSAEMSLRGVRVSKMGVDEYSSTERVVVSIDRFLRVRDQQGNVLLSPEEVSETQGTEKLTITYSSSMMWEINGPESVLVNTYQWIIRNWETVKIQWSQDPTMLYNKMEFESFQSLVPKAARSQYSGFVRTLFQQMRDVLGTFDTVQIIKLLPFAAAPPEPSRMQFSSLTVNVRGSGMRILVRGNSPVFNYNKATKRLTVLGKDAGALTEDPDEGTAGVESAVLRGFLILGREDKRYGPALSINELSNLAKGEKANVLIMQGDVVLVMKRKRDFSILTDSQTATKRIRMAIN>gi|GENSCAN_predicted_CDS_18|2280_bpatggaaagaataaaagaactaagagatctaatgtcgcagtcccgcactcgcgagatactaacaaaaaccactgtggatcatatggccataatcaagaaatacacatcaggaagacaagagaagaaccctgctctcagaatgaaatggatgatggcaatgaaatatccaatcacagcagacaagagaataatggagatgattcctgaaaggaatgagcaaggacaaacgctttggagcaagacaaatgatgctgggtcggacagagtgatggtgtctcccctagctgtaacttggtggaacaggaatgggccgacaacaagtacagtccattatccaaaggtttacaaaacatactttgagaaggttgaaaggttaaaacatggaaccttcggtcccgttcatttccgaaaccaagttaaaatacgtcgccgggtggatataaacccgggccatgcagatctcagtgctaaagaagcacaagatgttatcatggaggtcgttttcccaaatgaagtgggagctagaatattgacatcagagtcgcaattgacaataacaaaagagaagaaagaagagctccaggattgtaaaattgctcctttaatggtggcatacatgttggaaagagaactggtccgcaaaaccagatttctaccggtagcaggcggaacaagcagtgtgtacattgaggtattgcatttgactcaagggacctgttgggaacagatgtacactcccggcggagaagtaagaaatgatgatgttgaccagagtttgatcatcgctgccagaaacattgttaggagagcaacagtatcagcggacccactggcatcactcttggagatgtgtcacagcacacaaattgggggaataaggatggtggacatccttaggcaaaacccaactgaggagcaagctgtggatatatgcaaagcagcaatgggtttgaggatcagttcatcctttagctttggaggcttcactttcaaaagaacaaatggatcatccgtcaagaaggaagaggaagtgcttacaggcaacctccaaacattgaaaataaaagtacatgaggggtatgaagaattcacaatggttgggcggagagcaacagctatcctgaggaaagcaactagaaggctgattcagttgatagtaagtggaagagatgaacaatcaatcgctgaagcgatcattgtagcaatggtgttctcacaggaggattgcatgataaaggcagtccgaggcgatctgaatttcgtgaacagagcaaaccaaagattgaaccccatgcatcaactcctgaggcacttccaaaaagatgcaaaagtgctgtttcagaactggggaattgaacctattgacaatgtcatggggatgatcggaatattacctgacatgactccaagcgcagagatgtcactgagaggagtgagagttagtaagatgggagtagatgaatattccagcacggagagagtggtggtgagtattgaccgtttcttgagggtccgagatcagcaggggaacgtactcttatctcctgaagaggttagtgaaacacagggaacagagaagttgacaataacatattcatcctcaatgatgtgggaaatcaacggtcctgagtcagtgcttgttaacacttatcaatggatcatcaggaattgggagactgtaaagattcaatggtctcaagatcccacaatgctgtacaataagatggagtttgaatcgttccaatccttggtgccaaaggctgccagaagccaatatagtggatttgtgagaacactattccaacagatgcgtgatgttttggggacatttgatactgtccaaataatcaagctgctaccatttgcagcagccccaccggagccgagcagaatgcagttttcttctctaactgtgaatgtgagaggctcaggaatgagaatactcgtgaggggtaactcccccgtgttcaactacaacaaggcaaccaaaaggcttacagtcctcggaaaggacgcaggtgcattaacagaagatccagacgagggaacagccggggtggaatctgcagtattgaggggattcctaattctaggcagagaggacaaaagatatggacccgcattgagcatcaatgaactgagcaatcttgcaaaaggggagaaggctaatgtattgataatgcaaggagacgtggtgttggtaatgaaacggaaacgggactttagcatacttactgacagccagacagcgaccaaaagaattcggatggccatcaattag>gi|GENSCAN_predicted_peptide_19|716_aaMEDFVRQCFNPMIVELAEKTMKEYGEDPKIETNKFAAICTHLEVCFMYSDFHFIDERGESIIVESGDPNALLKHRFEIIEGRDRAMAWTVVNSICNTTGVDKPKFLPDLYDYKENRFTEIGVTRREVHIYYLEKANKIKSEKTHIHIFSFTGEEMATKADYTLDEESRARIKTRLFTIRQEMASRGLWDSFRQSERGEETIEERFEITGTMRRLADQSLPPNFSSLENFRAYVDGFKPNGCIEGKLSQMSKEVNARIEPFLKTTPRPLRLPDGPPCSQRSKFLLMDALKLSIEDPSHEGEGIPLYDAIKCMKTFFGWREPNIIKPHEKGINPNYLLAWKQVLAELQDIENEDKIPKTKNMKKTSQLMWALGENMAPEKLDFEDCKDIGDLKQYQSDEPELRSIASWIQSEFNKACELTDSSWIELDEIGEDVAPIEHIASMRRNYFTAEVSHCRATEYIMKGVYINTALLNASCAAMDDFQLIPMISKCRTKEGRRKTNLYGFIIKGRSHLRNDTDVVNFVSMEFSLTDPRLEPHKWEKYCVLEVGEMLLRTAIGQVSRPMFLYVRTNGTSKIKMKWGMEMRRCLLQSLQQIESMIEAESSIKEKDMTKEFFENRSETWPIGESPKGVEEGSIGKVCRTLLAKSVFNSLYSSPQLEGFSAESRKLLLIVQALRDNLEPGTFDLEGLYGAIEECLINDPWVLLNASWFNSFLTHALK>gi|GENSCAN_predicted_CDS_19|2151_bpatggaagactttgtgcgacagtgcttcaatccaatgattgtcgagcttgcggaaaagacaatgaaggaatatggggaagacccgaaaattgaaacaaataagttcgctgcaatatgcacacacttagaagtctgcttcatgtattcagacttccatttcattgacgaacgaggcgaatcaataattgtggaatctggtgatccaaatgcattgttgaagcacaggtttgaaataattgaaggaagagaccgagcaatggcctggacagtggtgaatagcatctgcaacacaacaggagtcgataaacccaaatttcttccggatctatacgactacaaggaaaaccgattcactgaaattggtgtgacacggagggaagttcacatatattacttagaaaaagctaacaagataaaatccgagaaaacacatatccacatcttttcattcactggagaagaaatggccactaaagctgactacacccttgatgaagagagcagggcaagaataaaaaccagactattcaccataagacaggaaatggcaagcaggggtctatgggattcctttcgtcagtccgagagaggcgaagagacaattgaagaaagatttgaaatcacagggaccatgcgtaggcttgccgaccaaagtctcccacctaacttctccagccttgaaaactttagagcctatgtggatggattcaaaccgaacggctgcattgagggcaagctttctcaaatgtcgaaagaagtgaacgccagaattgagccatttctgaagacaacaccacgtcccctcagattgcctgatggacctccctgctcccagcggtcgaaattcttgctgatggatgctctgaaattaagcattgaggacccgagccatgagggggaggggataccgctatatgatgcgataaaatgcatgaaaacattcttcggctggagagagcccaacatcatcaagccacacgagaagggcataaatcccaattatcttctggcttggaagcaggtgctggcagaactccaggatattgaaaatgaggataaaatcccaaaaacaaagaacatgaagaaaacaagccaattaatgtgggcactcggggagaatatggcaccggaaaaattggactttgaggactgcaaagatattggcgatctgaaacagtatcaaagtgatgagccagagctcagatcgatagcaagctggatccagagtgagttcaacaaggcatgtgaattgaccgattcgagctggatagaactcgatgagataggggaagatgttgccccaattgagcacattgcaagcatgagaaggaactacttcacagcggaagtgtctcattgcagggccactgagtacataatgaagggggtttacataaatacagctttgctcaatgcatcttgtgcagccatggatgacttccaactgattccaatgataagcaaatgcagaacaaaagaaggaagaaggaagacaaacctgtatgggttcattataaaaggaaggtcccatttgagaaatgatactgacgtggtgaactttgtgagtatggaattctcccttactgacccaaggctggagccacacaaatgggaaaagtactgtgttcttgaagtaggggaaatgctcttgcggactgcaataggccaggtgtcaaggcccatgttcctgtatgtgagaactaacggaacctccaaaattaagatgaaatgggggatggaaatgagacgctgccttcttcaatctcttcaacagattgagagcatgatcgaggctgagtcttctatcaaagagaaagacatgaccaaagaattctttgaaaacagatcggagacatggccaattggagagtcacctaagggagtggaggaaggctcaatcgggaaggtgtgcagaaccttactagcaaaatctgtgttcaacagcctatattcatctccacaactcgaaggattttcagctgaatcgagaaaactactactcattgttcaagcacttagggacaacctggaacctggaacctttgatcttgaagggctatatggagcaattgaggagtgcctgattaatgatccctgggttttgcttaatgcatcttggttcaactccttcctcacacatgcactaaaatag>gi|GENSCAN_predicted_peptide_20|719_aaMDTVNRTHQYSEKGRWTTNTETGAPQLNPIDGPLPEDNEPSGYAQTDCVLEAMAFLEESHPGLFENSCLETMEVVQQTRVDKLTQGRQTYDWTLNRNQPAATALANTIEVFRSNGLTANESGRLIDFLKDVMESMDKEEMEITTHFQRKRRVRDNMTKKMVTQRTIGKKKQKLTKKSYLIRALTLNTMTKDAERGKLKRRAIATPGMQIRGFVHFVEALARSICEKLEQSGLPVGGNEKKAKLANVVRKMMTNSQDTELSFTVTGDNTKWNENQNPRIFLAMITYITRNQPEWFRNVLSIAPIMFSNKMARLGKGYMFESKSMKLRTQIPAEMLANIDLKYFNESTRKKIEKIRPLLIEGTASLSPGMMMGMFNMLSTVLGVSILNLGQKRYTKTTYWWDGLQSSDDFALIVNAPNHEGIQAGVDRFYRTCKLVGINMSKKKSYINRTGTFEFTSFFYRYGFVANFSMELPSFGVSGINESADMSIGVTVIKNNMINNDLGPATAQMALQLFIKDYRYTYRCHRGDTQIQTRRSFELKKLWEQTRSKAGLLVSDGGPNLYNIRNLHIPEVCLKWELMDEDYQGRLCNPLNPFVSHKEVESVNNAVVMPAHGPAKSMEYDAVATTHSWIPKRNRSILNTSQRGILEDEQMYQKCCTLFEKFFPSSSYRRPVGISSMMEAMVSRARIDARIDFESGRIKKEEFAEILKICSTIEELGRQGK>gi|GENSCAN_predicted_CDS_20|2160_bpatggacacagtcaacagaacacatcaatattcagaaaaagggaggtggacaacaaacacagagaccggagcaccccaactcaaccctattgatggaccattacctgaagacaatgagccgagcgggtatgcacaaacagattgtgtattggaagcaatggctttccttgaagaatcccacccaggactctttgaaaactcatgtcttgaaacgatggaagttgtccagcaaacgagagtggataagctgacccaaggtcgccagacttatgactggacattgaatagaaaccagccggctgcaactgctttggccaacaccatagaagtattcagatcgaacggtctaacagccaatgagtcaggaaggttaatagattt

80

Page 81: Project  Report on Influenza Virus

cctcaaggacgtaatggaatcaatggataaggaagaaatggaaataacaacacatttccagagaaagagaagagtgagggacaacatgaccaagaaaatggtcacacaaagaacaatagggaagaagaagcaaaagctgacaaaaaagagctacctaataagagcactgacactgaacacaatgacaaaagatgctgaaaggggaaaattgaaaagacgagcgattgcaacacccggaatgcaaatcagaggattcgtgcactttgtcgaagcactagcaaggagcatctgtgaaaaacttgagcaatctggactccccgttggagggaatgagaagaaggctaaattggcaaatgttgtgagaaagatgatgactaactcacaagacacagagctctcctttacagttaccggagacaacaccaaatggaatgagaatcagaatcctcgaatatttctagcaatgataacatacatcacaaggaaccaacctgaatggtttagaaatgtcttgagcattgcccctataatgttctcaaataaaatggcgaggttaggaaaaggatacatgttcgagagtaagagcatgaagctacggacacaaataccagcagaaatgcttgcaaacattgacttgaaatacttcaacgaatcgacgagaaagaaaattgagaaaataagacctctactaatagagggcacagcctcattgagtccagggatgatgatgggcatgtttaatatgctaagtacggtcttaggagtctcaatcttaaatcttgggcagaagaggtacaccaaaaccacatactggtgggatgggctccaatcctctgatgatttcgctctcatagtgaatgcaccaaatcatgagggaatacaagcaggagtggatagattctataggacttgcaagctagttggaatcaacatgagcaaaaagaagtcttacataaatcggacaggaacatttgagttcacaagctttttctaccgctatgggtttgtagccaacttcagcatggagctgcccagctttggagtttccggaattaatgaatcggctgacatgagcattggagttacagtgataaagaataatatgataaacaacgaccttggaccagcaacagcccagatggctcttcagctgttcattaaagactacagatacacctaccgatgccacagaggtgatacacaaattcaaactagaagatcatttgaattgaagaagctgtgggagcagacccgctcaaaggcaggactgttggtttcagatggagggccgaatttatacaacatccggaatcttcacattccagaagtttgcttgaagtgggagttgatggatgaagattaccagggaagactgtgtaaccctctgaacccgtttgtcagtcataaggaagttgaatccgtcaacaatgctgtggtaatgccagcccatggtccggccaagagcatggaatatgatgccgttgcaactacacattcatggattcccaagagaaatcgctccattctcaacactagccaaaggggaattcttgaggatgaacaaatgtaccagaagtgctgcactctattcgagaaattcttccctagcagttcatatcggaggccagttggaatttccagcatgatggaggccatggtgtctagggcccgaattgatgcacggattgacttcgagtctggaaggattaagaaagaagaatttgctgagatcttgaagatctgttccaccattgaagagctcggacggcaagggaagtga>gi|GENSCAN_predicted_peptide_21|709_aaMAMKYPITADKRIMEMIPERNEQGQTLWSKTNDAGSDRVMVSPLAVTWWNRNGPTTSTVHYPKVYKTYFEKVERLKHGTFGPVHFRNQVKIRRRVDMNPGHADLSAKEAQDVIMEVVFPNEVGARILTSESQLTITKEKREELKNCNIAPLMVAYMLERELVRKTRFLPVAGGTSSVYIEVLHLTQGTCWEQMYTPGGEVRNDDVDQSLIIAARNIVRRATVSADPLASLLEMCHSTQIGGVRMVDILKQNPTEEQAVDICKAAMGLKISSSFSFGGFTFKRTKGSSVKREEEVLTGNLQTLKIKVHEGYEEFTMVGRRATAILRKATRRMIQLIVSGRDEQSIAEAIIVAMVFSQEDCMVKAVRGDLNFVNRANQRLNPMHQLLRHFQKDAKVLFQNWGIEPIDNVMGMIGILPDMTPSTEMSLRGVRVSKMGVDEYSSTERVVVSIDRFLRVRDQRGNVLLSPEEVSETQGMEKLTITYSSSMMWEINGPESVLVNTYQWIIRNWETVKIQWSQEPTMLYNKMEFEPFQSLVPKAARSQYSGFVRTLFQQMRDVLGTFDTVQIIKLLPFAAAPPEQSRMQFSSLTVNVRGSGMRILVRGNSPAFNYNKTTKRLTILGKDAGALTEDPDEGTAGVESAVLRGFLILGKEDKRYGPALSINELSNLTKGEKANVLIGQGDVVLVMKRKRDSSILTDSQTATKRIRMAIN>gi|GENSCAN_predicted_CDS_21|2130_bpatggcgatgaaatacccgatcacagctgacaaaagaataatggagatgatccctgaaaggaatgagcaaggccaaactctttggagcaaaacaaatgacgctggatcagacagggtaatggtatcacctctggctgtaacgtggtggaacagaaatggaccaacaacaagtacagtccattatccaaaggtgtataaaacctactttgaaaaggttgaaagattaaaacacggaacctttggccctgttcatttccggaatcaagtcaaaatacgccgcagggttgacatgaaccctggccatgcagatctcagcgctaaagaagcacaagatgtcatcatggaggtcgttttcccaaatgaagttggagccaggatattgacatcagaatcacagctgacaataacaaaggaaaagagggaggaactcaagaattgtaatattgctcctttaatggtggcatatatgttggaaagagaattggttcgcaagaccagattcctacccgtggctggcgggacaagcagcgtatatatagaagtattgcatttgactcaaggaacttgctgggagcagatgtacacaccaggaggggaggtaagaaatgatgatgttgaccaaagtttaatcattgctgctaggaacattgtcaggagagcaacagtatcagcagacccattggcttcactcctggaaatgtgccatagcacacaaattggcggagtaagaatggtagacatccttaaacaaaacccaacagaagagcaagctgtagatatatgcaaggcagcaatgggtttgaaaatcagctcatccttcagctttggagggttcactttcaaaagaacaaaggggtcttctgtcaaaagagaggaagaagtgcttacaggcaacctccaaacattgaagataaaagtacatgaaggatatgaggaattcacaatggttggacgaagagcaacagccattctaagaaaagcaaccagaaggatgatccaactgatagtcagcggaagggacgagcaatcaattgctgaggcaattattgtggcaatggtgttctcacaagaagattgcatggtaaaggcagtccgaggtgatttgaatttcgtaaacagagcaaatcaacgactgaatcccatgcaccaactcctgagacactttcaaaaggatgcaaaggtgctgtttcaaaactggggaattgaacccatcgacaatgtcatgggtatgattggaatattgcctgacatgacccccagcacggaaatgtcactaagaggagtgagagttagcaaaatgggggtggatgaatattctagcactgaaagggtggtcgtgagcattgaccgtttcttaagggtccgagatcagcgaggaaatgtactcctatcccctgaagaagttagtgaaacacagggaatggaaaagttgacgataacttattcatcgtctatgatgtgggagattaacgggccagaatcagtgctagttaacacatatcaatggatcattaggaattgggagactgtaaagatccaatggtcccaagaacccaccatgctatacaataagatggagtttgaaccatttcaatctttagtaccaaaggctgccagaagccaatatagtggatttgtgagaacgctattccagcagatgcgtgatgttttgggaacgttcgacactgttcaaataatcaaactactaccatttgcagcagccccaccggaacagagtaggatgcaattttcttctctgactgtgaatgtgaggggatcaggaatgagaatacttgtgagaggtaactcccctgcatttaactacaacaagacaactaagaggcttacaatacttgggaaggacgcaggtgcgcttacagaggacccagatgaaggaacagcaggagtagagtctgcagtattgagaggatttctaatcctcggcaaagaagacaaaagatatggaccagcattaagcatcaatgaactgagcaatcttacgaaaggggagaaagctaatgtattgatagggcaaggagacgtagtgttggtaatgaaacggaaacgggactctagcatacttactgacagccagacagcgaccaaaagaattcggatggccatcaattag>gi|GENSCAN_predicted_peptide_22|751_aaMETISLITILLVVTASNADKICIGHQSTNSTETVDTLTETNVPVTHAKELLHTEHNGMLCATSLGHPLILDTCTIEGLVYGNPSCDLLLGGREWSYIVERSSAVNGTCYPGNVENLEELRTLFSSASSYQRIQIFPDTTWNVTYTGTSRACSGSFYRSMRWLTQKSGFYPVQDAQYTNNRGKSILFVWGIHHPPTYTEQTNLYIRNDTTTSVTTEDLNRTFKPVIGPRPLVNGLQGRIDYYWSVLKPGQTLRVRSNGNLIAPWYGHVLSGGSHGRILKTDLKGGNCVVQCQTEKGGLNSTLPFHNISKYAFGTCPKYVRVNSLKLAVGLRNVPARSSRGLFGAIAGFIEGGWPGLVAGWYGFQHSNDQGVGMAADRDSTQKAIDKITSKVNNIVDKMNKQYEIIDHEFSEVETRLNMINNKIDDQIQDVWAYNAELLVLLENQKTLDEHDANVNNLYNKVKRALGSNAMEDGKGCFELYHKCDDQCMETIRNGTYNRRKYREESRLERQKIEGGILGFVFTLTVPSERGLQRRRFVQNALNGNGDPNNMDRAVKLYKKLKREMTFHGAKEVALSYSTGALASCMGLIYNRMGTVTTEVALGLVCATCEQIADAQHRSHRQMATTTNPLIRHENRMVLASTTAKAMEQMAGSSEQAAEAMEVASQARQMVQAMRTIGTHPSSSAASIIGILHLILWILDRLFFKCIYRRFKYGLKRGPSTEGVPESMREEYRQEQQNAVDVDDGHFVNIELE>gi|GENSCAN_predicted_CDS_22|2256_bpatggaaacaatatcactaataactatactactagtagtaacagcaagcaatgcagataaaatctgcatcggccaccagtcaacaaactccacagaaactgtggacacgctaacagaaaccaatgttcctgtgacacatgccaaagaattgctccacacagagcataatggaatgctgtgtgcaacaagcctgggacatcccctcattctagacacatgcactattgaaggactagtctatggcaacccttcttgtgacctgctgttgggaggaagagaatggtcctacatcgtcgaaagatcatcagctgtaaatggaacgtgttaccctgggaatgtagaaaacctagaggaactcaggacactttttagttccgctagttcctaccaaagaatccaaatcttcccagacacaacctggaatgtgacttacactggaacaagcagagcatgttcaggttcattctacaggagtatgagatggctgactcaaaagagcggtttttaccctgttcaagacgcccaatacacaaataacaggggaaagagcattcttttcgtgtggggcatacatcacccacccacctataccgagcaaacaaatttgtacataagaaacgacacaacaacaagcgtgacaacagaagatttgaataggaccttcaaaccagtgatagggccaaggccccttgtcaatggtctgcagggaagaattgattattattggtcggtactaaaaccaggccaaacattgcgagtacgatccaatgggaatctaattgctccatggtatggacacgttctttcaggagggagccatggaagaatcctgaagactgatttaaaaggtggtaattgtgtagtgcaatgtcagactgaaaaaggtggcttaaacagtacattgccattccacaatatcagtaaatatgcatttggaacctgccccaaatatgtaagagttaatagtctcaaactggcagtcggtctgaggaacgtgcctgctagatcaagtagaggactatttggagccatagctggattcatagaaggaggttggccaggactagtcgctggctggtatggtttccagcattcaaatgatcaaggggttggtatggctgcagatagggattcaactcaaaaggcaattgataaaataacatccaaggtgaataatatagtcgacaagatgaacaagcaatatgaaataattgatcatgaattcagtgaggttgaaactagactcaatatgatcaataataagattgatgaccaaatacaagacgtatgggcatataatgcagaattgctagtactacttgaaaatcaaa

81

Page 82: Project  Report on Influenza Virus

aaacactcgatgagcatgatgcgaacgtgaacaatctatataacaaggtgaagagggcactgggctccaatgctatggaagatgggaaaggctgtttcgagctataccataaatgtgatgatcagtgcatggaaacaattcggaacgggacctataataggagaaagtatagagaggaatcaagactagaaaggcagaaaatagagggggggattttagggtttgtgttcacgctcaccgtgcccagtgagcgaggactgcagcgtagacgatttgtccaaaatgccctaaatgggaatggagacccaaacaacatggacagggcagttaaactatacaagaagctgaagagggaaatgacattccatggagcaaaggaagttgcactcagttactcaactggtgcgcttgccagttgcatgggtctcatatacaaccggatgggaacagtgaccacagaagtggctcttggcctagtatgtgccacttgtgaacagattgctgatgcccaacatcggtcccacaggcagatggcgactaccaccaacccactaatcaggcatgagaacagaatggtactagccagcactacggctaaggccatggagcagatggctggatcaagtgagcaggcagcagaagccatggaagtcgcaagtcaggctaggcaaatggtgcaggctatgaggacaattgggactcaccctagttccagtgcagcaagtatcattgggatattgcacttgatattgtggattcttgatcgtcttttcttcaaatgcatttatcgtcgctttaaatacggtttgaaaagagggccttctacggaaggagtgcctgagtctatgagggaagagtatcggcaggaacagcagaatgctgtggatgttgacgatggtcattttgtcaacatagagctggagtaa>gi|GENSCAN_predicted_peptide_23|759_aaMERIKELRNLMSQSRTREILTKTTVDHMAIIKKYTSGRQEKNPALRMKWMMAMKYPITADKRITEMIPERNEQGQTLWSKMNDAGSDRVMVSPLAVTWWNRNGPMTNTVHYPKIYKTYFERVERLKHGTFGPVHFRNQVKIRRRVDINPGHADLSAKEAQDVIMEVVFPNEVGARILTSESQLTITKEKKEELQDCKISPLMVAYMLERELVRKTRFLPVAGGTSSVYIEVLHLTQGTCWEQMYTPGGEVKNDDVDQSLIIAARNIVRRAAVSADPLASLLEMCHSTQIGGIRMVDILKQNPTEEQAVGICKAAMGLRISSSFSFGGFTFKRTSGSSVKREEEVLTGNLQTLKIRVHEGYEEFTMVGRRATAILRKATRRLIQLIVSGRDEQSIAEAIIVAMVFSQEDCMIKAVRGDLNFVNRANQRLNPMHQLLRHFQKDAKVLFQNWGVEPIDNVMGMIGILPDMTPSIEMSMRGVRISKMGVDEYSSTERVVVSIDRFLRVRDQRGNVLLSPEEVSETQGTEKLTITYSSSMMWEINGPESVLVNTYQWIIRNWETVKIQWSQNPTMLYNKMEFEPFQSLVPKAIRGQYSGFVRTLFQQMRDVLGTFDTAQIIKLLPFAAAPPKQSRMQFSSFTVNVRGSGMRILVRGNSPVFNYNKATKRLTVLGKDAGTLTEDPDEGTAGVESAVLRGFLILGKEDRRYGPALSINELSNLAKGEKANVLIGQGDVVLVMKRKRDSSILTDSQTATKRIRMAIN>gi|GENSCAN_predicted_CDS_23|2280_bpatggaaagaataaaagaactaagaaatctaatgtcgcagtctcgcacccgcgagatactcacaaaaaccaccgtggaccatatggccataatcaagaagtacacatcaggaagacaggagaagaacccagcacttaggatgaaatggatgatggcaatgaaatatccaattacagcagacaagaggataacggaaatgattcctgagagaaatgagcaaggacaaactttatggagtaaaatgaatgatgccggatcagaccgagtgatggtatcacctctggctgtgacatggtggaataggaatggaccaatgacaaatacagttcattatccaaaaatctacaaaacttattttgaaagagtcgaaaggctaaagcatggaacctttggccctgtccattttagaaaccaagtcaaaatacgtcggagagttgacataaatcctggtcatgcagatctcagtgccaaggaggcacaggatgtaatcatggaagttgttttccctaacgaagtgggagccaggatactaacatcggaatcgcaactaacgataaccaaagagaagaaagaagaactccaggattgcaaaatttctcctttgatggttgcatacatgttggagagagaactggtccgcaaaacgagattcctcccagtggctggtggaacaagcagtgtgtacattgaagtgttgcatttgactcaaggaacatgctgggaacagatgtatactccaggaggggaagtgaagaatgatgatgttgatcaaagcttgattattgctgctaggaacatagtgagaagagctgcagtatcagcagacccactagcatctttattggagatgtgccacagcacacagattggtggaattaggatggtagacatccttaagcagaacccaacagaagagcaagccgtgggtatatgcaaggctgcaatgggactgagaattagctcatccttcagttttggtggattcacatttaagagaacaagcggatcatcagtcaagagagaggaagaggtgcttacgggcaatcttcaaacattgaagataagagtgcatgagggatatgaagagttcacaatggttgggagaagagcaacagccatactcagaaaagcaaccaggagattgattcagctgatagtgagtgggagagacgaacagtcgattgccgaagcaataattgtggccatggtattttcacaagaggattgtatgataaaagcagttagaggtgatctgaatttcgtcaatagggcgaatcagcgactgaatcctatgcatcaacttttaagacattttcagaaggatgcgaaagtgctttttcaaaattggggagttgaacctatcgacaatgtgatgggaatgattgggatattgcccgacatgactccaagcatcgagatgtcaatgagaggagtgagaatcagcaaaatgggtgtagatgagtactccagcacggagagggtagtggtgagcattgaccggttcttgagagtccgggaccaacgaggaaatgtactactgtctcccgaggaggtcagtgaaacacagggaacagagaaactgacaataacttactcatcgtcaatgatgtgggagattaatggtcctgaatcagtgttggtcaatacctatcaatggatcatcagaaactgggaaactgttaaaattcagtggtcccagaaccctacaatgctatacaataaaatggaatttgaaccatttcagtctttagtacctaaggccattagaggccaatacagtgggtttgtgagaactctgttccaacaaatgagggatgtgcttgggacatttgataccgcacagataataaaacttcttcccttcgcagccgctccaccaaagcaaagtagaatgcagttctcctcatttactgtgaatgtgaggggatcaggaatgagaatacttgtaaggggcaattctcctgtattcaactacaacaaggccacgaagagactcacagttctcggaaaggatgctggcactttaaccgaagacccagatgaaggcacagctggagtggagtccgctgttctgaggggattcctcattctgggcaaagaagacaggagatatgggccagcattaagcatcaatgaactgagcaaccttgcgaaaggagagaaggctaatgtgctaattgggcaaggagacgtggtgttggtaatgaaacgaaaacgggactctagcatacttactgacagccagacagcgaccaaaagaattcggatggccatcaattag>gi|GENSCAN_predicted_peptide_24|716_aaMEDFVRQCFNPMIVELAEKTMKEYGEDLKIETNKFAAICTHLEVCFMYSDFHFINEQGESIIVELGDPNALLKHRFEIIEGRDRTMAWTVVNSICNTTGAEKPKFLPDLYDYKENRFIEIGVTRREVHIYYLEKANKIKSEKTHIHIFSFTGEEMATKADYTLDEESRARIKTRLFTIRQEMASRGLWDSFRQSERGEETIEERFEITGTMRKLADQSLPPNFSSLENFRAYVDGFEPNGYIEGKLSQMSKEVNARIEPFLKTTPRPLRLPNGPPCSQRSKFLLMDALKLSIEDPSHEGEGIPLYDAIKCMRTFFGWKEPNVVKPHEKGINPNYLLSWKQVLAELQDIENEEKIPKTKNMKKTSQLKWALGENMAPEKVDFDDCKDVGDLKQYDSDEPELRSLASWIQNEFNKACELTDSSWIELDEIGEDVAPIEHIASMRRNYFTSEVSHCRATEYIMKGVYINTALLNASCAAMDDFQLIPMISKCRTKEGRRKTNLYGFIIKGRSHLRNDTDVVNFVSMEFSLTDPRLEPHKWEKYCVLEIGDMLLRSAIGQVSRPMFLYVRTNGTSKIKMKWGMEMRRCLLQSLQQIESMIEAESSVKEKDMTKEFFENKSETWPIGESPKGVEESSIGKVCRTLLAKSVFNSLYASPQLEGFSAESRKLLLIVQALRDNLEPGTFDLGGLYEAIEECLINDPWVLLNASWFNSFLTHALS>gi|GENSCAN_predicted_CDS_24|2151_bpatggaagattttgtgcgacaatgcttcaatccgatgattgtcgagcttgcggaaaaaacaatgaaagagtatggggaggacctgaaaatcgaaacaaacaaatttgcagcaatatgcactcacttggaagtatgcttcatgtattcagatttccacttcatcaatgagcaaggcgagtcaataatcgtagaacttggtgatcctaatgcacttttgaagcacagatttgaaataatcgagggaagagatcgcacaatggcctggacagtagtaaacagtatttgcaacactacaggggctgagaaaccaaagtttctaccagatttgtatgattacaaggaaaatagattcatcgaaattggagtaacaaggagagaagttcacatatactatctggaaaaggccaataaaattaaatctgagaaaacacacatccacattttctcgttcactggggaagaaatggccacaaaggccgactacactctcgatgaagaaagcagggctaggatcaaaaccaggctattcaccataagacaagaaatggccagcagaggcctctgggattcctttcgtcagtccgagagaggagaagagacaattgaagaaaggtttgaaatcacaggaacaatgcgcaagcttgccgaccaaagtctcccgccgaacttctccagccttgaaaattttagagcctatgtggatggattcgaaccgaacggctacattgagggcaagctgtctcaaatgtccaaagaagtaaatgctagaattgaaccttttttgaaaacaacaccacgaccacttagacttccgaatgggcctccctgttctcagcggtccaaattcctgctgatggatgccttaaaattaagcattgaggacccaagtcatgaaggagagggaataccgctatatgatgcaatcaaatgcatgagaacattctttggatggaaggaacccaatgttgttaaaccacacgaaaagggaataaatccaaattatcttctgtcatggaagcaagtactggcagaactgcaggacattgagaatgaggagaaaattccaaagactaaaaatatgaaaaaaacaagtcagctaaagtgggcacttggtgagaacatggcaccagaaaaggtagactttgacgactgtaaagatgtaggtgatttgaagcaatatgatagtgatgaaccagaattgaggtcgcttgcaagttggattcagaatgagttcaacaaggcatgcgaactgacagattcaagctggatagagcttgatgagattggagaagatgtggctccaattgaacacattgcaagcatgagaaggaattatttcacatcagaggtgtctcactgcagagccacagaatacataatgaagggggtgtacatcaatactgccttacttaatgcatcttgtgcagcaatggatgatttccaattaattccaatgataagcaagtgtagaactaaggagggaaggcgaaagaccaacttgtatggtttcatcataaaaggaagatcccacttaaggaatgacaccgacgtggtaaactttgtgagcatggagttttctctcactgacccaagacttgaaccacacaaatgggagaagtactgtgttcttgagataggagatatgcttctaagaagtgccataggccaggtttcaaggcccatgttcttgtatgtgaggacaaatggaacctcaaaaattaaaatgaaatggggaatggagatgaggcgttgtctcctccagtcacttcaacaaattgagagtatgattgaagctgagtcctctgtcaaagagaaagacatgaccaaagagttctttgagaacaaatcagaaacatggcccattggagagtctcccaaaggagtggaggaaagttccattgggaaggtctgcaggactttattagcaaagtcggtatttaacagcttgtatgcatctccacaactagaaggattttcagctgaatcaagaaaactg

82

Page 83: Project  Report on Influenza Virus

cttcttatcgttcaggctcttagggacaatctggaacctgggacctttgatcttggggggctatatgaagcaattgaggagtgcctaattaatgatccctgggttttgcttaatgcttcttggttcaactccttccttacacatgcattgagttag>gi|GENSCAN_predicted_peptide_25|718_aaMDTVNRTHQYSEKARWTTNTETGAPQLNPIDGPLPEDNEPSGYAQTDCVLEAMAFLEESHPGIFENSCIETMEVVQQTRVDKLTQGRQTYDWTLNRNQPAATALANTIEVFRSNGLTANESGRLIDFLKDVMESMKKEEMGITTHFQRKRRVRDNMTKKMITQRTIGKRKQRLNKRSYLIRALTLNTMTKDAERGKLKRRAIATPGMQIRGFVYFVETLARSICEKLEQSGLPVGGNEKKAKLANVVRKMMTNSQDTELSLTITGDNTKWNENQNPRMFLAMITYMTRNQPEWFRNVLSIAPIMFSNKMARLGKGYMFESKSMKLRTQIPAEMLASIDLKYFNDSTRKKIEKIRPLLIEGTASLSPGMMMGMFNMLSTVLGVSILNLGQKRYTKTTYWWDGLQSSDDFALIVNAPNHEGIQAGVDRFYRTCKLHGINMSKKKSYINRTGTFEFTSFFYRYGFVANFSMELPSFGVSGSNESADMSIGVTVIKNNMINNDLGPATAQMALQLFIKDYRYTYRCHRGDTQIQTRRSFEIKKLWEQTRSKAGLLVSDGGPNLYNIRNLHIPEVCLKWELMDEDYQGRLCNPLNPFVSHKEIESMNNAVMMPAHGPAKNMEYDAVATTHSWIPKRNRSILNTSQRGVLEDEQMYQRCCNLFEKFFPSSSYRRPVGISSMVEAMVSRARIDARIDFESGRIKKEEFTEIMKICSTIEELRRQK>gi|GENSCAN_predicted_CDS_25|2157_bpatggatactgtcaacaggacacatcagtactcagaaaaggcaagatggacaacaaacaccgaaactggagcaccgcaactcaacccgattgatgggccactgccagaagacaatgaaccaagtggttatgcccaaacagattgtgtattggaagcaatggctttccttgaggaatcccatcctggtatttttgaaaactcgtgtattgaaacgatggaggttgttcagcaaacacgagtagacaagctgacacaaggccgacagacctatgactggactttaaatagaaaccagcctgctgcaacagcattggccaacacaatagaagtgttcagatcaaatggcctcacggccaatgagtctggaaggctcatagacttccttaaggatgtaatggagtcaatgaaaaaagaagaaatggggatcacaactcattttcagagaaagagacgggtgagagacaatatgactaagaaaatgataacacagagaacaataggtaaaaggaaacagagattgaacaaaaggagttatctaattagagcattgaccctgaacacaatgaccaaagatgctgagagagggaagctaaaacggagagcaattgcaaccccagggatgcaaataagggggtttgtatactttgttgagacactggcaaggagtatatgtgagaaacttgaacaatcagggttgccagttggaggcaatgagaagaaagcaaagttggcaaatgttgtaaggaagatgatgaccaattctcaggacaccgaactttctttgaccatcactggagataacaccaaatggaacgaaaatcagaatcctcggatgtttttggccatgatcacatatatgaccagaaatcagcccgaatggttcagaaatgttctaagtattgctccaataatgttctcaaacaaaatggcgagactgggaaaagggtatatgtttgagagcaagagtatgaaacttagaactcaaatacctgcagaaatgctagcaagcattgatttgaaatatttcaatgattcaacaagaaagaagattgaaaaaatccgaccgctcttaatagaggggactgcatcattgagccctggaatgatgatgggcatgttcaatatgttaagcactgtattaggcgtctccatcctgaatcttggacaaaagagatacaccaagactacttactggtgggatggtcttcaatcctctgacgattttgctctgattgtgaatgcacccaatcatgaagggattcaagccggagtcgacaggttttatcgaacctgtaagctacatggaatcaatatgagcaagaaaaagtcttacataaacagaacaggtacatttgaattcacaagttttttctatcgttatgggtttgttgccaatttcagcatggagcttcccagttttggtgtgtctgggagcaacgagtcagcggacatgagtattggagttactgtcatcaaaaacaatatgataaacaatgatcttggtccagcaacagctcaaatggcccttcagttgttcatcaaagattacaggtacacgtaccgatgccatagaggtgacacacaaatacaaacccgaagatcatttgaaataaagaaactgtgggagcaaacccgttccaaagctggactgctggtctccgacggaggcccaaatttatacaacattagaaatctccacattcctgaagtctgcctaaaatgggaattgatggatgaggattaccaggggcgtttatgcaacccactgaacccatttgtcagccataaagaaattgaatcaatgaacaatgcagtgatgatgccagcacatggtccagccaaaaacatggagtatgatgctgttgcaacaacacactcctggatccccaaaagaaatcgatccatcttgaatacaagtcaaagaggagtacttgaagatgaacaaatgtaccaaaggtgctgcaatttatttgaaaaattcttccccagcagttcatacagaagaccagtcgggatatccagtatggtggaggctatggtttccagagcccgaattgatgcacggattgatttcgaatctggaaggataaagaaagaagagttcactgagatcatgaagatctgttccaccattgaagagctcagacggcaaaaatag>gi|GENSCAN_predicted_peptide_26|230_aaMDPNTVSSFQVDCFLWHVRKRVADQELGDAPFLDRLRRDQKSLRGRGSTLGLDIETATRAGKQIVERILKEESDEALKMTMASVPASRYLTDMTLEEMSREWSMLIPKQKVAGPLCIRMDQAIMDKNIILKANFSVIFDRLETLILLRAFTEEGAIVGEISPLPSLPGHTAEDVKNAVGVLIGGLEWNDNTVRVSETLQRFAWRSSNENGRPPLTPKQKREMAGTIRSEV>gi|GENSCAN_predicted_CDS_26|693_bpatggatccaaacactgtgtcaagctttcaggtagattgctttctttggcatgtccgcaaacgagttgcagaccaagaactaggtgatgccccattccttgatcggcttcgccgagatcagaaatccctaagaggaaggggcagcactcttggtctggacatcgagacagccacacgtgctggaaagcagatagtggagcggattctgaaagaagaatccgatgaggcacttaaaatgaccatggcctctgtacctgcgtcgcgttacctaaccgacatgactcttgaggaaatgtcaagggaatggtccatgctcatacccaagcagaaagtggcaggccctctttgtatcagaatggaccaggcgatcatggataaaaacatcatactgaaagcgaacttcagtgtgatttttgaccggctggagactctaatattgctaagggctttcaccgaagagggagcaattgttggcgaaatttcaccattgccttctcttccaggacatactgctgaggatgtcaaaaatgcagttggagtcctcatcggaggacttgaatggaatgataacacagttcgagtctctgaaactctacagagattcgcttggagaagcagtaatgagaatgggagacctccactcactccaaaacagaaacgagaaatggcgggaacaattaggtcagaagtttga>gi|GENSCAN_predicted_peptide_27|498_aaMASQGTKRSYEQMETDGERQNATEIRASVGKMIGGIGRFYIQMCTELKLSDYEGRLIQNSLTIERMVLSAFDERRNKYLEEHPSAGKDPKKTGGPIYRRVNGKWMRELILYDKEEIRRIWRQANNGDDATAGLTHMMIWHSNLNDATYQRTRALVRTGMDPRMCSLMQGSTLPRRSGAAGAAVKGVGTMVMELVRMIKRGINDRNFWRGENGRKTRIAYERMCNILKGKFQTAAQKAMMDQVRESRDPGNAEFEDLTFLARSALILRGSVAHKSCLPACVYGPAVASGYDFEREGYSLVGIDPFRLLQNSQVYSLIRPNENPAHKSQLVWMACHSAAFEDLRVLSFIKGTKVVPRGKLSTRGVQIASNENMETMESSTLELRSRYWAIRTRSGGNTNQQRASAGQISIQPTFSVQRNLPFDRTTVMAAFTGNTEGRTSDMRTEIIRMMESARPEDVSFQGRGVFELSDEKAASPIVPSFDMSNEGSYFFGDNAEEYDN>gi|GENSCAN_predicted_CDS_27|1497_bpatggcgtcccaaggcaccaaacggtcttacgaacagatggagactgatggagaacgccagaatgccactgaaatcagagcatccgtcggaaaaatgattggtggaattggacgattctacatccaaatgtgcacagaacttaaactcagtgattatgagggacggttgatccaaaacagcttaacaatagagagaatggtgctctctgcttttgacgaaaggagaaataaatacctggaagaacatcccagtgcggggaaggatcctaagaaaactggaggacctatatacagaagagtaaacggaaagtggatgagagaactcatcctttatgacaaagaagaaataaggcgaatctggcgccaagctaataatggtgacgatgcaacggctggtctgactcacatgatgatctggcattccaatttgaatgatgcaacttatcagaggacaagggctcttgttcgcaccggaatggatcccaggatgtgctctctgatgcaaggttcaactctccctaggaggtctggagccgcaggtgctgcagtcaaaggagttggaacaatggtgatggaattggtcaggatgatcaaacgtgggatcaatgatcggaacttctggaggggtgagaatggacgaaaaacaagaattgcttatgaaagaatgtgcaacattctcaaagggaaatttcaaactgctgcacaaaaagcaatgatggatcaagtgagagagagccgggacccagggaatgctgagttcgaagatctcacttttctagcacggtctgcactcatattgagagggtcggttgctcacaagtcctgcctgcctgcctgtgtgtatggacctgccgtagccagtgggtacgactttgaaagagagggatactctctagtcggaatagaccctttcagactgcttcaaaacagccaagtgtacagcctaatcagaccaaatgagaatccagcacacaagagtcaactggtgtggatggcatgccattctgccgcatttgaagatctaagagtattgagcttcatcaaagggacgaaggtggtcccaagagggaagctttccactagaggagttcaaattgcttccaatgaaaatatggagactatggaatcaagtacacttgaactgagaagcaggtactgggccataaggaccagaagtggaggaaacaccaatcaacagagggcatctgcgggccaaatcagcatacaacctacgttctcagtacagagaaatctcccttttgacagaacaaccgttatggcagcattcactgggaatacagaggggagaacatctgacatgaggaccgaaatcataaggatgatggaaagtgcaagaccagaagatgtgtctttccaggggcggggagtcttcgagctctcggacgaaaaggcagcgagcccgatcgtgccttcctttgacatgagtaatgaaggatcttatttcttcggagacaatgcagaggagtacgacaattaa>gi|GENSCAN_predicted_peptide_28|711_aaMKANLLVLLCALAAADADTICIGYHANNSTDTVDTVLEKNVTVTHSVNLLEDSHNGKLCRLKGIAPLQLGKCNIAGWLLGNPECDPLLPVRSWSYIVETPNSENGICYPGDFIDYEELREQLSSVSSFERFEIFPKESSWPNHNTTKGVTAACSHAGKSSFYRNLLWLTEKEGSYPKLKNSYVNKKGKEVLVLWGIHHPSNSKDQQNIYQNENAYVSVVTSNYNRRFTPEIAERPKVRDQAGRMNYYWTLLKPGDTIIFEANGNLIAPRYAFALSRGFGSGIITS

83

Page 84: Project  Report on Influenza Virus

NASMHECNTKCQTPLGAINSSLPFQNIHPVTIGECPKYVRSAKLRMVTGLRNIPSIQSRGLFGAIAGFIEGGWTGMIDGWYGYHHQNEQGSGYAADQKSTQNAINGITNKVNSVIEKMNIQFTAVGKEFNKLEKRMENLNKKVDDGFLDIWTYNAELLVLLENERTLDFHDSNVKNLYEKVKSQLKNNAKEIGNGCFEFYHKCDNECMESVRNGTYDYPKYSEESKLNREKGILGFVFTLTVPSERGLQRRRFVQNALNGNGDPNNMDKAVKLYRKLKREITFHGAKEISLSYSAGALASCMGLIYNRMGAVTTEVAFGLVCATCEQIADSQHRSHRQMVTTTNPLIRHENRMVLASTTAKAMEQMAGSSEQAAEAMEVASQARQMVQAMRTIGTHPSSSAGLKNDLLENLQAYQKRMGVQMQRFK>gi|GENSCAN_predicted_CDS_28|2136_bpatgaaggcaaacctactggtcctgttatgtgcacttgcagctgcagatgcagacacaatatgtataggctaccatgcgaacaattcaaccgacactgttgacacagtgctcgagaagaatgtgacagtgacacactctgttaacctgctcgaagacagccacaacggaaaactatgtagattaaaaggaatagccccactacaattggggaaatgtaacatcgccggatggctcttgggaaacccagaatgcgacccactgcttccagtgagatcatggtcctacattgtagaaacaccaaactctgagaatggaatatgttatccaggagatttcatcgactatgaggagctgagggagcaattgagctcagtgtcatcattcgaaagattcgaaatatttcccaaagaaagctcatggcccaaccacaacacaaccaaaggagtaacggcagcatgctcccatgcggggaaaagcagtttttacagaaatttgctatggctgacggagaaggagggctcatacccaaagctgaaaaattcttatgtgaacaagaaagggaaagaagtccttgtactgtggggtattcatcacccgtctaacagtaaggatcaacagaatatctatcagaatgaaaatgcttatgtctctgtagtgacttcaaattataacaggagatttaccccggaaatagcagaaagacccaaagtaagagatcaagctgggaggatgaactattactggaccttgctaaaacccggagacacaataatatttgaggcaaatggaaatctaatagcaccaaggtatgctttcgcactgagtagaggctttgggtccggcatcatcacctcaaacgcatcaatgcatgagtgtaacacgaagtgtcaaacacccctgggagctataaacagcagtctccctttccagaatatacacccagtcacaataggagagtgcccaaaatacgtcaggagtgccaaattgaggatggttacaggactaaggaacattccgtccattcaatccagaggtctatttggagccattgccggttttattgaagggggatggactggaatgatagatggatggtacggttatcatcatcagaatgaacagggatcaggctatgcagcggatcaaaaaagcacacaaaatgccattaacgggattacaaacaaggtgaactctgttatcgagaaaatgaacattcaattcacagctgtgggtaaagaattcaacaaattagaaaaaaggatggaaaatttaaataaaaaagttgatgatggatttctggacatttggacatataatgcagaattgttagttctactggaaaatgaaaggactctggatttccatgactcaaatgtgaagaatctgtatgagaaagtaaaaagccaattaaagaataatgccaaagaaatcggaaatggatgttttgagttctaccacaagtgtgacaatgaatgcatggaaagtgtaagaaatgggacttatgattatcccaaatattcagaagagtcaaagttgaacagggaaaaggggattttaggatttgtgttcacgctcaccgtgcccagtgagcgaggactgcagcgtagacgctttgtccaaaatgcccttaatgggaacggggatccaaataacatggacaaagcagttaaactgtataggaagctcaagagggagataacattccatggggccaaagaaatctcactcagttattctgctggtgcacttgccagttgtatgggcctcatatacaacaggatgggggctgtgaccactgaagtggcatttggcctggtatgtgcaacctgtgaacagattgctgactcccagcatcggtctcataggcaaatggtgacaacaaccaacccactaatcagacatgagaacagaatggttttagccagcactacagctaaggctatggagcaaatggctggatcgagtgagcaagcagcagaggccatggaggttgctagtcaggctaggcaaatggtgcaagcgatgagaaccattgggactcatcctagctccagtgctggtctgaaaaatgatcttcttgaaaatttgcaggcctatcagaaacgaatgggggtgcagatgcaacggttcaagtga

Explanation

Gn.Ex : gene number, exon number (for reference)Type : Init = Initial exon (ATG to 5' splice site) Intr = Internal exon (3' splice site to 5' splice site) Term = Terminal exon (3' splice site to stop codon) Sngl = Single-exon gene (ATG to stop) Prom = Promoter (TATA box / initation site) PlyA = poly-A signal (consensus: AATAAA)S : DNA strand (+ = input strand; - = opposite strand)Begin : beginning of exon or signal (numbered on input strand)End : end point of exon or signal (numbered on input strand)Len : length of exon or signal (bp)Fr : reading frame (a forward strand codon ending at x has frame x mod 3)Ph : net phase of exon (exon length modulo 3)I/Ac : initiation signal or 3' splice site score (tenth bit units)Do/T : 5' splice site or termination signal score (tenth bit units)CodRg : coding region score (tenth bit units)P : probability of exon (sum over all parses containing exon)Tscr : exon score (depends on length, I/Ac, Do/T and CodRg scores)

After that pdf file graphical view of Genscan output is to be pasted…..

84

Page 85: Project  Report on Influenza Virus

Table showing genome of different strains with their total number of Base pairs in their individual segments

NO.OF SEGMENTS IN GENOME NO.OF BASE PAIRS

H1N1

SEGMENT 1 2341

SEGMENT2 2341

SEGMENT3 2233

SEGMENT4 1778

SEGMENT5 1565

SEGMENT6 1413

SEGMENT7 1027

SEGMENT8 890

H2N2

SEGMENT1 2341

SEGMENT2 2341

SEGMENT3 2233

SEGMENT4 1773

SEGMENT5 1497

SEGMENT6 1410

SEGMENT7 1027

SEGMENT8 838

H3N2

SEGMENT1 2341

SEGMENT2 2341

SEGMENT3 2233

SEGMENT4 1762

SEGMENT5 1566

SEGMENT6 1467

SEGMENT7 1027

SEGMENT8 890

85

Page 86: Project  Report on Influenza Virus

H9N2

SEGMENT1 2341

SEGMENT2 2328

SEGMENT3 2225

SEGMENT4 1714

SEGMENT5 1557

SEGMENT6 1418

SEGMENT7 1025

SEGMENT8 890

H5N1

SEGMENT1 2341

SEGMENT2 2341

SEGMENT3 2233

SEGMENT4 1760

SEGMENT5 1565

SEGMENT6 1458

SEGMENT7 1027

SEGMENT8 865

Modelling

We perform modelling of protein of h5n1 strains, using SWISS MODEL SERVER.

Result of Blast of h5n1 strain:

Query sequence subject% identity A score

q start

Q end

s start

S end E value Bitscore

gi|73921266|ref|YP_308668.1| pdb|1NMB|N 44.51 474 248 9 1 465 1 468 6.00E-111 396

gi|73921266|ref|YP_308668.1| pdb|5NN9| 48.81 379 188 5 91 465 10 386 2.00E-102 367

gi|73921266|ref|YP_308668.1| pdb|1XOG|A 48.81 379 188 5 91 465 9 385 5.00E-102 366

gi|73921266|ref|YP_308668.1| pdb|1L7F|A 48.81 379 188 5 91 465 10 386 5.00E-102 366

gi|73921266|ref|YP_308668.1| pdb|4NN9| 48.81 379 188 5 91 465 10 386 5.00E-102 366

gi|73921266|ref|YP_308668.1| pdb|1NCC|N 48.81 379 188 5 91 465 11 387 5.00E-102 366

gi|73921266|ref|YP_308668.1| pdb|1NCA|N 48.81 379 188 5 91 465 11 387 5.00E-102 366

gi|73921266|ref|YP_308668.1| pdb|1NMA|N 48.28 379 190 5 91 465 10 386 8.00E-102 365

gi|73921266|ref|YP_308668.1| pdb|1NCD|N 48.28 379 190 5 91 465 11 387 8.00E-102 365

gi|73921266|ref|YP_308668.1| pdb|1L7H|A 48.55 379 189 5 91 465 10 386 1.00E-101 365

gi|73921266|ref|YP_308668.1| pdb|1W20|D 48.41 378 188 5 91 463 10 385 2.00E-101 364

gi|73921266|ref|YP_308668.1| pdb|1NCB|N 48.55 379 189 5 91 465 11 387 2.00E-101 364

86

Page 87: Project  Report on Influenza Virus

gi|73921266|ref|YP_308668.1| pdb|6NN9| 48.55 379 189 5 91 465 10 386 2.00E-101 364

gi|73921266|ref|YP_308668.1| pdb|3NN9| 48.55 379 189 5 91 465 10 386 2.00E-101 364

gi|73921266|ref|YP_308668.1| pdb|1INY| 48.55 379 189 5 91 465 10 386 2.00E-101 364

gi|73921266|ref|YP_308668.1| pdb|1L7G|A 48.55 379 189 5 91 465 10 386 3.00E-101 363

gi|73921266|ref|YP_308668.1| pdb|2AEQ|A 46.7 379 189 7 92 463 18 390 4.00E-95 343

gi|73921266|ref|YP_308668.1| pdb|2BAT| 45.93 381 193 7 92 465 11 385 4.00E-95 343

gi|73921266|ref|YP_308668.1| pdb|1INX| 45.93 381 193 7 92 465 11 385 6.00E-95 343

gi|73921266|ref|YP_308668.1| pdb|1VCJ|A 36.36 352 203 10 116 455 37 379 7.00E-59 223

gi|73921266|ref|YP_308668.1| pdb|1B9V|A 36.36 352 203 10 116 455 38 380 7.00E-59 223

gi|73921266|ref|YP_308668.1| pdb|1INF| 36.36 352 203 10 116 455 38 380 7.00E-59 223

gi|73921266|ref|YP_308668.1| pdb|1A4Q|B 35.9 351 206 9 116 455 38 380 3.00E-58 221

gi|73921266|ref|YP_308668.1| pdb|2AZD|B 50 22 11 0 294 315 159 180 0.22 32.3

gi|73921266|ref|YP_308668.1| pdb|1QM5|B 50 22 11 0 294 315 159 180 0.22 32.3

gi|73921266|ref|YP_308668.1| pdb|2ECP|B 50 22 11 0 294 315 159 180 0.22 32.3

gi|73921266|ref|YP_308668.1| pdb|1AHP|B 50 22 11 0 294 315 160 181 0.22 32.3

gi|73921266|ref|YP_308668.1| pdb|1U8C|B 25 120 70 4 226 345 496 595 0.28 32

gi|73921266|ref|YP_308668.1| pdb|1SSK|A 31.75 63 42 1 146 207 76 138 1.1 30

gi|73921266|ref|YP_308668.1| pdb|1EGI|B 41.67 24 14 0 422 445 72 95 3.1 28.5

gi|73921266|ref|YP_308668.1| pdb|1EGZ|C 30.3 66 35 3 354 410 9 72 5.3 27.7

gi|73921266|ref|YP_308668.1| pdb|1WB0|A 28.75 80 51 2 328 407 263 336 9.1 26.9

gi|73921266|ref|YP_308668.1| pdb|1HKM|A 28.75 80 51 2 328 407 263 336 9.1 26.9

gi|73921266|ref|YP_308668.1| pdb|1HKK|A 28.75 80 51 2 328 407 263 336 9.1 26.9

gi|73921266|ref|YP_308668.1| pdb|1LQ0|A 28.75 80 51 2 328 407 263 336 9.1 26.9

gi|73921266|ref|YP_308668.1| pdb|1GUV|A 28.75 80 51 2 328 407 263 336 9.1 26.9

gi|73852953|ref|YP_308667.1| pdb|1I7A|D 25.47 106 73 2 254 357 3 104 0.88 30.4

gi|73852953|ref|YP_308667.1| pdb|1TBG|H 32.73 55 35 1 41 95 6 58 5.7 27.7

gi|73852953|ref|YP_308667.1| pdb|1B9Y|B 32.73 55 35 1 41 95 6 58 5.7 27.7

gi|73852953|ref|YP_308667.1| pdb|1A0R|G 32.73 55 35 1 41 95 5 57 5.7 27.7

gi|73852953|ref|YP_308667.1| pdb|1GOT|G 32.73 55 35 1 41 95 13 65 5.7 27.7

gi|73852947|ref|YP_308664.1| pdb|1W1W|D 21.88 128 85 4 106 231 173 287 0.83 31.2

gi|73852947|ref|YP_308664.1| pdb|1GTM|C 20.65 92 72 1 149 240 274 364 4.1 28.9

gi|73852947|ref|YP_308664.1| pdb|1EUZ|F 20.21 94 74 1 149 242 274 366 5.3 28.5

gi|73852947|ref|YP_308664.1| pdb|1J0W|B 28.26 46 33 0 503 548 1 46 5.3 28.5

gi|73852947|ref|YP_308664.1| pdb|1BVU|F 21.74 92 71 1 149 240 273 363 7 28.1

gi|73852947|ref|YP_308664.1| pdb|1BBW|A 22.78 79 59 1 166 242 33 111 9.1 27.7

gi|73852947|ref|YP_308664.1| pdb|1KRT| 22.78 79 59 1 166 242 5 83 9.1 27.7

gi|73852957|ref|YP_308671.1| pdb|1EA3|B 95.12 164 8 0 1 164 1 164 3.00E-87 316

gi|73852957|ref|YP_308671.1| pdb|1AA7|B 94.94 158 8 0 1 158 1 158 3.00E-83 303

gi|73852957|ref|YP_308671.1| pdb|2CRL|A 28.24 85 52 3 118 199 2 80 4.1 26.9

gi|73852957|ref|YP_308671.1| pdb|1QGE|D 36.36 44 27 1 178 221 70 112 9.1 25.8

gi|73852957|ref|YP_308671.1| pdb|1CVL| 36.36 44 27 1 178 221 70 112 9.1 25.8

gi|73852957|ref|YP_308671.1| pdb|1TAH|D 36.36 44 27 1 178 221 69 111 9.1 25.8

gi|90093247|ref|YP_529486.1| pdb|1JSM|A 94.14 324 19 0 1 324 1 324 0 645

gi|90093247|ref|YP_529486.1| pdb|1RUY|L 58.02 324 135 1 1 323 5 328 2.00E-118 420

gi|90093247|ref|YP_529486.1| pdb|1RVT|L 57.72 324 136 1 1 323 5 328 2.00E-118 420

gi|90093247|ref|YP_529486.1| pdb|1RV0|L 57.72 324 136 1 1 323 5 328 2.00E-118 420

gi|90093247|ref|YP_529486.1| pdb|1RUZ|L 58.02 324 135 1 1 323 5 328 5.00E-117 416

gi|90093247|ref|YP_529486.1| pdb|1RD8|E 57.49 327 138 1 1 326 9 335 6.00E-117 415

gi|90093247|ref|YP_529486.1| pdb|1RVZ|K 58.02 324 134 2 1 323 5 327 6.00E-115 409

gi|90093247|ref|YP_529486.1| pdb|1RU7|K 58.02 324 134 2 1 323 5 327 6.00E-115 409

gi|90093247|ref|YP_529486.1| pdb|1JSD|A 44.89 323 172 2 1 323 1 317 7.00E-83 302

gi|90093247|ref|YP_529486.1| pdb|1MQN|G 38.27 324 193 6 3 326 13 329 1.00E-62 235

gi|90093247|ref|YP_529486.1| pdb|4HMG|E 38.82 322 190 6 3 324 13 327 1.00E-61 231

gi|90093247|ref|YP_529486.1| pdb|1KEN|E 38.51 322 191 6 3 324 13 327 8.00E-61 229

87

Page 88: Project  Report on Influenza Virus

gi|90093247|ref|YP_529486.1| pdb|1HA0|A 38.51 322 191 6 3 324 5 319 8.00E-61 229

gi|90093247|ref|YP_529486.1| pdb|2HMG|E 38.51 322 191 6 3 324 13 327 8.00E-61 229

gi|90093247|ref|YP_529486.1| pdb|1HGD|E 38.51 322 191 6 3 324 13 327 1.00E-60 228

gi|90093247|ref|YP_529486.1| pdb|2VIU|A 38.51 322 191 6 3 324 13 327 1.00E-60 228

gi|90093247|ref|YP_529486.1| pdb|1TI8|A 37.46 323 190 8 3 322 1 314 3.00E-60 227

gi|90093247|ref|YP_529486.1| pdb|2VIT|C 37.72 289 173 6 18 306 1 282 7.00E-51 196

gi|90093247|ref|YP_529486.1| pdb|2VIS|C 37.72 289 173 6 18 306 1 282 7.00E-51 196

gi|90093247|ref|YP_529486.1| pdb|2VIR|C 37.37 289 174 6 18 306 1 282 5.00E-50 193

gi|90093247|ref|YP_529486.1| pdb|1EAG|A 28.36 67 44 1 12 74 209 275 0.24 31.6

gi|90093247|ref|YP_529486.1| pdb|1KXP|D 25.32 79 58 1 37 114 265 343 0.41 30.8

gi|90093247|ref|YP_529486.1| pdb|1MA9|A 25.32 79 58 1 37 114 265 343 0.41 30.8

gi|90093247|ref|YP_529486.1| pdb|1LOT|A 25.32 79 58 1 37 114 265 343 0.41 30.8

gi|90093247|ref|YP_529486.1| pdb|2CSE|U 31.82 44 30 0 91 134 75 118 0.91 29.6

gi|90093247|ref|YP_529486.1| pdb|1TBM|B 40 30 18 0 45 74 136 165 1.6 28.9

gi|90093247|ref|YP_529486.1| pdb|1XP2|C 22.99 87 67 0 111 197 8 94 2.7 28.1

gi|90093247|ref|YP_529486.1| pdb|2AIV|A 34.09 44 25 2 90 130 94 136 4.5 27.3

gi|90093247|ref|YP_529486.1| pdb|1WMR|B 28.07 57 38 1 11 64 464 520 4.5 27.3

gi|90093247|ref|YP_529486.1| pdb|1AOV| 34.09 44 29 0 233 276 74 117 5.9 26.9

gi|73852955|ref|YP_308669.1| pdb|1JSM|A 94.14 324 19 0 17 340 1 324 0 645

gi|73852955|ref|YP_308669.1| pdb|1HA0|A 44.11 501 269 7 19 519 5 494 7.00E-125 442

gi|73852955|ref|YP_308669.1| pdb|1RVT|L 57.54 325 137 1 16 339 4 328 4.00E-118 420

gi|73852955|ref|YP_308669.1| pdb|1RUY|L 58.02 324 135 1 17 339 5 328 4.00E-118 420

gi|73852955|ref|YP_308669.1| pdb|1RV0|L 57.72 324 136 1 17 339 5 328 5.00E-118 420

gi|73852955|ref|YP_308669.1| pdb|1RUZ|L 57.85 325 136 1 16 339 4 328 7.00E-117 416

gi|73852955|ref|YP_308669.1| pdb|1RD8|E 57.49 327 138 1 17 342 9 335 1.00E-116 415

gi|73852955|ref|YP_308669.1| pdb|1RVZ|K 57.85 325 135 2 16 339 4 327 8.00E-115 409

gi|73852955|ref|YP_308669.1| pdb|1RU7|K 58.02 324 134 2 17 339 5 327 1.00E-114 409

gi|73852955|ref|YP_308669.1| pdb|1JSM|B 98.3 176 3 0 347 522 1 176 1.00E-101 365

gi|73852955|ref|YP_308669.1| pdb|1RD8|F 82.49 177 31 0 347 523 1 177 1.00E-90 329

gi|73852955|ref|YP_308669.1| pdb|1RUZ|M 83.75 160 26 0 347 506 1 160 5.00E-83 303

gi|73852955|ref|YP_308669.1| pdb|1RUY|M 83.75 160 26 0 347 506 1 160 5.00E-83 303

gi|73852955|ref|YP_308669.1| pdb|1RVT|M 84.38 160 25 0 347 506 1 160 6.00E-83 303

gi|73852955|ref|YP_308669.1| pdb|1JSD|A 44.89 323 172 2 17 339 1 317 1.00E-82 302

gi|73852955|ref|YP_308669.1| pdb|1RVZ|L 81.88 160 29 0 347 506 1 160 2.00E-81 298

gi|73852955|ref|YP_308669.1| pdb|1MQN|H 51.8 222 106 1 347 568 1 221 3.00E-68 254

gi|73852955|ref|YP_308669.1| pdb|1JSD|B 59.66 176 71 0 347 522 1 176 3.00E-64 241

gi|73852955|ref|YP_308669.1| pdb|1MQN|G 38.27 324 193 6 19 342 13 329 2.00E-62 235

gi|73852955|ref|YP_308669.1| pdb|4HMG|E 38.82 322 190 6 19 340 13 327 2.00E-61 231

gi|73852955|ref|YP_308669.1| pdb|1KEN|E 38.51 322 191 6 19 340 13 327 1.00E-60 229

gi|73852955|ref|YP_308669.1| pdb|2HMG|E 38.51 322 191 6 19 340 13 327 1.00E-60 229

gi|73852955|ref|YP_308669.1| pdb|1HGD|E 38.51 322 191 6 19 340 13 327 2.00E-60 228

gi|73852955|ref|YP_308669.1| pdb|2VIU|A 38.51 322 191 6 19 340 13 327 3.00E-60 228

gi|73852955|ref|YP_308669.1| pdb|1TI8|A 37.46 323 190 8 19 338 1 314 6.00E-60 227

gi|73852955|ref|YP_308669.1| pdb|1KEN|F 56 175 77 0 347 521 1 175 6.00E-60 227

gi|73852955|ref|YP_308669.1| pdb|5HMG|F 55.43 175 78 0 347 521 1 175 4.00E-59 224

gi|73852955|ref|YP_308669.1| pdb|1TI8|B 55.88 170 75 0 347 516 1 170 6.00E-57 217

gi|73852955|ref|YP_308669.1| pdb|2VIT|C 37.72 289 173 6 34 322 1 282 1.00E-50 196

gi|73852955|ref|YP_308669.1| pdb|2VIS|C 37.72 289 173 6 34 322 1 282 1.00E-50 196

gi|73852955|ref|YP_308669.1| pdb|2VIR|C 37.37 289 174 6 34 322 1 282 9.00E-50 193

gi|73852955|ref|YP_308669.1| pdb|1QU1|F 52 150 72 0 377 526 1 150 7.00E-44 173

gi|73852955|ref|YP_308669.1| pdb|1HTM|F 51.82 137 66 0 385 521 2 138 1.00E-40 162

gi|73852955|ref|YP_308669.1| pdb|1XOP|A 84.21 19 3 0 348 366 2 20 8.00E-04 40.8

gi|73852955|ref|YP_308669.1| pdb|1XOO|A 84.21 19 3 0 348 366 2 20 8.00E-04 40.8

gi|73852955|ref|YP_308669.1| pdb|1IBO|A 84.21 19 3 0 348 366 2 20 8.00E-04 40.8

88

Page 89: Project  Report on Influenza Virus

gi|73852955|ref|YP_308669.1| pdb|2DCI|A 80 20 4 0 347 366 1 20 0.006 37.7

gi|73852955|ref|YP_308669.1| pdb|1EAG|A 28.36 67 44 1 28 90 209 275 0.46 31.6

gi|73852955|ref|YP_308669.1| pdb|1KXP|D 25.32 79 58 1 53 130 265 343 0.78 30.8

gi|73852955|ref|YP_308669.1| pdb|1MA9|A 25.32 79 58 1 53 130 265 343 0.78 30.8

gi|73852955|ref|YP_308669.1| pdb|1LOT|A 25.32 79 58 1 53 130 265 343 0.78 30.8

gi|73852955|ref|YP_308669.1| pdb|2CSE|U 31.82 44 30 0 107 150 75 118 1.7 29.6

gi|73852955|ref|YP_308669.1| pdb|1TBM|B 40 30 18 0 61 90 136 165 3 28.9

gi|73852955|ref|YP_308669.1| pdb|1XP2|C 22.99 87 67 0 127 213 8 94 5.1 28.1

gi|73852955|ref|YP_308669.1| pdb|1P97|A 29.07 86 57 3 448 529 3 88 5.1 28.1

gi|73852955|ref|YP_308669.1| pdb|2BIB|A 27.91 43 31 0 344 386 402 444 6.6 27.7

gi|73852955|ref|YP_308669.1| pdb|1YDU|A 38.46 39 22 1 392 428 2 40 6.6 27.7

gi|73852955|ref|YP_308669.1| pdb|2AIV|A 34.09 44 25 2 106 146 94 136 8.6 27.3

gi|73852955|ref|YP_308669.1| pdb|1WMR|B 28.07 57 38 1 27 80 464 520 8.6 27.3

gi|90093248|ref|YP_529487.1| pdb|1JSM|B 98.3 176 3 0 1 176 1 176 3.00E-102 365

gi|90093248|ref|YP_529487.1| pdb|1RD8|F 82.49 177 31 0 1 177 1 177 3.00E-91 329

gi|90093248|ref|YP_529487.1| pdb|1RUZ|M 83.75 160 26 0 1 160 1 160 1.00E-83 303

gi|90093248|ref|YP_529487.1| pdb|1RUY|M 83.75 160 26 0 1 160 1 160 1.00E-83 303

gi|90093248|ref|YP_529487.1| pdb|1RVT|M 84.38 160 25 0 1 160 1 160 2.00E-83 303

gi|90093248|ref|YP_529487.1| pdb|1RVZ|L 81.88 160 29 0 1 160 1 160 6.00E-82 298

gi|90093248|ref|YP_529487.1| pdb|1MQN|H 51.8 222 106 1 1 222 1 221 1.00E-68 254

gi|90093248|ref|YP_529487.1| pdb|1JSD|B 59.66 176 71 0 1 176 1 176 9.00E-65 241

gi|90093248|ref|YP_529487.1| pdb|1KEN|F 56 175 77 0 1 175 1 175 2.00E-60 227

gi|90093248|ref|YP_529487.1| pdb|5HMG|F 55.43 175 78 0 1 175 1 175 1.00E-59 224

gi|90093248|ref|YP_529487.1| pdb|1HA0|A 56.07 173 76 0 1 173 322 494 1.00E-59 224

gi|90093248|ref|YP_529487.1| pdb|1TI8|B 55.88 170 75 0 1 170 1 170 2.00E-57 217

gi|90093248|ref|YP_529487.1| pdb|1QU1|F 52 150 72 0 31 180 1 150 2.00E-44 173

gi|90093248|ref|YP_529487.1| pdb|1HTM|F 51.82 137 66 0 39 175 2 138 4.00E-41 162

gi|90093248|ref|YP_529487.1| pdb|1XOP|A 84.21 19 3 0 2 20 2 20 2.00E-04 40.8

gi|90093248|ref|YP_529487.1| pdb|1XOO|A 84.21 19 3 0 2 20 2 20 2.00E-04 40.8

gi|90093248|ref|YP_529487.1| pdb|1IBO|A 84.21 19 3 0 2 20 2 20 2.00E-04 40.8

gi|90093248|ref|YP_529487.1| pdb|2DCI|A 80 20 4 0 1 20 1 20 0.002 37.7

gi|90093248|ref|YP_529487.1| pdb|1P97|A 29.07 86 57 3 102 183 3 88 1.5 28.1

gi|90093248|ref|YP_529487.1| pdb|1YDU|A 38.46 39 22 1 46 82 2 40 2 27.7

gi|90093248|ref|YP_529487.1| pdb|2BIB|A 31.43 35 24 0 6 40 410 444 2.6 27.3

gi|90093248|ref|YP_529487.1| pdb|2BIB|A 27.27 33 24 0 7 39 471 503 5.9 26.2

gi|90093248|ref|YP_529487.1| pdb|1MG7|B 23.49 149 88 6 72 211 6 137 3.4 26.9

gi|90093248|ref|YP_529487.1| pdb|2PHL|C 34 50 27 2 59 102 215 264 3.4 26.9

gi|90093248|ref|YP_529487.1| pdb|1FDO| 29.03 93 57 3 116 206 237 322 5.9 26.2

gi|90093248|ref|YP_529487.1| pdb|1L8W|D 27.06 85 53 2 23 107 3 78 5.9 26.2

gi|90093248|ref|YP_529487.1| pdb|1K7G|A 36 25 16 0 9 33 153 177 7.7 25.8

gi|90093248|ref|YP_529487.1| pdb|1YV0|I 20.41 49 39 0 38 86 46 94 7.7 25.8

gi|90093248|ref|YP_529487.1| pdb|1YTZ|I 20.41 49 39 0 38 86 46 94 7.7 25.8

gi|90093248|ref|YP_529487.1| pdb|1K7Q|A 36 25 16 0 9 33 153 177 7.7 25.8

gi|90093248|ref|YP_529487.1| pdb|1K7I|A 36 25 16 0 9 33 153 177 7.7 25.8

gi|90093248|ref|YP_529487.1| pdb|1GO8|P 36 25 16 0 9 33 136 160 7.7 25.8

gi|90093248|ref|YP_529487.1| pdb|1GO7|P 36 25 16 0 9 33 136 160 7.7 25.8

gi|90093248|ref|YP_529487.1| pdb|1PSZ|A 36.11 36 23 0 21 56 1 36 7.7 25.8

gi|90093248|ref|YP_529487.1| pdb|1DAB|A 52.38 21 9 1 4 23 243 263 7.7 25.8

gi|90093248|ref|YP_529487.1| pdb|1VIF| 40.74 27 13 1 14 40 22 45 7.7 25.8

gi|90093248|ref|YP_529487.1| pdb|1HNG|B 24.56 57 34 2 172 220 64 119 7.7 25.8

gi|73852944|ref|YP_308673.1| pdb|1AIL| 66.67 72 24 0 1 72 1 72 4.00E-23 103

gi|73852944|ref|YP_308673.1| pdb|1NS1|B 65.28 72 25 0 2 73 2 73 1.00E-22 101

gi|73852944|ref|YP_308673.1| pdb|1SIF|A 41.94 31 18 0 51 81 9 39 6.1 26.2

gi|73852944|ref|YP_308673.1| pdb|1ANV| 46.67 30 13 1 121 147 16 45 6.1 26.2

89

Page 90: Project  Report on Influenza Virus

gi|73852944|ref|YP_308673.1| pdb|1ADT| 46.67 30 13 1 121 147 16 45 6.1 26.2

gi|73852944|ref|YP_308673.1| pdb|1VPK|A 31.58 57 34 2 51 107 323 374 8 25.8

Result for the modelling

Template selection:For modelling we choose the appropriate template, according to its E-value ,bit score, % identity and alignment length from the list of PDB blast

H5n1

Target: NeuraminidaseTemplate pdb id=1w21E-val=1e-96Bit score=348%identity=47.15%

Target: HA1Template: PDB id=1jsmAE-val=1e-96Bit score= 645%identity= 94.17%

Target: Non structural ProteinTemplate: PDB id= 1AILE val=-4e-23Bit score= -103%identity= 66.67%

Target: Matrix proteinTemplate: PDB id=1ea3E-val=3e-87Bit score= 316%identity= 94.9%

Structure of matrix protein of h5n1 strain:

90

Page 91: Project  Report on Influenza Virus

Docking:

91

Page 92: Project  Report on Influenza Virus

We find that there is a het group (NAG N-acetyl-D-glucosamine) is present as a inhibitor for the neuraminidase

protein.We dock this inhibitor with this protein. After docking we get the evalue of this protein we choose the best

score means the protein which have the least e-value

Details of Neuraminidase protein:

PDB id:

7nn9

Name:

Hydrolase (o-glycosyl)

Title:

Native influenza virus neuraminidase subtype n9 (tern)

Structure:

Neuraminidase n9. Chain: null. Synonym: sialidase. Ec: 3.2.1.18

Source:

Avian influenza virus. Strain: a/tern/australia/g70c/75

UniProt:

P03472 (NRAM_IATRA)   [Pfam]

Enzyme class:

E.C.3.2.1.18   [IntEnz]   [ExPASy]   [KEGG] [BRENDA]

92

Page 93: Project  Report on Influenza Virus

Reaction:

Hydrolysis of alpha-(2->3)-, alpha-(2->6)-, alpha-(2->8)-glycosidic linkages of terminal sialic residues in oligosaccharides, glycoproteins, glycolipids, colominic acid and synthetic substrates.

Functions: Cellular component     membrane     Biological process     carbohydrate metabolism     Biochemical function   exo-alpha-sialidase activity   Resolution: 2.00 A

R-factor: 0.152

93

Page 94: Project  Report on Influenza Virus

Structure of NAG:

Het..group:..NAG...

Chemical Formula : C8H15NO6

LIGPLOT of ligand's interactions with protein

 

NAG 200A to MAN 200G

94

Page 95: Project  Report on Influenza Virus

Following Results we get after docking:

Calculations:

Receptor: 7NN9Ligand: 1AOH

E-START: 8826.22 KJ/MOLE-END: -609 KJ/MOL

TOTAL TIME TAKEN= 22 MIN 24 SEC

95

  

Page 96: Project  Report on Influenza Virus

LIGAND: 1A3K

ESTART: 3.6 KJ/MOL E-END: -240 KJ/MOL

TOTAL TIME TAKEN= 17 MIN 27 SEC

LIGAND: 1A3K

E-START: 3.96 KJ/MOL E-END: - 239.49 KJ/MOL

TOTAL TIME TAKEN= 17 MIN 27 SEC

Graphical view of docking of Neuraminidase protein (7NN9) with NAG (1A3K)

96

Page 97: Project  Report on Influenza Virus

Conclusion

After analyzing different strains of influenza A virus sequences we come to conclusion that though they

all are closely related, they have distinctly different pathogenic behaviour which plays an important role

in survival in different species. It is interesting to have closer look at the matter by studying at the gene

level. A phylogenetic analysis can be very helpful in understanding the evolutionary pattern

So based on current analysis, it can be said that different strain get diverged at different level.

We have noticed that same genes are present in all strains this shows that are they evolved together.

As influenza virus change through the well known process of Antigenic

drifting and shifting ,so as we are using four other strains with H5N1,it shows that they are somewhat

related to each other in past and may these strain give rise to each other(i.e. may be H5N1 was evolved

from H1N1, or any other strain. or vice versa)

Studies on the ecology of influenza viruses have led to the hypothesis that

all mammalian influenza viruses derive from the avian influenza reservoir.

With the finishing of the ongoing gene sequencing project on Avian

Influenza, we hope it will be possible to draw conclusive decision about the true picture of evolution in

near future and gene responsible for pathogenesis can also be identified.

Complete inference can only be drawn based on a comprehensive list of the

gene products and their function.

In order to find out unknown structure of protein present in the H5N1 strain

we do homology modelling. Till now the structures submitted is using X-ray crystallography or NMR

techniques. We forward step to present a theoretical model using available online modelling tools.

As we study that neuraminidase protein that is coded by NA gene is one of the

reasons of pathogenicity of Influenza A virus. So we tried to dock this protein with appropriate ligand,

in order to inhibit their activity on the basis of which the drugs have to be developed.

97

Page 98: Project  Report on Influenza Virus

FUTURE PROSPECTS

98

Page 99: Project  Report on Influenza Virus

Future prospects

The work presented in this report might just be a stepping stone for any such discoveries. The present

work might be small finding of big issue.

Phylogenetics is that field of biology which deals with identifying and understanding the relationships

between the many different kinds of life on earth. This includes methods for collecting and analysing

data, as well as interpretation of those results as new biological information.

With the aid of sequences it should be, possible to find the closely related organism. Experience learns

that closely related organism have similar sequences. More distantly related organism has more

dissimilar sequences. One objective is to reconstruct the evolutionary relationship between species.

Another objective is to estimating times of divergence between two organisms since they last shared a

common ancestor.

The purpose of modelling is to help the Drug developers and Biotechnologists to develop the drug more

efficiently and with more effectiveness in future by analysing the modelled structure of protein.

As the new drugs target would be identified it will open new vistas for further drug development .The

finding of our docking will be useful in finding a cure for the infectious disease bird flu, also it will

open new avenues for finding other possible drug targets in influenza A virus.

The docking results can be used to design new lead compounds and hence can aid in the new drug discovery process.

Finally, similar process can be applied on other pathogens and hence possible therapeutic sites can be

identified in them. Similar method can also be applied to other infectious diseases and hence we can

look forward to a better disease free world.

The work presented is just a small part of big issue and lots of work still needs to be done to establish a

good phylogenetic relationship and full fledged cure for bird flu. But we are hoping that these findings

will go long way and will prove fruitful to any going in a similar area.

99

Page 100: Project  Report on Influenza Virus

BIBLIOGRAPHY

AND

REFERENCES

100

Page 101: Project  Report on Influenza Virus

References Gog, J. R., Rimmelzwaan, G. F., Osterhaus, A. D. M. E., Grenfell, B. T. (2003). Population dynamics of

rapid fixation in cytotoxic T lymphocyte escape mutants of influenza A. Proc. Natl. Acad. Sci. U. S. A.

100: 11143-11147 [Abstract] [Full Text]  

Nakagawa, N., Nukuzuma, S., Haratome, S., Go, S., Nakagawa, T., Hayashi, K. (2002). Emergence of an

Influenza B Virus with Antigenic Change. J. Clin. Microbiol. 40: 3068-3070 [Abstract] [Full Text]  

Tumpey, T. M., Suarez, D. L., Perkins, L. E. L., Senne, D. A., Lee, J.-g., Lee, Y.-J., Mo, I.-P., Sung, H.-

W., Swayne, D. E. (2002). Characterization of a Highly Pathogenic H5N1 Avian Influenza A Virus

Isolated from Duck Meat. J. Virol. 76: 6344-6355 [Abstract] [Full Text]  

Benton, K. A., Misplon, J. A., Lo, C.-Y., Brutkiewicz, R. R., Prasad, S. A., Epstein, S. L. (2001).

Heterosubtypic Immunity to Influenza A Virus in Mice Lacking IgA, All Ig, NKT Cells, or {{gamma}}

{{delta}} T Cells. J Immunol 166: 7437-7445 [Abstract] [Full Text]  

Lindstrom, S. E., Hiromoto, Y., Nishimura, H., Saito, T., Nerome, R., Nerome, K. (1999). Comparative

Analysis of Evolutionary Mechanisms of the Hemagglutinin and Three Internal Protein Genes of

Influenza B Virus: Multiple Cocirculating Lineages and Frequent Reassortment of the NP, M, and NS

Genes. J. Virol. 73: 4413-4426 [Abstract] [Full Text]  

Voeten, J. T. M., Bestebroer, T. M., Nieuwkoop, N. J., Fouchier, R. A. M., Osterhaus, A. D. M. E.,

Rimmelzwaan, G. F. (2000). Antigenic Drift in the Influenza A Virus (H3N2) Nucleoprotein and Escape

from Recognition by Cytotoxic T Lymphocytes. J. Virol. 74: 6800-6807 [Abstract] [Full Text]  

Cooper, L. A., Subbarao, K. (2000). A Simple Restriction Fragment Length Polymorphism-Based

Strategy That Can Distinguish the Internal Genes of Human H1N1, H3N2, and H5N1 Influenza A

Viruses. J. Clin. Microbiol. 38: 2579-2583 [Abstract] [Full Text]  

Karasin, A. I., Olsen, C. W., Anderson, G. A. (2000). Genetic Characterization of an H1N2 Influenza

Virus Isolated from a Pig In Indiana. J. Clin. Microbiol. 38: 2453-2456 [Abstract] [Full Text]  

Naffakh, N., Massin, P., Escriou, N., Crescenzo-Chaigne, B., van der Werf, S. (2000). Genetic analysis

of the compatibility between polymerase proteins from human and avian strains of influenza A viruses. J

Gen Virol 81: 1283-1291 [Abstract] [Full Text]  

Hiromoto, Y., Yamazaki, Y., Fukushima, T., Saito, T., Lindstrom, S. E., Omoe, K., Nerome, R., Lim,

W., Sugita, S., Nerome, K. (2000). Evolutionary characterization of the six internal genes of H5N1

human influenza A virus. J Gen Virol 81: 1293-1303 [Abstract] [Full Text]  

Hiromoto, Y., Saito, T., Lindstrom, S. E., Li, Y., Nerome, R., Sugita, S., Shinjoh, M., Nerome, K.

(2000). Phylogenetic analysis of the three polymerase genes (PB1, PB2 and PA) of influenza B virus. J

Gen Virol 81: 929-937 [Abstract] [Full Text]  

Zhou, N. N., Senne, D. A., Landgraf, J. S., Swenson, S. L., Erickson, G., Rossow, K., Liu, L., Yoon, K.-

j., Krauss, S., Webster, R. G. (1999). Genetic Reassortment of Avian, Swine, and Human Influenza A

Viruses in American Pigs. J. Virol. 73: 8851-8856 [Abstract] [Full Text]

Alexander DJ, Brown IH. “Recent zoonoses caused by influenza A viruses” Rev Sci Tech 2000; 19:197

225. First citation in article | PubMed

101

Page 102: Project  Report on Influenza Virus

Altschul, Stephen F., Thomas L. Madden, Alejandro A. Schaffer, Jinghui Zhang, Zheng Zhang, Webb

Miller, and David J. Lipman Nucleic Acids Res. 25:3389-3402 (1997)

Genetic analysis of the compatibility between polymerase proteins from human and avian strains of

influenza A viruses by Nadia Naffakh1, Pascale Massin1, Nicolas Escriou1, Bernadette Crescenzo-

Chaigne1 and Sylvie van der Werf1 (http://jgv.sgmjournals.org/cgi/content/abstract/81/5/1283) read this article online

Whole-Genome Analysis of Human Influenza A Virus Reveals Multiple Persistent Lineages and

Reassortment among Recent H3N2 Viruses “Edward C. Holmes1, Elodie Ghedin2, Naomi Miller2, Jill

Taylor3, Yiming Bao4, Kirsten St. George3, Bryan T. Grenfell1, Steven L. Salzberg2, Claire M. Fraser2,

David J. Lipman4*, Jeffery K. Taubenberger5”

Influenza A (H3N2) Outbreak, Nepal Luke T. Daum,* Michael W. Shaw,Alexander I. Klimov,‡ Linda

C. Canas,* Elizabeth A. Macias,* Debra Niemeyer,* James P. Chambers,† Robert Renthal,† Sanjaya K.

Shrestha,§ Ramesh P. Acharya,¶ Shankar P. Huzdar,¶ Nirmal Rimal,¶ Khin S. Myint,# and Philip

Gould* (http://www.cdc.gov/ncidod/eid/vol11no08/05-0302.htm)

Felsenstein J. (1981). PHYLIP: Phylogeny inference package (version 3.2). Cladistics 5: 164-166.

Higgins DG and Sharp PM. (1988). CLUSTAL: A package for performing multiple sequence alignment

on a microcomputer. Gene 73: 237-244.

Higgins DG, Thompson JD, and Gibson TJ. (1996). Using CLUSTAL for multiple sequence alignment.

Methods Enzymol. 266: 383-402.

Mount DW. (2001). Bioinformatics: Sequence and genome analysis. Cold Spring Harbor Laboratory

Press, 564 pp.

Saitou N and Nei M. (1987). The neighbor-joining method: A new method for reconstronting

phylogenetic trees. Mol. Biol. Evol. 4: 406-425.

Hinshaw VS, Webster RG. The natural history of influenza A viruses. In: Beare AS, editor. Basic and

applied influenza research. Boca Raton (FL): CRC Press; 1982. p. 79-104.

Scholtissek C, Naylor E. Fish farming and influenza pandemics. Nature 1988;331:215.

Bean WJ, Kawaoka Y, Wood JM, Pearson JE, Webster RG. Characterization of virulent and avirulent

Fouchier RAM, Munster V, Wallensten A, et al, 2005. Characterization of a novel influenza A virus

hemagglutinin subtype (H16) obtained from black-headed gulls. J Virol vol 79, issue 5, pp2814-22.

Gambaryan A, Tuzikov A, Pazynina G, Bovin N, Balish A, Klimov A, 2005. Evolution of the receptor

binding phenotype of influenza A (H5) viruses in Virology (electronic publication ahead of print

version).

Hatta M, Gao P, Halfmann P, Kawaoka Y, 2001. Molecular Basis for High Virulence of Hong Kong H5N1 Influenza A Viruses in Science vol 293, pp1840-1842.

Nelson DL and Cox MM, 2005. Lehninger's Principles of Biochemistry, 4th edition, WH Freeman, New York, NY.

102

Page 103: Project  Report on Influenza Virus

Suzuki, Y, 2005. Sialobiology of Influenza: Molecular Mechanism of Host Range Variation of Influenza Viruses in Biological and Pharmaceutical Bulletin, vol 28, pp399-408.

Senne DA, Panigrahy B, Kawaoka Y, Pearson JE, Suss J, Lipkind M, Kida H, Webster RG, 1996. Survey of the hemagglutinin (HA) cleavage site sequence of H5 and H7 avian influenza viruses: amino acid sequence at the HA cleavage site as a marker of pathogenicity potential in Avian Disease vol 40, pp425-437.

Weis WI, Brünger AT, Skehel JJ, et al, 1990. Refinement of the influenza virus hemagglutinin by simulated annealing. J Mol Biol vol 212, pp737-761.

White JM, Hoffman LR, Arevalo JH, et al, 1997. Attachment and entry of influenza virus into host cells. Pivotal roles of hemagglutinin. In Structural Biology of Viruses. Chiu W, Burnett RM, and Garcea RL, editors. Oxford University Press, NY. pp80-104.

Website

1. http://www.ncbi.nlm.nih.gov/genomes/VIRUSES/11308.html

2. http://www.ncbi.nlm.nih.gov/genomes/FLU/FLU.html

3. http://www.cdc.gov/ncidod/eid/vol4no3/webster.htm

4. http://www.influenzacentre.org/fluinfo.htm

5. http://www.ncbi.nlm.nih.gov/entrez/query.fcgi?db=genome&cmd=search&term=influenza+A+virus

6. http://www.ncbi.nih.gov/genomes/VIRUSES

7. http://www.nhsdirect.nhs.uk

8. http://www.influenzareport.com/ir/ai.htm

9. http://www.agnr.umd.edu/avianflu/

10. http://www.cdc.gov/flu/about/fluviruses.htm

11. http://www.cdc.gov/flu/avian/gen-info/flu-viruses.htm

12. http://bioinformatics.ubc.ca/resources/tools/?name=clustalx

13. http://bips.u-strasbg.fr/fr/Documentation/ClustalX/

14. http://pbil.univ-lyon1.fr/software/njplot.html

15. http://www.cdc.gov/ncidod/eid/vol4no3/webster.htm#ref6

103

Page 104: Project  Report on Influenza Virus

16. http://www.en.wikipidia.org//wiki

17. http://www.who.int/csr/don/2004_01_15/en/

18. http://www.mayoclinic.com/health/bird-flu/DS00566

19. http://www.pandemicflu.state.pa.us/pandemicflu/cwp/view.asp?a=501&q=151742`

20. http://micro.magnet.fsu.edu/cells/viruses/influenzavirus.html

21. http://www.cdc.gov/flu/about/fluviruses.htm

22. http://en.wikipedia.org/wiki/h5n1_genetic_structure

23. http://www.cdc.gov/flu/avian/gen-info/flu-viruses.htm

24. http://www3.niaid.nih.gov/news/focuson/flu/illustrations/antigenic/antigenicdrift.htm

25. http://www3.niaid.nih.gov/news/focuson/flu/illustrations/antigenic/antigenicshift.htm

26. http://www.cdc.gov/flu/avian/gen-info/flu-viruses.htm

27. http://en.wikipedia.org/wiki

28. http://pathmicro.med.sc.edu/mhunt/flu.htm

29. http://en.wikipedia.org/wiki/H5N1#Genetic_structure_and_related_subtypes

30. http://www.csd.abdn.ac.uk/hex/

31. http://www.ebi.ac.uk/thornton-srv/databases/pdbsum

32. http://www.ebi.ac.uk/thornton-srv/databases/CSA

33. http://en.wikipedia.org/wiki/Neuraminidase

34. en.wikipedia.org/wiki/Neuraminidase_inhibitor

35. www.qdots.com/live/render/content.asp

Books

104

Page 105: Project  Report on Influenza Virus

1) “BIOINFORMATICS AND FUNCTIONAL GENOMICS”

Author: Jonanthan pevsner

2) “SEQUENCE AND GENOME ANALYSIS”

Author: David W Mount

3) “BIONFORMATICS—METHODS AND APPLICATION: GENOMICS, PROTEOMICS”

Author: S.C.Rastogi, Namita Mendiratta , Parag Rastogi

105

Page 106: Project  Report on Influenza Virus

ABBREVIATION

Abbreviation

CSA: Catalytic Site Atlas

Emboss: European Molecular Biology Open Software Suit

NCBI: National Centre for Biotechnology Information

NDB: Nucleic Acid Database

ORF: Open Reading Frame

OTU: Operational Taxonomic Unit

106

Page 107: Project  Report on Influenza Virus

PDB: Protein Data Bank

Phylip: Phylogeny Inference Package

107

Page 108: Project  Report on Influenza Virus

APPENDIX

Appendix

PDBsum:- A database of the known 3D structures of proteins and nucleic acid PDBsum is a pictorial

database providing an at-a-glance overview of every macromolecular structure deposited in the Protein

Data Bank (PDB). It provides schematic diagrams of the molecules in each structure and of the

interactions between them. Entries are accessed by their PDB code

(http://www.ebi.ac.uk/thornton-srv/databases/pdbsum/ )

108

Page 109: Project  Report on Influenza Virus

Jena Library:- The Jena Library of Biological Macromolecules (JenaLib) is aimed at a better

dissemination of information on three-dimensional biopolymer structures with an emphasis on

visualization and analysis.

It provides access to all structure entries deposited at the Protein Data Bank (PDB) or at the Nucleic

Acid Database (NDB). ( http://www.fli-leibniz.de/IMAGE.html )

CSA (Catalytic Site Atlas):- The Catalytic Site Atlas (CSA) is a database documenting enzyme active

sites and catalytic residues in enzymes of 3D structure.

The Catalytic Site Atlas (CSA) provides catalytic residue annotation for enzymes in the Protein Data

Bank.

The CSA contains 2 types of entry:

1. Original hand-annotated entries, derived from the primary literature. References for these

entries are given.

2. Homologous entries, found by PSI-BLAST alignment (using an e-value cut-off of 0.00005) to

one of the original entries. The equivalent residues, which align in sequence to the catalytic

residues found in the original entry are documented.

CSA Version 2.1.7 ( http://www.ebi.ac.uk/thornton-srv/databases/CSA )

Swiss model

Swiss model is an automated homology modelling server developed within the swiss institute of bioinformatics in collaboration between Glaxo and SBG

make it easy to submit a target sequence and get back an automatically generated homology model, provide an empirical structure with >30% sequence identity

exist to use as a template .These automated models may be useful, but will sometime have error that could be avoided if manual adjustment are made to the

sequence alignment by an expert .

SwissPDB Viewer: Swiss-PdbViewer can load and display several molecules simultaneously.Each

molecule is loaded into its own layer. Each molecule is composed of groups (i.e. amino acids,

nucleotides, substrates...). Each group is composed of atoms, whose coordinates are taken directly from

a PDB file.

Swiss PDV Viewer is a free program to display, analyse and manipulate PDB protein structures. Next

to features such as protein superimposition, H-bond detection, amino acid mutation etc., the protein is

tightly linked to Swiss- Model, an automated homology modelling server running at the Geneva

Biomedical Research Center. This allows

109

Page 110: Project  Report on Influenza Virus

for threading a protein primary sequence to a 3D template and analysing homology. The displaying

options of the program include spacefill, ball & stick, stick and ribbon representations, all of which can

be applied simultaneously within one structure model.

SwissPDB Viewer Version 3.7 http://www.expasy.ch/spdbv/text/main.htm

Hex: - Hex is an interactive molecular graphics program for calculating and displaying feasible docking

modes of pairs of protein and DNA molecules. Hex can also calculate small-ligand/protein docking

(provided the ligand is rigid), and it can superpose pairs of molecules using only knowledge of their 3D

shapes.

In Hex's docking calculations, each molecule is modelled using 3D parametric functions which are used

to encode both surface shape and electrostatic charge and potential distributions

Hex Version 4.5 ( http://www.csd.abdn.ac.uk/hex/ )

PHYLIP: (the PHYLogeny Inference Package) is a package of programs for inferring phylogenies

(evolutionary trees). Methods that are available in the package include parsimony, distance matrix, and

likelihood methods, including bootstrapping and consensus trees. Data types that can be handled

include molecular sequences, gene frequencies, restriction sites and fragments, distance matrices, and

discrete characters.

Some sequence analysis programs such as the ClustalW alignment program can write data files in the

PHYLIP format. Most of the programs look for the data in a file called "infile" -- if they do not find this

file they then ask the user to type in the file name of the data file.

Output is written onto special files with names like "outfile" and "outtree". Trees written onto "outtree"

are in the Newick format, an informal standard agreed to in 1986 by authors of a number of major

phylogeny packages.

.http://evolution.genetics.washington.edu/phylip

Get ORF: Get ORF is a freely available online package of EMBOSS .

Its function is to Finds and extracts open reading frames (ORFs).This program finds and outputs the

sequences of open reading frames (ORFs).

The ORFs can be defined as regions of a specified minimum size between STOP codons or between

START and STOP codons.The ORFs can be output as the nucleotide sequence or as the translation.

110

Page 111: Project  Report on Influenza Virus

The program can also output the region around the START or the initial STOP codon or the ending

STOP codons of an ORF for those doing analysis of the properties of these regions.

The START and STOP codons are defined in the Genetic Code tables. A suitable Genetic Code table

can be selected for the organism you are investigating.

(http://www.3rog.org/general/software/packages/emboss/getorf.html)

Clustal w : ClustalW is a general purpose multiple sequence alignment program for DNA or proteins.It

produces biologically meaningful multiple sequence alignments of divergent sequences. It calculates

the best match for the selected sequences, and lines them up so that the identities, similarities and

differences can be seen. Evolutionary relationships can be seen via viewing Cladograms or Phylograms.

http://www.ebi.ac.uk/clustalw/)

GENSCAN: GENSCAN is a general-purpose gene identification program which analyzes

genomic DNA sequences from a variety of organisms including human, other vertebrates,

invertebrates and plants.

This server provides access to the program Genscan for predicting the locations and exon-intron

structures of genes in genomic sequences from a variety of organisms. This server can accept sequences

up to 1 million base pairs (1 Mbp) in length.

http://genes.mit.edu/GENSCAN.html

bioinformatics.ubc.ca/resources/tools/index.php?name=genscan

111

Page 112: Project  Report on Influenza Virus

112

Page 113: Project  Report on Influenza Virus

 

113