ap biology lab-03

28
1 Copyright © 2013 Quality Science Labs, LLC Lab 3 Comparing DNA Sequences to Understand Evolutionary Relationships with BLAST Big Idea 1: Evolution How can bioinformatics be used as a tool to determine evolutionary relationships and to better understand genetic diseases? Please be sure you have read the student intro packet before you do this lab. (If needed, the student intro packet is available at www.qualitysciencelabs.com/AdvancedBioIntro.pdf) Lab Investigations Summary Pre-lab and Questions: Cladograms Part A - Structural Cladogram Part B - Biochemical Analysis for Cladogram Construction GAPDH gene and protein percentage similarities Part C - Biochemical Analysis of Cytochrome c Cytochrome c protein enzyme percentage of difference in amino acid sequences between 17 organisms Lab Investigation 3.1 Part 1 - BLAST Practice BLAST practice with an unknown fossil specimen Cladogram Prediction and Tutorial using BLAST Part 2 - Student Guided Inquiry Student guided inquiry using BLAST to query a gene responsible for producing a mutant disease-causing protein of interest” Tutorial on using Entrez Gene to obtain nucleotide sequence for gene of interest

Upload: ldlewis

Post on 05-Dec-2015

130 views

Category:

Documents


6 download

DESCRIPTION

Comparing DNA Sequences

TRANSCRIPT

Page 1: AP Biology Lab-03

1Copyright © 2013 Quality Science Labs, LLC

Lab 3Comparing DNA Sequences to Understand

Evolutionary Relationships with BLAST

Big Idea 1: EvolutionHow can bioinformatics be used as a tool to determine evolutionary

relationships and to better understand genetic diseases?

Please be sure you have read thestudent intro packet before you do this lab.

(If needed, the student intro packet is available at www.qualitysciencelabs.com/AdvancedBioIntro.pdf)

Lab Investigations SummaryPre-lab and Questions: Cladograms

Part A - Structural CladogramPart B - Biochemical Analysis for Cladogram Construction

GAPDH gene and protein percentage similaritiesPart C - Biochemical Analysis of Cytochrome c

Cytochrome c protein enzyme percentage of difference in amino acid sequences between 17 organisms

Lab Investigation 3.1 Part 1 - BLAST Practice

BLAST practice with an unknown fossil specimenCladogram Prediction and Tutorial using BLAST

Part 2 - Student Guided Inquiry Student guided inquiry using BLAST to query a gene responsible for producing a mutant disease-causing protein of interest”Tutorial on using Entrez Gene to obtain nucleotide sequence for gene of interest

Page 2: AP Biology Lab-03

2 Copyright © 2013 Quality Science Labs, LLC

LAB 3 - Comparing DNA Sequences to Understand Evolutionary Relationships with BLAST”

Big Idea 1 Evolution

How can bioinformatics be used as a tool to determine evolutionary relationships and to better understand genetic diseases?

BACKGROUND

How can we apply science to the study of origins? In the past, this meant the application of historical science and interpretations of the complex and variable fossil records and dating methods. What evidence do we have to support observable science when it comes to the evolutionary changes in populations and gradual, linear, primitive-to-advanced taxonomic groups of organisms? Are we relying on conjuring up explanations for observations or are we using the scientific methodology to form testable hypotheses to build on the evolutionary theory? The following is a quote from Washington State University, School of Biological Sciences (July 2008; Mack, R.N. and Black, R. A.)

“Although essential, the formation and testing of hypotheses (sometimes called ‘strong inference’) can be a formidable exercise. Observations must be reduced to simple hypotheses: a Null Hypothesis that says that a factor has no effect in causing an observed phenomenon and a companion Alternate Hypothesis that states the factor is having an effect. Equally important is the recognition that multiple hypotheses must be formulated and tested together. The overall goal is to disprove or falsify each of the multiple hypotheses. This approach seems initially an odd choice, but it arises because few hypotheses can be proven. Among an initial group of hypotheses, the hypothesis that cannot be disproved is taken as the likely explanation. Use of multiple hypotheses avoids unconsciously becoming the guardian of a pet hypothesis… the unfortunate consequence of having only one hypothesis is that it may become either a so-call Expandable Hypothesis (in which the original hypothesis is simply expanded to account for all new evidence, often as exceptions or special cases) or worse, a Ruling Theory (in which facts are deliberately sought out that support the theory while conflicting facts are attacked or ignored).”Consider how (or if) a multitude of testable hypotheses can be applied

to using bioinformatics to determine evolutionary relationships and genetic diseases. Molecular genetics has opened up a quantitative world of sequencing genes (DNA) and gene products (proteins). The thirteen-year Human Genome Project (HGP) mapped over 20,000 genes from a human. With sequencing of genes and proteins in many different organisms, we now have evidence showing

Page 3: AP Biology Lab-03

3Copyright © 2013 Quality Science Labs, LLC

changes in the genetic material of different taxonomic groups of organisms and their populations. Molecular Evolutionary Geneticists create models to analyze and reveal the evolutionary relationships among organisms and populations. As gene sequences are identified that relate to specific genetic diseases in a particular organism, similar sequences can be quickly compared in the powerful BLAST data base. “If all but one hypothesis explaining a phenomenon has been disproved or falsified, the remaining hypothesis is considered the best current approximation of the truth. The operative word here is current because any hypothesis (even if elevated to the rank of a theory, i.e., a hypothesis for which some evidence has been assembles) is subject to continual review and testing.” (July 2008; Mack, R.N. and Black, R. A.)

What do we know about human DNA and gene sequencing? Thirteen years of sequencing human DNA (Human Genome Project – HGP) has mapped the nucleotide sequences of over 20,000 gene sites in humans. This information has been made public to scientists all over the world.

How can knowing the human genome help in understanding genetic diseases? If gene sites for genetic diseases can be compared to nucleotide sequences of the same normal human gene site (from HGP data), then science can progress to alleviating these diseases. Gene therapy is the use of DNA to treat disease. Scientists introduce genes into human cells via a bacterial or viral vector (a bacteria or a virus that integrates a foreign gene into their DNA and transfers it to the host DNA), focusing on diseases caused by single-gene defects, such as cystic fibrosis, hemophilia, muscular dystrophy, thalassemia, and sickle-cell anemia. Today, most gene therapy studies are aimed at cancer and hereditary diseases linked to a genetic defect.

How can molecular information help in developing evolutionary relationships? Biological evolution is looking to biochemistry to provide evidence of gradual change between taxonomic groups. Comparing sequencing of DNA nucleotides or amino acids in proteins, they hope to create trees of life showing ancestral relationships. Cladograms are branching diagrams used to illustrate what is predicted to be the successive points of species divergence from common ancestral lines. Traditional cladograms were generated largely on the basis of physical characteristics but have been partially discarded for more recent ones developed using DNA and RNA sequencing information, which is now very commonly used in the generation of cladograms. The Pre-labs in this unit will give you practice in developing your own cladograms from anatomical information (physical characteristics). Consider if cladograms based on molecular sequencing provide irrefutable evidence to prove evolutionary development and the tree of life? This is for you to think about as you view the biochemistry databases of comparing cytochrome c and decipher the basic assumptions inherent in the cladogram diagram in the Pre-labs. Then Lab Investigation 3.1 will use molecular information to develop the cladogram.

Lemurs killed in Madagascar for bushmeat

Page 4: AP Biology Lab-03

4 Copyright © 2013 Quality Science Labs, LLC

Are similarities in biochemical molecules evidence of evolution? For biological evolution, similarities are a major foundational evidence for evolution. According to Darwinian theory, it stands to reason that the larger the number of similarities between two organisms, the closer their evolutionary relationship is likely to be. Physical characteristics are not always easy to discern, have proven unreliable, and the fossil record has not provided conclusive evidence. However, today we have access to DNA sequencing, gene sites, and protein amino acid sequencing in data bases easily accessible to the world. You will examine the molecular evidences in this lab and ask the following: Has the molecular evidence proved gradual evolutionary progression from one taxonomic group to another and is there evidence for the intermediate or transitional species?

In accordance with our promise to be open-minded and consider other interpretations, another perspective based on similarities could be considered. What if the reason for similarities in structure, function, and a biochemical base was that it allowed all living organisms to fit into a common ecological web and a food chain for survival?

Molecular sequencing, implications, and assumptions: Protein sequencing looks at sequences of amino acids. Cytochrome c oxidase (an enzyme involved in mitochondrial electron transport for ATP production) is a protein with gene locus (COX1) that has been studied in depth. The amino acid sequencing is known for many organisms. In fact, it is used today in DNA barcoding to identify unrecognizable carcasses as to whether they are endangered species in an attempt to control the Bushmeat crisis in third world countries.

As mentioned previously, the cytochrome c enzyme has been studied extensively and amino acid sequencing is known in several organisms. This facilitates comparisons on a molecular level among several different phyla. In a comparison of 17 organisms ranging from humans to photosynthesizing bacteria, cytochrome c has been analyzed and compared. Is there evidence of progressively more divergence as biological evolution would predict? Logically, Darwinian evolution would expect progressively more divergence on the molecular level as we move up the evolutionary scale from silkworms to humans. When comparing living organisms, it would be reasonable to predict a greater molecular distance from the insect to the amphibian than to the living fish, greater distance still to the reptile, and greater than that to the mammal. This pattern is not found. If fish evolved into amphibians in the evolutionary scenario, one would expect the cytochrome c amino acid sequencing in fish to be more similar to the cytochrome c sequencing in amphibians. But these are not the cases. What is found in the variety of vertebrates is the same or similar % difference in sequencing percent sequence distance of each cytochrome c protein. Instead of amphibians being closer to fish (having a smaller % difference in sequencing) as would be predicted in the evolutionary scale, on a molecular level amphibians are no closer to fish than to reptiles or mammals. Hence, this is still a puzzle to solve.

Has evidence of intermediate species been found in molecular sequencing? If intermediate species are considered (bridges transitioning between amphibians and reptiles), Darwinian evolution would predict some species of amphibians to be closer to fish (more primitive species) and others to be closer to reptiles (more advanced species). This is also NOT the case in cytochrome c analysis. All vertebrates from horses, rabbits, chickens, and turtles to bullfrogs have the same percent distance of differences in cytochrome c sequencing from that in

Page 5: AP Biology Lab-03

5Copyright © 2013 Quality Science Labs, LLC

fish (carp). Another puzzle to be solved.What is the perspective and what are the assumptions underlying cladogram

construction? At this time, biological evolution has high expectations that biochemistry will provide evidence of gradual change between taxonomic groups. They are counting on gene site DNA sequencing and protein amino acid sequencing to provide information for a progressive sequence on which to base progressive evolutionary changes from one taxonomic group to another; from bacteria to invertebrates to vertebrates.

BLAST: Evolutionary connections is a data base used to detect similarities and differences in genomes. BLAST stands for Basic Local Alignment Search Tool. Using BLAST, you can input a gene sequence of interest and search entire genomic libraries for identical or similar sequences in a matter of seconds. Students will use BLAST to compare several genes and then using the information, construct a cladogram, such as commonly used, to show the implied evolutionary relatedness of an unknown fossil. Students will also use Entrez Gene website and BLAST in Part 2 of Lab Investigation 3.1 to distinguish nucleotide differences between normal and mutant genes associated with a selected disease.

In this unit, you will examine databases currently available based on molecular sequencing of amino acids (proteins) and nucleotides (genes) and analyze similarities and differences; as well as applications and significances to the study of origins.

Page 6: AP Biology Lab-03

6 Copyright © 2013 Quality Science Labs, LLC

PREPARATION

Materials and EquipmentA computer with internet access

Timing and Length of Labs The Pre-lab has three parts: building structural and biochemical cladograms,

and analyzing cytochrome c similarities and differences among diverse taxonomic groups. Some research is required in the first part and that could be done outside of lab. Two lab periods, as a minimum, are required to complete the Pre-labs.

Lab Investigation 3.1Part 1 This is a tutorial on using BLAST, which could be done together as a class or

individually. Screenshots are provided for each step of the way. A second lab period could be used for discussion and analysis of results of the bioinformatics provided by BLAST.

Part 2 - Student Guided InquiryGuided inquiry starts with a tutorial (with step-by-step screenshots) on finding

a gene sequence in the Entrez Gene website; then students will need to do some research for the name of the gene associated with the student’s selected disease. They will then copy the gene sequence from the Entrez Gene website and paste it into BLAST to compare the normal to the mutant gene sequence. This could take two lab periods to research, complete and analyze.

Learning objectives aligned to standards and science practices (SP):

•To evaluate data-based evidence from bioinformatics tools like Entrez Gene and BLAST to analyze evolutionary degrees of similarities and differences; and to analyze genetic data related to diseases (1A2 and SP 5.3)

•To construct a scientific explanation that uses the structures and mechanisms of DNA and RNA to support the claim that DNA, and in some cases RNA, is the primary source of heritable information (3A1 and SP 5.6)

•To use cladograms and bioinformatics tools to ask other questions and test the student's ability to apply learned concepts relating to genetics and evolution (1A2. 1A4 and SP 1.1,1.2)

General Safety PrecautionsThere are no safety precautions associated with this investigation.

Page 7: AP Biology Lab-03

7Copyright © 2013 Quality Science Labs, LLC

Pre-lab and Questions: Cladograms

What is a cladogram (CLAY-doe-gram)? It is a diagram that depicts the probable evolutionary relationships among groups. It is based on phylogeny which is the study of evolutionary relationships. Sometimes a cladogram is called a phylogenetic tree. In the past, biologists would group organisms based solely on their physical appearance. Today, with the advances in genetics and biochemistry, biologists can look more closely at individuals in the hopes of discovering patterns of evolution, and group them accordingly — this strategy is called evolutionary classification. Cladistics is a form of analysis that looks at features of organisms that are considered "innovations," or newer features that serve some kind of purpose. It is important to recognize that cladograms were created by biologists to illustrate their understanding of progressive evolutionary development over time; which is progressive evolution from varying taxonomic groups such as evolution of reptiles to birds.

In this Pre-lab you will use the cladogram to illustrate group relationships by their anatomical structures. Corresponding organs and other body parts that are alike in basic structure and origin are said to be homologous structures (for example, the front legs of a horse, wings of a bird, flippers of a whale, and the arms of a person are all homologous to each other). Biological evolutionists have defined homology as correspondence of structure derived from a common primitive origin. This definition assumes progressive evolution over time from one taxonomic group to another. Human hands and dog paws are an example of Darwinian homology. Since the similarity of a human hand and a dog paw is somewhat structural, but not functional, it is assumed that their similarity results from a common ancestor that possessed this basic arrangement of bones. However, it should be noted that some scientists’ interpretations of the actual embryonic origin of some supposedly homologous structures is different.

Cladograms are used to show these assumed homologous structures. When studies are done in comparative anatomy and different numbers of shared derived characters are found to exist between different groups, a diagram can be created with branching lines that connect those groups, showing their different degrees of relationship. These diagrams look like trees. The organisms are at the tips of the stems. The shared features of the homologous structures are shown on the cladogram by solid square boxes along the branches, and predicted common ancestors are shown by open circles. The more derived structures two organisms share, the closer is their presumed evolutionary relationship — that is, the more likely their common ancestor lived recently. On the cladogram, closer predicted relationships are shown by a recent fork from the supporting branch. The closer the fork in the branch between two organisms, the closer is their relationship.

Page 8: AP Biology Lab-03

8 Copyright © 2013 Quality Science Labs, LLC

Procedures

Part A: Structural Cladogram(Adapted from ENSI/SENSI lesson plan: Making Cladograms http://www.

indiana.edu/~ensiweb/home.html)

1. Using resources: internet, textbook, and the explanations below; determine which of the characteristics each animal has. In the Data Table provided on the next pages place an “X” in the box if the animal has the characteristic.

Explanations of Characteristics:

Set #1: •Dorsal nerve cord: running along the back or “dorsal”

body surface.•Notochord: a flexible but supporting cartilage-like rod

running along the back or “dorsal” surface.Set #2:

•Paired appendages: legs, arms, wings, fins, flippers, antennae.

•Vertebral column: backbone.Set #3:

•Paired legsSet #4:

•Amnion: a membrane that holds in the amniotic fluid surrounding the embryo; may or may not be inside an egg shell.

Set #5: •Mammary glands: milk-secreting glands that nourish

the young.

Set #6: •Placenta: structure attached to inside of uterus of

mother, and joined to the embryo by the umbilical cord; provides nourishment and oxygen to the embryo

Set #7: •Canine teeth short: same length as other teeth.•Foramen magnum forward: spinal cord opening, located

forward, under skull.

Page 9: AP Biology Lab-03

9Copyright © 2013 Quality Science Labs, LLC

Pre-lab Table 1Animals

Sets Traits Kangaroo Lamprey Rhesus Monkey

Bullfrog Human Snapping Turtle

Tuna

Set 1 Dorsal Nerve Cord/Notochord

Set 2 Paired Appendages/Vertebral Column

Set 3 Paired Legs

Set 4 Amnion (Amniotic sac)

Set 5 Mammary Glands

Set 6 Placenta

Set 7 Canine teeth short/Foramen magnum forward

Total Number of X

2. Make a Venn diagram, placing your seven animals in groups to illustrate those characteristics which different animals have in common. See example below

Page 10: AP Biology Lab-03

10 Copyright © 2013 Quality Science Labs, LLC

3. Using the Venn diagram of the groupings just completed (as a guide), draw a cladogram to illustrate the ancestry of these animals. The diagram should reflect shared characteristics as

time proceeds. An example is shown to the left. Notice how the different animals are all at the same time level (across the top) since they all live today.

Page 11: AP Biology Lab-03

11Copyright © 2013 Quality Science Labs, LLC

Part B: Biochemical Analysis for Cladogram ConstructionGAPDH (glyceraldehyde 3-phosphate dehydrogenase) is an enzyme that

catalyzes the sixth step in glycolysis, an important reaction in the process of cellular respiration. The following data table shows the percentage similarity of this gene and the protein it expresses in humans versus other species. For example, according to the table below, the GAPDH gene in chimpanzees is 99.6% identical to the gene found in humans.

Pre-lab Table 2

Here is a fossilized walrus skull and tusks. This demonstrates that not all fossils that we find represent extinct organisms. Other examples like this include clam shells, shark teeth, and bones.Questions to consider• Are fossils only from extinct organisms?

• Do fossils have to be thousands or millions of years old?

• How short of time can the fossilization process take?

Page 12: AP Biology Lab-03

12 Copyright © 2013 Quality Science Labs, LLC

Pre-lab Table 3Comparisons of the Cytochrome c Molecule Sequencing % Differences of 17 Organisms

Part C: Biochemical Analysis of Cytochrome c Cytochrome c oxidase is another protein enzyme. It is involved in mitochondrial

electron transport at the later stages of ATP production. Cytochrome c differs in every organism in its sequencing. The following Table 3 shows the percent of difference in amino acid sequence in cytochrome c between 17 organisms. Reading across the line marked “humans,” we see that the differences in sequence becomes greater the farther away we move on the taxonomic scale. From human to Rhesus monkey is only one percent divergence; from human to pig is ten percent; from human to fish (carp) is 17%; from human to an insect is 29%. This finding is not surprising since it corroborates traditional taxonomic categories.

Take a closer look at the silkworm, number 15 at the top of the table. As you progress down the silkworm column through the different vertebrate classes, it becomes apparent that the difference from diverse organisms is the same. It doesn’t matter if it is a human, a penguin, a turtle, fish, or lamprey. The silkworm differs from all of these (reptiles, birds, mammals) by almost exactly the same percent.

Page 13: AP Biology Lab-03

13Copyright © 2013 Quality Science Labs, LLC

Questions

Part A: Structural Cladogram 1. What are three types of information that can be obtained from

this cladogram?

2. Three previously unknown vertebrates have been discovered in a rain forest in South America. One animal is very similar to an iguana lizard. The second animal resembles a large rat. The third is similar to a goldfish. Place these animals on your cladogram and explain why you placed them where you did.

Part B: Biochemical Analysis for Cladogram Construction 3. Why is the percentage similarity in the gene always lower than

the percentage similarity in the protein for each of the species? (Hint: Recall how a gene is expressed to produce a protein.)

4. Draw a cladogram depicting the evolutionary relationships among all five species (including humans) according to their percentage similarity in the GAPDH gene.

Page 14: AP Biology Lab-03

14 Copyright © 2013 Quality Science Labs, LLC

5. What are the assumptions and evolutionary perspectives associated with the cladogram?

Part C: Biochemical Analysis of Cytochrome c

6. Refer to the Cytochrome c Comparison Chart above. Carp (fish) is in position #12. If you go to the Bullfrog amphibian, position #10 and look across the horizontal line of numbers until you get to the vertical column #12 (carp), you will find a difference of 13% in the cytochrome c sequencing. Repeat for the turtle (a reptile). Record the Cytochrome c % Difference for each organism listed.

7. If fish evolved into amphibians, do you see molecular evidence from cytochrome c sequencing % differences that fish are closer to bullfrogs than to reptiles or mammals?

Pre-lab Table 4: MODEL Data Chart Position #12 Carp (fish)

Organism of comparison Position #

Organism of comparison

Cytochrome c from organism of comparison % Difference

Carp (fish) #10 Bullfrog (amphibian) 13%

Carp (fish) #9 Turtle (reptile) 13%

Carp (fish) Chicken (bird)

Carp (fish) Rabbit (mammal)

Carp (fish) Horse (mammal)

Page 15: AP Biology Lab-03

15Copyright © 2013 Quality Science Labs, LLC

8. How does this relate to what biological evolution might expect if evolution from one taxonomic group to another occurred?

9. Does this evidence support or contradict the expectations of Darwinian evolution?

10. Do you see any indication from the amino acid sequence differences that there are transitional or intermediate species from one taxonomic group to another?

Page 16: AP Biology Lab-03

16 Copyright © 2013 Quality Science Labs, LLC

Lab Investigation 3.1Part 1 - BLAST Practice

The figure above is a fossil cladogram.

BLAST Practice with an Unknown Fossil Specimen

A team of scientists uncovered the fossil specimen pictured at left in China. DNA was extracted from a small amount of soft tissue and the sequence of nucleotides was determined You will use the BLAST database to analyze the sequencing and determine the most likely placement of the fossil species on fossil cladogram above.

Photo by Sam Ose & Olai Skjaervoy.

Page 17: AP Biology Lab-03

17Copyright © 2013 Quality Science Labs, LLC

Procedures

1. Form a hypothesis as to where you believe the fossil specimen should be placed on the cladogram your observations.

2. Locate and download gene files. Download three gene files from the following website http://blogging4biology.edublogs.org/2010/08/28/college-board-lab-files/

Click on Gene 1 and this is what you see:

Click on Download and select Save File, OK

Page 18: AP Biology Lab-03

18 Copyright © 2013 Quality Science Labs, LLC

3. Upload the gene sequence into BLAST by doing the following:

•Go to the BLAST homepage http://blast.ncbi.nlm.nih.gov/Blast.cgi

• Click on “Saved Strategies” from the menu at the top of the page (see the red arrow 1 pointing to the highlighted area).

1. “Saved Strategies”

button

2. Scientists search within

several databases. In this lab,

students will be using nucleotide blast to compare

the nucleotides found in the

fossil specimen to those in known

organisms.

1

2

Page 19: AP Biology Lab-03

19Copyright © 2013 Quality Science Labs, LLC

•Under “Upload Search Strategy,” click on “Browse” and locate Gene 1 files that you saved onto your computer (from your download files).

•Click “View.”

•A screen will appear with the parameters for your query already configured. NOTE: DO NOT alter any of the parameters. Scroll down the page and click on the “BLAST” button at the bottom.

•After collecting and analyzing all of the data for that particular gene (see instructions below), repeat this procedure for the other two gene sequences.

1. Click on “Browse” and then locate your downloaded Gene 1 file from Biology website and saved to your computer.

2. Then click “View.”

1. This is the nucleotide sequence found in Gene 1. DO NOT alter any of the parameters set on this page.

2. This setting indicates the entire database will be searched. Selecting the human database + transcript would only yield similar sequences found in humans. By selecting “Others,” all of the genomes in BLAST database will be searched.

3. Select “BLAST”(Step 3e).

1

1

2

2

3

Page 20: AP Biology Lab-03

20 Copyright © 2013 Quality Science Labs, LLC

4. The results page has two sections. The first section is a graphical display of the matching sequences.

•Scroll down to the section titled “Sequences producing significant alignments.” The species in the list that appears below this section are those with sequences identical or most similar to the gene of interest. The most similar sequences are listed first, an as you move down the list, the sequences become less similar to your gene of interest.

• If you click on a particular species listed, you’ll get a full report that includes the classification scheme of the species, the research journal in which the gene was first reported, and the sequence of bases that appear to align with your gene of interest.

1. This bar represents the gene sequence

you entered into BLAST.

2. This indicates the number

of results (53 BLAST hits in

this example for Gene 1).

3. These bars represent the top

results in the query and how well

aligned each result is with the gene of

interest.

4. This is the “Distance Tree of

Results” button.

1. This is the species and gene

name that matches the gene of interest.

2. The score (Max) refers to

how many gaps or substitutions are associated with

the sequence. The higher the score, the more similar

the alignment.

3. The e value is the likelihood

that a match occurred purely by chance. The

lower the e value, the more similar

the alignment (the better the match).

1

1

2

2

3

3

4

Page 21: AP Biology Lab-03

21Copyright © 2013 Quality Science Labs, LLC

• If you go back to the graphical (Distribution of BLAST hits), click on the “Distance Tree of Results,” (4. Left is pointing to the Distance Tree of Results button to click) you will see a cladogram with the species with similar sequences to your gene of interest placed on the cladogram according to how closely their matched gene aligns with your gene of interest.

Analyzing Results 5. After you have completed your data inquiry for all three

genes, you should be thinking about your original hypothesis and whether the data support or cause you to reject your original placement of the fossil species on the cladogram. For each BLAST query, consider the following:•The higher the score, the closer the alignment.•The lower the e value, the closer the alignment.•Sequences with e values less that 1 X 10-4 can be

considered related with an error rate of less than 0.01%.

1. This indicates the species the aligned sequence is found in and the gene/phenotype.

2. This describes the number of identical nucleotides found in this sequence.

3. This shows the exact pattern on alignment. The top line is the gene of interest (Gene 1 in this case) and the bottom line is the matching sequence.

12

3

Page 22: AP Biology Lab-03

22 Copyright © 2013 Quality Science Labs, LLC

Data Analysis and Conclusions

1. What species in the BLAST result has the most similar gene sequence to the gene of interest?

2. Where is that species located on our cladogram?

3. How similar is that gene sequence?

4. What species has the next most similar gene sequence to the gene of interest?

5. Based on what you have learned from the sequence analysis and what you know from the structure, decide where the new fossil species belongs on the cladogram with the other organisms and redraw your original cladogram.

Page 23: AP Biology Lab-03

23Copyright © 2013 Quality Science Labs, LLC

Part 2 - Student Guided Inquiry

In this guided inquiry, you will select a gene of interest, do a BLAST query on your gene and determine the difference between the mutant disease causing gene and the normal gene.

You will need one more skill before beginning this activity. In Lab 3.1, you were provided with gene sequences from a database. This time, you will choose your own gene to investigate so you will need to find your gene’s sequence in the Entrez Gene database.

Following is a short practice activity with the gene for actin (a muscle protein in humans that forms the microfilaments) that will guide you into the Entrez Gene website for getting the nucleotide sequence for actin.

Tutorial on getting gene sequences: 1. Go to the Entrez Gene website http://www.ncbi.nlm.nih.gov/

gene and type human actin in the top search block.

2. This screenshot is the result of your search for human actin. Click on the first link that appears.

Page 24: AP Biology Lab-03

24 Copyright © 2013 Quality Science Labs, LLC

2. (Continued) Scroll down about two-thirds of the way to the section “NCBI Reference Sequences” (in the upper left hand corner of this screenshot).

3. Under “mRNA and Proteins,” click on the first file name. In this case, it is under 1. NM_007353.2. It may be named something similar depending on your gene of interest. These standardized numbers make cataloging sequence files easier. Do not worry about the file number for now.

4. Just below the gene title, click on “FASTA.” This is the name for a particular format for displaying sequences.

Page 25: AP Biology Lab-03

25Copyright © 2013 Quality Science Labs, LLC

5. The nucleotide sequence displayed is that of the actin gene in humans. The screenshot is not showing all of it – it is important in the next step to highlight and copy ALL of it for BLAST.

6. Copy the entire gene sequence, and then go to the BLAST homepage http://blast.ncbi.nlm.nih.gov/Bast.cgi

Page 26: AP Biology Lab-03

26 Copyright © 2013 Quality Science Labs, LLC

7. Click on “nucleotide blast” under the Basic BLAST menu. 8. Paste the entire gene sequence into the box where it says “Enter

Query Sequence.”

9. Under “Choose Search Set,” select whether you want to search the human genome only or others.

10. Under “Program Selection,” choose whether or not you want highly similar sequences (this would be best for a human disease inquiry) or somewhat similar sequences if you want more results in the case of an evolutionary relationship inquiry.

11. Click BLAST.

This ends the tutorial on finding your own gene sequences for BLAST.

Student Guided Inquiry Activity:Identify a disease that is known to be related to proteins; identify the gene

involved; search for the normal sequence of the gene in the Entrez Gene website; then upload it into BLAST. (The Entrez Gene website will only have the normal gene sequence. Usually there are hundreds to thousands of different chromosomal aberrations or mutations in the form of deletions, duplications, insertions, inversions, or point mutations that will cause many of these diseases).

Using BLAST for the normal gene involved in the disease, research the mutant versions of the disease and find out what is different about their sequences. Here are some diseases to choose from:

Page 27: AP Biology Lab-03

27Copyright © 2013 Quality Science Labs, LLC

•Genetic defects cause diseases in a variety of ways. The simplest way is through a “loss-of-function” mutation. In this type of defect, a change in the DNA nucleotides prevents the gene from making protein, or prevents the protein from functioning once it is made. Genetic diseases due to loss-of-function mutations are very common, and include cystic fibrosis (which affects the lungs and pancreas), Duchenne muscular dystrophy, and the hemophilias, a group of blood clotting disorders.

•A second mechanism for causing disease is called a “toxic-gain-of-function” mutation. In this type of defect, the gene takes on a new function that is harmful to the organism—the protein produced may interfere with cell functions, or may no longer be controllable by its normal regulatory partners, for instance. Many degenerative diseases of the brain are due to this type of mutation, including Huntington's disease.

•Read more: http://www.biologyreference.com/Fo-Gr/Genetic-Diseases.html#ixzz2MUxHudP4

•You will need to research what gene is associated with your selected disease before going to the Entrez Gene website.

Page 28: AP Biology Lab-03

28 Copyright © 2013 Quality Science Labs, LLC

Data Analysis and Conclusions

1. What were your findings in comparing normal to mutant gene sequences of your selected disease?

2. How can this help in treating or preventing this disease?

3. What are scientists doing today in genetic engineering with the knowledge of this gene deficiency?