monday, october 25, 5:56:17 pm what are gene families? a gene family is a group of genes that share...

17
Monday, October 25, 5:56:17 PM What are gene families? A gene family is a group of genes that share important characteristics. In many cases, genes in a family share a similar sequence of DNA building blocks (nucleotides). These genes provide instructions for making products (such as proteins) that have a similar structure or function. In other cases, dissimilar genes are grouped together in a family because proteins produced from these genes work together as a unit or participate in the same process. Classifying individual genes into families helps researchers: To describe how genes are related to each other. To predict the function of newly identified genes based on their similarity to known genes. Similarities among genes in a family can also be used to predict where and when a specific gene is active (expressed). Additionally, gene families may provide clues for identifying genes that are involved in particular diseases. Sometimes genes may fit into more than one family. Lec 07 Slide 133

Upload: kelley-arnold

Post on 13-Jan-2016

217 views

Category:

Documents


0 download

TRANSCRIPT

Page 1: Monday, October 25, 5:56:17 PM What are gene families?  A gene family is a group of genes that share important characteristics.  In many cases, genes

Monday, October 25, 5:56:17 PM

What are gene families?

A gene family is a group of genes that share important characteristics.

In many cases, genes in a family share a similar sequence of DNA building blocks (nucleotides). These genes provide instructions for making products (such as proteins) that have a similar structure or function.

In other cases, dissimilar genes are grouped together in a family because proteins produced from these genes work together as a unit or participate in the same process.

Classifying individual genes into families helps researchers:• To describe how genes are related to each other. • To predict the function of newly identified genes based on their

similarity to known genes.• Similarities among genes in a family can also be used to predict

where and when a specific gene is active (expressed). • Additionally, gene families may provide clues for identifying genes

that are involved in particular diseases.

Sometimes genes may fit into more than one family.

Lec 07

Slide 133

Page 2: Monday, October 25, 5:56:17 PM What are gene families?  A gene family is a group of genes that share important characteristics.  In many cases, genes

Monday, October 25, 5:56:17 PM

Lec 07

Slide 134

Members of DNA sequence families can be identified by a variety of different approaches:

• DNA sequencing: allows direct calculations on the degree of sequence relationship of family members.

• DNA hybridization and cloning: a probe from a gene family member typically gives a complex band pattern when hybridized against a Southern blot of genomic DNAs. Individual family members can then be cloned by screening genomic DNA libraries.

• PCR cloning: permits identification of novel family members by designing degenerate primers corresponding to highly conserved nucleotide or amino acid sequences.

In some gene families there is particularly pronounced homology within specific strongly conserved regions of the genes; the corresponding sequence similarity between the remaining portion of the coding sequence in the different genes may be quite low.

Gene families encoding products with large, highly conserved domains. Domains are fundamental units of protein organization.

Gene families

Page 3: Monday, October 25, 5:56:17 PM What are gene families?  A gene family is a group of genes that share important characteristics.  In many cases, genes

Monday, October 25, 5:56:17 PM

The shape and structure of proteins Lec 07

Slide 135

A protein molecule is made from a long chain of amino acids, each linked to its neighbor through a covalent peptide bond.

Proteins are therefore also known as polypeptides. Each type of protein has a unique sequence of amino acids.

The repeating sequence of the polypeptide chain is called polypeptide backbone. Attached to this repetitive chain are those portions of the amino acids that are not involved in making a peptide bond and which give each amino acid its unique properties

Page 4: Monday, October 25, 5:56:17 PM What are gene families?  A gene family is a group of genes that share important characteristics.  In many cases, genes

Monday, October 25, 5:56:17 PM

The folding of a protein chain

The requirement that no two atoms overlap limits greatly the possible bond angles in a polypeptide chain.

This steric interactions severely restrict the variety of conformations that are possible.

The folding of a protein chain is, however, further controlled by many different sets of weak noncovalent bonds that form between one part of the chain and another.

The weak bonds are of three types: • hydrogen bonds• ionic bonds• van der Waals attractions

Lec 07

Slide 136

Page 5: Monday, October 25, 5:56:17 PM What are gene families?  A gene family is a group of genes that share important characteristics.  In many cases, genes

Monday, October 25, 5:56:17 PM

The folding of a protein chain

A fourth weak force (polar and nonpolar side chains) also has a central role in determining the shape of a protein.• The nonpolar (hydrophobic) side

chains in a protein tend to cluster in the interior of the molecule.

• In contrast, polar side chains tend to arrange themselves near the outside of the molecule, where they can form hydrogen bonds with water and with other polar molecules.

As a result of all of these interactions, each type of protein has a particular three-dimensional structure, which is determined by the order of the amino acids in its chain.

The final folded structure, or conformation, adopted by any polypeptide chain is generally the one in which the free energy is minimized. Folding process often assisted by special proteins called molecular chaperones.

Lec 07

Slide 137

Page 6: Monday, October 25, 5:56:17 PM What are gene families?  A gene family is a group of genes that share important characteristics.  In many cases, genes

Monday, October 25, 5:56:17 PM

Proteins come in a wide variety of shapes, and they are generally between 50 and 2000 amino acids long.

Large proteins generally consist of several distinct protein domains (structural units) that fold more or less independently of each other.

The detailed structure of any protein can be depicted in several different ways, each emphasizing different features of the protein:

• polypeptide backbone model (A)• a ribbon model (B)• wire model that includes the

amino acid side chains (C)• space-filling model (D)

Protein’s models Lec 07

The images are colored in a way that allows the polypeptide chain to be followed from its N-terminus (purple) to its C-terminus (red).

Slide 138

Page 7: Monday, October 25, 5:56:17 PM What are gene families?  A gene family is a group of genes that share important characteristics.  In many cases, genes

Monday, October 25, 5:56:17 PM

A protein’s conformation is amazingly complex, but the description of protein structures can be simplified by the recognition that they are built up from several common folding patterns.

The regular conformation of the polypeptide backbone observed in the α helix and the β sheet.

These two patterns are particularly common because they result from hydrogen-bonding between the N–H and C=O groups in the polypeptide backbone, without involving the side chains of the amino acids.

An α helix is generated when a single polypeptide chain twists around on itself to form a rigid cylinder. A hydrogen bond is made between every fourth peptide bond, linking the C=O of one peptide bond to the N–H of another. This gives rise to a regular helix with a complete turn every 3.6 amino acids.

The regular conformation of the polypeptide backbone

Lec 07

Slide 139

Page 8: Monday, October 25, 5:56:17 PM What are gene families?  A gene family is a group of genes that share important characteristics.  In many cases, genes

Monday, October 25, 5:56:17 PM

The regular conformation of the polypeptide backbone

Lec 07

In other proteins, α helices wrap around each other to form a particularly stable structure, known as a coiled-coil. This structure can form when the two (or in some cases three) α helices have most of their nonpolar (hydrophobic) side chains on one side, so that they can twist around each other with these side chains facing inward.

Long rod-like coiled-coils provide the structural framework for many elongated proteins. Examples are α-keratin, which forms the intracellular fibers that reinforce the outer layer of the skin and the myosin molecules responsible for muscle contraction.

Slide 140

Page 9: Monday, October 25, 5:56:17 PM What are gene families?  A gene family is a group of genes that share important characteristics.  In many cases, genes

Monday, October 25, 5:56:17 PM

The regular conformation of the polypeptide backbone

Lec 07

Two types of β sheet structures

The core of many proteins contains extensive regions of β sheet. These β sheets can form either from neighboring polypeptide chains that run in the same orientation („A”, parallel chains) or from a polypeptide chain that folds back and forth upon itself, with each section of the chain running in the direction opposite to that of its immediate neighbors („B”, antiparallel chains).

Both types of β sheet produce a very rigid structure, held together by hydrogen bonds that connect the peptide bonds in neighboring chains

Slide 141

Page 10: Monday, October 25, 5:56:17 PM What are gene families?  A gene family is a group of genes that share important characteristics.  In many cases, genes

Monday, October 25, 5:56:17 PM

There are four levels of protein structure: primary, secondary, tertiary, and quaternary.

• The linear sequence of amino acids are known as the primary structure of the protein.

• Stretches of polypeptide chain that form α helices and β sheets constitute the protein’s secondary structure.

• The three-dimensional structure of a single polypeptide chain is termed its tertiary structure. Tertiary structures are different combinations of the secondary structures (α helices, β sheets, and loops). Tertiary structure is subdivided into certain portions that are termed motifs and domains.

• If a particular protein molecule is formed as a complex of more than one polypeptide chain, the complete structure (the full three-dimensional organization) is designated as the quaternary structure

Organization of protein structure Lec 07

Slide 142

Page 11: Monday, October 25, 5:56:17 PM What are gene families?  A gene family is a group of genes that share important characteristics.  In many cases, genes

Monday, October 25, 5:56:17 PM

Protein motifs Lec 07

Secondary structure elements are observed to combine in specific geometric arrangements known as motifs or supersecondary structures. Proteins have a limited number of structural motifs:

• Alpha-Alpha (two alpha helixes linked by a loop)• Beta-Beta (two beta-strands linked by a loop)• Beta-alpha-Beta (Beta-strand linked to an alpha helix that is also

linked to other beta strand, by loops)

HELIX-LOOP-HELIX motif HAIRPIN BETA SHEET motif BETA-ALPHA-BETA motif

Slide 143

Page 12: Monday, October 25, 5:56:17 PM What are gene families?  A gene family is a group of genes that share important characteristics.  In many cases, genes

Monday, October 25, 5:56:17 PM

Protein domains Lec 07

Studies of the conformation, function, and evolution of proteins have also revealed the central importance of a unit of organization different from the four just described (primary, secondary, tertiary, and quaternary).

This is the protein domain, a substructure produced by any part of a polypeptide chain that can fold independently into a compact, stable structure.

A domain usually contains between 40 and 350 amino acids, and it is the modular unit from which many larger proteins are constructed.

The smallest protein molecules contain only a single domain, whereas larger proteins can contain as many as several dozen domains, usually connected to each other by short, relatively unstructured lengths of polypeptide chain.

The different domains of a protein are often associated with different functions. Slide 144

Page 13: Monday, October 25, 5:56:17 PM What are gene families?  A gene family is a group of genes that share important characteristics.  In many cases, genes

Monday, October 25, 5:56:17 PM

Ones again, what are gene families? Lec 07

In some gene families there is particularly pronounced homology within specific strongly conserved regions of the genes; the corresponding sequence similarity between the remaining portion of the coding sequence in the different genes may be quite low.

• Gene families with large conservative domains (other parts could be low conservative).

• Gene families with short conservative motifs. It follows proteins can be classified into many families.

For example, the serine proteases, a large family of proteolytic enzymes that includes the digestive enzymes chymotrypsin, trypsin, and elastase, and several proteases involved in blood coagulation. When the protease portions of any two of these enzymes are compared, parts of their amino acid sequences are found to match. The similarity of their three-dimensional conformations is even more striking: most of the detailed twists and turns in their polypeptide chains, which are several hundred amino acids long, are virtually identical.

Slide 145

Page 14: Monday, October 25, 5:56:17 PM What are gene families?  A gene family is a group of genes that share important characteristics.  In many cases, genes

Monday, October 25, 5:56:17 PM

Gene superfamilies Lec 07

Proteins that are functionally related in a general sense, but show only weak homology.

• Immunoglobulin superfamily „A”, (IG genes, T- cell receptor genes, HLA-genes….).

• Globin superfamily „B”, (myoglobin, alpha and beta-globins, neuroglobin etc….).

• G-protein coupled receptor superfamily „C”.

A. B. C.

Slide 146

Page 15: Monday, October 25, 5:56:17 PM What are gene families?  A gene family is a group of genes that share important characteristics.  In many cases, genes

Monday, October 25, 5:56:17 PM

Lec 07

Human Genome Organisation (HUGO) http://www.genenames.org/

Gene families

Slide 147

Page 16: Monday, October 25, 5:56:17 PM What are gene families?  A gene family is a group of genes that share important characteristics.  In many cases, genes

Monday, October 25, 5:56:17 PM

SMART (a Simple Modular Architecture Research Tool) allows the identification and annotation of genetically mobile domains and the analysis of domain architectures.

http://smart.embl-heidelberg.de/

Homology-based gene prediction Lec 07

Slide 148

Page 17: Monday, October 25, 5:56:17 PM What are gene families?  A gene family is a group of genes that share important characteristics.  In many cases, genes

Monday, October 25, 5:56:17 PM

Homology-based gene prediction Lec 07

Conserved Domain Search Service (CD Search). Identifies the conserved domains present in a protein sequence. CD-Search uses RPS-BLAST (Reverse Position-Specific BLAST) to compare a query sequence against position-specific score matrices that have been prepared from conserved domain alignments present in the Conserved Domain Database (CDD).

Slide 149