molecular marker by anil bl gather

MOLECULAR MATKERS

NH 1

Delhi

5 KM

Thank

s for

your

visit

Just as mileposts guide the motorists along a linear highway, molecular

tools enable the geneticists to establish specific ‘DNA markers’ at

defined places along each chromosome. DNA markers can then be used

to delineate when one has reached or passed by a particular gene of

interest.

Molecular

MarkerA molecular marker is a polypeptide or piece of DNA with easily

identified phenotype such that cells or individuals with different

alleles are distinguishable. It can be a protein, isozyme, short DNA

sequence, such as a sequence surrounding a single base-pair change

(single nucleotide polymorphism, SNP), or a long one, like

minisatellites, whose inheritance can be monitored.

Ideal characteristics• Polymorphic

• Reproducible

• Co-dominant

• Wide genome coverage

• Easy and inexpensive

• Easy exchange of data

1. Represent the genetic

difference

2. May or may not represent

the target gene

3. Occupy specific loci

Molecular

marker

Protein marker Hybridization

based markerPCR based marker

•RFLP

•VNTR

•RAPD

•AFLP

•SSR

•ISSR

•SCAR

•CAPS

•EST

•Isozyme

Isozyme markers

Zymogram

Isozymes are proteins with same enzymatic function but

different structural, chemical, or immunological

characteristics.Use: Population genetics, phylogeny, diversity,

Isozyme

Multiple forms of the

same enzyme coded

by the different genes,

one enzyme, more

than one locus.

Allozyme

One

enzyme;

one locus

Advantages: Easy and inexpensive; required no sequence

information; co-dominant

Disadvantages: Limited availability of enzyme system,

low level of polymorphism, environment and tissue

dependent

Restriction Fragment Length Polymorphism (RFLP)

Digestion with

restriction enzyme Construct genomic or cDNA library

Develop probes

Electrophoresis

and Southern

blotting

Probe labelling

Hybridization

AA aa AaMarker scoring

Eco RI Eco RI

Eco RI Eco RI

Probe DNA Label

Labelled probe

Genomic

DNA

Restriction Fragment Length Polymorphism (RFLP)

Polymorphism: Point mutation, insertion, deletion or unequal

crossing over. It depends on enzyme probe combination.

Probes can be cDNA or PstI-derived genomic clone.

Advantages: Highly reproducible, co-dominant, wide genome

coverage, no sequence information, can be used across

species.

Disadvantages: Laborious, require high amount of DNA,

require radioactivity, difficult to automate.

Variable Number of Tandem Repeats (VNTR)

Repetitive DNAMajority of genome is constituted by tandem arrays of different

types of repetitive DNA, which play important roles in absorbing

mutation. Length polymorphisms arise due to polymerase slippage

during replication or unequal recombination.Minisatellites

Tandem repeats with a monomer repeat length of about 11–60 bp

Microsatellites

Short tandem of 1 to 6 bp long monomer sequence

VNTR: Restriction digested genomic DNA is hybridized

with a particular minisatellite sequence.

http://upload.wikimedia.org/wikipedia/en/5/5e/VNTRDemo.gif

http://upload.wikimedia.org/wikipedia/en/5/5e/VNTRDemo.gif

Each variant acts as an inherited allele, allowing them to

be used for personal or parental identification. VNTRs

have become essential to forensic crime investigations,

via DNA fingerprinting and the CODIS database

Variable Number of Tandem Repeats (VNTR)

The Combined DNA Index System (CODIS) is a DNA database funded by the United

States Federal Bureau of Investigation (FBI). It is a computer system that stores DNA

profiles created by federal, state, and local crime laboratories in the United States, with the

ability to search the database to assist in the identification of suspects in crimes

http://en.wikipedia.org/wiki/File:D1S80Demo.gif

http://en.wikipedia.org/wiki/File:D1S80Demo.gif

First cycle

Second cycle

Third cycle

Unit length RAPD products

Genomic DNA

Use one random decamer primer and amplify

by polymerase chain reaction

Electrophoresis

AA aa AaMarker scoring

40-45 cycles

Random

Amplified

Polymorphi

c DNA

(RAPD)

Random Amplified Polymorphic DNA

(RAPD)

Random Amplified Polymorphic DNA

(RAPD)

Multiple Arbitrary Amplicon Profiling (MAAP) Collective term for techniques using single arbitrary primers

UseDiversity, phylogeny

AdvantageEasy, inexpensive, no sequence information, low quantity of DNA, wide

genome coverage.

DisadvantageDominant, low reproducibility, sensitive to contamination

Arbitrarily Primed Polymerase Chain Reaction

(AP-PCR) Uses longer arbitrary primers than RAPDs

DNA Amplification Fingerprinting (DAF) Uses shorter, 5–8 bp primers

Electrophoresis

AA aa Aa

Marker scoring

Genomic DNA

Digestion with two

restriction enzymes

Ligation of adapter DNA

NNN

Preselective-amplification

Use two selective primers

NNNNN

Design longer selective primers

Selective PCR amplification

Amplified

Fragment

Length

Polymorphis

m (AFLP)

UsesGenetic diversity, phylogeny, fingerprint & cultivar identification, contig

map, criminal & paternity test

AdvantagesHigh polymorphism, wide genome coverage, low quantity of DNA,

amenable to automation

DisadvantagesDominant, require good quality of DNA, use of radioactivity

AFLP fingerprint

http://en.wikipedia.org/wiki/File:Electropherogram_trace.jpg

http://en.wikipedia.org/wiki/File:Electropherogram_trace.jpg

Genomic DNA

P1P2

Synthesize two primers per locus

from the flanking conserved regions

Polymerase Chain Reaction

Electrophoresis

AA aa Aa

Marker scoring

Simple Sequence Repeats

(SSR)

Di-nucleotide repeat:

CACACACA

Tri-nucleotide repeat:

ATGATGATGATG

Example of allelic variation in

SSRs:

Allele A: CACACACA (4 repeats of

the CA sequence)

Allele B: CACACACACACA (6

repeats of the CA sequence)

Microsatellites are sections of

DNA, consisting of tandemly

repeating mono-, di-, tri-, tetra- or

penta-nucleotide units that are

arranged throughout the genomes

of most eukaryotic species

UsesFingerprinting, genome mapping & linkage analysis, marker-assisted

selection

AdvantagesHighly reproducible, co-dominant, wide genome coverage, low quantity

of DNA, PCR-based and nonradioactive, amenable to automation

DisadvantagesNeed sequence information, can not be used across species

SSR fingerprint

Inter Simple Sequence Repeat (ISSR)

Sequence Tagged Site (STS)

PCR amplification of unique single copy segment

of the genome using long primer, designed based

on already available sequence information. So it

has the advantages of both RAPD (no probe,

PCR-based, rapidity) and RFLP (codominance,

highly reproducible).

Three types- SCAR, CAPS, STMS

Sequence Characterized Amplified Region

(SCAR)Select a unique RAPD fragment linked to a specific trait and

showing polymorphism

Gel cut the fragment, cloned into a suitable vector and end

sequence

Design primers based on the sequence information,

PCR amplification, gel run

High reproducibility as long sequence specific primers are

used, codominant inheritance, low quantity of DNA, quick

& easy, locus specific & can be used in gene mapping and

marker assisted selection (MAS). However, need

sequence information

Sequence Tagged Microsatellite Site (STMS): when microsatellite locus is targeted

Cleaved Amplified Polymorphic Sequence

(CAPS)Amplify the target region using long PCR primer

Restriction digestion of the amplified fragment and gel run

High reproducibility as long sequence specific primers are

used, codominant inheritance, low quantity of DNA, quick

& easy, locus specific & can be used in gene mapping and

marker assisted selection (MAS). However, need

sequence information

SNPs are single base pair positions in genomic DNA, at which

different sequence alternatives (alleles) exist in normal

individuals in some population(s), herein the least frequent

allele has an abundance of 1% or greater.

Single Nucleotide Polymorphism

(SNP)

http://en.wikipedia.org/wiki/File:Dna-SNP.svg

http://en.wikipedia.org/wiki/File:Dna-SNP.svg

Several types of SNPs are distinguished, according to their

assignment to the structural element of genomic DNA or

their functional effectRegulatory SNPs (rSNPs)

Involving regulatory regions that control gene expression (promoter SNPs

and some intron SNPs)

Anonymous SNPs

Functional effect is unknown

Candidate SNPs

Presumably having a functional effect

Protein SNPs

Change in the protein function or expression

Single Nucleotide Polymorphism

(SNP)

SNP

GenotypingThe detection of SNP markers on the basis of overlapping

genomic DNA sequences

The detection of SNP markers on the basis of overlapping EST

sequences

The detection of SNP markers on the basis of unique

(nonoverlapping) genomic and EST sequences

The detection of SNP markers on the basis of “shotgun”

sequencing

Allele

Discrimination

A C

T T

A

T

Match

C

T

Mismatch

HybridizationNo hybridization

Allele-specific hybridization

C

T

A

T

Match

C

T

Mismatch

Primer extensionNo primer extension

Allele-specific PCR

A

T

Allele

Discrimination

C

T

A

T

Match

C

Mismatch

Nucleotide incorporationNo nucleotide incorporation

Allele-specific single-base primer extension

A

T

Allele

Discrimination

Allele

Discrimination

http://upload.wikimedia.org/wikipedia/en/0/01/SNP-invader-1.jpg

http://upload.wikimedia.org/wikipedia/en/0/01/SNP-invader-1.jpg

The most straightforward way to detect an allele-

specific product is to label it by incorporating one or

more nucleotides conjugated to a fluorescent dye.

Direct fluorescence detection is generally used with

solid-phase assay formats (microarrays and bead

arrays) and where allele-specific products are

separated by gel electrophoresis or capillary

electrophoresis

Direct Fluorescence Detection

Fluorescence-based Detection in

Homogenous Solution

When a fluorophore is excited by plane-polarized light, the

fluorescence emitted by the dye is also polarized. This

phenomenon is termed fluorescence polarization (FP).

Complete FP occurs only when the dye molecule is

stationary. Therefore the degree of observed FP is

dependent on how fast a molecule tumbles in solution, and

this is in turn dependent on the volume of the molecule,

which is related to its molecular mass. Therefore changes

in molecular mass (e.g., caused by primer extension,

probe hydrolysis, or invasive cleavage) can be detected by

changes in FP as long as all other conditions (temperature,

viscosity, etc.) remain constant. FP is used as the

detection method in the SnaPshot (Applied Biosciences)

and Acycloprime (Perkin Elmer)

commercial genotyping systems.

Fluorescence Polarization

Signal detection is based on difference in molecular

weights of small DNA fragments rather than the

behavior of a label. The analysis of DNA by mass

spectrometry requires soft ionization (i.e., without

fragmentation) and is usually achieved by Matrix-

Assisted Laser Desorption/Ionization- Time of Flight

(MALDI-TOF) analysis. The MALDI procedure involves

mixing the allele-specific products of the

discrimination assay with a matrix compound on a

metal plate. The mixture is then heated with a short

laser pulse, causing it to expand into the gas phase

where ionization is achieved by applying a strong

potential difference. Ions are accelerated toward the

detector and the time of flight (the time taken to reach

the detector) is measured, allowing the mass/charge

ratio to be calculated.

Mass Spectrometry

Pyrosequencing is a novel method for sequencing short stretches of DNA based

on the detection of pyrophosphate, a normal by-product of DNA synthesis.

Although similar in principle to primer extension allele discrimination methods,

pyrosequencing is suitable not only for typing SNPs but also for scoring entire

haplotypes (groups of linked SNPs).

Pyrosequencing

A

T

PPi

Adenosine 5’ phosphosulfate (APS)

ATP sulfurylase

ATP

Luciferin

Oxyluciferin + Light

Luciferase

Genic Molecular Marker (GMM)The markers, derived from genomic DNA, could belong to either the

transcribed or the non-transcribed part of the genome without any

information available on their functions. In contrast, GMMs are developed

from coding sequences like ESTs or fully characterized genes with known

functions.Gene-targeted markers (GTMs) Derived from polymorphisms within genes, however not necessarily

involved in phenotypic trait variation, e.g. untranslated regions (UTRs) of

EST sequences.

Functional markers (FMs)Derived from polymorphic sequences or sites within genes and, thus, more

likely to be causally involved in phenotypic trait variation (e.g. candidate

gene-based molecular markers).

Indirect functional markers (IFMs): For which the role for phenotypic trait

variation is indirectly known, and Direct functional markers (DFMs): for

which the role for the phenotypic trait variation is well proven.

Genic Molecular Marker (GMM)

Expressed Sequence

Tag

Unigene Sequence Data

In silico miningSoftware tools are

used to identify SSR,

SNP etc.

Direct mappingCan be used as

RFLP probe or for

primer designing for

STS, CAPS etc.

DNA Fingerprinting

• Genetic variation

• Cultivar and strain identification

• Eco-geographical variation

• Mutation detection

• Selection of diverse parents

• Germplasm conservation

Genetic Variation

Similarity

Coefficient

0.56 0.66 0.76 0.85 0.95

K. Swarna

K. Giriraj

K. Chipsona2

K. Jeevan

K. Alankar

K. Khasigaro

K. Kumar

K. Megha

K. Jyoti

K. Muthu

K. Badshah

K. Sutlej

K. Chandramukhi

K. Dewa

K. Lauvkar

K. Ashoka

K. Red

K. Safed

K. Chamatkar

K. Jawahar

K. Bahar

K. Sheetman

K. Lalima

K. Kuber

Cultivar and Strain

Identification

OPC 05 OPC 06 OPC 09 OPD 05Descriptors

• Morphological

• Biochemical

• DNA based

Eco-Geographical Variation

Similarity Coefficient

0.50 0.61 0.72 0.84 0.95

Bokuta 7

Bokuta 9

Spilow 1

Dubling 5

Dubling 7

Spilow 8

Rispa 10

Kalpa 9

Kalpa 10 Kanan 10

Rispa 2

Moorang 1

Skiba 9

Shongthong 9

Kalpa 3

Kanan 2

Skiba 1

Shongthong 10

Dubling 10

Akpa 2

Akpa 10

Moorang 6

Skiba 3

Shongthong 1

Eco-geographical

variability in Pinus

gerardiana

Genome Mapping

Assigning/locating genes/markers to particular

region of a chromosome and determining the

location of and relative distances between

genes/markers on the chromosome

• There are two types of maps: genetic linkage map

and physical map

• The genetic linkage map shows the arrangement of

genes and genetic markers along the chromosomes

as calculated by the frequency with which they are

inherited together. The physical map is

representation of the chromosomes, providing the

physical distance between landmarks on the

chromosome, ideally measured in nucleotide bases.

The ultimate physical map is the complete

sequence itself

Genetic Linkage Map

The three main steps of linkage map

construction are:

(1)Production of a mapping population

(2)Identification of polymorphism

(3)Linkage analysis of markers

Development of Mapping

Population

Marker A

Marker C

Marker D

A

B

B

C

C

D

D14 6 4

…. 10 12

…. …. 2

…. …. ….

A D C4/20 2/2

06/20

Re

cu

rre

nt p

are

nt

Do

no

r p

are

nt

F1

hyb

rid

Marker B

Linkage Analysis

Linkage Map

Comparative Mapping or

Synteny• The grass model

• Tomato- potato co-linearity

• Map based prediction of location of

orthologous genes

Cross mapping of DNA

probes in a range of cereal

crops has led to the

widespread identification of

chromosomal regions in

which marker orders are

highly conserved. It is now

possible to describe all the

genomes of grass crop

species by their relationship

to a single reference genome,

rice. Potato genomes were mapped into 12 linkage groups

corresponding to 12 potato chromosomes using tomato cDNA

clones. Three paracentric inversions in chromosomes 5, 9, 10 were

detected in relation to the tomato chromosomes. The other nine

chromosomes are homosequential

Species kilobases/centimorgan

Arabidopsis 139

Tomato 510

Corn 2140

Physical and genetic distances are resolved by hybridizing clones, which

define closely linked genetic markers, to DNA that has been cut with rare

cutting enzymes. If two clones hybridize to the same fragment, then the

maximum distance between those two clones is the size of the restriction

fragment. As the genetic distance between the loci is known, genetic and

physical distances in this region can be correlated.

Physical

Mapping

The correspondence between genetic and physical distance

varies widely at different locations within a genome due to

recombination suppressed or hotspot regions

Fluorescence in situ Hybridization

(FISH)

Metaphase cell

Array on glass slide

Hybridize with labelled probe

Fluorescent

counterstaining of

chromosome

Visualization on

fluorescent microscope

Marker-assisted

Selection

MAS refers to the use of DNA markersthat are tightly-linked to target loci as a substitute for or to assist phenotypic screening

Assumption: DNA markers can reliably predict phenotype

High-resolution Linkage

Mapping

• Markers must be tightly-linked to target loci.

• Ideally markers should be <5 cM from a gene or QTL

Marker A

QTL5 cM

RELIABILITY FOR

SELECTION

Using marker A only:

1 – rA = ~95%

Marker A

QTL

Marker B

5 cM 5 cM

Using markers A and B:

1 - 2 rArB = ~99.5%

• Using a pair of flanking markers can greatly improve reliability but

increases time and cost

Marker-assisted

Selection

Markers should be validated by testing their

effectiveness in determining the target

phenotype in independent populations and

different genetic backgrounds, which is referred

to as ‘marker validation’. In other words, marker

validation involves testing the reliability of

markers to predict phenotype. This indicates

whether or not a marker could be used in

routine screening for MAS.

Validation of

Markers

Marker

Conversion

There are two instances where markers may

need to be converted into other types of

markers: when there are problems of

reproducibility (e.g. RAPDs) and when the

marker technique is complicated, time-

consuming or expensive (e.g. RFLPs or

AFLPs). The problem of reproducibility may be

overcome by the development. of sequence

characterised amplified regions (SCARs) or

sequence-tagged sites (STSs) derived by

cloning and sequencing specific RAPD

markers.

F2

P2

F1

P1 x

large populations consisting of thousands

of plants

ResistantSusceptible

MARKER-ASSISTED SELECTION (MAS)

Method whereby phenotypic selection is based on DNA markers

Marker-assisted

Selection

• Simpler method compared to phenotypic screening

– Especially for traits with laborious screening

– May save time and resources

• Selection at seedling stage

– Important for traits such as grain quality

– Can select before transplanting in rice

• Increased reliability

– No environmental effects

– Can discriminate between homozygotes and heterozygotes and select single plants

Marker-assisted

Selection

• more accurate and efficient selection

of specific genotypes

– May lead to accelerated variety

development

• more efficient use of resources

– Especially field trials

Marker-assisted

Selection

Map Based Cloning

• Marker targeting

• Chromosome

walking

Chromosome

landing

Bulked Segregants Analysis

(BSA)

A BGen

e

Chromosome Walking

Chromosome Landing

A B CGen

e

0.78 cM0.16 cM 11.3 cM

B D E F

Gene

Target Gene Identification

RFLP B RFLP G RFLP D RFLP H RFLP E

Gene

• High-resolution mapping to demonstrate cosegregation of

the candidate clone with the phenotype

• Demonstration that the expression pattern of the gene is

consistent with the phenotype, for example, that it is

transcribed in the appropriate tissue(s) and/or induced by

the appropriate stimulus

• Determination of the DNA sequence of the gene and its

comparison with those in sequence databases to identify

any homology with genes of known function

• Probing mutant lines with the cDNAs to identify alterations

of the DNA or mRNA from the wild type

• Complementation of the mutant phenotype by

transformation with the gene

Target Gene Validation

molecular marker by anil bl gather

Science