genetic polymorphism and sequence evolution of an alternatively spliced exon of the glial fibrillary...
TRANSCRIPT
-
8/9/2019 Genetic polymorphism and sequence evolution of an alternatively spliced exon of the glial fibrillary acidic protein g
1/9
Genetic polymorphism and sequence evolution of an alternativelyspliced exon of the glial fibrillary acidic protein gene, GFAP
Ripudaman Singh,a Anders L. Nielsen,a,b Marianne G. Johansen,a and Arne L. Jrgensena,*a Institute of Human Genetics, University of Aarhus, DK-8000 Aarhus C, Denmark
b Department of Molecular Biology, University of Aarhus, DK-8000 Aarhus C, Denmark
Received 5 September 2002; accepted 29 March 2003
Abstract
Isoform GFAP of the human cytoskeletal protein GFAP carries, as the result of alternative splicing of exon 7a of GFAP, a novel
42-amino-acid-long C-terminal region with binding capacity for the presenilin proteins. Here we show that exon 7a is present in a variety
of mammals but absent from GFAP of chicken and fish. Comparison of the mouse and human GFAP exons showed an increased rate of
nonsynonymous nucleotide substitutions in exon 7a compared to the other exons. This resulted in 10 nonconservative and 2 conservative
amino acid substitutions and suggests that exon 7a has evolved under different functional constraints. Exons 7a of humans and higher
primates are 100% identical apart from alanine codon 426, which is conserved in only 9% of the human alleles, while 21 and 70% of the
alleles, respectively, have a valine or a threonine codon at that position. Threonine represents a potential phosphorylation site, and positive
selection of that effect could explain the high allele frequency.
2003 Elsevier Science (USA). All rights reserved.
Keywords: Alternative splicing; Polymorphism; Allele frequency; Selection; Evolution
Glial fibrillary acidic protein (GFAP) is the principal
intermediate filament (IF) protein of the mature astrocytes
of the central nervous system. It belongs to type 3 of the IF
protein family and has a characteristic monomeric structure
composed of a highly conserved central -helical rod do-
main flanked by nonhelical head and tail domains. The
monomers form homodimers and homotetramers or het-
erotetramers with other IF proteins. Further multimerization
produces the intermediate fibers of the cytoskeleton. Thus,
GFAP provides structural stability to the astrocyte and may
take part in modulating its shape and motility. Regulatory
elements directing astrocyte-specific transcription havebeen identified, and synthesis of GFAP is rapidly upregu-
lated in activated astrocytes. The cell-limited expression of
GFAP is the basis for the routinely and widespread use of
the protein as an antigen marker specific for the astrocyte
[15].
The human GFAP is a 432-amino-acid-long polypeptide
of 55 kDa encoded by the nine exons of GFAP, which
extend over 10 kb on chromosome 17q21 [68]. GFAP is
phylogenetically old. Compared with mouse Gfap [9] the
nucleotide sequence and exon/intron organization of the
human gene are highly conserved and the polypeptide
shows more than 90% homology to the mouse and pig
GFAP and about 85% homology to GFAP of the goldfish
[6,8,10]. Accordingly, antimammalian GFAP antibodies
have been used successfully in comparative immunohisto-chemical studies of astrocytes in brains from bird, reptile,
and fish [1115].
We have previously characterized a novel human GFAP
isoform, designated GFAP [16]. This isoform results from
alternative splicing of a novel exon embedded in intron 7
and the use of a new polyadenylation signal present in this
exon, termed exon 7a. Hereby, the exons 8 and 9-encoded
tail region of the classical isoform GFAP is replaced by a
new tail region encoded by exon 7a. The generated isoform
Sequence data from this article have been deposited with the EMBL/
GenBank Data Libraries under Accession Nos. AY142187AY142200.
* Corresponding author. Fax: 45-86123173.
E-mail address: [email protected] (A.L. Jrgensen).
R
Available online at www.sciencedirect.com
Genomics 82 (2003) 185193 www.elsevier.com/locate/ygeno
0888-7543/03/$ see front matter 2003 Elsevier Science (USA). All rights reserved.
doi:10.1016/S0888-7543(03)00106-X
-
8/9/2019 Genetic polymorphism and sequence evolution of an alternatively spliced exon of the glial fibrillary acidic protein g
2/9
GFAP has protein binding capacity for the presenilin pro-
teins in vitro [16]. In the present study we show that exon 7a
is present also in GFAP of higher primates, the pig, and the
mouse, but absent from GFAP of chicken, zebrafish, and
goldfish. Interspecies comparison showed that the coding
region of exon 7a has been under evolutionary constraints
different from those on the other exons of the gene and wediscovered a high-frequency polymorphism in this exon
among humans. We will argue that exon 7a is mammalian
specific and propose that it may confer new and advanta-
geous functions to the GFAP isoform.
Results
Species comparison of the nucleotide sequences
of exon 7a
The head and especially the highly conserved rod do-mains of the IF proteins secure proper dimer and tetramer
formation and higher order polymerization, while the less
conserved tail domains of the IF proteins are available for
interaction with other cytosolic proteins [17]. Fig. 1A shows
the exon/intron organization of the 3 end of human GFAP,
and the two mRNA splice forms GFAP and GFAP are
indicated. Exon 7a contains a functional polyadenylation
site and GFAP is created by splicing of exon 7a directly
onto exon 7 [16]. This results in a tail domain of the isoform
GFAP whose amino acid sequence is different from and
one amino acid shorter than the tail domain of GFAP
(Fig. 1B).
To study whether the nucleotide sequence of exon 7a has
been conserved during evolution we obtained genomic
DNA from nonhuman primates, including pygmy chimpan-
zee (Pan paniscus), common chimpanzee (Pan troglodytes),
gorilla (Gorilla gorilla), orangutan (Pongo pygmaeus), and
baboon (Papio), and from the domestic pig (Sus scrofa
domesticus), the mouse ( Mus musculus), the rat (Rattusnorvegicus), the chicken (Gallus gallus domesticus), the
goldfish (Carassius auratus), and the zebrafish (Danio re-
rio) and used these DNAs to identify and to sequence the
coding region and some of the 3 UTR of exon 7a ofGFAP.
The primers used to PCR amplify and sequence exon 7a are
described in Table 4 and under Materials and methods.
We were able to identify exon 7a only in the mammalian
species. With respect to the nonmammalian species we
amplified and sequenced the entire intron 7 of GFAP. Intron
7 is about 2.3 kb long in the human and the mouse gene, but
only 88 and 82 bp in goldfish and zebrafish, respectively,
and 675 bp in chicken (Fig. 2). The nonmammalian intron 7sequences contained no indications of the presence of exon
7a or other alternative splicing and polyadenylation signals
(for specific intron 7 sequence information the accession
numbers for zebrafish, goldfish, and chicken are given under
Materials and methods).
In Fig. 3A are shown the nucleotide sequences of the
coding regions of exons 7a, identified in the species listed.
The human sequence represents 12 unrelated individuals
having identical sequences apart from a polymorphism at
codon 426 of which the most frequent codon is shown. The
sequence of the common chimpanzee represents four unre-
lated individuals whose exon 7a sequences were 100%
Fig. 1. Alternative splicing of human GFAP. (A) Exon/intron organization of the 3 end of the gene and the corresponding two mRNA splice forms GFAP
and GFAP. Note polyadenylation signal pA in exon 7a. (B) Amino acid sequences of the tail domain of GFAP and GFAP. Sequences were obtained
from Nielsen et al. [16].
186 R. Singh et al. / Genomics 82 (2003) 185193
-
8/9/2019 Genetic polymorphism and sequence evolution of an alternatively spliced exon of the glial fibrillary acidic protein g
3/9
identical, while the other sequences represent one individual
from each species.
The human exon 7a nucleotide sequence is 100% iden-
tical to the exon 7a sequences in the three most closely
related higher primates (pygmy chimpanzee, commonchimpanzee, gorilla) except for codon 426. This codon
encodes alanine in all the nonhuman species listed: in the
nonhuman higher primates the alanine codon is GCG, in the
baboon it reads GCA, and in the pig and the mouse it reads
GCC. Alanine at position 426 of the polypeptide, therefore,
appears to be conserved. In humans, codon 426 can be
either a threonine codon, ACG, shown in Fig. 3A, or a
valine codon, GTG, or the ancestral alanine codon GCG.
The threonine codon results from a G to A transition at the
first position of the GCG alanine codon and represents a
nonconservative amino acid substitution, while a C to T
transition at the second position creates the valine codon
and represents a conservative amino acid substitution. The
tyrosine codon TAT at position 406 is found only in hu-
mans, the chimpanzee, and the gorilla and most likely re-
sults from a C to T transition at the first position of the
histidine codon CAT present in the orangutan, the baboon,
and the pig. The mouse has a proline codon, CCG, at
position 406.
In addition to the species-specific A in the third position
of the alanine codon 426, the baboon sequence contains the
proline codon CCA at position 428, shared only by the
mouse, while the other higher primates have the proline
codon CCG at this position. Thus, the pattern of sequence
deviations in exon 7a among the primates, including hu-
mans, is consistent with their evolutionary relatedness.
The pig sequence has accumulated only one nucleotide
change not shared by the other species, namely the neutral
T of the glycine codon GGT at position 400. The corre-sponding glycine codon in the mouse reads GGC, while
humans and the nonhuman primates have the asparagine
codon AAT at that position. All other deviations of the pig
sequence from the human and the nonhuman primate se-
quences are shared by the mouse: the glutamic acid codon
GAA at position 397, the glutamine codon CAA at position
413, the alanine codon GCC at position 426, and the leucine
codon CTC at position 430.
Five codons of the mouse sequence encode amino acids
not shared by any of the other species at these positions:
glutamine codon CAA at position 401, proline codon CCT
at position 406, valine codon GTC at position 415, glutamic
acid codon GAA at position 423, and proline codon CCT at
position 431. But the mouse sequence contains no neutral
nucleotide deviation from the human sequence that is not
shared by, at least, the rat.
The rat sequence is unique, having experienced an insertion
of the dinucleotide GC between codons 420 and 421 (Fig. 3A).
The resulting shift in reading frame has changed the specificity
of codons 421, 422, and 423 and created a stop codon, TAA,
from the TA of codon 423 and the first A of codon 424. The
tail region of the rat GFAP, therefore, not only is truncated but
also contains four amino acids at the very C-terminus that are
not found in any of the other species.
Fig. 2. Species comparison of intron 7 of GFAP. Exon 7a is present only in intron 7 of the mammalian species and is flanked by direct repeats (arrows) in
the mouse gene. Numbers refer to lengths in base pairs. UTR, 3 untranslated region of exon 7a, i.e., from stop codon to polyadenylation signal pA. Mouse
and rat intron 7 sequences were obtained from Refs. [9] and [18]. Accession numbers for determined sequences are given under Materials and methods.
187R. Singh et al. / Genomics 82 (2003) 185193
-
8/9/2019 Genetic polymorphism and sequence evolution of an alternatively spliced exon of the glial fibrillary acidic protein g
4/9
The amino acid sequences encoded by exon 7a of the
different species are aligned in Fig. 3B. Threonine at posi-
tion 426 represents the most frequent of the 3 amino acid
variants (threonine, valine, alanine) of the human-specific
polymorphism at that position. Otherwise, the amino acid
sequences are identical among the higher primates except at
position 406, where the orangutan, instead of tyrosine,
shares histidine with the baboon and the pig. The amino acid
sequences diverged 30% between humans and the mouse,
i.e., amino acid substitutions at 12 of 41 positions. Ten of
these changes are nonconservative, only the changes of
glutamic acid to aspartic acid at position 397 and valine to
isoleucine at position 415 are conservative. By contrast, the
only amino acid substitution that has occurred in the corre-
sponding 42-amino-acid-long tail region of the isoform
GFAP is a conservative aspartic acid to glutamic acid
substitution at position 423 [16].
The coding region of exon 7a has accumulated a unique
pattern of nucleotide changes
We conducted a sequence comparison between all 10
exons of human and mouse GFAP. The numbers listed in
Table 1 show that synonymous substitutions are more fre-
quent than nonsynonymous ones in all exons except exon
7a, for which the pattern is the opposite, with 15 nonsyn-
onymous and 5 synonymous substitutions. Exons 8 and 9
together contain only 1 nonsynonymous and 6 synonymous
substitutions. In Table 1 are also listed the numbers of
nonsynonymous and synonymous sites in exon 7a, exon 8,
and exon 9. Synonymous and nonsynonymous sites are
counted as follows: If the number of possible synonymous
changes at a particular position in a codon is i, then this site
is counted as i /3 synonymous and (3 i)/3 nonsynony-
mous. The numbers of synonymous and nonsynonymous
Fig. 3. Species comparison of the coding region of exon 7a. (A) Nucleotide sequences relative to the human sequence from codon 391 to stop codon TAGat position 432, indicated by an asterisk. Codon 426, which is polymorphic in the human population, is marked by a dot. Note the GC insertion in the rat
sequence between codons 420 and 421. (B) Amino acid sequences derived from the nucleotide sequences in (A). Alanine at position 426, marked by a dot,
is conserved among the nonhuman species. In humans, this position is most frequently occupied by threonine, less frequently by valine, and only rarely by
the ancestral alanine. Note the truncated rat sequence due to the GC insertion indicated in (A). Asterisk corresponds to stop codon in (A). Abbreviations: C.
and P. chimpanzee, common and pygmy chimpanzee. Accession numbers for determined sequences are given under Materials and methods.
188 R. Singh et al. / Genomics 82 (2003) 185193
-
8/9/2019 Genetic polymorphism and sequence evolution of an alternatively spliced exon of the glial fibrillary acidic protein g
5/9
sites are counted in both the human and the mouse sequence
and the average is calculated. From these numbers we cal-
culated the frequency of nonsynonymous substitutions per
nonsynonymous site (KA) and the frequency of synonymous
substitutions per synonymous site (KS) and their ratios (Ta-
ble 2). More synonymous than nonsynonymous nucleotidesubstitutions are expected to accumulate, over time, in a
coding sequence and the tighter a functional constraint is,
the fewer nonsynonymous substitutions are allowed. Com-
parisons between human and mouse genes have identified
the KA/KS ratios to be 1, with an average of 0.2 [19,20];
in genes encoding highly conserved amino acid sequences
KS may exceed KA by more than 25 times [21]. Accord-
ingly, we found that KS exceeds KA by some 30 times in the
tail region of GFAP, encoded by the two exons 8 and 9
(KA/K
S 0.0344). In exon 7a, the nonsynonymous substi-
tution rate is 20 times higher than in the combined exons 8
and 9 (0.1819 vs 0.0103) and the synonymous substitutionrate is lower (0.1873 vs 0.2997). Thus, the KA/K
Sratio of
exon 7a is 0.9716, which is close to the theoretical ratio of
1 expected for a sequence under no functional constraint. A
KA/KS ratio 1 is normally regarded as a sign of positive
selection since nonsynonymous substitutions are far more
likely than synonymous substitutions to improve the func-
tion of a protein [19,21].
Table 1 contains the numbers of CpG dinucleotidespresent in the exons of the human and mouse GFAP. Seven
CpGs are present in exon 7a of the human gene but none in
exon 7a of the mouse gene. This discrepancy is unique to
exon 7a, as the numbers of CpGs in all the other exons of
human and mouse GFAP proved to be similar. We also
counted the numbers of CpGs in the intronic sequences,
presumably under no functional constraint, between exon 7
and exon 7a and found no difference between the human
and the mouse sequences (data not shown). Because of
spontaneous deamination of the methylated C-residue of
CpG dinucleotides, these dinucleotides tend to change to
TpG or CpA, especially for CpGs present in a sequence thatis no longer subject to any functional constraint. To this end
it is interesting that the seven CpG dinucleotides present in
the human sequence do occur as TpG or CpA in the mouse
sequence, suggesting that the human sequence is under
different functional constraints.
Codon 426 is polymorphic in the human population
Our first sequenced exon 7a of human GFAP had thre-
onine instead of the evolutionarily conserved alanine codon
at position 426. Additional exon 7a sequences obtained
from DNA from 12 unrelated individuals confirmed that the
Table 1
Characteristics of the nucleotide changes in human and mouse GFAP
Exon Species Amino acids
(n)
Syn. subst.
(n)
Nonsyn. subst.
(n)
CpG
(n)
Syn. sites
(n)
Nonsyn. sites
(n)
1 Human 154a 48 25 26
Mouse 153 27
2 Human 20 4 0 1Mouse 20 1
3 Human 32 7 5 4
Mouse 32 2
4 Human 54 17 9 9
Mouse 54 8
5 Human 51 17 2 15
Mouse 51 16
6 Human 65 23 3 14
Mouse 65 12
7 Human 14 5 0 2
Mouse 14 1
7a Human 41 5 15 7 92 5/6 30 1/6
Mouse 41 0
8 Human 29 5 0 3 66 21
Mouse 29 39 Human 13 2 1 0 31 2/3 7 1/3
Mouse 14b 2
Note. Abbreviations: Syn. and Nonsyn. subst., synonymous and nonsynonymous substitutions.a Human exon 1 carries a duplication of alanine codon 9.b The last valine codon is duplicated in the mouse gene.
Table 2
Exon 7a has a distinct nucleotide substitution profile
Amino acids KA
KS
KA/K
S
Exon 7a 41 0.1819 0.1873 0.9716
Exons 8 and 9 42 0.0103 0.2997 0.0344
Note. KA, nonsynonymous substitutions per nonsynonymous site; KS,
synonymous substitutions per synonymous site; KA/KS, the ration between
KA and KS.
189R. Singh et al. / Genomics 82 (2003) 185193
-
8/9/2019 Genetic polymorphism and sequence evolution of an alternatively spliced exon of the glial fibrillary acidic protein g
6/9
human sequence deviates from the higher primate sequence
only at codon 426 and that the site is polymorphic with two
variant codons, threonine codon ACG and valine codon
GTG. We did not find the primate alanine codon GCG in
this sample, suggesting that the frequency of this ancestral
allele is less than 10%. The frequencies of the two variants
and possibly the ancestral allele were determined by geno-typing 64 unrelated healthy individuals of Danish extraction
with respect to codon 426. In our screening assay we took
advantage of a HhaI recognition site, GCGC, created by the
alanine codon GCG at position 426 and the C in the first
position of codon 427, and another HhaI site 41 bp farther
downstream in the 3 UTR of exon 7a (P1 and P2 in Fig.
4A, see also Materials and methods). A PCR product (342
bp long) including these restriction sites will cut with HhaI
at codon 426 only if it contains alanine codon GCG (Fig.
4B). A PCR product that does not cut here will contain
either the threonine codon ACG or the valine codon GTG.
These codons deviate from each other at positions 1 and 2
and that allowed us to distinguish between the two alleles by
a subsequent PCR assay with allele-specific primers
(Fig. 4C).
In Table 3 is listed the observed genotypes and the
distribution of individuals with respect to these genotypes.
The frequencies of the three alleles containing threonine
codon ACG or valine codon GTG or alanine codon GCG
were calculated from these figures. Assuming HardyWein-
berg equilibrium we calculated the expected numbers of the
genotypes, also listed in Table 3, and found no significant
deviation from the observed numbers (p 0.7, using Wil-
coxon nonparametric test). The frequency of the threonine-
containing allele was 0.70 and by far the most frequentallele in the human population, followed by a frequency of
0.21 for the allele with the valine codon, while the fre-
quency of the ancestral allele with the alanine codon was
only 0.09.
Discussion
We found that exon 7a is present in GFAP of humans,
higher primates, the pig, and rodents, but absent from GFAP
of zebrafish, goldfish, and chicken. Exon 7a may thus have
originated in the common ancestor of the mammals. To thisend it is interesting that 10-bp-long direct repeats have been
identified in intron 7 of the mouse gene flanking a 1.4-kb
pyrimidine- and repeat-rich sequence that contains the en-
tire exon 7a, including the polyadenylation signal. Flanking
direct repeats are the signature of an insertion event and we
found the 3 flanking repeat at the same position in the
human and the rat sequences, whereas the 5 repeat could
not be identified (Fig. 2). The polypyrimidine tract just
upstream of exon 7a may be significant because polypyri-
midine tract-binding proteins have been found to regulate
tissue-specific alternative splicing [22].
We have previously shown that exon 7a is alternatively
spliced in-frame to exon 7 of human GFAP, creating a novel
isoform, termed GFAP, with a new tail domain. This tail
domain was shown to have a different protein binding
capacity compared to the tail domain of isoform GFAP
encoded by the evolutionarily conserved exons 8 and 9. A
comparison of the human and mouse sequences shows that
the coding part of exon 7a has accumulated more nucleotide
changes than have exons 8 and 9 (Table 1). Furthermore,
75% of these changes are nonsynonymous (15 of 20), which
is the opposite of the distribution found in the other exons of
GFAP. In fact, the number of nonsynonymous substitutions
per nonsynonymous site (KA
) equals the number of synon-
Fig. 4. Genotyping human individuals for the polymorphic codon 426 of
exon 7a ofGFAP. (A) Map (not drawn to scale) of positions of primers and
the polymorphic HhaI restriction sites P1 and P2 used to study the codon
426 polymorphism. HhaI recognizes the sequence 5 GCGC and will cut at
P1 only if codon 426 is alanine codon GCG. Primer CHK1 will prime if
codon 426 is threonine codon ACG and primer CHK2 will prime if it is
valine codon GTG. Gray area represents coding region and white area
represents 3 untranslated region (see also Materials and methods). (B)
HhaI cutting assay for codon 426 polymorphism. A 342-bp-long PCR
product including the coding region of GFAP exon 7a was amplified from
human genomic DNA, purified, and cut with HhaI. Lanes 2 and 3 representuncut and cut DNA, respectively, for a DNA sample with cutting at P1 on
one allele and P2 on the other allele. Lanes 4 and 5 represent uncut and cut
DNA, respectively, for a DNA sample with cutting at P2 on both alleles.
No DNA samples were detected with cutting at P1 on both alleles. Lane 1
contains a DNA size marker with the fragment sizes indicated to the left.
(C) PCR assay to distinguish between ACG and GTG codons at position
426. A PCR assay was employed using S2R as reverse primer in combi-
nation with each of two new forward primers, CHK1 and CHK2 (Table 4),
in which the last two nucleotides at the 3 end have specificity for either the
ACG allele (CHK1) or the GTG allele (CHK2). In lanes 1 to 6 PCR
fragments obtained from three different DNAs were analyzed by agarose
gel electrophoresis. A DNA size marker was loaded in lane 7, with the
fragment sizes indicated to the right.
190 R. Singh et al. / Genomics 82 (2003) 185193
-
8/9/2019 Genetic polymorphism and sequence evolution of an alternatively spliced exon of the glial fibrillary acidic protein g
7/9
ymous substitutions per synonymous site (KS) and could
suggest that the sequence has accumulated nucleotide
changes at random and may have lost function (Table 2).
We will argue, however, that the nucleotide changes in exon
7a result from positive (adaptive) selection of a new func-
tion conferred by exon 7a to isoform GFAP.
Positive selection is often defined by a KA/K
Sratio 1.
But a cut-off level of 1 means that significant functional
changes in proteins will be missed, as illustrated by a num-
ber of adaptively evolving genes whose KA/KS values lie
between 1 and 0.6 [21]. Accordingly, a KA/KS value of
0.9716 found in exon 7a is not inconsistent with positive
selection.
All but 2 of the 12 amino acid substitutions that result
from the 15 nonsynonymous nucleotide changes in exon7a
are nonconservative. Survival of these 10 nonconservative
substitutions is more likely to result from positive selectionthan from loss of selection due to loss of function since all
but one are present in the primates and some of these are
shared by the pig. Note that all deviations from the mouse
sequence of the pig amino acid sequence are shared by the
primates. No amino acid substitution is found only in the pig
(Fig. 3B). Also the discrepancy in the numbers of CpG
dinucleotides in exon 7a of human and mouse GFAP (7 vs
0) is unique (Table 1). No such discrepancy was found in
the noncoding adjacent intronic sequences or in the other
exons. Together, these observations may indicate that exon
7a, with respect to sequence evolution since the split from
the mouse lineage, has been under a constraint that does notinclude other exonic or noncoding sequences.
The human polymorphism at codon 426 in exon 7a may
elucidate this point (Fig. 4 and Table 3). In all mammals
studied codon 426 encodes alanine and in all nonhuman
higher primates the alanine codon is GCG. In the human
population, however, the frequency of this ancestral allele is
only 0.09. A variant allele carrying a valine codon, GTG, as
the result of a nonsynonymous C to T transition at the
second position of the alanine codon, is more frequent
(0.21). The most frequent allele (0.70) carries a threonine
codon, ACG, at position 426, created by a G to A transition
at the first position of the alanine codon. It is intriguing that
the only nucleotide differences between exon 7a of humans
and that of the nonhuman primates are two nonsynonymous
substitutions having occurred in codon 426 and which have
created two variant alleles more frequent than the ancestral
allele. In general, nonsynonymous substitutions are func-
tionally disadvantageous and selected against and more so
for nonconservative amino acid substitutions [19]. But pos-itive selection may work on nonsynonymous substitutions
in a sequence that has conferred a new function to a protein.
It may therefore be significant that by far the most frequent
allele carries a nonconservative alanine-to-threonine substi-
tution. Hereby, GFAP acquires a potential phosphorylation
site that may have an advantageous functional effect. Pos-
itive selection of this effect could explain the high fre-
quency of the human-specific threonine allele.
Materials and methods
PCR amplification and sequencing of the coding region of
exon 7a
PCR-based analyses were done on genomic DNA puri-
fied from whole blood from humans and primates, brain
tissue from the pig, lever tissue from the chicken, and whole
organisms of zebrafish and goldfish. In Table 4 are listed the
primers used for PCR amplification and sequencing for each
species. The primer combinations used were for humans,
PCR primers 3 and 7 (annealing temperature (Ta) 56C),
sequencing primers 6 and 7; common chimpanzee, PCR
primers 2 and 5 (Ta
56C), sequencing primers 3 and 7;
pygmy chimpanzee, PCR primers 1 and 5 (Ta 56C), se-quencing primers 3 and 7; gorilla, PCR primers 2 and 4 (Ta56C), sequencing primers 6 and 7; orangutan, PCR primers
2 and 4 (Ta 56C), sequencing primers 6 and 7; baboon,
PCR primers 2 and 4 (Ta
56C), sequencing primers 6 and
7; domestic pig, PCR primers 8 and 9 (Ta
57C), sequencing
primers 8 and 9; chicken, PCR primers 10 and 11 ( Ta
55C),
sequencing primers 10 and 11; goldfish, PCR primers 12
and 13 (Ta
56C), sequencing primers 12 and 13; zebrafish,
PCR primers 12 and 13 (Ta 56C), sequencing primers 12
and 13; mouse, PCR primers 16 and 17 (Ta 60C), sequenc-
ing primers 16 and 17; and rat PCR primers 18 and 19 (Ta
60C), sequencing primers 19 and 20. The amplificationprogram, using Taq DNA polymerase (Amersham Pharma-
cia Biotech, Inc.), was as follows: An initial denaturation
step at 94C for 2 min followed by 30 cycles of PCR (94 C
for 2 min, annealing at temperatures as indicated for 1 min,
and extension at 72C for 1 min). Quality and quantity of
each PCR product was evaluated by electrophoresis of 1/10
of its volume in a 1.25% agarose gel along with a known
amount of a 100 bp DNA ladder. The PCR product was
purified from the rest of the amplification solution (45 l)
using a GFX PCR DNA and gel band amplification kit
(Amersham Pharmacia Biotech, Inc.) and dissolved in 40 l
of double-distilled water and quantified again. DNA se-
Table 3
Genotype distribution of the polymorphic codon 426 of exon 7a of
human GFAP
Genotype
Individuals (n)
Observed alleles
(n) carrying
Observed Expected ACG GTG GCG
ACG/ACG 35 32 70 0 0
ACG/GTG 12 19 12 12 0
GTG/GTG 6 3 0 12 0
ACG/GCG 8 8 8 0 8
GTG/GCG 3 2 0 3 3
GCG/GCG 0 0 0 0 0
Sum 64 64 90 27 11
191R. Singh et al. / Genomics 82 (2003) 185193
-
8/9/2019 Genetic polymorphism and sequence evolution of an alternatively spliced exon of the glial fibrillary acidic protein g
8/9
quencing of both strands was done by following the proto-
col of the DYEnamic ET Terminator Cycle Sequencing Kit
(Amersham Pharmacia Biotech, Inc.).
Assay for codon 426 polymorphism
DNA samples collected from 64 unrelated healthy adults
of Danish extraction were PCR amplified using primersSFP2 and S2R and the protocol described above. The prim-
ers define a 342-bp-long fragment that contains the coding
sequence of exon 7a and adjacent 3 UTR sequences (Fig.
4A). The ancestral alanine codon GCG at position 426 and
the first C of the proline codon CCG at position 427 together
form the HhaI recognition site 5GCGC3 (P1 in Fig. 4A).
Another HhaI recognition site is located 41 bp farther
downstream in the 3 UTR (P2 in Fig. 4A). Both HhaI sites
are polymorphic, and cutting at P1 is in linkage disequilib-
rium with absence of cutting at P2 and vice versa. Cutting
at P1 results in two fragments of 179 and 163 bp and cutting
at P2 produces two fragments of 220 and 122 bp (Fig. 4B).With a combination of PCR amplification and HhaI diges-
tion it is possible to detect homozygosity and heterozygosity
for the presence or absence of the ancestral alanine codon at
position 426. One-fifth of the PCR product was cut by HhaI
under conditions recommended by the supplier (New En-
gland BioLabs, Inc.) and the restriction fragments were
visualized as bands by electrophoresis in an ethidium bro-
mide-stained 2% agarose gel. Among the 64 samples we
never observed a banding pattern consistent with HhaI cut-
ting at P1 on both alleles, i.e., homozygosity for the ances-
tral alanine codon GCG. Samples that showed a heterozy-
gous banding pattern (Fig. 4B, lane 3) had either ACG or
GTG on the other allele and were genotyped by sequencing.
Absence of the GCG alanine codon on both alleles produces
the HhaI banding pattern shown in Fig. 4B, lane 5. Lack of
HhaI cutting at P1 is due to either a G to A substitution at
position 1 or a C to T substitution at position 2 of the GCG
alanine codon and hence either an ACG threonine or a GTG
valine codon at position 426. To distinguish between these
two possibilities we employed a PCR assay using S2R asreverse primer in combination with each of two new for-
ward primers, CHK1 and CHK2 (Fig. 4A and Table 4), in
which the last 2 nucleotides at the 3 end have specificity for
either the ACG allele (CHK1) or the GTG allele (CHK2).
Each sample was tested in two corresponding PCRs, per-
formed essentially as mentioned above. Production of a
PCR fragment of 184 bp using CHK1 as forward primer and
absence of a PCR product using CHK2 as forward primer
indicated the presence of the ACG (threonine) allele; the
opposite result indicated the presence of the GTG (valine)
allele, while production of a PCR product with each of the
forward primers would indicate the presence of both the
ACG and the GTG allele in the sample tested (Figs. 4A and
4C).
Accession numbers.
The DNA sequences determined have the following ac-
cession numbers: human exon 7a GTG polymorphism
(AY142187), human exon 7a GCG polymorphism
(AY142188), human exon 7a ACG polymorphism
(AY142191), baboon exon 7a (AY142190), common chim-
panzee exon 7a (AY142192), pygmy chimpanzee exon 7a
(AY142189), gorilla exon 7a (AY142193), orangutan exon
Table 4
Primer description
No. Name Orientation Location Sequence
1 FW-1 Forward Exon 7 CTC TCC CTC TGC TTT CTT TC
2 SFP-1 Forward Exon 7 CTG CTT TCT TTC AGG ATC AC
3 SFP-2 Forward Intron 7 CTG CAG ATC CCT GAG CAA G
4 SRP-2 Reverse UTR of exon 7a CAG TTA CTC TGT ACC ACG TC5 SRP-3 Reverse UTR of exon 7a GAA CTG AGT CAG CAC TGA G
6 S2F Forward Intron 7 GCC CTT CTG AGT GTT TTC TG
7 S2R Reverse UTR of exon 7a CTG CAG TTC CTG GGA AAA TG
8 Pig-F Forward Intron 7 CTT CTC CAA TCT GCA GAT CC
9 PR1 Reverse Exon 7a pA ARC ATA AAR CTT TAT TCA CT
10 ZOO-E7 Forward Exon 7 AGA ATC ACY RTT CCK GTR CAG A
11 ZOO-E8 Reverse Exon 8 ACC TCT CCA TCM CGM RTC TCM AC
12 FISH-Ex7 Forward Exon 7 CAG AAC TTC ACC AAC TTA CAG
13 FISH-Ex8 Reverse Exon 8 CGG TTC GCA CAA CTA TGC TCC
14 CHK1 Forward Exon 7a CAC CAG ATT GTA AAT GGA AC
15 CHK2 Forward Exon 7a CAC CAG ATT GTA AAT GGA GT
16 Delta Forward Intron 7 TAT GCT AAA GGT TAG GTT GTA TTA AC
17 Delta Reverse Exon 7 TTA AAA TGA ACA GCA GGG AGC ATA A
18 R-f (int) Forward Intron 7 GGT CTG CAA GCC ATG AAC AA
19 R-f (exon) Forward Exon 7a GGG GCA AAG CAC CAA AGA20 R-Rev Reverse Intron 7a CAA GCC GGG AAA AGT ACA CA
Note. The last 2 nucleotides of primers 14 and 15 are specific for threonine and valine, respectively, at codon 426.
192 R. Singh et al. / Genomics 82 (2003) 185193
-
8/9/2019 Genetic polymorphism and sequence evolution of an alternatively spliced exon of the glial fibrillary acidic protein g
9/9
7a (AY142196), pig exon 7a (AY142199), rat exon 7a
(AY142198), mouse exon 7a (AY142200), chicken intron 7
(AY142197), goldfish intron 7 (AY142194), zebrafish in-
tron 7 (AY142195).
Acknowledgments
The Danish Medical Research Council (ldreforskning
II Grant 9502112) supported this work. We thank Samir
Deeb (University of Washington, Seattle, WA, USA) for the
primate samples. The study was done in accordance with the
guidelines of the Aarhus County Research Ethical Commit-
tee.
References
[1] E. Fuchs, K. Weber, Intermediate filaments: structure, dynamics,
functions, and disease, Annu. Rev. Biochem. 63 (1994) 345 382.[2] L.F. Eng, R.S. Ghirnikar, Y.L. Lee, Glial fibrillary acidic protein:
GFAPthirty-one years (1969 2000), Neurochem. Res. 25 (2000)
1439 1451.
[3] F. Besnard, et al., Multiple interacting sites regulate astrocyte-specific
transcription of the human gene for glial fibrillary acidic protein,
J. Biol. Chem. 266 (1991) 1887718883.
[4] R. Kaneko, N. Sueoka, Tissue-specific versus cell type-specific ex-
pression of the glial fibrillary acidic protein, Proc. Natl. Acad. Sci.
USA 90 (1993) 4698 4702.
[5] R. Kaneko, N. Hagiwara, K. Leader, N. Sueoka, Glial-specific cAMP
response of the glial fibrillary acidic protein gene in the RT4 cell
lines, Proc. Natl. Acad. Sci. USA 91 (1994) 4529 4533.
[6] S.A. Reeves, L.J. Helman, A. Allison, M.A. Israel, Molecular cloning
and primary structure of human glial fibrillary acidic protein, Proc.
Natl. Acad. Sci. USA 86 (1989) 5178 5182.[7] E. Bongcam-Rudloff, et al., Human glial fibrillary acidic protein:
complementary DNA cloning, chromosome localization, and messen-
ger RNA expression in human glioma cell lines of various pheno-
types, Cancer Res. 51 (1991) 15531560.
[8] A. Isaacs, M. Baker, F. Wavrant-De Vrieze, M. Hutton, Determina-
tion of the gene structure of human GFAP and absence of coding
region mutations associated with frontotemporal dementia with par-
kinsonism linked to chromosome 17, Genomics 51 (1998) 152154.
[9] J.M. Balcarek, N.J. Cowan, Structure of the mouse glial fibrillary
acidic protein gene: implications for the evolution of the intermediate
filament multigene family, Nucleic Acids Res. 13 (1985) 55275543.
[10] I. Cohen, M. Schwartz, cDNA clones from fish optic nerve, Comp.
Biochem. Physiol. 104B (1993) 439 447.
[11] M. Kalman, A.D. Szekely, A. Csillag, Distribution of glial fibrillary
acidic protein-immunopositive structures in the brain of the domestic
chicken (Gallus domesticus), J. Comp. Neurol. 330 (1993) 221237.[12] M. Kalman, M.B. Pritz, Glial fibrillary acidic protein-immunoposi-
tive structures in the brain of a crocodilian, Caiman crocodilus, and
its bearing on the evolution of astroglia, J. Comp. Neurol. 431 (2001)
460 480.
[13] R.C. Marcus, S.S. Easter, Expression of glial fibrillary acidic protein
and its relation to tract formation in embryonic zebrafish (Danio
rerio), J. Comp. Neurol. 359 (1995) 365381.
[14] M. Kalman, Astroglial architecture of the carp (Cyprinus carpio)
brain as revealed by immunohistochemical staining against glial
fibrillary acidic protein (GFAP), Anat. Embryol. 198 (1998) 409
433.
[15] M. Kalman, R.M. Gould, GFAP-immunopositive structures in spiny
dogfish, Squalus acanthias, and little skate, Raia erinacea, brains:
differences have evolutionary implications, Anat. Embryol. 204(2001) 59 80.
[16] A.L. Nielsen, et al., A new spliceform of glial fibrillary acidic protein,
GFAP, interacts with the presenilin proteins, J. Biol. Chem. 277
(2002) 2998329991.
[17] E. Fuchs, D.W. Cleveland, A structural scaffolding of intermediate
filaments in health and disease, Science 279 (1998) 514 519.
[18] D.F. Condorelli, et al., Structural features of the rat GFAP gene and
identification of a novel alternative transcript, J. Neurosci. Res. 56
(1999) 219 228.
[19] D. Graur, Li, W-H., Fundamentals of Molecular Evolution, 2nd
edition, Sunderland, MA, Sinauer, 2000.
[20] W. Makalowski, M.S. Boguski, Evolutionary parameters of the tran-
scribed mammalian genome: an analysis of 2,820 orthologous rodent
and human sequences, Proc. Natl. Acad. Sci. USA 95 (1998) 9407
9412.
[21] D.A. Liberles, D.R. Schreiber, S. Govindarajan, S.G. Chamberlin,
S.A. Benner, The adaptive evolution database (TAED), Genome Biol.
2 (2001) 1 6.
[22] A.D. Polydorides, H.J. Okano, Y.Y.L. Yang, G. Stefani, R.B. Darnell,
A brain-enriched polypyrimidine tract-binding protein antagonizes
the ability of nova to regulate neuron-specific alternative splicing,
Proc. Natl. Acad. Sci. USA 97 (2000) 6350 6355.
193R. Singh et al. / Genomics 82 (2003) 185193