FULL LENGTH RESEARCH PAPER
Influence of certain forces on evolution of synonymous codon usage biasin certain species of three basal orders of aquatic insects
C. SELVA KUMAR1, RAHUL R. NAIR2, K. G. SIVARAMAKRISHNAN3, D. GANESH4,
S. JANARTHANAN1, M. ARUNACHALAM5, & T. SIVARUBAN6
1Department of Zoology, University of Madras, Chennai 600 025, Tamil Nadu, India, 2Department of Biotechnology,
Sri Paramakalyani Centre for Environmental Sciences, Manonmaniam Sundaranar University, Alwarkurichi 627412,
Tamil Nadu, India, 3Department of Zoology, Madras Christian College, Tambaram East, Chennai 600 059, Tamil Nadu,
India, 4Department of Plant Biotechnology, School of Biotechnology, Madurai Kamaraj University, Madurai 625021,
Tamil Nadu, India, 5Sri Paramakalyani Centre for Environmental Sciences, Manonmaniam Sundaranar University,
Alwarkurichi 627412, Tamil Nadu, India, and 6Department of Zoology, The American College, Madurai 625002,
Tamil Nadu, India
(Received 3 February 2012; revised 3 July 2012; accepted 3 July 2012)
AbstractForces that influence the evolution of synonymous codon usage bias are analyzed in six species of three basal orders of aquaticinsects. The rationale behind choosing six species of aquatic insects (three from Ephemeroptera, one from Plecoptera, and twofrom Odonata) for the present analysis is based on phylogenetic position at the basal clades of the Order Insecta facilitating theunderstanding of the evolution of codon bias and of factors shaping codon usage patterns in primitive clades of insect lineagesand their subtle differences in some of their ecological and environmental requirements in terms of habitat–microhabitatrequirements, altitudinal preferences, temperature tolerance ranges, and consequent responses to climate change impacts.The present analysis focuses on open reading frames of the 13 protein-coding genes in the mitochondrial genome of sixcarefully chosen insect species to get a comprehensive picture of the evolutionary intricacies of codon bias. In all the sixspecies, A and T contents are observed to be significantly higher than G and C, and are used roughly equally. Sincetranscription hypothesis on codon usage demands A richness and T poorness, it is quite likely that mutation pressure may bethe key factor associated with synonymous codon usage (SCU) variations in these species because the mutation hypothesispredicts AT richness and GC poorness in the mitochondrial DNA. Thus, AT-biased mutation pressure seems to be animportant factor in framing the SCU variation in all the selected species of aquatic insects, which in turn explains thepredominance of A and T ending codons in these species. This study does not find any association between microhabitatsand codon usage variations in the mitochondria of selected aquatic insects. However, this study has identified major forces,such as compositional constraints and mutation pressure, which shape patterns of codon usage in mitochondrial genes in theprimitive clades of insect lineages.
Keywords: Aquatic insects, relative synonymous codon usage, Ephemeroptera, Plecoptera, Odonata, mitochondrial DNA
Introduction
When the genetic code was deciphered in the 1960s, it
became very clear that most amino acids are encoded
by multiple codons, typically differing only at the
third position of the codon, i.e. synonymous codons
(Sharp et al. 2010). There are a total of 64 codons,
with 61 of them coding for 20 different amino acids
and the remaining three serving as stop codons. Usage
of synonymous codons is not at equal frequencies
both within and between organisms (Grantham et al.
1980; Liu et al. 2011; Sablok et al. 2011; Xu et al.
2011). The trends in synonymous codon usage (SCU)
ISSN 1940-1736 print/ISSN 1940-1744 online q 2012 Informa UK, Ltd.
DOI: 10.3109/19401736.2012.710203
Correspondence: K. G. Sivaramakrishnan, Department of Zoology, Madras Christian College, Tambaram East, Chennai 600 059,Tamil Nadu, India. Tel: þ 91 9940490259. Fax: þ 91-44-22352494/3309. E-mail: [email protected]
Mitochondrial DNA, December 2012; 23(6): 447–460
Mito
chon
dria
l DN
A D
ownl
oade
d fr
om in
form
ahea
lthca
re.c
om b
y IB
I C
ircu
latio
n -
Ash
ley
Publ
icat
ions
Ltd
on
03/0
7/13
For
pers
onal
use
onl
y.
differ considerably among species, and also among
genes from the same species (Hershberg and Petrov
2008; Plotkin and Kudla 2010). Synonymous codons
are considered to be integrated parts of the genetic
code for maintaining its functional integrity (Biro
2008). In most genomes, SCU exhibits ‘codon usage
biases’ or nonrandom usage of codons within a single
amino acid family toward some preferred codons.
Recently, weak selection of preferred codons (codon
usage bias) has been shown as a significant evolution-
ary force (Carlini et al. 2001). This phenomenon has
been studied widely in bacterial, fungal, and insect
genes (Ikemura 1982; Sharp and Cowe 1991;
Moriyama and Powell 1997), as it is an important
component between patterns of genomic expression
and protein evolution, which in turn has high
significance in constructing models for the estimation
of evolutionary rates and implications regarding
phylogenetic reconstructions (Sarmer and Sullivan
1989; Wall and Herback 2003).
Departure from the random usage of codons has
been considered from two main perspectives: (i) base
composition constraints of genomes that generate a
bias in degenerate positions of coding regions
(Bernardi and Bernardi 1986) and (ii) natural selection
favoring protein elongation rates, i.e. translational
efficiency (Bulmer 1991) and for lessening the errors in
translating mRNA transcripts, i.e. translational accu-
racy (Akashi 1994). In addition, synonymous codon
usage bias (SCUB) is often attributed to codon/
anticodon interaction (Kurland 1993), site-specific
codon biases (Smith and Smith 1996), time of
replication (Deschavanne and Filipski 1995), codon
context (Irwin et al. 1995), and evolutionary age
(Karlin et al. 1998). In the context of molecular
evolution, the selection–mutation theory of SCU plays
a critical role by elucidating the co-evolution of
nonrandom usage of codons and tRNA content in
the context of translation optimization (Rocha 2004).
Insects are a very biodiverse taxonomic group with
high AT content in their mitochondrial genome (Sun
et al. 2009), and the gene content of the insect
mitochondrial genome is usually highly conserved but
not without some exceptions (Zhang et al. 2008).
Though codon usage prefers AT richness, nucleotide
content and codon usage bias may vary among taxa
(Sun et al. 2009). Perusal of published literature on
SCUB reveals paucity of information on aquatic insect
genomes. The rationale behind choosing six species of
aquatic insects (three from Ephemeroptera, one from
Plecoptera, and two from Odonata) for the present
analysis is based on the phylogenetic position of these
taxa at the base of the Order Insecta. Examining these
basal taxa should facilitate the better understanding of
the evolution of codon bias and of factors shaping
codon usage patterns in primitive clades of insect
lineages. In addition, we can examine the impact of
subtle differences in some of the ecological and
environmental requirements of these taxa in terms
of habitat–microhabitat requirements, altitudinal
preferences, temperature tolerance ranges, and con-
sequent responses to climate change impacts (Table I)
on codon bias. The present analysis focuses on open
reading frames (ORFs) of the standard 13 protein-
coding genes in the mitochondrial genome of six
carefully chosen basal insect species to get a
comprehensive picture of the evolutionary intricacies
of codon bias.
Materials and methods
Sequence data
The sequences of 13 protein-coding genes in
mitochondria of six species of aquatic insects were
retrieved from the National Centre for Biotechnology
Information (NCBI), and gene IDs are given in
Table II.
Table I. Systematic position, microhabitats, and influence of climate change in habitats of selected aquatic insects.
Species Systematic position Altitudinal, habitat, and thermal preference Climate change impacts
Ephemera orientalis Ephemeroptera Larvae burrowing on sandy substratum
in streams and rivers
Moderate range shift to higher elevations
Parafronurus youi Ephemeroptera Mid latitudinal preference Moderate range shift to higher elevations
Lotic in small streams
Somewhat eurythermal
Siphlonurus immanis Ephemeroptera Mid latitudinal preference Moderate range shift to higher elevations
Lotic in small streams
Somewhat eurythermal
Pteronarcys princeps Plecoptera Preference for high, altitude streams Range shift to higher altitudes
Cold stenothermal
Davidius lunatus Odonata Low altitude preference Range extention to cooler areas
Inhabits warmer
Water bodies
Euphaea formosa Odonata Low altitude preference Range extention to cooler areas
Inhabits warmer
Water bodies
C. Selva Kumar et al.448
Mito
chon
dria
l DN
A D
ownl
oade
d fr
om in
form
ahea
lthca
re.c
om b
y IB
I C
ircu
latio
n -
Ash
ley
Publ
icat
ions
Ltd
on
03/0
7/13
For
pers
onal
use
onl
y.
Relative synonymous codon usage (RSCU)
To study the features of SCU variations by avoiding
the influence of amino acid composition, the RSCU
values of all sequences were calculated according to
the following equation (Sharp et al. 1986).
RSCU ¼ Observed frequency of a codon=Expected
frequency provided all synonymous
codons for those amino acids used equally:
The effective number of codons (NC) (Wright 1990)
This index is a simple measure bias of synonymous
codons. The values of effective number of codons
(ENC) vary from 20 (when only one codon is used for
each amino acid) to 61 (when codons are used
randomly). If the calculated ENC is greater than 61
(because codon usage is more evenly distributed than
expected), it is adjusted to 61. ENCs of all sequences
were calculated as per the equation
ENC ¼ 2 þ sþ29
s2 þ ð1 2 s2ÞðWright 1990Þ;
where s ¼ GC3 (GC content at the third codon
position).
Sequence analysis
All sequence analysis was carried out by using
MEGA version 4.0 (Molecular Evolutionary Genetic
Analysis) (Tamura et al. 2007).
Statistical methods
Correspondence analysis (COA). COA is extensively
used to analyze multidimensional data (Perriere and
Thioulouse 2002). COA is used to analyze RSCU
data to explore the different characterestics of SCU
Table II. Name of aquatic insects and genes analyzed in this study.
Species Gene ID Length
E. orientalis ATP6 7804414 677
ATP8 7804413 162
COX1 7804411 1534
COX2 7804412 688
COX3 7804415 789
CYTB 7804421 1135
ND1 7804422 942
ND2 7804410 1018
ND3 7804416 354
ND4 7804418 1342
ND4L 7804419 297
ND5 7804417 1735
ND6 7804420 522
S. immanis ATP6 8774447 678
ATP8 8774446 162
COX1 8774444 1567
COX2 8774445 688
COX3 8774448 789
CYTB 8774454 1135
ND1 8774455 939
ND2 8774443 1017
ND3 8774449 354
ND4 8774451 1342
ND4L 8774452 297
ND5 8774450 1735
ND6 8774453 519
P. youi ATP6 6973033 675
ATP8 6973032 159
COX1 6973030 1536
COX2 6973031 688
COX3 6973034 788
CYTB 6973040 1135
ND1 6973041 951
ND2 6973029 1033
ND3 6973035 355
ND4 6973037 1346
ND4L 6973038 297
ND5 6973036 1734
ND6 6973039 510
P. princeps ATP6 2943519 678
ATP8 2943516 158
COX1 2943513 1532
COX2 2943510 688
COX3 2943521 789
CYTB 2943518 1137
ND1 2943517 951
ND2 2943512 1035
ND3 2943509 353
ND4 2943511 1341
ND4L 2943514 297
ND5 2943520 1736
ND6 2943515 525
D. lunatus ATP6 7804400 677
ATP8 7804399 162
COX1 7804397 1533
COX2 7804398 688
COX3 7804401 787
CYTB 7804407 1132
ND1 7804408 942
ND2 7804396 997
ND3 7804402 352
ND4 7804404 1343
ND4L 7804405 294
ND5 7804403 1729
ND6 7804406 519
Table II – continued
Species Gene ID Length
E. formosa ATP6 9725966 675
ATP8 9725965 159
COX1 9725963 1548
COX2 9725964 688
COX3 9725967 787
CYTB 9725973 1134
ND1 9725974 951
ND2 9725975 990
ND3 9725968 354
ND4 9725970 1344
ND4L 9725971 294
ND5 9725969 1723
ND6 9725972 498
Codon bias of mtDNA in aquatic insects 449
Mito
chon
dria
l DN
A D
ownl
oade
d fr
om in
form
ahea
lthca
re.c
om b
y IB
I C
ircu
latio
n -
Ash
ley
Publ
icat
ions
Ltd
on
03/0
7/13
For
pers
onal
use
onl
y.
Table III. Identified nucleotide contents in complete coding regions of protein-coding mitogenes in six selected species.
Species A T G C A3 T3 G3 C3 GC3 ENC
E. orientalis ATP6 0.299 0.380 0.112 0.208 0.568 0.474 0.035 0.151 0.160 39.8
ATP8 0.308 0.407 0.067 0.216 0.567 0.568 0.00 0.159 0.132 37.3
COX1 0.281 0.365 0.164 0.189 0.574 0.485 0.032 0.159 0.163 40.0
COX2 0.330 0.337 0.137 0.194 0.616 0.460 0.056 0.178 0.188 43.4
COX3 0.295 0.352 0.149 0.202 0.618 0.437 0.028 0.218 0.206 43.5
CYTB 0.301 0.372 0.132 0.193 0.665 0.446 0.032 0.173 0.175 38.5
ND1 0.290 0.436 0.103 0.169 0.579 0.506 0.075 0.115 0.153 38.0
ND2 0.279 0.443 0.104 0.172 0.546 0.550 0.04 0.180 0.179 46.3
ND3 0.267 0.471 0.176 0.085 0.500 0.600 0.147 0.010 0.119 32.0
ND4 0.252 0.508 0.158 0.080 0.493 0.607 0.169 0.000 0.122 32.9
ND4L 0.291 0.442 0.176 0.090 0.487 0.631 0.156 0.025 0.131 32.7
ND5 0.335 0.419 0.074 0.170 0.728 0.417 0.016 0.145 0.139 31.1
ND6 0.241 0.475 0.192 0.091 0.416 0.636 0.190 0.022 0.153 32.1
S. immanis ATP6 0.289 0.405 0.118 0.187 0.543 0.490 0.056 0.155 0.178 35.1
ATP8 0.333 0.382 0.067 0.216 0.578 0.523 0.026 0.190 0.170 30.6
COX1 0.260 0.379 0.175 0.184 0.482 0.527 0.093 0.142 0.192 38.7
COX2 0.331 0.343 0.139 0.185 0.622 0.466 0.069 0.157 0.179 42.9
COX3 0.278 0.367 0.162 0.191 0.556 0.480 0.089 0.173 0.210 42.2
CYTB 0.291 0.374 0.138 0.196 0.617 0.446 0.056 0.188 0.204 39.6
ND1 0.255 0.456 0.194 0.092 0.469 0.584 0.204 0.029 0.167 34.8
ND2 0.265 0.441 0.112 0.180 0.485 0.522 0.109 0.149 0.206 42.8
ND3 0.265 0.457 0.127 0.149 0.500 0.529 0.135 0.156 0.222 39.3
ND4 0.268 0.456 0.177 0.097 0.495 0.549 0.179 0.053 0.177 35.6
ND4L 0.298 0.422 0.177 0.101 0.506 0.563 0.177 0.058 0.175 36.5
ND5 0.298 0.422 0.177 0.101 0.506 0.563 0.177 0.058 0.175 36.5
ND6 0.310 0.450 0.07 0.167 0.620 0.500 0.008 0.160 0.145 38.5
P. youi ATP6 0.250 0.403 0.132 0.215 0.390 0.552 0.067 0.189 0.219 41.2
ATP8 0.270 0.428 0.088 0.214 0.500 0.561 0.111 0.171 0.212 39.9
COX1 0.266 0.363 0.167 0.204 0.501 0.496 0.069 0.182 0.208 41.3
COX2 0.297 0.344 0.154 0.206 0.497 0.487 0.093 0.215 0.245 47.8
COX3 0.274 0.352 0.155 0.219 0.542 0.425 0.051 0.270 0.267 45.5
CYTB 0.259 0.382 0.145 0.215 0.492 0.463 0.081 0.242 0.267 44.6
ND1 0.235 0.443 0.223 0.100 0.407 0.515 0.317 0.051 0.266 41.9
ND2 0.270 0.416 0.118 0.196 0.500 0.515 0.076 0.177 0.206 41.1
ND3 0.263 0.401 0.120 0.216 0.507 0.456 0.052 0.272 0.271 43.4
ND4 0.272 0.444 0.186 0.099 0.539 0.492 0.195 0.052 0.185 40.3
ND4L 0.259 0.444 0.199 0.098 0.437 0.469 0.310 0.086 0.296 46.9
ND5 0.293 0.412 0.188 0.108 0.496 0.529 0.213 0.063 0.206 40.8
ND6 0.265 0.435 0.092 0.208 0.446 0.521 0.058 0.219 0.231 42.6
P. princeps ATP6 0.310 0.415 0.100 0.176 0.609 0.495 0.033 0.142 0.145 38.4
ATP8 0.290 0.371 0.162 0.178 0.589 0.502 0.047 0.130 0.145 37.3
COX1 0.303 0.376 0.141 0.180 0.550 0.589 0.025 0.107 0.109 38.6
COX2 0.354 0.392 0.076 0.177 0.647 0.571 0.029 0.119 0.115 41.2
COX3 0.301 0.395 0.120 0.184 0.571 0.478 0.043 0.144 0.160 42.2
CYTB 0.275 0.378 0.147 0.200 0.583 0.502 0.017 0.170 0.160 38.5
ND1 0.302 0.404 0.113 0.181 0.671 0.455 0.025 0.158 0.154 34.8
ND2 0.287 0.443 0.172 0.098 0.536 0.556 0.175 0.027 0.147 36.1
ND3 0.274 0.464 0.169 0.093 0.525 0.561 0.179 0.024 0.148 37.6
ND4 0.276 0.495 0.152 0.077 0.606 0.529 0.182 0.012 0.133 35.1
ND4L 0.356 0.389 0.086 0.170 0.706 0.401 0.024 0.176 0.161 39.2
ND5 0.294 0.385 0.134 0.187 0.612 0.458 0.024 0.202 0.193 38.7
ND6 0.239 0.476 0.189 0.096 0.422 0.618 0.229 0.022 0.177 34.1
D. lunatus ATP6 0.342 0.326 0.128 0.203 0.662 0.302 0.090 0.218 0.253 41.4
ATP8 0.413 0.308 0.086 0.191 0.676 0.359 0.088 0.333 0.302 59.0
COX1 0.301 0.324 0.175 0.197 0.625 0.368 0.103 0.186 0.235 43.9
COX2 0.346 0.297 0.162 0.194 0.613 0.393 0.104 0.236 0.258 49.2
COX3 0.321 0.320 0.158 0.199 0.703 0.333 0.089 0.214 0.240 39.6
CYTB 0.330 0.351 0.135 0.182 0.719 0.412 0.057 0.169 0.183 38.8
ND1 0.214 0.469 0.196 0.120 0.307 0.606 0.278 0.108 0.278 41.9
ND2 0.358 0.334 0.111 0.196 0.708 0.315 0.068 0.265 0.256 37.9
ND3 0.347 0.333 0.113 0.206 0.734 0.305 0.048 0.242 0.231 36.3
ND4 0.225 0.468 0.191 0.114 0.375 0.570 0.256 0.091 0.251 41.0
ND4L 0.217 0.527 0.166 0.088 0.344 0.678 0.293 0.035 0.206 35.8
ND5 0.237 0.467 0.186 0.109 0.365 0.649 0.236 0.051 0.203 33.8
ND6 0.366 0.352 0.096 0.185 0.700 0.368 0.081 0.241 0.250 42.8
C. Selva Kumar et al.450
Mito
chon
dria
l DN
A D
ownl
oade
d fr
om in
form
ahea
lthca
re.c
om b
y IB
I C
ircu
latio
n -
Ash
ley
Publ
icat
ions
Ltd
on
03/0
7/13
For
pers
onal
use
onl
y.
variations across the protein-coding genes in the
mitochondria of six chosen species of aquatic insects.
Correlation analysis. Correlation analysis was used to
analyze the relationship between nucleotide
compositions and SCU patterns. This analysis was
implemented based on the Pearson correlation
method. All statistical processes were carried out
with statistical software Past version 2.12 (Hammer
et al. 2001).
Results
Compositional properties of ORFs of 13 protein-coding
genes
Complex correlations were observed between A, T, G,
C and A3, T3, G3, C3, GC3 contents in all the six
species (Tables III and IV). In Ephemera orientalis,
significant correlations were found between A3, G3, C3
and A, T, G, C contents, but T3 was not correlated
with G. Remarkably, in E. orientalis, GC3 was
observed to be highly correlated with T content,
whereas in other species, individual nucleotide
contents exhibited no correlation with GC3. Con-
sidering Parafronurus youi, T3 showed no correlation
with individual nucleotide contents, and A3 was found
to be correlated only with T, G, C contents and it was
not correlated with A. In Siphlonurus immanis, T3 was
correlated only with T, whereas A3 and G3 were found
to have correlations with A, G, C contents. C3
exhibited high correlations with all individual nucleo-
tide contents except with A. In Pteronarcys princeps, G3
was highly correlated with all individual nucleotide
contents but T3 showed no correlations with A, T, G,
C. Higher correlations existed between A3 and T, A,
G. But C3 showed correlation with T, G, C. In
Davidius lunatus and Euphaea formosa, individual
nucleotide contents were highly correlated with T3,
A3, G3, C3. These data suggest that compositional
constraints greatly influence codon usage variations in
all the six selected species, particularly in D. lunatus
and E. formosa.
Characteristics of overall RSCU
Overall codon usage pattern of all the six species is
summarized in Table V. Insect mitochondrial genomes
use AGA and AGG (ariginine in the universal code) to
code for Ser, AUA (isoleucine in the universal code) to
code for methionine, UGA (termination codon in the
universal code) to code for tryptophan, but in certain
groups, AGG is either used for coding Lys instead of
Ser or absent (Abascal et al. 2006). Since insect
mitochondrial genomes are AT rich, all the amino
acids were found to use A and T ending codons
preferentially in all the chosen species. From the
overall RSCU values, it can be assumed that
compositional constraints influenced in shaping
codon usage variation across genes of all six species.
Codons with RSCU value greater than one are
preferred codons and those with RSCU value less than
0.66 are considered as rare codons. If the RSCU value
of any codon falls between 0.66 and 1, such codons
are termed as intermediate codons.
Characteristics of strand-specific codon usage
Since mitochondrial DNA does not follow Chargaff’s
parity rule (Buehler 2006; Nikolaou and Almirantis
2006), codon usage analysis of the majority (J strand)
and the minority strand (N strand) of these insect
genomes would be appropriate to understand different
mutation biases acting on the genes of these two
strands (Table VI). Strand-specific analysis of codon
usage reveals that in the selected species of aquatic
insects, the third codon position of most of the amino
acids in majority strand-encoded genes is biased
Table III – continued
Species A T G C A3 T3 G3 C3 GC3 ENC
E. formosa ATP6 0.370 0.330 0.111 0.188 0.770 0.323 0.044 0.177 0.179 36.3
ATP8 0.465 0.302 0.082 0.151 0.800 0.412 0.025 0.147 0.115 30.2
COX1 0.333 0.308 0.168 0.191 0.718 0.324 0.072 0.177 0.202 36.8
COX2 0.384 0.287 0.157 0.173 0.742 0.368 0.092 0.175 0.197 40.5
COX3 0.338 0.309 0.156 0.197 0.781 0.312 0.055 0.188 0.195 35.4
CYTB 0.352 0.325 0.138 0.185 0.807 0.320 0.073 0.169 0.194 33.7
ND1 0.212 0.483 0.192 0.113 0.318 0.674 0.229 0.063 0.209 33.4
ND2 0.405 0.318 0.112 0.165 0.767 0.310 0.091 0.220 0.228 35.8
ND3 0.396 0.322 0.113 0.170 0.915 0.261 0.012 0.185 0.154 28.3
ND4 0.181 0.539 0.178 0.102 0.220 0.748 0.233 0.042 0.188 30.2
ND4L 0.180 0.561 0.180 0.078 0.226 0.753 0.258 0.035 0.196 29.2
ND5 0.208 0.499 0.184 0.110 0.275 0.728 0.226 0.054 0.192 32.8
ND6 0.392 0.343 0.112 0.153 0.730 0.406 0.090 0.172 0.194 39.1
Note: All the nucleotide contents are in fraction of 1/100.
Codon bias of mtDNA in aquatic insects 451
Mito
chon
dria
l DN
A D
ownl
oade
d fr
om in
form
ahea
lthca
re.c
om b
y IB
I C
ircu
latio
n -
Ash
ley
Publ
icat
ions
Ltd
on
03/0
7/13
For
pers
onal
use
onl
y.
toward A and in minority strand-encoded genes, the
third codon position of most of the amino acids is
biased toward T, although some exceptions do occur.
Variations of ENC values and GC distributions at
silent site suggested that in addition to compositional
constraints, some other forces may also influence in
shaping codon usage patterns.
ENC vs. GC3 plot
ENC vs. GC3 plots give a graphic display of codon
usage patterns across a number of genes. The axes of
this plot are independent of the data and effectively
demonstrate intraspecific and interspecific SCU
patterns (Wright 1990). The ENC Vs GC3 plot was
reported to be highly effective in interpreting patterns
of SCU variations if GC content of the genome
significantly differs from 0.50 (Wright 1990). The
same extreme GC content was encountered in all
the six species of aquatic insects under analysis to
ascertain the influence of two well-discussed evol-
utionary forces, GC compositional constraints, and
natural selection for translational accuracy. ENC and
GC3 values were calculated and plotted (Figure 1a–f)
for the six species in this study using the null
hypothesis that expected SCU pattern is influenced
only by GC compositional constraints. If a particular
gene is subjected to GC compositional constraints, it
will lie on or just below the expected GC3 curve. If the
SCU pattern of a gene is influenced by translational
selection, it will lie considerably below the GC3 curve.
The grouping of the majority of genes on or below
the left-hand side of the expected GC curve (Figure 1)
indicates that in all six species of aquatic insects (three
in Ephemeroptera, one in Plecoptera, and two in
Odonata) extreme compositional constraints are
involved in shaping SCU pattern across genes.
Independent evolution of codon bias and tRNA
anticodons in insect mitochondrial genome minimizes
the possibilities of natural selection for translational
accuracy in shaping codon usage (Sun et al. 2009).
Since the mitochondrial genomes are AT rich and GC
poor in all the six species, A and Tending codons were
used roughly equally, and more frequently than G and
C ending codons. This would substantiate the role of
mutation pressure rather than transcription hypoth-
esis of codon usage (Xia 1996) in framing SCU
patterns of genes in all six species of aquatic insects
(Figure 2).
Correspondence analysis. A multivariate statistical
analysis based on COA was used to examine the
variation in codon usage patterns among the 13
protein-coding genes of the six basal insect species in
this study. The complete coding region of each gene in
our study was represented as a 62 dimensional vector
(62 synonymous codons in insect mitochondria), and
Tab
leIV
.P
ears
on
corr
elati
on
an
aly
sis
bet
wee
nA
,T
,G
,C
con
ten
tsan
dA
3,
T3,
G3,
C3,
GC
3co
nte
nts
inO
RF
sof
13
pro
tein
-cod
ing
gen
esof
six
chose
nsp
ecie
s.
E.orientalis
P.youi
S.im
manis
P.princeps
D.lunatus
E.form
osa
T3
A3
G3
C3
GC
3T
3A
3G
3C
3G
C3
T3
A3
G3
C3
GC
3T
3A
3G
3C
3G
C3
T3
A3
G3
C3
GC
3T
3A
3G
3C
3G
C3
T0.78720.644
0.74820.84520.734
0.4
412
0.3
94
0.603
0.6522
0.1
08
0.7702
0.5
15
0.4
1720.680
0.5
07
0.4
512
0.4
85
0.92120.861
0.0
48
0.94620.913
0.93720.9042
0.3
05
0.96020.966
0.94620.953
0.2
05
A20.733
0.84520.743
0.664
0.2
932
0.0
80
0.6382
0.2
63
0.1
192
0.3
322
0.5
08
0.79320.604
0.5
052
0.0
422
0.4
68
0.80920.658
0.5
282
0.3
5720.881
0.91120.910
0.956
0.3
4820.881
0.93820.930
0.8872
0.3
96
G0.4
8220.643
0.75820.5742
0.0
472
0.2
592
0.1
37
0.78520.685
0.2
21
0.3
4620.570
0.80820.614
0.1
13
0.5
3720.823
0.72520.641
0.2
28
0.70020.775
0.74220.8072
0.2
61
0.68720.776
0.80020.690
0.5
43
C20.784
0.70420.950
0.949
0.6292
0.0
69
0.1
7520.932
0.931
0.0
0220.810
0.56520.795
0.957
0.4
552
0.5
12
0.5
2420.965
0.934
0.0
3220.984
0.94520.960
0.888
0.2
5420.947
0.90220.892
0.9172
0.0
53
Note
s:F
igu
res
inbold
an
dit
alics
ind
icate
the
sign
ifica
nce
atp,
0.0
05.
Fig
ure
sin
bold
wit
hou
tit
alics
ind
icate
sign
ifica
nce
atp,
0.0
5.
C. Selva Kumar et al.452
Mito
chon
dria
l DN
A D
ownl
oade
d fr
om in
form
ahea
lthca
re.c
om b
y IB
I C
ircu
latio
n -
Ash
ley
Publ
icat
ions
Ltd
on
03/0
7/13
For
pers
onal
use
onl
y.
Tab
leV
.O
vera
llsy
nonym
ou
sco
don
usa
ge
data
of
pro
tein
-cod
ing
mit
ogen
esofE.orientalis,S.im
manis
,P.
youi,P.
princeps,D.lunatus,
an
dE.form
osa.
Ep
hem
erop
tera
Ple
cop
tera
Od
on
ata
E.orientalis
S.im
manis
P.youi
P.princeps
D.lunatus
E.form
osa
AA
Cod
on
NR
SC
UA
AC
od
on
NR
SC
UA
AC
od
on
NR
SC
UA
AC
od
on
NR
SC
UA
AC
od
on
NR
SC
UA
AC
od
on
NR
SC
U
AG
CT
61
1.2
45
AG
CT
121
2.2
62
AG
CT
76
1.5
12
AGCT
138
2.509
AGCT
105
1.850
AGCT
109
2.086
AG
CG
17
0.3
47
AG
CG
30.0
56
AG
CG
60.1
19
AG
CG
70.1
27
AG
CG
14
0.2
47
AG
CG
12
0.2
30
AG
CC
33
0.6
73
AG
CC
29
0.5
42
AG
CC
29
0.5
77
AG
CC
27
0.4
91
AG
CC
46
0.8
11
AG
CC
43
0.8
23
AG
CA
85
1.7
35
AGCA
61
1.140
AGCA
90
1.791
AG
CA
48
0.8
73
AG
CA
62
1.0
93
AG
CA
45
0.8
61
CTGT
40
1.778
CTGT
36
1.946
CTGT
44
1.833
CTGT
39
1.814
CTGT
40
1.860
CTGT
32
1.829
CT
GC
50.2
22
CT
GC
10.0
54
CT
GC
40.1
67
CT
GC
40.1
86
CT
GC
30.1
40
CT
GC
30.1
71
DGAT
59
1.513
DGAT
50
1.389
DGAT
53
1.359
DGAT
59
1.662
DGAT
49
1.400
DGAT
48
1.371
DG
AC
19
0.4
87
DG
AC
22
0.6
11
DG
AC
25
0.6
41
DG
AC
12
0.3
38
DG
AC
21
0.6
00
DG
AC
22
0.6
29
EG
AG
17
0.4
42
EG
AG
12
0.3
08
EG
AG
17
0.4
10
EG
AG
16
0.3
81
EG
AG
18
0.4
29
EG
AG
15
0.3
75
EGAA
60
1.558
EGAA
66
1.692
EGAA
66
1.590
EGAA
68
1.619
EGAA
66
1.571
EGAA
65
1.625
FTTT
258
1.491
FTTT
237
1.514
FTTT
274
1.579
FTTT
256
1.652
FTTT
254
1.568
FTTT
242
1.546
FT
TC
88
0.5
09
FT
TC
76
0.4
86
FT
TC
73
0.4
21
FT
TC
54
0.3
48
FT
TC
70
0.4
32
FT
TC
71
0.4
54
GG
GT
52
0.8
70
GG
GT
104
1.6
71
GG
GT
73
1.2
22
GG
GT
68
1.1
43
GG
GT
66
1.0
43
GGGT
85
1.377
GG
GG
53
0.8
87
GG
GG
26
0.4
18
GG
GG
45
0.7
53
GG
GG
34
0.5
71
GG
GG
56
0.8
85
GG
GG
71
1.1
50
GG
GC
10
0.1
67
GG
GC
80.1
29
GG
GC
10
0.1
67
GG
GC
14
0.2
35
GG
GC
16
0.2
53
GG
GC
21
0.3
40
GGGA
124
2.075
GGGA
111
1.783
GGGA
111
1.858
GGGA
122
2.050
GGGA
115
1.818
GG
GA
70
1.1
34
HCAC
45
1.084
HC
AC
29
0.6
52
HC
AC
33
0.8
68
HC
AC
17
0.4
20
HC
AC
39
0.8
76
HC
AC
33
0.7
42
HC
AT
38
0.9
16
HCAT
60
1.348
HCAT
43
1.132
HCAT
64
1.580
HCAT
50
1.124
HCAT
56
1.258
IATT
251
1.579
IATT
255
1.735
IATT
250
1.667
IATT
266
1.744
IATT
247
1.698
IATT
274
1.797
IA
TC
67
0.4
21
IA
TC
39
0.2
65
IA
TC
50
0.3
33
IA
TC
39
0.2
56
IA
TC
44
0.3
02
IA
TC
31
0.2
03
KAAA
55
1.264
KAAA
46
1.243
KAAA
52
1.182
KAAA
42
1.135
KA
AA
35
0.9
59
KAAA
47
1.270
KA
AG
32
0.7
36
KA
AG
28
0.7
57
KA
AG
36
0.8
18
KA
AG
32
0.8
65
KAAG
38
1.041
KA
AG
27
0.7
30
LCTA
106
1.963
LC
TA
79
1.6
90
LC
TA
92
1.9
68
LC
TA
69
1.6
43
LC
TA
67
1.2
07
LC
TA
62
1.4
85
LC
TC
11
0.2
04
LC
TC
14
0.2
99
LC
TC
70.1
50
LC
TC
13
0.3
10
LC
TC
31
0.5
59
LC
TC
90.2
16
LC
TG
24
0.4
44
LC
TG
40.0
86
LC
TG
70.1
50
LC
TG
50.1
19
LC
TG
90.1
62
LC
TG
70.1
68
LC
TT
75
1.3
89
LCTT
90
1.925
LCTT
81
1.733
LCTT
81
1.929
LCTT
115
2.072
LCTT
89
2.132
LT
TA
223
1.4
16
LT
TA
383
1.7
14
LT
TA
230
1.4
38
LT
TA
428
1.7
72
LT
TA
325
1.5
48
LT
TA
413
1.7
54
LT
TG
92
0.5
84
LT
TG
64
0.2
86
LT
TG
90
0.5
63
LT
TG
55
0.2
28
LT
TG
95
0.4
52
LT
TG
58
0.2
46
MA
TG
60
0.4
80
MA
TG
35
0.3
55
MA
TG
58
0.4
48
MA
TG
32
0.3
44
MA
TG
31
0.3
43
MA
TG
37
0.3
92
MATA
190
1.520
MATA
162
1.645
MATA
201
1.552
MATA
154
1.656
MATA
150
1.657
MATA
152
1.608
NA
AC
43
0.5
77
NA
AC
24
0.3
40
NA
AC
45
0.5
56
NA
AC
22
0.2
97
NA
AC
37
0.5
25
NA
AC
32
0.4
41
NAAT
106
1.423
NAAT
117
1.660
NAAT
117
1.444
NAAT
126
1.703
NAAT
104
1.475
NAAT
113
1.559
PCCT
63
1.800
PCCT
91
2.476
PC
CT
54
1.5
65
PCCT
75
2.000
PCCT
85
2.179
PCCT
92
2.437
PC
CG
50.1
43
PC
CG
30.0
82
PC
CG
10.0
29
PC
CG
50.1
33
PC
CG
80.2
05
PC
CG
60.1
59
PC
CC
26
0.7
43
PC
CC
13
0.3
54
PC
CC
14
0.4
06
PC
CC
24
0.6
40
PC
CC
34
0.8
72
PC
CC
18
0.4
77
PC
CA
46
1.3
14
PC
CA
40
1.0
88
PCCA
69
2.000
PC
CA
46
1.2
27
PC
CA
29
0.7
44
PC
CA
35
0.9
27
QC
AG
15
0.3
95
QC
AG
80.2
16
QC
AG
15
0.3
57
QC
AG
12
0.3
16
QC
AG
10
0.2
44
QC
AG
12
0.3
00
QCAA
61
1.605
QCAA
66
1.784
QCAA
69
1.643
QCAA
64
1.684
QCAA
72
1.756
QCAA
68
1.700
RCGA
28
1.806
RC
GA
23
1.5
86
RCGA
27
1.895
RCGA
36
2.323
RCGA
25
1.695
RCGA
25
1.754
RC
GC
60.3
87
RC
GC
10.0
69
RC
GC
00.0
00
RC
GC
20.1
29
RC
GC
70.4
75
RC
GC
70.4
91
RC
GG
40.2
58
RC
GG
50.3
45
RC
GG
10.0
70
RC
GG
40.2
58
RC
GG
10
0.6
78
RC
GG
40.2
81
Codon bias of mtDNA in aquatic insects 453
Mito
chon
dria
l DN
A D
ownl
oade
d fr
om in
form
ahea
lthca
re.c
om b
y IB
I C
ircu
latio
n -
Ash
ley
Publ
icat
ions
Ltd
on
03/0
7/13
For
pers
onal
use
onl
y.
Tab
leV
–continued
Ep
hem
eropte
raP
leco
pte
raO
don
ata
E.orientalis
S.im
manis
P.youi
P.princeps
D.lunatus
E.form
osa
AA
Cod
on
NR
SC
UA
AC
od
on
NR
SC
UA
AC
od
on
NR
SC
UA
AC
od
on
NR
SC
UA
AC
od
on
NR
SC
UA
AC
od
on
NR
SC
U
RC
GT
24
1.5
48
RCGT
29
2.000
RCGT
29
2.035
RC
GT
20
1.2
90
RC
GT
17
1.1
53
RC
GT
21
1.4
74
SA
GC
16
0.3
76
SA
GC
80.1
84
SA
GC
90.2
31
SA
GC
90.2
18
SA
GC
12
0.2
87
SA
GC
11
0.2
51
SA
GA
73
1.7
18
SA
GA
70
1.6
14
SA
GA
65
1.6
67
SA
GA
62
1.5
03
SA
GA
65
1.5
52
SA
GA
68
1.5
50
ST
CA
65
1.5
29
ST
CA
68
1.5
68
ST
CA
85
2.1
79
ST
CA
81
1.9
64
ST
CA
48
1.1
46
ST
CA
56
1.2
76
ST
CC
31
0.7
29
ST
CC
28
0.6
46
ST
CC
15
0.3
85
ST
CC
21
0.5
09
ST
CC
38
0.9
07
ST
CC
14
0.3
19
ST
CG
13
0.3
06
ST
CG
50.1
15
ST
CG
60.1
54
ST
CG
60.1
45
ST
CG
90.2
15
ST
CG
80.1
82
STCT
96
2.259
STCT
109
2.513
STCT
92
2.359
STCT
101
2.448
STCT
102
2.436
STCT
134
3.054
SA
GG
10.0
24
SA
GG
10.0
23
SA
GG
10.0
26
SA
GG
00.0
00
SA
GG
00.0
00
SA
GG
10.0
23
SA
GT
45
1.0
59
SA
GT
58
1.3
37
SA
GT
39
1.0
00
SA
GT
50
1.2
12
SA
GT
61
1.4
57
SA
GT
59
1.3
45
TACA
83
1.660
TA
CA
86
1.6
07
TACA
116
2.231
TA
CA
78
1.4
25
TA
CA
57
1.1
01
TA
CA
84
1.6
31
TA
CC
34
0.6
80
TA
CC
28
0.5
23
TA
CC
29
0.5
58
TA
CC
21
0.3
84
TA
CC
48
0.9
28
TA
CC
28
0.5
44
TA
CG
10
0.2
00
TA
CG
30.0
56
TA
CG
40.0
77
TA
CG
80.1
46
TA
CG
12
0.2
32
TA
CG
20.0
39
TA
CT
73
1.4
60
TACT
97
1.813
TA
CT
59
1.1
35
TACT
112
2.046
TACT
90
1.739
TACT
92
1.786
VG
TC
16
0.2
79
VG
TC
40.0
70
VG
TC
14
0.2
25
VG
TC
19
0.3
45
VG
TC
14
0.2
43
VG
TC
90.1
62
VG
TG
21
0.3
67
VG
TG
12
0.2
11
VG
TG
17
0.2
73
VG
TG
20
0.3
64
VG
TG
25
0.4
35
VG
TG
23
0.4
14
VG
TT
78
1.3
62
VG
TT
100
1.7
54
VG
TT
101
1.6
22
VGTT
91
1.655
VGTT
102
1.774
VG
TT
89
1.6
04
VGTA
114
1.991
VGTA
112
1.965
VGTA
117
1.880
VG
TA
90
1.6
36
VG
TA
89
1.5
48
VGTA
101
1.820
WTGA
89
1.780
WTGA
83
1.596
WTGA
80
1.584
WTGA
98
1.885
WTGA
77
1.540
WTGA
82
1.562
WT
GG
11
0.2
20
WT
GG
21
0.4
04
WT
GG
21
0.4
16
WT
GG
60.1
15
WT
GG
23
0.4
60
WT
GG
23
0.4
38
YT
AC
50
0.6
06
YT
AC
21
0.2
66
YT
AC
41
0.4
91
YT
AC
32
0.4
08
YT
AC
35
0.4
61
YT
AC
26
0.3
13
YTAT
115
1.394
YTAT
137
1.734
YTAT
126
1.509
YTAT
125
1.592
YTAT
117
1.539
YTAT
140
1.687
Note
:F
igu
res
inb
old
lett
ers
are
pre
ferr
edco
don
sfo
rco
rres
pon
din
gam
ino
aci
ds.
C. Selva Kumar et al.454
Mito
chon
dria
l DN
A D
ownl
oade
d fr
om in
form
ahea
lthca
re.c
om b
y IB
I C
ircu
latio
n -
Ash
ley
Publ
icat
ions
Ltd
on
03/0
7/13
For
pers
onal
use
onl
y.
Table
VI.
Str
an
d-s
pec
ific
cod
on
usa
ge
data
of
pro
tein
-cod
ing
mit
ogen
esofE.orientalis,S.im
manis
,P.
youi,P.
princeps,D.lunatus,
an
dE.form
osa.
Ep
hem
erop
tera
Ple
cop
tera
Od
on
ata
E.orientalis
S.im
manis
P.youi
P.princeps
D.lunatus
E.Formosa
RS
CU
RS
CU
RS
CU
RS
CU
RS
CU
RS
CU
AA
Cod
on
JN
AA
Cod
on
JN
AA
Cod
on
JN
AA
Cod
on
JN
AA
Cod
on
JN
AA
Cod
on
JN
AGCT
1.794
3.077
AGCT
1.896
2.432
AGCT
1.671
2.173
AGCT
2.490
2.551
AGCT
0.7
69
2.182
AGCT
0.7
16
3.104
AG
CG
0.0
00
0.1
54
AG
CG
0.1
19
0.4
32
AG
CG
0.0
82
0.5
43
AG
CG
0.0
00
0.4
06
AG
CG
0.2
15
0.6
06
AG
CG
0.1
19
0.1
19
AG
CC
0.8
24
0.0
51
AG
CC
1.0
67
0.3
78
AG
CC
1.1
51
0.1
98
AG
CC
0.6
89
0.0
58
AG
CC
0.8
00
0.4
24
AG
CC
0.6
27
0.4
78
AG
CA
1.3
82
0.7
18
AG
CA
0.9
19
0.7
57
AG
CA
1.0
96
1.0
86
AG
CA
0.8
21
0.9
86
AGCA
2.215
0.7
88
AG
CA
2.537
0.2
99
CTGT
1.833
2.000
CTGT
1.818
1.833
CTGT
1.846
1.867
CTGT
1.538
1.933
CTGT
1.333
1.939
CT
GT
1.455
1.946
CT
GC
0.1
67
0.0
00
CT
GC
0.1
82
0.1
67
CT
GC
0.1
54
0.1
33
CT
GC
0.4
62
0.0
67
CT
GC
0.6
67
0.0
61
CT
GC
0.5
45
0.0
54
DGAT
1.120
2.000
DGAT
1.234
1.652
DGAT
1.265
1.714
DGAT
1.510
2.000
DGAT
1.283
2.000
DG
AT
0.9
80
2.000
DG
AC
0.8
80
0.0
00
DG
AC
0.7
66
0.3
48
DG
AC
0.7
35
0.2
86
DG
AC
0.4
90
0.0
00
DG
AC
0.7
17
0.0
00
DGAC
1.020
0.0
00
EG
AG
0.0
42
0.7
33
EG
AG
0.1
22
0.7
74
EG
AG
0.2
08
0.7
22
EG
AG
0.0
40
0.8
82
EG
AG
0.1
63
0.9
29
EG
AG
0.1
43
0.9
63
EGAA
1.958
1.267
EGAA
1.878
1.226
EGAA
1.792
1.278
EGAA
1.960
1.1
18
EGAA
1.837
1.071
EGAA
1.857
1.037
FTTT
1.281
1.884
FTTT
1.405
1.780
FTTT
1.374
1.860
FTTT
1.420
1.955
FTTT
1.182
1.812
FTTT
1.245
1.862
FT
TC
0.7
19
0.1
16
FT
TC
0.5
95
0.2
20
FT
TC
0.6
26
0.1
40
FT
TC
0.5
80
0.0
45
FT
TC
0.8
18
0.1
88
FT
TC
0.7
55
0.1
38
GGGT
1.0
07
2.566
GGGT
1.0
28
1.864
GG
GT
1.0
86
0.9
80
GG
GT
1.0
99
1.2
08
GG
GT
0.5
41
1.2
83
GGGT
0.5
11
2.113
GG
GG
0.2
80
0.6
04
GG
GG
1.0
83
1.2
43
GGGG
0.3
71
1.647
GG
GG
0.2
54
1.0
42
GGGG
0.5
41
1.321
GG
GG
0.6
92
0.8
30
GG
GC
0.1
96
0.0
38
GG
GC
0.4
17
0.2
33
GG
GC
0.3
44
0.1
18
GG
GC
0.3
66
0.0
42
GG
GC
0.1
20
0.2
26
GG
GC
0.0
60
0.3
02
GGGA
2.517
0.7
92
GGGA
1.472
0.6
60
GGGA
2.199
1.2
55
GGGA
2.282
1.708
GGGA
2.797
1.1
70
GGGA
2.737
0.7
55
HC
AC
0.8
00
0.1
05
HC
AC
0.8
99
0.2
00
HCAC
1.000
0.3
53
HC
AC
0.5
15
0.0
00
HCAC
1.265
0.2
67
HCAC
1.032
0.1
43
HCAT
1.200
1.895
HCAT
1.101
1.800
HCAT
1.000
1.647
HCAT
1.485
2.000
HCAT
0.7
35
1.733
HCAT
0.9
68
1.857
IATT
1.668
1.888
IATT
1.725
1.957
IATT
1.602
1.929
IATT
1.660
1.919
IATT
1.427
1.886
IATT
1.530
1.940
IA
TC
0.3
32
0.1
12
IA
TC
0.2
75
0.0
43
IA
TC
0.3
98
0.0
71
IA
TC
0.3
40
0.0
81
IA
TC
0.5
73
0.1
14
IA
TC
0.4
70
0.0
60
KAAA
1.647
0.9
00
KAAA
1.455
1.122
KAAA
1.097
0.8
57
KAAA
1.556
0.7
37
KAAA
1.667
0.6
06
KAAA
1.544
0.5
16
KAAG
0.3
53
1.100
KA
AG
0.5
45
0.8
78
KAAG
0.9
03
1.143
KAAG
0.4
44
1.263
KAAG
0.3
33
1.394
KAAG
0.4
56
1.484
LC
TA
1.8
55
0.3
81
LC
TA
1.4
93
1.4
55
LC
TA
1.1
70
1.4
12
LC
TA
1.6
94
1.3
33
LCTA
2.395
0.8
14
LCTA
3.017
0.1
74
LC
TC
0.2
89
0.3
81
LC
TC
0.2
69
0.0
00
LC
TC
0.6
17
0.2
35
LC
TC
0.3
33
0.1
67
LC
TC
0.2
55
0.0
68
LC
TC
0.2
37
0.0
00
LC
TG
0.0
96
0.0
00
LC
TG
0.1
79
0.1
21
LC
TG
0.1
06
0.4
71
LC
TG
0.0
83
0.3
33
LC
TG
0.3
82
0.6
10
LC
TG
0.1
36
0.1
74
LCTT
1.7
59
3.238
LCTT
2.060
2.424
LCTT
2.106
1.882
LC
TT
1.8
89
2.167
LC
TT
0.9
68
2.508
LC
TT
0.6
10
3.652
LTTA
1.891
1.5
69
LT
TA
1.8
61
1.6
50
LT
TA
1.7
09
1.4
27
LT
TA
1.974
1.5
92
LT
TA
1.7
76
1.1
49
LT
TA
1.8
53
1.0
71
LT
TG
0.1
09
0.4
31
LT
TG
0.1
39
0.3
50
LT
TG
0.2
91
0.5
73
LT
TG
0.0
26
0.4
08
LT
TG
0.2
24
0.8
51
LT
TG
0.1
47
0.9
29
MA
TG
0.2
08
0.5
27
MA
TG
0.2
89
0.5
00
MA
TG
0.1
88
0.5
18
MA
TG
0.2
04
0.5
38
MA
TG
0.3
03
0.8
24
MA
TG
0.2
34
1.0
14
MATA
1.792
1.473
MATA
1.711
1.500
MATA
1.813
1.482
MATA
1.796
1.462
MATA
1.697
1.176
MATA
1.766
0.986
NA
AC
0.5
00
0.0
41
NA
AC
0.6
32
0.0
80
NA
AC
0.6
82
0.2
86
NA
AC
0.3
84
0.1
22
NA
AC
0.7
27
0.2
80
NA
AC
0.8
22
0.0
36
NAAT
1.500
1.959
NAAT
1.368
1.920
NAAT
1.318
1.714
NAAT
1.616
1.878
NAAT
1.273
1.720
NAAT
1.178
1.964
PCCT
2.255
3.135
PCCT
2.211
3.135
PCCT
2.069
2.500
PCCT
1.636
3.000
PCCT
1.476
2.703
PCCT
1.1
65
2.743
PC
CG
0.0
73
0.1
08
PC
CG
0.2
11
0.0
00
PC
CG
0.2
41
0.1
00
PC
CG
0.1
45
0.1
00
PC
CG
0.1
17
0.2
16
PC
CG
0.0
39
0.0
00
PC
CC
0.4
00
0.2
16
PC
CC
0.5
61
0.2
16
PC
CC
0.9
66
0.6
00
PC
CC
0.8
36
0.1
00
PC
CC
0.8
16
0.5
41
PC
CC
0.3
11
0.6
86
PC
CA
1.2
73
0.5
41
PC
CA
1.0
18
0.6
49
PC
CA
0.7
24
0.8
00
PC
CA
1.3
82
0.8
00
PC
CA
1.5
92
0.5
41
PCCA
2.485
0.5
71
QC
AG
0.0
82
0.4
80
QC
AG
0.0
74
0.7
69
QC
AG
0.1
72
0.4
17
QC
AG
0.0
78
0.8
00
QC
AG
0.1
48
1.000
QC
AG
0.1
85
0.9
47
QCAA
1.918
1.520
QCAA
1.926
1.231
QCAA
1.828
1.583
QCAA
1.922
1.200
QCAA
1.852
1.000
QCAA
1.815
1.053
RCGA
2.421
0.0
00
RCGA
2.378
0.6
00
RCGA
1.895
1.3
33
RCGA
2.769
1.5
65
RCGA
2.421
0.8
33
RCGA
3.176
0.0
00
Codon bias of mtDNA in aquatic insects 455
Mito
chon
dria
l DN
A D
ownl
oade
d fr
om in
form
ahea
lthca
re.c
om b
y IB
I C
ircu
latio
n -
Ash
ley
Publ
icat
ions
Ltd
on
03/0
7/13
For
pers
onal
use
onl
y.
Tab
leV
I–continued
Eph
emer
op
tera
Ple
cop
tera
Od
on
ata
E.orientalis
S.im
manis
P.youi
P.princeps
D.lunatus
E.Formosa
RS
CU
RS
CU
RS
CU
RS
CU
RS
CU
RS
CU
AA
Cod
on
JN
AA
Cod
on
JN
AA
Cod
on
JN
AA
Cod
on
JN
AA
Cod
on
JN
AA
Cod
on
JN
RC
GC
0.0
00
0.2
00
RC
GC
0.7
57
0.0
00
RC
GC
0.7
37
0.0
00
RC
GC
0.2
05
0.0
00
RC
GC
0.6
32
0.0
00
RC
GC
0.0
00
0.0
00
RC
GG
0.0
00
1.0
00
RC
GG
0.2
16
0.4
00
RC
GG
0.4
21
1.1
43
RC
GG
0.1
03
0.5
22
RC
GG
0.1
05
0.5
00
RC
GG
0.1
18
0.0
00
RCGT
1.5
79
2.800
RCGT
0.6
49
3.000
RCGT
0.9
47
1.524
RCGT
0.9
23
1.913
RCGT
0.8
42
2.667
RCGT
0.7
06
4.000
SA
GC
0.3
25
0.0
00
SA
GC
0.1
97
0.3
24
SA
GC
0.2
84
0.2
90
SA
GC
0.2
21
0.2
15
SA
GC
0.4
15
0.3
27
SA
GC
0.2
38
0.2
22
SA
GA
1.1
78
2.1
87
SA
GA
0.8
67
2.4
86
SA
GA
0.8
12
2.609
SA
GA
1.2
82
1.7
72
SA
GA
2.1
97
1.0
88
SA
GA
2.3
33
0.8
89
STCA
2.274
0.6
40
ST
CA
1.7
34
0.6
49
ST
CA
1.5
03
0.6
38
STCA
2.387
1.4
50
STCA
2.446
0.3
27
STCA
3.571
0.5
56
ST
CC
1.1
37
0.0
00
ST
CC
0.5
12
0.0
54
ST
CC
1.3
81
0.2
32
ST
CC
0.8
84
0.0
54
ST
CC
0.9
12
0.4
9S
TC
C0.4
76
0.2
78
ST
CG
0.1
22
0.1
07
ST
CG
0.1
18
0.2
70
ST
CG
0.0
41
0.4
64
ST
CG
0.0
44
0.2
68
ST
CG
0.2
49
0.3
81
ST
CG
0.2
38
0.0
56
STCT
2.1
93
2.933
STCT
3.350
2.649
STCT
2.558
2.2
61
STCT
2.2
54
2.685
STCT
1.3
68
3.429
STCT
0.7
62
4.222
SA
GG
0.0
41
0.0
00
SA
GG
0.0
00
0.0
54
SA
GG
0.0
00
0.0
00
SA
GG
0.0
00
0.0
00
SA
GG
0.0
00
0.0
54
SA
GG
0.0
00
0.0
56
SA
GT
0.7
31
2.1
33
SA
GT
1.2
22
1.5
14
SA
GT
1.4
21
1.5
07
SA
GT
0.9
28
1.5
57
SA
GT
0.4
15
1.9
05
SA
GT
0.3
81
1.7
22
TACA
1.675
1.4
07
TACA
1.832
1.0
20
TA
CA
1.1
05
1.0
91
TA
CA
1.6
46
0.8
52
TACA
1.923
0.7
27
TACA
2.635
0.4
21
TA
CC
0.7
00
0.0
00
TA
CC
0.6
45
0.2
35
TA
CC
1.0
00
0.7
27
TA
CC
0.5
06
0.0
66
TA
CC
0.7
18
0.5
45
TA
CC
0.6
12
0.3
16
TA
CG
0.0
50
0.0
74
TA
CG
0.0
26
0.0
78
TA
CG
0.2
37
0.2
18
TA
CG
0.0
25
0.4
59
TA
CG
0.2
05
0.1
82
TA
CG
0.0
71
0.1
05
TACT
1.5
75
2.519
TA
CT
1.4
97
2.667
TACT
1.658
1.964
TACT
1.823
2.623
TACT
1.1
54
2.545
TACT
0.6
82
3.158
VG
TC
0.0
94
0.0
40
VG
TC
0.1
86
0.1
29
VG
TC
0.3
72
0.0
79
VG
TC
0.4
15
0.2
35
VG
TC
0.2
52
0.3
26
VG
TC
0.2
67
0.1
62
VG
TG
0.1
25
0.3
20
VG
TG
0.3
10
0.5
59
VG
TG
0.1
24
0.8
32
VG
TG
0.2
07
0.6
12
VG
TG
0.3
64
0.3
72
VG
TG
0.2
93
0.2
42
VGTT
1.5
94
1.960
VG
TT
1.5
81
1.6
34
VGTT
1.922
1.584
VGTT
1.4
52
1.976
VG
TT
0.7
83
2.326
VGTT
0.6
67
3.071
VGTA
2.188
1.6
80
VGTA
1.922
1.677
VG
TA
1.5
81
1.5
05
VGTA
1.926
1.1
76
VGTA
2.601
0.9
77
VGTA
2.773
0.5
25
WTGA
1.824
1.167
WT
GA
1.765
1.189
WT
GA
1.765
1.063
WT
GA
2.000
1.676
WT
GA
1.881
1.576
WT
GA
1.943
0.7
74
WT
GG
0.1
76
0.8
33
WT
GG
0.2
35
0.8
11
WT
GG
0.2
35
0.9
38
WT
GG
0.0
00
0.3
24
WT
GG
0.1
19
0.4
24
WT
GG
0.0
57
1.226
YT
AC
0.5
19
0.0
00
YT
AC
0.4
88
0.1
25
YT
AC
0.6
83
0.2
00
YT
AC
0.7
07
0.0
80
YT
AC
0.8
89
0.3
33
YT
AC
0.7
91
0.1
73
YTAT
1.481
2.000
YTAT
1.512
1.875
YTAT
1.317
1.800
YT
AT
1.293
1.920
YT
AT
1.111
1.667
YT
AT
1.209
1.827
Note
s:J
an
dN
den
ote
majo
rity
an
dm
inori
tyst
ran
ds,
resp
ecti
vely
.F
igu
res
inb
old
lett
ers
are
the
pre
ferr
edco
don
sfo
rth
eco
rres
pon
din
gam
ino
aci
din
each
stra
nd
.
C. Selva Kumar et al.456
Mito
chon
dria
l DN
A D
ownl
oade
d fr
om in
form
ahea
lthca
re.c
om b
y IB
I C
ircu
latio
n -
Ash
ley
Publ
icat
ions
Ltd
on
03/0
7/13
For
pers
onal
use
onl
y.
each dimension corresponds to RSCU value of one
sense codon. A series of orthogonal axes were
generated to project the trends responsible for
varying patterns of codon usage. The origin denotes
the mean value of RSCU for all genes with regard to
the two principal axes. Among the genes, dissimilarity
in RSCU is explained by the distance between the
genes in the plot. The results of this analysis for each
species are reported in Table VII.
Discussion
Complex correlations obtained by Pearson correlation
analysis reveal that nucleotide compositional con-
straints may have significant roles in shaping the
patterns of codon usage variations, particularly in
D. lunatus and E. formosa. In these two species, higher
correlations are observed between all homogeneous
and heterogeneous nucleotide contents. Since insect
mitochondrial genomes are AT rich, in most of the
amino acids, codons preferred to choose A or T
endings in all six basal insect species. The rationale
behind the abundance of A and T nucleotides in
mitogenomes of insects has not been proven so far, but
transcription hypothesis of codon usage (Xia 1996) is
usually adopted to explain this phenomenon (Sun et al.
2009). In all the six species, A and T contents are
observed to be much higher than G and C content,
and are used roughly equally (Table III). The ENC vs.
GC3 plot (Wright 1990) is being widely used as a part
of studying the determining factor associated with
SCU variation across genes in both unicellular and
muticellular organisms (Banerjee et al. 2004; Liu et al.
2011; Zhang et al. 2011). In this study, a considerable
number of genes of all the six species are lying on or
below the expected curve in the GC poor region, and
it indicates the greater possibilities for extreme
compositional constraints to be a significant factor in
determining SCU variations. Variations in both ENC
values and GC3 with higher standard deviations also
confirm the effects of base compositional constraints
in the evolution of SCU patterns among these species.
In D. lunatus and E. formosa, axis 1 is found to have
significantly higher correlations with all silent base
Figure 1. ENC vs. GC3 plots of six species of aquatic insects belong to Ephemeroptera, Plecoptera, and Odonata.
Codon bias of mtDNA in aquatic insects 457
Mito
chon
dria
l DN
A D
ownl
oade
d fr
om in
form
ahea
lthca
re.c
om b
y IB
I C
ircu
latio
n -
Ash
ley
Publ
icat
ions
Ltd
on
03/0
7/13
For
pers
onal
use
onl
y.
compositions and that emphasize the significance of
base compositional constraints as a major force in the
evolution of codon bias in the mitogenomes of these
species. But axis 2 has correlations with ENCs, but it
cannot be taken as an indicator of selection because
ATP8 alone has higher score in axis 2. A3 and T3
contents are found to be in correlation with genes
along axis 1 in P. princeps. Axis 2 and axis 3 are not
correlated with any of the silent base content. In the
light of this, it would be proper to believe that SCU
variations in P. princeps may be contributed by A3 and
T3 contents. Considering P. youi, axis 1 does not
exhibit any correlation with silent base compositions
but axis 2 shows significant correlations with G3, C3
and total GC, whereas axis 3 exhibits correlation with
T3 and GC3. In view of this, SCU variations across
mitochondrial genes of P. youi are expected to be highly
influenced by GC3 and T3 contents. InS. immanis, axis
2 has correlation with ENCs, but gene ATP8 alone
shows higher score in axis when compared to other
axes. Comparatively higher correlations are observed
for axis 1 with G3 and C3 rather than that of A3. But
axis 3 shows higher correlations with T3 and GC
contents. From this, it is likely to believe that A3, T3
and to some extent GC3 are playing a key role in SCU
variation of genes in S. immanis. In E. orientalis,
correlations are observed between axis 1 and total GC
content, and between axis 2, T3, and GC3. So,
variations of SCU may be attributed to the total GC,
GC3, and T3 content in E. orientalis.
Figure 2. Correspondence of six species of aquatic insects belongs to Ephemeroptera, Plecoptera, and Odonata showing
no grouping/separation of genes based on their ENC values.
C. Selva Kumar et al.458
Mito
chon
dria
l DN
A D
ownl
oade
d fr
om in
form
ahea
lthca
re.c
om b
y IB
I C
ircu
latio
n -
Ash
ley
Publ
icat
ions
Ltd
on
03/0
7/13
For
pers
onal
use
onl
y.
Conclusion
This study does not find any association between
microhabitats and codon usage variations in mito-
genes of selected aquatic insects. However, this has
identified major forces, such as compositional
constraints and mutation pressure, which shape
patterns of codon usage in mitochondrial genes in
the primitive clades of insect lineages.
Acknowledgements
We thank an anonymous reviewer for valuable
suggestions for improving the analysis and writing.
We are grateful to Mr Li Hu, Department of
Entomology, China Agricultural University, China
for his critical inputs in the manuscript. We thank
Dr M. Muralidharan, Sri Paramakalyani Centre for
Environmental Sciences, Manonmaniam Sundaranar
University, India for his help in statistical analysis.
Declaration of interest: The first author would like
to thank University Grants Commission, New Delhi,
India for providing Dr D. S. Kothari Post Doctoral
Fellowship (No. F.4-2/2006 (BSR) /13-670/2012
(BSR). K. G. Sivaramakrishnan thanks University
Grants Commission, New Delhi, India for the award
of Emeritus Fellowship (No. F.6-39/2011 (SA-II).
The authors report no conflicts of interest. The
authors alone are responsible for the content and
writing of the paper.
References
Abascal F, Posada D, Knight RD, Zardoya R. 2006. Parallel
evolution of the genetic code in arthropod mitochondrial
genomes. PLoS Biol 4(5):711–718.
Akashi H. 1994. Synonymous codon usage in Drosophila melanoga-
ster: Natural selection and translational accuracy. Genetics 136:
927–935.
Banerjee T, Basak S, Gupta SK, Ghosh TC. 2004. Evolutionary
forces in shaping the codon and amino acid usages in
Blochmannia floridanus. J Biomol Struct Dyn 22:13–23.
Bernardi G, Bernardi G. 1986. Compositional constraints and
genome evolution. J Mol Evol 24:1–11.
Biro JC. 2008. Does codon bias have an evolutionary origin? Theor
Biol Med Mod 5:1–15.
Buehler AG. 2006. Asymptotically increasing compliance of
genomes with Chargaff’s second parity rules through inversions
and inverted transpositions. PNAS 103:17828–17833.
Bulmer M. 1991. The selection-mutation-drift theory of synon-
ymous codon usage. Genetics 129:897–907.
Carlini DB, Chen Y, Stephan W. 2001. The relationship between
third-codon position nucleotide content, codon bias, mRNA
secondary structure and gene expression in the drosophilid
alcohol dehydrogenase genes Adh and Adhr. Genetics 159:
623–633.
Deschavanne I, Filipski J. 1995. Correlation of GC content with
replication timing and repair mechanisms in weakly expressed
E. coli genes. Nucleic Acids Res 23:1350–1353.
Grantham R, Gautier C, Gouy M, Mercier R, Pave A. 1980. Codon
catalog usage and the genome hypothesis. Nucleic Acids Res 8:
49–62.
Tab
leV
II.
Pea
rson
corr
elati
on
an
aly
sis
bet
wee
nC
OA
axes
,n
ucl
eoti
de
con
ten
ts,
an
def
fect
ive
nu
mb
erof
cod
on
s(E
NC
).
E.orientalis
P.youi
S.im
manis
P.princeps
D.lunatus
E.form
osa
Ax
1
(38.1
9%
)
Ax2
(13.3
3%
)
Ax3
(10.5
3%
)
Ax1
(26.6
3%
)
Ax2
(23.6
9%
)
Ax3
(14.3
4%
)
Ax1
(25.6
5%
)
Ax2
(19.3
9%
)
Ax3
(14.3
1%
)
Ax1
(30.1
9%
)
Ax2
(13.0
1%
)
Ax3
(12.8
8%
)
Ax1
(43.4
0%
)
Ax2
(15.9
7%
)
Ax3
(–
)
Ax1
(55.7
5%
)
Ax2
(–
)
Ax3
(–
)
A3
0.1
56
0.3
97
0.2
13
20.1
33
0.1
56
0.2
78
0.6
55*
0.0
97
20.2
35
0.7
16*
0.1
59
20.1
41
20.9
27
**
0.2
21
–0.9
92
**
––
T3
0.0
27
0.6
07*
0.2
45
20.4
28
0.2
30
20.6
28*
20.5
11
20.4
21
0.6
17*
0.5
54*
0.3
59
0.1
42
0.9
45
**
0.1
74
–0.9
82
**
––
G3
0.1
57
0.1
72
0.2
18
0.4
67
0.8
04
0.1
57
20.8
76
**
20.0
55
0.2
24
0.2
60
20.1
31
0.1
28
0.9
35
**
0.2
02
–0.9
61
**
––
C3
0.2
93
20.3
88
0.1
46
20.3
95
0.7
06
0.4
72
20.7
93*
0.2
12
20.4
20
0.3
01
0.2
40
20.1
35
20.9
43
**
0.1
60
–0.9
75
**
––
GC
20.5
85*
20.5
12
0.0
31
20.0
76
20.5
69*
0.2
61
0.0
06
0.1
54
0.7
77*
0.3
32
0.1
59
0.3
29
20.3
39
0.5
39
–2
0.2
56
––
GC
32
0.4
28
20.6
68*
0.0
18
0.2
32
0.1
23
0.6
28*
0.2
33
0.3
12
0.2
12
20.0
53
0.2
38
0.0
05
0.3
70
20.3
28
–0.1
92
––
EN
C2
0.3
46
20.5
00
20.0
35
0.2
87
20.1
12
0.4
43
0.0
03
0.6
38*
20.3
93
0.1
67
0.0
73
20.4
54
20.4
60
0.6
04*
–2
0.3
98
––
Note
:*F
igu
res
are
sign
ifica
nt
atp,
0.0
5;
**F
igu
res
are
sign
ifica
nt
atp,
0.0
05.
Codon bias of mtDNA in aquatic insects 459
Mito
chon
dria
l DN
A D
ownl
oade
d fr
om in
form
ahea
lthca
re.c
om b
y IB
I C
ircu
latio
n -
Ash
ley
Publ
icat
ions
Ltd
on
03/0
7/13
For
pers
onal
use
onl
y.
Hammer Ø, Harper DAT, Ryan PD. 2001. PAST: Paleontological
statistics software package for education and data analysis.
Palaeontologia Electronica 4(1):1–9.
Hershberg R, Petrov DA. 2008. Selection on codon bias. Annu Rev
Genet 42:287–299.
Ikemura T. 1982. Correlation between the abundance of yeast
tRNAs and the occurrence of the respective codons in its protein
genes. J Mol Biol 158:573–579.
Irwin B, Heck JD, Hatfield GW. 1995. Codon pair utilization the
biases influence translational elongation step times. J Biol Chem
270:22801–22806.
Karlin S, Mrazek J, Campbell AM. 1998. Codon usages in different
gene classes of the Escherichia coli genome. Mol Microbiol 29(6):
1341–1355.
Kurland CG. 1993. Major codon preference theme and variations.
Biochem Soc Trans 21:841–846.
Liu YS, Zhou JH, Chen HT, Maa L, Pejsak Z, Ding YZ, Zhang J.
2011. The characteristics of the synonymous codon usage in
enterovirus 71 virus and the effects of host on the virus in codon
usage pattern infection. Genet Evol 11:1168–1173.
Moriyama EN, Powell JR. 1997. Codon usage bias and tRNA
abundance in Drosophila. J Mol Evol 45:514–523.
Nikolaou C, Almirantis Y. 2006. Deviations from Chargaff’s second
parity rule in organellar DNA - Insights into the evolution of
organellar genomes. Gene 381:34–41.
Perriere G, Thioulouse J. 2002. Use and misuse of correspondence
analysis in codon usage studies. Nucleic Acids Res 30:
4548–4555.
Plotkin JB, Kudla G. 2010. Synonymous but not the same: The
causes and consequences of codon bias. Nature Rev Genet, 12:
32–42.
Rocha EPC. 2004. Codon usage bias from tRNA’s point of view:
Redundancy, specialization, and efficient decoding for trans-
lation optimization. Genome Res 14:2279–2286.
Sablok G, Nayak KC, Vazquez F, Tatarinova TV. 2011. Synon-
ymous codon usage, GC3, and evolutionary patterns across
plastomes of three pooid model species: Emerging grass genome
models for monocots. Mol Biotechnol 49:116–128.
Sarmer WT, Sullivan DT. 1989. A shift in the third-codon-position
nucleotide frequency in alcohol dehydrogenase genes in the
genus Drosophila. Mol Biol Evol 6:546–552.
Sharp PM, Cowe E. 1991. Synonymous codon usage in
Saccharomyces cerevisiae. Yeast 7:657–678.
Sharp PM, Tuohy TMF, Mosurski KR. 1986. Codon usage in yeast:
Cluster analysis clearly differentiate highly and lowly expressed
genes. Nucleic Acids Res 14:8207–8211.
Sharp PM, Emery LR, Zeng K. 2010. Forces that influence the
evolution of codon bias. Phil Trans R Soc B 365:1203–1212.
Smith MJ, Smith NH. 1996. Site-specific codon bias in bacteria.
Genetics 142:1037–1043.
Sun Z, Wan DG, Murphy RW, Ma L, Zhang XS, Huang DW. 2009.
Comparison of base composition and codon usage in insect
mitochondrial genomes. Genes Genomics 31(1):65–71.
Tamura K, Dudley J, Nei M, Kumar S. 2007. MEGA4: Molecular
evolutionary genetics analysis (MEGA) software version 4.0.
Mol Biol Evol 24(8):1596–1599.
Wall DP, Herback JT. 2003. Evolutionary patterns of codon usage in
the chloroplast gene rbcL. J Mol Evol 56(6):673–688.
Wright F. 1990. The “effective number of codons” used in a gene.
Gene 87:23–29.
Xia X. 1996. Maximizing transcription efficiency causes codon
usage bias. Genetics 144:1309–1320.
Xu C, Cai X, Chen Q, Zhou H, Cai Y, Ben A. 2011. Factors
affecting synonymous codon usage bias in chloroplast genome of
oncidium gower ramsey. Evol Bioinform 7:271–278.
Zhang J, Zhou C, Gai Y, Song D, Zhou K. 2008. The complete
mitochondrial genome of Parafronurus youi (Insecta: Ephemer-
optera) and phylogenetic position of the Ephemeroptera. Gene
424:18–24.
Zhang J, Wang M, Liu WQ, Zhou JH, Chen HT, Ma LN, Ding YZ,
Gu YX, Liu YS. 2011. Analysis of codon usage and nucleotide
composition bias in polioviruses. Virol J 8:146.
C. Selva Kumar et al.460
Mito
chon
dria
l DN
A D
ownl
oade
d fr
om in
form
ahea
lthca
re.c
om b
y IB
I C
ircu
latio
n -
Ash
ley
Publ
icat
ions
Ltd
on
03/0
7/13
For
pers
onal
use
onl
y.