bioinfo project

55
Understanding the role of intrinsic disorder in subunits of hemoglobin and the disease process of sickle cell anemia Reis Fitzsimmons and Narmin Amin Abstract One of the most notorious and common genetic disorders is sickle cell anemia, in which two recessive alleles must meet to allow for destruction and alteration in the morphology of red blood cells. This usually leads to loss of binding to oxygen and curved, sickle-shaped erythrocytes. The mutation responsible for this disease occurs in the 6 th codon of the β A -globin, a protein responsible for binding to the oxygen in the blood. It changes from a charged glutamic acid to a hydrophobic valine residue, which disrupts the tertiary structure and stability of the hemoglobin molecule. Questionably, intrinsic disorder in protein structure generally results from low mean hydrophobicity and high net charge, leading to unstructured protein morphology. Perhaps intrinsic disorder might have a role in the disease process of sickle cell disease. GlobProt2 and FoldIndex were used to predict intrinsically disordered regions in all subunits of hemoglobin: alpha, beta, delta, epsilon, zeta, and gamma (two of them). The

Upload: reis-fitzsimmons

Post on 13-Jan-2017

107 views

Category:

Documents


0 download

TRANSCRIPT

Page 1: Bioinfo Project

Understanding the role of intrinsic disorder in subunits of hemoglobin and the disease process of sickle cell anemia

Reis Fitzsimmons and Narmin Amin

Abstract

One of the most notorious and common genetic disorders is sickle cell anemia, in which

two recessive alleles must meet to allow for destruction and alteration in the morphology of red

blood cells. This usually leads to loss of binding to oxygen and curved, sickle-shaped

erythrocytes. The mutation responsible for this disease occurs in the 6th codon of the βA-globin, a

protein responsible for binding to the oxygen in the blood. It changes from a charged glutamic

acid to a hydrophobic valine residue, which disrupts the tertiary structure and stability of the

hemoglobin molecule. Questionably, intrinsic disorder in protein structure generally results from

low mean hydrophobicity and high net charge, leading to unstructured protein morphology.

Perhaps intrinsic disorder might have a role in the disease process of sickle cell disease.

GlobProt2 and FoldIndex were used to predict intrinsically disordered regions in all subunits of

hemoglobin: alpha, beta, delta, epsilon, zeta, and gamma (two of them). The protein sequences

for each subunit were retrieved from the UniprotKB database. Then structural analysis was

completed by using the SWISS-MODEL Repository to ensure the accuracy of the disorder

predictors. Finally, Uniprot STRING was used to determine each hemoglobin’s biochemical

interactome and protein partners along with analyzing their posttranslational modifications.

These other properties were used to correlate the sickle cell mutation with intrinsic disorder and

determine any differences between the six different types of subunits of hemoglobin.

Additionally, other considerations were discovered, such as the biochemical properties and

molecular mechanism of sickle cell, threading energy, comparisons between the hemoglobin

subunits, and how sickle cell anemia affects embryonic development of hemoglobin.

Page 2: Bioinfo Project

2

Introduction

Sickle cell anemia is an autosomal recessive genetic disease that is caused by the

“substitution of one amino acid in the hemoglobin molecule” (Roseff). This phenomenon is

caused by the sickle cell transformation of erythrocytes, which can no longer properly bind to

oxygen. Low oxygen levels can cause “occlusion of blood vessels, increased viscosity, and

inflammation” (Roseff). Sickle cell was the first genetic disorder to be “identified at the

molecular level” in 1957 (Pawliuk et al). The reason was that it was caused by the substitution of

valine for glutamic acid in the sixth codon of human βA-globin. Homozygotes for sickle cell have

abnormal hemoglobin which “polymerizes in long fibers” when red blood cells lose their oxygen

supply (Pawliuk et al). This is a major factor that explains how the RBCs transform into sickle-

shaped, deformed floppy discs. Although the reason might sound very insignificant at first, the

mutation creates radical changes in the structure and function of the RBCs. When the glutamic

acid residue is replaced by valine, the position for a charged residue is replaced with a nonpolar

residue, which could “cause some disruption of the tertiary structure” (Arends et al). Arends

mentioned that when oxygen levels were measured in a heterozygote, oxygen levels tended to be

normal, but when the oxygen levels were compared to those of a recessive homozygote, there

was decreased affinity for oxygen from the disruption in the tertiary structure. When the oxygen

affinity is lowered, then the red blood cells have been reshaped into a new dysfunctional

morphology that suspends their activity of carrying oxygen. The glutamic acid residue might be

influential because it is charged and enforces the secondary structure of the hemoglobin.

However, when it has been replaced by valine, the protein becomes more nonpolar and the valine

promotes intrinsic disorder, although this is normally an order-promoting residue. The objective

is to better understand the role of intrinsic disorder in hemoglobin and how it affects the disease

Page 3: Bioinfo Project

3

process of sickle cell anemia. Other considered factors were posttranslational modifications and

biochemical interactions with other proteins.

Intrinsic disorder in each subunit of hemoglobin

Hemoglobin in Homo sapiens is made of many different subunits that change during the

development of the human. When a human is an adult, the hemoglobin protein is made of two

alpha subunits and two beta subunits. The mutation for sickle cell occurs in the beginning of the

beta subunit as mentioned before. Before hemoglobin is able to develop alpha subunits, it must

have “combinations of ζ- with ɛ- or γ- subunits to form embryonic hemoglobins” (Manning et

al). Their order of expression is determined by their relative positions on the gene, “i.e., ζ → α (2

copies) on chromosome 16 and ɛ → γ (2 copies) → δ → β on chromosome 11” (Manning et al).

During normal development, the embryo is normally ζ2γ2, ζ2ɛ2, or α2ɛ2, the fetus is typically α2γ2, and

finally the adult stage consists of either α2β2 or α2δ2. Since the protein consists of a tetramer of any

of these combinations of these six different types of subunits, it would be most accurate to detect

intrinsic disorder levels in all of them. Protein sequences of each subunit were retrieved from

UniprotKB and were predicted using FoldIndex and GlobProt2. The gamma subunit has two

different versions, so both were considered. Although hemoglobin has more subunits such as mu

and theta, they were ignored because the majority of development of hemoglobin relies on the

combinations of the six main types: alpha, beta, gamma, delta, epsilon, and zeta. Clustal Omega

was used to run a multiple sequence alignment of all subunits, including both isoforms of the

gamma subunit (Fig. 1). The multiple sequence alignment revealed that all sequences share 37

identical positions and 56 similar positions, considering that every sequence is between 142 and

147 amino acids long. The percent identity was 24.8%, which shows that the subunits of

hemoglobin have a low level of evolutionary conservation. The phylogenetic tree showed that at

Page 4: Bioinfo Project

4

first there was a divergence between the alpha and zeta subunits from the others (Fig. 2). This

would make sense because they bind to all of the other ones through the development of a

human. Another divergence emerged in which the epsilon and gamma subunits separated from

the beta and the delta subunits. This would probably involve the fact that beta and delta subunits

bind to the alpha subunits in adulthood while the gamma and epsilon subunits bind to alpha or

zeta subunits during embryonic development. Finally, there was a divergence between the

epsilon and gamma subunits and eventually the divergence between the gamma subunit’s two

isoforms.

Fig 1. Multiple sequence alignment between all subunits of hemoglobin

Fig 2. Phylogenetic tree of all subunits of hemoglobin

The beta subunit is most directly involved with the sickle cell mutation. After analyzing

the protein sequence using FoldIndex and GlobProt2, FoldIndex indicated that there are no

disordered regions in the sequence (right image of Fig. 3). However, GlobProt2 indicated that

Page 5: Bioinfo Project

5

there is one disordered region roughly in the middle of the protein (left image of Fig. 3). The

disorder predictors did not reliably indicate an overall presence of disorder in the beta subunit.

Fig 3. Intrinsic disorder prediction of the beta subunit

The alpha subunit could pose as a reasonable subunit to determine if there is intrinsic

disorder because it requires two of them to form the tetramer in adult Homo sapiens with the beta

subunits. In this case, GlobProt2 predicted that there is one disordered region indicating the same

positioning as the beta subunit (left image of Fig. 4). FoldIndex once again showed no signs of

disordered regions (right image of Fig. 4).

Page 6: Bioinfo Project

6

Fig 4. Intrinsic disorder prediction of the alpha subunit

The delta subunit could possibly consist of intrinsic disorder like its alpha and beta

counterparts, especially if it is still present in the adult human. GlobProt2 reflected one

disordered region similar to the one found in the previous subunits (left image of Fig. 5).

FoldIndex has shown no disordered regions reflecting consistency from the previous sequences

(right image of Fig. 5). The delta subunit appears to have the same regions of intrinsic disorder

found in the alpha and beta subunits.

Page 7: Bioinfo Project

7

Fig 5. Intrinsic disorder prediction of the delta subunit

The gamma subunit consists of two isoforms, so there were two results for each predictor.

GlobProt2 showed similar results for both isoforms (left image of Fig. 6 and Fig. 7) while

FoldIndex both showed no disordered regions like all previous results (right image of Fig. 6 and

Fig. 7). According to GlobProt2, the gamma subunit has two IDRs, one at the beginning and one

towards the end of the protein sequence. Interestingly, this is different from the alpha, beta, and

delta subunits. The gamma subunit is also only present during fetal and embryonic stages of the

hemoglobin protein. Perhaps the disease process of sickle cell anemia is more sensitive during

these stages because its single codon mutation occurs in the sixth codon towards the beginning of

the protein sequence in HBB.

Page 8: Bioinfo Project

8

Fig 6. Intrinsic disorder prediction of the first gamma subunit isoform

Fig 7. Intrinsic disorder prediction of the second gamma subunit isoform

The epsilon subunit is one of the subunits found in embryonic hemoglobin. According to

GlobProt2, it has one disordered region similar to the alpha, beta, and delta subunits (left image

of Fig. 8). Strangely, no globular domain structure was detected in the residues positioned before

Page 9: Bioinfo Project

9

the IDR. This might be a sign that the intrinsically disordered region might have influence over

the area prior to it in sequence order. It might also raise the question that it could affect the

position of the sickle cell mutation. FoldIndex showed the same results as all other previous

subunits beforehand (right image of Fig. 8).

Fig 8. Intrinsic disorder prediction of the epsilon subunit

Finally, the zeta subunit was predicted and is the subunit which epsilon and gamma

subunits bind during embryonic development of the hemoglobin protein. GlobProt2 showed that

the zeta subunit has three disordered regions spread all over the protein sequence (left image of

Fig. 9). The zeta subunit has all disordered regions of every subunit before mentioned. It also has

the disordered region where the sickle cell normally occurs, which was also found in both

gamma isoforms. The zeta subunit might have a central role in how the sickle cell mutation is

inherited during embryonic development. FoldIndex also revealed similar results in which the

hemoglobin has no disordered regions which has proven the consistency of the predictor (right

mage of Fig. 9).

Page 10: Bioinfo Project

10

Fig 9. Intrinsic disorder prediction of the zeta subunit

Structural analysis of intrinsic disorder

Although sequence is considered the most reliable method of predicting intrinsically

disordered regions within a protein, predicting secondary and tertiary structure is also a useful

tool in considering the reality of the sequence predictions. The SWISS-MODEL Repository was

used to create models to predict the protein structures of each subunit of hemoglobin and were

used to determine the reliability of the disorder predictors. When viewing protein structures, a

dark blue region indicates that the threading energy is low and that the residues are properly set

in their positions while red indicates that the threading energy is very high and that the region of

residues is considered entropic or unsettled in its environment. The structure prediction might not

be the most reliable method, but it can still provide an accurate 3-dimensional image of the

protein and represent the distribution of threading energy throughout the entire molecule. The 3-

dimensional conformation of the protein structure could possibly predict intrinsically disordered

regions because the structure’s “binding-folding thermodynamics and kinetics,” which are

Page 11: Bioinfo Project

11

important for the “efficiency of realizing biomolecular function,” can be deduced from its

“global energy landscape topology” (Chu et al). Therefore, the intrinsically disordered regions of

the hemoglobin subunits could be further analyzed by the levels of threading energy detected by

the SWISS-MODEL Repository protein structures since intrinsic disorder is characterized by

high thermodynamic energy and lack of defined structure.

The beta subunit showed one IDR around bases 48-60 (left image of Fig. 3). Although

SWISS-MODEL might not be an accurate predictor of intrinsic disorder, it still provides an

accurate measure of the distribution of threading energy, which is essential to the biological

function and defined structure of the hemoglobin. The structure and sequence from Fig. 10

showed that the most prominent red regions are around bases 37-46, 63-73, 87-100, and 142-147.

This model of the beta subunit indicates that the disorder prediction might not have been that

accurate or that the correlation between intrinsic disorder and entropic residues might have flaws.

Another interesting observation is that the IDR is surrounded by two red areas indicating that

intrinsic disorder might cause lack of defined structure to surrounding regions.

Page 12: Bioinfo Project

12

Fig 10. SWISS-MODEL protein structure and threading energy of protein sequence of the beta subunit

The alpha subunit has one IDR around base positions 48-60 similar to the beta subunit

(left image of Fig. 4). Fig. 11 has its most prominent red regions around base positions 41-48,

58-67, 83-102, and 136-142. This time the alpha subunit seems to have better disorder prediction

versus the beta subunit. However, the red regions still seem to surround the IDR rather than be

part of it, as shown in the beta subunit (Fig. 10). This continues to support the idea that an IDR

might cause lack of defined structure or higher thermodynamics to surrounding regions within

the protein sequence.

Fig 11. SWISS-MODEL protein

structure and threading energy

of protein sequence of the alpha subunit

Page 13: Bioinfo Project

13

The delta subunit showed one IDR consisting of bases 47-60 (left image of Fig. 5). It is

roughly the same region as in the alpha and beta subunits. SWISS-MODEL showed that the most

prominent red regions of the delta subunit’s sequence are around the base positions 37-46, 88-

100, and 143-147 (Fig. 12). The delta subunit has roughly the same red regions as the alpha and

beta subunits, except that the red regions are less prominent and that the region around 58-72 is

either violet or blue. This time there is only one prominently red region that is adjacent to the

IDR. The delta subunit must not be as disordered and has more defined tertiary structure than the

alpha and beta subunits. It might not even be that involved with the sickle cell mutation.

Fig 12. SWISS-MODEL protein structure and threading energy of protein sequence of the delta subunit

Both isoforms of the gamma subunit have intrinsic disorder found around bases 1-7 and

140-144 (left images of Fig. 6 and Fig. 7). The SWISS-MODEL protein structure for the first

Page 14: Bioinfo Project

14

isoform has a few prominent red regions, but most of them are short or blended with blue

regions. The most prominent regions are around the bases 38-47, 64-72, 88-107 and 142-147

(Fig. 13). The SWISS-MODEL protein structure for the second isoform has many violet and

weak red regions, but its most prominent regions for red color are roughly 38-43, 93-98, and

145-147 (Fig. 14). Both isoforms of the gamma subunit do not have many prominent red regions

and seem to have long streaks of defined tertiary structure. Their highest energy levels are in

similar locations, although the first gamma subunit has much more pronounced red coloring and

much more energy in the region between 60 and 75. Interestingly, the intrinsically disordered

region at the start of the sequence for both isoforms was not accurately predicted, but the IDR at

the very end of the sequence for both isoforms was predicted very accurately. The sickle cell

mutation located at the start of the sequence of hemoglobin might be influential in causing high

threading energy at the end of the sequence. This could show how IDRs can influence other

IDRs even if they are at opposite ends of the protein. Compared to the alpha and beta subunits,

the first isoform of the gamma subunit seems quite similar in which regions are most

prominently red. The second gamma isoform might not share the exact residue positions for

prominent red areas, but both subunits showed roughly similar colored regions and the gamma 2

subunit has much less threading energy than the alpha, beta, and first gamma subunits. This

evidence also reflects that the gamma subunit must be versatile when binding to different types

of other subunits in embryonic development of hemoglobin.

Page 15: Bioinfo Project

15

Fig 13. SWISS-MODEL protein structure and threading energy of protein sequence of the first gamma subunit

Fig 14. SWISS-MODEL protein structure and threading energy of protein sequence of the second gamma subunit

Page 16: Bioinfo Project

16

The epsilon subunit has one IDR located around residues 44-60 (left image of Fig. 8). It

has similar intrinsic disorder to the alpha, beta, and delta subunits, but it lacks globular domain

structure in the first 40 to 45 bases of the sequence. In Fig. 15, the epsilon subunit has its most

prominent red regions around the bases 38-46, 64-72, 90-107, and 142-146. Once again, the idea

that an IDR affects the threading energy of its surrounding regions is seen, similar to when it was

mentioned about the alpha and beta subunits. Its most prominent red regions highlight its

tendency to resemble the alpha and beta subunits, including the first isoform of the gamma

subunit. Surprisingly, the first 40 to 45 bases of this sequence showed fairly stable structure and

might reflect that GlobProt2 might not be an accurate disorder predictor. However, lacking

globular domain structure might not necessarily mean that that part of the protein structure is

completely unordered.

Fig 15. SWISS-MODEL protein structure and threading energy of protein sequence of the epsilon subunit

Page 17: Bioinfo Project

17

At last, the zeta subunit has disordered regions around the nucleotide bases 1-8, 42-52

and 133-137 (left image of Fig. 9). Its disordered regions include the site of mutation for sickle

cell anemia, which is common in the gamma subunits, and reflect all disordered regions of all

other subunits (left images of Fig. 1 to 8). The strongest red regions in the zeta subunit are

roughly 40-47, 59-66, 84-102 and 133-142. Strangely, bases 1-8 were not predicted by the

structure even though they have the site of the sickle cell mutation, similar to the conclusion

about both isoforms of the gamma subunit. The other IDRs were accurately predicted, reflecting

that perhaps threading energy and lack of defined structure indicate not the site of the mutation,

but rather the most affected areas. Since the zeta subunit was mentioned to have a central role in

the embryonic development of the hemoglobin protein, the evidence has been showing more

direction towards the idea that the sickle cell mutation definitely has more genetic influence

during the development of the embryo versus other life stages. Finally, the zeta subunit seems to

resemble the alpha, beta, epsilon and first gamma isoform subunits based on the areas of its

highest threading energy.

Fig 16. SWISS-MODEL protein structure and threading energy of the protein sequence of the zeta subunit

Page 18: Bioinfo Project

18

Posttranslational modifications of hemoglobin

Posttranslational modifications are enzymatic and covalent modifications of proteins after

the process of translation that serve several functions, such as providing the protein with a

specific function or targeting it for proteolytic cleavage. These include phosphorylation,

glycosylation, nitrosylation, ubiquitination, and others. Hemoglobin has a large number of

posttranslational modifications in the majority of its different subunits. The PTMs were indicated

by the display of the sequence in UniprotKB.

Fig 17. Posttranslational modifications of hemoglobin subunit beta

As shown in Fig. 17, the beta subunit of hemoglobin mainly has posttranslational

modifications at positions 2, 9, 10, 13, 18, 45, 51, 60, 67, 83, 88, 94, 121, and 145. The amino

acid valine at position 2 is an N-acetylated, glycosylated, and pyruvic acid iminylated residue.

The beta subunit is also glycosylated at positions 9, 18, 67, 121, and 145. There are also several

amino acids that are phosphorylated in this subunit, such as the serine residues at positions 10

and 45 and the threonine residues at positions 13, 51, and 88. The lysine residues at positions 60,

Page 19: Bioinfo Project

19

83, and 145 are N6-acetylated. Finally, the cysteine residue at position 94 is S-nitrosylated.

According to the GlobProt2 graph in Fig. 3, the beta subunit is disordered in the middle of the

amino acid sequence between residues 50 and 60. This might indicate a correlation between the

aforementioned posttranslational modifications at residues 51 and 60 and the disorder within the

corresponding region of the amino acid sequence. The sickle cell mutation occurs in the sixth

codon of the beta subunit. Since the beta subunit is the main hemoglobin subunit involved in

sickle cell anemia, the intrinsic disorder within the subunit might play a role in sickle cell

disease. The sickle cell mutation, which occurs at the 6th codon of this subunit, might be

correlated with the surrounding modified and glycosylated residues because the beginning of the

protein is the most modified region. It would not be surprising if the amino acid valine at

position 2 plays a crucial role due to its triple-modified condition. Thus the mutation induces a

glutamic acid-to-valine transition in which the protein structure destabilizes due to lowered

charge from the new valine. Perhaps the region in the beginning of the beta subunit might be

prone to the molecular mechanism of the disease process if many of the residues are modified

and that the region is usually low in threading energy indicating an otherwise normally stable

structure (Fig. 10).

Page 20: Bioinfo Project

20

Fig 18. Posttranslational modifications of hemoglobin subunit alpha

In Figure 18, the alpha subunit is phosphorylated at numerous sites, including serine

residues at positions 4, 36, 50, 103, 125, 132, and 139; threonine residues at positions 9, 109,

135, and 138; and a tyrosine residue at position 25. The second most frequent posttranslational

modification in this amino acid sequence is glycosylation, which is found at positions 8, 17, 41,

and 62. The lysine residues at positions 8, 12, 17, and 41 are also N6-succinylated. Finally, the

lysine residue at position 17 is N6-acetylated. According to the GlobProt2 graph in Fig. 4, the

alpha subunit is disordered in the middle of the amino acid sequence between residues 50 and 60.

Therefore, the posttranslational modification at position 50 could be correlated with the disorder

in this subunit. However, the IDR still consists of only one posttranslational modification, so no

correlation can actually be accurately deduced.

Fig 19. Posttranslational modifications of hemoglobin subunit delta

The delta subunit only has one posttranslational modification, which is a phosphorylated

serine residue at position 51 (Fig. 19). There is another posttranslational modification that occurs

Page 21: Bioinfo Project

21

in the Niigata variant of this subunit, which is an N-acetylated alanine residue at position 2.

According to the GlobProt2 graph in Fig. 5, the delta subunit is disordered in the middle of the

amino acid sequence between residues 50 and 60. Therefore, the posttranslational modification at

position 51 could be correlated with the disorder in this subunit, yet still no strong correlation is

present from the given evidence.

Fig 20. Posttranslational modifications of hemoglobin subunit gamma 1

The only posttranslational modification found in hemoglobin subunit gamma 1 is the N-

acetylation of glycine at position 2 (Fig. 20). According to the GlobProt2 graph in Fig. 6, the

gamma 1 subunit is disordered in the beginning of the amino acid sequence, between residues 0

and 5. Therefore, the posttranslational modification at position 2 could be correlated with the

disorder in this subunit. Once again, the correlation is still not that accurate although acetylation

has been shown to have an effect in protein stability.

Page 22: Bioinfo Project

22

Fig 21. Posttranslational modifications of hemoglobin subunit gamma 2

The only posttranslational modification in the gamma 2 subunit is N-acetylation of

glycine at position 2, similar to gamma subunit 1 (Figure 21). According to the GlobProt2 graph

in Fig. 7, the gamma 2 subunit is disordered in the beginning of the amino acid sequence,

between residues 0 and 5. Therefore, the N-acetylation of glycine at position 2 could be

correlated with the disorder in this subunit, since acetylation does play a role in protein structure

stability.

Fig 22. Posttranslational modifications of hemoglobin subunit zeta

The hemoglobin zeta subunit has only one posttranslational modification site, which is

the N-acetylated serine residue at position 2 (Fig. 22). Similar to the gamma subunits, the

disordered region of zeta is located in the beginning of the amino acid sequence, specifically

between residues 0 and 8 (left image of Fig. 9). Therefore, the N-acetylation of serine at position

2 could be correlated with the intrinsic disorder of the zeta subunit.

Page 23: Bioinfo Project

23

Fig 23. Posttranslational modifications of hemoglobin subunit epsilon

The hemoglobin epsilon subunit epsilon has three phosphorylated amino acid residues:

two serine residues at positions 45 and 51 and threonine at position 124. It also has two N6-

succinylated lysine residues at positions 18 and 60 (Fig. 23). Finally, the valine residue at

position 2 is N-acetylated. According to the GlobProt2 graph in Fig. 8, the epsilon subunit is

disordered in the middle of the amino acid sequence between residues 45 and 60. Therefore, the

posttranslational modifications at positions 45, 51, and 60 could be correlated with the disorder

in this subunit. Phosphorylation of the serine residues and the N6-succinylation of the lysine

most likely result in changes in the protein structure and function, possibly leading to the

intrinsic disorder.

Biochemical interactions with protein partners

Hemoglobin’s subunits have multiple interactions with a wide variety of other proteins.

These protein interactomes were discovered through Uniprot STRING, a database which

develops functional protein association networks to determine the function of the selected

protein. The interactomes are important because they could show the true role of each subunit of

hemoglobin and how each one is associated with another protein. Functional genomics could

provide a better answer towards how the disease process of sickle cell anemia can alter

hemoglobin’s function and the disease’s possible correlation with intrinsic disorder.

Page 24: Bioinfo Project

24

Fig 24. Biochemical interaction network of hemoglobin subunit beta

As shown in Fig 24, the beta subunit of hemoglobin interacts with approximately 17

different proteins, six of which are hemoglobin subunits HBA1 (hemoglobin alpha 1), HBA2

(hemoglobin alpha 2), HBZ (hemoglobin zeta), HBD (hemoglobin delta), HBE1 (hemoglobin

epsilon 1), and HBG2 (hemoglobin gamma G). The beta subunit also interacts with the alpha

hemoglobin stabilizing protein (AHSP), which is a protein that binds to the alpha hemoglobin to

prevent it from precipitating in vitro. Other proteins that interact with the beta subunit include the

three different homologs of v-maf musculoaponeurotic fibrosarcoma oncogene, F (MAFF), K

(MAFK), and J (MAFJ), all of which are proteins that act as transcriptional activators or

repressors that are involved in embryonic lens fiber cell development. There are also other

transcriptional factors that interact with HBB, such as the Kruppel-like factor (KLF1), which is a

DNA-binding protein that is involved in gene expression regulation, and nuclear factory

Page 25: Bioinfo Project

25

erythroid 2 (NFE2), which is involved in megakaryocyte production. Haptoglobin (HP) and

hemopexin are also important binding partners of HBB because they are responsible for

inhibiting the oxidative activity of low-affinity hemoglobin that is released by erythrocytes to

avoid oxidative damage. The Rh-associated glycoprotein (RHAG), another member of the HBB

interactome, is an ammonia transporter protein. Finally, HBB interacts with aquaporin 1 (AQP1),

which is a water channel found in the plasma membranes of certain regions of nephrons, and low

density lipoprotein receptor-related protein 1 (LRP1), which is a receptor that is responsible for

the process of receptor-mediated endocytosis. All of these interactions are possible due to the

diverse functionality of the beta subunit.

Since the beta subunit is most directly involved with sickle cell disease, its function

would be directly affected if the disease is inherited. Sickle cell disease is mainly characterized

by the sickle cell transformation of red blood cells when hemoglobin loses its defined tertiary

structure and lacks the ability to bind to oxygen. Therefore a mutant beta subunit would have

weak interactions with haptoglobin and hemopexin, since they are the primary proteins in

preventing oxidative damage. Then low affinity to oxygen would result in distortion of the red

blood cells and activation of the oxidative activity of low-affinity hemoglobin. Another factor is

HBB’s interaction with AHSP because sickle cell could cause a lack of binding and defined

structure in HBB. Then the tetramer formed between alpha and beta subunits would have a

weaker binding affinity and result in the molecular changes caused by the disease process. This

phenomenon would be enhanced by prominent intrinsic disorder in either alpha or beta subunits.

Although the beta subunit is only found in adult humans, it interacts with the three different

homologs of v-maf musculoaponeurotic fibrosarcoma oncogene, F (MAFF), K (MAFK), and J

(MAFJ), crucial to embryonic lens fiber cell development. These protein partners indicate that

Page 26: Bioinfo Project

26

the sickle cell mutation could still have an effect on embryonic stem cells, even when found in a

subunit that is generally not present during embryonic or fetal development. HBB’s interaction

with the Kruppel-like factor and NFE2 reflects that sickle cell anemia might even have a possible

effect on the premature development of megakaryocytes and their gene expression regulation.

The disease process of sickle cell could occur before or during the development of

megakaryocytes and greatly alter the properties of blood, including the functional morphology of

the red blood cells. Intrinsic disorder would only increase the chances of the disease process

occurring, since it is involved with high net charge and low hydrophobicity. HBB’s interactions

with Rh-associated glycoprotein and aquaporin 1 indicate that the sickle cell mutation could even

affect the transport and endocytosis of certain molecules, such as ammonia and lipoproteins.

Sickle cell RBCs are not able to transport or bind to many molecules due to their altered shape

and loss of function. The beta subunit’s diverse functionality reflects the importance of its

necessity and the devastating consequences of altered structure when the sickle cell mutation is

introduced in certain carriers or homozygotes.

Fig 25. Biochemical interaction network of hemoglobin subunit alpha

Page 27: Bioinfo Project

27

Hemoglobin subunit alpha (HBA1) has only five interactions, three of which are with

other hemoglobin subunits, HBB, HBE1, and HBA2 (Figure 25). It also interacts with the same

two proteins found in the HBB interactome, AHSP and HP. This interactome indicates that the

sickle cell mutation can disrupt the functionality of the alpha subunit when the beta subunit is

abnormal. The disease could also affect the binding affinity, permit oxidative damage, and

disrupt the total structure of the needed tetramer for normal oxygen affinity based on the

interactions with other proteins aforementioned (Fig. 24).

Fig 26. Biochemical interaction network of hemoglobin subunit delta

The hemoglobin delta subunit interacts with four other hemoglobin subunits, HBA2,

HBG2, HBE1, and HBB. Like HBB, it also interacts with MAFK, MAFF, MAFG, NFE2, and

KLF1 (Fig. 26). In addition, it interacts with cytoglobin, which is a globin molecule that helps

prevent oxidative stress and scavenges reactive oxygen species and nitric oxide. The delta

subunit would most likely suffer the same results of sickle cell like the beta subunit based on the

Page 28: Bioinfo Project

28

previous model (Fig. 24). The cytoglobin, if altered, could lead to even greater oxidative stress

and permit worse damage on the erythrocytes.

Fig 27. Biochemical interaction network of hemoglobin subunit gamma 2

The hemoglobin subunits that HBG2 mainly interacts with are HBA2, HBB, HBE1, and

HBD. Similar to HBB, HBG2 also interacts with MAFG, MAFF, MAFK, and NFE2 (Fig. 27). It

also interacts with two transcription factors: jun proto-oncogene (JUN) and activating

transcription factor 2 (ATF2). However, no interactome was found for the first gamma subunit.

Since the gamma subunit is present in either fetal or embryonic development stages of the human

lifecycle, sickle cell disease would indeed have a significant impact during this time period due

to the heavy interactome between the beta and gamma subunits. JUN and ATF2 also provide

information that the sickle cell disease process could particularly affect gene expression and

activation of hemoglobin function.

Page 29: Bioinfo Project

29

Fig 28. Biochemical interaction network of hemoglobin subunit zeta

The fetal hemoglobin molecule zeta interacts with only two of the other hemoglobin

subunits, HBB and HBE1 (Fig. 28). It also interacts with JUND and forkhead box P3 (FOXP3),

which is a protein that regulates the development of regulatory T cells. Although the zeta subunit

seems to have a limited interactome, it still plays an important role in embryonic and fetal

development. Its interactions with HBB could be a sign that the sickle cell mutation could

indirectly affect fetal development and the premature immune system because the zeta subunit

interacts with a protein which regulates development of regulatory T cells. Another factor that

might support this idea is that the zeta subunit has a few IDRs (Fig. 9). Intrinsic disorder of the

zeta hemoglobin would not only alter its function, but could possibly facilitate the molecular

mechanism of sickle cell if it is present in HBB.

Page 30: Bioinfo Project

30

Fig 29. Biochemical interaction network of hemoglobin subunit epsilon

HBE1, the other fetal globin, interacts with hemoglobin subunits HBA1, HBA2, HBZ,

HBG2, HBD, and HBB. It also has several interactome members in common with HBB, which

are NFE2, AHSP, MAFG, MAFF, and MAFK (Fig. 29). The reason why it has such a diverse

interactome is that it binds to many other types of subunits and has a very strong relationship

with zeta. This interactome supports the idea that sickle cell anemia is significant in

fetal/embryonic development when HBB has the mutation. The epsilon subunit also interacts

with proteins needed for hemoglobin stability, oxygen affinity, transport of materials,

endocytosis, and other important functions. Intrinsic disorder could easily permit the facilitation

of sickle cell if the epsilon subunit has difficulty interacting with required protein partners.

Conclusion

The findings of this study have shown that all of the hemoglobin subunits have some

level of intrinsic disorder, with the fetal subunits having more than the adult subunits. The

intrinsically disordered region of the beta hemoglobin subunit is nowhere near the mutation site

that causes sickle cell anemia. Therefore, it is difficult to assume that there is a correlation

between intrinsic disorder and sickle cell anemia. This study has also shown that most of the

Page 31: Bioinfo Project

31

hemoglobin subunits have many posttranslational modifications, especially the adult ones. These

posttranslational modifications were found in the intrinsically disordered regions, which might

indicate a correlational relationship between them. It can be inferred that intrinsic disorder is a

significant property of the hemoglobin subunits as it provides them with binding promiscuity i.e.

the ability of a protein to bind to many partners. Perhaps sickle cell disease is not caused by

intrinsic disorder, but instead a lack of it, since the mutation changes glutamic acid to valine.

Discussion

Intrinsic disorder was shown to be present in all subunits of hemoglobin, according to

GlobProt2. It showed how similar intrinsic disorder is between the alpha, beta, and delta

subunits. Interestingly, the fetal subunits showed more regions of intrinsic disorder than the adult

hemoglobin, indicating that the fetus might be more prone to intrinsic disorder during

development. The gamma subunit has two IDRs on opposite ends, the epsilon subunit has one

roughly similar to the adult hemoglobin, and the zeta subunit has three IDRs located in all

regions similar to the other subunits. Although there is presence of intrinsic disorder, HBB does

not have intrinsic disorder located near its sixth codon, where the sickle cell mutation occurs.

FoldIndex showed no intrinsic disorder in any of the subunits, which shows how different

disorder predictors can be. In general, intrinsic disorder is still difficult to predict and the

development of strictly accurate predictors is still in progress. The correlation between intrinsic

disorder and the sickle cell mutation is not strong, but remains plausible.

SWISS-MODEL was a reinforcement of the proposed accuracy of the disorder predictors

through structural analysis and measurement of threading energy levels in certain protein

regions. Many of the subunits were at best partially predicted correctly by the structural analysis,

which means that either the disorder predictors are not highly accurate or structural analysis from

Page 32: Bioinfo Project

32

simple sequence information does not guarantee accurate prediction of intrinsic disorder in the

protein sequence. There was a recurring theme that many of the IDRs were not accurately

predicted, but instead are located between or adjacent to areas of high threading energy levels.

The information might propose a new idea that IDRs themselves might not have high threading

energy levels or unsettled residues, but rather influence adjacent regions of the protein to have

higher entropy or threading energy. There was also no definite correlation between the sickle cell

mutation in HBB and presence of high threading energy at the site of mutation. The alpha, beta,

gamma 1, epsilon and zeta subunits all have roughly similar areas of high threading energy

around bases 38-48, 58-73, 83-107, and 133-147 while the gamma 2 and delta subunits have high

threading energy around bases 38-45, 85-100, and 140-147. The gamma 2 and delta subunits

have less clearly prominent high threading energy regions compared to the other subunits. The

gamma subunit is the most interesting subunit in the structural analysis because its two subunits

differ substantially in the category of threading energy and that both had the ending IDR

accurately predicted, but not the beginning IDR. This analysis proves that the gamma subunit has

different forms and that intrinsic disorder and threading energy share a more complex

relationship than what was expected. Another interesting phenomenon is the unclear correlation

between lack of globular domain structure and regions of high threading energy in the epsilon

subunit. A region that lacks globular domain structure does not necessarily mean that it is also

disordered or contains a high presence of threading energy. The zeta subunit’s structural analysis

revealed that regions of high threading energy seem to represent more affected areas of the

protein rather than the actual site of mutation or intrinsic disorder. The last conclusion about the

structural analysis showed that the sickle cell mutation probably has a greater influence on fetal

development than during human adulthood.

Page 33: Bioinfo Project

33

The posttranslational modifications appear to be abundant in most of the subunits of

hemoglobin, especially beta and alpha. Although intrinsic disorder and structural analysis could

not produce a clear association with the sickle cell mutation site, there are an especially large

number of PTMs surrounding the site of mutation in HBB. The valine at position 2 has three

different types of modification: N-acetylation, glycosylation, and pyruvic acid iminylation.

Perhaps there might be a profound effect of the modifications on the sixth codon because they

are numerous and tend to have significant impact on the protein’s structure and function. Sickle

cell has been mentioned to have a glutamic acid-to-valine substitution, which induces a lack of

defined tertiary structure and a decrease in the net charge of the entire hemoglobin. Therefore,

posttranslational modifications might be responsible for activating the mutation when it is

inherited. Although the first few residues of HBB tend to have no intrinsic disorder and low

threading energy, they might become altered by their natural abundance of chemical

modifications. The intrinsically disordered region in HBB has a somewhat relatable association

with PTMs although the region only possesses 2 modifications. The delta, both gamma, and zeta

subunits only have the aforementioned N-acetylation at position 2, although delta has another

modification in its IDR. This shows that posttranslational modifications have a higher presence

in the adult subunits than those which are predominant in the fetal and embryonic developmental

stages. The alpha and delta subunits also have relatable, but weak associations between their

intrinsic disorder and posttranslational modifications. The gamma and zeta subunits have one

posttranslational modification in their beginning IDRs, showing possible correlation between the

two properties of the hemoglobin proteins. The epsilon subunit has the best correlation between

intrinsic disorder and PTMs in that there were 3 modifications present in its IDR.

Phosphorylation and N6-succinylation of some of the residues would likely indicate some

Page 34: Bioinfo Project

34

presence of intrinsic disorder to execute function of the protein.

At last, biochemical interactomes deduced by using Uniprot STRING helped reflect

possible correlations between sickle cell disease, intrinsic disorder, and other properties of each

subunit of hemoglobin with their overall functions based on presence of necessary protein

partners. Of all subunits, beta was the most versatile with 17 different protein partners. The beta

subunit interacts with every other subunit, showing its true role as a widespread binding subunit,

and that sickle cell anemia could have an indirect effect on all other types of hemoglobin so long

as it is in HBB. Sickle cell could be a very debilitating disease due to the sheer importance of

HBB’s functionality. Some of its protein partners are responsible for making sure that it binds to

the alpha subunits during adulthood, transporting ammonia, preventing oxidative stress from low

affinity hemoglobin, regulating gene expression, promoting or reducing transcription activity,

producing megakaryocytes, and facilitating receptor-mediated endocytosis. An abnormal HBB

could result in disruption of the morphology of red blood cells, change the gene expression of

stem cells during embryonic development, lower binding affinity to oxygen and essential

proteins, and permit increased oxidative stress on erythrocytes and blood. This shows how other

peculiar dysfunctions emphasize the domino effect of one sickle cell mutation. Intrinsic disorder

in any of the subunits could result in a magnified effect of disorder if it is inherited. The other

hemoglobin proteins have less diverse interactomes than HBB, but share some of the same

important protein partners required for proper functioning. Delta interacts with cytoglobin, an

addition globin molecule for reducing oxidative stress. The sickle cell disease could create worse

damage to RBCs if it could alter the function of the cytoglobin. The second gamma subunit

shares many interactions with beta, which reflects how dependent embryonic development of

hemoglobin might be towards the universal nature of HBB’s interactome. This also shows more

Page 35: Bioinfo Project

35

sensitivity towards sickle cell disease. Two protein partners of HBG2 also showed how sickle

cell could even affect transcription activity and gene expression regulation. The zeta subunit

shows more evidence of how sickle cell disease could cause more disorder during fetal

development, especially involving the development of regulatory T cells essential for the

immune system of the fetus. Finally, the epsilon subunit has a diverse interactome almost to the

extent of HBB and shares a strong interaction with HBZ. The epsilon hemoglobin could be

responsible for many side effects of sickle cell disease due to its diverse nature during embryonic

and fetal development.

Acknowledgements

Page 36: Bioinfo Project

36

We would like to thank Dr. Vladimir Uversky for editing and critiquing this paper. We

would not have completed it without his determined legacy to intrinsically disordered proteins

and frequently used lectures on the role of intrinsic disorder in other diseases.

Works cited

Arends, T., et al. "Haemoglobin North Shore‐Caracas β134 (H12) valine→ glutamic acid." FEBS

letters 80.2 (1977): 261-265. Web. 16 Mar. 2016.

Chu, Xiakun, et al. "Quantifying the topography of the intrinsic energy landscape of flexible

biomolecular recognition." Proceedings of the National Academy of Sciences 110.26

(2013): E2342-E2351. Web. 18 Mar. 2016.

Manning, Lois R. et al. “Human Embryonic, Fetal, and Adult Hemoglobins Have Different

Subunit Interface Strengths. Correlation with Lifespan in the Red Cell.” Protein Science :

A Publication of the Protein Society 16.8 (2007): 1641–1658. PMC. Web. 16 Mar. 2016.

Pawliuk, Robert, et al. "Correction of sickle cell disease in transgenic mouse models by gene

therapy." Science 294.5550 (2001): 2368-2371. Web. 15 Mar. 2016.

Roseff, S. D. "Sickle cell disease: a review." Immunohematology/American Red Cross 25.2

(2008): 67-74. Web. 15 Mar. 2016.