strategies for identification of susceptibility genes in ...164297/fulltext01.pdf · strategies for...

60
Comprehensive Summaries of Uppsala Dissertations from the Faculty of Medicine 1333 Strategies for Identification of Susceptibility Genes in Complex Autoimmune Diseases BY LUDMILA PROKUNINA ACTA UNIVERSITATIS UPSALIENSIS UPPSALA 2004

Upload: ngohanh

Post on 15-Mar-2018

214 views

Category:

Documents


2 download

TRANSCRIPT

Page 1: Strategies for Identification of Susceptibility Genes in ...164297/FULLTEXT01.pdf · Strategies for Identification of Susceptibility Genes in Complex ... Complex diseases represent

Comprehensive Summaries of Uppsala Dissertationsfrom the Faculty of Medicine 1333

Strategies for Identification ofSusceptibility Genes in Complex

Autoimmune Diseases

BY

LUDMILA PROKUNINA

ACTA UNIVERSITATIS UPSALIENSISUPPSALA 2004

Page 2: Strategies for Identification of Susceptibility Genes in ...164297/FULLTEXT01.pdf · Strategies for Identification of Susceptibility Genes in Complex ... Complex diseases represent
Page 3: Strategies for Identification of Susceptibility Genes in ...164297/FULLTEXT01.pdf · Strategies for Identification of Susceptibility Genes in Complex ... Complex diseases represent

to the memory of my mother

Page 4: Strategies for Identification of Susceptibility Genes in ...164297/FULLTEXT01.pdf · Strategies for Identification of Susceptibility Genes in Complex ... Complex diseases represent
Page 5: Strategies for Identification of Susceptibility Genes in ...164297/FULLTEXT01.pdf · Strategies for Identification of Susceptibility Genes in Complex ... Complex diseases represent

v

List of Papers

I. Prokunina, L., Castillejo-López, C., Öberg, F., Gunnarsson, I., Berg,L., Magnusson, V., Brookes, A.J., Tentler, D., Kristjansdottir, H., Gröndal, G., Bolstad, A.I., Svenungsson, E., Lundberg, I., Sturfelt, G., Jönssen, A., Truedsson, L., Lima, G., Alcocer-Varela, J., Jonsson, R.,Gyllensten, U., Harley, J.B., Alarcón-Segovia, D., Steinsson K., and Alarcón-Riquelme, M.E.A regulatory polymorphism within the PDCD1gene is associated with sus-ceptibility to systemic lupus erythematosus in humans. Nat Genetics. 32:666-669, 2002

II. Prokunina, L., Gunnarsson, I., Seligman, V.A., Sturfelt, G., Truedsson, L., Olson, J.L., Seldin, M.F., Criswell, L.A., Alarcón-Riquelme, M.E. Analysis of the SLE-associated PDCD1 (PD-1) polymorphism PD1.3 and lupus nephritis. Arthritis Rheum 50:327-328, 2004.

III. Prokunina, L., Padyukov, L., Bennet, A., De Faire, U., Wiman, B., Prince, J., Alfredsson, L.,Klareskog, L., and Alarcón-Riquelme, M.E. Association of the PD1.3 A allele of the PDCD1 gene in patients with rheu-matoid arthritis negative for rheumatoid factor and the shared epitope. Arthritis Rheum, 2004, in press

IV. Prokunina, L., Öberg, F., Kozyrev, S., Gunnarsson, I., Rodriguez, E., Lima, G., Trollmo, C., Malmström,V., Alarcón-Segovia, D., Alarcon-Riquelme, M.E. Aberrant expression of the PD-1 and RUNX genes in activated CD4+ T cells in patients with systemic lupus erythematosus Manuscript

V. Prokunina, L., Zhang L., Lima, G., Gunnarsson, I., Svenungsson, E., Sturfelt, G., Truedsson, L.,Alarcón-Segovia, D., and Alarcón-Riquelme, M.E.Analysis of the TNFR 2 polymorphism M196R in Swedish and Mexican patients with systemic lupus erythematosus and rheumatoid arthritis. Genes and Immunity, submitted.

Reprints were made with permission from publishers

Page 6: Strategies for Identification of Susceptibility Genes in ...164297/FULLTEXT01.pdf · Strategies for Identification of Susceptibility Genes in Complex ... Complex diseases represent

vi

Contents

List of Papers ..................................................................................................v

Introduction.....................................................................................................1Human disease – monogenic and complex ................................................1Autoimmune disease ..................................................................................1Systemic lupus erythematosus (SLE) – a complex autoimmune disease ...2

Risk factors ............................................................................................3Genetic studies of human SLE ..............................................................4Animal studies .......................................................................................5

Rheumatoid arthritis (RA) – a complex autoimmune disease....................6Risk factors ............................................................................................6Genetic studies of human RA ................................................................8Animal studies .....................................................................................10

Finding genes for complex autoimmune diseases – The strategy ............10The disease. Epidemiology and evidence for genetic factors. .............11Markers for genetic studies..................................................................12Linkage analysis ..................................................................................12Fine mapping .......................................................................................13Mutations .............................................................................................14Mutation identification and genotyping...............................................16The ‘post-p value stage’ or functional validation of mutations ...........18

Present investigation .....................................................................................24Research aim ............................................................................................24

Paper I. A regulatory polymorphism within the PDCD1 geneis associated with susceptibility to systemic lupus erythematosus in humans.............................................................................................24Paper II. Analysis of the SLE-associated PDCD1 (PD-1)polymorphism PD1.3 and lupus nephritis............................................29Paper III. Association of the PD1.3 A allele of the PDCD1 gene in patients with rheumatoid arthritis negative for rheumatoid factor (RF) and the shared epitope (SE). ................................................................31Paper IV. Aberrant expression of the PD-1 and RUNX genes in activated CD4+ T cells in patients with systemic lupus erythematosus. .....................................................................................32

Page 7: Strategies for Identification of Susceptibility Genes in ...164297/FULLTEXT01.pdf · Strategies for Identification of Susceptibility Genes in Complex ... Complex diseases represent

vii

Paper V. Analysis of the TNFR2 polymorphism M196R in Swedish and Mexican patients with systemic lupus erythematosus and rheumatoid arthritis..............................................................................35Concluding remarks.............................................................................36

Acknowledgements.......................................................................................37

References.....................................................................................................38

Page 8: Strategies for Identification of Susceptibility Genes in ...164297/FULLTEXT01.pdf · Strategies for Identification of Susceptibility Genes in Complex ... Complex diseases represent

viii

Abbreviations

ab antibodies ANA antinuclear antibodies AFBAC affected family-based controls bp base pairs ChIP chromatin immunoprecipitation cM centi Morgan EMSA electrophoretic mobility shift assay FcGR Fc gamma receptor HLA human leukocyte antigen HRR haplotype relative risk IDDM insulin-dependent diabetes mellitus Ig immunoglobulin ITIM immune tyrosine-based inhibitory motif IL interleukin LD linkage disequilibrium LOD logarithm of the odds MHC major histocompatibility complex mRNA messenger RNA NZB New Zealand Black mouse NZW New Zealand White mouse OMIM online Mendelian inheritance in man PCR polymerase chain reaction PDCD1 programmed cell death 1 gene PDT pedigree disequilibrium test RFLP restriction fragments length polymorphism RF rheumatoid factor RR relative risk SE shared epitope SNP single nucleotide polymorphism rSNP regulatory SNP SLE systemic lupus erythematosus RA rheumatoid arthritis TDT transmission disequilibrium test TNFR2 tumor necrosis factor receptor 2 Z LOD score theta, recombination fraction

Page 9: Strategies for Identification of Susceptibility Genes in ...164297/FULLTEXT01.pdf · Strategies for Identification of Susceptibility Genes in Complex ... Complex diseases represent

1

Introduction

Human disease – monogenic and complex Human disease can be caused by many different factors. Genetic factors are mutations and susceptibility alleles that increase the risk of a disease for the carrier individuals. Environmental factors are other, mostly unknown factors, not directly associated with genetics – exposure to hazardous agents (con-taminants or drugs), infections, UV and radiation, diets, habits, etc.

If the same or a few mutations in the same gene cause a disease in differ-ent patients, the disease is called a monogenic or mendelian disease. Phenylketonuria (PKU), Huntington disease (HD), Duchenne muscular dys-trophy (DMD) and cystic fibrosis (CF) are examples of monogenic diseases. In the case of cystic fibrosis, as many as 500 different mutations in the CFTRgene are known to cause the disease 1.

Complex diseases represent a very broad group of common diseases and syndromes with unknown etiology. Examples of complex diseases are car-diovascular, psychiatric and autoimmune diseases, cancer, etc. A combina-tion of multiple factors (both genetic and environmental) may be required for the development of any complex disease. Each of the contributing factors can have a weak impact by itself, but increases risk for disease in a combina-tion with other factors. Low penetrance (absence of the disease in spite of all predisposition factors) and phenocopies (different factors causing the same disease) make complex diseases difficult to study. Genetic studies aiming to find susceptibility factors for complex diseases represent a great challenge and will eventually help to provide better prevention, diagnostics and treat-ment for many patients.

Autoimmune disease The immune system guards the host organism from all foreign intruders. Normally the immune system can discriminate between self-antigens (com-ponents of the body like its own nucleic acids and proteins) and foreign anti-gens (microorganisms and viruses). Normally, lymphocytes reactive towards self-antigens are eliminated, and this status is known as self-tolerance. But in the case of autoimmune disease, the immune system cannot discriminate between self and foreign antigens or eliminate self-reacting lymphocytes,

Page 10: Strategies for Identification of Susceptibility Genes in ...164297/FULLTEXT01.pdf · Strategies for Identification of Susceptibility Genes in Complex ... Complex diseases represent

2

which leads to an immune reaction. The immune response results in the deposition of immune complexes, persistent inflammation and damage to different tissues and organs of the body.

Particular cells or multiple tissues can be affected as a result of an auto-immune reaction: multiple organs in systemic lupus erythematosus (SLE), insulin-producing cells in the pancreas in Insulin-dependent diabetes melli-tus (IDDM), myelin, surrounding and protecting nerves in the central nerv-ous system in multiple sclerosis (MS), and joint cartilage in rheumatoid ar-thritis (RA).

Systemic lupus erythematosus (SLE) – a complex auto-immune disease In SLE (OMIM 152700), the immune system produces autoantibodies to-wards more than 60 self-nuclear components (DNA, histones and other nu-clear proteins) 2. As a result, multiple organs can be affected, but skin, joints and kidney are affected most often. SLE is described in the revised Ameri-can College of Rheumatology’s Criteria and the diagnosis is accepted when more than 4 out of 11 criteria are met 3 (Table 1). SLE is called a “great imi-tator” or a prototypic disease, as it resembles features of many other auto-immune diseases – rheumatoid arthritis, multiple sclerosis, psoriasis, asthma, etc. SLE is a chronic disease for which only symptomatic treatment is avail-able, often as a heavy combination of steroids and immunosuppressive drugs with severe side effects. Figure 1 illustrates clinical features of the disease and its complications.

Page 11: Strategies for Identification of Susceptibility Genes in ...164297/FULLTEXT01.pdf · Strategies for Identification of Susceptibility Genes in Complex ... Complex diseases represent

3

Figure 1. “Facts about lupus” From Issy’s Lupus Information Center homepage 4.

Risk factors Gender, Ethnicity, Age and Environment In fertile age the proportion of SLE patients is 15 women: 1 men, but if the disease starts after 50 years of age, this proportion is changed to 8 females: 1 male, suggesting that, as in other autoimmune diseases, female hormones are the strongest risk factor. Meta-analysis of results from several studies ad-dressing this question showed altered steroid metabolism in patients: de-creased levels of testosterone and progesterone and an increased levels of estradiol and prolactin in female patients and increased level of prolactin in males 5. Increased frequency of the disease in aged males can be associated with age-dependent changes in their hormonal status. However, it is difficult to conclude if the hormonal differences are a primary event and a suscepti-bility factor or a result of the immune response. It is also thought that, in general, late-onset SLE can be more environmentally-induced, or/and asso-ciated with age-dependent demethylation of DNA 6,7.

The disease can appear in any age group, but for females it is most fre-quent in a fertile age and for males in childhood/puberty or after 60 years of age. The peak incidence rates in women have been reported to be 41-47 years in white females and 34.5 years in blacks 8. Greater incidence of SLE is reported in non-Caucasian populations, being at least 6-fold higher in blacks (about 1: 200 females) than in whites (about 1: 1200 females) and 5-

Page 12: Strategies for Identification of Susceptibility Genes in ...164297/FULLTEXT01.pdf · Strategies for Identification of Susceptibility Genes in Complex ... Complex diseases represent

4

fold higher in Asians than in whites 8. It can also reflect genetic, socio-economical and environmental differences between population groups, effi-ciency of diagnostics and medical assistance. In general, the prevalence of SLE in Caucasians (Sweden 9, Iceland 10, UK 11 and US 8,12) is estimated to be 30-60 patients /100 000 individuals or 0.03-0.06% and 0.15-0.5% in non-Caucasian and admixed populations 8,11.

Environment It was observed that some drugs (procainamid, hydralazine, minocycline and approximately 80 other drugs 13), toxins, industrial contaminants, food sup-plements, UV light and infections can induce SLE 14,15. One of the suggested mechanisms of drug-induced SLE involves alteration of the methylation status of DNA. Hypomethylation of genomic DNA was found in patients with SLE and RA, correlating with activity of the disease 16. Artificially hypomethylated mammalian DNA could cause lupus-like or rheumatoid arthritis symptoms in mice, proving its immunogenic features 17,18. Hy-pomethylation of DNA can cause uncontrolled activation of genes, unre-sponsiveness to inhibition /activation signals and altered expression, leading to cell dysfunction 6,7,19,20. Hypomethylated DNA observed in the plasma of SLE and RA patients 21,22 can be their own DNA released from apoptotic and necrotic cells or bacterial and viral DNA that has high GpC content and is not methylated. Hypomethylated DNA induces the interferon alpha pathway and activates SLE 23.

Genetic studies of human SLE SLE cases often cluster in families indicating that genetic factors (certain allelic variants of genes) can increase risk for the disease. The degree of familial clustering,

s = disease prevalence in siblings of affected individuals / disease preva-lence in general population,

for SLE is estimated to be between 20-80 depending on population 24,25.This is similar or higher than for other autoimmune diseases 24.

It was also observed that 25-69% of monozygotic (MZ) co-twins of pro-bands develop the disease compared to 2-3% of dizygotic (DZ) twin-pairs 25-

27. About 4-12% of first- and second-degree relatives of SLE patients de-velop the disease 25,26. Taken together, these data indicate that SLE has sig-nificant genetic contributions that could be detected by appropriate methods.

Based on these observations, a number of groups initiated the collection of families with multiple cases of SLE. Screenings of the genome were per-formed and several susceptibility loci were identified both for the disease itself 28-30 and for several disease manifestations or related phenotypes 31-36.Regions of linkage were detected on almost every chromosome, suggesting a contribution of several genes, common or population-specific. Only on hu-

Page 13: Strategies for Identification of Susceptibility Genes in ...164297/FULLTEXT01.pdf · Strategies for Identification of Susceptibility Genes in Complex ... Complex diseases represent

5

man chromosome 1 were six linkage regions with LOD score over 1.5 de-tected 37.

The presence of multiple linkage regions found for SLE reflects the com-plexity of the disease with many susceptibility genes with weak effects and/or a high background of false-positive results. The genes associated with the disease in most regions detected by linkage are yet to be identified.

Numerous studies have suggested the involvement of HLA class II genes 38, genes of the complement system, such as C1q, C2 and C4A deficiency 39,Interleukin 10 (IL10) 40,41 and other cytokines (IL-4, IL-6, TNF alpha), the Fc receptors II and III 42 and the apoptosis gene bcl-2 43,44 in human SLE.

Animal studies Most studies about SLE were performed in mice. The New Zealand Black (NZB) mouse was the first described spontaneous model of SLE developing hemolytic autoimmunity. The New Zealand White (NZW) strain develops some level of autoantibodies and mild glomerulonephritis at late age and is usually considered a normal strain. The hybrid of NZB and NZW, called (NZBxNZW)F1 develops severe SLE-like disease characterized by kidney involvement at an early age, suggesting not additive but epistatic interactions between genes of both strains. Studies of a new strain called New Zealand Mixed (NZM) which is a backcross (NZBxNZW)F1 x NZW, revealed the presence of at least three loci named Sle1, responsible for spontaneous loss of immunological tolerance to nuclear antigens 45, Sle2, which lowers the activation threshold of the B cell compartment and mediates poly-clonal/polyreactive antibody production 46 and Sle3, mediating a deregula-tion of the T cell compartment that potentiates polyclonal IgG antibody pro-duction and decreases activation-induced cell death in CD4+ T cells 47. It was suggested that the NZW strain harbors at least four suppressor loci (Sles1-Sles4), responsible for the absence of the severe phenotype in the NZW mice 48,49.

Lpr (lymphoproliferation autosomal recessive mutation) is a mutation of the Fas gene, encoding a glycoprotein mediating apoptosis in T and B lym-phocytes 50, originally identified in the MRL-lpr strain. On different genetic backgrounds, this mutation causes increased proliferation of immune cells and accelerates the development of the disease phenotype in MRL mice. The MRL mice show mild antibody production. The MRL-lpr mice rapidly de-velop a severe form of SLE-like disease with autoantibody production and glomerulonephritis and die very early of kidney failure 51, suggesting that the MRL background promotes autoimmunity.

The BXSB mouse strain carries an unidentified mutant gene on chromo-some Y, designated as Yaa, which accelerates disease development in males 52. BXSB mice have been shown to have three main loci responsible for SLE-like disease, all mapped to chromosome 1: Bxs1 and Bxs2 associated

Page 14: Strategies for Identification of Susceptibility Genes in ...164297/FULLTEXT01.pdf · Strategies for Identification of Susceptibility Genes in Complex ... Complex diseases represent

6

with nephritis and Bxs3 associated with anti-DNA antibody production that overlaps with loci mapped in the NZB/W strains 53.

A number of transgenic and knockout animal models have been produced for several genes that show an SLE-like disease. The mouse pd-1 gene was originally isolated from a T cell hybridoma and was thought to play a role in apoptosis 54. Pd1 is expressed in the thymus and spleen and contains the so-called immunoreceptor tyrosine-based inhibitory motif (ITIM) 55, which is conserved in the human gene as well 56,57. The knockout for pd-1 spontane-ously develops lupus-like proliferative glomerulonephritis and arthritis. This suggests a role for pd-1 in the maintenance of peripheral self-tolerance by serving as a negative regulator of immune responses 58,59.

Rheumatoid arthritis (RA) – a complex autoimmune diseaseIn Rheumatoid Arthritis (OMIM 180300), CD4+ T, B cells and macrophages infiltrate normally acellular synovial fluid and can form lymphoid aggregates known as germinal centers. Rheumatoid arthritis affects small joints of the hands and feet. More systemic manifestations as vasculitis, lung and heart complications can be also associated with RA. Variations in the clinical pic-ture of the disease in different patients suggests that RA is a syndrome or a combination of several diseases with similar outcomes 60.

Risk factors Gender, Ethnicity, Age and Environment As in other autoimmune diseases, the female gender is a risk factor for de-velopment of RA. Proportion of females:males within RA patients is about 2.5:1 61. The disease usually starts between 40 and 60 years with the average age being 51 years 61. The disease is equally common in whites and blacks affecting about 0.7% of males and 0.15% of females 63. The only ethnic group with increased prevalence of RA is native Americans, where 3-4% of men and 3-7% of women are affected by the disease 63 .

Known risk factors for RA include smoking 64,65, obesity and blood trans-fusion 65. Epigenetic demethylation of DNA was associated with both RA and SLE 16,21. The changes can be age-dependent 7 or induced by drugs 13.

The rheumatoid factor (RF) was found in the blood of RA patients as a factor “that binds to the Fc portion of immunoglobulins” and therefore might represent causative self-antibodies 66. About 65% 61 of all RA patients are RF positive and show a more severe joint destruction and a worse disease prog-nosis 60.

Page 15: Strategies for Identification of Susceptibility Genes in ...164297/FULLTEXT01.pdf · Strategies for Identification of Susceptibility Genes in Complex ... Complex diseases represent

7

SLE RA Population preva-

lence0.03-0.06% in Caucasians

0.1-0.5% in non-Caucasians

0.25-2% 67

3-7% in native Americans 63

Prevalence and risk for siblings 4-12% 25,26, s=20-80 24 2-4%, s=2-17 67

Prevalence and risk for a monozy-gotic co-twin of a

proband

25-69% 25,27,68 12-15%, s=12-62 67

HLAWeak effect of B8, DR3, DQ2 alleles 2

30-50% of total risk, s=1.7-1.8 67

Shared Epitope: DR 1 alleles 69

Female: male ratio 9:1 8 2.5:1 61

Diagnostic criteria ACR criteria, 1982 31. Malar rash 2. Discoid rash 3. Photosensitivity 4. Oral ulcers 5. Non-erosive arthritis 6. Serosites: pleuritis or pericarditis 7. Renal disorder: persis-tent proteinurea or cellular casts8. Neurological disorder: seizures or psychosis 9. Hematological disorder: haemolytic anemia, leu-kopenia, lymphopenia or thrombocytopenia 10. Immunological disor-der: LE cells, anti-dsDNA, anti-Sm or false-positive serology for syphilis 11. Antinuclear antibodies

The diagnosis is based on the presence of any four or more factors

ACR criteria, 1987 70

1. Morning stiffness in and around joints lasting at least 1 hour before maximal improve-ment; 2. Soft tissue swelling (arthritis) of 3 or more joint areas observed by a physician; 3. Swelling (arthritis) of the proximal interphalangeal, meta-carpophalangeal, or wrist joints; 4. Symmetric swelling (arthritis);5. Rheumatoid nodules; 6. The presence of rheumatoid factor;7. Radiographic erosions and/or periarticular osteopenia in hand and/or wrist joints.

Criteria 1 through 4 must have been present for at least 6 weeks.

The diagnosis is based on the presence of any four or more factors

Table 1. SLE and RA

Page 16: Strategies for Identification of Susceptibility Genes in ...164297/FULLTEXT01.pdf · Strategies for Identification of Susceptibility Genes in Complex ... Complex diseases represent

8

Genetic studies of human RA Compared to SLE, RA has a lower genetic contribution as defined by the risk of getting the disease by sibs and monozygotic co-twins of probands 24,67,71. For SLE, the risk of getting the disease for sibs of patients is s=20-80 24 while for RA s=2-17 67, indicating that there is a weaker genetic compo-nent in RA than in SLE.

The first notion about heritability of RA dates back to 1806 and belongs to William Heberden 72. Much later, in 1978, Stastny observed an association between RA and alleles of the Human Leukocyte Antigen (HLA) DRB1locus, collectively named the Shared Epitope (SE) 69,73. It is calculated that the effect of the HLA region can account for 20-85% of all genetic risk for RA 61. The SE is present in about 72 % of RA patients 61. The exact action of the SE is not known, but it is suggested that immunogenic peptides can bind to the SE with increased specificity. The HLA region located at locus 6p21.3 and harbouring the SE, is still the only region significantly associated with RA 74-76 (Table 2). Many other regions with suggestive linkage were detected in genome scans (Table 2). The association was weak, not reaching the sig-nificance for genome-wide analysis (2.2x10-5)77 and s = 1.0-1.2, for most of the regions. With a frequency of the disease allele of 20% (common allele) and relative risk ( s) of 1.3, a set of 1324 patients and controls is required to detect the association with a power of 80% and p=0.05, while fewer than 304 patients are needed to detect an association with a locus with stronger im-pact, such as the HLA locus ( s=1.7) 78. Thus, it is evident that unless large combined sets of patients and controls can be studied, the chances of detect-ing significant associations outside the HLA region are close to zero.

Marker Position, cM P value Relative risk, s

D1S214 4 16.4 1 LOD 3.58

D1S253 4 16.4 LOD 3.77

D1S228 1 32.4 0.007

D1S1631 2 136.9 0.001 1.1

D1S235 2 254.6 0.003 1.15

D2S380 1 102.1 0.01

D2S377 1 228.2 0.013

D2S2354 1 235.0 0.005

D4S2361 2 93.5 0.014 1.05

D5S1462 2 105.3 0.018 1.18

D6S1959 2 34.2 0.0002 1.7

D6S265 (HLA) 2 44.4 5x10-11 1.8

D6S1629 (HLA) 2 44.9 5x10-11 1.8

D6S276 (HLA) 3 3x10-6

DlS276 (HLA) 1 2x10-5

Page 17: Strategies for Identification of Susceptibility Genes in ...164297/FULLTEXT01.pdf · Strategies for Identification of Susceptibility Genes in Complex ... Complex diseases represent

9

D6S273 (HLA) 2 47.7 3.6x10-12 1.8

D6S291 (HLA) 2 49.5 9.9x10-8 1.82

D6S389 (HLA) 2 53.8 8x10-7 1.4

D6S2427 2 53.8 1.5x10-4 1.82

D6S1017 2 63.3 0.02 1.5

D6S2410 2 73.1 0.003 1.35

D6S434 3 109 0.0007

D6S1021 2 112.2 0.008 1.14

D8S264 2 0.7 0.011 1.23

D8S277 2 8.3 0.026 1.24

D8S1110 2 67.3 0.007 1.09

D8S556 4 116.8 LOD 3.33

D8S373 2 164.5 0.02 1.1

D9S1121 2 44.3 0.013 1.19

D10S1221 2 75.6 0.0006 1.18

D12S398 2 68.2 0.005 1.11

D13S170 1 65.4 0.015

D13S1315 1 105.2 0.004

D16S403 2 43.9 0.008 1.19

D17S1298 2 10.7 0.003 1.0

D18S474 1 71.3 0.012

D18S858 2 80.4 0.002 1.23

D18S68 1 94.4 0.02

D18S61 1 102.8 0.01

D18S469 1 109.0 0.02

D22S264 1 0-9 0.01

DXS998 1 183.3 0.019

Table 2. Results of linkage studies for RA performed in four sets of families: 1 - 114 sib-pairs, European Consortium Rheumatoid Arthritis Families 74; 2 - 512 multicase families (US) 75; 3 - 256 sib-pairs, UK 76; 4 - 41 multicase families, Japan 79; signifi-cant associations(in bold) – HLA region at 6p21 was detected in three studies of four and the1p36 region was detected in one study.

Other known factors Tumor Necrosis Factor alpha (TNF alpha) is also located within the HLA region and therefore may be one of the factors outside of the “shared epi-tope” contributing to HLA association with RA. Increase of expression of TNF in macrophages, monocytes and lymphocytes is induced by infections, activation, toxins, and is found in many diseases including AIDS, cancer, diabetes, multiple sclerosis and RA. In RA, TNF alpha is produced in syno-

Page 18: Strategies for Identification of Susceptibility Genes in ...164297/FULLTEXT01.pdf · Strategies for Identification of Susceptibility Genes in Complex ... Complex diseases represent

10

vium by activated macrophages. Anti-TNF alpha antibodies were found to be useful as treatment for many RA patients 80,81.

Several other genes were tested in case-control association studies with results being often contradictory to each other 82.

Animal studiesThe main model for human RA is collagen-induced arthritis (CIA), where the disease in mice and rats is induced by injections of collagen of different types. Depending on the substance inducing the disease, there are several other models: oil-induced arthritis (OIA), pristane-induced arthritis (PIA) or arthritis induced with live pathogens. Genome scans in all of the models detected more than 40 quantitative trait loci harbouring potential disease genes 83. The impact of HLA alleles and haplotypes was confirmed in trans-genic mice carrying the susceptibility or protective haplotypes 84. Linkage observed in homologous regions identified in animal models and RA sib-pair families (human locus 17q22) led to identification of a narrow region of about 1 cM associated with the disease 85. Another example of a success in animal models for RA is a study of mouse strain SKG that spontaneously develops the disease. It was found that the genetic defect leading to the dis-ease in this strain was a mutation in the ZAP-70 molecule resulting in aber-rant signal transduction and positive selection of autoreactive T cells 86. The identification of a variant of the Ncf1 gene (encoding neutrophil cytosolic factor 1, a component of the NADPH oxidase complex), which regulates arthritis severity, was also a result of studies in animal models. The disease-related allele of Ncf1 was found to reduce the oxidative burst response and thus promoting activation of arthritogenic T cells 87.

Finding genes for complex autoimmune diseases – The strategy

”…The way to deal with an impossible task was to chop it down into a number of merely very difficult tasks, and break each one of them into a group of horribly hard tasks, and each one of them…”

Terry Pratchett, “Truckers”

Page 19: Strategies for Identification of Susceptibility Genes in ...164297/FULLTEXT01.pdf · Strategies for Identification of Susceptibility Genes in Complex ... Complex diseases represent

11

Finding all of the genes for complex autoimmune diseases represents quite an impossible task. The strategy therefore consists of a number of “merely difficult tasks” - genetic and functional studies (discussed below and Figure 2).

Figure 2. A combination of strategies for the identification and validation of disease genes and mutations.

The disease. Epidemiology and evidence for genetic factors. The first question to ask before starting the quest for disease genes is - “how genetic is the disease?” The answer to this question can help evaluate the chances of finding mutations segregating in families and that may be com-mon in sporadic patients. The parameter evaluating the proportion of the genetic component in a disease is called risk ratio or s

78,88.s = disease prevalence in siblings of affected individuals / disease preva-

lence in general population.

This parameter evaluates the relative risk of getting the disease by sib-lings of affected probands compared with that in the general (unrelated) population. For many common autoimmune diseases, the s value was found to be in the range of 10-20 or higher 24. The higher s is, the more significant the genetic component in the disease and the easier it will be to find the genes involved in its pathogenesis.

Another approach to evaluate the “heritability” of the disease is to com-pare concordance rates ( ) between monozygotic (MZ) and dizygotic (DZ) twins raised in the same families or separated early in childhood (adopted). The quantity ( ) estimates the population probability that a co-twin will be

Page 20: Strategies for Identification of Susceptibility Genes in ...164297/FULLTEXT01.pdf · Strategies for Identification of Susceptibility Genes in Complex ... Complex diseases represent

12

affected given that his/her partner is affected. If for MZ is higher than for DZ twins, then the disease may have an identifiable genetic component.

Even though it is quite obvious, the same disease criteria and the correct disease diagnosis should be used throughout the study.

Markers for genetic studies Two types of polymorphic markers are used for genetic studies: SNPs (Sin-gle Nucleotide Polymorphisms) and microsatellite markers, which together with minisatellites belong to VNTRs family (variable number of tandem repeats) 89. The term single nucleotide polymorphism (SNP) relates to varia-tions at a single nucleotide level. Following the sequencing of the human genome 90, it became clear that one of every 300-570 nucleotides 91,92 in hu-man genomic DNA varies between individuals. VNTRs are more polymor-phic than SNPs, but they are about 100 times less frequent in the human genome 90 and therefore are mostly used in the initial steps of genetic map-ping, while SNPs are more frequently used in fine mapping and association studies. The SNP Consortium (TSC) 93 and the International Human Genome Sequencing Consortium 91,94 aim to identify as many SNP variations as pos-sible through the re-sequencing of a large number of individuals from differ-ent populations and nearly 6 million SNPs have already been identified 91.

Linkage analysis In order to perform linkage studies, families with multiple cases of the dis-ease are required. Linkage analysis allows estimation of the probability for two or more polymorphic markers (or marker(s) and a disease) to be inher-ited together, meaning that they are physically close to each other on the chromosome. Significance of linkage is evaluated by the LOD score value (Z) (LOD stands for log of the odds). When Z>3.3 linkage is considered as significant, and suggestive when Z=2.5-3.3 77.

In the case of mendelian diseases, where all affected individuals have mu-tations in the same gene, LOD scores can be very high, as was the case in Friedreich ataxia, where several markers on chromosome 9 demonstrated high linkage with the disease (Z=97-99)95. For complex diseases, where many different genes are involved and many of them have weak effects, the majority of significant LOD scores are between 3.3 and 6.0. In most linkage studies performed for SLE to date, where nearly a 100 multicase families were studied, 28 only few LOD scores were higher than Z=3.3. This may reflect the complexity of the disease, false-positive results or genetic hetero-geneity. To study isolated and inbred populations is one way to decrease heterogeneity of the disease, so that most patients in the population will have the same or limited number of mutations, which will be easier to detect.

Page 21: Strategies for Identification of Susceptibility Genes in ...164297/FULLTEXT01.pdf · Strategies for Identification of Susceptibility Genes in Complex ... Complex diseases represent

13

Fine mapping “Fine mapping” generally means more detailed studies of the region where a significant linkage was detected. Multicase families, single-case families and sporadic patients as well as corresponding population controls are used for fine mapping studies.

Microsatellite markers and SNPs are usually used for fine mapping. SNPs, as bi-allelic markers, can have low polymorphism information content (PIC value). To increase the PIC value of the locus, haplotypes 96,97, which are a series of alleles of markers residing on the same chromosome and asso-ciated with each other (in linkage disequilibrium 98), can be derived. Several programs can help to reconstruct haplotypes from alleles of several markers using genotypes from families or from unrelated individuals. Some of exam-ples are MERLIN 99, PHASE 100 and GENEHUNTER 101; many other useful programs for linkage, linkage disequilibrium and association studies can be found at the Rockefeller University website 102.

Association studies can be performed in families or sporadic patients and controls. In family-based studies, the most commonly used methods are the transmission disequilibrium test (TDT) 103, which evaluates the transmission of alleles from heterozygous parents to their sick child or children. In the case of larger pedigrees, the pedigree disequilibrium test (PDT) is used 104

since it can handle more than one affected individual within a family. The haplotype relative risk (HRR) test 105 and the affected family-based controls (AFBAC) test 106 can use information from parents both homozygous and heterozygous for the marker or SNP. The result will be a comparison of al-lele frequencies in chromosomes transmitted to patients from their parents versus non-transmitted (control) chromosomes. The family-based studies use a population of affected and control chromosomes obtained from the same individuals that represent the best ‘matching-by-ethnicity’ approximation, helping to avoid ‘population stratification’ (a situation when the patients and control sets are not matched by ethnicity, but frequencies of alleles are dif-ferent between ethnic groups). Association studies in sporadic patients and controls matched by ethnicity, age and sex are the most common type and the significance is presented as a p-value of a X2 distribution.

The risk ratio 107 indicates folds of increase in the probability of having the disease by individuals with a certain allele (genotype) compared to those with a different allele.

Replication of association studies in other populations and in independent (and larger) sets of patients/controls can help to support real associations and eliminate false positive.

Page 22: Strategies for Identification of Susceptibility Genes in ...164297/FULLTEXT01.pdf · Strategies for Identification of Susceptibility Genes in Complex ... Complex diseases represent

14

Ancestral haplotype analysis The idea of this approach is to find the smallest common ancestral fragment of the haplotype still associated with the disease in different patients. This has been successful for studies of monogenic diseases, but the approach has had limited application in the field of complex diseases. It is assumed that all, or majority of the patients have the same causative mutation, located within the same causative haplotype. In fact, in complex diseases, patients even within the same family may have different genetic and environmental causes for the disease.

The size of the conserved fragment is dependent on the degree of linkage disequilibrium (LD) in the region 108-110. High LD practically means that some alleles of neighboring markers will always be detected together with each other or a disease. LD, in turn, is the function of the distance between markers, the time of existence of the markers, the movement of populations and the marker’s location in the genome (centromeric or telomeric, hot spots, etc.). LD on distances from 84 to more than 250 kb 111,112 has been described, meaning that fragments of chromosomes of this size are inherited as intact haplotypes and association with the disease will be detected by any marker within these blocks.

Candidate gene(s) When the region with a significant LOD score is found and fine mapping is done, there may be three alternatives. First, many genes within the region can be potential candidates due to a priori functional information. Alterna-tively, there may be no single apparent candidate gene as many genes are still of unknown function. In the best-case scenario, just one or two func-tionally important genes become possible candidate genes.

MutationsIn general, pathological changes on the DNA level (mutations) can be classi-fied into two main categories: structural and regulatory mutations.

Structural mutations Structural mutations are deletions, insertions, duplications, translocations and substitutions of one amino acid by another amino acid or by a stop codon. These types of severe changes in genes may result in a monogenic disease – multiple mutations in the CFTR gene for cystic fibrosis 1 or early and severe forms of complex diseases - mutations in the BRCA1 gene for breast cancer 113 or in the presenilin gene for Alzheimer’s disease 114.

Page 23: Strategies for Identification of Susceptibility Genes in ...164297/FULLTEXT01.pdf · Strategies for Identification of Susceptibility Genes in Complex ... Complex diseases represent

15

Regulatory mutationsThe majority of mutations of this type are so called regulatory single nucleo-tide polymorphisms (rSNPs) 115:

-rSNPs altering binding sites for regulatory proteins rSNPs can alter (modify, destroy or create) recognition and binding sites for regulatory proteins and as a result of this alteration, binding can be increased or decreased, abrogated or induced, leading to changes in level, timing and localization of gene expression. For instance, a rSNP in the promoter region of the gene encoding tumor necrosis factor alpha (TNF alpha) created a new binding site for the OCT-1 transcription factor that upregulated the expres-sion of TNF alpha in monocytes, and thereby increased the risk for cerebral malaria in Africans 116.

-rSNPs affecting stability of the transcripts rSNPs located in the 5’ and 3’ untranslated regions of the genes (UTRs) can alter the binding of proteins that regulate the stability of newly synthesized RNA. The regulatory sequences in UTRs form high secondary structures (hairpins), which may serve as regulatory elements as they are recognized and bound by regulatory proteins. These proteins protect mRNA from deg-radation by RNases 117,118. Specific adenylate- and uridylate-rich (AU-rich) elements (ARE) are often found in unstable mRNAs 119. rSNPs occurring in these elements can be associated with increased stability or decay of mRNA. Even silent exonic mutations can affect mRNA stability and efficiency of mRNA-translation through altered mRNA folding 120.

-rSNPs affecting mRNA splicingSplicing of mRNA is regulated by the 5’ and 3’ splice sites and a branch site located in introns as well as by intronic and exonic enhancers and silencers. Therefore, SNPs occuring within these sequences can create non-functional forms of proteins by exclusion and inclusion of exons and/or activation of alternative splice variants 121. Approximately 15 % of mutations associated with human diseases affect mRNA splicing 122. Mutations in exonic splicing enhancers (ECE) were associated with human diseases, for example, spinal muscular atrophy (SMA) 123, Becker’s muscular dystrophy (BMD) 124, breast cancer 125 and cystic fibrosis 126,127. Inactivation of an exonic silencer by a rSNP within the CD45 gene and inclusion of an alternative exon was associ-ated with susceptibility to multiple sclerosis 128. One of the mutations in the BRCA2 gene causing early-onset breast cancer has been associated with dis-ruption of an intronic splicing site, causing exon skipping and creating a deleterious allele of the gene carrying this mutation 129. A rSNP associated with susceptibility to several autoimmune diseases found in the 3’-UTR of

Page 24: Strategies for Identification of Susceptibility Genes in ...164297/FULLTEXT01.pdf · Strategies for Identification of Susceptibility Genes in Complex ... Complex diseases represent

16

the CTLA-4 gene, was a cause of alternative splicing and expression of a non-functional variant of the protein that lacks its ligand-binding domain 130.

Mutation identification and genotypingIn the search for any kind of mutations, re-sequencing of full-size genes in DNA from patients and controls from defined populations is still the most direct way. However, this is an expensive and time-consuming process, and alternative methods of SNP detection have been developed. One such method is single-strand conformation polymorphism (SSCP), in which po-lymerase chain reaction (PCR) fragments amplified from the DNA of differ-ent individuals are separated on denaturing slab polyacrylamide gels or using capillary sequencers 131,132. Variants of DNA present in the amplicons result in the different conformation and hence migration of single-stranded DNA through the gel. To identify the nature of the migration difference, the PCR fragments must be sequenced. A more recent and widely used method is denaturing high-pressure liquid chromatography (DHPLC). PCR fragments are generated and analyzed by DHPLC, and fragments differing from wild type are sequenced. This method has the advantage of allowing high-throughput and semi-automation 133,134.

Comparative studies of conserved noncoding regions using the extensive sequence information from different species can provide interesting data about putative functional elements and pathways of gene regulation 135.Computer programs such as VISTA 136 or NIX 137, which allow large-scale sequence comparison, help to highlight regions of high evolutionary conser-vation for more-detailed searches. However, 32-40% of functional sites pre-sent in humans or rodents are not functional in other species 138.

Once SNPs are detected, a large selection of genotyping methods is avail-able. RFLP is the oldest and simplest method for genotyping 139, but it can only be used when the nucleotide change leads to a change in the recognition site of a restriction enzyme. Sequencing is also a choice in many cases, espe-cially where automated capillary high-throughput sequencing is available. The high demand for SNP genotyping has resulted in the development of many other new and efficient methods with high-throughput capabilities (Table 3). The choice of the method will depend on the genotyping scale and the resources, as many methods require the availability of expensive equip-ment and reagents.

Page 25: Strategies for Identification of Susceptibility Genes in ...164297/FULLTEXT01.pdf · Strategies for Identification of Susceptibility Genes in Complex ... Complex diseases represent

17

Method Refs

Matrix-assisted laser desorption/ionisation time-of-flight mass

spectrometry (MALDI-TOF)

140-142

DASH-dynamic allele-specific hybridization 143,144

Pyrosequencing 145

Minisequencing 146,147

Invader 148,149

Rolling circle amplification 150,151

McSNP-melting curve analysis 152

Multiplex automated primer extension analysis (MAPA) 146

MIP genotyping with Molecular Inversion Probes 153

Survivor Assay, a SNP detection method based on electrospray

ionization mass spectrometry (ESI-MS)

154

Qbead system with fluorescent Qdot semiconductor nanocrys-

tals

155

Template-directed dye-terminator incorporation with fluores-

cence quenching detection (FQ-TDI)

156

SNP genotyping using single-tube fluorescent bidirectional PCR 157

Multiplexed single-base extension (SBE) genotyping via end-

labeled free-solution electrophoresis (ELFSE)

158

Molecular beacons and Real-Time PCR 159

Site-selective RNA scission 160

‘ZipCodes’-oligonucleotide ligation assay (OLA) and flow cy-

tometric analysis of fluorescent microspheres

161

Fluorescence polarization 162

Amplifluor (allele-specific amplification and universal energy-

transfer-labeled primers)

163

Table 3. Methods of SNP genotyping

Page 26: Strategies for Identification of Susceptibility Genes in ...164297/FULLTEXT01.pdf · Strategies for Identification of Susceptibility Genes in Complex ... Complex diseases represent

18

The ‘post-p value stage’ or functional validation of mutations Identification of splicing variants Alleles of rSNPs can affect splicing by creating or abrogate splicing sites or modifying splicing enhancers and silencers and this can be analyzed first by computational tools. Exonic splicing enhancers (ECE) are the most common splicing regulatory elements present in majority of exons 164 and the ESE Finder tool helps to identify these elements in the genomic DNA 165. RES-CUE-ESE, yet another tool was used to screen for exonic enhancer elements within 4817 human genes. ESE of 10 different types were identified through this approach 166. NIX software also helps to predict exons and introns in the genomic sequence by means of several prediction algorithms and effect of a particular allelic change can also be evaluated 137. Presence of alternative transcripts of course should be confirmed by experimental analysis of the RNA (cDNA) from individuals carrying the allele. This can be accomplished by northern blots or RT-PCR analysis of cDNA, followed by sequencing of alternative fragments and correlation of the presence of the alternative tran-scripts in healthy and sick individuals with alleles of the rSNP. The relative ratio of alternative transcripts can be measured by quantitative RT-PCR or microarrays.

Effect of rSNPs on splicing can be studied with the use of minigenes, where exons, suspected to be alternatively spliced, surrounded by adjacent intronic sequences are cloned into expression vectors. Effect of particular allelic changes is evaluated by site-directed mutagenesis and transient ex-pression of minigenes in cell cultures 167,168 This approach is also used to study activity of predicted splicing regulatory elements 166.

Studies of mRNA stability rSNPs can result in altered stability of mRNA, which can be evaluated ex-perimentally as described 169,170.

Prediction of transcription binding sites The TransFac 171 and TFSearch 172 databases can help to analyze the ge-nomic sequences for the presence of binding sites of known transcription factors. Both alleles of the rSNP should be evaluated as either of the alleles can abrogate or create a binding site. CONSITE 173 is another tool that helps to compare sequences from two species and find conserved transcription factor binding sites present in both. Several centers provide access to many useful web-based programs available freely or upon registration 137,174. Com-puter predictions result in hundreds of potential binding sites, many of which will be false positives and only a small proportion will be active at a certain time, or in a certain tissue. Relevant predictions have to be evaluated indi-vidually and according to previous knowledge on the biology of the gene before initiating the experimental validation.

Page 27: Strategies for Identification of Susceptibility Genes in ...164297/FULLTEXT01.pdf · Strategies for Identification of Susceptibility Genes in Complex ... Complex diseases represent

19

Electrophoretic mobility shift assay (EMSA) Predicted DNA–protein interactions can be validated with electrophoretic mobility shift assays (EMSAs, also called ‘gel shift’ or ‘band shift’ assays) 175. In this method, oligonucleotides 20-25 bp in length representing both alleles of a SNP are labeled radioactively or non-radioactively and incubated with nuclear cell extracts prepared from cells or tissues of interest. Protein(s) from the nuclear extract binding to the oligonucleotides will cause retarda-tion of migration in a gel owing to formation of DNA–protein complexes. The allele-specific binding will suggest the potential functional effect of the SNP studied and, if the addition of excess of unlabelled oligonucleotide con-taining one but not another allele decreases or eliminates the binding, and an unrelated oligonucleotide does not cause this effect, it proves the specificity of the interaction. The addition of antibodies against the nuclear factor causes additional retardation of the migration of the DNA–protein com-plexes, and proves the involvement of the nuclear protein in the interaction. Some factors can be expressed at very low amounts in any cell type, making it difficult to detect DNA–protein interaction when the protein in question represents only a tiny proportion of all proteins present in the nuclear cell extract. In this case, a recombinant form of the transcription factor or its DNA-binding domain can be used in an EMSA 176. Not all transcription fac-tors and their binding sites are present in the databases and that is why inter-action can be detected even if no prediction was identified in the databases 177.

DNAseI footprinting Footprinting of DnaseI-hypersensitive sites using nuclear extracts can con-firm transcription factor binding and show the position and number of sites occupied by the protein(s) 178. In this method, a radioactively labeled DNA fragment (usually PCR-amplified, up to 400-500 bp in length) is incubated with a nuclear extract and treated with the DNAseI enzyme. Complexes are resolved on polyacrylamide gels and analyzed: sites of DNA covered by the interacting proteins are protected from DNAseI cleavage and leave ‘white spots’ – footprints, surrounded by a ladder of bands corresponding to di-gested (unprotected) DNA. Multiple binding sites of the same or different nuclear factors will result in several footprints. The exact position and iden-tity of the nucleotides constituting the target involved in the binding can also be defined using this method.

If several proteins bind DNA at the same time to form a complex, the EMSA will not be able detect such an interaction, because not all proteins required for this complex will be able to bind to the short oligonucleotides used in EMSA. In cases like this, footprinting can give more information.

Page 28: Strategies for Identification of Susceptibility Genes in ...164297/FULLTEXT01.pdf · Strategies for Identification of Susceptibility Genes in Complex ... Complex diseases represent

20

Reporter assays Potential regulatory activities (promoter, enhancer or silencer) of a putative regulatory sequence, and the effect of a particular change caused by alleles of a rSNP, can be studied using reporter gene assays 179. Fragments of DNA carrying alleles of the rSNP are cloned into a promoter-less vector or a vec-tor carrying a weak promoter 180. Either of these vectors carries a ‘reporter’ gene, whose expression will depend on the inserted regulatory sequence. Such reporter genes are easy to monitor. Those commonly used are the fire-fly-derived luciferase gene or the CAT (chloramphenicol acetyl transferase) gene. Different allelic constructs, together with a control plasmid used for normalization of the efficiency of transfection, are transiently transfected into appropriate cell lines. The level of reporter gene expression is evaluated for both constructs (that is, for both alleles of the rSNP) and compared with the expression of the ‘empty’ vector. An increase or decrease in reporter gene expression caused by one of the alleles will demonstrate that the allele is affecting the activity of the regulatory element in which it is found (pro-moter, enhancer or silencer). Some transcription factors may require partners or co-factors to perform fully their activities in some cell types; for example, binding of the RUNX1 protein to DNA is increased in the presence of its partners CBF alpha (40-fold) or Ets-1 (10-fold) 181. Plasmids carrying an interacting partner can be co-transfected with the allele-specific constructs. In this type of experiment, the choice of cell line is very important. Many genes are expressed only under certain conditions, for example upon cellular activation, making it important to select the right activating agent and to perform a pre-analysis of the kinetics of activation for each cell type used.

Chromatin immunoprecipitation Protein–DNA interactions detected in vitro (by EMSA or footprinting) may not be true in vivo. Chromatin immunoprecipitation (ChIP) is the method of choice to study whether a particular nuclear factor is indeed binding a par-ticular DNA target in a particular condition. The availability of antibodies against the nuclear factor in question is one of the main components for suc-cess with this method. Formaldehyde crosslinking is used to make ‘snap-shots’ of DNA–protein and protein–protein contacts at a given moment in living cells (cell cultures) or tissues. DNA bound to its interacting proteins is sonicated to produce fragments of 300–700 bp and is incubated with appro-priate antibodies specific for the protein. The fragments of DNA (allele-specific fragments in the case of SNPs) bound to their interacting proteins will be immunoprecipitated with the specific antibodies and can be detected by PCR. RT-PCR also allows evaluation of the strength (degree) of the in-teraction by measuring the quantity of precipitated DNA and indirectly of the interacting complexes. To simulate in vitro the relevant disease condi-tions, cells can be treated with different agents. Combination of microarray

Page 29: Strategies for Identification of Susceptibility Genes in ...164297/FULLTEXT01.pdf · Strategies for Identification of Susceptibility Genes in Complex ... Complex diseases represent

21

chips and ChIP technology have already produced a new brand called ‘ChIP-on-chips’ 182. Genomic microarrays representing sequences of the whole genome, separate chromosomes or genomic fragments (for example, pro-moters or CpG islands) can be used for this type of analysis. The first exam-ples of this hybrid technology using arrays containing spotted PCR frag-ments of promoters have been successful 183. Another example is the use of CpG island microarrays, often representing 5’ regions of active genes 184.One of the possible applications of this approach is for the identification of novel gene targets for transcription factors in order to provide potential can-didate genes for complex diseases. Most of the SNPs in the binding sites for transcription factors found in the target genes through this approach are ex-pected to be regulatory.

Methylation studies The addition of a methyl group to a cytosine (C) residue in CpG dinucleo-tides in the mammalian genome creates 5-methylcytosine (5-MeC). When deaminated, the two forms of cytosine produce different bases: cytosine is converted to uracil (which is found in RNA but not DNA and therefore is eliminated); and 5-MeC is mutated to tyrosine (T). During evolution, the deamination process has contributed to a gradual decrease in the frequency of CpG dinucleotides in the genome (the human genome contains 41% of CpG pairs, most of which are methylated) 90,185. The only regions where CpGs are usually unmethylated, and therefore protected from mutations, are important regulatory regions called CpG islands 185. If a SNP occurs in a CpG dinucleotide where cytosine is methylated (5-MeC), the allelic nucleo-tide change of either the C or G base will cause demethylation of this site and alteration of the binding of methylase-sensitive regulatory proteins 186.Somatic hypermethylation of genes is associated with repression of gene transcription 187 and hypomethylation is associated with activation of gene transcription and is found in several examples of cancer and autoimmunity and in ageing 7,188. Bisulphite treatment allows one to distinguish C from 5-MeC. This treatment converts non-methylated Cs to Ts and leaves the me-thylated C residues unchanged. DNA treated with bisulphite is then ampli-fied by PCR, the fragments are cloned and 15-20 colonies for each fragment are sequenced to determine the proportion of methylation for each CpG site. The functional importance of the methylated allele can be established with EMSA using methylated oligonucleotides and/or by investigating the corre-lation between the level of methylation of a regulatory region containing the SNP with the level of expression of the corresponding gene 186. The effect of methylation can also be evaluated in reporter assays where allele-specific constructs are in vitro methylated with SssI methylase and used for transfec-tion. If the DNA–protein interaction depends on methylation, the level of expression of the constructs will be different between the methylated and non-methylated forms 189.

Page 30: Strategies for Identification of Susceptibility Genes in ...164297/FULLTEXT01.pdf · Strategies for Identification of Susceptibility Genes in Complex ... Complex diseases represent

22

Expression studies – microarrays and quantitative RT-PCR Large-scale expression studies became possible with microarrays. The two main types of microarrays are oligonucleotide arrays (Affymerix 190) and custom cDNA arrays 191. Affymetrix arrays contain a set of oligonucleotides representing several thousand genes from the human genome. cDNA arrays usually contain smaller and customized sets of genes (specific genes relevant for pathways or from certain chromosomal regions, etc). These arrays are cheaper than oligonucleotide arrays, as they are spotted PCR fragments of sequence-verified cDNA clones. The effect of rSNPs in regulatory proteins (transcription factors) on the expression of their target genes is often impos-sible to predict, but is possible to detect by analyzing expression of thou-sands of genes at the same time using microarrays. Some of these genes will be directly regulated by the transcription factor through binding to its target sites, whereas other genes will be regulated indirectly through a chain of remote responses. Analysis software for microarrays can help to cluster genes according to their gene expression suggesting their involvement in certain pathways 190-196.

Quantitative RT-PCR (TaqMan for example) is another approach to study gene expression when only few genes are of interest. Only limited number of genes can be studied at the same time, but this also allows more individual and sensitive evaluation than in microarrays 197.

Allelic imbalance The functional effect of rSNPs in the expression of genes located on the same DNA molecule (cis-position) can be evaluated in assays measuring allelic imbalance. Whatever the nature of the alteration associated with vari-ants of SNP – loss or gain of binding sites for regulatory factors, or altered efficiency of this binding, or epigenetic change – this alteration might be detected by measuring the proportion of expression of both allelic variants of the gene. If there is no preferential expression of one allele over another, they will both be equally expressed, in a proportion of 50:50. Otherwise, this proportion will be shifted in favor of one of the alleles. A 10% deviation from equimolar expression of alleles is an arbitrary threshold for identifica-tion of allelic imbalance (where the proportion of the transcripts is at least 40:60). An extreme case of allelic imbalance is the monoallelic expression of imprinted genes, where one allele (maternal or paternal) is always repressed and the proportion of transcripts is therefore 100:0. Several recent studies have shown that SNPs in 18-54% of the analyzed genes (this wide range probably depends on the study, number of individuals and thresholds used), demonstrated deviation from equimolar expression of their alleles at least in one individual 198-200. Different methods based on detection of allele-specific mRNA expression can be used: microarrays, TaqMan RT-PCR, padlock probes 201. Identification of allelic imbalance may be the most direct method

Page 31: Strategies for Identification of Susceptibility Genes in ...164297/FULLTEXT01.pdf · Strategies for Identification of Susceptibility Genes in Complex ... Complex diseases represent

23

to prove the functional implication of a given rSNP in the regulation of a given gene and provides a first step towards an understanding of the role of that rSNP in disease susceptibility.

Page 32: Strategies for Identification of Susceptibility Genes in ...164297/FULLTEXT01.pdf · Strategies for Identification of Susceptibility Genes in Complex ... Complex diseases represent

24

Present investigation

Research aim The general aim of the project was to develop a strategy and find susceptibil-ity genes for the complex autoimmune diseases systemic lupus erythemato-sus (SLE) and rheumatoid arthritis (RA).

Paper I. A regulatory polymorphism within the PDCD1 gene is associated with susceptibility to systemic lupus erythematosus in humans. Linkage to the 2q37.3 region (marker D2S125 with two-point LOD score of Z=4.24) has been detected in the genome scan in multicase families from Iceland, Norway and Sweden 30. Additional studies in this region in the same set of families suggested that the most likely location of the locus associated with SLE (denoted as SLEB2 locus) is the interval between the markers D2S125 and joint markers D2S2585/D2S2985 (the multipoint LOD score of Z = 6.03) 202.

Aim: To study the locus SLEB2 on chromosome 2q37.3 where linkage with SLE has been detected.

Material: 2510 individuals as shown in the Table 4.

Page 33: Strategies for Identification of Susceptibility Genes in ...164297/FULLTEXT01.pdf · Strategies for Identification of Susceptibility Genes in Complex ... Complex diseases represent

25

Structure of replication sets number of families number of individuals

Set I Nordic multicase families 31 105

Set II Swedish singlecase families 66 238 Swedish sporadic patients 200

Set III European-American multicase families 151 849

Set IV Mexican multicase families 25 129 Mexican singlecase families 86 279 Mexican sporadic patients 320

Set V African-American multicase Families 82 390

Total 2510

Table 4. Structure of the sets

Results and discussion Physical and genetic mapping of the region The SLEB2 locus is located in 2q37.3 region on the very telomeric end of chromosome 2. At the time of the study (year 1999-2002), this region was poorly mapped and sequenced. We performed physical mapping of the re-gion with BAC-clones available from all possible sources. We could find only one mouse, but no human BAC-clone, containing the PD-1 gene. We sequenced this mouse clone to investigate which are the genes surrounded the PD-1gene. SNPs from several other genes within the region were used for haplotype construction and linkage studies. The position of the PD-1gene within the 2q37 locus and in relation to some other genes and BAC-clones from this region was confirmed by FISH analysis. Therefore, the physical map of the region and the list of genes, including candidate genes, was based on the consensus of our own data, information from databases and sequences from Celera, NCBI and Ensembl. Interestingly, even now, this region is still not fully sequenced and assembled. The reason for this is a high repeat content and secondary structure, which we observed while se-quencing and assembling the mouse BAC-clone.

At the time of physical and genetic mapping of the region results of a mouse knock-out were published 59 supporting the idea that the PDCD1(PD-1) gene could be the best candidate located in this region. The PD-1 gene is

Page 34: Strategies for Identification of Susceptibility Genes in ...164297/FULLTEXT01.pdf · Strategies for Identification of Susceptibility Genes in Complex ... Complex diseases represent

26

an immunoreceptor carrying an immune tyrosine-based inhibitory motif (ITIM) 55 and down-regulating activation of T and B cells 54,56,57.

We sequenced the entire gene in 10 unrelated individuals from Nordic multicase families (5 patients and 5 controls). The annotated sequence of the full PD-1 gene was deposited in GenBank with accession number AF363458. Several SNPs were found (Table 5).

SNP Position in the gene Comments bp from the translation start

PD-1.1 A/G -531 promoter Frequency <1% in Europeans PD-1.2 A/G 6438 intron 2 Frequency <1% in Europeans PD-1.3 A/G 7146 intron 4 Analyzed here PD-1.4 A/G 7499 intron 4 In full linkage disequilibrium with SNP PD-1.5 PD-1.9 A/G 7625 exon 5, Ala-Val Frequency <1% in Europeans PD-1.5 C/T 7785 exon 5, Ala-Ala Analyzed here PD-1.6 A/G 8738 3’ UTR Analyzed here

Table 5. Position and features of SNPs found in the PD-1 gene

The same SNPs, as well as a few more with frequencies below 1%, were reported later 203.

Haplotype analysis Only three of seven SNPs in the PD-1 gene were “key” SNPs, defining five haplotypes, which we studied in all of the families. It was previously dis-cussed that errors associated with computational reconstruction of haplo-types from alleles of markers can have a more dramatic effect for association studies in the case of rare alleles and unique haplotypes than in case of common haplotypes 204. Therefore, haplotypes were constructed empirically, based on family information and only in the cases where it was possible to resolve them undoubtedly. We observed that allele A of the SNP PD-1.3 was always present in the same haplotype that was more often present in chro-mosomes transmitted to patients from their parents compared to non-transmitted chromosomes. Linkage analysis performed with haplotypes of the PD-1 gene as alleles of a polymorphic marker demonstrated that the as-sociation of this haplotype could explain all of the linkage to the SLEB2region: 10 families with this haplotype had a two-point LOD score of Z=3,29 while 20 other families had a LOD score of Z=-0,12.

Association studies Using the material we had, consisting of multicase families, singlecase fami-lies and sporadic patients, we performed TDT (transmission-disequilibrium

Page 35: Strategies for Identification of Susceptibility Genes in ...164297/FULLTEXT01.pdf · Strategies for Identification of Susceptibility Genes in Complex ... Complex diseases represent

27

test) and PDT (pedigree-based transmission-disequilibrium test), the variant of TDT applicable for multicase families. Both TDT and PDT count the events of transmission vs. non-transmission of alleles of markers from het-erozygous parents to their affected children. In the case of rare alleles, as for allele A of the SNP PD-1.3 with average population frequency of about 9%, only a few parents may be informative (heterozygous) and the results may not be correct. In the case of more common alleles, the results of these methods are more robust and reliable. Therefore we also used an alternative method, AFBAC (affected family-based controls), which also analyzes the transmission of alleles from parents to probands, though it uses information from all of the parents, not only heterozygous. This method is similar to HRR (haplotype-relative risk) included in the Analyze package, but in AF-BAC results are conveniently present as frequencies of alleles in the trans-mitted - “disease” population compared to the non-transmitted - “control” population of chromosomes. This makes it possible to compare the AFBAC results with other tests, for example, case-control studies. This was done in the present investigation, where the population frequencies of alleles in healthy individuals were not available. We used frequencies of alleles in non-transmitted chromosomes as corresponding population controls and compared them to frequencies of alleles in transmitted chromosomes and in sporadic patients. The results of TDT, PDT and AFBAC tests in our study were compatible with each other and a consistent association in all of the sets was observed only to allele A of the SNP PD-1.3.

Population studies showed very similar results for frequency of the PD-1.3A allele in patients and controls in North Europe and European-Americans. As expected, family-based studies (AFBAC, HRR, TDT, PDT) are more sensitive to detect weak associations because all the pool of chro-mosomes from parents (the generation which is older than patients and be-long to the same ethnic group) is clearly divided into two groups: transmitted (disease chromosomes) and non-transmitted (control chromosomes). Case-control studies are less sensitive as the frequency of the “disease allele” is always higher in population of healthy individuals than in non-transmitted chromosomes of parents of probands.

Population studies also demonstrated, that allele A is a “European factor”, as it was not present in Native Americans, Asians, and rarely present in Blacks. The Spanish introduced this allele into the Mexican population. In-terestingly, the allele also had a very low overall frequency in the Finnish population 205. Based on the results, we concluded that the PD-1.3 was a risk factor only in Europeans and to a lesser extent in Mexicans, but not in Afri-can-Americans (Table 6).

Page 36: Strategies for Identification of Susceptibility Genes in ...164297/FULLTEXT01.pdf · Strategies for Identification of Susceptibility Genes in Complex ... Complex diseases represent

28

Table 6. Association studies in the PD-1gene

Functional studies The SNP PD-1.3 is located in the enhancer-like structure with four imperfect repeats, containing binding sites for RUNX1, NFkB and E-box binding pro-teins, known to be expressed in hematopoietic cells. Analysis of the potential transcription-binding sites demonstrated that the normal allele (G) of the SNP PD-1.3 could bind RUNX1 (AML-1) transcription factor 206, while allele A abolished this binding. EMSA performed with nuclear cell extract from Jurkat cells (CD4+ T cells) confirmed this prediction. Moreover, the presence of antibodies against RUNX1 protein, but not unrelated protein, caused a supershift, confirming the involvement of the RUNX1 protein in the binding of SNP PD-1.3 A/G. The next questions to answer are therefore: which of the three RUNX factors theoretically binding this site are involved, how exactly do they regulate the expression of the PD-1 gene and how does this lead to autoimmunity?

Is it expected and obligatory that the same mutation should be equally important for all disease cases in all populations? SLE, like other autoim-mune diseases is a complex syndrome, where combinations of different ge-netic and environmental factors can give the same clinical picture of disease. Practically, the diagnosis of SLE is accepted when any four of eleven diag-nostic criteria are met 3. It would be very interesting to finally have a list of all possible genetic and environmental risk factors, where a combination of any, let’s say, four of them can give the same disease. I am sure that there

Page 37: Strategies for Identification of Susceptibility Genes in ...164297/FULLTEXT01.pdf · Strategies for Identification of Susceptibility Genes in Complex ... Complex diseases represent

29

will be a number of “vicious” combinations and that they will differ between populations and sets of patients.

Possibly, some of these risk factors for autoimmunity can be beneficial and protective against other diseases such as cancer or myocardial infarction and therefore preserved in populations with sufficiently high frequencies.

Paper II. Analysis of the SLE-associated PDCD1 (PD-1) poly-morphism PD1.3 and lupus nephritis. We found that allele A of the SNP PD-1.3 located within the PD-1 gene (SLEB2 locus) is associated with SLE in Europeans 207 (Paper I).

Aim: To test an association of the mutation PD-1.3A in a subgroup of pa-tients with lupus nephritis in two independent sets of patients: Swedish and European-Americans.

Sporadic SLE nephritis non nephritis X2; p; 95% OR* All patients

Sweden 54/510 (0.11) ** 27/154 (0.18) 27/356 (0.08) 10.2; 0.002; 2.6 (1.4-4.8)

US 48/448 (0.11) 9/82 (0.11) 39/366 (0.11) not significant

Total 102/958 (0.11) 36/236 (0.15) 66/722 (0.09) 6.4; 0.01; 1.8 (1.1-2.8)

Table 7. Frequencies of PD1.3 A in female sporadic Swedish and US pa-tients with SLE with and without lupus nephritis. *- Comparisons between nephritis and non-nephritis SLE groups; ** - n / chromosomes (frequency of allele A)

Results and discussion An independent replication set Our initial observation that Swedish SLE patients with nephritis often have allele A of the SNP PD-1.3 required confirmation in another independent set of patients. The replication set should be large enough, as only about 25-30% of patients have nephritis. In addition, the patients should be ethnically similar (Caucasians, at least) and the diagnosis must be performed similarly. We therefore replicated the initial study in 255 Swedish patients in another set, consisting of 224 patients from the US (Table 7).

Page 38: Strategies for Identification of Susceptibility Genes in ...164297/FULLTEXT01.pdf · Strategies for Identification of Susceptibility Genes in Complex ... Complex diseases represent

30

Only females were studied The reason for this is that the frequency of nephritis is significantly higher in male patients than in females. We observed that the frequency of nephritis was 19% in European-American females, 30% in Swedish females, 50% in European-American males and 60% in Swedish males. SLE in males is also present 10 times less often than in females and in total we had only 47 males with SLE with available information about nephritis status. We therefore decided not to mix together such clinically different sets of SLE patients (males and females).

Lupus nephritis The diagnosis “lupus nephritis” was confirmed in most cases by biopsy. In the US set, only 19% of patients had nephritis compared with 30% of the Swedish patients ( 2 = 8.1, P= 0.005). The frequency of allele A of the PD-1.3 was higher in Swedish patients with lupus nephritis than in patients without it (18% versus 8%, p=0.002) while it was similar (11%) in Euro-pean-American patients with or without nephritis. In the joint group of pa-tients from Sweden and the US, the frequency of allele A in the group of patients with nephritis compared to non-nephritis patients was 15% versus 9% respectively, (p=0.01, OR 1.8, 95% CI 1.1-2.8) (Table 7).

According to observations by Swedish clinicians, patients homozygous for allele A have particularly severe disease associated with nephritis of the highest grade and poor response to therapy (one of the few A/A patients died recently at the age of 47).

Population heterogeneity Even if we tried to have sets of patients as similar as possible, it was diffi-cult. We had information about family history (2-3 generations) for all of the Swedish patients included in this study. Importantly, we sorted out all pa-tients of non-Swedish origin at the beginning of the studies, prior to genotyp-ing. In the European-American set of patients, such information was not available and patients were only classified as Caucasian (White), Black, Hispanic or Asian. Such a classification can still hide a high degree of het-erogeneity. As it is clinically relevant, carefully designed genetic, functional and clinical studies of the correlation between allele A and nephritis should be continued.

Update for the association of allele A of the SNP PD-1.3 with SLE Combined data from all of our data sets of genotyped SLE patients (Swedish and European American, 1,312 chromosomes) and controls of European descent (6808 chromosomes) showed an overall association ( 2) of 28.29 (OR 1.7, 95% CI 1.4-2.0, P < 0.0000001).

Page 39: Strategies for Identification of Susceptibility Genes in ...164297/FULLTEXT01.pdf · Strategies for Identification of Susceptibility Genes in Complex ... Complex diseases represent

31

Paper III. Association of the PD1.3 A allele of the PDCD1 genein patients with rheumatoid arthritis negative for rheumatoid fac-tor (RF) and the shared epitope (SE). We found that allele A of the SNP PD-1.3 located within the PD-1 gene (SLEB2 locus) is associated with SLE in Europeans 207 (Paper I). RA is an autoimmune disease often found in relatives of SLE patients. Rheumatoid factor (RF) 208 and the shared epitope (SE) 73 are well-known factors increas-ing risk for RA.

Aim: To test an association of the mutation PD-1.3A in Swedish patients with rheumatoid arthritis and controls.

Material: 1175 patients with rheumatoid arthritis and 3404 controls. 876 patients (629 females and 247 males), for whom information about rheuma-toid factor (RF) and the shared epitope (SE) was available, were divided into 4 groups (RF-/SE-; RF-/SE+; RF+/SE- and RF+/SE+).

Results and discussion ControlsLarge sets of patients and controls are required to be able to detect associa-tions for risk factors with weak effects. It was estimated that the sample size of 1324 patients compared to a similar group of controls is required to detect a risk factor with a population frequency of 20%, RR=1.7 and p=0.05 78.Even larger sets are required to detect an effect of more rare risk factors (al-leles).

Our control set was a combination of several groups – healthy population controls and patients with myocardial infarction. The latter was a large group of individuals from the same population (Sweden) and, in general, of the same age or older than RA patients. Frequencies of allele A did not differ between patients with infarctions and controls so we decided to use this group of “non-RA patients” as an additional control group. The average fre-quency of allele A in all control groups was 7.3% in 3404 individuals.

RA Patients We detected a tendency for association of allele A with RA, when 3404 con-trols (frequency of allele A of 7.3%) were compared to 1175 RA patients (frequency of allele A of 8.5%; p=0.053, OR=1.18, 95% C.I. [0.99-1.41]).

Rheumatoid factor (RF) and the shared epitope (SE) Rheumatoid factor (RF) and alleles of the shared epitope (SE) are well-known risk factors for development of rheumatoid arthritis (RA). RF was present in 65% and the SE was present in 72% of our patients. Nevertheless 14% of patients were negative for both of these factors. We tested whether allele A of the SNP PD-1.3, which was already associated with SLE and lupus nephritis, could play a role in RA.

Page 40: Strategies for Identification of Susceptibility Genes in ...164297/FULLTEXT01.pdf · Strategies for Identification of Susceptibility Genes in Complex ... Complex diseases represent

32

Within 4 groups of patients stratified according to their RF and SE status, there was a similar proportion of females and males (71:29%) and similar age of onset (in average, 51 years of age) in all groups. Only in the group of patients negative for both RF and SE, the frequency of allele A (12.1%) dif-fered from the controls (7.3%), (p=0.0054, OR=1.75, 95% C.I. [1.15-2.65], Bonferroni corrected p=0.015), and all other groups of patients (7.7%, p=0.023). It remains to be seen whether the “double-negative” RA patients differ from other RA patients in clinical manifestations, response to treat-ment, etc.

Frequency of allele A in RF-/SE- RA patients and SLE patients The frequency of allele A of PD-1.3 in patients negative for RF and SE (12.1%) is similar to that in European patients with SLE (11.6%).

Paper IV. Aberrant expression of the PD-1 and RUNX genes in activated CD4+ T cells in patients with systemic lupus erythema-tosus.We found that allele A of the SNP PD-1.3 located within the PD-1 gene (SLEB2 locus) is associated with SLE in Europeans 207 (Paper I), with lupus nephritis 209 (Paper II) and with rheumatoid arthritis in a subgroup of patients negative for rheumatoid factor and the shared epitope (Paper III).

Aim: To perform functional studies on the PD-1.3A mutation and to study expression of the PD-1 and RUNX genes in normal human tissues, cells cultures and blood fractions from patients with SLE and controls.

Results and discussion Expression of the PD-1 geneIn contrast to expression of the mouse PD-1 gene, which was found in acti-vated T and B cells as well as monocytes, expression of the human PD-1gene was restricted to CD4+ T cells and up-regulated upon cell activation (PMA/ionomycin treatment or anti-CD3 antibodies).

Expression of the PD-1 gene was increased in activated T cells and, par-ticularly, in CD4+ cells of SLE patients. The increase of gene expression was observed in all patients, irrespective of their genotype for SNP PD-1.3. (Fig 3)

Page 41: Strategies for Identification of Susceptibility Genes in ...164297/FULLTEXT01.pdf · Strategies for Identification of Susceptibility Genes in Complex ... Complex diseases represent

33

Figure 3. Expression of the PD-1 gene in blood fractions of patients and controls. Mean values are presented as lines.

Expression of the RUNX genes Can an increase in PD-1 gene expression be associated with regulation by RUNX factors? We found that all three human RUNX genes are expressed in T cells, both CD4+ and CD8+, and can bind the regulatory region within the PD-1 gene. Expression of RUNX3 and RUNX2 genes was also increased in the activated T cells of patients compared to controls, while expression of the RUNX1 gene was not changed.

Functional effect of the PD-1.3 A allele The functional effect of allele A was evaluated in a reporter assay – a con-struct with allele A was constitutively expressed and did not increase upon activation, while the construct with the G allele was expressed at low levels in resting conditions in Jurkat cells (CD4+) but increased by 4-5 fold upon cellular activation (Figure 4).

Page 42: Strategies for Identification of Susceptibility Genes in ...164297/FULLTEXT01.pdf · Strategies for Identification of Susceptibility Genes in Complex ... Complex diseases represent

34

Figure 4. Reporter assay in Jurkat (CD4+) cells and expression studies of T cells from patients and controls: expression of the PD-1 gene differed depending on the genotype of the SNP PD-1.3.

A similar effect was observed when the PD-1 gene expression in T cells of A/A patients was compared with G/G patients and controls. The PD-1gene expression was already increased in non-activated T cells (as if the cells were activated) and did not increase much upon cell activation (Figure 4). The exact functional consequences of PD-1 overexpression are not clear and depend on the function of the PD-1 gene, which is still being discussed 210.

The loss, inactivation or decreased expression of RUNX genes has already been associated with different cancers. Based on our data, we propose that

Page 43: Strategies for Identification of Susceptibility Genes in ...164297/FULLTEXT01.pdf · Strategies for Identification of Susceptibility Genes in Complex ... Complex diseases represent

35

over-expression of the RUNX genes (at least RUNX2 and RUNX3) is associ-ated with autoimmunity. The exact role of each of the RUNX genes, their regulation, interacting partners and regulation of their downstream targets must be studied further. The exact clinical picture of the disease may depend on the target gene whose regulation is disturbed by regulatory mutations, as in the case of SLE207, RA 61,211 and psoriasis 212.

Paper V. Analysis of the TNFR2 polymorphism M196R in Swedish and Mexican patients with systemic lupus erythemato-sus and rheumatoid arthritis The Tumor necrosis factor receptor 2 (TNFR2) is one of the two cell surface receptors for TNF alpha, a pleiotropic cytokine. The TNFR2 gene is located within the locus 1p36, where linkage for both SLE and RA was detected. The allele R of the polymorphism M196R (methionine-arginine substitu-tion), located in exon 6 of the TNFR2 gene, was found to be associated with SLE in one set of Japanese patients and controls. In all following studies (another Japanese set, and sets from Korea, UK and Spain), the association was not replicated. For RA, association was reported for French and UK patients with familial history of the disease.

Aim: To test an association of the TNFR2 polymorphism M196R in pa-tients with SLE and RA from Sweden and Mexico.

Material: 508 sporadic SLE patients, 107 single-case families with SLE (both parents and a patient), 292 RA patients and 487 controls from corre-sponding populations – Sweden and Mexico.

MethodsGenotyping: PCR/RFLP (NlaIII), case-control association studies and AF-BAC (affected family-based controls)

Results and discussion Population frequencies in European controls, SLE and RA patients Frequencies of allele R of the M196R polymorphism were similar in all European sets (Sweden, UK and Spain) in controls: average 26.2% (varied between 23.4 and 28.3%) and SLE patients: average 24.2% (varied between 22.5 and 30%). No association with SLE or RA was observed in patient-control sets or in single-case families.

Population frequencies in Asian controls, SLE and RA patients Frequencies of allele R of the M196R polymorphism were similar in all Asian sets (Japan, Korea and Mexico) in controls: average 12.3% (varied between 11.1 and 17.5%) and SLE patients: average 15.4% (varied between

Page 44: Strategies for Identification of Susceptibility Genes in ...164297/FULLTEXT01.pdf · Strategies for Identification of Susceptibility Genes in Complex ... Complex diseases represent

36

11.9 and 19%). No association with SLE or RA was observed in patient-control sets or in single-case families.

The only difference – between populations Meta-analysis of all published data and data from the current study demon-strated that frequencies of allele R were different between European and Asian controls (X2 84.2, P 2.10-16) and between European and Asian SLE patients (X2 27.9, P 2.10-7).

These results can mean the following: 1) that the TNFR2 gene is not in-volved in the pathogenesis of SLE and RA and positive associations were detected as a result of population stratification; or 2) that the polymorphism M196R is not associated with SLE and RA, but another SNP(s) within the same gene can be involved; or 3) that the SNP M196R is a causative muta-tion in only a very small proportion of patients, making it impossible to de-tect its effect by association studies; or 4) that the SNP M196R is associated with SLE and/or RA only in families with familial history of the disease.

Concluding remarks Autoimmune diseases represent complex phenotypes with overlapping fea-tures. This means that they may have overlapping causes as well.

This study shows that allele A of the SNP PD-1.3 within the PDCD1 (PD1) gene, originally found to be involved in the pathogenesis of SLE, is also of importance for lupus nephritis and for a subgroup of rheumatoid ar-thritis patients negative for rheumatoid factor and the shared epitope. A re-cent study demonstrated that the same allele A is also associated with type I diabetes in Danish patients 203.

The functional effect of this mutation was associated with the loss of regulation of PD-1 gene expression by any of three human RUNX transcrip-tion factors. Expression of the PD-1, RUNX2 and RUNX3 genes, but not RUNX1, differed between patients and controls in activated T cells. Support-ing the importance of gene regulation by RUNX transcription factors, 213

similar disruption of RUNX1 binding sites by SNPs was found in two other genes associated with rheumatoid arthritis 211 and psoriasis 212.

SNP PD-1.5, located within the PD-1 gene and on the distance of 640 bp from the SNP PD-1.3, was recently associated with RA in Chinese 213. Fu-ture studies should aim to show the functional role of the PD-1 gene and each of the RUNX factors in different autoimmune diseases 214.

Page 45: Strategies for Identification of Susceptibility Genes in ...164297/FULLTEXT01.pdf · Strategies for Identification of Susceptibility Genes in Complex ... Complex diseases represent

37

Acknowledgements

My warm thanks to:

- Marta Alarcon-Riquelme for believing in me, for such a great opportunity to work in this exciting project and for giving a hand and a word, when I needed.- Ulf Gyllensten for always-valuable discussions.

- All former and current members of Lupus group: Anna-Karin I, Casimiro,Veronica, Renata, Ligun, Cecilia, Bo, Anna-Karin II, Sergey, Eduardo, Prasad, our former technicians – Paula, Susanne and Irja and all our former and current students. - All other PCR-group people and particularly Per-Ivan for help with “all that Unix stuff” and Inger, Jenny and Ann-Sofi for help with “always-urgent-sequencing”.

- Our close neighbors and roommates in the lab – Ludmila, Lucia, Char-lotte, HungChing, Fredrik, Magnus, Niklas, Annette.

- All other former and current groups-neighbors in MedGen, administra-tors – Mia, Ulla, Elisabeth and Tommy for making it nice and comfortable place to work.

- All collaborators and coauthors, patients and families from Stockholm, Lund, Norway, Iceland, Mexico and the US involved in these projects. Spe-cial thanks to people “from upstairs” - Fredrik and Marina.

- My early group in Stockholms University and particularly to ChristinaThylen and Elisabeth Haggård- Ljungquist.

- All the friends not named personally here.

- My family – my father and my late mother, the first generation of ge-neticists in my family and my brothers with their families for always sup-porting me with whatever I was doing. Mike, for our time and our wonderful kids.

- My new family – Göran, Daniel, Erik, Kerstin, Gun and Carl-Einar.- And absolutely the best girls in the world – my amazing Ksenia and

Olga for making my life wonderful!

Page 46: Strategies for Identification of Susceptibility Genes in ...164297/FULLTEXT01.pdf · Strategies for Identification of Susceptibility Genes in Complex ... Complex diseases represent

38

References

1. Consortium, C. F. G. A. Population variation of common cystic fibrosis mutations. The Cystic Fibrosis Genetic Analysis Consortium. Hum Mutat4, 167-77 (1994).

2. Arnett, F. C. & Reveille, J. D. Genetics of systemic lupus erythematosus. Rheum Dis Clin North Am 18, 865-92 (1992).

3. Tan, E. M. et al. The 1982 revised criteria for the classification of systemic lupus erythematosus. Arthritis Rheum 25, 1271-7 (1982).

4. http://www.geocities.com/issymissy/facts.html.5. McMurray, R. W. & May, W. Sex hormones and systemic lupus erythema-

tosus: review and meta-analysis. Arthritis Rheum 48, 2100-10 (2003). 6. Golbus, J., Palella, T. D. & Richardson, B. C. Quantitative changes in T

cell DNA methylation occur during differentiation and ageing. Eur J Im-munol 20, 1869-72 (1990).

7. Richardson, B. C. Role of DNA methylation in the regulation of cell func-tion: autoimmunity, aging and cancer. J Nutr 132, 2401S-2405S (2002).

8. Jimenez, S., Cervera, R., Font, J. & Ingelmo, M. The epidemiology of sys-temic lupus erythematosus. Clin Rev Allergy Immunol 25, 3-12 (2003).

9. Jonsson, H., Nived, O., Sturfelt, G. & Silman, A. Estimating the incidence of systemic lupus erythematosus in a defined population using multiple sources of retrieval. Br J Rheumatol 29, 185-8 (1990).

10. Gudmundsson, S. & Steinsson, K. Systemic lupus erythematosus in Iceland 1975 through 1984. A nationwide epidemiological study in an unselected population. J Rheumatol 17, 1162-7 (1990).

11. Johnson, A. E., Gordon, C., Palmer, R. G. & Bacon, P. A. The prevalence and incidence of systemic lupus erythematosus in Birmingham, England. Relationship to ethnicity and country of birth. Arthritis Rheum 38, 551-8 (1995).

12. Fessel, W. J. Systemic lupus erythematosus in the community. Incidence, prevalence, outcome, and first symptoms; the high prevalence in black women. Arch Intern Med 134, 1027-35 (1974).

13. Brogan, B. L. & Olsen, N. J. Drug-induced rheumatic syndromes. CurrOpin Rheumatol 15, 76-80 (2003).

14. Bigazzi, P. E. Autoimmunity and heavy metals. Lupus 3, 449-53 (1994). 15. Montanaro, A. & Bardana, E. J., Jr. Dietary amino acid-induced systemic

lupus erythematosus. Rheum Dis Clin North Am 17, 323-32 (1991). 16. Richardson, B. et al. Evidence for impaired T cell DNA methylation in

systemic lupus erythematosus and rheumatoid arthritis. Arthritis Rheum 33,1665-73 (1990).

Page 47: Strategies for Identification of Susceptibility Genes in ...164297/FULLTEXT01.pdf · Strategies for Identification of Susceptibility Genes in Complex ... Complex diseases represent

39

17. Quddus, J. et al. Treating activated CD4+ T cells with either of two distinct DNA methyltransferase inhibitors, 5-azacytidine or procainamide, is suffi-cient to cause a lupus-like disease in syngeneic mice. J Clin Invest 92, 38-53 (1993).

18. Richardson, B. C., Liebling, M. R. & Hudson, J. L. CD4+ cells treated with DNA methylation inhibitors induce autologous B cell differentiation. Clin Immunol Immunopathol 55, 368-81 (1990).

19. Attwood, J. T., Yung, R. L. & Richardson, B. C. DNA methylation and the regulation of gene transcription. Cell Mol Life Sci 59, 241-57 (2002).

20. Lu, Q. et al. Demethylation of ITGAL (CD11a) regulatory sequences in systemic lupus erythematosus. Arthritis Rheum 46, 1282-91 (2002).

21. Corvetta, A., Della Bitta, R., Luchetti, M. M. & Pomponio, G. 5-Methylcytosine content of DNA in blood, synovial mononuclear cells and synovial tissue from patients affected by autoimmune rheumatic diseases. JChromatogr 566, 481-91 (1991).

22. Magnusson, M., Magnusson, S., Vallin, H., Ronnblom, L. & Alm, G. V. Importance of CpG dinucleotides in activation of natural IFN-alpha-producing cells by a lupus-related oligodeoxynucleotide. Scand J Immunol54, 543-50 (2001).

23. Ronnblom, L. & Alm, G. V. Systemic lupus erythematosus and the type I interferon system. Arthritis Res Ther 5, 68-75 (2003).

24. Vyse, T. J. & Todd, J. A. Genetic analysis of autoimmune disease. Cell 85,311-8 (1996).

25. Hochberg, M. C. The application of genetic epidemiology to systemic lupus erythematosus. J Rheumatol 14, 867-9 (1987).

26. Lawrence, J. S., Martins, C. L. & Drake, G. L. A family survey of lupus erythematosus. 1. Heritability. J Rheumatol 14, 913-21 (1987).

27. Deapen, D. et al. A revised estimate of twin concordance in systemic lupus erythematosus. Arthritis Rheum 35, 311-8 (1992).

28. Gaffney, P. M. et al. A genome-wide search for susceptibility genes in human systemic lupus erythematosus sib-pair families. Proc Natl Acad Sci U S A 95, 14875-9 (1998).

29. Moser, K. L. et al. Genome scan of human systemic lupus erythematosus: evidence for linkage on chromosome 1q in African-American pedigrees. Proc Natl Acad Sci U S A 95, 14869-74 (1998).

30. Lindqvist, A. K. et al. A susceptibility locus for human systemic lupus erythematosus (hSLE1) on chromosome 2q. J Autoimmun 14, 169-78 (2000).

31. Namjou, B. et al. Stratification of pedigrees multiplex for systemic lupus erythematosus and for self-reported rheumatoid arthritis detects a systemic lupus erythematosus susceptibility gene (SLER1) at 5p15.3. Arthritis Rheum 46, 2937-45 (2002).

32. Namjou, B. et al. Genome scan stratified by the presence of anti-double-stranded DNA (dsDNA) autoantibody in pedigrees multiplex for systemic lupus erythematosus (SLE) establishes linkages at 19p13.2 (SLED1) and 18q21.1 (SLED2). Genes Immun 3 Suppl 1, S35-41 (2002).

Page 48: Strategies for Identification of Susceptibility Genes in ...164297/FULLTEXT01.pdf · Strategies for Identification of Susceptibility Genes in Complex ... Complex diseases represent

40

33. Sawalha, A. H. et al. Genetic linkage of systemic lupus erythematosus with chromosome 11q14 (SLEH1) in African-American families stratified by a nucleolar antinuclear antibody pattern. Genes Immun 3 Suppl 1, S31-4 (2002).

34. Kelly, J. A. et al. Evidence for a susceptibility gene (SLEH1) on chromo-some 11q14 for systemic lupus erythematosus (SLE) families with hemo-lytic anemia. Proc Natl Acad Sci U S A 99, 11766-71 (2002).

35. Nath, S. K. et al. Evidence for a susceptibility gene, SLEV1, on chromo-some 17p13 in families with vitiligo-related systemic lupus erythematosus. Am J Hum Genet 69, 1401-6 (2001).

36. Rao, S. et al. Linkage analysis of human systemic lupus erythematosus-related traits: a principal component approach. Arthritis Rheum 44, 2807-18 (2001).

37. Johanneson, B., Lima, G., von Salome, J., Alarcon-Segovia, D. & Alarcon-Riquelme, M. E. A major susceptibility locus for systemic lupus erythe-mathosus maps to chromosome 1q31. Am J Hum Genet 71, 1060-71 (2002).

38. Arnett, F. C., Olsen, M. L., Anderson, K. L. & Reveille, J. D. Molecular analysis of major histocompatibility complex alleles associated with the lu-pus anticoagulant. J Clin Invest 87, 1490-5 (1991).

39. Arnett, F. C. & Moulds, J. M. HLA class III molecules and autoimmune rheumatic diseases. Clin Exp Rheumatol 9, 289-96 (1991).

40. Llorente, L. et al. In vivo production of interleukin-10 by non-T cells in rheumatoid arthritis, Sjogren's syndrome, and systemic lupus erythemato-sus. A potential mechanism of B lymphocyte hyperactivity and autoimmu-nity. Arthritis Rheum 37, 1647-55 (1994).

41. Alarcon-Riquelme, M. E. et al. Genetic analysis of the contribution of IL10 to systemic lupus erythematosus. J Rheumatol 26, 2148-52 (1999).

42. Song, Y. W. et al. Abnormal distribution of Fc gamma receptor type IIa polymorphisms in Korean patients with systemic lupus erythematosus. Ar-thritis Rheum 41, 421-6 (1998).

43. Johansson, C. et al. Association analysis with microsatellite and SNP mark-ers does not support the involvement of BCL-2 in systemic lupus erythema-tosus in Mexican and Swedish patients and their families. Genes Immun 1,380-5 (2000).

44. Mehrian, R. et al. Synergistic effect between IL-10 and bcl-2 genotypes in determining susceptibility to systemic lupus erythematosus. Arthritis Rheum 41, 596-602 (1998).

45. Mohan, C., Alas, E., Morel, L., Yang, P. & Wakeland, E. K. Genetic dis-section of SLE pathogenesis. Sle1 on murine chromosome 1 leads to a se-lective loss of tolerance to H2A/H2B/DNA subnucleosomes. J Clin Invest101, 1362-72 (1998).

46. Mohan, C., Morel, L., Yang, P. & Wakeland, E. K. Genetic dissection of systemic lupus erythematosus pathogenesis: Sle2 on murine chromosome 4 leads to B cell hyperactivity. J Immunol 159, 454-65 (1997).

Page 49: Strategies for Identification of Susceptibility Genes in ...164297/FULLTEXT01.pdf · Strategies for Identification of Susceptibility Genes in Complex ... Complex diseases represent

41

47. Mohan, C., Yu, Y., Morel, L., Yang, P. & Wakeland, E. K. Genetic dissec-tion of Sle pathogenesis: Sle3 on murine chromosome 7 impacts T cell ac-tivation, differentiation, and cell death. J Immunol 162, 6492-502 (1999).

48. Morel, L., Tian, X. H., Croker, B. P. & Wakeland, E. K. Epistatic modifiers of autoimmunity in a murine model of lupus nephritis. Immunity 11, 131-9 (1999).

49. Wakeland, E. K., Liu, K., Graham, R. R. & Behrens, T. W. Delineating the genetic basis of systemic lupus erythematosus. Immunity 15, 397-408 (2001).

50. Watanabe-Fukunaga, R., Brannan, C. I., Copeland, N. G., Jenkins, N. A. & Nagata, S. Lymphoproliferation disorder in mice explained by defects in Fas antigen that mediates apoptosis. Nature 356, 314-7 (1992).

51. Cohen, P. L. & Eisenberg, R. A. Lpr and gld: single gene models of sys-temic autoimmunity and lymphoproliferative disease. Annu Rev Immunol 9,243-69 (1991).

52. Murphy, E. D. & Roths, J. B. A Y chromosome associated factor in strain BXSB producing accelerated autoimmunity and lymphoproliferation. Ar-thritis Rheum 22, 1188-94 (1979).

53. Hogarth, M. B. et al. Multiple lupus susceptibility loci map to chromosome 1 in BXSB mice. J Immunol 161, 2753-61 (1998).

54. Ishida, Y., Agata, Y., Shibahara, K. & Honjo, T. Induced expression of PD-1, a novel member of the immunoglobulin gene superfamily, upon pro-grammed cell death. Embo J 11, 3887-95 (1992).

55. Vivier, E. & Daeron, M. Immunoreceptor tyrosine-based inhibition motifs. Immunol Today 18, 286-91 (1997).

56. Finger, L. R. et al. The human PD-1 gene: complete cDNA, genomic or-ganization, and developmentally regulated expression in B cell progenitors. Gene 197, 177-87 (1997).

57. Shinohara, T., Taniwaki, M., Ishida, Y., Kawaichi, M. & Honjo, T. Struc-ture and chromosomal localization of the human PD-1 gene (PDCD1). Ge-nomics 23, 704-6 (1994).

58. Nishimura, H., Minato, N., Nakano, T. & Honjo, T. Immunological studies on PD-1 deficient mice: implication of PD-1 as a negative regulator for B cell responses. Int Immunol 10, 1563-72 (1998).

59. Nishimura, H., Nose, M., Hiai, H., Minato, N. & Honjo, T. Development of lupus-like autoimmune diseases by disruption of the PD-1 gene encoding an ITIM motif-carrying immunoreceptor. Immunity 11, 141-51 (1999).

60. Weyand, C. M., Klimiuk, P. A. & Goronzy, J. J. Heterogeneity of rheuma-toid arthritis: from phenotypes to genotypes. Springer Semin Immunopathol20, 5-22 (1998).

61. Prokunina, L., Padyukov, L., MD, Bennet, A., de Faire, U., Wiman, B., Prince, J., Alfredsson, L., Klareskog, L., Alarcon-Riquelme, M. Associa-tion of the PD-1.3 A allele of the PDCD1 Gene in Patients with Rheuma-toid Arthritis Negative for Rheumatoid Factor and the Shared Epitope. Arthr Rheum in press (2004).

Page 50: Strategies for Identification of Susceptibility Genes in ...164297/FULLTEXT01.pdf · Strategies for Identification of Susceptibility Genes in Complex ... Complex diseases represent

42

62. Lawrence, R. C. et al. Estimates of the prevalence of arthritis and selected musculoskeletal disorders in the United States. Arthritis Rheum 41, 778-99 (1998).

63. Molokhia, M. & McKeigue, P. Risk for rheumatic disease in relation to ethnicity and admixture. Arthritis Res 2, 115-25 (2000).

64. Stolt, P. et al. Quantification of the influence of cigarette smoking on rheumatoid arthritis: results from a population based case-control study, us-ing incident cases. Ann Rheum Dis 62, 835-41 (2003).

65. Symmons, D. P. et al. Blood transfusion, smoking, and obesity as risk fac-tors for the development of rheumatoid arthritis: results from a primary care-based incident case-control study in Norfolk, England. Arthritis Rheum 40, 1955-61 (1997).

66. Firestein, G. S. Evolving concepts of rheumatoid arthritis. Nature 423, 356-61 (2003).

67. Seldin, M. F., Amos, C. I., Ward, R. & Gregersen, P. K. The genetics revo-lution and the assault on rheumatoid arthritis. Arthritis Rheum 42, 1071-9 (1999).

68. Block, S. R. Twin studies: genetic factors are important. Arthritis Rheum36, 135-6 (1993).

69. Stastny, P. Association of the B-cell alloantigen DRw4 with rheumatoid arthritis. N Engl J Med 298, 869-71 (1978).

70. Arnett, F. C. et al. The American Rheumatism Association 1987 revised criteria for the classification of rheumatoid arthritis. Arthritis Rheum 31,315-24 (1988).

71. Grant, S. F. et al. The inheritance of rheumatoid arthritis in Iceland. Arthri-tis Rheum 44, 2247-54 (2001).

72. John, S. & Worthington, J. Genetic epidemiology. Approaches to the ge-netic analysis of rheumatoid arthritis. Arthritis Res 3, 216-20 (2001).

73. Gregersen, P. K., Silver, J. & Winchester, R. J. The shared epitope hy-pothesis. An approach to understanding the molecular genetics of suscepti-bility to rheumatoid arthritis. Arthritis Rheum 30, 1205-13 (1987).

74. Cornelis, F. et al. New susceptibility locus for rheumatoid arthritis sug-gested by a genome-wide linkage study. Proc Natl Acad Sci U S A 95,10746-50 (1998).

75. Jawaheer, D. et al. Screening the genome for rheumatoid arthritis suscepti-bility genes: a replication study and combined analysis of 512 multicase families. Arthritis Rheum 48, 906-16 (2003).

76. MacKay, K. et al. Whole-genome linkage analysis of rheumatoid arthritis susceptibility loci in 252 affected sibling pairs in the United Kingdom. Ar-thritis Rheum 46, 632-9 (2002).

77. Lander, E. & Kruglyak, L. Genetic dissection of complex traits: guidelines for interpreting and reporting linkage results. Nat Genet 11, 241-7 (1995).

78. Gregersen, P. K. Genetics of rheumatoid arthritis: confronting complexity. Arthritis Res 1, 37-44 (1999).

79. Shiozawa, S. et al. Identification of the gene loci that predispose to rheuma-toid arthritis. Int Immunol 10, 1891-5 (1998).

Page 51: Strategies for Identification of Susceptibility Genes in ...164297/FULLTEXT01.pdf · Strategies for Identification of Susceptibility Genes in Complex ... Complex diseases represent

43

80. Maini, R. N. et al. Anti-tumour necrosis factor specific antibody (inflixi-mab) treatment provides insights into the pathophysiology of rheumatoid arthritis. Ann Rheum Dis 58 Suppl 1, I56-60 (1999).

81. Maini, R. et al. Infliximab (chimeric anti-tumour necrosis factor alpha monoclonal antibody) versus placebo in rheumatoid arthritis patients re-ceiving concomitant methotrexate: a randomised phase III trial. ATTRACT Study Group. Lancet 354, 1932-9 (1999).

82. Barton, A. & Ollier, W. Genetic approaches to the investigation of rheuma-toid arthritis. Curr Opin Rheumatol 14, 260-9 (2002).

83. Jirholt, J., Lindqvist, A. B. & Holmdahl, R. The genetics of rheumatoid arthritis and the need for animal models to find and understand the underly-ing genes. Arthritis Res 3, 87-97 (2001).

84. Bradley, D. S. et al. HLA-DQB1 polymorphism determines incidence, onset, and severity of collagen-induced arthritis in transgenic mice. Impli-cations in human rheumatoid arthritis. J Clin Invest 100, 2227-34 (1997).

85. Barton, A. et al. High resolution linkage and association mapping identifies a novel rheumatoid arthritis susceptibility locus homologous to one linked to two rat models of inflammatory arthritis. Hum Mol Genet 10, 1901-6 (2001).

86. Sakaguchi, N. et al. Altered thymic T-cell selection due to a mutation of the ZAP-70 gene causes autoimmune arthritis in mice. Nature 426, 454-60 (2003).

87. Olofsson, P. et al. Positional identification of Ncf1 as a gene that regulates arthritis severity in rats. Nat Genet 33, 25-32 (2003).

88. Risch, N. Linkage strategies for genetically complex traits. I. Multilocus models. Am J Hum Genet 46, 222-8 (1990).

89. Jeffreys, A. J., Wilson, V. & Thein, S. L. Hypervariable 'minisatellite' re-gions in human DNA. Nature 314, 67-73 (1985).

90. Venter, J. C. et al. The sequence of the human genome. Science 291, 1304-51 (2001).

91. http://www.ncbi.nlm.nih.gov/SNP/snp_summary.cgi.92. http://hgvbase.cgb.ki.se/cgi-bin/main.pl?page=snp_info.htm.93. http://snp.cshl.org/.94. http://www.ensembl.org/genome/central/.95. Chamberlain, S. et al. Genetic recombination events which position the

Friedreich ataxia locus proximal to the D9S15/D9S5 linkage group on chromosome 9q. Am J Hum Genet 52, 99-109 (1993).

96. Hoehe, M. R. Haplotypes and the systematic analysis of genetic variation in genes and genomes. Pharmacogenomics 4, 547-70 (2003).

97. Cardon, L. R. & Abecasis, G. R. Using haplotype blocks to map human complex trait loci. Trends Genet 19, 135-40 (2003).

98. Clark, A. G. Finding genes underlying risk of complex disease by linkage disequilibrium mapping. Curr Opin Genet Dev 13, 296-302 (2003).

99. Abecasis, G. R., Cherny, S. S., Cookson, W. O. & Cardon, L. R. Merlin--rapid analysis of dense genetic maps using sparse gene flow trees. Nat Genet 30, 97-101 (2002).

100. http://www.stat.washington.edu/stephens/phase.html.

Page 52: Strategies for Identification of Susceptibility Genes in ...164297/FULLTEXT01.pdf · Strategies for Identification of Susceptibility Genes in Complex ... Complex diseases represent

44

101. Kruglyak, L., Daly, M. J., Reeve-Daly, M. P. & Lander, E. S. Parametric and nonparametric linkage analysis: a unified multipoint approach. Am J Hum Genet 58, 1347-63 (1996).

102. http://linkage.rockefeller.edu/soft/.103. Spielman, R. S., McGinnis, R. E. & Ewens, W. J. Transmission test for

linkage disequilibrium: the insulin gene region and insulin-dependent dia-betes mellitus (IDDM). Am J Hum Genet 52, 506-16 (1993).

104. Martin, E. R., Monks, S. A., Warren, L. L. & Kaplan, N. L. A test for link-age and association in general pedigrees: the pedigree disequilibrium test. Am J Hum Genet 67, 146-54 (2000).

105. Ott, J. Statistical properties of the haplotype relative risk. Genet Epidemiol6, 127-30 (1989).

106. Thomson, G. Mapping disease genes: family-based association studies. Am J Hum Genet 57, 487-98 (1995).

107. Lathrop, G. M. Estimating genotype relative risks. Tissue Antigens 22, 160-6 (1983).

108. Devlin, B. & Risch, N. A comparison of linkage disequilibrium measures for fine-scale mapping. Genomics 29, 311-22 (1995).

109. Devlin, B., Risch, N. & Roeder, K. Disequilibrium mapping: composite likelihood for pairwise disequilibrium. Genomics 36, 1-16 (1996).

110. Abecasis, G. R. & Cookson, W. O. GOLD--graphical overview of linkage disequilibrium. Bioinformatics 16, 182-3 (2000).

111. Daly, M. J., Rioux, J. D., Schaffner, S. F., Hudson, T. J. & Lander, E. S. High-resolution haplotype structure in the human genome. Nat Genet 29,229-32 (2001).

112. Rioux, J. D. et al. Genetic variation in the 5q31 cytokine gene cluster con-fers susceptibility to Crohn disease. Nat Genet 29, 223-8 (2001).

113. Hall, J. M. et al. Linkage of early-onset familial breast cancer to chromo-some 17q21. Science 250, 1684-9 (1990).

114. Sherrington, R. et al. Cloning of a gene bearing missense mutations in early-onset familial Alzheimer's disease. Nature 375, 754-60 (1995).

115. Prokunina, L., Alarcon-Riquelme, M.E. Regulatory SNPs in complex dis-eases and their functional validation. Expert Reviews in Molecular Medi-cine (2004).

116. Knight, J. C. et al. A polymorphism that affects OCT-1 binding to the TNF promoter region is associated with severe malaria. Nat Genet 22, 145-50 (1999).

117. Miller, G. M. & Madras, B. K. Polymorphisms in the 3'-untranslated region of human and monkey dopamine transporter genes affect reporter gene ex-pression. Mol Psychiatry 7, 44-55 (2002).

118. Di Paola, R. et al. A variation in 3' UTR of hPTP1B increases specific gene expression and associates with insulin resistance. Am J Hum Genet 70, 806-12 (2002).

119. Bevilacqua, A., Ceriani, M. C., Capaccioli, S. & Nicolin, A. Post-transcriptional regulation of gene expression by degradation of messenger RNAs. J Cell Physiol 195, 356-72 (2003).

Page 53: Strategies for Identification of Susceptibility Genes in ...164297/FULLTEXT01.pdf · Strategies for Identification of Susceptibility Genes in Complex ... Complex diseases represent

45

120. Duan, J. et al. Synonymous mutations in the human dopamine receptor D2 (DRD2) affect mRNA stability and synthesis of the receptor. Hum Mol Genet 12, 205-16 (2003).

121. Faustino, N. A. & Cooper, T. A. Pre-mRNA splicing and human disease. Genes Dev 17, 419-37 (2003).

122. Krawczak, M., Reiss, J. & Cooper, D. N. The mutational spectrum of single base-pair substitutions in mRNA splice junctions of human genes: causes and consequences. Hum Genet 90, 41-54 (1992).

123. Lorson, C. L., Hahnen, E., Androphy, E. J. & Wirth, B. A single nucleotide in the SMN gene regulates splicing and is responsible for spinal muscular atrophy. Proc Natl Acad Sci U S A 96, 6307-11 (1999).

124. Shiga, N. et al. Disruption of the splicing enhancer sequence within exon 27 of the dystrophin gene by a nonsense mutation induces partial skipping of the exon and is responsible for Becker muscular dystrophy. J Clin Invest100, 2204-10 (1997).

125. Liu, H. X., Cartegni, L., Zhang, M. Q. & Krainer, A. R. A mechanism for exon skipping caused by nonsense or missense mutations in BRCA1 and other genes. Nat Genet 27, 55-8 (2001).

126. Pagani, F., Buratti, E., Stuani, C. & Baralle, F. E. Missense, nonsense, and neutral mutations define juxtaposed regulatory elements of splicing in cys-tic fibrosis transmembrane regulator exon 9. J Biol Chem 278, 26580-8 (2003).

127. Pagani, F. et al. New type of disease causing mutations: the example of the composite exonic regulatory elements of splicing in CFTR exon 12. Hum Mol Genet 12, 1111-20 (2003).

128. Lynch, K. W. & Weiss, A. A CD45 polymorphism associated with multiple sclerosis disrupts an exonic splicing silencer. J Biol Chem 276, 24341-7 (2001).

129. Pyne, M. T. et al. The BRCA2 genetic variant IVS7 + 2T-->G is a muta-tion. J Hum Genet 45, 351-7 (2000).

130. Ueda, H. et al. Association of the T-cell regulatory gene CTLA4 with sus-ceptibility to autoimmune disease. Nature 423, 506-11 (2003).

131. Kozlowski, P. & Krzyzosiak, W. J. Combined SSCP/duplex analysis by capillary electrophoresis for more efficient mutation detection. Nucleic Ac-ids Res 29, E71 (2001).

132. Kaczanowski, R., Trzeciak, L. & Kucharczyk, K. Multitemperature single-strand conformation polymorphism. Electrophoresis 22, 3539-45 (2001).

133. Ribas, G., Neville, M. J. & Campbell, R. D. Single-nucleotide polymor-phism detection by denaturing high-performance liquid chromatography and direct sequencing in genes in the MHC class III region encoding novel cell surface molecules. Immunogenetics 53, 369-81 (2001).

134. Xiao, W., Stern, D., Jain, M., Huber, C. G. & Oefner, P. J. Multiplex capil-lary denaturing high-performance liquid chromatography with laser-induced fluorescence detection. Biotechniques 30, 1332-8 (2001).

135. Hardison, R. C., Oeltjen, J. & Miller, W. Long human-mouse sequence alignments reveal novel regulatory elements: a reason to sequence the mouse genome. Genome Res 7, 959-66 (1997).

Page 54: Strategies for Identification of Susceptibility Genes in ...164297/FULLTEXT01.pdf · Strategies for Identification of Susceptibility Genes in Complex ... Complex diseases represent

46

136. Mayor, C. et al. VISTA : visualizing global DNA sequence alignments of arbitrary length. Bioinformatics 16, 1046-7 (2000).

137. http://www.hgmp.mrc.ac.uk/Bioinformatics/.138. Dermitzakis, E. T. & Clark, A. G. Evolution of transcription factor binding

sites in Mammalian gene regulatory regions: conservation and turnover. Mol Biol Evol 19, 1114-21 (2002).

139. Botstein, D., White, R. L., Skolnick, M. & Davis, R. W. Construction of a genetic linkage map in man using restriction fragment length polymor-phisms. Am J Hum Genet 32, 314-31 (1980).

140. Stoerker, J. et al. Rapid genotyping by MALDI-monitored nuclease selec-tion from probe libraries. Nat Biotechnol 18, 1213-6 (2000).

141. Ye, S., Liang, X., Yamamoto, Y. & Komiyama, M. Detection of single nucleotide polymorphisms by the combination of nuclease S1 and PNA. Nucleic Acids Res Suppl, 235-6 (2002).

142. Bray, M. S., Boerwinkle, E. & Doris, P. A. High-throughput multiplex SNP genotyping with MALDI-TOF mass spectrometry: practice, problems and promise. Hum Mutat 17, 296-304 (2001).

143. Prince, J. A. et al. Robust and accurate single nucleotide polymorphism genotyping by dynamic allele-specific hybridization (DASH): design crite-ria and assay validation. Genome Res 11, 152-62 (2001).

144. Jobs, M., Howell, W. M., Stromqvist, L., Mayr, T. & Brookes, A. J. DASH-2: flexible, low-cost, and high-throughput SNP genotyping by dy-namic allele-specific hybridization on membrane arrays. Genome Res 13,916-24 (2003).

145. Fakhrai-Rad, H., Pourmand, N. & Ronaghi, M. Pyrosequencing: an accu-rate detection platform for single nucleotide polymorphisms. Hum Mutat19, 479-85 (2002).

146. Makridakis, N. M. & Reichardt, J. K. Multiplex automated primer exten-sion analysis: simultaneous genotyping of several polymorphisms. Biotech-niques 31, 1374-80 (2001).

147. Shapero, M. H., Leuther, K. K., Nguyen, A., Scott, M. & Jones, K. W. SNP genotyping by multiplexed solid-phase amplification and fluorescent minisequencing. Genome Res 11, 1926-34 (2001).

148. Mein, C. A. et al. Evaluation of single nucleotide polymorphism typing with invader on PCR amplicons and its automation. Genome Res 10, 330-43 (2000).

149. Hsu, T. M., Law, S. M., Duan, S., Neri, B. P. & Kwok, P. Y. Genotyping single-nucleotide polymorphisms by the invader assay with dual-color fluo-rescence polarization detection. Clin Chem 47, 1373-7 (2001).

150. Qi, X., Bakht, S., Devos, K. M., Gale, M. D. & Osbourn, A. L-RCA (liga-tion-rolling circle amplification): a general method for genotyping of single nucleotide polymorphisms (SNPs). Nucleic Acids Res 29, E116 (2001).

151. Faruqi, A. F. et al. High-throughput genotyping of single nucleotide poly-morphisms with rolling circle amplification. BMC Genomics 2, 4 (2001).

152. Akey, J. M. et al. Melting curve analysis of SNPs (McSNP): a gel-free and inexpensive approach for SNP genotyping. Biotechniques 30, 358-62, 364, 366-7 (2001).

Page 55: Strategies for Identification of Susceptibility Genes in ...164297/FULLTEXT01.pdf · Strategies for Identification of Susceptibility Genes in Complex ... Complex diseases represent

47

153. Hardenbol, P. et al. Multiplexed genotyping with sequence-tagged molecu-lar inversion probes. Nat Biotechnol 21, 673-8 (2003).

154. Zhang, S., Van Pelt, C. K., Huang, X. & Schultz, G. A. Detection of single nucleotide polymorphisms using electrospray ionization mass spectrome-try: validation of a one-well assay and quantitative pooling studies. J Mass Spectrom 37, 1039-50 (2002).

155. Xu, H. et al. Multiplexed SNP genotyping using the Qbead system: a quan-tum dot-encoded microsphere-based assay. Nucleic Acids Res 31, e43 (2003).

156. Xiao, M. & Kwok, P. Y. DNA analysis by fluorescence quenching detec-tion. Genome Res 13, 932-9 (2003).

157. Waterfall, C. M. & Cobb, B. D. SNP genotyping using single-tube fluores-cent bidirectional PCR. Biotechniques 33, 80, 82-4, 86 passim (2002).

158. Vreeland, W. N., Meagher, R. J. & Barron, A. E. Multiplexed, high-throughput genotyping by single-base extension and end-labeled free-solution electrophoresis. Anal Chem 74, 4328-33 (2002).

159. Mhlanga, M. M. & Malmberg, L. Using molecular beacons to detect single-nucleotide polymorphisms with real-time PCR. Methods 25, 463-71 (2001).

160. Kuzuya, A., Mizoguchi, R., Morisawa, F. & Komiyama, M. Novel ap-proach for SNP genotyping based on site-selective RNA scission. Nucleic Acids Res Suppl, 129-30 (2002).

161. Iannone, M. A. et al. Multiplexed single nucleotide polymorphism genotyp-ing by oligonucleotide ligation and flow cytometry. Cytometry 39, 131-40 (2000).

162. Chen, X., Levine, L. & Kwok, P. Y. Fluorescence polarization in homoge-neous nucleic acid analysis. Genome Res 9, 492-8 (1999).

163. Myakishev, M. V., Khripin, Y., Hu, S. & Hamer, D. H. High-throughput SNP genotyping by allele-specific PCR with universal energy-transfer-labeled primers. Genome Res 11, 163-9 (2001).

164. Cartegni, L., Chew, S. L. & Krainer, A. R. Listening to silence and under-standing nonsense: exonic mutations that affect splicing. Nat Rev Genet 3,285-98 (2002).

165. http://exon.cshl.org/ESE/.166. Fairbrother, W. G., Yeh, R. F., Sharp, P. A. & Burge, C. B. Predictive iden-

tification of exonic splicing enhancers in human genes. Science 297, 1007-13 (2002).

167. Graham, I. R., Hamshere, M. & Eperon, I. C. Alternative splicing of a hu-man alpha-tropomyosin muscle-specific exon: identification of determining sequences. Mol Cell Biol 12, 3872-82 (1992).

168. Tacke, R. & Goridis, C. Alternative splicing in the neural cell adhesion molecule pre-mRNA: regulation of exon 18 skipping depends on the 5'-splice site. Genes Dev 5, 1416-29 (1991).

169. Cuadrado, A. et al. HuD binds to three AU-rich sequences in the 3'-UTR of neuroserpin mRNA and promotes the accumulation of neuroserpin mRNA and protein. Nucleic Acids Res 30, 2202-11 (2002).

Page 56: Strategies for Identification of Susceptibility Genes in ...164297/FULLTEXT01.pdf · Strategies for Identification of Susceptibility Genes in Complex ... Complex diseases represent

48

170. Suzuki, A. et al. Functional haplotypes of PADI4, encoding citrullinating enzyme peptidylarginine deiminase 4, are associated with rheumatoid ar-thritis. Nat Genet 34, 395-402 (2003).

171. http://transfac.gbf.de/TRANSFAC/.172. http://molsun1.cbrc.aist.go.jp/research/db/TFSEARCH.html.173. http://mordor.cgb.ki.se/cgi-bin/CONSITE/consite.174. http://www.123genomics.com/files/home.html.175. Fried, M. & Crothers, D. M. Equilibria and kinetics of lac repressor-

operator interactions by polyacrylamide gel electrophoresis. Nucleic Acids Res 9, 6505-25 (1981).

176. Kelly, S., Yotis, J., Macris, M. & Harley, V. Recombinant expression, purification and characterisation of the HMG domain of human SRY. Pro-tein Pept Lett 10, 281-6 (2003).

177. Horikawa, Y. et al. Genetic variation in the gene encoding calpain-10 is associated with type 2 diabetes mellitus. Nat Genet 26, 163-75 (2000).

178. Galas, D. J. & Schmitz, A. DNAse footprinting: a simple method for the detection of protein-DNA binding specificity. Nucleic Acids Res 5, 3157-70 (1978).

179. Alam, J. & Cook, J. L. Reporter genes: application to the study of mammal-ian gene transcription. Anal Biochem 188, 245-54 (1990).

180. Nordeen, S. K. Luciferase reporter gene vectors for analysis of promoters and enhancers. Biotechniques 6, 454-8 (1988).

181. Gu, T. L., Goetz, T. L., Graves, B. J. & Speck, N. A. Auto-inhibition and partner proteins, core-binding factor beta (CBFbeta) and Ets-1, modulate DNA binding by CBFalpha2 (AML1). Mol Cell Biol 20, 91-103 (2000).

182. Kurdistani, S. K. & Grunstein, M. In vivo protein-protein and protein-DNA crosslinking for genomewide binding microarray. Methods 31, 90-5 (2003).

183. Ren, B. et al. E2F integrates cell cycle progression with DNA repair, repli-cation, and G(2)/M checkpoints. Genes Dev 16, 245-56 (2002).

184. Weinmann, A. S. & Farnham, P. J. Identification of unknown target genes of human transcription factors using chromatin immunoprecipitation. Methods 26, 37-47 (2002).

185. Bird, A. P. DNA methylation and the frequency of CpG in animal DNA. Nucleic Acids Res 8, 1499-504 (1980).

186. Van Laere, A. S. et al. A regulatory mutation in IGF2 causes a major QTL effect on muscle growth in the pig. Nature 425, 832-6 (2003).

187. Meehan, R. et al. Transcriptional repression by methylation of CpG. J Cell Sci Suppl 16, 9-14 (1992).

188. Deng, G., Chen, A., Pong, E. & Kim, Y. S. Methylation in hMLH1 pro-moter interferes with its binding to transcription factor CBF and inhibits gene expression. Oncogene 20, 7120-7 (2001).

189. Murumagi, A., Vahamurto, P. & Peterson, P. Characterization of regulatory elements and methylation pattern of the autoimmune regulator (AIRE) promoter. J Biol Chem 278, 19784-90 (2003).

190. http://www.affymetrix.com/index.affx.191. http://cmgm.stanford.edu/pbrown/.

Page 57: Strategies for Identification of Susceptibility Genes in ...164297/FULLTEXT01.pdf · Strategies for Identification of Susceptibility Genes in Complex ... Complex diseases represent

49

192. Duggan, D. J., Bittner, M., Chen, Y., Meltzer, P. & Trent, J. M. Expression profiling using cDNA microarrays. Nat Genet 21, 10-4 (1999).

193. Churchill, G. A. Fundamentals of experimental design for cDNA microar-rays. Nat Genet 32 Suppl, 490-5 (2002).

194. Quackenbush, J. Microarray data normalization and transformation. Nat Genet 32 Suppl, 496-501 (2002).

195. http://www.ncbi.nlm.nih.gov/About/primer/microarrays.html.196. http://www.ebi.ac.uk/arrayexpress/.197. Perkin-Elmer, C. PE Applied Biosystems ABI Prism 7700 Sequence Detec-

tion System: Relative Quantitation of Gene Expression. User Bulletin #2,. 1-36 (1997).

198. Yan, H., Yuan, W., Velculescu, V. E., Vogelstein, B. & Kinzler, K. W. Allelic variation in human gene expression. Science 297, 1143 (2002).

199. Lo, H. S. et al. Allelic variation in gene expression is common in the hu-man genome. Genome Res 13, 1855-62 (2003).

200. Pastinen, T. et al. A survey of genetic and epigenetic variation affecting human gene expression. Physiol Genomics (2003).

201. Nilsson, M., Barbany, G., Antson, D. O., Gertow, K. & Landegren, U. Enhanced detection and distinction of RNA by enzymatic probe ligation. Nat Biotechnol 18, 791-3 (2000).

202. Magnusson, V. et al. Fine mapping of the SLEB2 locus involved in suscep-tibility to systemic lupus erythematosus. Genomics 70, 307-14 (2000).

203. Nielsen, C., Hansen, D., Husby, S., Jacobsen, B. B. & Lillevang, S. T. Association of a putative regulatory polymorphism in the PD-1 gene with susceptibility to type 1 diabetes. Tissue Antigens 62, 492-7 (2003).

204. Tishkoff, S. A., Pakstis, A. J., Ruano, G. & Kidd, K. K. The accuracy of statistical methods for estimation of haplotype frequencies: an example from the CD4 locus. Am J Hum Genet 67, 518-22 (2000).

205. Haimila, K. et al. Genetic association of coeliac disease susceptibility to polymorphisms in the ICOS gene on chromosome 2q33. Genes Immun(2004).

206. Lutterbach, B. & Hiebert, S. W. Role of the transcription factor AML-1 in acute leukemia and hematopoietic differentiation. Gene 245, 223-35 (2000).

207. Prokunina, L. et al. A regulatory polymorphism in PDCD1 is associated with susceptibility to systemic lupus erythematosus in humans. Nat Genet32, 666-9 (2002).

208. Bukhari, M. et al. Rheumatoid factor is the major predictor of increasing severity of radiographic erosions in rheumatoid arthritis: results from the Norfolk Arthritis Register Study, a large inception cohort. Arthritis Rheum46, 906-12 (2002).

209. Prokunina, L. et al. The systemic lupus erythematosus-associated PDCD1 polymorphism PD1.3A in lupus nephritis. Arthritis Rheum 50, 327-8 (2004).

210. Tokuhiro, S. et al. An intronic SNP in a RUNX1 binding site of SLC22A4, encoding an organic cation transporter, is associated with rheumatoid ar-thritis. Nat Genet (2003).

Page 58: Strategies for Identification of Susceptibility Genes in ...164297/FULLTEXT01.pdf · Strategies for Identification of Susceptibility Genes in Complex ... Complex diseases represent

50

211. Helms, C. et al. A putative RUNX1 binding site variant between SLC9A3R1 and NAT9 is associated with susceptibility to psoriasis. Nat Genet (2003).

212. Prokunina, L., Alarcon-Riquelme, M. The genetics of Systemic Lupus Erythematosus: knowledge from today and thoughts for tomorrow. Human Molecular Genetics (2004).

213. Lin, et.al. Association of a programmed death 1 gene polymorphism with the development of rheumatoid arthritis, but not systemic lupus erythema-tosus. Arthr Rheum,50, 770-775 (2004)

214. Alarcon-Riquelme, M. E. A RUNX trio with a taste for autoimmunity. Nat Genet 35, 299-300 (2003).

Page 59: Strategies for Identification of Susceptibility Genes in ...164297/FULLTEXT01.pdf · Strategies for Identification of Susceptibility Genes in Complex ... Complex diseases represent
Page 60: Strategies for Identification of Susceptibility Genes in ...164297/FULLTEXT01.pdf · Strategies for Identification of Susceptibility Genes in Complex ... Complex diseases represent

Acta Universitatis UpsaliensisComprehensive Summaries of Uppsala Dissertations

from the Faculty of MedicineEditor: The Dean of the Faculty of Medicine

Distribution:Uppsala University Library

Box 510, SE-751 20 Uppsala, Swedenwww.uu.se, [email protected]

ISSN 0282-7476ISBN 91-554-5918-8

A doctoral dissertation from the Faculty of Medicine, Uppsala University,is usually a summary of a number of papers. A few copies of the completedissertation are kept at major Swedish research libraries, while the sum-mary alone is distributed internationally through the series Comprehen-sive Summaries of Uppsala Dissertations from the Faculty of Medicine.(Prior to October, 1985, the series was published under the title “Abstracts ofUppsala Dissertations from the Faculty of Medicine”.)