expression profiling of peripheral blood cells for early detection of breast cancer introduction...

1
Expression profiling of peripheral blood Expression profiling of peripheral blood cells for early detection of breast cancer cells for early detection of breast cancer Introduction Introduction Early detection of breast cancer is a key to successful treatment and patient survival. The existing methods to detect breast cancer in asymptomatic patients have limitations, and there is a need to develop more accurate and convenient methods. In a recently published study (Sharma et al. 2005), we demonstrate the potential use of gene expression profiling in peripheral blood cells (PBC) for early detection of breast cancer. However, the study was based on limited sample size and the use of in-house manufactured macroarrays. The aim of the present study was to investigate whether the findings reported earlier could be reproduced using a larger sample size and a commercially available microarray platform. Materials and methods Materials and methods Whole blood were collected in PAX tubes from 64 females diagnosed with breast cancer and 76 females with no reported sign of the disease. Total RNA was extracted and gene expression analyses were conducted using high density oligonucleotide arrays (Agilent Technologies) containing 22.000 probes. Expression data were analyzed using several statistical approaches: Partial Least Square Regression (PLSR) was used for model building while a novel approach combining double and triple cross validation (CV) was used to identify stable and relevant predictive genes and estimate their prediction efficiency. A number of studies have reported over-optimistic accuracy levels due to improper validation where selection bias has not been taken into account (Ambroise and McLachlan, 2002). In order to avoid such selection bias and to obtain unbiased estimates of accuracy, a trippel CV approach was required, since the gene selection procedure was based on an inner double CV routine (Figure 1). Based on the selected predictors, pathway analysis was conducted (PathwayAssist, Ariadne Genomics). J. Aarøe 1 T. Lindahl 2 S. Sæbø 3 P. Skaane 4 S. Myhre 1 T. Reiersen 1 A. Lönneborg 2 A-L. Børresen-Dale 1 P. Sharma 2 1 Department of Genetics, The Norwegian Radium Hospital, N-0310 Oslo, Norway. 2 DiaGenic ASA, Oslo, Norway. 3 IKBM, University of Life Sciences, 1432 Ås, Norway. 4 Department of Surgery, Ullevål University Hospital, Oslo, Norway. Results and Discussion Results and Discussion We identified a set of 58 genes that correctly predicted the diagnostic class in 75% ± 7% of the samples, including several cases of early stage cancers (stage 0 and stage I). In addition to some gene families reported earlier such as ribosomal genes, the identified predictive genes also included some novel gene families. Pathway analysis identified a number of pathways based on the identified genes. These pathways are currently being investigated more thoroughly to reveal possible tumor-blood interactions. In this study, factors known to significantly affect data quality such as manufacturing lot, hybridization and labeling time were randomized as we expected less noise when using commercial arrays and protocols. Though, the results show that diagnostic information relating to early stage breast cancer can still be mined from PBC, it is important that any test intended for breast cancer diagnosis has high prediction accuracy. We have recently conducted a pilot study using Applied Biosystems 44K microarrays following a design that allows efficient normaliztion of the facors affecting data quality. The results show a significant improvement in the prediction accuracy. We are now reanalyzing 130 blood samples using ABI microarrays in a carefully designed experiment. The 97th AACR Annual Meeting, Washington DC, USA, 1-5 April 2006 # 125 Acknowledgement The present study was supported by The National Programme for Research in Functional Genomics in Norway (FUGE) in the Research Council of Norway References 1. Sharma et al. (2005). Breast Cancer Research 7(5): R 634-644. 2. Ambroise C, McLachlan GJ (2002). Proc. Natl. Acad. Sci. USA, 99: 6562-6566. Conclusions Larger study supports our previous finding that breast cancer affects gene expression patterns in PBC during early stages of disease development. A blood-based gene expression test can potentially be developed for early detection of breast cancer. Ongoing and future studies Validate the relevancy of identified predictive genes by TaqMan RT- PCR. Try to understand the underlying biology causing the gene expression changes in blood cells of breast cancer patients. Develop a simple, objective and accurate gene expression based diagnostic tool for early detection of breast cancer. Figure 1 Training samples Full variable set Variable selection using class info on training samples Test samples Model selection Cross-validation Classification Proper validation M o d e l 1 M o d e l 2 Select stable and common genes across segments M o d e l 3 M o d e l 4 M o d e l 1 3 9 Jackknife significant genes Sample 140 Repeated for all 140 samples All samples Full variable set Variable selection using class info on all samples Training samples Test samples Cross-validation Classification Model selection Improper validation

Upload: polly-parrish

Post on 25-Dec-2015

214 views

Category:

Documents


1 download

TRANSCRIPT

Page 1: Expression profiling of peripheral blood cells for early detection of breast cancer Introduction Early detection of breast cancer is a key to successful

Expression profiling of peripheral blood cells for early Expression profiling of peripheral blood cells for early detection of breast cancerdetection of breast cancer

IntroductionIntroduction

Early detection of breast cancer is a key to successful treatment and patient survival. The existing methods to detect breast cancer in asymptomatic patients have limitations, and there is a need to develop more accurate and convenient methods. In a recently published study (Sharma et al. 2005), we demonstrate the potential use of gene expression profiling in peripheral blood cells (PBC) for early detection of breast cancer. However, the study was based on limited sample size and the use of in-house manufactured macroarrays. The aim of the present study was to investigate whether the findings reported earlier could be reproduced using a larger sample size and a commercially available microarray platform.

Materials and methodsMaterials and methods

Whole blood were collected in PAX tubes from 64 females diagnosed with breast cancer and 76 females with no reported sign of the disease. Total RNA was extracted and gene expression analyses were conducted using high density oligonucleotide arrays (Agilent Technologies) containing 22.000 probes. Expression data were analyzed using several statistical approaches: Partial Least Square Regression (PLSR) was used for model building while a novel approach combining double and triple cross validation (CV) was used to identify stable and relevant predictive genes and estimate their prediction efficiency. A number of studies have reported over-optimistic accuracy levels due to improper validation where selection bias has not been taken into account (Ambroise and McLachlan, 2002). In order to avoid such selection bias and to obtain unbiased estimates of accuracy, a trippel CV approach was required, since the gene selection procedure was based on an inner double CV routine (Figure 1). Based on the selected predictors, pathway analysis was conducted (PathwayAssist, Ariadne Genomics).

J. Aarøe1 • T. Lindahl2 • S. Sæbø3 • P. Skaane4 • S. Myhre1 • T. Reiersen1 • A. Lönneborg2 • A-L. Børresen-Dale1 • P. Sharma2 1 Department of Genetics, The Norwegian Radium Hospital, N-0310 Oslo, Norway. 2 DiaGenic ASA, Oslo, Norway. 3 IKBM, University of Life Sciences, 1432 Ås, Norway. 4 Department of

Surgery, Ullevål University Hospital, Oslo, Norway.

Results and DiscussionResults and Discussion

We identified a set of 58 genes that correctly predicted the diagnostic class in 75% ± 7% of the samples, including several cases of early stage cancers (stage 0 and stage I). In addition to some gene families reported earlier such as ribosomal genes, the identified predictive genes also included some novel gene families. Pathway analysis identified a number of pathways based on the identified genes. These pathways are currently being investigated more thoroughly to reveal possible tumor-blood interactions.

In this study, factors known to significantly affect data quality such as manufacturing lot, hybridization and labeling time were randomized as we expected less noise when using commercial arrays and protocols. Though, the results show that diagnostic information relating to early stage breast cancer can still be mined from PBC, it is important that any test intended for breast cancer diagnosis has high prediction accuracy. We have recently conducted a pilot study using Applied Biosystems 44K microarrays following a design that allows efficient normaliztion of the facors affecting data quality. The results show a significant improvement in the prediction accuracy. We are now reanalyzing 130 blood samples using ABI microarrays in a carefully designed experiment.

The 97th AACR Annual Meeting, Washington DC, USA, 1-5 April 2006

# 125

AcknowledgementThe present study was supported by The National Programme for Research in Functional Genomics in Norway (FUGE) in the Research Council of Norway

References 1. Sharma et al. (2005). Breast Cancer Research 7(5): R 634-644.2. Ambroise C, McLachlan GJ (2002). Proc. Natl. Acad. Sci. USA, 99: 6562-6566.

Conclusions

• Larger study supports our previous finding that breast cancer affects gene expression patterns in PBC during early stages of disease development.

• A blood-based gene expression test can potentially be developed for early detection of breast cancer.

Ongoing and future studies

• Validate the relevancy of identified predictive genes by TaqMan RT-PCR.

• Try to understand the underlying biology causing the gene expression changes in blood cells of breast cancer patients.

• Develop a simple, objective and accurate gene expression based diagnostic tool for early detection of breast cancer.

Figure 1

Training samples

Full variable set

Variable selectionusing class info on

training samples

Test samples

Model selection

Cro

ss-v

alid

atio

n

Classification

Proper validation

Mod

el 1

Mod

el 2

Select stable and commongenes across

segments

Mod

el 3

Mod

el 4

Mod

el 1

39

Jackknife significant genes

Sample 140

Repeated for all 140 samples

All samples

Full variable set

Variable selection using class info on

all samples

Training samples

Test samples

Cro

ss-v

alid

atio

n

Classification

Model selection

Improper validation