neural network analysis of flow cytometry immunophenotype data

39
Neural Network Analysis of Flow Neural Network Analysis of Flow Cytometry Immunophenotype Data Cytometry Immunophenotype Data Mehrshad Mokhtaran Mehrshad Mokhtaran M.D. M.D. IEEE TRANSACTIONS ON BIOMEDICAL ENGINEERING, VOL. 43, NO. 8, AUGUST 1996 Ravi Kothari,* Member, IEEE, Hernani Cualing, and Thiagarajan Balachander

Upload: caldwell-mcclain

Post on 01-Jan-2016

34 views

Category:

Documents


1 download

DESCRIPTION

IEEE TRANSACTIONS ON BIOMEDICAL ENGINEERING, VOL. 43, NO. 8, AUGUST 1996. Neural Network Analysis of Flow Cytometry Immunophenotype Data. Ravi Kothari,* Member, IEEE, Hernani Cualing, and Thiagarajan Balachander. Mehrshad Mokhtaran M.D. Acute Leukemia. Definition Malignant Event - PowerPoint PPT Presentation

TRANSCRIPT

Page 1: Neural Network Analysis of Flow Cytometry Immunophenotype Data

Neural Network Analysis of Flow Cytometry Neural Network Analysis of Flow Cytometry Immunophenotype DataImmunophenotype Data

Mehrshad Mokhtaran M.D.Mehrshad Mokhtaran M.D.

IEEE TRANSACTIONS ON BIOMEDICAL ENGINEERING, VOL. 43, NO. 8, AUGUST 1996

Ravi Kothari,* Member, IEEE, Hernani Cualing, and Thiagarajan Balachander

Page 2: Neural Network Analysis of Flow Cytometry Immunophenotype Data
Page 3: Neural Network Analysis of Flow Cytometry Immunophenotype Data
Page 4: Neural Network Analysis of Flow Cytometry Immunophenotype Data

Acute LeukemiaAcute Leukemia

• Definition– Malignant Event– Replace the bone marrow with blast– Clinical Complication: Anemia, Infection, Bleeding– Rapidly fatal– With appropriate therapy, the natural history can be

markedly altered, and many patient can be cured.

Page 5: Neural Network Analysis of Flow Cytometry Immunophenotype Data

Acute LeukemiaAcute Leukemia

• Etiology:– Radiation– Oncogenic Viruses– Genetic and Congenital Factors– Chemical and Drugs

Page 6: Neural Network Analysis of Flow Cytometry Immunophenotype Data

Acute LeukemiaAcute Leukemia• Incidence:

– Annual new case (All Leukemia): 8 to 10 per 100,000.– Remained static over the past three decades.– ALL:11% CLL:29% AML:46% CML:14%– 3% of all cancer in United States– ALL is most common cancer in children(<15y)– ALL is second cause of death in children(<15y)– ALL has tow maximum incidence per age– AML gradually increases with age– Half of AML cases occur in patients younger than 50 y

Page 7: Neural Network Analysis of Flow Cytometry Immunophenotype Data

Acute LeukemiaAcute Leukemia

• Pathophysiology:

Page 8: Neural Network Analysis of Flow Cytometry Immunophenotype Data

Acute LeukemiaAcute Leukemia

• Classification– Morphology– Cytochemistry– Cell-surface markers– Cytoplasmic markers– Cytogenetics– Oncogene expression

Page 9: Neural Network Analysis of Flow Cytometry Immunophenotype Data

Acute LeukemiaAcute Leukemia

• Must important Distinction is between: AML & ALL– Clinical behavior, prognosis, response to therapy

• AML (FAB)– M0, M1, M2, M3: Increasing degree of differentiation– M4, M5: Monocytic lineage – M6: Erytroid cell linage– M7: Acute Megakaryocytic Leukemia

• ALL (FAB)– L1– L2– L3

Page 10: Neural Network Analysis of Flow Cytometry Immunophenotype Data

Acute LeukemiaAcute Leukemia• Cell-surface Markers:

– AML• Normal immature myeloid cells and blast cells from most patient

with AML: CD13, CD14, CD33, CD34• M6, M7: Antigens restricted to red cell and platelet lineage• AML may express: HLA-DR antigen• 10-20%: B- or T-cell lineage

– ALL• 60% of ALL: CALLA(CD10) (early pre-B-cell differentiation state)• Pre-B-cell ALL: 20% CALLA-positive that have intracytoplasmic

immunoglobulin• B-cell ALL(5%): Immunoglobulin on cell surface• T-cell ALL(20%): CD5, CD3 or CD2 (normal early T-cell)• Null cell ALL (15%): Fail to express CALLA, B- , T-cell markers• 25% of ALL: Myeloid antigens

Page 11: Neural Network Analysis of Flow Cytometry Immunophenotype Data

Acute LeukemiaAcute Leukemia

• Cytogenetics and Molecular biology:

Page 12: Neural Network Analysis of Flow Cytometry Immunophenotype Data
Page 13: Neural Network Analysis of Flow Cytometry Immunophenotype Data

Acute LeukemiaAcute Leukemia• Clinical Manifestations:

– Decreaseing normal marrow function:• Anemia: Fatigue, pallor, headache, angina or heart failure• Thrombocytopenia: Bleeding(petechiae, ecchymosess, bleeding

gums, epistaxis) • Granulocytopenic(AML>ALL) : Infections (Bacterial)

– Invasioning of normal organs by leukemic blasts (ALL>AML):• Enlargement of lymph nodes, liver, spleen• Bone pain• Skin (Leukemia cutis)• Leukemic meningitis: Headache, nausea• CNS (particular in relapse): palsies and seizures• Testicular involvement (particular in relapse)• Any soft tissue (AML>ALL): Chloroma, myeloblastoma

– Specific subtype of leukemia:• M3: DIC (Disseminated intravascular coagulation)

Page 14: Neural Network Analysis of Flow Cytometry Immunophenotype Data

Acute LeukemiaAcute Leukemia

• Laboratory Manifestations:– CBC– Bone marrow aspiration and biopsy– PT (Prothrombin Time) & PTT (Partial

Thromboplastin Time)– LDH (Lactate dehydrogenase)– …

Page 15: Neural Network Analysis of Flow Cytometry Immunophenotype Data

Acute LeukemiaAcute Leukemia

• Treatment:– Combination Chemotherapy– Bone Marrow Transplantation– Stabilization:

• Hematological• Metabolical• Psychological

Page 16: Neural Network Analysis of Flow Cytometry Immunophenotype Data

• Introduction

• Data Collection

• Classifier Design

• Results

• Discussion

• Conclusion

Page 17: Neural Network Analysis of Flow Cytometry Immunophenotype Data

IntroductionIntroduction

• Immunophenotype data• Flow cytometry• Lineage & Differentiation• ALL: Immature (CALLA+), Pre-B,

Mature-B, T-Lymphoblastic• Response to chemotherapy• AML: M1,M2,…,M8• No relevant prognosis

Page 18: Neural Network Analysis of Flow Cytometry Immunophenotype Data

Data CollectionData Collection

• Flow cytometry immunophenotype data of cases with leukemia or reactive bone marrow were collected retrospectively from computerized archival database.

• Selection Criterion:– Confirmed diagnosis

– Complete flow cytometry antibody panel result

• Total cases: 170– 151 leukemia and 19 nonleukemia

– 62 children and 89 adults

– 81 males and 70 females

Page 19: Neural Network Analysis of Flow Cytometry Immunophenotype Data

First PhaseFirst Phase

• Lineage Categories• Categorize into:

– Reactive– ALL– Remission– Mixed AML-ALL– AML

Page 20: Neural Network Analysis of Flow Cytometry Immunophenotype Data
Page 21: Neural Network Analysis of Flow Cytometry Immunophenotype Data

Second PhaseSecond Phase

• Categorize the ALL Cases into subcategories based on differentiation

• Categorize into:– Pre-B– CALLA+– T Phenotype

• Not include: Mature-B (Difficulty in obtaining sufficient data for meaningful interpretation)

Page 22: Neural Network Analysis of Flow Cytometry Immunophenotype Data
Page 23: Neural Network Analysis of Flow Cytometry Immunophenotype Data

DataData

• Validation / Training set size = 33-50%• Only Bone marrow phenotypes (Most Sensetive specific)• Excluded: Peripheral blood and cerebro-spinal fluids immunophenotype• Flow cytometry immunophenotype data:

– Mean fluorescence intensity of a minimum of 10000 cells analyzed using either a red or green fluorescence tagged antibody

Page 24: Neural Network Analysis of Flow Cytometry Immunophenotype Data

DataData

• 27 Standardized and most commonly used monoclonal antibodies with defined specificities.

• Not all of these are utilized for each case.• Average of 15 antibodies for each case.• At least ten antibodies are commonly used for acute leukemia as a

standard practice.• With a zero value if an antibody was not used• An additional binary input denoting past diagnosis of leukemia, were

used as input a neural network classifier.

Page 25: Neural Network Analysis of Flow Cytometry Immunophenotype Data
Page 26: Neural Network Analysis of Flow Cytometry Immunophenotype Data

Classifier DesignClassifier Design

• A feed-forward neural network

• Trained using back propagation algorithm

Page 27: Neural Network Analysis of Flow Cytometry Immunophenotype Data
Page 28: Neural Network Analysis of Flow Cytometry Immunophenotype Data

ClassifierClassifier

• How many hidden layer neurons are needed for a particular task?

– Having a large number of redundant weights leads to over fitting

Page 29: Neural Network Analysis of Flow Cytometry Immunophenotype Data

ClassifierClassifier

• Given a network with a certain number of inputs, hidden layer

neurons, and output, how many training sample are needed to achieve good generalization?

• For accuracy of (1-ε):

p ≥ O(W/ε)

p: Number of training sample.

W: Total number of weights in the network.

Page 30: Neural Network Analysis of Flow Cytometry Immunophenotype Data

ClassifierClassifier

• Perturbation: To generate a large number of cases by introducing small variation in actual cases.

• Optimal Brain Damage: The weight which least increase the error can be

eliminated

• Optimal Brain Surgeon: The sensitivity of an interconnection is expressed as the

cumulative sum of the changes experienced by a weight, during training.

• Weight Decay: Each weight has a tendency to decay to zero with a rate

proportional to the magnitude of the weight.

Page 31: Neural Network Analysis of Flow Cytometry Immunophenotype Data

ClassifierClassifier

• Inputs: 27 + 1

• Hidden: 50 Progressively increasing the number of hidden neurons until

acceptable performance was achieved on training data.

• Output:– First phase (Based on lineage): 5– Second phase (Based on differentiation): 3

• Learning rate (η): 0.1

• Weight Decay Coefficient (λ): 0.05

Page 32: Neural Network Analysis of Flow Cytometry Immunophenotype Data

ResultsResults

• Mean error was acceptably low (0.0001) in both the cases.

• First phase weights :– Total: 1650– Nonzero: 1106– Very small value(<0.1): 544

• Second phase weights :– Total: 1550– Nonzero: 446– Very small value(<0.1): 1104

Page 33: Neural Network Analysis of Flow Cytometry Immunophenotype Data

Fig. 2. Performance of the network for categorization into reactive and the lineage categories of leukemia (ALL, Remission, Mixed AML-ALL, and AML).

Page 34: Neural Network Analysis of Flow Cytometry Immunophenotype Data

Fig. 3. Performance of the network for categorization of ALL cases into subcategories based on differentiation (Pre-B, CALLA+, and T Phenotype).

Page 35: Neural Network Analysis of Flow Cytometry Immunophenotype Data

ResultResult

• Generalization Error:– First phase: 10.3%– Second phase: 10.0%

• Back propagation without the complexity regulation term (Weight Decay): – Generalization performance was poor

Page 36: Neural Network Analysis of Flow Cytometry Immunophenotype Data

DiscussionDiscussion

• Clustering-based methods fall into one of two categories:

– Partitioning– Hierarchical

Page 37: Neural Network Analysis of Flow Cytometry Immunophenotype Data

DiscussionDiscussion

• Partitioning:– e.g., k-means, c-means fuzzy clustering– Divide the inputs, so that members of a

cluster are close to each other and far away from other clusters

– The shared specificity of some monoclonal antibodies make this extremely difficult.

Page 38: Neural Network Analysis of Flow Cytometry Immunophenotype Data

DiscussionDiscussion

• Hierarchical:– e.g., centroid sorting, linkage methods– Try to merge two closest data points together

at each step, and repeat the process until there is only one cluster.

– Have a better chance of succeeding due to the variability in immunophenotype data

– An error in merging made earlier on is propagated throughout.

Page 39: Neural Network Analysis of Flow Cytometry Immunophenotype Data

ConclusionConclusion

• Off line retraining

• Extract rules from trained networks