unsupervised clustering in mrna expression profiles

11
Unsupervised clustering in mRNA expression profiles D.K. Tasoulis, V.P. Plagianakos, and M.N. Vrahatis Computational Intelligence Laboratory (CILAB), Department of Mathematics, University of Patras, GR-26110 Patras, Greece University of Patras Artificial Intelligence Research Center (UPAIRC), University of Patras, GR-26110 Patras, Greece Computers in Biology and Medicine In Press, Corrected Proof, Available online 24 October 2005

Upload: amela-hernandez

Post on 31-Dec-2015

10 views

Category:

Documents


0 download

DESCRIPTION

Unsupervised clustering in mRNA expression profiles. D.K. Tasoulis, V.P. Plagianakos, and M.N. Vrahatis Computational Intelligence Laboratory (CILAB), Department of Mathematics, University of Patras, GR-26110 Patras, Greece - PowerPoint PPT Presentation

TRANSCRIPT

Page 1: Unsupervised clustering in mRNA expression profiles

Unsupervised clustering in mRNA expression profiles

D.K. Tasoulis, V.P. Plagianakos, and M.N. Vrahatis

Computational Intelligence Laboratory (CILAB), Department of Mathematics, University of Patras, GR-26110 Patras, Greece

University of Patras Artificial Intelligence Research Center (UPAIRC), University of Patras, GR-26110 Patras, Greece

Computers in Biology and Medicine In Press, Corrected Proof, Available online 24 October 2005

Page 2: Unsupervised clustering in mRNA expression profiles

K-Windows Clustering

• Adaptation of K-means, originally proposed in 2002 by Vrahatis et. al.

• Windowing technique improves speed and accuracy

• Tries to place a d-dimensional window (box) containing all patterns that belong to a single cluster

Page 3: Unsupervised clustering in mRNA expression profiles

K-Windows – Basic Concepts

• Move windows to find cluster centers (fig a)1. Select k points as centers of d-windows of size a.2. Window means becomes new center.3. Repeat until stopping criterion (movement of center).

• Enlarge windows to determine cluster edges (fig b)1. Enlarge one dimension by a specified percent.2. Relocate window as above.3. Keep only if increase in instances in window exceeds threshold

Page 4: Unsupervised clustering in mRNA expression profiles

Unsupervised K-Windows (UKW)

• Start with sufficiently large number of windows• Merge to automatically determine the number of

clusters• For each pair of overlapping windows, calculate

proportion of overlap for each window.a) Large overlap, considered same cluster, W1 is deleted.b) Many points in common, considered the same cluster.c) Low overlap, considered two different clusters.

Page 5: Unsupervised clustering in mRNA expression profiles

Experimental Setup

• Leukemia dataset – well characterized• Default UKW parameters used• Supervised dimension reduction

– Two previously published gene subsets and their union

• Unsupervised dimension reduction– Biclustering with UKW– PCA– PCA and UKW hybrid

Page 6: Unsupervised clustering in mRNA expression profiles

Supervised Feature Selection

• Use two gene subsets selected in previously published papers using supervised techniques.

• All algorithms did best on combined set, results below.

Page 7: Unsupervised clustering in mRNA expression profiles

Unsupervised Feature Selection(Biclustering Technique)

• Apply UKW to cluster genes, select one gene, closest to cluster center, as representative from each cluster.

• Apply UKW to samples, using those genes (239).

• UKW accuracy: 93.6% (ALL) and 76% (AML)

• No results reported for other algorithms

Page 8: Unsupervised clustering in mRNA expression profiles

Unsupervised Feature Selection(PCA Techniques)

• PCA and scree plot to reduce features– Poor Performance

• Hybrid PCA and UKW method– Partition genes using UKW– Transform each partition using PCA– Select representative factors from each

cluster– UKW accuracy: 97.87% (ALL) and 88% (AML)

Page 9: Unsupervised clustering in mRNA expression profiles

UKW Results Summary

Dataset ALL Accuracy AML Accuracy

Published Gene Subsets

(Supervised)

90% 100%

UKW Biclustering (Unsupervised)

93.6% 76%

PCA (Unsupervised)

N/A N/A

PCA-UKW Hybrid (Unsupervised)

97.87% 88%

Page 10: Unsupervised clustering in mRNA expression profiles
Page 11: Unsupervised clustering in mRNA expression profiles

• Default parameters– initial window size a=5– enlargement threshold θe=0.8– merging threshold θm=0.1– coverage threshold θc=0.2– variability threshold θv=0.02

• Link to article