a plot for visualizing multivariate data

22
A Plot for Visualizing Multivariate Data Rida E. A. Moustafa George Mason University ADM Group,AAL [email protected] [email protected]

Upload: arne

Post on 07-Feb-2016

42 views

Category:

Documents


3 download

DESCRIPTION

A Plot for Visualizing Multivariate Data. Rida E. A. Moustafa George Mason University ADM Group,AAL [email protected] [email protected]. Talk Outline. The Theory of MV-Plot. Detecting Linear Structures with MV-plot. Detecting Non-Linear Structures with MV-plot. - PowerPoint PPT Presentation

TRANSCRIPT

Page 1: A Plot for Visualizing Multivariate Data

A Plot for Visualizing Multivariate Data

Rida E. A. Moustafa

George Mason UniversityADM Group,AAL

[email protected]@aalcpas.com

Page 2: A Plot for Visualizing Multivariate Data

Talk Outline

The Theory of MV-Plot. Detecting Linear Structures with MV-plot. Detecting Non-Linear Structures with MV-plot. Comparisons with other methods and application on real data.

Page 3: A Plot for Visualizing Multivariate Data

MV-Plot Theory

d

jjd

d

jjd

xfxxfxgv

xxfm

1

21

1

1

|)(|))(,(

||)(

Given an observation x=(x1,x2,…,xd)We define m and v as follows:

Computing m and v for every observation produces vector of m and v.

What is the relationship between m and v?

Page 4: A Plot for Visualizing Multivariate Data

MV-Relationship in 2-d

21212

2

121

2121

2

121

||

|)||(|||

iiij

iji

iij

iji

xxmxv

xxxm

• Normalizing the data in range (0,1) avoid the abs-value in computing m.• Close to the PC in 2-d

Page 5: A Plot for Visualizing Multivariate Data

MV- detects linear structure(s)

011011

00111

1

01121

01121

0112

;;)1()1(

if

)1(

;)1(

axavaxamawaww

w

wxwv

wxwmwxwx

iiii

ii

iiii

If the data is linear in the original space

It will be linear in the MV-space!!

Page 6: A Plot for Visualizing Multivariate Data

MV- detects linear structure(s)

1

10

1

1

10

1

)1()1)1(

)1(

2

d

jijjd

dj

d

jijjdj

wdxwdv

wxwm

1

10

1

10

d

jijjj

d

jijjj

axav

axam

Page 7: A Plot for Visualizing Multivariate Data

Detecting Linear structure(s)Example I

Page 8: A Plot for Visualizing Multivariate Data

Detecting Linear structure(s) Example II

Page 9: A Plot for Visualizing Multivariate Data

Detecting Linear structure(s) Example III

Page 10: A Plot for Visualizing Multivariate Data

Detecting nonlinear datawith MV-plot

MV- plot can detect nonlinear structure in the data set without any changes in the equations.

Page 11: A Plot for Visualizing Multivariate Data

Detecting nonlinear structure

|)sin(|),sin()sin(,|)cos(|),cos()cos(,

xxvxxmxxxxvxxmxx

Page 12: A Plot for Visualizing Multivariate Data

Detecting Sphere(s)

.222

1

2212

1

12

dR

ii

d

jiijd

d

jiijdi

mv

dmxmxv

Case I: • The sphere radius R• The sphere center is the origin

Page 13: A Plot for Visualizing Multivariate Data

Detecting Sphere(s)

.

)()(

222

1

221

2

1

12

dR

ii

d

ji

cj

cjijd

d

ji

cj

cjijdi

mv

mxdxx

mxxxv

Case II: • The sphere radius R• The sphere center is not the origin

Page 14: A Plot for Visualizing Multivariate Data

Detecting Sphere(s)

Page 15: A Plot for Visualizing Multivariate Data

Fisher’s IRIS data (150x4) 3-classes of( 50 point each)

Process control data (600x60)6-classes of (100 points each)

Pollen data (3,848x5) (Wegman’s data)2-classes (linear and nonlinear)

Application on Real data

Page 16: A Plot for Visualizing Multivariate Data

Multidimensional Scaling Fisher Discriminate Analysis Principal Component

Related Dimensional Reduction Methods

Page 17: A Plot for Visualizing Multivariate Data

IRIS (R. A. Fisher) Dataset150-cases in 4-dim

Page 18: A Plot for Visualizing Multivariate Data

Time Series Dataset600-cases in 60-dim

Page 19: A Plot for Visualizing Multivariate Data

Pollen dataset 3,848-points in 5-dim

Other methods:Require more storage and speed.Even if it work, we expect bad results on this particular data.

(Wegman2002)

Page 20: A Plot for Visualizing Multivariate Data

Pollen dataset

Linear and Nonlinear mixed structures.

Page 21: A Plot for Visualizing Multivariate Data

The linear structure in the Pollen data set

17+16+18+17+14+16=98 Linear, 3750 nonlinear

Page 22: A Plot for Visualizing Multivariate Data

Summary MV-algorithm can discover the linear

and nonlinear pattern at the same time.

MV-algorithm can discover symmetric data.

MV-algorithm deals with large multivariate data.