Download - Visualization and Data Mining
![Page 1: Visualization and Data Mining](https://reader036.vdocument.in/reader036/viewer/2022062411/568155f6550346895dc3bdd5/html5/thumbnails/1.jpg)
Visualization and
Data Mining
![Page 2: Visualization and Data Mining](https://reader036.vdocument.in/reader036/viewer/2022062411/568155f6550346895dc3bdd5/html5/thumbnails/2.jpg)
22
Napoleon Invasion of Russia, 1812
Napoleon
![Page 3: Visualization and Data Mining](https://reader036.vdocument.in/reader036/viewer/2022062411/568155f6550346895dc3bdd5/html5/thumbnails/3.jpg)
33
Marley, 1885
![Page 4: Visualization and Data Mining](https://reader036.vdocument.in/reader036/viewer/2022062411/568155f6550346895dc3bdd5/html5/thumbnails/4.jpg)
44
![Page 5: Visualization and Data Mining](https://reader036.vdocument.in/reader036/viewer/2022062411/568155f6550346895dc3bdd5/html5/thumbnails/5.jpg)
55
Snow’s Cholera Map, 1855
![Page 6: Visualization and Data Mining](https://reader036.vdocument.in/reader036/viewer/2022062411/568155f6550346895dc3bdd5/html5/thumbnails/6.jpg)
66
Asia at night
![Page 7: Visualization and Data Mining](https://reader036.vdocument.in/reader036/viewer/2022062411/568155f6550346895dc3bdd5/html5/thumbnails/7.jpg)
77
South and North Korea at night
Seoul,South Korea
North KoreaNotice how darkit is
![Page 8: Visualization and Data Mining](https://reader036.vdocument.in/reader036/viewer/2022062411/568155f6550346895dc3bdd5/html5/thumbnails/8.jpg)
88
Visualization Role Support interactive exploration Help in result presentation Disadvantage: requires human eyes Can be misleading
![Page 9: Visualization and Data Mining](https://reader036.vdocument.in/reader036/viewer/2022062411/568155f6550346895dc3bdd5/html5/thumbnails/9.jpg)
99
Bad Visualization: Spreadsheet with misleading Y -axisYear Sales
1999 21102000 21052001 21202002 21212003 2124
Sales
20952100210521102115212021252130
1999 2000 2001 2002 2003
Sales
Y-Axis scale gives WRONGimpression of big change
![Page 10: Visualization and Data Mining](https://reader036.vdocument.in/reader036/viewer/2022062411/568155f6550346895dc3bdd5/html5/thumbnails/10.jpg)
1010
Better VisualizationYear Sales
1999 21102000 21052001 21202002 21212003 2124
Sales
0
500
1000
1500
2000
2500
3000
1999 2000 2001 2002 2003
Sales
Axis from 0 to 2000 scale gives correct impression of small change
![Page 11: Visualization and Data Mining](https://reader036.vdocument.in/reader036/viewer/2022062411/568155f6550346895dc3bdd5/html5/thumbnails/11.jpg)
1111
Lie Factor=14.8
(E.R. Tufte, “The Visual Display of Quantitative Information”, 2nd edition)(E.R. Tufte, “The Visual Display of Quantitative Information”, 2nd edition)
![Page 12: Visualization and Data Mining](https://reader036.vdocument.in/reader036/viewer/2022062411/568155f6550346895dc3bdd5/html5/thumbnails/12.jpg)
1212
Lie Factor
dataineffectofsizegraphicinshowneffectofsizeFactorLie
8.14528.0833.7
18)0.185.27(
6.0)6.03.5(
Tufte requirement: 0.95<Lie Factor<1.05
(E.R. Tufte, “The Visual Display of Quantitative Information”, 2nd edition)(E.R. Tufte, “The Visual Display of Quantitative Information”, 2nd edition)
![Page 13: Visualization and Data Mining](https://reader036.vdocument.in/reader036/viewer/2022062411/568155f6550346895dc3bdd5/html5/thumbnails/13.jpg)
1313
Tufte’s Principles of Graphical Excellence Give the viewer
the greatest number of ideas in the shortest time with the least ink in the smallest space.
Tell the truth about the data!
(E.R. Tufte, “The Visual Display of Quantitative Information”, 2nd edition)(E.R. Tufte, “The Visual Display of Quantitative Information”, 2nd edition)
![Page 14: Visualization and Data Mining](https://reader036.vdocument.in/reader036/viewer/2022062411/568155f6550346895dc3bdd5/html5/thumbnails/14.jpg)
1414
Visualization Methods Visualizing in 1-D, 2-D and 3-D
well-known visualization methods Visualizing more dimensions
Parallel Coordinates Other ideas
![Page 15: Visualization and Data Mining](https://reader036.vdocument.in/reader036/viewer/2022062411/568155f6550346895dc3bdd5/html5/thumbnails/15.jpg)
1515
1-D (Univariate) Data Representations
7
5
3
1
0 20
Mean
low highMiddle 50%
Tukey box plot
Histogram
![Page 16: Visualization and Data Mining](https://reader036.vdocument.in/reader036/viewer/2022062411/568155f6550346895dc3bdd5/html5/thumbnails/16.jpg)
1616
2-D (Bivariate) Data Scatter plot, …
price
mileage
![Page 17: Visualization and Data Mining](https://reader036.vdocument.in/reader036/viewer/2022062411/568155f6550346895dc3bdd5/html5/thumbnails/17.jpg)
1717
3-D Data (projection)
price
![Page 18: Visualization and Data Mining](https://reader036.vdocument.in/reader036/viewer/2022062411/568155f6550346895dc3bdd5/html5/thumbnails/18.jpg)
1818
3-D image (requires 3-D blue and red glasses)
Taken by Mars Rover Spirit, Jan 2004
![Page 19: Visualization and Data Mining](https://reader036.vdocument.in/reader036/viewer/2022062411/568155f6550346895dc3bdd5/html5/thumbnails/19.jpg)
1919
Visualizing in 4+ Dimensions Scatterplots Parallel Coordinates Chernoff faces …
![Page 20: Visualization and Data Mining](https://reader036.vdocument.in/reader036/viewer/2022062411/568155f6550346895dc3bdd5/html5/thumbnails/20.jpg)
2020
Multiple Views
Give each variable its own display
A B C D E1 4 1 8 3 52 6 3 4 2 13 5 7 2 4 34 2 6 3 1 5
A B C D E
1
2
3
4
Problem: does not show correlations
![Page 21: Visualization and Data Mining](https://reader036.vdocument.in/reader036/viewer/2022062411/568155f6550346895dc3bdd5/html5/thumbnails/21.jpg)
2121
Scatterplot Matrix
Represent each possiblepair of variables in theirown 2-D scatterplot (car data)
Q: Useful for what? A: linear correlations (e.g. horsepower & weight)
Q: Misses what? A: multivariate effects
![Page 22: Visualization and Data Mining](https://reader036.vdocument.in/reader036/viewer/2022062411/568155f6550346895dc3bdd5/html5/thumbnails/22.jpg)
2222
Parallel Coordinates • Encode variables along a horizontal row• Vertical line specifies values
Dataset in a Cartesian coordinates
Same dataset in parallel coordinates
Invented by Alfred Inselberg while at IBM, 1985
![Page 23: Visualization and Data Mining](https://reader036.vdocument.in/reader036/viewer/2022062411/568155f6550346895dc3bdd5/html5/thumbnails/23.jpg)
2323
Example: Visualizing Iris Data
sepal length
sepal width
petal length
petal width
5.1 3.5 1.4 0.2
4.9 3 1.4 0.2
... ... ... ...
5.9 3 5.1 1.8
Iris setosa
Iris versicolor
Iris virginica
![Page 24: Visualization and Data Mining](https://reader036.vdocument.in/reader036/viewer/2022062411/568155f6550346895dc3bdd5/html5/thumbnails/24.jpg)
2424
Flower PartsPetal, a non-reproductive part of the flower
Sepal, a non-reproductive part of the flower
![Page 25: Visualization and Data Mining](https://reader036.vdocument.in/reader036/viewer/2022062411/568155f6550346895dc3bdd5/html5/thumbnails/25.jpg)
2525
Parallel Coordinates Sepal Length
5.1
sepal length
sepal width
petal length
petal width
5.1 3.5 1.4 0.2
![Page 26: Visualization and Data Mining](https://reader036.vdocument.in/reader036/viewer/2022062411/568155f6550346895dc3bdd5/html5/thumbnails/26.jpg)
2626
Parallel Coordinates: 2 DSepal Length
5.1
Sepal Width
3.5
sepal length
sepal width
petal length
petal width
5.1 3.5 1.4 0.2
![Page 27: Visualization and Data Mining](https://reader036.vdocument.in/reader036/viewer/2022062411/568155f6550346895dc3bdd5/html5/thumbnails/27.jpg)
2727
Parallel Coordinates: 4 DSepal Length
5.1
Sepal Width
Petal length
Petal Width
3.5
sepal length
sepal width
petal length
petal width
5.1 3.5 1.4 0.2
1.4 0.2
![Page 28: Visualization and Data Mining](https://reader036.vdocument.in/reader036/viewer/2022062411/568155f6550346895dc3bdd5/html5/thumbnails/28.jpg)
2828
5.1
3.5
1.4 0.2
Parallel Visualization of Iris data
![Page 29: Visualization and Data Mining](https://reader036.vdocument.in/reader036/viewer/2022062411/568155f6550346895dc3bdd5/html5/thumbnails/29.jpg)
2929
Parallel Visualization Summary Each data point is a line Similar points correspond to similar lines Lines crossing over correspond to negatively correlated attributes
Interactive exploration and clustering
Problems: order of axes, limit to ~20 dimensions
![Page 30: Visualization and Data Mining](https://reader036.vdocument.in/reader036/viewer/2022062411/568155f6550346895dc3bdd5/html5/thumbnails/30.jpg)
3030
Chernoff FacesEncode different variables’ values in characteristicsof human face
http://www.cs.uchicago.edu/~wiseman/chernoff/http://hesketh.com/schampeo/projects/Faces/chernoff.html
Cute applets:
![Page 31: Visualization and Data Mining](https://reader036.vdocument.in/reader036/viewer/2022062411/568155f6550346895dc3bdd5/html5/thumbnails/31.jpg)
3131
Interactive Face
![Page 32: Visualization and Data Mining](https://reader036.vdocument.in/reader036/viewer/2022062411/568155f6550346895dc3bdd5/html5/thumbnails/32.jpg)
3232
Chernoff faces, example
![Page 33: Visualization and Data Mining](https://reader036.vdocument.in/reader036/viewer/2022062411/568155f6550346895dc3bdd5/html5/thumbnails/33.jpg)
3333
Visualization Summary Many methods Visualization is possible in more than 3-D Aim for graphical excellence