csis computer visionscha/cv/cvintro.pdf · 2 fall/2002 © s. cha perception csis fall/2002 © s....
TRANSCRIPT
1
© S. ChaFall/2002
CSIS
Fall of 2002
Prof. Sung-Hyuk Cha
School of Computer Science & Information Systems
Computer Vision
© S. ChaFall/2002
CSISArtificial Intelligence
2
© S. ChaFall/2002
CSISPerception
© S. ChaFall/2002
CSISLena & Computer vision
3
© S. ChaFall/2002
CSISMachine Vision
© S. ChaFall/2002
CSISPattern Recognition Applications
4
© S. ChaFall/2002
CSIS
�����
���
� ���
� �� �
������
� �� �
������
����� � � �����
I r is authentication
© S. ChaFall/2002
CSIS
5
© S. ChaFall/2002
CSIS
Each person has different faces.
Face Recognition System
© S. ChaFall/2002
CSIS
?Query
Face DB
Face Recognition System
6
© S. ChaFall/2002
CSISHead Pose Recognitionleft strt rght up
© S. ChaFall/2002
CSIS
Sargur N. Srihari
f6
street namef7
secondarydesignator abbr.
f5
primarynumber
f8
secondarynumber
Lee Entrance STE520 202
f2
state abbr.f3
5-digitZIP Code
f4
4-digitZIP+4 add-on
f1
city name
-Amherst NY 14228 2583
• Delivery point: 142282583
Complex Pattern Recognition Applications
7
© S. ChaFall/2002
CSISSpeech Recognition System
����������
© S. ChaFall/2002
CSIS
biomouseFingerprint
scanner
DigitalCamera
LCD Pentablet Microphone
),...,,( 21x
dxx fff ),...,,( 21
xd
xx fff ),...,,( 21x
dxx fff ),...,,( 21
xd
xx fff),...,,( 21x
dxx fff ),...,,( 21
xd
xx fff
Vital Sign
monitor
Applications
8
© S. ChaFall/2002
CSIS
© S. ChaFall/2002
CSIS
br ightness, length
Salmon
Bass
Salmon1 = ( 12 , 16 )
Salmon2 = ( 11 , 20 )
Bass1 = ( 7 , 6 )
Bass2 = ( 3 , 4 )
Truth features
Measurements
9
© S. ChaFall/2002
CSIS
© S. ChaFall/2002
CSIS
10
© S. ChaFall/2002
CSISDecision theory (cost)
© S. ChaFall/2002
CSIS
11
© S. ChaFall/2002
CSISDistr ibutions and Errors
Bass
Salmon
Bass identified as salmon
salmonidentified as
bass
Decisionboundary
© S. ChaFall/2002
CSISParametr ic Univar iate Dichotomizer
(a) length (b) lightness (c) width
39 %Type II 27 % 26 %9 %Type I 7 % 5 %(a) (b) (c)
12
© S. ChaFall/2002
CSISMultivar iate Analysis
© S. ChaFall/2002
CSIS
length
br ightness
Salmon
Bass
= salmon?
Nearest Neighbor Classifier
13
© S. ChaFall/2002
CSIS
• too slow for users to wait for the output.
q = 4, 6
r2 = 3, 8
r1 = 5, 7
r3 = 10, 16
r4 = 12, 14
Salmon
Bass
Salmon
Bass
Salmon
rn = 14, 15
R =
t2 = 2, 5
t1 = 4, 6
t3 = 11, 17
t4 = 14, 12
Salmon
Bass
Salmon
Bass
Salmon
tn = 14, 17
T =
Bass
Bass
Salmon
Bass
Bass
• Performance is evaluated by using a testing set.
reference set testing set
Nearest Neighbor Classifier
© S. ChaFall/2002
CSIS
length
br ightness
Salmon
Bass
Y > aX + b
? = salmon
Machine Learning (L inear function)
14
© S. ChaFall/2002
CSIS
synapse
nucleus
axon
dendrites
the biological neuron
Σ
x1(t)
x2(t)
xn(t)
w1
w2
wn
a(t)
w0
y
a
y=f(a)
O(t+1)
the artificial neuron
Artificial Neural Network
© S. ChaFall/2002
CSIS
• extremely fast.
r2 = 3, 8
r1 = 5, 7
r3 = 10, 16
r4 = 12, 14
Salmon
Bass
Salmon
Bass
Salmon
rn = 14, 15
R =
t2 = 2, 5
t1 = 4, 6
t3 = 11, 17
t4 = 14, 12
Salmon
Bass
Salmon
Bass
Salmon
tn = 14, 17
T =
Bass
Bass
Salmon
Bass
Bass
• Performance is evaluated by using a testing set.
reference set testing set
Y > aX + b
• Performance is not as good as the NN classifier’s.
training set
• No need to load the training data during the classification.
Machine Learning (L inear function)
15
© S. ChaFall/2002
CSIS
length
br ightness
Salmon
Bass
Y > aX + b
Non-L inear case
© S. ChaFall/2002
CSIS
length
br ightness
Salmon
Bass
• NN is better.• will learn artificial neural network which is non-linear function.
Non-L inear case
16
© S. ChaFall/2002
CSISHuman Brain
© S. ChaFall/2002
CSIS
Class
2f
3f
4f
5f
6f
1f
7f
Fully Connected, feed forward, back-propagation
multi-layer Artificial neural network (11-6-1) (ANN).
Artificial Neural Network
17
© S. ChaFall/2002
CSIS
© S. ChaFall/2002
CSIS
18
© S. ChaFall/2002
CSIS
• Predict unseen future instance.
• Generalization.
• Inductive step.
Purpose of Pattern Recognition
© S. ChaFall/2002
CSIS
length
width
Generalizability (statistical inferece)
length
width
length
width
training set
validating set
universe
19
© S. ChaFall/2002
CSISInferential Statistics
1. Inferential Statistics is inferring a conclusion about population of interest from a sample.- need a procedure for sampling the population.- need a measure of reliability for the inference.
2. If error rate in a random sample set is the same as in universe, then the procedure is a sound inferential statistical procedure.
3. If error rate in one random sample set is the same as in another random sample set, then the procedure is sound.
© S. ChaFall/2002
CSISGeneralization
δδδδf2
δδδδf1
Universe
20
© S. ChaFall/2002
CSISSampling & learning
δδδδf2
δδδδf1
Sample 1
© S. ChaFall/2002
CSISTesting on another sample
δδδδf2
δδδδf1
Sample 2
21
© S. ChaFall/2002
CSISGeneralization
δδδδf2
δδδδf1
Universe
© S. ChaFall/2002
CSISMultiple classification
f1
f2
class 1 class 3
class 2
Classes = {class 1, 2, 3}
22
© S. ChaFall/2002
CSIS
© S. ChaFall/2002
CSIS
23
© S. ChaFall/2002
CSIS
1. Data acquisition:
Template for PR Applications
2. Feature Extraction:
3. Training a classifier:
4. Classification system:
a. Recruit subjects.b. Modality interface (Scanning, picturing, recording, etc).
a. Raw data to feature vectors.b. Involves image/ voice/ signal processing techniques.
a. Design a classifier (e.g., ANN).b. Enter the training (& validating) feature vector set(s).
a. embed the ANN engine to your actual program (Java/C)b. User interface for the Final Product.
© S. ChaFall/2002
CSIS
• Fast Nearest Neighbor Search Algorithms
• Decision Tree
• Statistical Pattern Recognition.
• Artificial Neural Network.
• Clustering
• etc.
http://www.csis.pace.edu/~scha/PR
Fur ther Pattern Recognition
24
© S. ChaFall/2002
CSIS
outlook temperature humidity windy playsunny hot high false nosunny hot high true noovercast hot high false yesrainy mild high false yesrainy cool normal false yesrainy cool normal true noovercast cool normal true yessunny mild high false nosunny cool normal false yesrainy mild normal false yessunny mild normal true yesovercast mild high true yesovercast hot normal false yesrainy mild high true no
Decision Tree
© S. ChaFall/2002
CSIS
(b)
ad
kj
h
gif
ec
b
ad
kj
h
g if
ec
b
(a)
(c) 1 2 3
abc
0.4 0.1 0.50.1 0.8 0.10.3 0.3 0.4
...
(d)
g a c i e d k b j f h
Cluster ing
25
© S. ChaFall/2002
CSISTerminology
Classification:
The process of assigning one of a limited set of alternative interpretations to (the generator of) a set of data. Often requires the steps of the computation of relative probabilities (or a quantity related to them) followed by the application of a decision rule. All classification processes can be evaluated in terms of "detection" and "mis-classification" rates.
© S. ChaFall/2002
CSISTerminology
• Computer Vision: Compter Vision is the subject area which deals with the automatic analysis of images for the purposes of quantification or system control (often mimicking tasks which humans find trivial). It is to be distinguised from "Image Processing" which deals only with the computational processes applied to images, including enhancement and compression, but does not deal with abstract representation for the purposes of reasoning and interpretation. Compter Vision can be seen as the inverse of Computer Graphics, though generally the representations and methods of this area are not of use in Computer Vision due to the incomplete and therefore ambiguous nature of images. This requires prior knowledge to be used in order to obtain robust scene interpretation.
26
© S. ChaFall/2002
CSISTerminology
• Machine Vision:Like "computer vision" but generally more closely associated with its use in robotics.
• Pattern Recognition Pattern recognition is the process of assigning a pattern classification to a particular set of measurements, normally represented as a high dimensional vector. This is normally done within the context of "probability theory", whereby a particular set of assumptions regarding the expected statisticaldistribution of measurements is used to compute classification probabilities which can be used as the basis for a decision such as the "Bayes decision rule". There are several popular forms of classifier including "k-nearest neighbour", "parzenwindows", "mixture methods" and more recently "artificial neural networks".
© S. ChaFall/2002
CSISTerminology
• Images:An image is two dimensional spatial representation of a group of "objects" (or "scene") which exists in two or more dimensions. It is an intuitive way of presenting data for computer interfaces in the area of graphics, but in machine vision it may be defined as a continuous function of two variables defined within a bounded (generally rectangular) region.
• Histograms A histogram is an array of non negative integer counts from a set of data, which represents the frequency of occurance of values within a set of non-overlapping regions.
27
© S. ChaFall/2002
CSIS
.95 .49 .70 .71 .50 .10 .51 .92 .13 .47 .32 .21
.94 .49 .75 .70 .50 .11 .53 .84 .26 .54 .35 .18
.94 .49 .67 .74 .50 .10 .45 .85 .23 .48 .32 .22
.93 .72 .33 .47 .50 .21 .28 .30 .66 .60 .42 .10
.93 .74 .33 .48 .50 .22 .26 .30 .60 .59 .45 .10
.93 .79 .36 .54 .50 .18 .27 .32 .60 .59 .52 .09
.92 .30 .61 .66 .60 .11 .35 .49 .70 .71 .57 .10
.94 .42 .72 .66 .60 .11 .32 .49 .67 .74 .53 .10
.94 .40 .75 .67 .60 .12 .34 .49 .75 .70 .54 .11
.96 .30 .60 .59 .50 .10 .21 .30 .66 .60 .36 .10
.95 .32 .60 .59 .50 .09 .22 .30 .60 .59 .39 .10
.95 .30 .66 .60 .50 .10 .21 .32 .60 .59 .34 .09
dar k blob hole slant width skew ht pixel hslope nslope pslope vslope
int int int real int real int int int int int int
Features
SSSSSS
BBBBBB
class
Features & Class
© S. ChaFall/2002
CSIS
length4035302520151050-5-10
lightness
width
10 1214 16
18 20 2224
56
78
910
a
b
c
a = (12,6,-5)
b = (16,9,10)
c = (19,7,-10)
Representation
28
© S. ChaFall/2002
CSIS
?
?
?
Image Classification
© S. ChaFall/2002
CSISImage Indexing & Retr ieval
29
© S. ChaFall/2002
CSIS
Acute myeloid leukemia
?
Acute myeloid leukemia
Acute myeloid leukemia
Query by Image Content
© S. ChaFall/2002
CSIS
D( ) = ?,
S( ) = ?,
Dissimilar ity (distance) / Similar ity
30
© S. ChaFall/2002
CSIS
• Image processing vs. computer vision
• Human vision & illusion.
• Basic Image Processing
• Machine Vision Applications.
• Histogram based Image Indexing & Retrieval.
Overview
© S. ChaFall/2002
CSIS
• There are no clear distinction• Image processing
– Applications where humans are in the loop. – Humans supply the intelligence– Image Analysis - extracting quantitative info.
• Size of a tumor• distance between objects• facial expression
– Image restoration. Try to undo damage• needs a model of how the damage was made
– Image enhancement. Try to improve the quality of an image– Image compression. How to convey the most amount of
information with the least amount of data
Digital Image Processing vs. Computer Vision
31
© S. ChaFall/2002
CSIS
• Computer Vision– Take the human out of the loop
– The computer supplies the intelligence
– Where does the computer get it’s intelligence?
Digital Image Processing vs. Computer Vision
© S. ChaFall/2002
CSISHuman Vision
32
© S. ChaFall/2002
CSISCerebral Cor tex
© S. ChaFall/2002
CSIS
Monocular Visual Field: 160 deg (w) X 175 deg (h)Binocular Visual Field: 200 deg (w) X 135 deg (h)
Human Vision
33
© S. ChaFall/2002
CSISThe figure-Ground Problem
© S. ChaFall/2002
CSIS
Mouth
Mouth
The Bunny/Duck illusion
34
© S. ChaFall/2002
CSIS
Squares or lines?
More illusions
© S. ChaFall/2002
CSISMore illusions: How many colors?
35
© S. ChaFall/2002
CSISMore illusions: parallel line
© S. ChaFall/2002
CSISMore illusions
36
© S. ChaFall/2002
CSISMore illusions
© S. ChaFall/2002
CSISMore illusions
37
© S. ChaFall/2002
CSISMore illusions
© S. ChaFall/2002
CSIS
Concerned with mechanisms for converting light energy into electrical energy.
World Optics Sensor
Signal Digitizer
Digital Representation
Photometry
38
© S. ChaFall/2002
CSIS
1 2 3 4 5 6 7i 1234567
j
Binary image
© S. ChaFall/2002
CSIS
.
OpticsImage Plane
A/D Converterand Sampler
E(x,y) : Electricalvideo signal
Image L(x,y)
VideoCamera
I(i,j) Digital Image
22 34 22 0 18 •••
••••••
Grayscale Image Data
Computer Memory
Grey image
39
© S. ChaFall/2002
CSIS
.
Blue ChannelA/D ConverterGreen Channel
A/D Converter
OpticsImage Plane
Digital Image
E(x,y) : Electricalvideo signal
Image L(x,y)
Computer Memory
22 3422 0 18 •••
••••••
Red ChannelA/D Converter
VideoCamera
B(i,j)G(i,j)
R(i,j)
Color image
© S. ChaFall/2002
CSIS
Hue (color)
Saturation (white)
Lightness
HSL Color Space
40
© S. ChaFall/2002
CSISColor
© S. ChaFall/2002
CSISContrast Stretching
41
© S. ChaFall/2002
CSIS
0 255
255
INPUTO
UT
PU
T
Linear Stretching
© S. ChaFall/2002
CSISHistogram Equalization
Adjust peaks and plains
42
© S. ChaFall/2002
CSISFalse Color
© S. ChaFall/2002
CSISWarping
http://www.doctorwarp.com/index.php?ID=23&flx=world
43
© S. ChaFall/2002
CSISCompression
© S. ChaFall/2002
CSISMosaics
44
© S. ChaFall/2002
CSISStereo
© S. ChaFall/2002
CSISStereo vision
45
© S. ChaFall/2002
CSISNoise Removal
salt
pepper
© S. ChaFall/2002
CSISZooming
Important for size invariance
46
© S. ChaFall/2002
CSISRotation
Important for rotation invariance
© S. ChaFall/2002
CSISSubtraction
47
© S. ChaFall/2002
CSIS
• Goal: To find clusters of pixels that are similar and connected to each other
• How it works:– Assign a value to each pixel
– Define what similar values mean• e.g., 10 +/- 2
– Determine if like pixels are connected
Connected Components/ Image Labeling
4- connected 8-connected
© S. ChaFall/2002
CSIS
1 1 1 1 1 1
1 0 0 1 1 1
1 1 1 0 1 1
1 2 2 0 0 1
1 2 2 0 0 1
A A A A A A
A B B A A A
A A A C A A
A D D C C A
A D D C C A
Connected Components/ Image Labeling
48
© S. ChaFall/2002
CSIS
1 1 1 1 1 1
1 0 0 1 1 1
1 1 1 0 1 1
1 2 2 0 0 1
1 2 2 0 0 1
A A A A A A
A B B A A A
A A A B A A
A C C B B A
A C C B B A
Connected Components/ Image Labeling
© S. ChaFall/2002
CSISSegmentation
49
© S. ChaFall/2002
CSISSegmentation
© S. ChaFall/2002
CSIS
���
�
�
���
�
�
−−−−+−−−−
=111
181
111
B
Edge Detection
convolution
mask
50
© S. ChaFall/2002
CSIS
Each person writes differently.
Handwriting
© S. ChaFall/2002
CSIS
Analysis of Handwriting
Recognition Examination Personality identification(Graphology)
On-line Off-line Writer VerificationWriter Identification
Natural Writing Forgery Disguised Writing
Handwriting Analysis Taxonomy
51
© S. ChaFall/2002
CSIS
The EndSee U all next week.
Pattern Recognition