introduction to computer vision - university of...

57
Introduction to Computer Vision RODNEY DOCKTER, PH.D. MARCH 2018 1

Upload: others

Post on 30-Aug-2020

2 views

Category:

Documents


0 download

TRANSCRIPT

Page 1: Introduction to Computer Vision - University of Minnesotadept.me.umn.edu/courses/me5286/vision/VisionNotes/... · Book References (optional) •David A. Forsyth and Jean Ponce: Computer

Introduction to Computer VisionRODNEY DOCKTER, PH.D.

MARCH 2018

1

Page 2: Introduction to Computer Vision - University of Minnesotadept.me.umn.edu/courses/me5286/vision/VisionNotes/... · Book References (optional) •David A. Forsyth and Jean Ponce: Computer

Rodney Dockter (me)• Ph.D. in Mechanical Engineering from the University of Minnesota

• Worked in Dr. Tim Kowalewski’s lab

• Medical robotics and machine learning

• Now working at Danfoss Power Solutions developing autonomous systems in agriculture and heavy machinery

2

Page 3: Introduction to Computer Vision - University of Minnesotadept.me.umn.edu/courses/me5286/vision/VisionNotes/... · Book References (optional) •David A. Forsyth and Jean Ponce: Computer

Contact Information•Office: ME 2110

•Email: [email protected]

•Office Hours: W, F 1:15pm – 2:00 pm

3

Page 4: Introduction to Computer Vision - University of Minnesotadept.me.umn.edu/courses/me5286/vision/VisionNotes/... · Book References (optional) •David A. Forsyth and Jean Ponce: Computer

Book References (optional)• David A. Forsyth and Jean Ponce: Computer vision: a modern approach. Prentice Hall, 2002.

• Peter Corke: Robotics, Vision and Control: Fundamental Algorithms In MATLAB. Springer, 2017

• R.C. Gonzalez and R.E. Woods: Digital Image Processing, Prentice-Hall, 2002

• Richard Hartley and Andrew Zisserman: Multiple View Geometry in Computer Vision. Cambridge2004

• Christopher M. Bishop: Pattern Recognition and Machine Learning. Springer, 2011

4

Page 5: Introduction to Computer Vision - University of Minnesotadept.me.umn.edu/courses/me5286/vision/VisionNotes/... · Book References (optional) •David A. Forsyth and Jean Ponce: Computer

Class Organization• All computer vision from here on out!

• ~10 lectures, 3 assignments, 30% of total grade

• “Prerequisites”:• Linear algebra

• Statistics

• Vector Calculus

• Coding!

• Course does not presume prior computer vision experience

• Emphasis on coding!

• Matlab will be required for all homework assignments

5

Page 6: Introduction to Computer Vision - University of Minnesotadept.me.umn.edu/courses/me5286/vision/VisionNotes/... · Book References (optional) •David A. Forsyth and Jean Ponce: Computer

Class Organization Cont.• Collaboration Policy:

- You are encouraged to discuss assignments with your peers. However, all written work and coding must be done individually. Individuals found submitting duplicate or substantially similar materials due to inappropriate collaboration may get an “F” in this class and may receive other serious consequences.

- In the real world no one will write your code for you!

• Libraries and Code Found Online:

- None of the Matlab libraries for computer vision may be used in assignments. You will writing your own computer vision code from the ground up.

- Students found plagiarizing code found on the internet will fail the course (we know how to use google too)

6

Page 7: Introduction to Computer Vision - University of Minnesotadept.me.umn.edu/courses/me5286/vision/VisionNotes/... · Book References (optional) •David A. Forsyth and Jean Ponce: Computer

Course Syllabus (Vision Portion)March 30 F 1 - Introduction to computer vision and image processing

April 04 W 2 - Image Formation Camera Fundamentals

April 06 F 3 - Digital Image Representation (Quiz #2 review)

April 11 W 4 - Spatial Domain + Computer Vision in Matlab

April 13 F QUIZ #2 – No Lecture

April 18 W 5 - Image Histograms

April 20 F 6 - Edge Detection

April 25 W 7 - Edge Detection Cont. – Canny and Sobel

April 27 F 8 - Interest Point Detection

May 02 W 9 - Line Detection. Hough Transform.

May 04 F 10 - The future. Deep Learning.

7

Page 8: Introduction to Computer Vision - University of Minnesotadept.me.umn.edu/courses/me5286/vision/VisionNotes/... · Book References (optional) •David A. Forsyth and Jean Ponce: Computer

Outline • What is computer vision? What is image processing?

• Basics of Computer Vision Processing

• Interesting examples of computer vision in the wild?

• Brief introduction to Matlab for computer vision use.

8

Page 9: Introduction to Computer Vision - University of Minnesotadept.me.umn.edu/courses/me5286/vision/VisionNotes/... · Book References (optional) •David A. Forsyth and Jean Ponce: Computer

Part 1 - What is computer vision?Allowing robots and machines to see the world around them.

9

Page 10: Introduction to Computer Vision - University of Minnesotadept.me.umn.edu/courses/me5286/vision/VisionNotes/... · Book References (optional) •David A. Forsyth and Jean Ponce: Computer

What is computer vision?Once robots can see the world, they can interact with it.

10

Page 11: Introduction to Computer Vision - University of Minnesotadept.me.umn.edu/courses/me5286/vision/VisionNotes/... · Book References (optional) •David A. Forsyth and Jean Ponce: Computer

What is computer vision?Machines can learn to perform tasks just like humans.

11

Page 12: Introduction to Computer Vision - University of Minnesotadept.me.umn.edu/courses/me5286/vision/VisionNotes/... · Book References (optional) •David A. Forsyth and Jean Ponce: Computer

What is computer vision?• Deals with the formation, analyses and interpretation of images

• Integral to robotics and Artificial Intelligence (AI)

• Interdisciplinary subject area:• Robotics

• Autonomous Vehicles

• Medical Applications

• Security

• Practical and Useful

• Challenging and continuously evolving

12

Page 13: Introduction to Computer Vision - University of Minnesotadept.me.umn.edu/courses/me5286/vision/VisionNotes/... · Book References (optional) •David A. Forsyth and Jean Ponce: Computer

Difficulties in Computer Vision• Images are ambiguous: dependent on perspective

• Images are affected by many factors:• Sensor model

• Illumination (lighting)

• Shape of object(s)

• Color of object (s)

• Texture of object (s)

• No “universal solution” to vision and sensing

• Many theories and potential algorithms

• Gold standard: Human Vision!

13

Page 14: Introduction to Computer Vision - University of Minnesotadept.me.umn.edu/courses/me5286/vision/VisionNotes/... · Book References (optional) •David A. Forsyth and Jean Ponce: Computer

Human Vision:

14

Page 15: Introduction to Computer Vision - University of Minnesotadept.me.umn.edu/courses/me5286/vision/VisionNotes/... · Book References (optional) •David A. Forsyth and Jean Ponce: Computer

Difficulties in Computer Vision• It is a many-to-mapping:• Different objects with different material and geometric properties, possibly under different lighting

conditions could lead to identical images.

• The same object, viewed from different perspectives or with different lighting can result in vastlydifferent images (under constrained mapping)

• Information is lost in the transformation from the 3D world to a 2D image.

15

Page 16: Introduction to Computer Vision - University of Minnesotadept.me.umn.edu/courses/me5286/vision/VisionNotes/... · Book References (optional) •David A. Forsyth and Jean Ponce: Computer

Difficulties in Computer Vision• It is computationally intensive.

• Usually requires high end processors or graphics cards to achieve real time performance.

• We still do not fully understand the recognition problem.

16

Page 17: Introduction to Computer Vision - University of Minnesotadept.me.umn.edu/courses/me5286/vision/VisionNotes/... · Book References (optional) •David A. Forsyth and Jean Ponce: Computer

Nomenclature• Many names for mostly similar things:

• Computer Vision• General – All encompassing term

• Computational Vision• Modeling of biological vision

• Image Processing• Generally refers to static images (frame to frame). Building block for computer vision.

• Machine Vision• Industrial or factory applications for inspection and measurement.

17

Page 18: Introduction to Computer Vision - University of Minnesotadept.me.umn.edu/courses/me5286/vision/VisionNotes/... · Book References (optional) •David A. Forsyth and Jean Ponce: Computer

Related Fields• Pattern Recognition

• Machine Learning

• Robotics (obviously)

• Medical Imaging

• Computer Graphics

• Human – Robot Interaction

18

Page 19: Introduction to Computer Vision - University of Minnesotadept.me.umn.edu/courses/me5286/vision/VisionNotes/... · Book References (optional) •David A. Forsyth and Jean Ponce: Computer

Why study computer vision• Images and videos are everywhere

• Mobile phones, cheap cameras, real time streaming

• New applications all the time

• Face recognition

• 3D representations from pictures

• Automatic surveillance

• Driverless vehicles

• Shipping and warehouse management

• Various deep scientific mysteries• Human vision system etc

19

Page 20: Introduction to Computer Vision - University of Minnesotadept.me.umn.edu/courses/me5286/vision/VisionNotes/... · Book References (optional) •David A. Forsyth and Jean Ponce: Computer

Brief History of Computer Vision• B.C. (Before Computers)

• Philosophy

• Optics

• Physics

• Neurophysiology

• Early Cameras (Late 1800s)

• Earliest way to represent the real world in a consistent manner

• Precursor to video and digital representations of the world around us

20

Page 21: Introduction to Computer Vision - University of Minnesotadept.me.umn.edu/courses/me5286/vision/VisionNotes/... · Book References (optional) •David A. Forsyth and Jean Ponce: Computer

Brief History of Computer Vision• Early Computer Vision Systems (1960’s)

• Minsky at MIT• Attempted to solve vision in a summer (lol)

• Empirical approaches• Ad hoc, “bag of tricks”

• Image processing plus+

• Solutions tailored to each problem.

• Simplified worlds• “Blocks worlds” then generalize

• Known perspectives, known illumination

21

Page 22: Introduction to Computer Vision - University of Minnesotadept.me.umn.edu/courses/me5286/vision/VisionNotes/... · Book References (optional) •David A. Forsyth and Jean Ponce: Computer

Brief History of Computer Vision• 1980’s improved computational power + robotics innovations

• 1990’s personal computing brings computer vision possibilities to researchers everywhere.

• 2000’s OpenCV, ROS, and Matlab provide generalized computer vision computing abilities toresearchers

• 2010’s – Computer vision comes to the masses:• Microsoft Kinect

• Leap Motion

• Google Tango

• Augmented Reality

• iPhone Face ID

22

Page 23: Introduction to Computer Vision - University of Minnesotadept.me.umn.edu/courses/me5286/vision/VisionNotes/... · Book References (optional) •David A. Forsyth and Jean Ponce: Computer

Progress in Computer Vision• First Generation: Military/ Early Research• Few systems, custom built. Cost: $1 Million+

• Users have PhDs

• Slow (1 hour per frame)

• Second Generation: Industrial / Medical• Numerous systems (1000+). Cost: $10,000+

• Users have bachelor degrees

• Specialized hardware

• Third Generation: Consumer• 100000 + systems worldwide. Cost: $100

• Users have little or no training

• Raspberry + webcam = $50 = limitless potential

23

Page 24: Introduction to Computer Vision - University of Minnesotadept.me.umn.edu/courses/me5286/vision/VisionNotes/... · Book References (optional) •David A. Forsyth and Jean Ponce: Computer

The Future• Software:• 1 solution for all vision problems

• Provide unexperienced users with an easy way to detect/track objects.

• Does it exist?

• Convolutional Neural Networks?

• Tensorflow?

• Hardware:• Cheap cameras (cell phone)

• Depth cameras (Kinect, stereo cameras)

• Cheap, powerful, mobile processing (getting there)

24

Page 25: Introduction to Computer Vision - University of Minnesotadept.me.umn.edu/courses/me5286/vision/VisionNotes/... · Book References (optional) •David A. Forsyth and Jean Ponce: Computer

Part 2 – Computer Vision Processing• Processing Levels:• Low Level

• Mid Level

• High Level

• Image Formulation

25

Page 26: Introduction to Computer Vision - University of Minnesotadept.me.umn.edu/courses/me5286/vision/VisionNotes/... · Book References (optional) •David A. Forsyth and Jean Ponce: Computer

Three Processing Levels – Low Level• Low Level Processing:

• Standard procedures to improve image quality and format

• No “intelligence”

26

Page 27: Introduction to Computer Vision - University of Minnesotadept.me.umn.edu/courses/me5286/vision/VisionNotes/... · Book References (optional) •David A. Forsyth and Jean Ponce: Computer

Three Processing Levels - Mid Level• Intermediate Processing:

• Extract features and characterize components

• Some “intelligence”

27

Page 28: Introduction to Computer Vision - University of Minnesotadept.me.umn.edu/courses/me5286/vision/VisionNotes/... · Book References (optional) •David A. Forsyth and Jean Ponce: Computer

Three Processing Levels - Mid Level• Representing small patches in an image• Want to establish correspondence between points or regions in different

images, so we need to mathematically describe the neighborhood around apoint

• Sharp changes are important features! (Known as “edges”)

• Representing texture by giving some statistics of the different types of patchespresent in a region

• i.e. Tigers have lots of stripes, few spots

• Leapords have lots of spots, no stripes

• How can we mathematically describe this?

28

Page 29: Introduction to Computer Vision - University of Minnesotadept.me.umn.edu/courses/me5286/vision/VisionNotes/... · Book References (optional) •David A. Forsyth and Jean Ponce: Computer

Three Processing Levels - Mid Level• Filter Outputs• Filters form a dot-product between a pattern and an image, while shifting the pattern

across the image

• Strong Response -> image region matches a pattern

• e.g. derivatives measured by filtering with a kernel that looks like a big derivative (bright line next to a dark line would yield a large derivative)

29

Page 30: Introduction to Computer Vision - University of Minnesotadept.me.umn.edu/courses/me5286/vision/VisionNotes/... · Book References (optional) •David A. Forsyth and Jean Ponce: Computer

Three Processing Levels - Mid Level• Many objects are distinguished by their texture• Tigers, cheetahs, grass, trees…

• We represent texture with statistics from filter outputs• For tigers, a stripes filter at a coarse scale responds strongly

• For cheetahs, a spot filter with a coarse scale responds strongly

• For grass, long narrow lines

• For the leaves of a trees, extended spots

• Objects with different textures can be segmented

• The variation in textures in a cue to a shape

30

Page 31: Introduction to Computer Vision - University of Minnesotadept.me.umn.edu/courses/me5286/vision/VisionNotes/... · Book References (optional) •David A. Forsyth and Jean Ponce: Computer

Three Processing Levels - Mid Level• Geometry of multiple views (images)• Where might an object appear in camera #2 (or #3,…) given its position in

camera 1

• Stereopsis• What we know about the world from having 2 eyes (cameras)

• Depth and 3D information

• Structure from motion• What we know about the world from having many eyes (a single eye, moving

quickly)

31

Page 32: Introduction to Computer Vision - University of Minnesotadept.me.umn.edu/courses/me5286/vision/VisionNotes/... · Book References (optional) •David A. Forsyth and Jean Ponce: Computer

Three Processing Levels - Mid Level• Finding coherent structures in an image to break it into smaller units• Segmentation:

• Breaking images and videos into useful pieces

• e.g. finding image components that are coherent in appearance

• e.g. separating image foreground from background

• Tracking:

• Keeping track of a moving object through a sequence of views

32

Page 33: Introduction to Computer Vision - University of Minnesotadept.me.umn.edu/courses/me5286/vision/VisionNotes/... · Book References (optional) •David A. Forsyth and Jean Ponce: Computer

Three Processing Levels – High Level• High-Level Processing:

• Recognizing patterns, comparing to known models

• High “intelligence” (ie machine learning)

33

Page 34: Introduction to Computer Vision - University of Minnesotadept.me.umn.edu/courses/me5286/vision/VisionNotes/... · Book References (optional) •David A. Forsyth and Jean Ponce: Computer

Three Processing Levels - High Level• Relationship between object geometry and image geometry• Model Based Vision

• Find the position and orientation of known objects (e.g. squares, circles)

• Smooth surfaces and outlines

• How the outline of a curved object is formed and what it looks like (contours)

• Aspect graphs:

• How the outline of a curved object changes as you view it from different directions

• Range data

34

Page 35: Introduction to Computer Vision - University of Minnesotadept.me.umn.edu/courses/me5286/vision/VisionNotes/... · Book References (optional) •David A. Forsyth and Jean Ponce: Computer

Three Processing Levels - High Level• Using classifiers and probability to match and recognize objects• Templates and classifiers

• How to find objects that look the same from view to view with a classifier

• Relations

• Break up objects into big, simple parts, find the parts with a classifier, thenidentify relationships between parts to find the object

• Geometric templates from spatial relations

• Extend this trick so that templates are formed from relations betweensmaller parts

• Bordering on machine learning…

35

Page 36: Introduction to Computer Vision - University of Minnesotadept.me.umn.edu/courses/me5286/vision/VisionNotes/... · Book References (optional) •David A. Forsyth and Jean Ponce: Computer

Image Formulation

36

Page 37: Introduction to Computer Vision - University of Minnesotadept.me.umn.edu/courses/me5286/vision/VisionNotes/... · Book References (optional) •David A. Forsyth and Jean Ponce: Computer

Image Formulation

37

Page 38: Introduction to Computer Vision - University of Minnesotadept.me.umn.edu/courses/me5286/vision/VisionNotes/... · Book References (optional) •David A. Forsyth and Jean Ponce: Computer

Simplistic View

38

Page 39: Introduction to Computer Vision - University of Minnesotadept.me.umn.edu/courses/me5286/vision/VisionNotes/... · Book References (optional) •David A. Forsyth and Jean Ponce: Computer

The Physics of Imaging• How images are formed:

• Cameras

• How a camera creates an image

• How to tell where the camera was

• Light

• How to measure light

• What light does at surfaces

• How the brightness values we see in cameras are determined

• Color

• The underlying mechanisms of color

• How to describe and measure color

39

Page 40: Introduction to Computer Vision - University of Minnesotadept.me.umn.edu/courses/me5286/vision/VisionNotes/... · Book References (optional) •David A. Forsyth and Jean Ponce: Computer

The Physics of Imaging

40

Page 41: Introduction to Computer Vision - University of Minnesotadept.me.umn.edu/courses/me5286/vision/VisionNotes/... · Book References (optional) •David A. Forsyth and Jean Ponce: Computer

The Physics of Imaging

More on this later …

41

Page 42: Introduction to Computer Vision - University of Minnesotadept.me.umn.edu/courses/me5286/vision/VisionNotes/... · Book References (optional) •David A. Forsyth and Jean Ponce: Computer

Part 4 – Computer Vision in the wild

There are countless products that havean element of computer vision, andthe number increases daily.Consumer, defense, and research.

42

Page 43: Introduction to Computer Vision - University of Minnesotadept.me.umn.edu/courses/me5286/vision/VisionNotes/... · Book References (optional) •David A. Forsyth and Jean Ponce: Computer

Computer Vision in the wild

NASA Rover PAR systems: Vision GuidedRobots

43

Page 44: Introduction to Computer Vision - University of Minnesotadept.me.umn.edu/courses/me5286/vision/VisionNotes/... · Book References (optional) •David A. Forsyth and Jean Ponce: Computer

Computer Vision in the wild

Face Recognition ForSecurity, User Interface

Apple Face ID

44

Page 45: Introduction to Computer Vision - University of Minnesotadept.me.umn.edu/courses/me5286/vision/VisionNotes/... · Book References (optional) •David A. Forsyth and Jean Ponce: Computer

Computer Vision in the wild

Challenges in Face Recognition:-Illumination-Pose Variations-Facial Expressions-Facial Similarity

45

Page 46: Introduction to Computer Vision - University of Minnesotadept.me.umn.edu/courses/me5286/vision/VisionNotes/... · Book References (optional) •David A. Forsyth and Jean Ponce: Computer

Computer Vision in the wild

3D Sensors, Microsoft, Intel, etcTime of Flight Depth

Gesture RecognitionLeap Motion, Realsense

46

Page 47: Introduction to Computer Vision - University of Minnesotadept.me.umn.edu/courses/me5286/vision/VisionNotes/... · Book References (optional) •David A. Forsyth and Jean Ponce: Computer

Computer Vision in the wild

Optical Character Recognition(Hand Written Digits)

MNIST Data Set

47

Page 48: Introduction to Computer Vision - University of Minnesotadept.me.umn.edu/courses/me5286/vision/VisionNotes/... · Book References (optional) •David A. Forsyth and Jean Ponce: Computer

Computer Vision in the wild

Traffic Monitoring Augmented Driver Assist Systems

48

Page 49: Introduction to Computer Vision - University of Minnesotadept.me.umn.edu/courses/me5286/vision/VisionNotes/... · Book References (optional) •David A. Forsyth and Jean Ponce: Computer

Computer Vision in the wild

Autonomous Vehicles

49

Page 50: Introduction to Computer Vision - University of Minnesotadept.me.umn.edu/courses/me5286/vision/VisionNotes/... · Book References (optional) •David A. Forsyth and Jean Ponce: Computer

Computer Vision in the wild

Robotics Navigation

50

Page 51: Introduction to Computer Vision - University of Minnesotadept.me.umn.edu/courses/me5286/vision/VisionNotes/... · Book References (optional) •David A. Forsyth and Jean Ponce: Computer

Computer Vision in the wild

Brain Imaging MRI Vital Images Inc.

51

Page 52: Introduction to Computer Vision - University of Minnesotadept.me.umn.edu/courses/me5286/vision/VisionNotes/... · Book References (optional) •David A. Forsyth and Jean Ponce: Computer

Computer Vision in the wild• Types of images:• Infra-red

• Ultra-violet

• Radio-waves

• Visible-light

• Radar

• Tomography

• Sonar

• Microscopy

• Magnetic Resonance

52

Page 53: Introduction to Computer Vision - University of Minnesotadept.me.umn.edu/courses/me5286/vision/VisionNotes/... · Book References (optional) •David A. Forsyth and Jean Ponce: Computer

Computer Vision in the wild

Thermal Imaging:Applications in defense, agricultureLocal company: Fluke Thermography

53

Page 54: Introduction to Computer Vision - University of Minnesotadept.me.umn.edu/courses/me5286/vision/VisionNotes/... · Book References (optional) •David A. Forsyth and Jean Ponce: Computer

Computer Vision in the wild

Human Activity Recognition

54

Page 55: Introduction to Computer Vision - University of Minnesotadept.me.umn.edu/courses/me5286/vision/VisionNotes/... · Book References (optional) •David A. Forsyth and Jean Ponce: Computer

Computer Vision in the wild

Video Surveillance and Tracking

55

Page 56: Introduction to Computer Vision - University of Minnesotadept.me.umn.edu/courses/me5286/vision/VisionNotes/... · Book References (optional) •David A. Forsyth and Jean Ponce: Computer

Can computers match human perception?• Not yet…

• Computer vision is still no match for human perception

• But computers are catching up in certain areas

• Classifying the ImageNet Database:• 1.2 million images, 1000 categories

• Human: ~5.1% Error Rate

• Inception-v3 Convolutional Neural Network: 3.46% Error Rate

https://karpathy.github.io/2014/09/02/what-i-learned-from-competing-against-a-convnet-on-imagenet/

56

Page 57: Introduction to Computer Vision - University of Minnesotadept.me.umn.edu/courses/me5286/vision/VisionNotes/... · Book References (optional) •David A. Forsyth and Jean Ponce: Computer

Conclusion• Computer Vision is a challenging and exciting field

• Applied to many real world situations

• Tremendous progress in the last two decades

• There is still work to be done…

57