s95 arial, bld, yw8, 37 points, 105% line spacing · synthesis or comparison of several research...

108
6.866 projects Proposals to us by today. We will ok them by Oct. 31. 3 possible project types: Original implementation of an existing algorithm. Rigorous evaluation of existing algorithm. Synthesis or comparison of several research papers. Proposals to us by today. We will ok them by Oct. 31. 3 possible project types: Original implementation of an existing algorithm. Rigorous evaluation of existing algorithm. Synthesis or comparison of several research papers.

Upload: others

Post on 14-May-2020

1 views

Category:

Documents


0 download

TRANSCRIPT

Page 1: S95 Arial, Bld, YW8, 37 points, 105% line spacing · Synthesis or comparison of several research papers. Proposals to us by today. We will ok them by Oct. 31. 3 possible project types:

6.866 projects

Proposals to us by today. We will ok them by Oct. 31.

3 possible project types:Original implementation of an existing algorithm.Rigorous evaluation of existing algorithm.Synthesis or comparison of several research papers.

Proposals to us by today. We will ok them by Oct. 31.

3 possible project types:Original implementation of an existing algorithm.Rigorous evaluation of existing algorithm.Synthesis or comparison of several research papers.

Page 2: S95 Arial, Bld, YW8, 37 points, 105% line spacing · Synthesis or comparison of several research papers. Proposals to us by today. We will ok them by Oct. 31. 3 possible project types:
Page 3: S95 Arial, Bld, YW8, 37 points, 105% line spacing · Synthesis or comparison of several research papers. Proposals to us by today. We will ok them by Oct. 31. 3 possible project types:
Page 4: S95 Arial, Bld, YW8, 37 points, 105% line spacing · Synthesis or comparison of several research papers. Proposals to us by today. We will ok them by Oct. 31. 3 possible project types:
Page 5: S95 Arial, Bld, YW8, 37 points, 105% line spacing · Synthesis or comparison of several research papers. Proposals to us by today. We will ok them by Oct. 31. 3 possible project types:

Does anyone mind…If I use your photographed face for a simple face-detection demo program that we’ll run in class next time?

If you do mind, please let me know (before Thursday).

If I use your photographed face for a simple faceIf I use your photographed face for a simple face--detection demo program that we’ll run in class detection demo program that we’ll run in class next time?next time?

If you do mind, please let me know (before If you do mind, please let me know (before Thursday).Thursday).

Page 6: S95 Arial, Bld, YW8, 37 points, 105% line spacing · Synthesis or comparison of several research papers. Proposals to us by today. We will ok them by Oct. 31. 3 possible project types:

Today: Cameras looking at people

MIT 6.801/6.866Oct. 29, 2002

MIT 6.801/6.866MIT 6.801/6.866Oct. 29, 2002Oct. 29, 2002

A mini-application lecture: under controlled conditions (not general conditions), what human interaction applications can you build with the tools we’ve developed so far?To be compared with: more sophisticated detection, classification, and tracking tools that we’ll study over the rest of the course.

Page 7: S95 Arial, Bld, YW8, 37 points, 105% line spacing · Synthesis or comparison of several research papers. Proposals to us by today. We will ok them by Oct. 31. 3 possible project types:

Yesterday’s tomorrow

New York Worlds Fair, 1939New York Worlds Fair, 1939(Westinghouse Historical Collection)(Westinghouse Historical Collection)

ElektroElektro

SparkoSparko

Page 8: S95 Arial, Bld, YW8, 37 points, 105% line spacing · Synthesis or comparison of several research papers. Proposals to us by today. We will ok them by Oct. 31. 3 possible project types:

Computer vision still needs to become more robust

Pavlovic, Rehg, Cham, and Murphy, Intl. Conf. Computer Vision, 1999

Page 9: S95 Arial, Bld, YW8, 37 points, 105% line spacing · Synthesis or comparison of several research papers. Proposals to us by today. We will ok them by Oct. 31. 3 possible project types:

But we can fake it with clever system design

M. Krueger,“Artificial Reality”,

Addison-Wesley, 1983.

Page 10: S95 Arial, Bld, YW8, 37 points, 105% line spacing · Synthesis or comparison of several research papers. Proposals to us by today. We will ok them by Oct. 31. 3 possible project types:

From MERL and Mitsubishi Electric:

David Anderson, Paul Beardsley, Chris Dodge, William Freeman, Hiroshi Kage, Kazuo Kyuma, Darren Leigh, Neal McKenzie, Yasunari Miyake, Michal Roth, Ken-ichi Tanaka, Craig Weissman, William Yerazunis

From MERL and Mitsubishi Electric:From MERL and Mitsubishi Electric:

David Anderson, PaulDavid Anderson, Paul BeardsleyBeardsley, , Chris Dodge, William Freeman, Hiroshi Chris Dodge, William Freeman, Hiroshi KageKage, Kazuo , Kazuo KyumaKyuma, Darren Leigh, Neal , Darren Leigh, Neal McKenzie,McKenzie, YasunariYasunari Miyake, Miyake, MichalMichal Roth, Roth, KenKen--ichiichi Tanaka, Craig Tanaka, Craig WeissmanWeissman, , William William YerazunisYerazunis

Research at MERL on fast, low-cost vision systems

Page 11: S95 Arial, Bld, YW8, 37 points, 105% line spacing · Synthesis or comparison of several research papers. Proposals to us by today. We will ok them by Oct. 31. 3 possible project types:

Computer vision based interface

The hope: video input will give a more The hope: video input will give a more expressive, natural or engaging interface.expressive, natural or engaging interface.

Page 12: S95 Arial, Bld, YW8, 37 points, 105% line spacing · Synthesis or comparison of several research papers. Proposals to us by today. We will ok them by Oct. 31. 3 possible project types:

Existing interfaces devices are fast & low-cost.

Page 13: S95 Arial, Bld, YW8, 37 points, 105% line spacing · Synthesis or comparison of several research papers. Proposals to us by today. We will ok them by Oct. 31. 3 possible project types:

Applications make the vision easier.

Constraints simplify recognition--if you know where the tracks are, it’s easy to guess where the train is.

Page 14: S95 Arial, Bld, YW8, 37 points, 105% line spacing · Synthesis or comparison of several research papers. Proposals to us by today. We will ok them by Oct. 31. 3 possible project types:

There is a human in the loop.

Rich, immediate visual, audio feedback.The player can correct for algorithm imperfections.

Page 15: S95 Arial, Bld, YW8, 37 points, 105% line spacing · Synthesis or comparison of several research papers. Proposals to us by today. We will ok them by Oct. 31. 3 possible project types:

Computer vision algorithmsas ocean-going vessels

Page 16: S95 Arial, Bld, YW8, 37 points, 105% line spacing · Synthesis or comparison of several research papers. Proposals to us by today. We will ok them by Oct. 31. 3 possible project types:

Computer vision algorithmsas ocean-going vessels

thiswork

Page 17: S95 Arial, Bld, YW8, 37 points, 105% line spacing · Synthesis or comparison of several research papers. Proposals to us by today. We will ok them by Oct. 31. 3 possible project types:

1. Selected appliance: television

Page 18: S95 Arial, Bld, YW8, 37 points, 105% line spacing · Synthesis or comparison of several research papers. Proposals to us by today. We will ok them by Oct. 31. 3 possible project types:

television market

~1 billion television sets~1 billion television sets

Page 19: S95 Arial, Bld, YW8, 37 points, 105% line spacing · Synthesis or comparison of several research papers. Proposals to us by today. We will ok them by Oct. 31. 3 possible project types:

Survey“What high technology gadget has improved the “What high technology gadget has improved the

quality of your life the most?”quality of your life the most?”

What two things were mentioned most?What two things were mentioned most?

Page 20: S95 Arial, Bld, YW8, 37 points, 105% line spacing · Synthesis or comparison of several research papers. Proposals to us by today. We will ok them by Oct. 31. 3 possible project types:

Survey results“What high technology gadget has improved the “What high technology gadget has improved the

quality of your life the most?”quality of your life the most?”

Microwave ovens and TV remote controls Microwave ovens and TV remote controls ----Porter/Porter/NovelliNovelli survey, 1995survey, 1995

message: message: People value the ability to control a television from a People value the ability to control a television from a distance.distance.

Page 21: S95 Arial, Bld, YW8, 37 points, 105% line spacing · Synthesis or comparison of several research papers. Proposals to us by today. We will ok them by Oct. 31. 3 possible project types:

Control of television setfrom a distanceWired remote control.Wired remote control.

InfraInfra--red remote control.red remote control.

Voice control.Voice control.

Gesture control.Gesture control.

Page 22: S95 Arial, Bld, YW8, 37 points, 105% line spacing · Synthesis or comparison of several research papers. Proposals to us by today. We will ok them by Oct. 31. 3 possible project types:

Design constraints

From the user’s point of viewFrom the user’s point of view

From the computer’s point of viewFrom the computer’s point of view

Page 23: S95 Arial, Bld, YW8, 37 points, 105% line spacing · Synthesis or comparison of several research papers. Proposals to us by today. We will ok them by Oct. 31. 3 possible project types:

Complex commandsrequire complicated gestures?

From the user’s point of view:

“mute”“mute”

Page 24: S95 Arial, Bld, YW8, 37 points, 105% line spacing · Synthesis or comparison of several research papers. Proposals to us by today. We will ok them by Oct. 31. 3 possible project types:

Living room scene is difficultFrom the computer’s point of view:

How can the computer find the hand, and recognize its gesture, in this complicated, unpredictable visual scene?

Page 25: S95 Arial, Bld, YW8, 37 points, 105% line spacing · Synthesis or comparison of several research papers. Proposals to us by today. We will ok them by Oct. 31. 3 possible project types:

Our solution: exploit the visual feedback from the television

television

Volume

user

Page 26: S95 Arial, Bld, YW8, 37 points, 105% line spacing · Synthesis or comparison of several research papers. Proposals to us by today. We will ok them by Oct. 31. 3 possible project types:

hand recognition method:template matching

template image

Examine the squared difference between (a) pixel values in the hand template, and(b) pixel values in a square centered at each possible positionin the image.

Page 27: S95 Arial, Bld, YW8, 37 points, 105% line spacing · Synthesis or comparison of several research papers. Proposals to us by today. We will ok them by Oct. 31. 3 possible project types:

hand recognition method:normalized correlation

template image normalizedcorrelation

Page 28: S95 Arial, Bld, YW8, 37 points, 105% line spacing · Synthesis or comparison of several research papers. Proposals to us by today. We will ok them by Oct. 31. 3 possible project types:

Normalized correlation

( )( )bbaaba

rrrr

rr

⋅⋅

⋅ Where a and b arevectors from rasterized patches of the image and template

Page 29: S95 Arial, Bld, YW8, 37 points, 105% line spacing · Synthesis or comparison of several research papers. Proposals to us by today. We will ok them by Oct. 31. 3 possible project types:

Background removal

currentimage

runningaverage

next average

background removed

(1-α)

α

Page 30: S95 Arial, Bld, YW8, 37 points, 105% line spacing · Synthesis or comparison of several research papers. Proposals to us by today. We will ok them by Oct. 31. 3 possible project types:

Processing block diagram

Raw Video (RBG - 24 bit)

Image Representation

TemplateCreation

Correlation Position

Remove Background

Kalman Filter

Edit

On-screen Controls

Tracking Trigger Gesture

Remote Control

TV

Page 31: S95 Arial, Bld, YW8, 37 points, 105% line spacing · Synthesis or comparison of several research papers. Proposals to us by today. We will ok them by Oct. 31. 3 possible project types:

Prototype of television controlledby hand signals.

Page 32: S95 Arial, Bld, YW8, 37 points, 105% line spacing · Synthesis or comparison of several research papers. Proposals to us by today. We will ok them by Oct. 31. 3 possible project types:

TV screen overlay

Page 33: S95 Arial, Bld, YW8, 37 points, 105% line spacing · Synthesis or comparison of several research papers. Proposals to us by today. We will ok them by Oct. 31. 3 possible project types:

TV control

Page 34: S95 Arial, Bld, YW8, 37 points, 105% line spacing · Synthesis or comparison of several research papers. Proposals to us by today. We will ok them by Oct. 31. 3 possible project types:

Video

Page 35: S95 Arial, Bld, YW8, 37 points, 105% line spacing · Synthesis or comparison of several research papers. Proposals to us by today. We will ok them by Oct. 31. 3 possible project types:

Prototype limitations

Distance from camera: Distance from camera: 6 6 -- 10 feet.10 feet.

Field of view: Field of view: trigger gesture: 15 trigger gesture: 15 o o tracking: 25 tracking: 25 oo

Coupling to television is loose.Coupling to television is loose.

Two screens instead of one.Two screens instead of one.

Robustness during operation:Robustness during operation:no template adaptation to different users.no template adaptation to different users.

background removal may need variable contrast control.background removal may need variable contrast control.

Page 36: S95 Arial, Bld, YW8, 37 points, 105% line spacing · Synthesis or comparison of several research papers. Proposals to us by today. We will ok them by Oct. 31. 3 possible project types:

Product hardware requirements

Short termShort term•• cameracamera•• video digitizervideo digitizer•• computercomputer

Long termLong term•• TV’s / computers / browsers will have cameras TV’s / computers / browsers will have cameras

and powerful computers.and powerful computers.•• a software product.a software product.

Page 37: S95 Arial, Bld, YW8, 37 points, 105% line spacing · Synthesis or comparison of several research papers. Proposals to us by today. We will ok them by Oct. 31. 3 possible project types:

2. Simple gesture recognition method

Page 38: S95 Arial, Bld, YW8, 37 points, 105% line spacing · Synthesis or comparison of several research papers. Proposals to us by today. We will ok them by Oct. 31. 3 possible project types:

image

T

trainingset

signaturevector

recognition system

compare

Real-time hand gesture recognitionby orientation histograms

Page 39: S95 Arial, Bld, YW8, 37 points, 105% line spacing · Synthesis or comparison of several research papers. Proposals to us by today. We will ok them by Oct. 31. 3 possible project types:

Orientation measurements (bottom) are more robust to lighting changes than are pixel intensities (top)

Page 40: S95 Arial, Bld, YW8, 37 points, 105% line spacing · Synthesis or comparison of several research papers. Proposals to us by today. We will ok them by Oct. 31. 3 possible project types:

Orientation measurements (bottom) are more robust to lighting changes than are pixel intensities (top)

Page 41: S95 Arial, Bld, YW8, 37 points, 105% line spacing · Synthesis or comparison of several research papers. Proposals to us by today. We will ok them by Oct. 31. 3 possible project types:
Page 42: S95 Arial, Bld, YW8, 37 points, 105% line spacing · Synthesis or comparison of several research papers. Proposals to us by today. We will ok them by Oct. 31. 3 possible project types:

Images, orientation images, and orientation histograms for training set

Page 43: S95 Arial, Bld, YW8, 37 points, 105% line spacing · Synthesis or comparison of several research papers. Proposals to us by today. We will ok them by Oct. 31. 3 possible project types:

Test image, and distances from each of the training set orientation histograms (categorized correctly).

Page 44: S95 Arial, Bld, YW8, 37 points, 105% line spacing · Synthesis or comparison of several research papers. Proposals to us by today. We will ok them by Oct. 31. 3 possible project types:

Crane movements controlledby hand gestures

Page 45: S95 Arial, Bld, YW8, 37 points, 105% line spacing · Synthesis or comparison of several research papers. Proposals to us by today. We will ok them by Oct. 31. 3 possible project types:

Janken game

Page 46: S95 Arial, Bld, YW8, 37 points, 105% line spacing · Synthesis or comparison of several research papers. Proposals to us by today. We will ok them by Oct. 31. 3 possible project types:

video

Page 47: S95 Arial, Bld, YW8, 37 points, 105% line spacing · Synthesis or comparison of several research papers. Proposals to us by today. We will ok them by Oct. 31. 3 possible project types:
Page 48: S95 Arial, Bld, YW8, 37 points, 105% line spacing · Synthesis or comparison of several research papers. Proposals to us by today. We will ok them by Oct. 31. 3 possible project types:

Games add fun and purpose: “Get the sprite Games add fun and purpose: “Get the sprite through the golden rings.”through the golden rings.”

3. Computer vision for computer games.

Page 49: S95 Arial, Bld, YW8, 37 points, 105% line spacing · Synthesis or comparison of several research papers. Proposals to us by today. We will ok them by Oct. 31. 3 possible project types:

“Guests cared “Guests cared about the about the experience, experience, not the not the technology.”technology.”

Field test results from Disney’s VR Aladdin.

Page 50: S95 Arial, Bld, YW8, 37 points, 105% line spacing · Synthesis or comparison of several research papers. Proposals to us by today. We will ok them by Oct. 31. 3 possible project types:

Games selected for vision interface

Page 51: S95 Arial, Bld, YW8, 37 points, 105% line spacing · Synthesis or comparison of several research papers. Proposals to us by today. We will ok them by Oct. 31. 3 possible project types:

Image moments give a very coarse image summary.

Page 52: S95 Arial, Bld, YW8, 37 points, 105% line spacing · Synthesis or comparison of several research papers. Proposals to us by today. We will ok them by Oct. 31. 3 possible project types:

Hand images and equivalent rectangles having the same image moments

Page 53: S95 Arial, Bld, YW8, 37 points, 105% line spacing · Synthesis or comparison of several research papers. Proposals to us by today. We will ok them by Oct. 31. 3 possible project types:

Artificial Retina chip for detectionand low-level image processing.

Page 54: S95 Arial, Bld, YW8, 37 points, 105% line spacing · Synthesis or comparison of several research papers. Proposals to us by today. We will ok them by Oct. 31. 3 possible project types:

Artificial Retina chip

Page 55: S95 Arial, Bld, YW8, 37 points, 105% line spacing · Synthesis or comparison of several research papers. Proposals to us by today. We will ok them by Oct. 31. 3 possible project types:

Artificial Retina functions

Page 56: S95 Arial, Bld, YW8, 37 points, 105% line spacing · Synthesis or comparison of several research papers. Proposals to us by today. We will ok them by Oct. 31. 3 possible project types:

Fast image moment calculation with artificial retina chip

Processing time for image projections:

w/o AR chip: 10 msec

with AR chip: 0.3 msec

Processing time Processing time for image for image projections: projections:

w/o AR chip: w/o AR chip: 10 10 msecmsec

with AR chip: with AR chip: 0.3 0.3 msecmsec

Page 57: S95 Arial, Bld, YW8, 37 points, 105% line spacing · Synthesis or comparison of several research papers. Proposals to us by today. We will ok them by Oct. 31. 3 possible project types:

Hand gestureHand gesture--controlled robotcontrolled robot

Page 58: S95 Arial, Bld, YW8, 37 points, 105% line spacing · Synthesis or comparison of several research papers. Proposals to us by today. We will ok them by Oct. 31. 3 possible project types:

Game: Nights

Page 59: S95 Arial, Bld, YW8, 37 points, 105% line spacing · Synthesis or comparison of several research papers. Proposals to us by today. We will ok them by Oct. 31. 3 possible project types:

Moment-based pointing control

time 1

time 2Center-of-mass of absolute value of difference-image

Page 60: S95 Arial, Bld, YW8, 37 points, 105% line spacing · Synthesis or comparison of several research papers. Proposals to us by today. We will ok them by Oct. 31. 3 possible project types:

Line to difference-image center-of-mass determines flight direction.

Moment-based pointing control

Page 61: S95 Arial, Bld, YW8, 37 points, 105% line spacing · Synthesis or comparison of several research papers. Proposals to us by today. We will ok them by Oct. 31. 3 possible project types:

Game: Magic Carpet

Page 62: S95 Arial, Bld, YW8, 37 points, 105% line spacing · Synthesis or comparison of several research papers. Proposals to us by today. We will ok them by Oct. 31. 3 possible project types:

Magic carpet game--figure analysis by hierarchical image moments

Page 63: S95 Arial, Bld, YW8, 37 points, 105% line spacing · Synthesis or comparison of several research papers. Proposals to us by today. We will ok them by Oct. 31. 3 possible project types:

Game: Decathlete

Page 64: S95 Arial, Bld, YW8, 37 points, 105% line spacing · Synthesis or comparison of several research papers. Proposals to us by today. We will ok them by Oct. 31. 3 possible project types:

Optical-flow-based Decathlete figure motion analysis

Page 65: S95 Arial, Bld, YW8, 37 points, 105% line spacing · Synthesis or comparison of several research papers. Proposals to us by today. We will ok them by Oct. 31. 3 possible project types:

Decathlete 100m hurdles

Page 66: S95 Arial, Bld, YW8, 37 points, 105% line spacing · Synthesis or comparison of several research papers. Proposals to us by today. We will ok them by Oct. 31. 3 possible project types:

Decathlete javelin throw

Page 67: S95 Arial, Bld, YW8, 37 points, 105% line spacing · Synthesis or comparison of several research papers. Proposals to us by today. We will ok them by Oct. 31. 3 possible project types:

Decathlete javelin throw

Page 68: S95 Arial, Bld, YW8, 37 points, 105% line spacing · Synthesis or comparison of several research papers. Proposals to us by today. We will ok them by Oct. 31. 3 possible project types:

video

Page 69: S95 Arial, Bld, YW8, 37 points, 105% line spacing · Synthesis or comparison of several research papers. Proposals to us by today. We will ok them by Oct. 31. 3 possible project types:

Nintendo Game Boy CameraSeveral million sold (most of any digital camera). Imaging chip is Mitsubishi Electric’s “Artificial Retina” CMOS detector.

Several million sold (most of any Several million sold (most of any digital camera). Imaging chip is digital camera). Imaging chip is Mitsubishi Electric’s “Artificial Mitsubishi Electric’s “Artificial Retina” CMOS detector.Retina” CMOS detector.

Page 70: S95 Arial, Bld, YW8, 37 points, 105% line spacing · Synthesis or comparison of several research papers. Proposals to us by today. We will ok them by Oct. 31. 3 possible project types:

video

Page 71: S95 Arial, Bld, YW8, 37 points, 105% line spacing · Synthesis or comparison of several research papers. Proposals to us by today. We will ok them by Oct. 31. 3 possible project types:

Fast, simple algorithms and lowFast, simple algorithms and low--cost cost hardware are wellhardware are well--suited to interactive suited to interactive graphics applications.graphics applications.We followed this approach to make a We followed this approach to make a television controlled by hand gestures, television controlled by hand gestures, simple hand gesture recognition, and simple hand gesture recognition, and visionvision--based computer game interfaces.based computer game interfaces.

Summary

Page 72: S95 Arial, Bld, YW8, 37 points, 105% line spacing · Synthesis or comparison of several research papers. Proposals to us by today. We will ok them by Oct. 31. 3 possible project types:

To Trevor’s slides…

Page 73: S95 Arial, Bld, YW8, 37 points, 105% line spacing · Synthesis or comparison of several research papers. Proposals to us by today. We will ok them by Oct. 31. 3 possible project types:

Perceptive Context for Pervasive Computing

Trevor DarrellVision Interface GroupMIT AI Lab

Page 74: S95 Arial, Bld, YW8, 37 points, 105% line spacing · Synthesis or comparison of several research papers. Proposals to us by today. We will ok them by Oct. 31. 3 possible project types:

Perceptually Aware Displays

Camera associated with displayDisplay should respond to user

- font size- attentional load- passive acknowledgement

e.g., “Magic Mirror”, IntervalCompaq’s Smart KioskALIVE, MIT Media Lab

Camera

Display

Page 75: S95 Arial, Bld, YW8, 37 points, 105% line spacing · Synthesis or comparison of several research papers. Proposals to us by today. We will ok them by Oct. 31. 3 possible project types:

Example: A Face Responsive Display• Faces are natural interfaces!

- Ubiquitous, fast, expressive, general.- Want machines to generate and perceive faces.

• A Face Responsive Display...- Knows when it’s being observed- Recognizes returning observers- Tracks head pose- Robust to changing lighting, moving backgrounds…

Page 76: S95 Arial, Bld, YW8, 37 points, 105% line spacing · Synthesis or comparison of several research papers. Proposals to us by today. We will ok them by Oct. 31. 3 possible project types:

A Face Responsive Display

Tasks- Detection- Identification- Tracking

How? Exploit multiple visual modalities:- Shape- Color- Pattern

Page 77: S95 Arial, Bld, YW8, 37 points, 105% line spacing · Synthesis or comparison of several research papers. Proposals to us by today. We will ok them by Oct. 31. 3 possible project types:

Tasks and Visual Modalities

fine motion estimation / pose tracking

clothing histogram

coarse motion estimationtracking

face recognitionflesh huebiometricsidentification

face detectionskin classifiersilhouette classifierdetection

patterncolorshape

Page 78: S95 Arial, Bld, YW8, 37 points, 105% line spacing · Synthesis or comparison of several research papers. Proposals to us by today. We will ok them by Oct. 31. 3 possible project types:

Mode and Task Matrix

Appearance change

clothing histogramShape changetracking

face recognitionflesh huebiometricsidentification

face detectionskin classifiersilhouette classifierdetection

patterncolorshape

Page 79: S95 Arial, Bld, YW8, 37 points, 105% line spacing · Synthesis or comparison of several research papers. Proposals to us by today. We will ok them by Oct. 31. 3 possible project types:

Finding Features

2D Head / hands localization- contour analysis: mark extremal points (highest curvature or

distance from center of body) as hand features- use skin color model when region of hand or face is found (color

model is independent of flesh tone intensity)

Page 80: S95 Arial, Bld, YW8, 37 points, 105% line spacing · Synthesis or comparison of several research papers. Proposals to us by today. We will ok them by Oct. 31. 3 possible project types:

Flesh color tracking

• Often the simplest, fastest face detector!• Initialize region of hue space

[ Crowley, Coutaz, Berard, INRIA ]

Page 81: S95 Arial, Bld, YW8, 37 points, 105% line spacing · Synthesis or comparison of several research papers. Proposals to us by today. We will ok them by Oct. 31. 3 possible project types:

Color Processing

• Train two-class classifier with examples of skin and not skin

• Typical approaches: Gaussian, Neural Net, Nearest Neighbor

• Use features invariant to intensityLog color-opponent [Fleck et al.]

(log(r) - log(g), log(b) - log((r+g)/2) )Hue & Saturation

Page 82: S95 Arial, Bld, YW8, 37 points, 105% line spacing · Synthesis or comparison of several research papers. Proposals to us by today. We will ok them by Oct. 31. 3 possible project types:

Flesh color tracking

Can use Intel OpenCV lib’s CAMSHIFT algorithm for robust real-time tracking.

(open source impl. avail.!)

[ Bradsky, Intel ]

Page 83: S95 Arial, Bld, YW8, 37 points, 105% line spacing · Synthesis or comparison of several research papers. Proposals to us by today. We will ok them by Oct. 31. 3 possible project types:

Detection with multiple visual modes

Find head sized peaksin 2-D or 3-D.

Detect skin pigment in hue-based color space

Classify intensity vector corresponding to face class

Shape

Flesh ColorDetection

Face PatternDetection

Page 84: S95 Arial, Bld, YW8, 37 points, 105% line spacing · Synthesis or comparison of several research papers. Proposals to us by today. We will ok them by Oct. 31. 3 possible project types:

Common Detection Failure Modes

Fooled by head shaped peaks

Fooled by flesh colored objects

Misses out of plane rotation or expression

Shape

Flesh ColorDetection

Face PatternDetection

Page 85: S95 Arial, Bld, YW8, 37 points, 105% line spacing · Synthesis or comparison of several research papers. Proposals to us by today. We will ok them by Oct. 31. 3 possible project types:

Robust real-time performance

Integrated Face Detection Algorithm(temporally asynch.

voting scheme)

Shape

Flesh ColorDetection

Face PatternDetection

Page 86: S95 Arial, Bld, YW8, 37 points, 105% line spacing · Synthesis or comparison of several research papers. Proposals to us by today. We will ok them by Oct. 31. 3 possible project types:

Mode and Task Matrix

Appearance change

clothing histogramShape changetracking

face recognitionflesh huebiometricsidentification

face detectionskin classifiersilhouette classifierdetection

patterncolorshape

Page 87: S95 Arial, Bld, YW8, 37 points, 105% line spacing · Synthesis or comparison of several research papers. Proposals to us by today. We will ok them by Oct. 31. 3 possible project types:

A Key Technology: Video-Rate Stereo

• Two cameras −> stereo range estimation; disparity proportional to depth

• Depth makes tracking people easy- segmentation- shape characterization - pose tracking

• Real-time implementations becoming commercially available.

Page 88: S95 Arial, Bld, YW8, 37 points, 105% line spacing · Synthesis or comparison of several research papers. Proposals to us by today. We will ok them by Oct. 31. 3 possible project types:

Video-rate stereo

Computed disparity

Foreground pixels; grouped by local connectivity

Left and right images

Page 89: S95 Arial, Bld, YW8, 37 points, 105% line spacing · Synthesis or comparison of several research papers. Proposals to us by today. We will ok them by Oct. 31. 3 possible project types:

RGBZ input

Page 90: S95 Arial, Bld, YW8, 37 points, 105% line spacing · Synthesis or comparison of several research papers. Proposals to us by today. We will ok them by Oct. 31. 3 possible project types:

RGBZ input

Page 91: S95 Arial, Bld, YW8, 37 points, 105% line spacing · Synthesis or comparison of several research papers. Proposals to us by today. We will ok them by Oct. 31. 3 possible project types:

RGBZ input

Page 92: S95 Arial, Bld, YW8, 37 points, 105% line spacing · Synthesis or comparison of several research papers. Proposals to us by today. We will ok them by Oct. 31. 3 possible project types:

Video-Rate Stereo

• Multiple cameras −> stereo range estimation; disparity proportional to depth

• Real-time implementations becoming commercially available.

• Depth makes tracking people easier- segmentation- shape characterization - pose tracking

Page 93: S95 Arial, Bld, YW8, 37 points, 105% line spacing · Synthesis or comparison of several research papers. Proposals to us by today. We will ok them by Oct. 31. 3 possible project types:

Range feature for ID!

• Body shape characteristics -- e.g., height measure.• Normalize for motion/pose: median filter over time

• Near future: full vision-based kinematic estimation and tracking--active research topic in many labs.

Trevor

MikeGaile

Page 94: S95 Arial, Bld, YW8, 37 points, 105% line spacing · Synthesis or comparison of several research papers. Proposals to us by today. We will ok them by Oct. 31. 3 possible project types:

Color feature for ID! For long-term tracking / identification, measure color hue and saturation

values of hair and skin….

For same-day ID, use histogram of entire body / clothing

Gaile Mike Trevor

Page 95: S95 Arial, Bld, YW8, 37 points, 105% line spacing · Synthesis or comparison of several research papers. Proposals to us by today. We will ok them by Oct. 31. 3 possible project types:

Mode and Task Matrix

Appearance change

clothing histogramShape changetracking

face recognitionflesh huebiometricsidentification

face detectionskin classifiersilhouette classifierdetection

patterncolorshape

See lectures by Trevor later in the course

Page 96: S95 Arial, Bld, YW8, 37 points, 105% line spacing · Synthesis or comparison of several research papers. Proposals to us by today. We will ok them by Oct. 31. 3 possible project types:

Robust, Multi-modal Algorithm

Combine modules for detection:• Silhouette finds body• Color tracks extremities• Pattern discriminates head from hands.

Use each also to recognize returning people:• Face recognition • Biometrics (skeletal structure)• Hair and Skin hue• Clothing (intra-day.)

[ CVPR ‘98; T. Darrell, G. Gordon, M. Harville, J. Woodfill ]

Page 97: S95 Arial, Bld, YW8, 37 points, 105% line spacing · Synthesis or comparison of several research papers. Proposals to us by today. We will ok them by Oct. 31. 3 possible project types:

System Overview

Page 98: S95 Arial, Bld, YW8, 37 points, 105% line spacing · Synthesis or comparison of several research papers. Proposals to us by today. We will ok them by Oct. 31. 3 possible project types:

Classic Background Subtraction model

• Background is assumed to be mostly static• Each pixel is modeled as by a gaussian distribution in

YUV space• Model mean is usually updated using a recursive low-

pass filter

Given new image, generate silhouetteby marking those pixels that are significantlydifferent from the “background” value.

Page 99: S95 Arial, Bld, YW8, 37 points, 105% line spacing · Synthesis or comparison of several research papers. Proposals to us by today. We will ok them by Oct. 31. 3 possible project types:

Static Background Modeling Examples

[MIT Media Lab Pfinder / ALIVE System]

Page 100: S95 Arial, Bld, YW8, 37 points, 105% line spacing · Synthesis or comparison of several research papers. Proposals to us by today. We will ok them by Oct. 31. 3 possible project types:

Static Background Modeling Examples

[MIT Media Lab Pfinder / ALIVE System]

Page 101: S95 Arial, Bld, YW8, 37 points, 105% line spacing · Synthesis or comparison of several research papers. Proposals to us by today. We will ok them by Oct. 31. 3 possible project types:

Static Background Modeling Examples

[MIT Media Lab Pfinder / ALIVE System]

Page 102: S95 Arial, Bld, YW8, 37 points, 105% line spacing · Synthesis or comparison of several research papers. Proposals to us by today. We will ok them by Oct. 31. 3 possible project types:

The ALIVE System

VideoScreen

Camera

User

Autonomous Agents

Page 103: S95 Arial, Bld, YW8, 37 points, 105% line spacing · Synthesis or comparison of several research papers. Proposals to us by today. We will ok them by Oct. 31. 3 possible project types:

ALIVE • Real sensing for virtual world• Tightly coupled sensing-behavior-action• Vision routines: body/head/hand tracking

Kinematics /Rendering

Camera

Projector

Vision Behaviors / Goals

User Agents

[ Blumberg, Darrell, Maes, Pentland, Wren, … 1995 ]

Page 104: S95 Arial, Bld, YW8, 37 points, 105% line spacing · Synthesis or comparison of several research papers. Proposals to us by today. We will ok them by Oct. 31. 3 possible project types:

ALIVE system, MIT

http://vismod.www.media.mit.edu/cgi-bin/tr_pagemaker (TR 257)

Page 105: S95 Arial, Bld, YW8, 37 points, 105% line spacing · Synthesis or comparison of several research papers. Proposals to us by today. We will ok them by Oct. 31. 3 possible project types:

http://vismod.www.media.mit.edu/cgi-bin/tr_pagemaker (TR 257)

Page 106: S95 Arial, Bld, YW8, 37 points, 105% line spacing · Synthesis or comparison of several research papers. Proposals to us by today. We will ok them by Oct. 31. 3 possible project types:

A Face Responsive Display

Video Display

StereoCameras

Page 107: S95 Arial, Bld, YW8, 37 points, 105% line spacing · Synthesis or comparison of several research papers. Proposals to us by today. We will ok them by Oct. 31. 3 possible project types:

Vision-only Application:Interactive Video Effects

Page 108: S95 Arial, Bld, YW8, 37 points, 105% line spacing · Synthesis or comparison of several research papers. Proposals to us by today. We will ok them by Oct. 31. 3 possible project types:

end