computer visioncrcv.ucf.edu/courses/cap5415/fall2014/cvintroductionaugust2014.pdf · computer...

8/20/2014

Computer Vision

Mubarak Shahshah@crcv.ucf.edu

Computer Vision The ability of computers to see.

Image Understanding Machine Vision Robot Vision Image Analysis Video Understanding

8/20/2014

A picture is worth a thousand words.

A word is worth a thousand pictures.

A HUNT

8/20/2014

Image 2-D array of numbers (intensity values,

gray levels) Gray levels 0 (black) to 255 (white) Color image is 3 2-D arrays of numbers

Red Green Blue

Resolution (number of rows and columns) 128X128 256X256 512X512 640X480

8/20/2014

Image Formats

TIF PGM PBM GIF JPEG RAW

Video Sequence of frames 30 frames per second

Formats AVI MPEG Quick Time

8/20/2014

Video Clip

Sequence of Images

8/20/2014

Image Formation Light Source Camera (extrinsic and intrinsic

parameters) Scene (Surface reflectance, Surface

shape )

Perspective Projection (Pin Hole)

(X,Y,Z)

Image Plane

image Zy

fWorldpoint

8/20/2014

Orthographic Projection(X,Y,Z)

Image Plane

Worldpoint

Shape from X Recover 3-D shape from 2-D image(s)

Stereo Motion Shading Texture Contours

8/20/2014

Stereo

http://www.vision3d.com/stereo.html

8/20/2014

Renault Stereo Pair

Depth Map

8/20/2014

Shape from Shading

Lambertian Model

S=L, light source

8/20/2014

(1, 0, 1) (-1, 1, 1) (-1,-1, 1)

Shape from Texture

8/20/2014

Visual Motion

Shape from Motion: Moving Light Display

8/20/2014

Shape from Motion

Photosynth

8/20/2014

Sequence Raw Optical flow

Video Clip & Mosaic

8/20/2014

Applications of Computer Vision Face Recognition Object Recognition Video Surveillance and Monitoring

Object detection, tracking and behavior analysis

Remote Sensing: UAVs Robotics Computer Graphics

Object Recognition

Finding People in imagesProblem 1: Given an image I Question: Does I contain an image of a

person?

8/20/2014

“Yes” Instances

“No” Instances

8/20/2014

Localize People (Human Detection)

Human Detection

8/20/2014

Airplanes

Motor Cycles

8/20/2014

Face Recognition

Taming Wild Faces: Web‐Scale, Open‐Universe Face Recognition

CRCV | Center for Research in Computer VisionUn ive rs i t y o f Cent ra l F lo r ida

What are wild faces?

8/20/2014

350 million photos uploaded daily350 million photos uploaded daily

100 hours of movies uploaded per hour100 hours of movies uploaded per hour

60,000 movies60,000 movies

8/20/2014

Why are wild faces important?

Three Tasks of Face Recognition

Most Research Focus

8/20/2014

Most Research Focus

FERET200 people1400 faces

AR100 people1400 faces

Ext. Yale B38 people2400 faces

PubFig8383 people

>830 faces

YouTube Celebrities

47 people1910 faces

Most Research Focus Pair-Matching Task

Same Not Same

PubFig200 people58,797 faces

LFW5,749 people13,233 faces

YouTube Faces

1,595 people3,425 faces

8/20/2014

Most Research Focus Pair-Matching TaskScenario

Most Realistic Scenario

Open-Universe Face Identification

News Article: Label Important Figures

Social Network: Tag Facebook Friends

Barack Obama

8/20/2014

Open-Universe Face IdentificationFind Angelina Jolie and George Clooney

8/20/2014

Open-Universe Face Identification

Gwenyth PaltrowRobert Downey Jr.

Face Recognition in Movie Trailers via Mean Sequence Sparse Representation‐based Classification

CVPR 2013

8/20/2014

Recognition – Qualitative

Known:Paul RuddSteve Carell

8/20/2014

Known:Owen WilsonReese WitherspoonPaul Rudd

Known:Bruce WillisMorgan Freeman

8/20/2014

Known:Tom CruiseCameron Diaz

FACIAL EXPRESSIONS

RAISE EYE BROWS SMILE

8/20/2014

Detecting Driver Alertness

Lipreading

8/20/2014

Video Surveillance and Monitoring

Automated Surveillance System (Detection & Tracking)

Object detection Object tracking Object categorization and classification

Event or Activities Recognition

A. I Person Tracking

A. II Part Tracking

8/20/2014

• Part of the WAS (wide area surveillance) project executed by the Homeland Security Advanced Research Project Agency (HSARPA)

• Current Sensor• 8 high-resolution cameras• provide a 100 mega-pixel, 360°field of view• frame rate: 5 frames per second

• Next Generation• 48 cameras with significantly higher resolution• smaller size

NONA: Project Overview

Current Sensor Next Generation

NONA Sysmtem—Airport Sequence 1

8/20/2014

UAV: Unmanned Aerial Vehicle

UAVs: Unmanned Aerial Vehicles (Drones)

Global Hawk

Predator

Microdrone

8/20/2014

KINGFISHER AEROSTAT BALLOON

8/20/2014 Computer Vision Lab, UCF 67

COCOA – System Flow

Ego Motion Compensation

Feature based + Gradient Based

Motion Detection

Accumulative Frame Differencing + Background

Modeling + Object Segmentation

Object Tracking

Kernel Tracking + Blob Tracking + Occlusion Handling

Telemetry*

Aerial V

Registered Images Motion Detection Tracks

Event Detection & Indexing

8/20/2014

Registration Result ‐ I

Aerial Video - EO Mosaic

Alignment Mask

Registration Result ‐ II

Aerial Video - IR Mosaic

Alignment Mask

8/20/2014

Detection Results

Tracking Results

8/20/2014

Wide Area Surveillance

8/20/2014

Tracking Results

Robot Vision (Unmanned Ground Vehicle)

8/20/2014

Human Action Recognition

8/20/2014

Events, Actions, Activities,

• Action

• Event

• Movement

• Activity

• Interaction

• Verb

• ….

• 10 actions

• 9 actors per action

Bend Jack Wave2hands Wave1hand Jumpinplace

Sidestep Hop Walk Skip Run

Weizmann Action Dataset

8/20/2014

KTH Data Set

• Six Categories, 25 actors, 4 instances, 600. clips

Boxing HandClapping HandWaving

Jogging Running Walking

UCF Sports Action Dataset

Bench Swing Dive Swing Run

Kick Lift Ride Golf Swing Skate

9actions,142videos.

8/20/2014

IXMAS Multi‐view Data Set

• 13 action categories, 4 camera views, 10 actors, 3 instances.

View1 View2 View3 View4

CheckWatch

UCF YouTube Action Dataset (UCF‐11)

Cycling Diving GolfSwinging Riding

Juggling BasketballShooting Swinging TennisSwinging

Volleyball Spiking TrampolineJumping WalkingDog

8/20/2014

UCF-101

BaseballPitch BasketballShooting BikingBenchPress Billiards

Breaststroke CleanandJerk DrummingDiving Fencing

GolfSwing HighJump HorseRidingHorseRace HulaHoop

JavelinThrow JugglingBalls JumpRopeJumpingJack Kayaking

Lunges MilitaryParade NunchucksMixingBatter PizzaTossing

8/20/2014

PlayingGuitar PlayingPiano PlayingViolinPlayingTabla PoleVault

PommelHorse PullUps PushUpsPunch RockClimbingIndoor

RopeClimbing Rowing SkateBoardingSalsaSpins Skiing

Skijet SoccerJuggling TaiChiSwing TennisSwing

ThrowDiscus TrampolineJumping WalkingwithadogVolleyballSpiking Yo Yo

Microsoft Kinect sensor

• Data Captured using Microsoft Kinect sensor

• Approximately 50,000 gesture samples

RGB Camera

IR Camera 1 IR Camera 2

8/20/2014

Gesture Lexicons

Gestures from Depth camera

Gestures from RGB camera

Diving Signals Referee Signals Nurse Gesture Music Notes

Discovered Primitives

Representative Motion Primitives (out of 136) for different batches

Left arm moving down

Left arm moving up

Left arm moving to body

Left hand moving forward

Left arm waving up

Left arm moving away from body

Right arm moving laterally up

Right arm moving away from body

Left shoulder moving to right

Left arm moving to right

Left shoulder moving down

Right arm moving up

Left arm moving up

Right arm moving down

8/20/2014

Test case 1: Torso motion adds noise (devel 01– 10 gestures)

Instances of Successful Recognition

Instances of Failed Recognition

Test Case 2: Improvisations (devel 06 – 9 gestures)

8/20/2014

Test Case 3: Subtle differences (devel 09 – 10 gestures)

8/20/2014

High Density Crowded Scenes

Political Rallies Religious Festivals Marathons High Density Moving Objects

Tracking in Crowds

Average chip size 14 x 22 pixels 492 Frames Selected 199 athletes for tracking Successfully tracked 143 athletes

8/20/2014

Results

Experiment – 1

8/20/2014

Experiment-3

Average chip size 14 x 17 pixels 453 Frames Selected 50 athletes for tracking

Experiment – 3

8/20/2014

Behaviors in Crowded Scenes

Bottleneck Departure Lanes Arch/Rings Blockings

8/20/2014

Proposed Method=640Ground truth=634

8/20/2014

Ground truth=2322 Proposed Method=2203

8/20/2014

Where Am I?

“Where Am I?” Problem:

Accurate Image Localization

Input Output

Mere Visual Information(Images) Location in Terms of λ (Lon.) and φ (Lat.)=40.4419, =-79.9986

8/20/2014

Qualitative Image Localization Results:

68.3%: The best match found:

QueryMatch – Error: 6.4 m

QueryQuery

Match – Error: 7.5 m Match – Error: 62.7 m

Qualitative Image Localization Results:

68.3%: The best match found:13.2%: A correct match found, but not the best one:

Match – Error: 307.7 m

Match – Error: 157.3 m

Match – Error: 71.3 mQuery

8/20/2014

GeospatialTrajectoryExtraction

Visual Business RecognitionACM Multimedia 2013

8/20/2014

Computer Vision for Computer Graphics

Video Completion

8/20/2014

Layer Based Video Composition

Results of Doll

8/20/2014

Results of Mom-Daughter

Multimedia: Segmentation of Moving-Sounding Objects

Accepted in IEEE Transactions on Multimedia

8/20/2014

Computer Vision Text Books

8/20/2014

Computer Vision Researchers

8/20/2014

Late Azriel Rosenfeld (Mayrland)

Berthold Horn (MIT)

8/20/2014

Thomas Huang (UIUC)

Ruzena Bajcsy

8/20/2014

Jake Aggarwal (UT Austin)

Chris Brown (Rochester)

8/20/2014

Bob Haralick (CUNY)

Olivier Faugeras (INRIA, France)

8/20/2014

Takeo Kanade (CMU)

Sandy Pentland (MIT)

8/20/2014

Shree Nayar (Columbia)

John Canny (Berkeley)

8/20/2014

Demetri Terzopoulos (UCLA)

Ramesh Jain (UCI)

8/20/2014

Jitendra Malik (Berkeley)

Andrew Zisserman (Oxford)

8/20/2014

Andrew Blake (Microsoft)

Rama Chellappa

8/20/2014

Anil Jain

David Forsyth

8/20/2014

Rick Szeliski (Microsof)

David Lowe

8/20/2014

Cordelia Schmidt

Computer Vision Journals

8/20/2014

Computer Vision Conferences

8/20/2014

IEEE Conference on Computer Vision and Pattern Recognition

(CVPR)• June 24-27, 2014

• Greater Columbus Convention Center in Columbus, Ohio.

8/20/2014

International Conference on Computer Vision (ICCV)

8/20/2014

European Conference on Computer Vision (ECCV)

International Conference on Pattern Recognition (ICPR)

8/20/2014

Asian Conference on Computer Vision (ACCV)

International Conference on Image Processing

8/20/2014

computer visioncrcv.ucf.edu/courses/cap5415/fall2014/cvintroductionaugust2014.pdf · computer...

Documents

shah computer

d.p. shah – d. shah & associates 1. companies act -1956...

jain vani - jsgd.org · parekh for organizing the games;...

introduction to computer by karan shah

computer vision story - ucf computer science · computer...

michael likes... playing computer games. michael likes......

the sword: a role playing game for demonstrating computer...

guided by personal relations mehul shah · mehul a. shah...

examining brain activity while playing computer games ›...

generating ambient behaviors in computer role-playing games

general programme -...

a pattern catalog for computer role playing games

gujarat technological university no:gtu/de68/prac/summer14...

school of computer science bangor university category...

playing for data: ground truth from computer games

creating a chess-playing computer program -...

how b2b sales is like playing a computer game

fundamentals of computer vision( mubarak shah - crcv

computer vision fundamentals of computer vision - mubarak...

computer games & learning. what are these people doing?...