eecs 286 advanced topics in computer vision ming-hsuan yang

Post on 16-Dec-2015

218 Views

Category:

Documents

1 Downloads

Preview:

Click to see full reader

TRANSCRIPT

EECS 286 Advanced Topics in Computer

Vision

Ming-Hsuan Yang

Computer vision

• Holly grail – tell a story from an image

History

• “In the 1960s, almost no one realized that machine vision was difficult.” – David Marr, 1982

• Marvin Minsky asked Gerald Jay Sussman to “spend the summer linking a camera to a computer and getting the computer to describe what it saw” – Crevier, 1993

• 40+ years later, we are still working on this

1970s

1980s

1990s

• Face detection

• Particle filter• Pfinder• Normalized

cut

2000s

• SIFT– Mosaicing, panorama– Object recognition– Photo tourism, photosynth– Human detection

• Adaboost-based face detector

Frontiers in computer vision

• NSF sponsored workshop at MIT CSAIL, August 21 to 24, 2011– identify the future impact of computer vision

on the economic, social, and security needs of the nation

– outline the scientific and technological challenges to address

– draft a roadmap to address those challenges and realize the benefits

• Read the current white papers• Read the 1991 workshop final reports

Related topics

Conferences

• CVPR – Computer Vision and Pattern Recognition, since 1983– Annual, held in US

• ICCV – International Conference on Computer Vision, since 1987– Every other year, alternate in 3

continents• ECCV – European Conference on

Computer Vision, since 1990– Every other year, held in Europe

Conferences (cont’d)

• ACCV – Asian Conference on Computer Vision

• BMVC – British Machine Vision Conference

• ICPR – International Conference on Pattern Recognition

• SIGGRAPH• NIPS – Neural Information Processing

Systems

Conferences (cont’d)

• MICCAI – Medical Image Computing and Computer-Assisted Intervention

• ISBI – International Symposium on Biomedical Imaging

• FG – IEEE Conference on Automatic Face and Gesture Recognition

• ICCP, ICDR, ICVS, DAGM, CAIP, MVA, AAAI, IJCAI, ICML, ICRA, ICASSP, ICIP, SPIE, DCC, WACV, 3DPVT, ACM Multimedia, ICME, …

Journals

• PAMI – IEEE Transactions on Pattern Analysis and Machine Intelligence, since 1979 (impact factor: 5.96, #1 in all engineering and AI, top-ranked IEEE and CS journal)

• IJCV – International Journal on Computer Vision, since 1988 (impact factor: 5.36, #2 in all engineering and AI)

• CVIU – Computer Vision and Image Understanding, since 1972 (impact factor: 2.20)

Journals (cont’d)

• IVC – Image and Vision Computing• IEEE Transactions on Medical Imaging • TIP – IEEE Transactions on Image

Processing• MVA – Machine Vision and

Applications• PR – Pattern Recognition• TM – IEEE Transactions on Multimedia• …

Tools

• Google scholar, citeseer, • h-index• Software: publish or perish

• Disclaimer:– h index = significance? – # of citation = significance?

Challenging issues

• Large scale• Unconstrained• Real-time• Robustness• Recover from failure – graceful dead

Recent topics

• Object detection, segmentation, recognition, categorization

• Deep learning• Internet scale image search• Video search• 3D human pose estimation• Computational photography• Scene understanding

Some tools

• Prior• Context• Sparse representation• Multiple instance learning• Online learning• Convex optimization• Constraint• Hashing

Prior

Torralba and Sinha ICCV 01

Prior

Heitz and Koller ECCV 08

Prior

He et al. CVPR 09Jia CVPR 08

Scene understanding

Leibe et al. CVPR 07

Computational photography

Johnson and Adelson CVPR 09

Computational photography

• Gelsight: – http://www.mit.edu/~kimo/gelsight/

• Lytro: – http://www.lytro.com/

Image and video search

• Google image search– http://images.google.com/

• Videosurf– http://www.videosurf.com/

Current state of the art• You just saw examples of current systems.

– Many of these are less than 5 years old• This is a very active research area, and rapidly

changing– Many new applications in the next 5

years• To learn more about vision applications and

companies– David Lowe maintains an excellent

overview of vision companies• http://www.cs.ubc.ca/spider/lowe/vision.ht

ml

• Confluence of vision, graphics, learning, sensing and signal processing

Software and hardware

• Algorithms: processing images and videos

• Camera: acquiring images/videos • Embedded system

Class mechanics

• Papers will be assigned weekly• One student needs to present 2 or 3

papers in details• All students need to read and write

critiques• Presentation and discussion

Prerequisites

• Prerequisites—these are essential!– Data structures– A good working knowledge of MATLAB,

C, and C++ programming– Linear algebra – Vector calculus– EECS 274 Computer Vision– EECS 274 Matrix Computation

Topics

• Low-level vision: feature, edge, texture, deblurring, visual saliency

• Mid-level vision: segmentation, superpixels• High-level vision: object detection, object

recognition, visual tracking, super resolution• Learning algorithms: Markov random field,

conditional random field, graphical model, belief propagation, active learning, multi-view learning

Textbooks and references• Textbook

– Computer Vision: A Modern Approach, David Forsyth and Jean Ponce– Computer Vision: Algorithms and Applications , Richard Szeliski– Elements of Statistical Learning, Hastie, Tibshirani, Friedman

• Reference for background study:  – Introductory Techniques for 3-D Computer Vision, Emanuele Trucco and

Alessandro Verri– Multiple View Geometry in Computer Vision, Richard Hartley and

Andrew Zisserman– Robot Vision, Berthold Horn– Learning OpenCV: Computer Vision with OpenCV Library, Gary Bradski

and Adrian Kaehler

• Reading assignments will be from the text and additional material that will be handed out or made available on the web page

• All lecture slides will be available on the course website

http://faculty.ucmerced.edu/mhyang/course/eecs286/index.htm

Grading

• 30% Critiques• 10% Presentation• 20% Midterm report• 10% Final project presentation• 30% Term project

Term Project

• Open-ended project of your choosing• Oral presentation

– Midterm presentation– Final presentation and demo

• Publish your results

top related