understanding video with deep convolubonal neural networks · computer vision victoria dean, carl...

1

Computer Vision Victoria Dean, Carl Vondrick, Antonio Torralba (Electrical Engineering and Computer Science) Cisco Undergraduate Research and InnovaBon Scholar Understanding Video with Deep ConvoluBonal Neural Networks Neural Networks Unsupervised Learning for Frame PredicBon Current Frame PredicBon Models Current status Next Steps and References Given past frames, predict future frame • The previous frames are inputs to the predictor, which a6empts to generate the next frame x t+1 • Cost func<on computes how close predicted frame pixels are to actual next frame pixels Problem: Predictor tends to output mean frame Neural network computes a differen<able func<on of its input A convolu<onal neural network (CNN) applies the same func<on to each sec<on of the input (good for computer vision tasks) Useful for many tasks, including pedestrian detec<on for self-driving cars Figure from [3] Figures and defini<ons from [2] Photo from [1] • Try out different network architectures (number of layers, dimensions, etc.) and determine which is the most successful for predic<on • Experiment with different cost func<ons: do any yield predic<ons that are be6er than just mean? • Implement LSTM (Long Short Term Memory) network for learning greater context in videos References [1] h6p://www.nvidia.com/content/tegra/automo<ve/images/driver- assistance/pedestrian-detec<on-large.jpg [2] h6p://www.image-net.org/challenges/LSVRC/2012/supervision.pdf [3] h6p://arxiv.org/abs/1412.6604 [4] h6p://caffe.berkeleyvision.org/ [5] h6p://arxiv.org/abs/1311.2524 • Implementa<on of simple predic<on model in Caffe, a popular open-source library for CNNs [4] • Current model generates future frame given one past frame • Simplified problem to predic<on of cropped frame based on object detector bounding boxes [5] • Considering cost func<ons that will hopefully improve issue with outpu_ng mean frame Much successful work on images, but less done on videos (labeling videos is expensive)

Upload: others

Post on 24-Oct-2020

0 views

Category:

Documents

0 download

Report

Download

Embed Size (px):

TRANSCRIPT

ComputerVision

VictoriaDean,CarlVondrick,AntonioTorralba(ElectricalEngineeringandComputerScience)CiscoUndergraduateResearchandInnovaBonScholar

UnderstandingVideowithDeepConvoluBonalNeuralNetworks

NeuralNetworks

UnsupervisedLearningforFramePredicBon CurrentFramePredicBonModels

Currentstatus NextStepsandReferences

Givenpastframes,predictfutureframe•  Thepreviousframesareinputstothepredictor,

whicha6emptstogeneratethenextframext+1•  Costfunc

La industria lítica del yacimiento achelense de Torralba (Soria, … · 2020. 8. 13. · La industria lítica del yacimiento achelense de Torralba 43 Trab. Prehist., 72, N.º 1,

Fast and Compact Retrieval Methods in Computer Vision Part II A. Torralba, R. Fergus and Y. Weiss. Small Codes and Large Image Databases for Recognition

Yusuf Aytar, Carl Vondrick, Antonio Torralba Abstract ... · Yusuf Aytar, Carl Vondrick, Antonio Torralba Massachusetts Institute of Technology fyusuf,vondrick,[email protected]

854 IEEE TRANSACTIONS ON PATTERN ANALYSIS AND ...people.csail.mit.edu/torralba/publications/5papers/...Street, 32-D462, Cambridge, MA 02139. E-mail: [email protected].. K.P. Murphy

Conferencia José Manuel Torralba

Carl Vondrick, Antonio Torralba Adria Recasens*, Aditya ...vision.cs.utexas.edu/381V-spring2016/slides/goel-expt.pdfWhere are they looking? Adria Recasens*, Aditya Khosla*, Carl Vondrick,

massachusetts institute of technology — computer science and …web.mit.edu/torralba/www/extendedCVPR2004.pdf · 2004. 4. 27. · Boosting [25], [10] provides a simple way to sequen-tially

Accidental pinhole and pinspeck cameras: revealing the scene outside the picture A. Torralba and W. T. Freeman Proceedings of 25 IEEE Conference on Computer

Domain Adaptation - Department of Computer Science ... › ~castrejon › content › domain_adaptation_short.pdfLluis Castrejon, Yusuf Aytar, Carl Vondrick, Hamed Pirsiavash, Antonio

Sharing features for multi-class object detection Antonio Torralba, Kevin Murphy and Bill Freeman MIT Computer Science and Artificial Intelligence Laboratory

Bag-of-features models Many slides adapted from Fei-Fei Li, Rob Fergus, and Antonio Torralba

Memorability of Image Regions...Memorability of Image Regions Aditya Khosla Jianxiong Xiao Antonio Torralba Aude Oliva Massachusetts Institute of Technology fkhosla,xiao,torralba,[email protected]

Color and color constancy 6.869, MIT Bill Freeman Antonio Torralba Feb. 22, 2011

Specular reflections and the perception of shapelewicki/cp-s08/Fleming-Torralba-Adelson-04-JOV.… · Journal of Vision (2004) 4, 798-820 Fleming, Torralba, & Adelson 799 Indeed,

Generating the Future With Adversarial Transformers · Generating the Future with Adversarial Transformers Carl Vondrick and Antonio Torralba Massachusetts Institute of Technology

Learning visual biases from human imaginationvondrick/imagination/paper.pdf · Learning visual biases from human imagination Carl Vondrick Hamed Pirsiavashy Aude Oliva Antonio Torralba

Unbiased Look at Dataset Bias - People | MIT CSAILpeople.csail.mit.edu/torralba/publications/datasets_cvpr... · 2011-06-28 · Unbiased Look at Dataset Bias Antonio Torralba Massachusetts

UNBIASED LOOK AT DATASET BIAS - Computer Scienceweb.cs.ucdavis.edu/~yjlee/teaching/ecs289h-fall2014/Pres...UNBIASED LOOK AT DATASET BIAS Antonia Torralba Alexei A. Efros MIT CMU Presented

HOGgles: Visualizing Object Detection Featuresopenaccess.thecvf.com/...HOGgles_Visualizing_Object_2013_ICCV_p… · HOGgles: Visualizing Object Detection Features∗ Carl Vondrick,

Learning visual biases from human imaginationpapers.nips.cc/...learning-visual-biases-from-human-imagination.pdf · Learning visual biases from human imagination Carl Vondrick Hamed

Following Gaze in Video - openaccess.thecvf.comopenaccess.thecvf.com/.../Recasens_Following_Gaze... · Following Gaze in Video Adri`a Recasens Carl Vondrick Aditya Khosla Antonio

HOGgles: Visualizing Object Detection Features (to be appeared in ICCV 2013) Carl Vondrick Aditya Khosla Tomasz Malisiewicz and Antonio Torralba,MIT Presented

Proceedings of the IEEE 2010 Antonio Torralba, MIT Jenny Yuen, MIT Bryan C. Russell, MIT

Opportunities of Scale Computer Vision James Hays, Brown Many slides from James Hays, Alyosha Efros, and Derek Hoiem Graphic from Antonio Torralba

QDQ media: The Beauty Concept por Paz Torralba

Dirichlet Process Mixture Models: Application to Brain ...bglocker/pdfs/castro2016bnp.pdf · applied in computer vision for natural image segmentation (e.g. Torralba et al., 2006;

Quantifying Context Effects on Image Memorabilityfigrim.mit.edu/figrimPoster.pdf · Zoya Bylinskii, Phillip Isola, Antonio Torralba, Aude Oliva Computer Science and Artiﬁcial Intelligence

CS 4476 / 6476: Computer Visionhays/compvision2017/lectures/...Accidental Pinhole and Pinspeck Cameras Revealing the scene outside the picture. Antonio Torralba, William T. Freeman

HOGgles: Visualizing Object Detection Features...HOGgles: Visualizing Object Detection Features∗ Carl Vondrick, Aditya Khosla, Tomasz Malisiewicz, Antonio Torralba Massachusetts

Following Gaze in Video...Adri`a Recasens Carl Vondrick Aditya Khosla Antonio Torralba Massachusetts Institute of Technology {recasens, vondrick, khosla, torralba}@csail.mit.edu 0.8

Unbiased Look at Dataset Bias...Unbiased Look at Dataset Bias Antonio Torralba Massachusetts Institute of Technology [email protected] Alexei A. Efros Carnegie Mellon University

Active Appearance Models Computer examples A. Torralba T. F. Cootes, C.J. Taylor, G. J. Edwards M. B. Stegmann