large scale image processing

31
Large Scale Image Processing with Hadoop Brandyn White [email protected] Advisor: Prof. Larry Davis

Upload: matthew-reach

Post on 12-Dec-2015

3 views

Category:

Documents


0 download

DESCRIPTION

image processing into HDFS (hadoop)

TRANSCRIPT

Page 1: Large Scale Image Processing

Large Scale Image Processing with Hadoop

Brandyn [email protected]

Advisor: Prof. Larry Davis

Page 2: Large Scale Image Processing

Outline

• 'Big Data' in Computer Vision• Map/Reduce and Computer Vision• Map/Reduce Image Search• Application: Screenshot Retrieval

Page 3: Large Scale Image Processing

'Big Data' in Vision

• Traditional Vision: Focus on the modelo Pose Est.: 2D Image -> Virtual 3D model + Camera

Under-constrained, slow, sensitive to noiseo Object Recognition: SVM + features

Breaks with many classes (e.g., every flickr tag)

• New Trend: Focus on the datao DB of images (w/ metadata) -> query imageo Problem becomes similar image searcho Transfer metadata from DB images to query imageo KNN methods simple and scalable

Clustering, hashing, metric learning

• NLP: rule-based models -> statistical models

Page 4: Large Scale Image Processing

Example: Image Search -> MetadataQuery Image

Page 5: Large Scale Image Processing

Example: Image Search -> MetadataQuery Image Retrieved Images (flickr)

TagsLocation (GPS)TitleDateGroupsCommentsOwnerViews

TagsLocation (GPS)TitleDateGroupsCommentsOwnerViews

TagsLocation (GPS)TitleDateGroupsCommentsOwnerViews

Page 6: Large Scale Image Processing

Example: Image Search -> MetadataQuery Image Retrieved Images (flickr)

TagsLocation (GPS)TitleDateGroupsCommentsOwnerViews

TagsLocation (GPS)TitleDateGroupsCommentsOwnerViews

TagsLocation (GPS)TitleDateGroupsCommentsOwnerViews

Output Metadata

TagsLocation (GPS)

Page 7: Large Scale Image Processing

Big Data in Vision: Pose EstimationGoal: Given an image of a person, estimate 3D pose.

 G. Shakhnarovich, P. Viola, T. Darrell Fast pose estimation with parameter-sensitive hashing, October 2003.

Page 8: Large Scale Image Processing

Big Data in Vision: Scene CompletionGoal: Given an image and a selected region, fill the region with a plausible texture.

J. Hays and A. A. Efros, "Scene completion using millions of photographs," in SIGGRAPH '07: ACM SIGGRAPH 2007 papers.    New York, NY, USA: ACM, 2007, pp. 4+.

Page 9: Large Scale Image Processing

Big Data in Vision: IM2GPSGoal: Given an image, guess where in the world it was taken.

J. Hays and A. A. Efros, "Im2gps: estimating geographic information from a single image," Computer Vision and Pattern Recognition, IEEE Computer Society Conference on, vol. 0, pp. 1-8, 2008.

Page 10: Large Scale Image Processing

Big Data in Vision: Object RecognitionGoal: Given an image, select a noun that describes it.

A. Torralba, R. Fergus, and W. T. Freeman, "80 million tiny images: A large data set for nonparametric object and scene recognition," Pattern Analysis and Machine Intelligence, IEEE Transactions on, vol. 30, no. 11, pp. 1958-1970, May 2008

Page 11: Large Scale Image Processing

Big Data in Vision: Pixel AnnotationGoal: Given an image, annotate every pixel (e.g., building).

C. Liu, J. Yuen, and A. Torralba, "Nonparametric scene parsing: Label transfer via dense scene alignment," Computer Vision and Pattern Recognition, IEEE Computer Society Conference on, vol. 0, pp. 1972-1979, 2009.

Page 12: Large Scale Image Processing

Big Data in Vision: One Frame MotionGoal: Given an image, estimate the pixel motion.

C. Liu, J. Yuen, A. Torralba, J. Sivic, and W. T. Freeman, "Sift flow: Dense correspondence across different scenes," in ECCV '08: Proceedings of the 10th European Conference on Computer Vision.    Berlin, Heidelberg: Springer-Verlag, 2008, pp. 28-42.

Page 13: Large Scale Image Processing

Outline

• 'Big Data' in Computer Vision• Map/Reduce and Computer Vision• Map/Reduce Image Search• Application: Screenshot Retrieval

Page 14: Large Scale Image Processing

Hadoop+CV: No Reducer

Example Maps• Object Detection (e.g., cars, faces)• Feature Computation (e.g., SIFT)• Sliding Windows (given a region+image)

Map Map Map

Page 15: Large Scale Image Processing

Hadoop+CV: Model Creation

Map: Feature ComputationRed: Model CreationExamples• Classifiers (e.g., SVM, Bayes)• Geometry Problems (e.g., RANSAC, SfM)

Reduce

Map Map Map

Page 16: Large Scale Image Processing

Hadoop+CV: Expectation Maximization

Map: Fit data to model given parameters (E-Step)Red: Compute new model parameters given data (M-Step)Iterate until stopping conditions are met.Examples• Clustering (e.g., K-Means)• Mixture Models (e.g., MoG)

Vec0 Vec1 Vec2

Map Map MapParameter Estimate (in JAR or cache)

Reduce

Page 17: Large Scale Image Processing

Outline

• 'Big Data' in Computer Vision• Map/Reduce and Computer Vision• Map/Reduce Image Search• Application: Screenshot Retrieval

Page 18: Large Scale Image Processing

Image Retrieval with Hadoop

• Analogies between image and text retrievalo Bag of Words -> Bag of Featureso Document -> Imageo Visual Word: Cluster of similar visual features

• Compute Local Image Features (e.g., SIFT)• Cluster Features (i.e., create visual words)• Find cluster medians• Make Hamming Embeddings (compact feature) [1]

o Efficient binary code (256 -> 8 Bytes per feature)o Hamming Distanceo Benefit: Small size means more in memory

• Inverted Index[1] H. Jegou, M. Douze, and C. Schmid, "Hamming embedding and weak geometric consistency for large scale image search," in ECCV '08: Proceedings of the 10th European Conference on Computer Vision.    Berlin, Heidelberg: Springer-Verlag, 2008, pp. 304-317

Page 19: Large Scale Image Processing

Hadoop Job Workflow

Image Features (SURF 64D)

Remove Dupes (Curr./Prev.)

(Database Images)

K-Means Clustering (Initial)

K-Means Clustering

Median Computation

Hamming Embedding

Page 20: Large Scale Image Processing

Hadoop Job Workflow: Image Features

Image Features (SURF 64D)

(Database Images)

Map In: (image_url, image_hash, image_data, image_tags)

Map Out: (image_hash, image_url, image_features)

Page 21: Large Scale Image Processing

Hadoop Job Workflow: Remove Dupes

Map In: [image_hash, image_url, image_features]orMap In: [image_hash] (for images already in the DB)

Map Out Key: image_hashMap Out Val: image_features

Reduce Out: [image_hash, image_feature]

Image Features (SURF 64D)

Remove Dupes (Curr./Prev.)

Page 22: Large Scale Image Processing

Hadoop Job Workflow: K-Means (init)

Map In: [image_hash, image_feature]

Map Out Key: random [0,1]Map Out Val: image_feature (extended by 1 dim to get count)

1 Reducer (outputs once per cluster)Reduce Out: [cluster_num, cluster_mean]

Remove Dupes (Curr./Prev.)

K-Means Clustering (Initial)

Page 23: Large Scale Image Processing

Hadoop Job Workflow: K-Means

File: cluster_meansMap In: [image_hash, image_feature]

Map Out Key: cluster_num (nearest cluster)Map Out Val: image_feature (extended by 1 dim to get count)

Reduce Out: [cluster_num, cluster_mean]

K-Means Clustering (Initial)

K-Means Clustering

Page 24: Large Scale Image Processing

Hadoop Job Workflow: Medians

File: cluster_meansMap In: [image_hash, image_feature]

Map Out Key: cluster_num (nearest cluster)Map Out Val: image_feature

Reduce Out: [cluster_num, cluster_median]

K-Means Clustering

Median Computation

Page 25: Large Scale Image Processing

Hadoop Job Workflow: Ham. Emb.

File: cluster_means, cluster_mediansMap In: [image_hash, image_feature]

Map Out Key: cluster_num (nearest cluster)Map Out Val: hamming_embedding

Reduce Out: [cluster_num, hamming_embedding]

Median Computation

Hamming Embedding

Page 26: Large Scale Image Processing

Image Retrieval Overview: Query

Image Features (SURF 64D)

Find Nearest Cluster

For each feature...

(Query Image)

Compute hamming embedding(using cluster median)

Vote (tf-idf) for DB image if a feature if hamming dist < Thresh

Page 27: Large Scale Image Processing

Outline

• 'Big Data' in Computer Vision• Map/Reduce and Computer Vision• Map/Reduce Image Search• Application: Screenshot Retrieval

Page 28: Large Scale Image Processing

Current Work: PC Help Doc. Retrieval

• Goal: Take a screenshot and retrieve books and websites that provide relevant help documentation.

Tom Yeh, Brandyn White, Larry Davis, and Boris Katz

Page 29: Large Scale Image Processing

Outline

• 'Big Data' in Computer Vision• Map/Reduce and Computer Vision• Map/Reduce Image Search• Application: Screenshot Retrieval

Page 30: Large Scale Image Processing

Conclusion

• Vision has 'Big Data' applications• Many image search applications• Common design patterns for M/R+Vision• Hadoop useful image search

Page 31: Large Scale Image Processing

References

[1] P. Duygulu, K. Barnard, J. de Freitas, and D. Forsyth, "Object recognition as machine translation: Learning a lexicon for a fixed image vocabulary," in Computer Vision — ECCV 2002, ser. Lecture Notes in Computer Science, 2002, ch. 7, pp. 349-354.[2] A. Makadia, V. Pavlovic, and S. Kumar, "A new baseline for image annotation," in ECCV '08: Proceedings of the 10th European Conference on Computer Vision.    Berlin, Heidelberg: Springer-Verlag, 2008, pp. 316-329.[3] Matthieu Guillaumin, Thomas Mensink, Jakob Verbeek and Cordelia Schmid, "Tagprop: Discriminative metric learning in nearest neighbor models for image auto-annotation." ICCV 2009[4] A. Torralba, R. Fergus, and W. T. Freeman, "80 million tiny images: A large data set for nonparametric object and scene recognition," Pattern Analysis and Machine Intelligence, IEEE Transactions on, vol. 30, no. 11, pp. 1958-1970, May 2008.