![Page 1: Face Detection, Pose Estimation, and Landmark Localization in the Wild](https://reader034.vdocument.in/reader034/viewer/2022051623/568158bd550346895dc603b0/html5/thumbnails/1.jpg)
Face Detection, Pose Estimation, and Landmark Localization in the Wild
Xiangxin Zhu Deva RamananDept. of Computer Science, University of California, Irvine
![Page 2: Face Detection, Pose Estimation, and Landmark Localization in the Wild](https://reader034.vdocument.in/reader034/viewer/2022051623/568158bd550346895dc603b0/html5/thumbnails/2.jpg)
Outline
•Introduction•Model•Inference•Learning•Experimental Results•Conclusion
![Page 3: Face Detection, Pose Estimation, and Landmark Localization in the Wild](https://reader034.vdocument.in/reader034/viewer/2022051623/568158bd550346895dc603b0/html5/thumbnails/3.jpg)
Introduction
•A unified model for face detection, pose estimation,and landmark estimation.
•Based on a mixtures of trees with a shared pool of parts
•Use global mixtures to capture topological changes
![Page 4: Face Detection, Pose Estimation, and Landmark Localization in the Wild](https://reader034.vdocument.in/reader034/viewer/2022051623/568158bd550346895dc603b0/html5/thumbnails/4.jpg)
Introduction
![Page 5: Face Detection, Pose Estimation, and Landmark Localization in the Wild](https://reader034.vdocument.in/reader034/viewer/2022051623/568158bd550346895dc603b0/html5/thumbnails/5.jpg)
mixture-of-trees model
![Page 6: Face Detection, Pose Estimation, and Landmark Localization in the Wild](https://reader034.vdocument.in/reader034/viewer/2022051623/568158bd550346895dc603b0/html5/thumbnails/6.jpg)
Tree structured part model• We write each tree Tm =(Vm,Em) as a linearly-
parameterized ,where m indicates a mixture and .
• I : image, and li = (xi, yi) : the pixel location of part i.• We score a configuration of parts
• : a scalar bias associated with viewpoint mixture m
![Page 7: Face Detection, Pose Estimation, and Landmark Localization in the Wild](https://reader034.vdocument.in/reader034/viewer/2022051623/568158bd550346895dc603b0/html5/thumbnails/7.jpg)
Tree structured part model
• Sums the appearance evidence for placing a template for part i, tuned for mixture m, at location li.
• Scores the mixture-specific spatial arrangement of parts L
![Page 8: Face Detection, Pose Estimation, and Landmark Localization in the Wild](https://reader034.vdocument.in/reader034/viewer/2022051623/568158bd550346895dc603b0/html5/thumbnails/8.jpg)
Shape model• the shape model can be rewritten
• : re-parameterizations of the shape model (a, b, c, d)
• : a block sparse precision matrix, with non-zero entries corresponding to pairs of parts i, j connected in Em.
![Page 9: Face Detection, Pose Estimation, and Landmark Localization in the Wild](https://reader034.vdocument.in/reader034/viewer/2022051623/568158bd550346895dc603b0/html5/thumbnails/9.jpg)
The mean shape and deformation modes
![Page 10: Face Detection, Pose Estimation, and Landmark Localization in the Wild](https://reader034.vdocument.in/reader034/viewer/2022051623/568158bd550346895dc603b0/html5/thumbnails/10.jpg)
Inference• Inference corresponds to maximizing S(I, L,m) in Eqn.1
over L and m:
• Since each mixture Tm =(Vm,Em) is a tree, the inner maximization can be done efficiently with dynamic programming.
![Page 11: Face Detection, Pose Estimation, and Landmark Localization in the Wild](https://reader034.vdocument.in/reader034/viewer/2022051623/568158bd550346895dc603b0/html5/thumbnails/11.jpg)
Learning• Given labeled positive examples {In,Ln,mn} and negative
examples {In}, we will define a structured prediction objective function similar to one proposed in [41]. To do so, let’s write zn = {Ln,mn}.
• Concatenating Eqn1’s parameters into a single vector
[41] Y. Yang and D. Ramanan. Articulated pose estimation using flexible mixtures of parts. In CVPR 2011.
![Page 12: Face Detection, Pose Estimation, and Landmark Localization in the Wild](https://reader034.vdocument.in/reader034/viewer/2022051623/568158bd550346895dc603b0/html5/thumbnails/12.jpg)
Learning• Now we can learn a model of the form:
• The objective function penalizes violations of these constraints using slack variables
• write K for the indices of the quadratic spring terms (a, c) in parameter vector .
![Page 13: Face Detection, Pose Estimation, and Landmark Localization in the Wild](https://reader034.vdocument.in/reader034/viewer/2022051623/568158bd550346895dc603b0/html5/thumbnails/13.jpg)
Experimental Results
![Page 14: Face Detection, Pose Estimation, and Landmark Localization in the Wild](https://reader034.vdocument.in/reader034/viewer/2022051623/568158bd550346895dc603b0/html5/thumbnails/14.jpg)
Dataset
•CMU MultiPIE•annotated face in-the-wild (AFW) (from Flickr images)
![Page 15: Face Detection, Pose Estimation, and Landmark Localization in the Wild](https://reader034.vdocument.in/reader034/viewer/2022051623/568158bd550346895dc603b0/html5/thumbnails/15.jpg)
Dataset
![Page 16: Face Detection, Pose Estimation, and Landmark Localization in the Wild](https://reader034.vdocument.in/reader034/viewer/2022051623/568158bd550346895dc603b0/html5/thumbnails/16.jpg)
Sharing
•We explore 4 levels of sharing, denoting each model with the number of distinct templates encoded.▫Share-99 (i.e. fully shared model)▫Share-146▫Share-622▫Independent-1050 (i.e. independent model)
![Page 17: Face Detection, Pose Estimation, and Landmark Localization in the Wild](https://reader034.vdocument.in/reader034/viewer/2022051623/568158bd550346895dc603b0/html5/thumbnails/17.jpg)
![Page 18: Face Detection, Pose Estimation, and Landmark Localization in the Wild](https://reader034.vdocument.in/reader034/viewer/2022051623/568158bd550346895dc603b0/html5/thumbnails/18.jpg)
In-house baselines
•We define Multi.HoG to be rigid, multiview HoG template detectors, trained on the same data as our models.
•We define Star Model to be equivalent to Share-99 but defined using a “star” connectivity graph, where all parts are directly connected to a root nose part.
![Page 19: Face Detection, Pose Estimation, and Landmark Localization in the Wild](https://reader034.vdocument.in/reader034/viewer/2022051623/568158bd550346895dc603b0/html5/thumbnails/19.jpg)
Face detection on AFW testset
[22] Z. Kalal, J. Matas, and K. Mikolajczyk. Weighted sampling for large-scale boosting. In BMVC 2008.
![Page 20: Face Detection, Pose Estimation, and Landmark Localization in the Wild](https://reader034.vdocument.in/reader034/viewer/2022051623/568158bd550346895dc603b0/html5/thumbnails/20.jpg)
Pose estimation
![Page 21: Face Detection, Pose Estimation, and Landmark Localization in the Wild](https://reader034.vdocument.in/reader034/viewer/2022051623/568158bd550346895dc603b0/html5/thumbnails/21.jpg)
Landmark localization
![Page 22: Face Detection, Pose Estimation, and Landmark Localization in the Wild](https://reader034.vdocument.in/reader034/viewer/2022051623/568158bd550346895dc603b0/html5/thumbnails/22.jpg)
Landmark localization
![Page 23: Face Detection, Pose Estimation, and Landmark Localization in the Wild](https://reader034.vdocument.in/reader034/viewer/2022051623/568158bd550346895dc603b0/html5/thumbnails/23.jpg)
AFW image
![Page 24: Face Detection, Pose Estimation, and Landmark Localization in the Wild](https://reader034.vdocument.in/reader034/viewer/2022051623/568158bd550346895dc603b0/html5/thumbnails/24.jpg)
Conclusion
•Our model outperforms state-of-the-art methods, including large-scale commercial systems, on all three tasks under both constrained and in-the-wild environments.
![Page 25: Face Detection, Pose Estimation, and Landmark Localization in the Wild](https://reader034.vdocument.in/reader034/viewer/2022051623/568158bd550346895dc603b0/html5/thumbnails/25.jpg)
Thanks for listening