![Page 1: Is there a simple Statistical Model of Generic Natural Images? David Mumford](https://reader036.vdocument.in/reader036/viewer/2022062803/5681469a550346895db3b344/html5/thumbnails/1.jpg)
MSRI Program:Mathematical,
Computational and Statistical Aspects of Vision
Learning and Inference in Low and Mid Level VisionFeb 21-25, 2005
Is there a simple Statistical Model of
Generic Natural Images?
David Mumford
![Page 2: Is there a simple Statistical Model of Generic Natural Images? David Mumford](https://reader036.vdocument.in/reader036/viewer/2022062803/5681469a550346895db3b344/html5/thumbnails/2.jpg)
Outline of talk
1. What are we trying to do: the role of modeling, the analogy with language.
2. The scaling property of images and its implications: model I
3. High kurtosis as the universal clue to discrete structure: models IIa and IIb
4. The ‘phonemes’ of images
5. Syntax: objects+parts, gestalt grouping, segmentation
6. PCFG’s, random branching trees, models IIIa and IIIb
7. Two implementations: Zhu et al, Sharon etal
8. Parse graphs: the problem of context sensitivity.
![Page 3: Is there a simple Statistical Model of Generic Natural Images? David Mumford](https://reader036.vdocument.in/reader036/viewer/2022062803/5681469a550346895db3b344/html5/thumbnails/3.jpg)
Part 1: What is the role of a model?
• How much of the complexities of reality can be usefully abstracted in a precise mathematical model?• Early examples:
Ibn al-HaythamGalileo
• Recent examples:Navier-StokesVapnik and PAC learning
• The model and the algorithm are two different things. Don’t judge a model by one slow implementation!
Analogy with language4 levels:• Phonology Pixel statistics• Syntax Grouping rules• Semantics Object recognition• Pragmatics Robotic applications
![Page 4: Is there a simple Statistical Model of Generic Natural Images? David Mumford](https://reader036.vdocument.in/reader036/viewer/2022062803/5681469a550346895db3b344/html5/thumbnails/4.jpg)
Part 2: Scaling properties of image statistics
“Renormalization fixed point” means that in a 2Nx2N image, the marginal stats on an NxN subimage or an averaged NxN image should be the same.
In the continuum limit, if random means a sample from μ on the space of Schwarz distributions D′ (R2), then scale-invariant means
2 1, ( ) , where , ,T T f f
![Page 5: Is there a simple Statistical Model of Generic Natural Images? David Mumford](https://reader036.vdocument.in/reader036/viewer/2022062803/5681469a550346895db3b344/html5/thumbnails/5.jpg)
Evidence of scaling in images: horizontal derivative of images from 2 different
databases at 5 dyadic scales
Vertical axis is log of frequency; dotted lines = 1 standard deviation
![Page 6: Is there a simple Statistical Model of Generic Natural Images? David Mumford](https://reader036.vdocument.in/reader036/viewer/2022062803/5681469a550346895db3b344/html5/thumbnails/6.jpg)
The sources of scaling
The distance from the camera or eye to the object is random. When a scene is viewed nearer, all objects enlarge; further away, they shrink (modulo perspective corrections). Note blueberry pickers in the image above.
![Page 7: Is there a simple Statistical Model of Generic Natural Images? David Mumford](https://reader036.vdocument.in/reader036/viewer/2022062803/5681469a550346895db3b344/html5/thumbnails/7.jpg)
A less obvious root of scaling
• On the left, a woman and dog on a dirt road in a rural scene; on the right, enlargement of a patch.
• Note the dog is the same size as the texture patches in the dirt road; and in the enlargement, windows and shrub branches retreat into ‘noise’.
• There is a continuum from objects to texture elements to noise of unresolved details
![Page 8: Is there a simple Statistical Model of Generic Natural Images? David Mumford](https://reader036.vdocument.in/reader036/viewer/2022062803/5681469a550346895db3b344/html5/thumbnails/8.jpg)
Gaussian colored noise, spectrum 1/f 2
(looks like typical cumulus clouds) This image is not a measurable function!!
Left: 1/f 4 noise – a true function. Right: white noise
![Page 9: Is there a simple Statistical Model of Generic Natural Images? David Mumford](https://reader036.vdocument.in/reader036/viewer/2022062803/5681469a550346895db3b344/html5/thumbnails/9.jpg)
Part 3: Another basic statistical property -- high kurtosis
• Essentially all real valued signals from nature have kurtosis (=μ4/σ4) greater than 3 (their Gaussian value).
• Explanation I: the signal is a mixture of Gaussians with multiple variances (‘heteroscedastic’ to wall streeters). Thus random mean 0 8x8 filter stats have kurtosis > 3, and this disappears if the images are locally contrast normalized.
• Explanation II: a Markov stochastic process with i.i.d increments always has values Xt with kurtosis ≥ 3, and if >3, it has discrete jumps. (Such variables are called infinitely divisible). Thus high kurtosis is a signal of discrete events/objects in nature.
![Page 10: Is there a simple Statistical Model of Generic Natural Images? David Mumford](https://reader036.vdocument.in/reader036/viewer/2022062803/5681469a550346895db3b344/html5/thumbnails/10.jpg)
Huge tails are not needed for a process to have jumps: generate a
stochastic process with gamma increments
![Page 11: Is there a simple Statistical Model of Generic Natural Images? David Mumford](https://reader036.vdocument.in/reader036/viewer/2022062803/5681469a550346895db3b344/html5/thumbnails/11.jpg)
The Levy-Khintchine theorem for images
• If a random ‘natural’ image I is a vector valued infinitely divisible variable, then L-K applies, so:
, a Poisson
process sampled from Levy measure
gauss k kI I I I
The Ik were called ‘textons’ by Julesz, are the
elts of Marr’s ‘primal sketch’.
What can we say about these elementary constituents of images?
Edges, bars, blobs, corners, T-junctions?
Seeking the textons experimentally – 2x2, 3x3 patches :Ann Lee, Huang, Zhu, Malik, …
![Page 12: Is there a simple Statistical Model of Generic Natural Images? David Mumford](https://reader036.vdocument.in/reader036/viewer/2022062803/5681469a550346895db3b344/html5/thumbnails/12.jpg)
Levi-Khinchine leads to the next level of image modeling
• Random natural images have translation and scale-invariant statistics
• This means the primitive objects should be ‘random wavelets:
• Must worry about UV and IR limits – but it works.
• A complication: occlusion. This makes images extremely non-Markovian, leads to the ‘Dead Leaves’ (Matheron, Serra) or ‘random collage’ model (Ann Lee):
( , ) ( , )
where are size+position normalized
( , ) ( , )
k kr rk k k k
k
kk
I x y e x x e y y
I x y I x y
( , ) ( , ), where is the
component covering ( , ) closest to the observer
k kr rk k kI x y e x x e y y k
x y
![Page 13: Is there a simple Statistical Model of Generic Natural Images? David Mumford](https://reader036.vdocument.in/reader036/viewer/2022062803/5681469a550346895db3b344/html5/thumbnails/13.jpg)
4 random wavelet images with different primitives
![Page 14: Is there a simple Statistical Model of Generic Natural Images? David Mumford](https://reader036.vdocument.in/reader036/viewer/2022062803/5681469a550346895db3b344/html5/thumbnails/14.jpg)
Part 4: In search of the primitives: equi-probable contours in the joint histograms of adjacent wavelet pairs
(filters from E.Simoncelli, calculation by J.Huang)
Bottom left: horizontally adjacent horizontal filters. The diagonal corner illustrates the likelihood of contour continuation; the rounder corners on the x- and y- axes are line endings.
Bottom right: horizontally adjacent vertical filters. The anti-diagonal elongation comes from bars giving a contrast reversal; the rounded corners on the axes comes from edges making one filter respond but not the other.
These cannot be produced by products of an independent scalar factor and a Gaussian.
![Page 15: Is there a simple Statistical Model of Generic Natural Images? David Mumford](https://reader036.vdocument.in/reader036/viewer/2022062803/5681469a550346895db3b344/html5/thumbnails/15.jpg)
Image ‘phonemes’ obtained by k-means clustering on 8x8 image
patches (from J.Huang)
![Page 16: Is there a simple Statistical Model of Generic Natural Images? David Mumford](https://reader036.vdocument.in/reader036/viewer/2022062803/5681469a550346895db3b344/html5/thumbnails/16.jpg)
Ann Lee’s study of 3x3 patches
Take all 3x3 patches, normalize mean to 0, take those with top 20% contrast, normalize contrast: result is a datapoint on S7.
In this 7-sphere, perfect edges form a surface E. Plot the volume in tubular neighborhhods of E and the proportion of datapoints in them.
There is a huge concentration around E with aymptotic infinite density.
![Page 17: Is there a simple Statistical Model of Generic Natural Images? David Mumford](https://reader036.vdocument.in/reader036/viewer/2022062803/5681469a550346895db3b344/html5/thumbnails/17.jpg)
Part 5. Grouping laws and the syntax of images
• Our models are too homogeneous. Natural world scenes have more homogeneously colored/textured parts with discontinuities between them: this is segmentation.
• Is segmentation well-defined? YES if you let it be hierarchical and not one fixed domain decomposition, allowing for the scaling of images.
![Page 18: Is there a simple Statistical Model of Generic Natural Images? David Mumford](https://reader036.vdocument.in/reader036/viewer/2022062803/5681469a550346895db3b344/html5/thumbnails/18.jpg)
3 images from the Malik database, each with 3 human segmentations
The gestalt rules of grouping(Metzger, Wertheimer, Kanisza,…)
![Page 19: Is there a simple Statistical Model of Generic Natural Images? David Mumford](https://reader036.vdocument.in/reader036/viewer/2022062803/5681469a550346895db3b344/html5/thumbnails/19.jpg)
Segmentation of images is sometimes obvious,
sometimes not: clutter is a consequence of scaling
The classic mandrill: it segments unambiguously into eyes, nose, fleshy cheeks, whiskers, fur
My own favorite for total clutter: an image of log driver in the spring timber run. Logs do not form a consistent texture, background trees have contrast reversal, snow matches white water.
![Page 20: Is there a simple Statistical Model of Generic Natural Images? David Mumford](https://reader036.vdocument.in/reader036/viewer/2022062803/5681469a550346895db3b344/html5/thumbnails/20.jpg)
The gestalt rules of grouping(Metzger, Wertheimer, Kanisza,…)
Elements of images are linked on the basis of:
• Proximity
• Similar color/texture These are the factors used in segmentation
• Good continuation This is studied as contour completion
• Parallelism
• Symmetry
• Convexity
Reconstructed edges and objects can be amodal as well as modal:
![Page 21: Is there a simple Statistical Model of Generic Natural Images? David Mumford](https://reader036.vdocument.in/reader036/viewer/2022062803/5681469a550346895db3b344/html5/thumbnails/21.jpg)
Part 6: Random branching trees
• Start at the root. Each node decides to have k children with probability pk, where Σ pk = 1. Continue infinietly or until no more children.
• λ= Σ kpk is the expected # of children; if λ≤1, then the tree is a.s. finite; if λ>1, it is infinite with pos.prob.
• Can put labels, from some finite set L, on the nodes and make a labelled tree from a prob. distr. which assigns probabilities to a label {} having k children with labels {k
• This is identical to what linguists call PCFG’s (=probabilistic context-free grammars). For them, L is the set of attributed phrases (e.g. ‘singular feminine noun phrases’) plus the lexicon (which can have no children) and the tree is assumed a.s. to be finite.
![Page 22: Is there a simple Statistical Model of Generic Natural Images? David Mumford](https://reader036.vdocument.in/reader036/viewer/2022062803/5681469a550346895db3b344/html5/thumbnails/22.jpg)
Graft random branching trees into the random wavelet model
1. Seed scale space with a Poisson process (xi,yi,ri) with density Cdxdydr/r3.
2. Let each node grow a tree with growth rate <(r/r0)2.
3. Put primitives on seeds, e.g. face, tree, road. It passes attributes to its children, e.g. eyes, trunk, car.
![Page 23: Is there a simple Statistical Model of Generic Natural Images? David Mumford](https://reader036.vdocument.in/reader036/viewer/2022062803/5681469a550346895db3b344/html5/thumbnails/23.jpg)
Part 7a: Zhu’s algorithms: top-down synthesis and bottom up tree construction
![Page 24: Is there a simple Statistical Model of Generic Natural Images? David Mumford](https://reader036.vdocument.in/reader036/viewer/2022062803/5681469a550346895db3b344/html5/thumbnails/24.jpg)
The nodes of the tree correspond to subsets of the image domain; edges show when one region is a subset of the other
![Page 25: Is there a simple Statistical Model of Generic Natural Images? David Mumford](https://reader036.vdocument.in/reader036/viewer/2022062803/5681469a550346895db3b344/html5/thumbnails/25.jpg)
Part 7b: AMG algorithm of Galun, Sharon, Basri and Brandt
The segmentation of a shell by progressive grouping, the set of colored regions at each level forming the nodes of the grid at that level and the grouping following weights obtained by aggregating statistics from below.
![Page 26: Is there a simple Statistical Model of Generic Natural Images? David Mumford](https://reader036.vdocument.in/reader036/viewer/2022062803/5681469a550346895db3b344/html5/thumbnails/26.jpg)
A second example of AMG
![Page 27: Is there a simple Statistical Model of Generic Natural Images? David Mumford](https://reader036.vdocument.in/reader036/viewer/2022062803/5681469a550346895db3b344/html5/thumbnails/27.jpg)
Part 8: The next frontier -- context sensitive grammars
Children of children need to share information, i.e. context. This can be done by giving more and more attributes to each node to pass down. But this gets absurd after a while. Here areTwo examples – a face and a sentence of a 2 ½ yr. old
![Page 28: Is there a simple Statistical Model of Generic Natural Images? David Mumford](https://reader036.vdocument.in/reader036/viewer/2022062803/5681469a550346895db3b344/html5/thumbnails/28.jpg)
‘Unification’ grammar, Shieber and Geman
Parsing the concept of square. Parts of the square belong to more than intermediate level grouping. We no longer have a tree.
‘Compositional’ grammars (Geman-
Bienenstock) are a beginning of a stochastic version of this construction