image searches, abstraction, invariancecshalizi/350/lectures/04/lecture_04...abstract level: feature...

27
Image Searches, Abstraction, Invariance 36-350: Data Mining 2 September 2009 1

Upload: others

Post on 29-Aug-2020

0 views

Category:

Documents


0 download

TRANSCRIPT

Page 1: Image Searches, Abstraction, Invariancecshalizi/350/lectures/04/Lecture_04...Abstract level: feature vectors Concrete level: meaningful objects Text 1 Text 2 Text 3 W W v1 v2 v3 v4

Image Searches, Abstraction, Invariance

36-350: Data Mining2 September 2009

1

Page 2: Image Searches, Abstraction, Invariancecshalizi/350/lectures/04/Lecture_04...Abstract level: feature vectors Concrete level: meaningful objects Text 1 Text 2 Text 3 W W v1 v2 v3 v4

• Medical: x-rays, brain imaging, histology (“do these look like cancerous cells?”)

• Satellite imagery

• Fingerprints

• Finding illustrations for lectures...

2

Page 3: Image Searches, Abstraction, Invariancecshalizi/350/lectures/04/Lecture_04...Abstract level: feature vectors Concrete level: meaningful objects Text 1 Text 2 Text 3 W W v1 v2 v3 v4

Searching for Images by Searching for Text

• Assume there’s text accompanying the images (“annotation”)

• tags

• Search those text records with the query phrase

• Take images which appear close to the query phrase on highly-ranked records

• This how Google does it

3

Page 4: Image Searches, Abstraction, Invariancecshalizi/350/lectures/04/Lecture_04...Abstract level: feature vectors Concrete level: meaningful objects Text 1 Text 2 Text 3 W W v1 v2 v3 v4

Sometimes this works perfectly...

4

Page 5: Image Searches, Abstraction, Invariancecshalizi/350/lectures/04/Lecture_04...Abstract level: feature vectors Concrete level: meaningful objects Text 1 Text 2 Text 3 W W v1 v2 v3 v4

...and sometimes it doesn’t;

depends on the text!

5

Page 6: Image Searches, Abstraction, Invariancecshalizi/350/lectures/04/Lecture_04...Abstract level: feature vectors Concrete level: meaningful objects Text 1 Text 2 Text 3 W W v1 v2 v3 v4

Searching for images by representing images

• For text, we only cared about features, and only worked with feature vectors

• Define numerical features for images and everything carries over

• Abstraction

6

Page 7: Image Searches, Abstraction, Invariancecshalizi/350/lectures/04/Lecture_04...Abstract level: feature vectors Concrete level: meaningful objects Text 1 Text 2 Text 3 W W v1 v2 v3 v4

Abstraction

• Remove some of the details but keep others• Kept details = features

• Then act on abstracta

• Hopes:• Simplifies problem• Lets you treat many problems similarly

7

Page 8: Image Searches, Abstraction, Invariancecshalizi/350/lectures/04/Lecture_04...Abstract level: feature vectors Concrete level: meaningful objects Text 1 Text 2 Text 3 W W v1 v2 v3 v4

Abstract level: feature vectors

Concrete level: meaningful objects

Text 1 Text 2 Text 3

BoW

BoW

v1 v2 v3 v4 v5 v6

Similaritymatching

DimensionalityReduction Classification Clustering

BoW

BoW

Text 4 Text 5 Text 6

BoW

BoW

etc.

8

Page 9: Image Searches, Abstraction, Invariancecshalizi/350/lectures/04/Lecture_04...Abstract level: feature vectors Concrete level: meaningful objects Text 1 Text 2 Text 3 W W v1 v2 v3 v4

Concrete level: meaningful objects

Text 1 Text 2 Text 3

Topi

cs

v1 v2 v3 v4 v5 v6

Topi

csText 4 Text 5 Text 6

Topi

cs

Topi

cs

Topi

cs

Topi

cs

Abstract level: feature vectors

Similaritymatching

DimensionalityReduction Classification Clustering etc.

9

Page 10: Image Searches, Abstraction, Invariancecshalizi/350/lectures/04/Lecture_04...Abstract level: feature vectors Concrete level: meaningful objects Text 1 Text 2 Text 3 W W v1 v2 v3 v4

Concrete level: meaningful objects

Bitm

ap

v1 v2 v3 v4 v5 v6

Pic. 1 Pic. 2 Pic. 3 Pic. 4 Pic. 5 Pic.6

Bitm

ap

Bitm

ap

Bitm

ap

Bitm

ap

Bitm

ap

Abstract level: feature vectors

Similaritymatching

DimensionalityReduction Classification Clustering etc.

10

Page 11: Image Searches, Abstraction, Invariancecshalizi/350/lectures/04/Lecture_04...Abstract level: feature vectors Concrete level: meaningful objects Text 1 Text 2 Text 3 W W v1 v2 v3 v4

Concrete level: meaningful objects

v1 v2 v3 v4 v5 v6

Pic. 1 Pic. 2 Pic. 3 Pic. 4 Pic. 5 Pic.6

Bag

of c

olor

s

Bag

of c

olor

s

Bag

of c

olor

s

Bag

of c

olor

s

Bag

of c

olor

s

Bag

of c

olor

s

Abstract level: feature vectors

Similaritymatching

DimensionalityReduction Classification Clustering etc.

11

Page 12: Image Searches, Abstraction, Invariancecshalizi/350/lectures/04/Lecture_04...Abstract level: feature vectors Concrete level: meaningful objects Text 1 Text 2 Text 3 W W v1 v2 v3 v4

Concrete level: meaningful objects

Network 6

Mot

ifs

v1 v2 v3 v4 v5 v6

Network 1 Network 2 Network 3 Network 4 Network 5

Mot

ifs

Mot

ifs

Mot

ifs

Mot

ifs

Mot

ifs

Abstract level: feature vectors

Similaritymatching

DimensionalityReduction Classification Clustering etc.

12

Page 13: Image Searches, Abstraction, Invariancecshalizi/350/lectures/04/Lecture_04...Abstract level: feature vectors Concrete level: meaningful objects Text 1 Text 2 Text 3 W W v1 v2 v3 v4

• Need to find right (relevant) representation

• Representation = concrete/abstract interface• Go read The Sciences of the Artificial!

• Great methods at the abstract level generally fail if the representation is bad• missing what’s relevant• including what’s irrelevant• comparing apples to kangaroos

• both multicellular sexually-reproducing carbon-based lifeforms...

• A lot of your work will be designing representations

13

Page 14: Image Searches, Abstraction, Invariancecshalizi/350/lectures/04/Lecture_04...Abstract level: feature vectors Concrete level: meaningful objects Text 1 Text 2 Text 3 W W v1 v2 v3 v4

Concrete level: meaningful objects

Text 1 Text 2 Text 3

BoW

BoW

Topi

cs

Pic. 1 Pic. 2

SocialNetwork

v1 v2 v3 v4 v5 v6

Bag

of c

olor

s

Bitm

ap

Mot

ifs

Abstract level: feature vectors

Similaritymatching

DimensionalityReduction Classification Clustering etc.

14

Page 15: Image Searches, Abstraction, Invariancecshalizi/350/lectures/04/Lecture_04...Abstract level: feature vectors Concrete level: meaningful objects Text 1 Text 2 Text 3 W W v1 v2 v3 v4

flower1 flower2 flower3

tiger1 tiger2 tiger3

ocean1 ocean2 ocean315

Page 16: Image Searches, Abstraction, Invariancecshalizi/350/lectures/04/Lecture_04...Abstract level: feature vectors Concrete level: meaningful objects Text 1 Text 2 Text 3 W W v1 v2 v3 v4

Euclidean Distance of Images

• Image is MxN pixels, each with 3 color components, so a 3MN vector

• Euclidean distance possible, and OK for some kinds of noise-removal

• but hopeless even at grouping flower1 with flower2

• or slight changes in perspective, lighting...

16

Page 17: Image Searches, Abstraction, Invariancecshalizi/350/lectures/04/Lecture_04...Abstract level: feature vectors Concrete level: meaningful objects Text 1 Text 2 Text 3 W W v1 v2 v3 v4

Bag of Colors

• “If it works, try it some more”

• For each possible color, count how many pixels there are of that color

• Use Euclidean distance on color-count vectors

• Too many colors, so quantize them down to a manageable number (like stemming, or combining synonyms)

17

Page 18: Image Searches, Abstraction, Invariancecshalizi/350/lectures/04/Lecture_04...Abstract level: feature vectors Concrete level: meaningful objects Text 1 Text 2 Text 3 W W v1 v2 v3 v4

flower1

flower2

flower3

flower4

flower5

flower6

flower7

flower8

flower9

tiger1

tiger2

tiger3

tiger4

tiger5

tiger6

tiger7

tiger8

tiger9

ocean1

ocean2

ocean3

ocean4

ocean5

ocean6

ocean7

ocean7ocean6ocean5ocean4ocean3ocean2ocean1tiger9tiger8tiger7tiger6tiger5tiger4tiger3tiger2tiger1

flower9flower8flower7flower6flower5flower4flower3flower2flower1

−1.0 −0.5 0.0 0.5 1.0−1.0

−0.5

0.0

0.5

1.0

V1

V2

flower1

flower2flower3

flower4

flower5

flower6

flower7

flower8

flower9

tiger1tiger2

tiger3

tiger4

tiger5

tiger6

tiger7tiger8

tiger9

ocean1

ocean2

ocean3ocean4

ocean5 ocean6

ocean7

flower ocean tigerMultidimensional scaling

Distances between images MDS plot of images18

Page 19: Image Searches, Abstraction, Invariancecshalizi/350/lectures/04/Lecture_04...Abstract level: feature vectors Concrete level: meaningful objects Text 1 Text 2 Text 3 W W v1 v2 v3 v4

Representation and Invariance

• Invariances of a representation = how can we change the underlying object without changing the representation?

• What differences does the representation ignore?

19

Page 20: Image Searches, Abstraction, Invariancecshalizi/350/lectures/04/Lecture_04...Abstract level: feature vectors Concrete level: meaningful objects Text 1 Text 2 Text 3 W W v1 v2 v3 v4

Invariants of bags of words

• Punctuation and word order

• Universal words (exact count of “the”, “of”, “to”, ...), if using inverse document frequency

• Word-endings, if using stemming

• Grammar, context, word proximity ...

• “Send lawyers, guns and money” vs. “Sending the Guns’ lawyers for the money”

20

Page 21: Image Searches, Abstraction, Invariancecshalizi/350/lectures/04/Lecture_04...Abstract level: feature vectors Concrete level: meaningful objects Text 1 Text 2 Text 3 W W v1 v2 v3 v4

Invariants of bags of colors

• Small changes in orientation, pose, some rotations

• Small amounts of color noise or weird colors

• Texture

21

Page 22: Image Searches, Abstraction, Invariancecshalizi/350/lectures/04/Lecture_04...Abstract level: feature vectors Concrete level: meaningful objects Text 1 Text 2 Text 3 W W v1 v2 v3 v4

Same color counts, different textures

22

Page 23: Image Searches, Abstraction, Invariancecshalizi/350/lectures/04/Lecture_04...Abstract level: feature vectors Concrete level: meaningful objects Text 1 Text 2 Text 3 W W v1 v2 v3 v4

Non-invariants

• Lighting, shadows

• Occlusion, 3D effects

• Blurring

• There are good ways to deal with blur (from astronomy)

• but full vision is very, very hard

23

Page 24: Image Searches, Abstraction, Invariancecshalizi/350/lectures/04/Lecture_04...Abstract level: feature vectors Concrete level: meaningful objects Text 1 Text 2 Text 3 W W v1 v2 v3 v4

• Breaking an invariance is easy

• e.g., add features for textures

• or sub-divide the image and do color-counts on each part

• Adding invariances is hard

• often need to go back to scratch and chose a different representation

24

Page 25: Image Searches, Abstraction, Invariancecshalizi/350/lectures/04/Lecture_04...Abstract level: feature vectors Concrete level: meaningful objects Text 1 Text 2 Text 3 W W v1 v2 v3 v4

Similarity search

with real images

from the web

(“retrievr”, see notes)

25

Page 26: Image Searches, Abstraction, Invariancecshalizi/350/lectures/04/Lecture_04...Abstract level: feature vectors Concrete level: meaningful objects Text 1 Text 2 Text 3 W W v1 v2 v3 v4

26

Page 27: Image Searches, Abstraction, Invariancecshalizi/350/lectures/04/Lecture_04...Abstract level: feature vectors Concrete level: meaningful objects Text 1 Text 2 Text 3 W W v1 v2 v3 v4

• Typically works better with more restricted domains (actually pretty good for medical images)

27