fast computational methods for visually guided robots maryam mahdaviani, nando de freitas, bob...

21
Fast Computational Methods for Visually Guided Robots Maryam Mahdaviani, Nando de Freitas, Bob Fraser and Firas Department of Computer Science, University of British Columbia We apply Semi-supervised and active learning algorithms (Zhu et al) for interactive object recognition in visually guided robots. These algorithms are O(M 3 ), but we will show that the cost can be reduced to O(M). We will also reduce storage from O(M 2 ) to O(M).

Post on 19-Dec-2015

212 views

Category:

Documents


0 download

TRANSCRIPT

Page 1: Fast Computational Methods for Visually Guided Robots Maryam Mahdaviani, Nando de Freitas, Bob Fraser and Firas Hamze Department of Computer Science, University

Fast Computational Methods for Visually Guided Robots

Maryam Mahdaviani, Nando de Freitas, Bob Fraser and Firas Hamze Department of Computer Science, University of British Columbia, CANADA

We apply Semi-supervised and active learning algorithms (Zhu et al) for interactive object recognition in visually guided robots.

These algorithms are O(M3), but we will show that the cost can be reduced to O(M). We will also reduce storage from O(M2) to O(M).

Page 2: Fast Computational Methods for Visually Guided Robots Maryam Mahdaviani, Nando de Freitas, Bob Fraser and Firas Hamze Department of Computer Science, University

2

Object recognition with semi-supervised data and simple color features

Page 3: Fast Computational Methods for Visually Guided Robots Maryam Mahdaviani, Nando de Freitas, Bob Fraser and Firas Hamze Department of Computer Science, University

3

Aibo is able to identify objects in different settings.

Aibo can learn and classify several objects at the same time.

Page 4: Fast Computational Methods for Visually Guided Robots Maryam Mahdaviani, Nando de Freitas, Bob Fraser and Firas Hamze Department of Computer Science, University

4

Semi-supervised Learning

2||||1

ji xx

ij ew

1ly

0ly

?uyxi

xj

2

1 1

)( yyw ji

M

i

M

jij

Error

We have:Input data x, two labels yl

We want:A full labeling of the data

wij

Page 5: Fast Computational Methods for Visually Guided Robots Maryam Mahdaviani, Nando de Freitas, Bob Fraser and Firas Hamze Department of Computer Science, University

5

lllu

uluuW

ww

ww

NjNj

Njj

Njj

w

w

w

D

2

1

Semi-supervised Learning Leads to a Linear System of Equations

Differentiating the Error function and equating it Differentiating the Error function and equating it to zero, gives the solution in terms of a linear to zero, gives the solution in terms of a linear system of equations (Zhu et al, 2003):system of equations (Zhu et al, 2003):

luluuuuu yWyWD )(Where W is the adjacency matrix.Where W is the adjacency matrix.

0

0

Page 6: Fast Computational Methods for Visually Guided Robots Maryam Mahdaviani, Nando de Freitas, Bob Fraser and Firas Hamze Department of Computer Science, University

6

The big computational bottleneck is M3

What is large M??

1955: M=201965: M=2001980: M=2000 1995: M=200002005: M=200000

Solving the linear system of equations Solving the linear system of equations

costs costs O(MO(M33)) , where M is a large number of , where M is a large number of unlabeled features.unlabeled features.

luluuuuu yWyWD )(

So over the course of 50 years M has increased by a factor of 104.

However, the speed of computers has increased by a factor of 1012. From this, the problematic O(M3) bottleneck is evident.

Page 7: Fast Computational Methods for Visually Guided Robots Maryam Mahdaviani, Nando de Freitas, Bob Fraser and Firas Hamze Department of Computer Science, University

7

• Using iterative methods (of which Krylov are well known to work best), the cost can be reduced to O(M2) times the number of iterations. The expensive step in each iteration is the following matrix-vector multiplication:

• This matrix vector multiplication can be written as two O(M2) Gaussian kernel estimates:

• These kernel estimates can be solved in O(M) operations using the Fast Gauss Transform.

M

jiji wd

1

1

qWDv uuuu )(

M

jijji wqg

1

From O(M3) to O(M2): Krylov Iterative Methods (MINRES)

Page 8: Fast Computational Methods for Visually Guided Robots Maryam Mahdaviani, Nando de Freitas, Bob Fraser and Firas Hamze Department of Computer Science, University

8

From O(M2) to O(M): The Fast Gauss Transform

• Intuition:

*L Greengard and V Rokhlin,1987

Storage requirement is also reduced from O(M2) to O(M) !!

Page 9: Fast Computational Methods for Visually Guided Robots Maryam Mahdaviani, Nando de Freitas, Bob Fraser and Firas Hamze Department of Computer Science, University

9

Training in Real Time

Page 10: Fast Computational Methods for Visually Guided Robots Maryam Mahdaviani, Nando de Freitas, Bob Fraser and Firas Hamze Department of Computer Science, University

10

Predicting Pixel Labels

• Once we have labels for M points in our training data, we use a classical kernel discriminant for N test pixels

• The cost is O(NM)!

• By applying the Fast Gauss Transform the cost can be reduced to O(N+M).

M

iik

M

iiik

k

w

ywy

1

1

Page 11: Fast Computational Methods for Visually Guided Robots Maryam Mahdaviani, Nando de Freitas, Bob Fraser and Firas Hamze Department of Computer Science, University

11

Predicting Pixel Labels in Real Time

Page 12: Fast Computational Methods for Visually Guided Robots Maryam Mahdaviani, Nando de Freitas, Bob Fraser and Firas Hamze Department of Computer Science, University

12

Active Learning

• Labeling data is an expensive process. We use active learning to choose what pixels should be labeled automatically.

• Active learning calls the semi-supervised learning subroutine at each iteration.

Page 13: Fast Computational Methods for Visually Guided Robots Maryam Mahdaviani, Nando de Freitas, Bob Fraser and Firas Hamze Department of Computer Science, University

13

Active learning: asking the right questions

Aibo recognizes the ball without a problem.

Since the orange ring is close to the ball in colour space, Aibo gets confused and decides to prompt the user for labels.

• We want the robot to ask the right questions. The robot prompts the user for the labels that will improve its performance the most.

Page 14: Fast Computational Methods for Visually Guided Robots Maryam Mahdaviani, Nando de Freitas, Bob Fraser and Firas Hamze Department of Computer Science, University

14

• We managed to reduce computational cost from O(M3) to O(M) and storage requirement from O(M2) to O(M).

• Currently we are using more sophisticated features (SIFT) and dual KD-tree recursions methods to

deal with high dimensions. • These methods can be applied to other problems such as SLAM, segmentation, ranking and

Gaussian Processes.

Thank You!

Questions?

Page 15: Fast Computational Methods for Visually Guided Robots Maryam Mahdaviani, Nando de Freitas, Bob Fraser and Firas Hamze Department of Computer Science, University

15

• One solution: Power Method

• O(N2) per iteration

• But it might take TOO MANY iterations to converge

][ )(1)1(lul

tuuuuu

tu yWyWDy

Page 16: Fast Computational Methods for Visually Guided Robots Maryam Mahdaviani, Nando de Freitas, Bob Fraser and Firas Hamze Department of Computer Science, University

16

The Fast Gauss Transform- Reduction of

Complexity Straightforward (nested

loop)

end

end

qwgg

Nifor

g

Mjfor

iijjj

j

,...,1

0

,...,1

ij

N

j

tji wqg

1

)(

Page 17: Fast Computational Methods for Visually Guided Robots Maryam Mahdaviani, Nando de Freitas, Bob Fraser and Firas Hamze Department of Computer Science, University

17

The Fast Gauss Transform- Reduction of Complexity

p<<N

end

end

qxxacc

Nifor

c

pmfor

iimmm

m

)(

,...,1

0

1,...,0

*

end

end

xyfcgg

pmfor

g

Njfor

jmmjj

j

B

)(

1,...,0

0

,...,1

*

Page 18: Fast Computational Methods for Visually Guided Robots Maryam Mahdaviani, Nando de Freitas, Bob Fraser and Firas Hamze Department of Computer Science, University

18

Page 19: Fast Computational Methods for Visually Guided Robots Maryam Mahdaviani, Nando de Freitas, Bob Fraser and Firas Hamze Department of Computer Science, University

19

Training Set Time Comparison

M Computational time (seconds)

Naïve MINRES-FGT MINRES-FGT

60 0.521703 0.119514 0.312126

120 4.23732 0.250050 0.518589

240 78.7864 0.729464 0.791181

480 501.46 2.56246 1.165930

960 -------- 63.9487 2.02537

1920 -------- 497.59 3.97674

Page 20: Fast Computational Methods for Visually Guided Robots Maryam Mahdaviani, Nando de Freitas, Bob Fraser and Firas Hamze Department of Computer Science, University

20

Test Set Time Comparison

N Computational time (seconds)

Naive FGT

260 0.0036083 0.035944

520 0.128507 0.113086

1040 0.458446 0.178275

2080 1.69306 0.321210

4160 6.62728 0.682747

8320 20.56953 0.858313

Page 21: Fast Computational Methods for Visually Guided Robots Maryam Mahdaviani, Nando de Freitas, Bob Fraser and Firas Hamze Department of Computer Science, University

21

Krylov Subspace Methods: MINRES Algorithm

||)(~

||min 1ebct

HtocFind

nt

n

tttt

Ttn

vqandv

qqvv

vq

)1(

)()()1()1(

)(

...3,2,1tFor

)()( tquu

Wuu

Dv

ctQtuysetand )()(

The cost can be reduced to The cost can be reduced to O(MO(M22)) times number of times number of iterations.iterations.