training kinect
DESCRIPTION
Training Kinect. Mihai Budiu Microsoft Research, Silicon Valley UCSD CNS 2012 RESEARCH REVIEW February 8, 2012. Label body parts in depth map. - PowerPoint PPT PresentationTRANSCRIPT
Training Kinect
Mihai BudiuMicrosoft Research, Silicon Valley
UCSD CNS 2012 RESEARCH REVIEW February 8, 2012
2
Label body parts in depth map
Parallelizing the Training of the Kinect Body Parts Labeling AlgorithmMihai Budiu, Jamie Shotton, Derek G. Murray, and Mark FinocchioBig Learning: Algorithms, Systems and Tools for Learning at Scale, Sierra Nevada, Spain, December 16-17, 2011
3
Solution: Learn from Data
Classifier
Training examplesMachine learning
4
Big data
• 1M Training examples• 300,000 pixels/image• 100,000 features• <220 tree nodes/tree• 31 body parts• 3 trees
Dryad
DryadLINQ
Decision forest inference
Classifier
Execution
Application
Data-Parallel Computation
5
Storage
Language
ParallelDatabases
Map-Reduce
GFSBigTable
CosmosAzureHPC
Dryad
DryadLINQSawzall,FlumeJava
Hadoop
HDFSS3
Pig, HiveSQL ≈SQL LINQSawzall, Java
6
Dryad = 2-D Piping• Unix Pipes: 1-D
grep | sed | sort | awk | perl
• Dryad: 2-D grep1000 | sed500 | sort1000 | awk500 | perl50
7
Virtualized 2-D Pipelines
8
Virtualized 2-D Pipelines
9
Virtualized 2-D Pipelines
10
Virtualized 2-D Pipelines
11
Virtualized 2-D Pipelines• 2D DAG• multi-machine• virtualized
12
Fault Tolerance
13
LINQ
Dryad
=> DryadLINQ
14
LINQ = .Net+ Queries
Collection<T> collection;bool IsLegal(Key);string Hash(Key);
var results = from c in collection where IsLegal(c.key) select new { Hash(c.key), c.value};
15
DryadLINQ Data Model
Partition
Collection
.Net objects
16
Collection<T> collection;bool IsLegal(Key k);string Hash(Key);
var results = from c in collection where IsLegal(c.key) select new { Hash(c.key), c.value};
DryadLINQ = LINQ + Dryad
C#
collection
results
C# C# C#
Vertexcode
Queryplan(Dryad job)Data
17
Kinect Training Pipeline
20x
18
Partial tree ImagesFeatures
split
New partial tree
Query plan for one tree layer
Parallelize on:• Features• Images• Tree nodes
19
High cluster utilization
Time
Mac
hine
20
CONCLUSIONS
21
Huge Commercial Success
22
Tremendous Interest from Developers
23
Consumer Technologies Push The Envelope
Price: 6000$
Price: 150$
24
Unique Opportunity for Technology Transfer
25
I can finally explain to my sonwhat I do for a living…
26
BACKUP SLIDES
27
10 100 1000 10000 100000 10000000
0.05
0.1
0.158 core machine1000 core cluster
Number of training images (log scale)
core
* h
ours
/ im
age
Training efficiency
28
Cluster usage for one tree
Time (s)
Machine(235)
Prep
roce
ss
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 (f
aile
d)
19
18.3 hours, 137.2 CPU days, 107421 processes, 29.56 TB data, average parallelism=140
1440
0 pr
oces
ses
Nor
mal
izeTr
ee
29
DryadLINQ Language Summary
WhereSelectGroupByOrderByAggregateJoin