petuum - a new platform
TRANSCRIPT
-
7/26/2019 Petuum - A new Platform
1/15
PETUUMa New Platform for Distributed Machine Learning on Big Data
- Seminar on Big Data Architectures for Machine Learning and Data Mini
-
7/26/2019 Petuum - A new Platform
2/15
utline
Introduction the need
current problems
Motivation why build a new
framework?!
scalability
Components
& Approach key concepts
data-parallelism
model-parallelism
System Design internals -
schematic view
scheduler, workers
YARN compatibility problems &
solutions proposed
Performan Comparison w
different plat
-
7/26/2019 Petuum - A new Platform
3/15
ntroduction
Machine Learning is becoming a primary mechanism for extracting inform
Need ML methods to scale beyond single machine.
Flickr, Instagram and Facebook 10s of billions of images.
Highly inefficient to use such big data sequentially in a batch fashion in a ML algorithm
Despite rapid development of many new ML models and algorithms aim
application, adoption of these technologies remains generally unseen in
mining, NLP, vision, and other application communities.
Difficult migration from an academic implementation (small desktop PCs
clusters) to a big, less predictable platform (cloud or a corporate cluster)
models and algorithms from being widely applied.
-
7/26/2019 Petuum - A new Platform
4/15
otivation
Find a systematic way to efficiently apply a wide spectrum of adprograms to industrial scale problems, using Big Models (up to 1of parameters) on Big Data (up to terabytes or petabytes).
Modern parallelization strategies employ fine-grained operationdifficult to find a universal platform applicable to a wide range oat scale.
Problems with other platforms
Hadoop :o simplicity of MapReduce difficult to exploit ML properties (erro
o performance on many ML programs has been surpassed by alterna
Spark :
o does not offer fine-grained scheduling of computation and commuand correct execution of advanced ML algorithms.
-
7/26/2019 Petuum - A new Platform
5/15
LDA - 20m unique words and 1
topics (220b sparse parameters
a cluster with total 256 cores a
1TB memory.
MF - 20m-by-20k matrix and ra400 (8b parameters) on a cluste
with total 128 cores and 1TB
memory.
CNN - 1b parameters on a clu
with total 1024 CPU cores an
2TB memory. GPUs not requi
Fig. 1: The scale of Big ML efforts in recent literature. A key goal of Petuum is to enable lML models to be run on fewer resources, even relative to highly-specialized implementat
otivation
-
7/26/2019 Petuum - A new Platform
6/15
omponents pproach
Formalized ML algorithms as iterative-convergent programs:
o stochastic gradient descent
o MCMC for determining point estimates in latent variable models
o coordinate descent, variational methods for graphical methods
o proximal optimization for structured sparsity problems, and othe
Found out the shared properties across all algorithms.
Key lies in the recognition of a clear dichotomy(division) b/w DATA an
This inspired bimodal approach to parallelism: data parallel and mod
and execution of a big ML program over cluster of machines.
-
7/26/2019 Petuum - A new Platform
7/15
This approach exploits unique statistical nature of ML algorithms, mainly three properties:
Error tolerance iterative-convergent algorithms are robust against limited errors in intermcalculations.
Dynamic structural dependency changing correlation strengths between model parameteefficient parallelization.
Non-uniform convergence No. of steps required for a parameter to converge can be highparameters.
Fig. 2: Key pro
algorithms:
(a) Non-unifor
(b) Error-toler
(c) Dependenc
variables.
omponents pproach
-
7/26/2019 Petuum - A new Platform
8/15
Iterative-Convergent ML Algorithm: Given data D and loss L (i.e., fitness function such a
lihood), a typical ML problem can be grounded as executing the following update equa
iteratively, until the model state (i.e., parameters and/or latent variables)
A reaches some stopping criteria:
The update function L() (which improves the loss L) performs computation on data D
model stateA, and outputs intermediate results to be aggregated by F().
omponents pproach
-
7/26/2019 Petuum - A new Platform
9/15
the data D is partitioned and assigned to
computational workers;
we assume that the function () can be applied to
each of these data subsets independently
() are aggregated via summation
Each parallel worker contributes equally
the dataA is partitioned and assign
unlike data-parallelism each update
takes a scheduling function Spt-1() w
to operate on a subset of the mode
the model parameters are not inde
each parallel worker contributes eq
omponents pproach
Fig. 3: Petuum Data-Parallelism Fig. 4: Petuum Model-P
-
7/26/2019 Petuum - A new Platform
10/15
ystem design
Fig. 5: Petuum scheduler, workers, parameter servers.
Petuum Goal: allow users to easily implement data-parallel andmodel-parallel ML algorithms!
Parameter Server (PS):
enables data-parallelism
global read/write access
three functions: PS.get
Scheduler:
enables model-paralleliscontrol which model par
worker machines
scheduling function sch
of parameters for each w
Workers:
receives parameters to b
schedule(), and th
functionspush()
-
7/26/2019 Petuum - A new Platform
11/15
ystem design
STRADSBSEN POSEIDON
parameter server for data-parallel Big Data AI & ML
uses a consistency model,which allows asynchronous-
like performance and bulk
synchronous execution, yet
does not sacrifice ML
algorithm correctness
scheduler for model-parallelHigh Speed AI & ML
performs fine-grainedscheduling of ML update
operations
prioritizing computation on the
parts of the ML program that
need it most
avoiding unsafe parallel
operations that could hurt
performance
distributed Glearning fram
on Caffe andBsen
enjoys speed
communicati
bandwidth m
features of B
-
7/26/2019 Petuum - A new Platform
12/15
ARN compatibility
Problem:
many industrial and academic clusters run HadoopMapReduce
however, programs that are written for stand-alone clusters are n
with YARN/HDFS, and vice versa, applications written for YARN/HD
compatible with stand alone clusters.
Solution: providing common libraries that work on either Hadoop or non-H
o YARN launcher that will deploy any Petuum application (inclu
written ones) onto a Hadoop cluster
o a data access library for HDFS access, which allows users to w
stream code that works on both HDFS files the local filesystem
-
7/26/2019 Petuum - A new Platform
13/15
erformance
Fig. 6: Petuum relative speedup vs
popular platforms (larger is better).
Across ML programs, Petuum is at least
2-10 times faster than popular
implementations.
Fig. 7: Matrix Factorization convergence time: Petuu
Spark. Petuum is fastest and the most memory-effic
platform that could handle Big MF models with rank
given hardware budget.
-
7/26/2019 Petuum - A new Platform
14/15
Fig. 8: Petuum Program Structure
Fig. 9: Petuum DML data-parallel ps
xample
-
7/26/2019 Petuum - A new Platform
15/15
eferences
[PETUUM A New Platform for Distributed Machine Learning on Big Data Eric] P. Xing
Wei Dai, Jin Kyu Kim Jinliang Wei, Seunghak Lee Xun Zheng, Pengtao Xie AbhimanuYaoliang Yu. IEEE Transactions on Big Data, June 2015
https://github.com/petuum
http://www.simplilearn.com/how-facebook-is-using-big-data-article
hank ou !