systems support for graphical learning ken birman 1 cs6410 fall 2014 9/18/2014
TRANSCRIPT
CS6410 Fall 2014 1
SYSTEMS SUPPORT FORGRAPHICAL LEARNING
Ken Birman9/18/2014
CS5412 Spring 2014 (Cloud Computing: Birman)
2
Graphical models and applications Artificial intelligence and machine
learning is the core technology in many modern cloud settings Support for social networking mechanisms Creating product placement
recommendations Understanding the flow of “influence” within
communities
Graphical processing can also matter in systems Understand what to cache and what not to
cache Learning common patterns to optimize
CS5412 Spring 2014 (Cloud Computing: Birman)
3
What makes this hard?
Prior generation of solutions was too general Programming languages can do anything, but they
aren’t at all specialized for graph structured data Database systems are awesome for tabular data
but much less optimized for graphical data
There is also an issue of scale We’re good at what can be done on one computer But a company like Facebook has billions of users
and their infrastructure runs on massive data centers
CS5412 Spring 2014 (Cloud Computing: Birman)
4
Today’s papers
TAO paper (I’ll start with this) gives a sense of the challenge Facebook confronts Like an entire distributed operating system But the whole role of the solution is to manage
graphical data and support queries against it Massive loads and surreal scale
Things to notice? How does the architecture of the solution reflect the
special environment in which it runs? How did they identify and optimize the critical
paths?
CS5412 Spring 2014 (Cloud Computing: Birman)
5
Dryad/LINQ
Here we see two concepts combined At Microsoft, LINQ has become very popular It embeds a kind of query processing into
C# code
Dryad takes this one step further Given a LINQ expression, Dryad can run it
on a distributed “computing engine” of their own design
Idea is to obtain massive parallelism
CS5412 Spring 2014 (Cloud Computing: Birman)
6
Basic architecture of Dryad
CS5412 Spring 2014 (Cloud Computing: Birman)
7
Execution of a LINQ expression
CS5412 Spring 2014 (Cloud Computing: Birman)
8
A join, done in two ways
CS5412 Spring 2014 (Cloud Computing: Birman)
9
A join, done in two ways
CS5412 Spring 2014 (Cloud Computing: Birman)
10
MapReduce in Dryad/LINQ
CS5412 Spring 2014 (Cloud Computing: Birman)
11
Other major systems in this space Check out http://
en.wikipedia.org/wiki/Graph_database
They list 50 or so graphical databases and processing systems
Some popular ones in research settings are Pregel (from Google), GraphLab (CMU) and Vowpal Wabbit (“Fast Learning”) (Yahoo)
CS5412 Spring 2014 (Cloud Computing: Birman)
12
Take aways
Computer systems need to be responsive to Styles of use (what our “customers” are
doing) Common patterns of load (optimize for this
case)
In today’s major cloud computing settings, graphical data and graphical learning solutions are becoming a highly dominant form of load and focus
Computer systems need to evolve to track this need