systems support for graphical learning ken birman 1 cs6410 fall 2014 9/18/2014

12
SYSTEMS SUPPORT FOR GRAPHICAL LEARNING Ken Birman 1 CS6410 Fall 2014 9/18/2014

Upload: katrina-clark

Post on 24-Dec-2015

225 views

Category:

Documents


0 download

TRANSCRIPT

Page 1: SYSTEMS SUPPORT FOR GRAPHICAL LEARNING Ken Birman 1 CS6410 Fall 2014 9/18/2014

CS6410 Fall 2014 1

SYSTEMS SUPPORT FORGRAPHICAL LEARNING

Ken Birman9/18/2014

Page 2: SYSTEMS SUPPORT FOR GRAPHICAL LEARNING Ken Birman 1 CS6410 Fall 2014 9/18/2014

CS5412 Spring 2014 (Cloud Computing: Birman)

2

Graphical models and applications Artificial intelligence and machine

learning is the core technology in many modern cloud settings Support for social networking mechanisms Creating product placement

recommendations Understanding the flow of “influence” within

communities

Graphical processing can also matter in systems Understand what to cache and what not to

cache Learning common patterns to optimize

Page 3: SYSTEMS SUPPORT FOR GRAPHICAL LEARNING Ken Birman 1 CS6410 Fall 2014 9/18/2014

CS5412 Spring 2014 (Cloud Computing: Birman)

3

What makes this hard?

Prior generation of solutions was too general Programming languages can do anything, but they

aren’t at all specialized for graph structured data Database systems are awesome for tabular data

but much less optimized for graphical data

There is also an issue of scale We’re good at what can be done on one computer But a company like Facebook has billions of users

and their infrastructure runs on massive data centers

Page 4: SYSTEMS SUPPORT FOR GRAPHICAL LEARNING Ken Birman 1 CS6410 Fall 2014 9/18/2014

CS5412 Spring 2014 (Cloud Computing: Birman)

4

Today’s papers

TAO paper (I’ll start with this) gives a sense of the challenge Facebook confronts Like an entire distributed operating system But the whole role of the solution is to manage

graphical data and support queries against it Massive loads and surreal scale

Things to notice? How does the architecture of the solution reflect the

special environment in which it runs? How did they identify and optimize the critical

paths?

Page 5: SYSTEMS SUPPORT FOR GRAPHICAL LEARNING Ken Birman 1 CS6410 Fall 2014 9/18/2014

CS5412 Spring 2014 (Cloud Computing: Birman)

5

Dryad/LINQ

Here we see two concepts combined At Microsoft, LINQ has become very popular It embeds a kind of query processing into

C# code

Dryad takes this one step further Given a LINQ expression, Dryad can run it

on a distributed “computing engine” of their own design

Idea is to obtain massive parallelism

Page 6: SYSTEMS SUPPORT FOR GRAPHICAL LEARNING Ken Birman 1 CS6410 Fall 2014 9/18/2014

CS5412 Spring 2014 (Cloud Computing: Birman)

6

Basic architecture of Dryad

Page 7: SYSTEMS SUPPORT FOR GRAPHICAL LEARNING Ken Birman 1 CS6410 Fall 2014 9/18/2014

CS5412 Spring 2014 (Cloud Computing: Birman)

7

Execution of a LINQ expression

Page 8: SYSTEMS SUPPORT FOR GRAPHICAL LEARNING Ken Birman 1 CS6410 Fall 2014 9/18/2014

CS5412 Spring 2014 (Cloud Computing: Birman)

8

A join, done in two ways

Page 9: SYSTEMS SUPPORT FOR GRAPHICAL LEARNING Ken Birman 1 CS6410 Fall 2014 9/18/2014

CS5412 Spring 2014 (Cloud Computing: Birman)

9

A join, done in two ways

Page 10: SYSTEMS SUPPORT FOR GRAPHICAL LEARNING Ken Birman 1 CS6410 Fall 2014 9/18/2014

CS5412 Spring 2014 (Cloud Computing: Birman)

10

MapReduce in Dryad/LINQ

Page 11: SYSTEMS SUPPORT FOR GRAPHICAL LEARNING Ken Birman 1 CS6410 Fall 2014 9/18/2014

CS5412 Spring 2014 (Cloud Computing: Birman)

11

Other major systems in this space Check out http://

en.wikipedia.org/wiki/Graph_database

They list 50 or so graphical databases and processing systems

Some popular ones in research settings are Pregel (from Google), GraphLab (CMU) and Vowpal Wabbit (“Fast Learning”) (Yahoo)

Page 12: SYSTEMS SUPPORT FOR GRAPHICAL LEARNING Ken Birman 1 CS6410 Fall 2014 9/18/2014

CS5412 Spring 2014 (Cloud Computing: Birman)

12

Take aways

Computer systems need to be responsive to Styles of use (what our “customers” are

doing) Common patterns of load (optimize for this

case)

In today’s major cloud computing settings, graphical data and graphical learning solutions are becoming a highly dominant form of load and focus

Computer systems need to evolve to track this need