big data management and systems design dr. weikuan yu associate professor department of computer...

Big Data Management and Systems Design

Dr. Weikuan Yu

Associate Professor

Department of Computer Science

Florida State University

Oct 15, 2015 – CS5935 - S2

Life as a Graduate Student

· Course work– Degrees vs. skill sets

· Research– Goals vs. means

· Career development– Presentation and teaching skills– Communication and social skills– Internships– Networking

Oct 15, 2015 – CS5935 - S3

Big Data Challenge

Sources: 1: http://www.infobarrel.com/Evolution 2:http://visual.ly/big-data-explosion?utm_source=visually_embed

Where is Data Coming From ?

3.0 BillionInternet Users

1.35 BillionFacebook Users

550 MillionTweets Per Day

72 Hrs VideoPer Minutes

4.7 Billion Google Search Per Day

And so many others

The dawn of civilization

Year 2003

5 EXABYTES of DATA 5 EXABYTES of DATAEvery 2 Days

Year 2015

7910EXABYTES

Oct 15, 2015 – CS5935 - S4

Big Data Ecosystem

HDFS Hbase

Storage

DisksNetworks

Mahout, Giraph, Pregel, R

MR/Hadoop

Hive, Pig, Shark, Flume

Storm RamCloudSpark

Infrastructure

Applications

Run-Time

Hardware

MemCached

VM, Containers, Public and Private CloudsOS

ComplexityData

Insight

Processors(Accelerators)

Services

Oct 15, 2015 – CS5935 - S5

Research Strategies?· Descriptive research

– Examine and collect facts to document patterns

· Discovery oriented research– Inductive reasoning from patterns to general discoveries

· Engineering-based research– Use existing techniques and theories to create a

technology or tool.

· Hypothesis-driven research– Make a hypothesis and then test the hypothesis using

deductive reasoning

Oct 15, 2015 – CS5935 - S6

Overview of Hadoop MapReduce

Split MapTask

…. ….

Split MapTask

MOF ReduceTask

ReduceTask

shuffle/merge reduce

DFSDFS

JobTrackerAssign MapTasks Assign ReduceTasks

Oct 15, 2015 – CS5935 - S7

1 2 3 4 5 6 70

Standalone Execution Time (sec)

Groups

Normalized Execution Time (slowdown)

Small Job Starvation within Hadoop

Oct 15, 2015 – CS5935 - S8

Hadoop Fair Scheduler

· The mostly widely used Hadoop scheduler

· It is designed to provide fairness among concurrently running jobs.

· Tasks occupy slots until completion or failure

Job Arrival

J-3 J-3

shuffle

reduce

Slot-R1

Slot-R2

Slot-R3

Slot-M1

Slot-M2

Slot-M3

Slot-M4

Slot-M5

5 Map Slots

3 Reduce Slots

Oct 15, 2015 – CS5935 - S9

How to achieve both Efficiency and Fairness?

· How to correct the non-preemptive nature of reduce tasks for flexible and dynamic allocation of reduce slots?

– Existing schedulers are not aware of such behavior. Once a reduce task is

launched, it stays with the reduce slot till the end.

· How to better schedule two different types of tasks?

– Hadoop schedules both map and reduce tasks with a similar max-min fair

sharing policy without paying attention to their relationship in a job.

– Map and reduce slots need to be dynamically, and proportionally shared.

Objective

Oct 15, 2015 – CS5935 - S10

Preemptive ReduceTasks

· Different from Linux command “Kill –STOP $PID”.

· Lightweight work-conserving preemption mechanism.

– Provides any-time preemption with negligible performance impact.

– Allows a reduce task to resume from where it was preempted.

– Preserves previous computation and I/O.

Oct 15, 2015 – CS5935 - S11

TaskTracker

seg seg seg

Retrieve

R1: Before Preempt

Heapseg

R1: After Resume

Preemption During Shuffle Phase

· Only merge the in-memory segs, while maintaining on-disk segs untouched

Oct 15, 2015 – CS5935 - S12

R1: Before Preempt

R1: After Resume

offset

Preemption During Reduce Phase

· Preemption is occurred at the boundary of intermediate <key, val> pairs.

· Recording the current offset of each segment and minimum priority queue

TaskTracker

Oct 15, 2015 – CS5935 - S13

Evaluation of Preemptive ReduceTask

10% 30% 50% 70% 90%3000

7000Work-Conserving PreemptionKilling PreemptionNo Preemption (Baseline)

Completion Ratio of ReduceTask

Negligible overhead

Oct 15, 2015 – CS5935 - S14

Fast Completion Scheduler

· Strategy: – Find a reduce task for Preemption and select another for Launching,

and balance the utilization of reduce slots

– Decisions to make

· Which reduce task to preempt and for how many times?

· Which one to launch and on which slot/node?

· How to avoid starvation and achieve locality?

· New progress metrics of a job: – Remaining shuffle time

– Remaining data (<k,v> pairs) to be reduced

Oct 15, 2015 – CS5935 - S15

FCS Algorithms

· Preemption algorithm– Select a reduce task from a job with longer remaining time + more

remaining data

– Task slackness: the number of times a task has been preempted

– Avoid starvation: do not preempt a reduce task that has a big slackness.

· Launching algorithm– Select another reduce task from a job with the least remaining time /

remaining data

– Delay a reduce task to maximize data locality.

– Avoid aggressive delay: set a threshold based on the cluster size.

Oct 15, 2015 – CS5935 - S16

1 2 3 4 5 6 7 8 9 1010

FCS HFS

· FCS reduces average execution time by 31% (171 jobs).

· Significantly speeds up small jobs at a small cost of big jobs.

10 Groups of Jobs

1.9 2.4 1.9 2.3

1.12.2 0.79

Results for Map-heavy Workload

Oct 15, 2015 – CS5935 - S17

1 2 3 4 5 6 7 8 9 101

10000FCS HFS

· Small jobs are benefited from significantly shortened reduce wait time.

· Waiting time are reduced by 22× for the jobs in the first 6 groups.

Average ReduceTask Wait Time

19.512.4

2221 32.2

1.224.5

10 Groups of Jobs

Oct 15, 2015 – CS5935 - S18

1 2 3 4 5 6 7 8 9 1002468

101214161820

Fair Comple-tion

· Nearly uniform maximum slowdown for all groups of jobs.

· FCS improves the fairness by 66.7% on average.

10 groups of jobs

Fairness Evaluation: Maximum Slowdown

Oct 15, 2015 – CS5935 - S19

Summary for Coordinated Scheduling

· Identified the fairness and efficiency issues because of the lack of scheduling coordination.

· Introduced Preemptive ReduceTasks for efficient preemption of reduce tasks from long-running jobs

· Designed and Implemented Fast Completion Scheduler for fast execution of small jobs and better job fairness.

Oct 15, 2015 – CS5935 - S20

Broad Research Interests

Big Data· System Design and

Management– Fast Data Movement– Efficient job management– Multi-purpose framework

· High Performance Computing– Parallel computing models– Scalable I/O & communication – Computation & I/O optimization

· Data Analytics– K-mer indexing for sequence

fingerprinting and alignment– Scalable Image Processing– Fast community detection

· Security and Reliability– Analytics logging and Recovery– Cloud security– Storage security

Oct 15, 2015 – CS5935 - S21

Resource and Capabilities· Software: Unstructured Data

Accelerator (UDA)– Accelerator for Big Data Analytics– Transferred to Mellanox

· In-house big data platform– 22 nodes; InfiniBand and 10GigE– SSD and GPGPU (Phi and Kepler)– Donations from Mellanox, Solarflare,

and NVIDIA

Oct 15, 2015 – CS5935 - S22

Sponsors, Contractors, and Collaborators· Current Sponsors and Contractors:

– NSF: Two active grants on big data analytics, storage and network systems, – LLNL: burst buffer based storage systems

· Past Sponsors and Contractors– NASA: one grant for for I/O Optimization of climate applications– DOE Labs: many contracts for high-performance computing.– Industry: contracts from Intel, Mellanox, NVIDIA and Scitor; – Alabama: Innovation Award; Auburn IGP for TigerCloud.

· Collaborators:– IBM, Intel, Mellanox, Scitor, Solarflare, AMD– LBNL, ORNL, LLNL, SNL, GSFC, LPS– Illinois Tech, Clemson, College of William Mary, NJIT, Georgia Tech, Auburn

(OIT, Physics, Biology, SFWS)

Oct 15, 2015 – CS5935 - S23

Research Directions

· Main Thrusts

· Big Data Analytics and systems design

· Network and data privacy and security

· Interdisciplinary data-driven computational research

· Key: solve challenging problems with novel strategies…

· Collaborations

· Students (systems/network oriented, interdisciplinary)

· Faculty (on and off campus)

· National laboratories

· Industry: IBM, Intel, and many more

Oct 15, 2015 – CS5935 - S24

Student Cultivation and Team Building Numerous Internships

ORNL, Sandia, LANL, LLNL, IBM. Awards and honors

First Prize of ACM Grand Finals SC11 Fellowship ($5000). Outstanding students: 2011-2014.

Alumni (Ph.D. listed) Yuan Tian – ORNL Xinyu Que – IBM T.J. Watson Yandong Wang – IBM Watson Zhuo Liu – Yahoo! Cong Xu – Intel Bin Wang – Arista Networks

Current Team 4 Ph.D. students

big data management and systems design dr. weikuan yu associate professor department of computer...

Documents

staff, programs, and publications kansas state … ›...

dr. tian-you yu associate professor, school of electrical &...

experimental analysis of infiniband transport...

weikuan gu, phd professor director of translational research...

dr. tian-you yu associate professor, school of electrical &...

e-learning.bmstu.rue-learning.bmstu.ru/moodle/file.php/1/common_files/... ·...

weikuan gu, phd professor director of translational research

department of structural engineering - … · department of...

yu dissertation (yu)

methodist church - yale...

yu yu hakusho - d20

faculty list and research interests associate professor ·...

experimental analysis of infiniband transport services on...

hengyong yu, phd associate professor department of...

9/22/20151 network research at college of computing and...

development of the hematopoietic system & the introduction...

yu yu hakusho adaptation - unit 10

email - office of the...

monitor student ： huang - yu de 49912051 wun - hua cai...

yu yu hakusho shinto case study