application-driven energy-efficient architecture explorations for big data authors: xiaoyan gu rui...

27
Application-driven Energy- efficient Architecture Explorations for Big Data Authors: Xiaoyan Gu Rui Hou Ke Zhang Lixin Zhang Weiping Wang (Institute of Computing Technology, Chinese Academy of Sciences) Reviewed by- Siddharth Bhave (University of Washington, Tacoma)

Upload: theodora-howard

Post on 26-Dec-2015

213 views

Category:

Documents


1 download

TRANSCRIPT

Page 1: Application-driven Energy-efficient Architecture Explorations for Big Data Authors: Xiaoyan Gu Rui Hou Ke Zhang Lixin Zhang Weiping Wang (Institute of

Application-driven Energy-efficient Architecture

Explorations for Big DataAuthors:

Xiaoyan GuRui HouKe ZhangLixin ZhangWeiping Wang(Institute of Computing Technology,Chinese Academy of Sciences)

Reviewed by-

Siddharth Bhave(University of Washington, Tacoma)

Page 2: Application-driven Energy-efficient Architecture Explorations for Big Data Authors: Xiaoyan Gu Rui Hou Ke Zhang Lixin Zhang Weiping Wang (Institute of

Big Data What is Big Data?

Problems with Big data Energy Consumption Velocity (Operation latency and throughput) Volume (storing capacity) Variety

Managing Big Data Problems Storage Technologies Partitioning Multithreading Parallel Processing Efficient Architecture Hadoop, Map Reduce, MAHOUT Find bottle neck

Page 3: Application-driven Energy-efficient Architecture Explorations for Big Data Authors: Xiaoyan Gu Rui Hou Ke Zhang Lixin Zhang Weiping Wang (Institute of

Introduction

Big data management at architecture level

Two architecture systems Xeon-based cluster Atom Based (micro-server) Cluster

Comparison Based on: - Energy consumption Execution time

Page 4: Application-driven Energy-efficient Architecture Explorations for Big Data Authors: Xiaoyan Gu Rui Hou Ke Zhang Lixin Zhang Weiping Wang (Institute of

Motivation

Ever increasing data.

Energy and Time tradeoff in Xeon and Atom based clusters.

Bottleneck by the processes of compression/decompression

Stateless data processing

Page 5: Application-driven Energy-efficient Architecture Explorations for Big Data Authors: Xiaoyan Gu Rui Hou Ke Zhang Lixin Zhang Weiping Wang (Institute of

Mastiff

Mastiff - Targeted application for performance analysis

Big data processing engine

Columnar store policy

Compression Ratio on 3 GB

data

Compression Ratio

on 100 GB data

Compression Ratio

on 500 GB data

Mastiff 0.54 0.53 0.518

Hadoop HDFS

0.72 0.71 0.7

Page 6: Application-driven Energy-efficient Architecture Explorations for Big Data Authors: Xiaoyan Gu Rui Hou Ke Zhang Lixin Zhang Weiping Wang (Institute of

Working flow of the Mastiff

Page 7: Application-driven Energy-efficient Architecture Explorations for Big Data Authors: Xiaoyan Gu Rui Hou Ke Zhang Lixin Zhang Weiping Wang (Institute of

Methodology

TPC-H test benchmark of queries and concurrent data

1 TB of verification data

2 cases - data load and data query

Fluke NORMA 4000

Average cases and median results are reported

Page 8: Application-driven Energy-efficient Architecture Explorations for Big Data Authors: Xiaoyan Gu Rui Hou Ke Zhang Lixin Zhang Weiping Wang (Institute of

Power and Performance Evaluation

Time on Atom

Cluster (30

nodes)

Time on Xeon

Cluster (30

nodes)

Time on Xeon

Cluster (15

nodes)

Data Load 3.435 hours

1.543 hours

3.242 hours

Data Query 5.877 hours

2.724 hours

5.564 hours

Take 3 cases for time and energy consumption

31 nodes – Atom Cluster (1 master node)

31 nodes – Xeon Cluster (1 master node)

16 nodes – Xeon Cluster (1 master node)

Page 9: Application-driven Energy-efficient Architecture Explorations for Big Data Authors: Xiaoyan Gu Rui Hou Ke Zhang Lixin Zhang Weiping Wang (Institute of

Energy consumption between 30-node Atom Cluster and 30-node Xeon Cluster

Power and Performance Evaluation (cont’d)

Page 10: Application-driven Energy-efficient Architecture Explorations for Big Data Authors: Xiaoyan Gu Rui Hou Ke Zhang Lixin Zhang Weiping Wang (Institute of

Energy consumption between 30-node Atom Cluster and 15-node Xeon Cluster

Power and Performance Evaluation (cont’d)

Page 11: Application-driven Energy-efficient Architecture Explorations for Big Data Authors: Xiaoyan Gu Rui Hou Ke Zhang Lixin Zhang Weiping Wang (Institute of

Time Breakdown in Map Phase

Power and Performance Evaluation (cont’d)

Page 12: Application-driven Energy-efficient Architecture Explorations for Big Data Authors: Xiaoyan Gu Rui Hou Ke Zhang Lixin Zhang Weiping Wang (Institute of

Time Breakdown in Reduce phase

Power and Performance Evaluation (cont’d)

Page 13: Application-driven Energy-efficient Architecture Explorations for Big Data Authors: Xiaoyan Gu Rui Hou Ke Zhang Lixin Zhang Weiping Wang (Institute of

Findings

Atom platform more power efficient

Data compression and decompression occupies significant percentage.

Compression and decompression can be done in software pipeline fashion i.e. with multiple interleave

Page 14: Application-driven Energy-efficient Architecture Explorations for Big Data Authors: Xiaoyan Gu Rui Hou Ke Zhang Lixin Zhang Weiping Wang (Institute of

Propositions

Heterogeneous architecture

Accelerators to perform data compression/decompression

Multiple interleaved compression/decompression

Page 15: Application-driven Energy-efficient Architecture Explorations for Big Data Authors: Xiaoyan Gu Rui Hou Ke Zhang Lixin Zhang Weiping Wang (Institute of

Off-chip and On-chip Accelerators

Page 16: Application-driven Energy-efficient Architecture Explorations for Big Data Authors: Xiaoyan Gu Rui Hou Ke Zhang Lixin Zhang Weiping Wang (Institute of

Multiple Interleaved Tasks

Page 17: Application-driven Energy-efficient Architecture Explorations for Big Data Authors: Xiaoyan Gu Rui Hou Ke Zhang Lixin Zhang Weiping Wang (Institute of

Strengths

A much needed innovative concept

Organized well

Detailed description of energy and time investigation

Already implemented propositions

Page 18: Application-driven Energy-efficient Architecture Explorations for Big Data Authors: Xiaoyan Gu Rui Hou Ke Zhang Lixin Zhang Weiping Wang (Institute of

Weaknesses

Not enough power meters to monitor all nodes

2 assumptions Power of every network router is evenly counted

towards nodes Energy consumption of each node is similar

Results are generalized by Hadoop even if they might not be true for every application.

Vague propsitions implementation

Page 19: Application-driven Energy-efficient Architecture Explorations for Big Data Authors: Xiaoyan Gu Rui Hou Ke Zhang Lixin Zhang Weiping Wang (Institute of

FAWN: A Fast Array of Wimpy Nodes

Authors:

David G. AndersenJason FranklinMichael KaminskyAmar PhanishayeeLawrence TanVijay Vasudevan(Carnegie Mellon University)

Page 20: Application-driven Energy-efficient Architecture Explorations for Big Data Authors: Xiaoyan Gu Rui Hou Ke Zhang Lixin Zhang Weiping Wang (Institute of

High performance, energy efficient system for storage

Large number of small low-performance (hence wimpy) nodes with moderate amounts of local storage

2 parts: FAWN-DS (data store) and FAWN-KV (key value)

Motivation Traditional architecture consumes too much

power I/O bottleneck due to current storage inabilities

Introduction

Page 21: Application-driven Energy-efficient Architecture Explorations for Big Data Authors: Xiaoyan Gu Rui Hou Ke Zhang Lixin Zhang Weiping Wang (Institute of

Features

Pairs of low powered embedded nodes with flash storage

FAWN-DS is the backend that consists of the large number of nodes

Each node has some RAM and flash

FAWN-KV is a consistent, replicated, highly available and high performance key value storage system

Page 22: Application-driven Energy-efficient Architecture Explorations for Big Data Authors: Xiaoyan Gu Rui Hou Ke Zhang Lixin Zhang Weiping Wang (Institute of

FAWN Architecture

Page 23: Application-driven Energy-efficient Architecture Explorations for Big Data Authors: Xiaoyan Gu Rui Hou Ke Zhang Lixin Zhang Weiping Wang (Institute of

Efficient Data Streaming with On-chip Accelerators: Opportunities and Chanllenges

Authors:

Rui HouLixin ZhangMichael C. HuangKun WangHubertus FrankeYi GeXiaotao Chang(University of Rochester)

Page 24: Application-driven Energy-efficient Architecture Explorations for Big Data Authors: Xiaoyan Gu Rui Hou Ke Zhang Lixin Zhang Weiping Wang (Institute of

Motivation

Transistor density increasing day by day

Many cores are integrated in a single die

Advantage of on-chip accelerator instead of using it as PCI

Page 25: Application-driven Energy-efficient Architecture Explorations for Big Data Authors: Xiaoyan Gu Rui Hou Ke Zhang Lixin Zhang Weiping Wang (Institute of

On-Chip Accelerator Architecture

Page 26: Application-driven Energy-efficient Architecture Explorations for Big Data Authors: Xiaoyan Gu Rui Hou Ke Zhang Lixin Zhang Weiping Wang (Institute of

3 types of accelerators Crypto accelerators Decompression accelerators Network offload accelerator

Some common characteristics of data stream in the 3 accelerators

Optimize the power and performance of the accelerators.

Features

Page 27: Application-driven Energy-efficient Architecture Explorations for Big Data Authors: Xiaoyan Gu Rui Hou Ke Zhang Lixin Zhang Weiping Wang (Institute of

Thank You