meeting application performance needs: scaling up versus scaling out

Post on 15-Aug-2015

25 Views

Category:

Technology

0 Downloads

Preview:

Click to see full reader

TRANSCRIPT

Meeting app performance needs – scaling up/scaling out 1

Do you know how to eat an elephant? 2

¨  One bite at a time! 

¨  Divide and Conquer.

A practical problem 3

¨  Coca Cola needs to analyze consumer sentiment on Diet Coke brand across popular social networks ¤ What type of machine would they need? ¤ Will all the data even fit on the biggest most expensive

machine you can buy today?

The Need for Speed 4

¨  High Performance Architectures need more and more resources as demand grows

¨  Methods of adding more resources for a particular application fall into two categories: ¤ Scale up (vertical) VERSUS Scale Out (horizontal) ¤ Get a bigger machine VERSUS add more small

machines

Scale Up (scale vertically) 5

¨  Get a bigger machine ¨  Add resources to a single node in a system

¤  involving the addition of CPUs or memory to a single computer. ¨  Vertical scaling of existing systems

¤  enables effective virtualization ¤  provides more resources for the hosted set of operating

system and application modules to share. ¨  Taking advantage of such resources in a single computer can also be called

"scaling up“ ¤  such as expanding the number of Apache daemon processes currently running.

Scale Out (scale horizontally) 6

¨  Add more nodes to a collection of machines ¤  such as adding a new computer to a distributed software application. ¤  An example might be scaling out from one Web server system to three.

¨  Large number of low cost "commodity" systems ¤  As computer prices drop and ¤  performance continues to increase

¨  Several (Hundreds or thousands) of small computers configured in a cluster to obtain aggregate computing power that often exceeds that of single traditional RISC processor based scientific computers

¨  Scaling out fueled by availability of high performance interconnects (e.g., Myrinet and InfiniBand)

Trade offs 7

¨  Larger numbers of computers means ¤  increased management complexity, ¤  more complex programming model ¤  throughput and latency between nodes ¤  some applications do not lend themselves to a distributed

computing model ¨  Configuring an existing idle system has always been less

expensive than buying, installing, and configuring a new one, regardless of the model.

Scale Up versus Scale Out 8

Choosing between Scale up/Scale Out

¨  Scale up: ¤ You have a hard limit ¤  the size of the machine

on which you are running

¨  Scale out: ¤ Not limited to the

capacity of a single unit ¤ Combine the power of

multiple machines into a single pool

9

Scale Up Scale Out

Scale Up versus Scale Out 10

¨  In Concept: ¤ In both cases we break a sequential piece of logic

into smaller pieces that can be executed in parallel.

¨  In Practice: ¤ Two models are fairly different from an

implementation and performance perspective.

Scale Up versus Scale Out

¨  Concurrent programming on multi-core machines is often done through multi-threading and in-process message passing.

¨  Single large multi-core machines are best utilized in a context of a single application through concurrent programming

¨  Distributed programming does something similar by distributing jobs across machines over the network

¨  Patterns used are: ¤  MapReduce – Google (2004)

¤  Master/Worker

¤  Tuple Spaces

¤  BlackBoard

11

Scale Up Scale Out

Scale Up versus Scale Out

¨  Existence of a shared address space

¨  Data sharing and message passing can be done simply by passing a reference.

¨  Lack of a shared address space

¨  Makes sharing, passing or updating data significantly more complex

¨  Deal with passing of copies of the data which involves additional network and serialization and de-serialization overhead

¨  Once you cross the boundaries of a single process you need to deal with partial failure and consistency

12

Scale Up Scale Out

Why Scale Out 13

¨ Cost/Performance Flexibility: ¤ Optimize cost/performance by selecting the

optimal configuration setup at any time ¤ If your system is designed for scale-up only, then

you are pretty much locked into a certain minimum price driven by the hardware that you are using.

¤ In a competitive situation, the lack of flexibility could actually kill your business

Why Scale Out 14

¨ Continuous Availability/Redundancy: ¤ Failure is inevitable. ¤ One big system is a single point of failure ¤ The recovery process could be long ¤ Extended down-time needed to restore one big

machine

Why Scale Out 15

¨ Continuous Upgrades: ¤ Building an application as one big unit makes it

harder or even impossible to add or change pieces of code individually without bringing the entire system down.

¤ Better to decouple your application into concrete sets of services that can be maintained independently.

Why Scale Out 16

¨ Geographical Distribution: ¤ There are cases where an application needs to be

spread across data centers or geographical location to handle disaster recovery scenarios or to reduce geographical latency.

¤ Its better to distribute your application so putting in a single box won’t work.

Scaling out is non trivial 17

¨  Scale out apps need a rewrite as the programming model is different

¨  Scale out gains are not linear ¤  have to deal with network overhead, transactions, and

replication into operations that were previously done just by passing object references

¨  Beyond a few obvious cases, choosing between scale up and scale out is fairly hard

Further reading 18

¨  MapReduce: Simplified Data Processing on Large Clusters: Dean, Jeff and Ghemawat, Sanjay.  ¤ http://research.google.com/archive/mapreduce.html

¤ Open Source Implementation of MapReduce ¤ http://hadoop.apache.org/

top related