web search for a planet: the google cluster architecture

Post on 22-Feb-2016

45 Views

Category:

Documents

0 Downloads

Preview:

Click to see full reader

DESCRIPTION

Web Search for a Planet: The Google Cluster Architecture. Eugenio De Hoyos 6175 Computer Science Seminar October 4, 2011. introduction. introduction. “. … a single query on Google reads h undreds of megabytes of data and c onsumes tens of billions of CPU cycles…. ”. IO. - PowerPoint PPT Presentation

TRANSCRIPT

Web Search for a Planet:The Google Cluster

Architecture

Eugenio De Hoyos

6175 Computer Science SeminarOctober 4, 2011

2

introduction

3

introduction

… a single query on Google readshundreds of megabytes of data andconsumes tens of billions of CPU cycles…“ ”

500 MB @ 20 MB/s → 25 sec

10x109 cycles @ 2 GHz → 5 sec

IO

CPU

4

introduction

… a single query on Google readshundreds of megabytes of data andconsumes tens of billions of CPU cycles…“ ”

500 MB @ 20 MB/s → 25 sec

10x109 cycles @ 2 GHz → 5 sec

IO

CPU

5

outline

A Single Query

Philosophy

Power

Index Hardware

Index Memory

Conclusion

6

http://www.googlefalle.com

a single query

7

a single query

Google Web Server

Google Web Server

Google Web Server

Google Web Server

Google Web Server

HardwareLoad Balancer

Google Web Server

Google Web Server

8

Google Web Server

PCPCPCPC

ShardPCPCPCPC

Shard

PCPCPCPC

ShardPCPCPCPC

Shard

Index Servers

PCPCPCPC

ShardPCPCPCPC

Shard

PCPCPCPC

ShardPCPCPCPC

Shard

Document Servers

1 2 3 4

Google Web Server

Google Web Server

9

10

outline

A Single Query

Philosophy

Power

Index Hardware

Index Memory

Conclusion

11

philosophy

Service A Service CService B

12

philosophy

176 CPU’s176 GB RAM

7 TB ROM278,000 Dollars

8 CPU’s64 GB RAM

8 TB ROM758,000 Dollars

13

14

15

outline

A Single Query

Philosophy

Power

Index Hardware

Index Memory

Conclusion

16

the power problem

CPU

POWER

RAM/BOARD

HD

17

A Google data center, circa 2000. Note the fan on the floor to cool servers.(Credit: Stephen Shankland-CNET News.com/Jeff Dean-Google)“ ”

18

their observation

Cost

Scale

Equipment

Power &Cooling

19

are their numbers right?

≈ 𝐼𝑛𝑠𝑡𝑟𝑢𝑐𝑡𝑖𝑜𝑛𝑠$ 7,700+$1500

$7,700+$1300+$ 200

𝑃𝑒𝑟𝑓𝑜𝑟𝑚𝑎𝑛𝑐𝑒𝐶𝑜𝑠𝑡 ≈ 𝐼𝑛𝑠𝑡𝑟𝑢𝑐𝑡𝑖𝑜𝑛𝑠

𝐴𝑚𝑜𝑟𝑡𝑖𝑧𝑎𝑡𝑖𝑜𝑛+(𝑃𝑜𝑤𝑒𝑟 +𝐶𝑜𝑜𝑙𝑖𝑛𝑔)

Cost of inefficiency

Min. CostRequires$ 20,000

Amortization

Min. AmortizationRequires$ 1,500

Operating Costs

20

outline

A Single Query

Philosophy

Power

Index Hardware

Index Memory

Conclusion

21

hardware

index server

RAMCPU

Hard Drive

22

hardware

1 2 3 4 51 2 3 4 5

1 2 3 4 51 2 3 4 5

1 2 3 4 51 2 3 4 5

1 2 3 4 51 2 3 4 5

1 2 3 4 51 2 3 4 5

6 7 8 9 06 7 8 9 0

6 7 8 96 7 8

6 71 2 3 4 5

1 2 3 4 51 2 3 4

1 2 31 2

65

9 08 9 07 8 9 06 7 8 9 0

6 7 8 9 04 53 4 52 3 4 51 2 3 4 5

1 2 3 4 5

6 7 8 9 06 7 8 9 0

6 7 8 9 06 7 8 9 0

6 7 8 9 0

1 2 3 4 51 2 3 4 5

1 2 3 4 51 2 3 4 5

1 2 3 4 5

6 7 8 9 06 7 8 9 0

6 7 8 9 06 7 8 9 0

6 7 8 9 01 2 3 4 5

1 2 3 4 51 2 3 4 5

1 2 3 4 51 2 3 4 5

6 7 8 9 06 7 8 9 0

6 7 8 9 06 7 8 9 0

6 7 8 9 0

1 Clock Cycle

ShortPipeline

LongPipeline

Pentium IV

Pentium III

23

hardware

1 2 3 4 51 2 3 4 5

1 2 3 4 51 2 3 4 5

1 2 3 4 51 2 3 4 5

1 2 3 4 51 2 3 4 5

1 2 3 4 51 2 3 4 5

6 7 8 9 06 7 8 9 0

6 7 8 96 7 8

6 71 2 3 4 5

1 2 3 4 51 2 3 4

1 2 31 2

65

9 08 9 07 8 9 06 7 8 9 0

6 7 8 9 04 53 4 52 3 4 51 2 3 4 5

1 2 3 4 5

6 7 8 9 06 7 8 9 0

6 7 8 9 06 7 8 9 0

6 7 8 9 0

1 2 3 4 51 2 3 4 5

1 2 3 4 51 2 3 4 5

1 2 3 4 5

6 7 8 9 06 7 8 9 0

6 7 8 9 06 7 8 9 0

6 7 8 9 01 2 3 4 5

1 2 3 4 51 2 3 4 5

1 2 3 4 51 2 3 4 5

6 7 8 9 06 7 8 9 0

6 7 8 9 06 7 8 9 0

6 7 8 9 0

1 Clock Cycle

ShortPipeline

LongPipeline

Pentium IV

Pentium III

24

hardware

1 2 3 4 51 2 3 4 5

1 2 3 4 51 2 3 4 5

1 2 3 4 5

1 2 3 4 51 2 3 4 5

1 2 3 4 51 2 3 4 5

1 2 3 4 5

thread level parallelism

instruction level parallelism

25

1 2 3 4 51 2 3 4

1 2 31 2

1

1 2 3 4 51 2 3 4

1 2 31 2

1

hardware

simultaneous multithreading (SMT)

1 2 3 4 51 2 3 4 5

1 2 3 4 51 2 3 4 5

1 2 3 4 51 2 3 4 5

1 2 3 4 51 2 3 4 5

1 2 3 4 51 2 3 4 5

L1

L2

CPU

26

54 53 4 52 3 4 51 2 3 4 5

54 53 4 52 3 4 51 2 3 4 5

hardware

1 2 3 4 51 2 3 4 5

1 2 3 4 5

chip multiprocessor (CMP)

L1

L2

CPU

CPU

L1

1 2 3 4 51 2 3 4 5

1 2 3 4 5

27

outline

A Single Query

Philosophy

Power

Index Hardware

Index Memory

Conclusion

28

memory & scalability

Unpredictable memory accessLarge cache lines prefetch helps

Memory bandwithOK

CPU RAMCache

line lengthca

che

leng

th

29

outline

A Single Query

Philosophy

Power

Index Hardware

Index Memory

Conclusion

30

conclusion

Cluster architecture is ideal and least expensive

Maximize throughput

Software Reliability

31

conclusion

Service A Service CService B

32

a discussion question…

HDMI MonitorUSB Keyboard700 MHz ARM 11128 MB RAMOpen GL ES 2.0 1080p -- David Braben, UK game developer

33

questions?

top related