02lecsp11mapreduce
TRANSCRIPT
-
8/13/2019 02LecSp11MapReduce
1/70
CS 61C: Great Ideas in Computer
Architecture (Machine Structures)
Instructors:Randy H. Katz
David A. Pattersonhttp://inst.eecs.Berkeley.edu/~cs61c/sp11
1Apeinf 2011 -- Lecture #21/30/2014
-
8/13/2019 02LecSp11MapReduce
2/70
Review CS61c: Learn 5 great ideas in computer architecture to
enable high performance programming via parallelism,not just learn C
1. Layers of Representation/Interpretation
2. Moores Law
3. Principle of Locality/Memory Hierarchy
4. Parallelism
5. Performance Measurement and Improvement
6. Dependability via Redundancy
Post PC Era: Parallel processing, smart phones to WSC
WSC SW must cope with failures, varying load, varyingHW latency bandwidth
WSC HW sensitive to cost, energy efficiency 2Spring 2011 -- Lecture #11/30/2014
-
8/13/2019 02LecSp11MapReduce
3/70
New-School Machine Structures(Its a bit more complicated!)
Parallel RequestsAssigned to computer
e.g., Search Katz
Parallel ThreadsAssigned to core
e.g., Lookup, Ads
Parallel Instructions>1 instruction @ one time
e.g., 5 pipelined instructions
Parallel Data>1 data item @ one time
e.g., Add of 4 pairs of words
Hardware descriptionsAll gates @ one time
1/30/2014 Spring 2011 -- Lecture #1 3
SmartPhone
WarehouseScale
Computer
Software Hardware
HarnessParallelism &Achieve High
Performance
Logic Gates
Core Core
Memory (Cache)
Input/Output
Computer
Main Memory
Core
Instruction Unit(s) FunctionalUnit(s)
A3+B3A2+B
2A1+B1A0+B0
Todays Lecture
-
8/13/2019 02LecSp11MapReduce
4/70
Agenda
Request and Data Level Parallelism
Administrivia + 61C in the News + InternshipWorkshop + The secret to getting good grades at
Berkeley MapReduce
MapReduce Examples
Technology Break Costs in Warehouse Scale Computer (if time
permits)
1/30/2014 Apeinf 2011 -- Lecture #2 4
-
8/13/2019 02LecSp11MapReduce
5/70
Request-Level Parallelism (RLP)
Hundreds or thousands of requests per second
Not your laptop or cell-phone, but popular Internet
services like Google search
Such requests are largely independent Mostly involve read-only databases
Little read-write (aka producer-consumer) sharing
Rarely involve readwrite data sharing or synchronization
across requests
Computation easily partitioned within a request
and across different requests
1/30/2014 Apeinf 2011 -- Lecture #2 5
-
8/13/2019 02LecSp11MapReduce
6/70
Google Query-Serving Architecture
1/30/2014 Apeinf 2011 -- Lecture #2 6
-
8/13/2019 02LecSp11MapReduce
7/70
Anatomy of a Web Search
Google Randy H. Katz Direct request to closest Google Warehouse Scale
Computer
Front-end load balancer directs request to one of
many arrays (cluster of servers) within WSC Within array, select one of many Google Web Servers
(GWS) to handle the request and compose theresponse pages
GWS communicates with Index Servers to finddocuments that contain the search words, Randy,Katz, uses location of search as well
Return document list with associated relevance score
1/30/2014 Apeinf 2011 -- Lecture #2 7
-
8/13/2019 02LecSp11MapReduce
8/70
Anatomy of a Web Search
In parallel,
Ad system: books by Katz at Amazon.com
Images of Randy Katz
Use docids (document IDs) to access indexeddocuments
Compose the page
Result document extracts (with keyword in context)ordered by relevance score
Sponsored links (along the top) and advertisements(along the sides)
1/30/2014 Apeinf 2011 -- Lecture #2 8
-
8/13/2019 02LecSp11MapReduce
9/70
1/30/2014 Apeinf 2011 -- Lecture #2 9
-
8/13/2019 02LecSp11MapReduce
10/70
Anatomy of a Web Search
Implementation strategy
Randomly distribute the entries
Make many copies of data (aka replicas)
Load balance requests across replicas
Redundant copies of indices and documents
Breaks up hot spots, e.g., Justin Bieber
Increases opportunities for request-levelparallelism
Makes the system more tolerant of failures
1/30/2014 Apeinf 2011 -- Lecture #2 10
-
8/13/2019 02LecSp11MapReduce
11/70
Data-Level Parallelism (DLP)
2 kinds
Lots of data in memory that can be operated onin parallel (e.g., adding together 2 arrays)
Lots of data on many disks that can be operatedon in parallel (e.g., searching for documents)
March 1 lecture and 3rdproject does DataLevel Parallelism (DLP) in memory
Todays lecture and 1stproject does DLP across1000s of servers and disks using MapReduce
1/30/2014 Apeinf 2011 -- Lecture #2 11
-
8/13/2019 02LecSp11MapReduce
12/70
Problem Trying To Solve
How process large amounts of raw data (crawleddocuments, request logs, ) every day tocompute derived data (inverted indicies, pagepopularity, ) when computation conceptually
simple but input data large and distributed across100s to 1000s of servers so that finish inreasonable time?
Challenge: Parallelize computation, distribute
data, tolerate faults without obscuring simplecomputation with complex code to deal withissues
Jeffrey Dean and Sanjay Ghemawat, MapReduce: Simplified Data Processing on LargeClusters, Communications of the ACM, Jan 2008.
1/30/2014 Apeinf 2011 -- Lecture #2 12
-
8/13/2019 02LecSp11MapReduce
13/70
MapReduce Solution
Apply Map function to user supplier record ofkey/value pairs
Compute set of intermediate key/value pairs
Apply Reduce operation to all values thatshare same key in order to combine deriveddata properly
User supplies Map and Reduce operations infunctional model so can parallelize, using re-execution for fault tolerance
1/30/2014 Apeinf 2011 -- Lecture #2 13
-
8/13/2019 02LecSp11MapReduce
14/70
Data-Parallel Divide and Conquer(MapReduce Processing)
Map: Slice data into shards or splits, distribute these to
workers, compute sub-problem solutions map(in_key,in_value)->list(out_key,intermediate value)
Processes input key/value pair Produces set of intermediate pairs
Reduce: Collect and combine sub-problem solutions reduce(out_key,list(intermediate_value))->list(out_value)
Combines all intermediate values for a particular key
Produces a set of merged output values (usually just one)
Fun to use: focus on problem, let MapReduce librarydeal with messy details
1/30/2014 Apeinf 2011 -- Lecture #2 14
-
8/13/2019 02LecSp11MapReduce
15/70
MapReduce Execution
1/30/2014 Apeinf 2011 -- Lecture #2 15
Fine granularitytasks: many
more map tasks
than machines
2000 servers => 200,000 Map Tasks,
5,000 Reduce tasks
-
8/13/2019 02LecSp11MapReduce
16/70
MapReduce Popularity at Google
Aug-04 Mar-06 Sep-07 Sep-09Number of MapReduce jobs 29,000 171,000 2,217,000 3,467,000Average completion time(secs) 634 874 395 475
Server years used 217 2,002 11,081 25,562
Input data read (TB) 3,288 52,254 403,152 544,130Intermediate data (TB) 758 6,743 34,774 90,120
Output data written (TB) 193 2,970 14,018 57,520Average number servers/job 157 268 394 488
1/30/2014 Apeinf 2011 -- Lecture #2 16
-
8/13/2019 02LecSp11MapReduce
17/70
-
8/13/2019 02LecSp11MapReduce
18/70
Agenda
Request and Data Level Parallelism
Administrivia + The secret to getting good
grades at Berkeley
MapReduce
MapReduce Examples
Technology Break Costs in Warehouse Scale Computer (if time
permits)
1/30/2014 Apeinf 2011 -- Lecture #2 19
-
8/13/2019 02LecSp11MapReduce
19/70
-
8/13/2019 02LecSp11MapReduce
20/70
This Week
Discussions and labs will be held this week
Switching Sections: if you find another 61C student
willing to swap discussion AND lab, talk to your TAs
Partner (only project 3 and extra credit): OK ifpartners mix sections but have same TA
First homework assignment due this Sunday
January 23rd by 11:59:59 PM There is reading assignment as well on course page
1/30/2014 Spring 2011 -- Lecture #1 21
-
8/13/2019 02LecSp11MapReduce
21/70
Course Organization
Grading Participation and Altruism (5%)
Homework (5%)
Labs (20%)
Projects (40%)1. Data Parallelism (Map-Reduce on Amazon EC2)
2. Computer Instruction Set Simulator (C)
3. Performance Tuning of a Parallel Application/Matrix Multiplyusing cache blocking, SIMD, MIMD (OpenMP, due with partner)
4. Computer Processor Design (Logisim) Extra Credit: Matrix Multiply Competition, anything goes
Midterm (10%): 6-9 PM Tuesday March 8
Final (20%): 11:30-2:30 PM Monday May 9
1/30/2014 Spring 2011 -- Lecture #1 22
-
8/13/2019 02LecSp11MapReduce
22/70
EECS Grading Policy
http://www.eecs.berkeley.edu/Policies/ugrad.grading.shtml
A typical GPA for courses in the lower division is 2.7. This GPA
would result, for example, from 17% A's, 50% B's, 20% C's,
10% D's, and 3% F's. A class whose GPA falls outside the range
2.5 - 2.9 should be considered atypical.
Fall 2010: GPA 2.81
26% A's, 47% B's, 17% C's,
3% D's, 6% F's
Job/Intern Interviews: They grill
you with technical questions, so
its what you say, not your GPA
(New 61c gives good stuff to say)1/30/2014 Spring 2011 -- Lecture #1 23
Fall Spring
2010 2.81 2.81
2009 2.71 2.81
2008 2.95 2.74
2007 2.67 2.76
http://www.eecs.berkeley.edu/Policies/ugrad.grading.shtmlhttp://www.eecs.berkeley.edu/Policies/ugrad.grading.shtml -
8/13/2019 02LecSp11MapReduce
23/70
Late Policy
Assignments due Sundays at 11:59:59 PM Late homeworks not accepted (100% penalty)
Late projects get 20% penalty, accepted up to
Tuesdays at 11:59:59 PM No credit if more than 48 hours late
No slip days in 61C
Used by Dan Garcia and a few faculty to cope with 100s
of students who often procrastinate without having tohear the excuses, but not widespread in EECS courses
More late assignments if everyone has no-cost options;better to learn now how to cope with real deadlines
1/30/2014 Spring 2011 -- Lecture #1 24
-
8/13/2019 02LecSp11MapReduce
24/70
Policy on Assignments and
Independent Work With the exception of laboratories and assignments that explicitly
permit you to work in groups, all homeworks and projects are to beYOUR work and your work ALONE.
You are encouraged to discuss your assignments with other students,and extra credit will be assigned to students who help others,
particularly by answering questions on the Google Group, but weexpect that what you hand is yours.
It is NOT acceptable to copy solutions from other students.
It is NOT acceptable to copy (or start your) solutions from the Web.
We have tools and methods, developed over many years, fordetecting this. You WILL be caught, and the penalties WILL be severe.
At the minimum a ZERO for the assignment, possibly an F in thecourse, and a letter to your university record documenting theincidence of cheating.
(We caught people last semester!)
1/30/2014 Spring 2011 -- Lecture #1 25
-
8/13/2019 02LecSp11MapReduce
25/70
YOUR BRAIN ON COMPUTERS; Hooked
on Gadgets, and Paying a Mental PriceNY Times, June 7, 2010, by Matt Richtel
SAN FRANCISCO -- When one of the most important
e-mail messages of his life landed in his in-box a few
years ago, Kord Campbell overlooked it.
Not just for a day or two, but 12 days. He finally saw
it while sifting through old messages: a big company
wanted to buy his Internet start-up.
''I stood up from my desk and said, 'Oh my God, oh
my God, oh my God,' '' Mr. Campbell said. ''It's kind
of hard to miss an e-mail like that, but I did.''
The message had slipped by him amid an electronic
flood: two computer screens alive with e-mail,
instant messages, online chats, a Web browser and
the computer code he was writing.
While he managed to salvage the $1.3 million deal
after apologizing to his suitor, Mr. Campbell
continues to struggle with the effects of the deluge
of data. Even after he unplugs, he craves the
stimulation he gets from his electronic gadgets. He
forgets things like dinner plans, and he has trouble
focusing on his family. His wife, Brenda, complains,
''It seems like he can no longer be fully in the
moment.''
This is your brain on computers.
Scientists say juggling e-mail, phone calls and other
incoming information can change how people think and
behave. They say our ability to focus is being undermined
by bursts of information.
These play to a primitive impulse to respond to immediate
opportunities and threats. The stimulation provokes
excitement -- a dopamine squirt -- that researchers say can
be addictive. In its absence, people feel bored.
The resulting distractions can have deadly consequences, as
when cellphone-wielding drivers and train engineers cause
wrecks. And for millions of people like Mr. Campbell, these
urges can inflict nicks and cuts on creativity and deep
thought, interrupting work and family life.
While many people say multitasking makes them more
productive, research shows otherwise. Heavy multitaskersactually have more trouble focusing and shutting out
irrelevant information, scientists say, and they experience
more stress.
And scientists are discovering that even after the
multitasking ends, fractured thinking and lack of focus
persist. In other words, this is also your brain off computers.
Fall 2010 -- Lecture #2 26
-
8/13/2019 02LecSp11MapReduce
26/70
The Rules
(and we really mean it!)
27Fall 2010 -- Lecture #2
-
8/13/2019 02LecSp11MapReduce
27/70
Architecture of a Lecture
1/30/2014 Spring 2011 -- Lecture #1 28
Attention
Time (minutes)
0 20 25 50 53 78 80
Administrivia And inconclusion
Techbreak
Full
-
8/13/2019 02LecSp11MapReduce
28/70
61C in the News
IEEE Spectrum Top 11 Innovations of the Decade
1/30/2014 Apeinf 2011 -- Lecture #2 29
61C
61C 61C
-
8/13/2019 02LecSp11MapReduce
29/70
1/30/2014 Apeinf 2011 -- Lecture #2 30
-
8/13/2019 02LecSp11MapReduce
30/70
The Secret to Getting Good Grades
Grad student said he figured finally it out
(Mike Dahlin, now Professor at UT Texas)
My question: What is the secret?
Do assigned reading night before, so that getmore value from lecture
Fall 61c Comment on End-of-Semester Survey:
I wish I had followed Professor Patterson'sadvice and did the reading before eachlecture.
Fall 2010 -- Lecture #2 31
-
8/13/2019 02LecSp11MapReduce
31/70
MapReduce ProcessingExample: Count Word Occurrences
Pseudo Code: Simple case of assuming just 1 word per document
map(String input_key, String input_value):
// input_key: document name
// input_value: document contents
for each word w in input_value:
EmitIntermediate(w, "1");// Produce count of words
reduce(String output_key, Iterator intermediate_values):
// output_key: a word
// output_values: a list of countsint result = 0;
for each v in intermediate_values:
result += ParseInt(v);// get integer from key-value
Emit(AsString(result));
1/30/2014 Apeinf 2011 -- Lecture #2 32
-
8/13/2019 02LecSp11MapReduce
32/70
1/30/2014 Apeinf 2011 -- Lecture #2 33
MapReduce Processing
Shuffle phase
-
8/13/2019 02LecSp11MapReduce
33/70
MapReduce Processing
1/30/2014 Apeinf 2011 -- Lecture #2 34
1. MR 1st splits the
input files into Msplits then starts
many copies of
program on servers
Shuffle phase
-
8/13/2019 02LecSp11MapReduce
34/70
MapReduce Processing
1/30/2014 Apeinf 2011 -- Lecture #2 35
2. One copythe
masteris special. Therest
are workers. The master
picks idle workers and
assigns each 1 of M map
tasks or 1 of R reduce
tasks.
Shuffle phase
-
8/13/2019 02LecSp11MapReduce
35/70
MapReduce Processing
1/30/2014 Apeinf 2011 -- Lecture #2 36
3. A map worker reads the
input split. It parseskey/value pairs of the input
data and passes each pair
to the user-defined map
function.
(The intermediate
key/value pairsproduced by the map
function are buffered
in memory.)
Shuffle phase
-
8/13/2019 02LecSp11MapReduce
36/70
MapReduce Processing
1/30/2014 Apeinf 2011 -- Lecture #2 37
4. Periodically, the buffered
pairs are written to localdisk, partitioned
into R regions by the
partitioning function.
Shuffle phase
-
8/13/2019 02LecSp11MapReduce
37/70
MapReduce Processing
1/30/2014 Apeinf 2011 -- Lecture #2 38
5. When a reduce worker
has read all intermediate
data for its partition, it sorts
it by the intermediate
keys so that all occurrences
of the same key are
grouped together.
(The sorting is needed
because typically manydifferent keys map to
the same reduce task )
Shuffle phase
-
8/13/2019 02LecSp11MapReduce
38/70
-
8/13/2019 02LecSp11MapReduce
39/70
MapReduce Processing
1/30/2014 Apeinf 2011 -- Lecture #2 40
7. When all map tasks and
reduce tasks have been
completed, the master
wakes up the user program.
The MapReduce call
In user program returns
back to user code.
Output of MR is in R
output files (1 perreduce task, with file
names specified by
user); often passed
into another MR job.
Shuffle phase
-
8/13/2019 02LecSp11MapReduce
40/70
MapReduce Processing Time Line
Master assigns map + reduce tasks to worker servers
As soon as a map task finishes, worker server can beassigned a new map or reduce task
Data shuffle begins as soon as a given Map finishes
Reduce task begins as soon as all data shuffles finish
To tolerate faults, reassign task if a worker server dies1/30/2014 Apeinf 2011 -- Lecture #2 41
-
8/13/2019 02LecSp11MapReduce
41/70
Show MapReduce Job Running
~41 minutes total
~29 minutes for Map tasks & Shuffle tasks
~12 minutes for Reduce tasks
1707 worker servers used
Map (Green) tasks read 0.8 TB, write 0.5 TB
Shuffle (Red) tasks read 0.5 TB, write 0.5 TB
Reduce (Blue) tasks read 0.5 TB, write 0.5 TB
1/30/2014 Apeinf 2011 -- Lecture #2 42
-
8/13/2019 02LecSp11MapReduce
42/70
1/30/2014 Apeinf 2011 -- Lecture #2 43
-
8/13/2019 02LecSp11MapReduce
43/70
1/30/2014 Apeinf 2011 -- Lecture #2 44
-
8/13/2019 02LecSp11MapReduce
44/70
1/30/2014 Apeinf 2011 -- Lecture #2 45
-
8/13/2019 02LecSp11MapReduce
45/70
1/30/2014 Apeinf 2011 -- Lecture #2 46
-
8/13/2019 02LecSp11MapReduce
46/70
1/30/2014 Apeinf 2011 -- Lecture #2 47
-
8/13/2019 02LecSp11MapReduce
47/70
1/30/2014 Apeinf 2011 -- Lecture #2 48
-
8/13/2019 02LecSp11MapReduce
48/70
1/30/2014 Apeinf 2011 -- Lecture #2 49
-
8/13/2019 02LecSp11MapReduce
49/70
1/30/2014 Apeinf 2011 -- Lecture #2 50
-
8/13/2019 02LecSp11MapReduce
50/70
1/30/2014 Apeinf 2011 -- Lecture #2 51
-
8/13/2019 02LecSp11MapReduce
51/70
1/30/2014 Apeinf 2011 -- Lecture #2 52
-
8/13/2019 02LecSp11MapReduce
52/70
1/30/2014 Apeinf 2011 -- Lecture #2 53
A th E l W d I d
-
8/13/2019 02LecSp11MapReduce
53/70
Another Example: Word Index(How Often Does a Word Appear?)
1/30/2014 Apeinf 2011 -- Lecture #2 55
that that is is that that is not is not is that it it is
is 1, that 2 is 1, that 2 is 2, not 2 is 2, it 2, that 1
Map 1 Map 2 Map 3 Map 4
Reduce 1 Reduce 2
is 1 that 2is 1,1 that 2,2is 1,1,2,2it 2
that 2,2,1not 2
is 6; it 2 not 2; that 5
Shuffle
Collect
is 6; it 2; not 2; that 5
Distribute
-
8/13/2019 02LecSp11MapReduce
54/70
MapReduce Failure Handling
On worker failure:
Detect failure via periodic heartbeats
Re-execute completed and in-progress map tasks
Re-execute in progress reduce tasks Task completion committed through master
Master failure:
Could handle, but don't yet (master failure unlikely) Robust: lost 1600 of 1800 machines once, but
finished fine
1/30/2014 Apeinf 2011 -- Lecture #2 56
-
8/13/2019 02LecSp11MapReduce
55/70
MapReduce Redundant Execution
Slow workers significantly lengthen completiontime
Other jobs consuming resources on machine
Bad disks with soft errors transfer data very slowly
Weird things: processor caches disabled (!!)
Solution: Near end of phase, spawn backupcopies of tasks
Whichever one finishes first "wins" Effect: Dramatically shortens job completion time
3% more resources, large tasks 30% faster
1/30/2014 Apeinf 2011 -- Lecture #2 57
-
8/13/2019 02LecSp11MapReduce
56/70
-
8/13/2019 02LecSp11MapReduce
57/70
-
8/13/2019 02LecSp11MapReduce
58/70
Agenda
Request and Data Level Parallelism
Administrivia + The secret to getting good
grades at Berkeley
MapReduce
MapReduce Examples
Technology Break Costs in Warehouse Scale Computer (if time
permits)
1/30/2014 Apeinf 2011 -- Lecture #2 60
-
8/13/2019 02LecSp11MapReduce
59/70
Design Goals of a WSC
Unique to Warehouse-scale Ample parallelism:
Batch apps: large number independent data sets withindependent processing. Also known as Data-LevelParallelism
Scale and its Opportunities/Problems Relatively small number of these make design cost expensive
and difficult to amortize
But price breaks are possible from purchases of very largenumbers of commodity servers
Must also prepare for high component failures
Operational Costs Count: Cost of equipment purchases
-
8/13/2019 02LecSp11MapReduce
60/70
Internet
WSC Case Study
Server Provisioning
1/30/2014 Fall 2010 -- Lecture #37 62
WSC Power Capacity 8.00 MWPower Usage Effectiveness (PUE) 1.45
IT Equipment Power Share 0.67 5.36 MW
Power/Cooling Infrastructure 0.33 2.64 MW
IT Equipment Measured Peak (W) 145.00
Assume Average Pwr @ 0.8 Peak 116.00
# of Servers 46207
# of Servers 46000
# of Servers per Rack 40.00
# of Racks 1150
Top of Rack Switches 1150
# of TOR Switch per L2 Switch 16.00# of L2 Switches 72
# of L2 Switches per L3 Switch 24.00
# of L3 Switches 3Rack
Server
TOR Switch
L3 Switch
L2 Switch
-
8/13/2019 02LecSp11MapReduce
61/70
Cost of WSC
US account practice separates purchase price
and operational costs
Capital Expenditure (CAPEX) is cost to buy
equipment (e.g. buy servers)
Operational Expenditure (OPEX) is cost to run
equipment (e.g, pay for electricity used)
1/30/2014 Apeinf 2011 -- Lecture #2 63
WSC Case Study
-
8/13/2019 02LecSp11MapReduce
62/70
WSC Case Study
Capital Expenditure (Capex)
Facility cost and total IT cost look about the same
1/30/2014 Fall 2010 -- Lecture #37 64
Facility Cost $88,000,000
Total Server Cost $66,700,000
Total Network Cost $12,810,000
Total Cost $167,510,000
However, replace servers every 3 years,networking gear every 4 years, and facility every10 years
-
8/13/2019 02LecSp11MapReduce
63/70
Cost of WSC
US account practice allow converting Capital
Expenditure (CAPEX) into Operational
Expenditure (OPEX) by amortizing costs over
time period Servers 3 years
Networking gear 4 years
Facility 10 years
1/30/2014 Apeinf 2011 -- Lecture #2 65
WSC Case Study
-
8/13/2019 02LecSp11MapReduce
64/70
WSC Case Study
Operational Expense (Opex)
1/30/2014 Fall 2010 -- Lecture #37 66
Server 3 $66,700,000 $2,000,000 55%
Network 4 $12,530,000 $295,000 8%
Facility $88,000,000
Pwr&Cooling 10 $72,160,000 $625,000 17%
Other 10 $15,840,000 $140,000 4%Amortized Cost $3,060,000
Power (8MW) $0.07 $475,000 13%
People (3) $85,000 2%
Total Monthly $3,620,000 100%
Monthly Cost
Years
Amortization
$/kWh
Amortized
Capital
Expense
Operational
Expense
Monthly Power costs $475k for electricity
$625k + $140k to amortize facility power distribution and cooling
60% is amortized power distribution and cooling
-
8/13/2019 02LecSp11MapReduce
65/70
How much does a watt cost in a WSC?
8 MW facility
Amortized facility, including power
distribution and cooling is $625k + $140k =
$765k
Monthly Power Usage = $475k
Watt-Year = ($765k+$475k)*12/8M = $1.86 or
about $2 per year
To save a watt, if spend more than $2 a year,
lose money1/30/2014 Apeinf 2011 -- Lecture #2 67
-
8/13/2019 02LecSp11MapReduce
66/70
Replace Rotating Disks with Flash? Flash is non-volatile semiconductor memory
Costs about $20 / GB, Capacity about 10 GB
Power about 0.01 Watts
Disk is non-volatile rotating storage
Costs about $0.1 / GB, Capacity about 1000 GB
Power about 10 Watts
Should replace Disk with Flash to save money?
1/30/2014 Apeinf 2011 -- Lecture #2 68
A red) No: Capex Costs are 100:1 of OpEx savings!
B orange) Dont have enough information to answer question
C green) Yes: Return investment in a single year!
-
8/13/2019 02LecSp11MapReduce
67/70
Replace Rotating Disks with Flash? Flash is non-volatile semiconductor memory
Costs about $20 / GB, Capacity about 10 GB
Power about 0.01 Watts
Disk is non-volatile rotating storage
Costs about $0.1 / GB, Capacity about 1000 GB
Power about 10 Watts
Should replace Disk with Flash to save money?
1/30/2014 Apeinf 2011 -- Lecture #2 69
A red) No: Capex Costs are 100:1 of OpEx savings!
B orange) Dont have enough information to answer question
C green) Yes: Return investment in a single year!
WSC Case Study
-
8/13/2019 02LecSp11MapReduce
68/70
WSC Case Study
Operational Expense (Opex)
$3.8M/46000 servers = ~$80 per month perserver in revenue to break even
~$80/720 hours per month = $0.11 per hour
So how does Amazon EC2 make money???1/30/2014 Fall 2010 -- Lecture #37 70
Server 3 $66,700,000 $2,000,000 55%
Network 4 $12,530,000 $295,000 8%
Facility $88,000,000
Pwr&Cooling 10 $72,160,000 $625,000 17%
Other 10 $15,840,000 $140,000 4%Amortized Cost $3,060,000
Power (8MW) $0.07 $475,000 13%
People (3) $85,000 2%
Total Monthly $3,620,000 100%
Monthly Cost
Years
Amortization
$/kWh
Amortized
Capital
Expense
Operational
Expense
-
8/13/2019 02LecSp11MapReduce
69/70
January 2011 AWS Instances & Prices
Closest computer in WSC example is Standard Extra Large
@$0.11/hr, Amazon EC2 can make money!
even if used only 50% of time
1/30/2014 Fall 2010 -- Lecture #37 71
Instance PerHour
RatiotoSmall
ComputeUnits
VirtualCores
ComputeUnit/Core
Memory(GB)
Disk(GB)
Address
Standard Small $0.085 1.0 1.0 1 1.00 1.7 160 32 bit
Standard Large $0.340 4.0 4.0 2 2.00 7.5 850 64 bit
Standard Extra Large $0.680 8.0 8.0 4 2.00 15.0 1690 64 bit
High-Memory Extra Large $0.500 5.9 6.5 2 3.25 17.1 420 64 bitHigh-Memory Double Extra Large $1.000 11.8 13.0 4 3.25 34.2 850 64 bit
High-Memory Quadruple Extra Large $2.000 23.5 26.0 8 3.25 68.4 1690 64 bit
High-CPU Medium $0.170 2.0 5.0 2 2.50 1.7 350 32 bit
High-CPU Extra Large $0.680 8.0 20.0 8 2.50 7.0 1690 64 bit
Cluster Quadruple Extra Large $1.600 18.8 33.5 8 4.20 23.0 1690 64 bit
-
8/13/2019 02LecSp11MapReduce
70/70
Summary Request-Level Parallelism
High request volume, each largely independent of other
Use replication for better request throughput, availability
MapReduce Data Parallelism Divide large data set into pieces for independent parallel
processing Combine and process intermediate results to obtain final
result
WSC CapEx vs. OpEx Servers dominate cost
Spend more on power distribution and coolinginfrastructure than on monthly electricity costs
Economies of scale mean WSC can sell computing as autility