cs224 fall 2011 chapter 1 computer organization cs224 fall 2011 “welcome to my course” will...
TRANSCRIPT
![Page 1: CS224 Fall 2011 Chapter 1 Computer Organization CS224 Fall 2011 “Welcome to my course” Will Sawyer With thanks to M.J. Irwin, D. Patterson, and J. Hennessy](https://reader034.vdocument.in/reader034/viewer/2022052200/56649e035503460f94aeea44/html5/thumbnails/1.jpg)
CS224 Fall 2011 Chapter 1
Computer OrganizationCS224
Fall 2011
“Welcome to my course”Will Sawyer
With thanks to M.J. Irwin, D. Patterson, and J. Hennessy for some lecture slide contents
![Page 2: CS224 Fall 2011 Chapter 1 Computer Organization CS224 Fall 2011 “Welcome to my course” Will Sawyer With thanks to M.J. Irwin, D. Patterson, and J. Hennessy](https://reader034.vdocument.in/reader034/viewer/2022052200/56649e035503460f94aeea44/html5/thumbnails/2.jpg)
CS224 Fall 2011 Chapter 1
CS224 Course Contents
Overview of computer technologies, instruction set architecture (ISA), ISA design considerations, RISC vs. CISC, assembly and machine language, translation and program start-up. Computer arithmetic, arithmetic logic unit, floating-point numbers and their arithmetic implementations. Processor design, data path and control implementation, pipelining, hazards, pipelined processor design, hazard detection and forwarding, branch prediction and exception handling. Memory hierarchy, principles, structure, and performance of caches, virtual memory, segmentation and paging. I/O devices, I/O performance, interfacing I/O. Intro to multiprocessors, multicores, and cluster computing.
From the Bilkent University Catalog: “Course Descriptions”
![Page 3: CS224 Fall 2011 Chapter 1 Computer Organization CS224 Fall 2011 “Welcome to my course” Will Sawyer With thanks to M.J. Irwin, D. Patterson, and J. Hennessy](https://reader034.vdocument.in/reader034/viewer/2022052200/56649e035503460f94aeea44/html5/thumbnails/3.jpg)
CS224 Fall 2011 Chapter 1
CS224 PoliciesEverything is on the Web site:
found @ CS Dept > Course Home Pages > CS224
http://www.cs.bilkent.edu.tr/~will/courses/CS224/
Numerical average will be calculated from:• 4-5 homeworks 10%• X pop quizzes 10%• 2 projects 30%• Midterm 20%• Final exam 30%
TO PASS, you must:• have exam average >= 35% (weighted average)• have overall course performance that is passing
![Page 4: CS224 Fall 2011 Chapter 1 Computer Organization CS224 Fall 2011 “Welcome to my course” Will Sawyer With thanks to M.J. Irwin, D. Patterson, and J. Hennessy](https://reader034.vdocument.in/reader034/viewer/2022052200/56649e035503460f94aeea44/html5/thumbnails/4.jpg)
CS224 Fall 2011 Chapter 1
CS224 Introduction• This course is all about how computers work
• But what do we mean by a computer?
– Different types: desktop, servers, embedded devices
– Different uses: automobiles, graphics, finance, genomics…
– Different manufacturers: Intel, Apple, IBM, Microsoft, Sun…
– Different underlying technologies and different costs
• Best way to learn:
– Focus on a specific instance and learn how it works
– While learning general principles and historical perspectives
![Page 5: CS224 Fall 2011 Chapter 1 Computer Organization CS224 Fall 2011 “Welcome to my course” Will Sawyer With thanks to M.J. Irwin, D. Patterson, and J. Hennessy](https://reader034.vdocument.in/reader034/viewer/2022052200/56649e035503460f94aeea44/html5/thumbnails/5.jpg)
CS224 Fall 2011 Chapter 1
Why learn this stuff?
• You want to call yourself a “computer engineer”
• You want to build software people use (need performance)
• You need to make a purchasing decision or offer “expert” advice
• Both Hardware and Software affect performance:
– Algorithm determines number of source-level statements
– Language/Compiler/Architecture determine number of machine instructions (Chapter 2 and 3)
– Processor/Memory determine how fast instructions are executed(Chapter 4 and 5)
– I/O and Number_of_Cores determine overall system performance
(Chapter 6 and 7)
![Page 6: CS224 Fall 2011 Chapter 1 Computer Organization CS224 Fall 2011 “Welcome to my course” Will Sawyer With thanks to M.J. Irwin, D. Patterson, and J. Hennessy](https://reader034.vdocument.in/reader034/viewer/2022052200/56649e035503460f94aeea44/html5/thumbnails/6.jpg)
CS224 Fall 2011 Chapter 1
Organization of a Computer
• Five classic components of a computer – input, output, memory, datapath, and control
datapath + control = processor
![Page 7: CS224 Fall 2011 Chapter 1 Computer Organization CS224 Fall 2011 “Welcome to my course” Will Sawyer With thanks to M.J. Irwin, D. Patterson, and J. Hennessy](https://reader034.vdocument.in/reader034/viewer/2022052200/56649e035503460f94aeea44/html5/thumbnails/7.jpg)
CS224 Fall 2011 Chapter 1
What is a computer?
• Components:– input (mouse, keyboard, camera, microphone...)– output (display, printer, speakers....)– memory (caches, DRAM, SRAM, hard disk drives, Flash....)– network (both input and output)
• Our primary focus: the processor (datapath and control)– implemented using billions of transistors– Impossible to understand by looking at each transistor– We need...abstraction!
An abstraction omits unneeded detail, helps us cope with complexity.
![Page 8: CS224 Fall 2011 Chapter 1 Computer Organization CS224 Fall 2011 “Welcome to my course” Will Sawyer With thanks to M.J. Irwin, D. Patterson, and J. Hennessy](https://reader034.vdocument.in/reader034/viewer/2022052200/56649e035503460f94aeea44/html5/thumbnails/8.jpg)
CS224 Fall 2011 Chapter 1
How do computers work?
• Each of the following abstracts everything below it:– Applications software– Systems software– Assembly Language– Machine Language– Architectural Approaches: Caches, Virtual Memory, Pipelining– Sequential logic, finite state machines– Combinational logic, arithmetic circuits– Boolean logic, 1s and 0s– Transistors used to build logic gates (e.g. CMOS)– Semiconductors/Silicon used to build transistors– Properties of atoms, electrons, and quantum dynamics
• Notice how abstraction hides the detail of lower levels, yet gives a useful view for a given purpose
![Page 9: CS224 Fall 2011 Chapter 1 Computer Organization CS224 Fall 2011 “Welcome to my course” Will Sawyer With thanks to M.J. Irwin, D. Patterson, and J. Hennessy](https://reader034.vdocument.in/reader034/viewer/2022052200/56649e035503460f94aeea44/html5/thumbnails/9.jpg)
Computer Architecture
ComputerArchitecture
I/O systemInstr. Set Proc.
Compiler
OperatingSystem
Application
Logic Design
Circuit Design
Instruction Set Architecture
Firmware
Implementation
Layout
![Page 10: CS224 Fall 2011 Chapter 1 Computer Organization CS224 Fall 2011 “Welcome to my course” Will Sawyer With thanks to M.J. Irwin, D. Patterson, and J. Hennessy](https://reader034.vdocument.in/reader034/viewer/2022052200/56649e035503460f94aeea44/html5/thumbnails/10.jpg)
CS224 Fall 2011 Chapter 1
The Instruction Set: a Critical Interface
instruction set architecture
software
hardware
![Page 11: CS224 Fall 2011 Chapter 1 Computer Organization CS224 Fall 2011 “Welcome to my course” Will Sawyer With thanks to M.J. Irwin, D. Patterson, and J. Hennessy](https://reader034.vdocument.in/reader034/viewer/2022052200/56649e035503460f94aeea44/html5/thumbnails/11.jpg)
CS224 Fall 2011 Chapter 1
Instruction Set Architecture
• A very important abstraction
– interface between hardware and low-level software
– standardizes instructions, machine language bit patterns, etc.
– advantage: different implementations of the same architecture
– disadvantage: sometimes prevents using new innovations
• Common instruction set architectures:– IA-64, IA-32, PowerPC, MIPS, SPARC, ARM, and others– All are multi-sourced, with different implementations for the same
ISA
![Page 12: CS224 Fall 2011 Chapter 1 Computer Organization CS224 Fall 2011 “Welcome to my course” Will Sawyer With thanks to M.J. Irwin, D. Patterson, and J. Hennessy](https://reader034.vdocument.in/reader034/viewer/2022052200/56649e035503460f94aeea44/html5/thumbnails/12.jpg)
CS224 Fall 2011 Chapter 1
Instruction Set Architecture (ISA)
• ISA, or simply architecture: the abstract interface between hardware and the lowest level of software that encompasses all the information necessary to write a machine language program, including instructions, registers, memory access, IO, …
• ISA Includes– Organization of storage– Data types– Encoding and representing instructions– Instruction Set (i.e. opcodes)– Modes of addressing data items/instructions– Program visible exception handling
• ISA together with OS interface specifies the requirements for binary compatibility across implementations (ABI: application binary interface)
![Page 13: CS224 Fall 2011 Chapter 1 Computer Organization CS224 Fall 2011 “Welcome to my course” Will Sawyer With thanks to M.J. Irwin, D. Patterson, and J. Hennessy](https://reader034.vdocument.in/reader034/viewer/2022052200/56649e035503460f94aeea44/html5/thumbnails/13.jpg)
CS224 Fall 2011 Chapter 1
Case Study: MIPS ISA
• Instruction Categories– Load/Store– Computational– Jump and Branch– Floating Point– Memory Management– Special
R0 - R31
PCHI
LO
OP
OP
OP
rs rt rd sa funct
rs rt immediate
jump target
3 Instruction Formats, 32 bits wide
![Page 14: CS224 Fall 2011 Chapter 1 Computer Organization CS224 Fall 2011 “Welcome to my course” Will Sawyer With thanks to M.J. Irwin, D. Patterson, and J. Hennessy](https://reader034.vdocument.in/reader034/viewer/2022052200/56649e035503460f94aeea44/html5/thumbnails/14.jpg)
CS224 Fall 2011 Chapter 1
Computer Organization Logic Designer's View
ISA Level
FUs & Interconnect• Capabilities & Performance Characteristics of Principal Functional Units (e.g., Registers, ALU, Shifters, Logic Units, ...)
• Ways in which these components are interconnected
• Information flows between components• Logic and means by which such information
flow is controlled.• Choreography of FUs to realize the ISA• Register Transfer Level (RTL) Description
![Page 15: CS224 Fall 2011 Chapter 1 Computer Organization CS224 Fall 2011 “Welcome to my course” Will Sawyer With thanks to M.J. Irwin, D. Patterson, and J. Hennessy](https://reader034.vdocument.in/reader034/viewer/2022052200/56649e035503460f94aeea44/html5/thumbnails/15.jpg)
Function Units in a Computer
![Page 16: CS224 Fall 2011 Chapter 1 Computer Organization CS224 Fall 2011 “Welcome to my course” Will Sawyer With thanks to M.J. Irwin, D. Patterson, and J. Hennessy](https://reader034.vdocument.in/reader034/viewer/2022052200/56649e035503460f94aeea44/html5/thumbnails/16.jpg)
CS224 Fall 2011 Chapter 1
Classes of Computers
• Desktop computers: Designed to deliver good performance to a single user at low cost usually executing 3rd party software, usually incorporating a graphics display, a keyboard, and a mouse
• Servers: Used to run larger programs for multiple, simultaneous users typically accessed only via a network and that places a greater emphasis on dependability and (often) security
• Supercomputers: A high performance, high cost class of servers with hundreds to thousands of processors, terabytes of memory and petabytes of storage that are used for high-end scientific and engineering applications
• Embedded computers (processors): A computer inside another device, used for running one predetermined application
![Page 17: CS224 Fall 2011 Chapter 1 Computer Organization CS224 Fall 2011 “Welcome to my course” Will Sawyer With thanks to M.J. Irwin, D. Patterson, and J. Hennessy](https://reader034.vdocument.in/reader034/viewer/2022052200/56649e035503460f94aeea44/html5/thumbnails/17.jpg)
CS224 Fall 2011 Chapter 1
Digital Cell Phone--Front Side (Nokia 8260)
![Page 18: CS224 Fall 2011 Chapter 1 Computer Organization CS224 Fall 2011 “Welcome to my course” Will Sawyer With thanks to M.J. Irwin, D. Patterson, and J. Hennessy](https://reader034.vdocument.in/reader034/viewer/2022052200/56649e035503460f94aeea44/html5/thumbnails/18.jpg)
CS224 Fall 2011 Chapter 1
Growth in Embedded Processor Sales(embedded growth >> desktop growth !!!)
Where else are embedded processors found?
![Page 19: CS224 Fall 2011 Chapter 1 Computer Organization CS224 Fall 2011 “Welcome to my course” Will Sawyer With thanks to M.J. Irwin, D. Patterson, and J. Hennessy](https://reader034.vdocument.in/reader034/viewer/2022052200/56649e035503460f94aeea44/html5/thumbnails/19.jpg)
CS224 Fall 2011 Chapter 1
Embedded Processor Characteristics
The largest class of computers spanning the widest range of applications and performance
• Often have minimum performance requirements. • Often have stringent limitations on cost.• Often have stringent limitations on power consumption. • Often have low tolerance for failure.
In all these ways, embedded processors are very different than supercomputers, servers, or desktops/laptops
![Page 20: CS224 Fall 2011 Chapter 1 Computer Organization CS224 Fall 2011 “Welcome to my course” Will Sawyer With thanks to M.J. Irwin, D. Patterson, and J. Hennessy](https://reader034.vdocument.in/reader034/viewer/2022052200/56649e035503460f94aeea44/html5/thumbnails/20.jpg)
CS224 Fall 2011 Chapter 1
Below the Program
• System software
– Operating system – supervising program that interfaces the user’s program with the hardware (e.g., Linux, MacOS, Windows)
• Handles basic input and output operations
• Allocates storage and memory
• Provides for protected sharing among multiple applications
– Compiler – translate programs written in a high-level language (e.g., C, Java) into instructions that the hardware can execute
Systems software
Applications software
Hardware
![Page 21: CS224 Fall 2011 Chapter 1 Computer Organization CS224 Fall 2011 “Welcome to my course” Will Sawyer With thanks to M.J. Irwin, D. Patterson, and J. Hennessy](https://reader034.vdocument.in/reader034/viewer/2022052200/56649e035503460f94aeea44/html5/thumbnails/21.jpg)
CS224 Fall 2011 Chapter 1
Below the Program• High-level language program (in C)
swap (int v[], int k)(int temp;
temp = v[k];v[k] = v[k+1];v[k+1] = temp;
)
• Assembly language program (for MIPS)swap: sll $2, $5, 2
add $2, $4, $2lw $15, 0($2)lw $16, 4($2)sw $16, 0($2)sw $15, 4($2)jr $31
• Machine (object, binary) code (for MIPS) 000000 00000 00101 0001000010000000 000000 00100 00010 0001000000100000. . .
C compiler
assembler
one-to-many
one-to-one
![Page 22: CS224 Fall 2011 Chapter 1 Computer Organization CS224 Fall 2011 “Welcome to my course” Will Sawyer With thanks to M.J. Irwin, D. Patterson, and J. Hennessy](https://reader034.vdocument.in/reader034/viewer/2022052200/56649e035503460f94aeea44/html5/thumbnails/22.jpg)
CS224 Fall 2011 Chapter 1
Advantages of HLLs• Higher-level languages (HLLs)
• As a result, very little programming is done today at the assembler level.
Allow the programmer to think in a more natural language and tailored for the intended use (Fortran for scientific computation, Cobol for business programming, Lisp for symbol manipulation, Java for web programming, …)
Improve programmer productivity – more understandable code that is easier to debug and validate
Improve program maintainability Allow programs to be machine independent of the computer on
which they are developed (compilers and assemblers can translate high-level language programs to the binary instructions of any machine)
Emergence of optimizing compilers that produce very efficient assembly code optimized for the target machine
![Page 23: CS224 Fall 2011 Chapter 1 Computer Organization CS224 Fall 2011 “Welcome to my course” Will Sawyer With thanks to M.J. Irwin, D. Patterson, and J. Hennessy](https://reader034.vdocument.in/reader034/viewer/2022052200/56649e035503460f94aeea44/html5/thumbnails/23.jpg)
CS224 Fall 2011 Chapter 1
Compiler Basics
• High-level languages– Programmers do not think in 0 and 1s
• Languages can also be specific to target applications, such as Cobol (business) or Fortran (scientific)
– Applications are more concise fewer bugs– Programs can be independent of system on which they are
developed• Compilers convert source code to object code• Libraries simplify common tasks
![Page 24: CS224 Fall 2011 Chapter 1 Computer Organization CS224 Fall 2011 “Welcome to my course” Will Sawyer With thanks to M.J. Irwin, D. Patterson, and J. Hennessy](https://reader034.vdocument.in/reader034/viewer/2022052200/56649e035503460f94aeea44/html5/thumbnails/24.jpg)
CS224 Fall 2011 Chapter 1
Levels of Representation
High Level Language Program
Assembly Language Program
Machine Language Program
Control Signal Specification
Compiler
Assembler
Machine Interpretation
temp = v[k];
v[k] = v[k+1];
v[k+1] = temp;
lw $15, 0($2)lw $16, 4($2)sw $16, 0($2)sw $15, 4($2)
0000 1001 1100 0110 1010 1111 0101 10001010 1111 0101 1000 0000 1001 1100 0110 1100 0110 1010 1111 0101 1000 0000 1001 0101 1000 0000 1001 1100 0110 1010 1111
°°
ALUOP[0:3] <= InstReg[9:11] & MASK [i.e.high/low on control lines]
![Page 25: CS224 Fall 2011 Chapter 1 Computer Organization CS224 Fall 2011 “Welcome to my course” Will Sawyer With thanks to M.J. Irwin, D. Patterson, and J. Hennessy](https://reader034.vdocument.in/reader034/viewer/2022052200/56649e035503460f94aeea44/html5/thumbnails/25.jpg)
Execution Cycle
Instruction
Fetch
Instruction
Decode
Operand
Fetch
Execute
Result
Store
Next
Instruction
Obtain instruction from program storage
Determine required actions and instruction size
Locate and obtain operand data
Compute result value or status
Deposit results in storage for later use
Determine successor instruction
![Page 26: CS224 Fall 2011 Chapter 1 Computer Organization CS224 Fall 2011 “Welcome to my course” Will Sawyer With thanks to M.J. Irwin, D. Patterson, and J. Hennessy](https://reader034.vdocument.in/reader034/viewer/2022052200/56649e035503460f94aeea44/html5/thumbnails/26.jpg)
CS224 Fall 2011 Chapter 1
AMD’s Barcelona Multicore Chip
http://www.techwarelabs.com/reviews/processors/barcelona/
Core 1 Core 2
Core 3 Core 4
Northbridge
512K
B L
2
512K
B L
2 51
2KB
L2
512K
B L
2
2MB
sh
ared
L3
Cac
he
Four out-of-order cores on one chip
1.9 GHz clock rate
65nm technology
Three levels of caches (L1, L2, L3) on chip
Integrated Northbridge
![Page 27: CS224 Fall 2011 Chapter 1 Computer Organization CS224 Fall 2011 “Welcome to my course” Will Sawyer With thanks to M.J. Irwin, D. Patterson, and J. Hennessy](https://reader034.vdocument.in/reader034/viewer/2022052200/56649e035503460f94aeea44/html5/thumbnails/27.jpg)
Magnetic Storage
Source: Quantum Corp
Disk capacity increasing 60%/year for common form factor
![Page 28: CS224 Fall 2011 Chapter 1 Computer Organization CS224 Fall 2011 “Welcome to my course” Will Sawyer With thanks to M.J. Irwin, D. Patterson, and J. Hennessy](https://reader034.vdocument.in/reader034/viewer/2022052200/56649e035503460f94aeea44/html5/thumbnails/28.jpg)
CS224 Fall 2011 Chapter 1
Communication
• The Information Age• The Internet’s changes to communication are unlike any past
medium (printing press, radio, television)– Estimated 400
million users in 2005– An astounding 1.2
billion wireless users!!
Residential Internet Subscribers
145 160180
205240
270
0
50
100
150
200
250
300
2000 2001 2002 2003 2004 2005
Mill
ions
Source: Ovum
![Page 29: CS224 Fall 2011 Chapter 1 Computer Organization CS224 Fall 2011 “Welcome to my course” Will Sawyer With thanks to M.J. Irwin, D. Patterson, and J. Hennessy](https://reader034.vdocument.in/reader034/viewer/2022052200/56649e035503460f94aeea44/html5/thumbnails/29.jpg)
Courtesy, Intel ®
Dual Core Itanium with
1.7B transistors
Moore’s Law
feature size&
die size
In 1965, Intel’s Gordon Moore predicted that the number of transistors that can be integrated on single chip would double about every two years
![Page 30: CS224 Fall 2011 Chapter 1 Computer Organization CS224 Fall 2011 “Welcome to my course” Will Sawyer With thanks to M.J. Irwin, D. Patterson, and J. Hennessy](https://reader034.vdocument.in/reader034/viewer/2022052200/56649e035503460f94aeea44/html5/thumbnails/30.jpg)
CS224 Fall 2011 Chapter 1
Moore’s Law for CPUs and DRAMs
From: “Facing the Hot Chips Challenge Again”, Bill Holt, Intel, presented at Hot Chips 17, 2005.
![Page 31: CS224 Fall 2011 Chapter 1 Computer Organization CS224 Fall 2011 “Welcome to my course” Will Sawyer With thanks to M.J. Irwin, D. Patterson, and J. Hennessy](https://reader034.vdocument.in/reader034/viewer/2022052200/56649e035503460f94aeea44/html5/thumbnails/31.jpg)
Technology Scaling Road Map
Year 2004 2006 2008 2010 2012
Feature size (nm) 90 65 45 32 22
Intg. Capacity (BT) 2 4 6 16 32
• Fun facts about 45nm transistors– 30 million can fit on the head of a pin– You could fit more than 2,000 across the width of a human
hair– If car prices had fallen at the same rate as the price of a
single transistor has since 1968, a new car today would cost about 1 cent
![Page 32: CS224 Fall 2011 Chapter 1 Computer Organization CS224 Fall 2011 “Welcome to my course” Will Sawyer With thanks to M.J. Irwin, D. Patterson, and J. Hennessy](https://reader034.vdocument.in/reader034/viewer/2022052200/56649e035503460f94aeea44/html5/thumbnails/32.jpg)
CS224 Fall 2011 Chapter 1
Semiconductors
• 50 year old industry– Still has continuous improvements– New generation every 2-3 years
• 30% reduction in dimension 50% in area• 30% reduction in delay 50% speed increase• Current generation: Reduce cost and increases performance
– Processors are fabricated on ingots cut into wafers which are then etched to create transistors
– Wafers are then diced to form chips, some of which have defects
– Yield is the measurement of the good chips• Next generation: Larger with more functions
– Each generation is an incremental improvement
![Page 33: CS224 Fall 2011 Chapter 1 Computer Organization CS224 Fall 2011 “Welcome to my course” Will Sawyer With thanks to M.J. Irwin, D. Patterson, and J. Hennessy](https://reader034.vdocument.in/reader034/viewer/2022052200/56649e035503460f94aeea44/html5/thumbnails/33.jpg)
CS224 Fall 2011 Chapter 1
Semiconductor Manufacturing Process for Silicon ICs
![Page 34: CS224 Fall 2011 Chapter 1 Computer Organization CS224 Fall 2011 “Welcome to my course” Will Sawyer With thanks to M.J. Irwin, D. Patterson, and J. Hennessy](https://reader034.vdocument.in/reader034/viewer/2022052200/56649e035503460f94aeea44/html5/thumbnails/34.jpg)
CS224 Fall 2011 Chapter 1
Main driver: device scaling ...
From: “Facing the Hot Chips Challenge Again”, Bill Holt, Intel, presented at Hot Chips 17, 2005.
![Page 35: CS224 Fall 2011 Chapter 1 Computer Organization CS224 Fall 2011 “Welcome to my course” Will Sawyer With thanks to M.J. Irwin, D. Patterson, and J. Hennessy](https://reader034.vdocument.in/reader034/viewer/2022052200/56649e035503460f94aeea44/html5/thumbnails/35.jpg)
But What Happened to Clock Rates?
Clock rates hit a “power wall”
![Page 36: CS224 Fall 2011 Chapter 1 Computer Organization CS224 Fall 2011 “Welcome to my course” Will Sawyer With thanks to M.J. Irwin, D. Patterson, and J. Hennessy](https://reader034.vdocument.in/reader034/viewer/2022052200/56649e035503460f94aeea44/html5/thumbnails/36.jpg)
CS224 Fall 2011 Chapter 1
Hitting the Power Wall
“For the P6, success criteria included performance above a certain level and failure criteria included power dissipation above some threshold.”
Bob Colwell, Pentium Chronicles
![Page 37: CS224 Fall 2011 Chapter 1 Computer Organization CS224 Fall 2011 “Welcome to my course” Will Sawyer With thanks to M.J. Irwin, D. Patterson, and J. Hennessy](https://reader034.vdocument.in/reader034/viewer/2022052200/56649e035503460f94aeea44/html5/thumbnails/37.jpg)
CS224 Fall 2011 Chapter 1
Processor performance growth flattens!
![Page 38: CS224 Fall 2011 Chapter 1 Computer Organization CS224 Fall 2011 “Welcome to my course” Will Sawyer With thanks to M.J. Irwin, D. Patterson, and J. Hennessy](https://reader034.vdocument.in/reader034/viewer/2022052200/56649e035503460f94aeea44/html5/thumbnails/38.jpg)
CS224 Fall 2011 Chapter 1
The Latest Revolution: MulticoresThe power challenge has forced a change in the design of microprocessors
--Since 2002 the rate of improvement in the response time of programs on desktop computers has slowed from a factor of 1.5 per year to less than a factor of 1.2 per year
--In 2011 all desktop and server companies are shipping microprocessors with multiple processors – cores – per chip
Product AMD Barcelona
Intel Nehalem
IBM Power 6 Sun Niagara 2
Cores per chip 4 4 2 8
Clock rate 2.5 GHz ~2.5 GHz? 4.7 GHz 1.4 GHz
Power 120 W ~100 W? ~100 W? 94 W
The plan is to double the number of cores per chip per generation (about every two years)
![Page 39: CS224 Fall 2011 Chapter 1 Computer Organization CS224 Fall 2011 “Welcome to my course” Will Sawyer With thanks to M.J. Irwin, D. Patterson, and J. Hennessy](https://reader034.vdocument.in/reader034/viewer/2022052200/56649e035503460f94aeea44/html5/thumbnails/39.jpg)
CS224 Fall 2011 Chapter 1
Workloads and Benchmarks
• Benchmarks – a set of programs that form a “workload” specifically chosen to measure performance. With standard inputs, the benchmarks are run and execution time is measured.
• SPEC (System Performance Evaluation Cooperative) creates standard sets of benchmarks starting with SPEC89. The latest is SPEC CPU2006 which consists of 12 integer benchmarks (CINT2006) and 17 floating-point benchmarks (CFP2006).
www.spec.org
• There are also benchmark collections for power workloads (SPECpower_ssj2008), for email workloads (SPECmail2008), for multimedia workloads (mediabench), …
![Page 40: CS224 Fall 2011 Chapter 1 Computer Organization CS224 Fall 2011 “Welcome to my course” Will Sawyer With thanks to M.J. Irwin, D. Patterson, and J. Hennessy](https://reader034.vdocument.in/reader034/viewer/2022052200/56649e035503460f94aeea44/html5/thumbnails/40.jpg)
2002 SPEC BenchmarksInteger benchmarks FP benchmarks
gzip compression wupwise Quantum chromodynamics
vpr FPGA place & route swim Shallow water model
gcc GNU C compiler mgrid Multigrid solver in 3D fields
mcf Combinatorial optimization applu Parabolic/elliptic pde
crafty Chess program mesa 3D graphics library
parser Word processing program galgel Computational fluid dynamics
eon Computer visualization art Image recognition (NN)
perlbmk perl application equake Seismic wave propagation simulation
gap Group theory interpreter facerec Facial image recognition
vortex Object oriented database ammp Computational chemistry
bzip2 compression lucas Primality testing
twolf Circuit place & route fma3d Crash simulation fem
sixtrack Nuclear physics accel
apsi Pollutant distribution
![Page 41: CS224 Fall 2011 Chapter 1 Computer Organization CS224 Fall 2011 “Welcome to my course” Will Sawyer With thanks to M.J. Irwin, D. Patterson, and J. Hennessy](https://reader034.vdocument.in/reader034/viewer/2022052200/56649e035503460f94aeea44/html5/thumbnails/41.jpg)
CS224 Fall 2011 Chapter 1
SPEC CINT2006 on Barcelona (2.5 GHz)Name ICx109 CPI ExTime RefTime SPEC
ratio
perl 2,1118 0.75 637 9,770 15.3
bzip2 2,389 0.85 817 9,650 11.8
gcc 1,050 1.72 724 8,050 11.1
mcf 336 10.00 1,345 9,120 6.8
go 1,658 1.09 721 10,490 14.6
hmmer 2,783 0.80 890 9,330 10.5
sjeng 2,176 0.96 837 12,100 14.5
libquantum 1,623 1.61 1,047 20,720 19.8
h264avc 3,102 0.80 993 22,130 22.3
omnetpp 587 2.94 690 6,250 9.1
astar 1,082 1.79 773 7,020 9.1
xalancbmk 1,058 2.70 1,143 6,900 6.0
Geometric Mean 11.7
![Page 42: CS224 Fall 2011 Chapter 1 Computer Organization CS224 Fall 2011 “Welcome to my course” Will Sawyer With thanks to M.J. Irwin, D. Patterson, and J. Hennessy](https://reader034.vdocument.in/reader034/viewer/2022052200/56649e035503460f94aeea44/html5/thumbnails/42.jpg)
CS224 Fall 2011 Chapter 1
Comparing & Summarizing Performance
• Reference time is the execution time on a reference computer• Guiding principle in reporting performance measurements is
reproducibility – list everything another experimenter would need to duplicate the experiment (version of the operating system, compiler settings, input set used, specific computer configuration (clock rate, cache sizes and speed, memory size and speed, etc)).
How do we summarize the performance for a benchmark set with a single number?
First the execution times are normalized giving the “SPEC ratio” of reference time to measured execution time (bigger is faster)
The SPEC ratios are then “averaged” using the geometric mean (GM)
GM = n SPEC ratioi
i = 1
n
![Page 43: CS224 Fall 2011 Chapter 1 Computer Organization CS224 Fall 2011 “Welcome to my course” Will Sawyer With thanks to M.J. Irwin, D. Patterson, and J. Hennessy](https://reader034.vdocument.in/reader034/viewer/2022052200/56649e035503460f94aeea44/html5/thumbnails/43.jpg)
Other Performance MetricsPower consumption – especially used in the embedded market where battery life is important. For power-limited applications, the most important metric is energy efficiency
![Page 44: CS224 Fall 2011 Chapter 1 Computer Organization CS224 Fall 2011 “Welcome to my course” Will Sawyer With thanks to M.J. Irwin, D. Patterson, and J. Hennessy](https://reader034.vdocument.in/reader034/viewer/2022052200/56649e035503460f94aeea44/html5/thumbnails/44.jpg)
CS224 Fall 2011 Chapter 1
CS224: Course Content
Computer Architecture and Engineering
Instruction Set Design Computer Organization
Interfaces Hardware Components
Compiler/System View Logic Designer’s View
“Building Architect” “Construction Engineer”
![Page 45: CS224 Fall 2011 Chapter 1 Computer Organization CS224 Fall 2011 “Welcome to my course” Will Sawyer With thanks to M.J. Irwin, D. Patterson, and J. Hennessy](https://reader034.vdocument.in/reader034/viewer/2022052200/56649e035503460f94aeea44/html5/thumbnails/45.jpg)
CS224 Fall 2011 Chapter 1
CS224: So what's in it for me?
• In-depth understanding of the inner-workings of modern computers, their evolution, and trade-offs present at the hardware/software boundary.– Insight into fast/slow operations that are easy/hard to implement
in hardware• Experience with the design process in the context of a large complex
(hardware) design.– Functional Spec --> Control & Datapath --> Simulation -->
Physical implementation– Modern CAD tools
• Designer's "Conceptual" toolbox
![Page 46: CS224 Fall 2011 Chapter 1 Computer Organization CS224 Fall 2011 “Welcome to my course” Will Sawyer With thanks to M.J. Irwin, D. Patterson, and J. Hennessy](https://reader034.vdocument.in/reader034/viewer/2022052200/56649e035503460f94aeea44/html5/thumbnails/46.jpg)
CS224 Fall 2011 Chapter 1
Conceptual tool box
• Evaluation Techniques• Levels of Translation (e.g. Compilation, Assembly)• Hierarchy (e.g. registers, cache, memory, disk)• Pipelining and Parallelism• Static / Dynamic Scheduling• Indirection and Address Translation• Timing, Clocking, and Latching• CAD Programs, Hardware Description Languages, Simulation• Physical Building Blocks (e.g. CLA)• Understanding Technology Trends