cs252/kubiatowicz lec 1.1 8/25/03 cs252 graduate computer architecture lecture 1 review of...
Post on 20-Dec-2015
217 views
TRANSCRIPT
![Page 1: CS252/Kubiatowicz Lec 1.1 8/25/03 CS252 Graduate Computer Architecture Lecture 1 Review of Technology Trends and Cost/Performance August 25, 2003 Prof](https://reader030.vdocument.in/reader030/viewer/2022032800/56649d4b5503460f94a2963e/html5/thumbnails/1.jpg)
CS252/KubiatowiczLec 1.1
8/25/03
CS252Graduate Computer Architecture
Lecture 1
Review of Technology Trends and Cost/Performance
August 25, 2003
Prof. John Kubiatowicz
http://www.cs.berkeley.edu/~kubitron/cs252-F03
![Page 2: CS252/Kubiatowicz Lec 1.1 8/25/03 CS252 Graduate Computer Architecture Lecture 1 Review of Technology Trends and Cost/Performance August 25, 2003 Prof](https://reader030.vdocument.in/reader030/viewer/2022032800/56649d4b5503460f94a2963e/html5/thumbnails/2.jpg)
CS252/KubiatowiczLec 1.2
8/25/03
Original
Big Fishes Eating Little Fishes
![Page 3: CS252/Kubiatowicz Lec 1.1 8/25/03 CS252 Graduate Computer Architecture Lecture 1 Review of Technology Trends and Cost/Performance August 25, 2003 Prof](https://reader030.vdocument.in/reader030/viewer/2022032800/56649d4b5503460f94a2963e/html5/thumbnails/3.jpg)
CS252/KubiatowiczLec 1.3
8/25/03
1988 Computer Food Chain
PCWork-stationMini-
computer
Mainframe
Mini-supercomputer
Supercomputer
Massively Parallel
Processors
![Page 4: CS252/Kubiatowicz Lec 1.1 8/25/03 CS252 Graduate Computer Architecture Lecture 1 Review of Technology Trends and Cost/Performance August 25, 2003 Prof](https://reader030.vdocument.in/reader030/viewer/2022032800/56649d4b5503460f94a2963e/html5/thumbnails/4.jpg)
CS252/KubiatowiczLec 1.4
8/25/03
1998 Computer Food Chain
PCWork-station
Mainframe
Supercomputer
Mini-supercomputerMassively Parallel
Processors
Mini-computer
Now who is eating whom?
Server
![Page 5: CS252/Kubiatowicz Lec 1.1 8/25/03 CS252 Graduate Computer Architecture Lecture 1 Review of Technology Trends and Cost/Performance August 25, 2003 Prof](https://reader030.vdocument.in/reader030/viewer/2022032800/56649d4b5503460f94a2963e/html5/thumbnails/5.jpg)
CS252/KubiatowiczLec 1.5
8/25/03
Why Such Change in 10 years?
• Performance– Technology Advances
» CMOS VLSI dominates older technologies (TTL, ECL) in cost AND performance
– Computer architecture advances improves low-end » RISC, superscalar, RAID, …
• Price: Lower costs due to …– Simpler development
» CMOS VLSI: smaller systems, fewer components– Higher volumes
» CMOS VLSI : same dev. cost 10,000 vs. 10,000,000 units – Lower margins by class of computer, due to fewer services
• Function– Rise of networking/local interconnection technology
![Page 6: CS252/Kubiatowicz Lec 1.1 8/25/03 CS252 Graduate Computer Architecture Lecture 1 Review of Technology Trends and Cost/Performance August 25, 2003 Prof](https://reader030.vdocument.in/reader030/viewer/2022032800/56649d4b5503460f94a2963e/html5/thumbnails/6.jpg)
CS252/KubiatowiczLec 1.6
8/25/03
Amazing Underlying Technology Change
• “Cramming More Components onto Integrated Circuits”
– Gordon Moore, Electronics, 1965
![Page 7: CS252/Kubiatowicz Lec 1.1 8/25/03 CS252 Graduate Computer Architecture Lecture 1 Review of Technology Trends and Cost/Performance August 25, 2003 Prof](https://reader030.vdocument.in/reader030/viewer/2022032800/56649d4b5503460f94a2963e/html5/thumbnails/7.jpg)
CS252/KubiatowiczLec 1.7
8/25/03
Year
Tra
nsis
tors
1000
10000
100000
1000000
10000000
100000000
1970 1975 1980 1985 1990 1995 2000
i80386
i4004
i8080
Pentium
i80486
i80286
i8086
Technology Trends: Microprocessor Capacity
CMOS improvements:• Die size: 2X every 3 yrs• Line width: halve / 7 yrs
Pentium 4: 55 millionAlpha 21264: 15 millionPentium Pro: 5.5 millionPowerPC 620: 6.9 millionAlpha 21164: 9.3 millionSparc Ultra: 5.2 million
Moore’s Law
![Page 8: CS252/Kubiatowicz Lec 1.1 8/25/03 CS252 Graduate Computer Architecture Lecture 1 Review of Technology Trends and Cost/Performance August 25, 2003 Prof](https://reader030.vdocument.in/reader030/viewer/2022032800/56649d4b5503460f94a2963e/html5/thumbnails/8.jpg)
CS252/KubiatowiczLec 1.8
8/25/03
size
Year
Bit
s
1000
10000
100000
1000000
10000000
100000000
1000000000
1970 1975 1980 1985 1990 1995 2000
Memory Capacity (Single Chip DRAM)
year size(Mb) cyc time1980 0.0625 250 ns1983 0.25 220 ns1986 1 190 ns1989 4 165 ns1992 16 145 ns1996 64 120 ns2000 256 100 ns2003 1024 60 ns
![Page 9: CS252/Kubiatowicz Lec 1.1 8/25/03 CS252 Graduate Computer Architecture Lecture 1 Review of Technology Trends and Cost/Performance August 25, 2003 Prof](https://reader030.vdocument.in/reader030/viewer/2022032800/56649d4b5503460f94a2963e/html5/thumbnails/9.jpg)
CS252/KubiatowiczLec 1.9
8/25/03
Technology dramatic change• Processor
– logic capacity: about 30% per year– clock rate: about 20% per year
• Memory– DRAM capacity: about 60% per year (4x every 3
years)– Memory speed: about 10% per year– Cost per bit: improves about 25% per year
• Disk– capacity: about 60% per year– Total use of data: 100% per 9 months!
• Network Bandwidth– Bandwidth increasing more than 100% per year!
![Page 10: CS252/Kubiatowicz Lec 1.1 8/25/03 CS252 Graduate Computer Architecture Lecture 1 Review of Technology Trends and Cost/Performance August 25, 2003 Prof](https://reader030.vdocument.in/reader030/viewer/2022032800/56649d4b5503460f94a2963e/html5/thumbnails/10.jpg)
CS252/KubiatowiczLec 1.10
8/25/03
Computers in the News: New IBM Transistor
• Announced 12/10/02• 6nm gate length!!!• Details: Still to be announced
![Page 11: CS252/Kubiatowicz Lec 1.1 8/25/03 CS252 Graduate Computer Architecture Lecture 1 Review of Technology Trends and Cost/Performance August 25, 2003 Prof](https://reader030.vdocument.in/reader030/viewer/2022032800/56649d4b5503460f94a2963e/html5/thumbnails/11.jpg)
CS252/KubiatowiczLec 1.11
8/25/03
Processor PerformanceTrends
Microprocessors
Minicomputers
Mainframes
Supercomputers
Year
0.1
1
10
100
1000
1965 1970 1975 1980 1985 1990 1995 2000
![Page 12: CS252/Kubiatowicz Lec 1.1 8/25/03 CS252 Graduate Computer Architecture Lecture 1 Review of Technology Trends and Cost/Performance August 25, 2003 Prof](https://reader030.vdocument.in/reader030/viewer/2022032800/56649d4b5503460f94a2963e/html5/thumbnails/12.jpg)
CS252/KubiatowiczLec 1.12
8/25/03
0
200
400
600
800
1000
1200
87 88 89 90 91 92 93 94 95 96 97
DEC A
lpha
21164/6
00
DEC A
lpha
5/5
00
DEC A
lpha
5/3
00
DEC A
lpha
4/2
66
IBM
PO
WER 1
00
DEC A
XP/
500
HP
9000/7
50
Sun
-4/2
60
IBM
RS
/6000
MIP
S M
/120
MIP
S M
/2000
Processor Performance(1.35X before, 1.55X now)
1.54X/yr
![Page 13: CS252/Kubiatowicz Lec 1.1 8/25/03 CS252 Graduate Computer Architecture Lecture 1 Review of Technology Trends and Cost/Performance August 25, 2003 Prof](https://reader030.vdocument.in/reader030/viewer/2022032800/56649d4b5503460f94a2963e/html5/thumbnails/13.jpg)
CS252/KubiatowiczLec 1.13
8/25/03
Computer Architecture Is …
the attributes of a [computing] system as seen by the programmer, i.e., the conceptual structure and functional behavior, as distinct from the organization of the data flows and controls the logic design, and the physical implementation.
Amdahl, Blaaw, and Brooks, 1964
SOFTWARESOFTWARE
![Page 14: CS252/Kubiatowicz Lec 1.1 8/25/03 CS252 Graduate Computer Architecture Lecture 1 Review of Technology Trends and Cost/Performance August 25, 2003 Prof](https://reader030.vdocument.in/reader030/viewer/2022032800/56649d4b5503460f94a2963e/html5/thumbnails/14.jpg)
CS252/KubiatowiczLec 1.14
8/25/03
Computer Architecture’s Changing Definition
• 1950s to 1960s: Computer Architecture Course: Computer Arithmetic
• 1970s to mid 1980s: Computer Architecture Course: Instruction Set Design, especially ISA appropriate for compilers
• 1990s: Computer Architecture Course:Design of CPU, memory system, I/O system, Multiprocessors, Networks
• 2010s: Computer Architecture Course: Self adapting systems? Self organizing structures?DNA Systems/Quantum Computing?
![Page 15: CS252/Kubiatowicz Lec 1.1 8/25/03 CS252 Graduate Computer Architecture Lecture 1 Review of Technology Trends and Cost/Performance August 25, 2003 Prof](https://reader030.vdocument.in/reader030/viewer/2022032800/56649d4b5503460f94a2963e/html5/thumbnails/15.jpg)
CS252/KubiatowiczLec 1.15
8/25/03
Instruction Set Architecture (ISA)
instruction set
software
hardware
![Page 16: CS252/Kubiatowicz Lec 1.1 8/25/03 CS252 Graduate Computer Architecture Lecture 1 Review of Technology Trends and Cost/Performance August 25, 2003 Prof](https://reader030.vdocument.in/reader030/viewer/2022032800/56649d4b5503460f94a2963e/html5/thumbnails/16.jpg)
CS252/KubiatowiczLec 1.16
8/25/03
Evolution of Instruction Sets
Single Accumulator (EDSAC 1950)
Accumulator + Index Registers(Manchester Mark I, IBM 700 series 1953)
Separation of Programming Model from Implementation
High-level Language Based Concept of a Family(B5000 1963) (IBM 360 1964)
General Purpose Register Machines
Complex Instruction Sets Load/Store Architecture
RISC
(Vax, Intel 432 1977-80) (CDC 6600, Cray 1 1963-76)
(Mips,Sparc,HP-PA,IBM RS6000, . . .1987)
![Page 17: CS252/Kubiatowicz Lec 1.1 8/25/03 CS252 Graduate Computer Architecture Lecture 1 Review of Technology Trends and Cost/Performance August 25, 2003 Prof](https://reader030.vdocument.in/reader030/viewer/2022032800/56649d4b5503460f94a2963e/html5/thumbnails/17.jpg)
CS252/KubiatowiczLec 1.17
8/25/03
Interface Design
A good interface:
• Lasts through many implementations (portability, compatibility)
• Is used in many differeny ways (generality)
• Provides convenient functionality to higher levels
• Permits an efficient implementation at lower levels
Interfaceimp 1
imp 2
imp 3
use
use
use
time
![Page 18: CS252/Kubiatowicz Lec 1.1 8/25/03 CS252 Graduate Computer Architecture Lecture 1 Review of Technology Trends and Cost/Performance August 25, 2003 Prof](https://reader030.vdocument.in/reader030/viewer/2022032800/56649d4b5503460f94a2963e/html5/thumbnails/18.jpg)
CS252/KubiatowiczLec 1.18
8/25/03
Virtualization:One of the lessons of RISC
• Integrated Systems Approach – What really matters is the functioning of the complete system,
I.e. hardware, runtime system, compiler, and operating system– In networking, this is called the “End to End argument”– Programmers care about high-level languages, debuggers,
source-level object-oriented programming
• Computer architecture is not just about transistors, individual instructions, or particular implementations
• Original RISC projects replaced complex instructions with a compiler + simple instructions
• Logical Extension => Genetically adaptive runtime systems enhanced by dynamic compilation running on reconfigurable hardware? Perhaps.
![Page 19: CS252/Kubiatowicz Lec 1.1 8/25/03 CS252 Graduate Computer Architecture Lecture 1 Review of Technology Trends and Cost/Performance August 25, 2003 Prof](https://reader030.vdocument.in/reader030/viewer/2022032800/56649d4b5503460f94a2963e/html5/thumbnails/19.jpg)
CS252/KubiatowiczLec 1.19
8/25/03
Computer Architecture Topics
Instruction Set Architecture
Pipelining, Hazard Resolution,Superscalar, Reordering, Prediction, Speculation,Vector, Dynamic Compilation
Addressing,Protection,Exception Handling
L1 Cache
L2 Cache
DRAM
Disks, WORM, Tape
Coherence,Bandwidth,Latency
Emerging TechnologiesInterleavingBus protocols
RAID
VLSI
Input/Output and Storage
MemoryHierarchy
Pipelining and Instruction Level Parallelism
NetworkCommunication
Oth
er
Pro
cessors
![Page 20: CS252/Kubiatowicz Lec 1.1 8/25/03 CS252 Graduate Computer Architecture Lecture 1 Review of Technology Trends and Cost/Performance August 25, 2003 Prof](https://reader030.vdocument.in/reader030/viewer/2022032800/56649d4b5503460f94a2963e/html5/thumbnails/20.jpg)
CS252/KubiatowiczLec 1.20
8/25/03
Sample Organization: It’s all about
communication
Proc
CachesBusses
Memory
I/O Devices:
Controllers
adapters
DisksDisplaysKeyboards
Networks
Pentium III Chipset
![Page 21: CS252/Kubiatowicz Lec 1.1 8/25/03 CS252 Graduate Computer Architecture Lecture 1 Review of Technology Trends and Cost/Performance August 25, 2003 Prof](https://reader030.vdocument.in/reader030/viewer/2022032800/56649d4b5503460f94a2963e/html5/thumbnails/21.jpg)
CS252/KubiatowiczLec 1.21
8/25/03
Computer Architecture Topics
M
Interconnection NetworkS
PMPMPMP° ° °
Topologies,Routing,Bandwidth,Latency,Reliability
Network Interfaces
Shared Memory,Message Passing,Data Parallelism
Processor-Memory-Switch
MultiprocessorsNetworks and Interconnections
![Page 22: CS252/Kubiatowicz Lec 1.1 8/25/03 CS252 Graduate Computer Architecture Lecture 1 Review of Technology Trends and Cost/Performance August 25, 2003 Prof](https://reader030.vdocument.in/reader030/viewer/2022032800/56649d4b5503460f94a2963e/html5/thumbnails/22.jpg)
CS252/KubiatowiczLec 1.22
8/25/03
CS 252 Course FocusUnderstanding the design techniques, machine structures,
technology factors, evaluation methods that will determine the form of computers in 21st Century
Technology ProgrammingLanguages
OperatingSystems History
Applications Interface Design(ISA)
Measurement & Evaluation
Parallelism
Computer Architecture:• Instruction Set Design• Organization• Hardware/Software Boundary
Compilers
![Page 23: CS252/Kubiatowicz Lec 1.1 8/25/03 CS252 Graduate Computer Architecture Lecture 1 Review of Technology Trends and Cost/Performance August 25, 2003 Prof](https://reader030.vdocument.in/reader030/viewer/2022032800/56649d4b5503460f94a2963e/html5/thumbnails/23.jpg)
CS252/KubiatowiczLec 1.23
8/25/03
Topic CoverageTextbook: Hennessy and Patterson, Computer Architecture: A Quantitative Approach, 3rd Ed., 2002.
Research Papers -- Handed out in class• 1.5 weeks Review: Fundamentals of Computer Architecture (Ch.
1), Instruction Set Architecture (Ch. 2), Pipelining (Ch. 3)• 2.5 weeks: Pipelining, Interrupts, and Instructional Level
Parallelism (Ch. 4), Vector Processors (Appendix B).
• 1.5 weeks: Dynamic Compilation. Data Speculation (papers). Complexity, design via genetic algorithms
• 1 week: Memory Hierarchy (Chapter 5)• 1.5 weeks: Fault Tolerance, Input/Output and Storage (Ch. 6)• 1.5 weeks: Networks and Interconnection Technology (Ch. 7)• 1.5 weeks: Multiprocessors (Ch. 8 + Research papers + Culler
book draft Chapter 1) • 1 week: Quantum Computing, DNA Computing
![Page 24: CS252/Kubiatowicz Lec 1.1 8/25/03 CS252 Graduate Computer Architecture Lecture 1 Review of Technology Trends and Cost/Performance August 25, 2003 Prof](https://reader030.vdocument.in/reader030/viewer/2022032800/56649d4b5503460f94a2963e/html5/thumbnails/24.jpg)
CS252/KubiatowiczLec 1.24
8/25/03
CS252: InformationInstructor:Prof John D. Kubiatowicz
Office: 673 Soda Hall, 643-6817 kubitron@cs
Office Hours: Wed 3:30 - 5:00 or by appt.
(Contact Veronique Richards, 642-4334, nicou@cs,
676 Soda)
T. A: TBA
Class: Mon/Wed, 1:00 - 2:30pm 310 Soda Hall
Text: Computer Architecture: A Quantitative Approach, Third Edition (2002)
Web page: http://www.cs/~kubitron/courses/cs252-F03/
Lectures available online <11:30AM day of lecture
Newsgroup: ucb.class.cs252
Email: [email protected]
![Page 25: CS252/Kubiatowicz Lec 1.1 8/25/03 CS252 Graduate Computer Architecture Lecture 1 Review of Technology Trends and Cost/Performance August 25, 2003 Prof](https://reader030.vdocument.in/reader030/viewer/2022032800/56649d4b5503460f94a2963e/html5/thumbnails/25.jpg)
CS252/KubiatowiczLec 1.25
8/25/03
Lecture style
• 1-Minute Review • 20-Minute Lecture/Discussion• 5- Minute Administrative Matters• 25-Minute Lecture/Discussion• 5-Minute Break (water, stretch)• 25-Minute Lecture/Discussion• Instructor will come to class early & stay after
to answer questions
Attention
Time
20 min. Break“In Conclusion, ...”
![Page 26: CS252/Kubiatowicz Lec 1.1 8/25/03 CS252 Graduate Computer Architecture Lecture 1 Review of Technology Trends and Cost/Performance August 25, 2003 Prof](https://reader030.vdocument.in/reader030/viewer/2022032800/56649d4b5503460f94a2963e/html5/thumbnails/26.jpg)
CS252/KubiatowiczLec 1.26
8/25/03
Grading• 10% Homeworks (work in pairs)• 40% Examinations (2 Midterms)• 40% Research Project (work in pairs)
– Transition from undergrad to grad student– Berkeley wants you to succeed, but you need to show
initiative– pick topic– meet 3 times with faculty/TA to see progress– give oral presentation– give poster session– written report like conference paper– 3 weeks work full time for 2 people– Opportunity to do “research in the small” to help make
transition from good student to research colleague
• 10% Class Participation
![Page 27: CS252/Kubiatowicz Lec 1.1 8/25/03 CS252 Graduate Computer Architecture Lecture 1 Review of Technology Trends and Cost/Performance August 25, 2003 Prof](https://reader030.vdocument.in/reader030/viewer/2022032800/56649d4b5503460f94a2963e/html5/thumbnails/27.jpg)
CS252/KubiatowiczLec 1.27
8/25/03
Quizes
• Reduce the pressure of taking quizes– Only 2 Graded Quizes:
Tentative: Wed Oct 13th and Wed Dec 1st – Our goal: test knowledge vs. speed writing– 3 hrs to take 1.5-hr test (5:30-8:30 PM, TBA location)– Both mid-term quizes can bring summary sheet
» Transfer ideas from book to paper
– Last chance Q&A: during class time day of exam
• Students/Staff meet over free pizza/drinks at La Vals: Wed Oct 13th (8:30 PM) and Wed Dec 1st (8:30 PM)
![Page 28: CS252/Kubiatowicz Lec 1.1 8/25/03 CS252 Graduate Computer Architecture Lecture 1 Review of Technology Trends and Cost/Performance August 25, 2003 Prof](https://reader030.vdocument.in/reader030/viewer/2022032800/56649d4b5503460f94a2963e/html5/thumbnails/28.jpg)
CS252/KubiatowiczLec 1.28
8/25/03
Research Paper Reading• As graduate students, you are now
researchers.• Most information of importance to you will be
in research papers.• Ability to rapidly scan and understand
research papers is key to your success.
• So: you will read lots of papers in this course!– Quick 1 paragraph summaries will be due in class– Important supplement to book.– Will discuss papers in class
• Papers will be scanned and on web page.
![Page 29: CS252/Kubiatowicz Lec 1.1 8/25/03 CS252 Graduate Computer Architecture Lecture 1 Review of Technology Trends and Cost/Performance August 25, 2003 Prof](https://reader030.vdocument.in/reader030/viewer/2022032800/56649d4b5503460f94a2963e/html5/thumbnails/29.jpg)
CS252/KubiatowiczLec 1.29
8/25/03
More Course Info• Everything is on the course Web page:
www.cs.berkeley.edu/~kubitron/courses/cs252-F03
• Notes:– Not sure what the state of textbooks at Student Center.– The course Web page includes a pointer to last term’s 152
home page. The “handout” page includes pointers to old 152 quizes.
• Schedule:– 2 Graded Quizes: Mon Oct 13th and Mon Dec 1st – Veteran’s Day: Friday Nov 5th – Thanksgiving Vacation: Thur Nov 27th - Sun Nov 28th – Oral Presentations: Tue/Wed Dec 9/10th – 252 Last lecture: Fri Dec 3rd – 252 Poster Session: ???– Project Papers/URLs due: Fri Dec 12th
• Project Suggestions: TBA
![Page 30: CS252/Kubiatowicz Lec 1.1 8/25/03 CS252 Graduate Computer Architecture Lecture 1 Review of Technology Trends and Cost/Performance August 25, 2003 Prof](https://reader030.vdocument.in/reader030/viewer/2022032800/56649d4b5503460f94a2963e/html5/thumbnails/30.jpg)
CS252/KubiatowiczLec 1.30
8/25/03
Related Courses
CS 152CS 152 CS 252CS 252 CS 258CS 258
CS 250CS 250
How to build itImplementation details
Why, Analysis,Evaluation
Parallel Architectures,Languages, Systems
Integrated Circuit Technologyfrom a computer-organization viewpoint
Strong
Prerequisite
Basic knowledge of theorganization of a computeris assumed!
![Page 31: CS252/Kubiatowicz Lec 1.1 8/25/03 CS252 Graduate Computer Architecture Lecture 1 Review of Technology Trends and Cost/Performance August 25, 2003 Prof](https://reader030.vdocument.in/reader030/viewer/2022032800/56649d4b5503460f94a2963e/html5/thumbnails/31.jpg)
CS252/KubiatowiczLec 1.31
8/25/03
Coping with CS 252
• Too many students with too varied background?
– Next Wednesday - Prequisite exam
• Limiting Number of Students– First priority is CS/ EECS grad students taking prelims– Second priority is N-th year CS/ EECS grad students
(breadth)– Third priority is College of Engineering grad students– Fourth priority is CS/EECS undergraduate seniors
(Note: 1 graduate course unit = 2 undergraduate course units)
– All other categories
• If not this semester, 252 is offered regularly
![Page 32: CS252/Kubiatowicz Lec 1.1 8/25/03 CS252 Graduate Computer Architecture Lecture 1 Review of Technology Trends and Cost/Performance August 25, 2003 Prof](https://reader030.vdocument.in/reader030/viewer/2022032800/56649d4b5503460f94a2963e/html5/thumbnails/32.jpg)
CS252/KubiatowiczLec 1.32
8/25/03
Coping with CS 252• Students with too varied background?
– In past, CS grad students took written prelim exams on undergraduate material in hardware, software, and theory
– 1st 5 weeks reviewed background, helped 252, 262, 270– Prelims were dropped => some unprepared for CS 252?
• In class exam on Wednesday September 3rd – Doesn’t affect grade, only admission into class– 2 grades: Admitted or audit/take CS 152 1st– Improve your experience if recapture common background
• Review: Chapters 1-3, CS 152 home page, maybe “Computer Organization and Design (COD)2/e”
– Chapters 1 to 8 of COD if never took prerequisite– If took a class, be sure COD Chapters 2, 6, 7 are familiar– Copies in Bechtel Library on 2-hour reserve– Last exam on previous-year’s web site
(~kubitron/courses/cs252-F00)
![Page 33: CS252/Kubiatowicz Lec 1.1 8/25/03 CS252 Graduate Computer Architecture Lecture 1 Review of Technology Trends and Cost/Performance August 25, 2003 Prof](https://reader030.vdocument.in/reader030/viewer/2022032800/56649d4b5503460f94a2963e/html5/thumbnails/33.jpg)
CS252/KubiatowiczLec 1.33
8/25/03
Building Hardwarethat Computes
![Page 34: CS252/Kubiatowicz Lec 1.1 8/25/03 CS252 Graduate Computer Architecture Lecture 1 Review of Technology Trends and Cost/Performance August 25, 2003 Prof](https://reader030.vdocument.in/reader030/viewer/2022032800/56649d4b5503460f94a2963e/html5/thumbnails/34.jpg)
CS252/KubiatowiczLec 1.34
8/25/03
Finite State Machines:• System state is explicit in representation• Transitions between states represented as
arrows with inputs on arcs.• Output may be either part of state or on arcs
Alpha/
0
Delta/
2
Beta/
10
1
1
0
0
1
“Mod 3 Machine”
Input (MSB first)
0 1 0 1 00 1 2 2
1
106
Mod 3
1
1
1 1
0
![Page 35: CS252/Kubiatowicz Lec 1.1 8/25/03 CS252 Graduate Computer Architecture Lecture 1 Review of Technology Trends and Cost/Performance August 25, 2003 Prof](https://reader030.vdocument.in/reader030/viewer/2022032800/56649d4b5503460f94a2963e/html5/thumbnails/35.jpg)
CS252/KubiatowiczLec 1.35
8/25/03
“M
eale
y M
ach
ine”“M
oore
Mach
ine”
Implementation as Combinational logic +
LatchAlpha/
0
Delta/
2
Beta/
10/0
1/0
1/1
0/10/0
1/1
Latc
h
Com
bin
ati
on
al
Log
ic
I nput Stateold Statenew Div
000
000110
001001
001
111
000110
010010
011
![Page 36: CS252/Kubiatowicz Lec 1.1 8/25/03 CS252 Graduate Computer Architecture Lecture 1 Review of Technology Trends and Cost/Performance August 25, 2003 Prof](https://reader030.vdocument.in/reader030/viewer/2022032800/56649d4b5503460f94a2963e/html5/thumbnails/36.jpg)
CS252/KubiatowiczLec 1.36
8/25/03
Microprogrammed Controllers• State machine in which part of state is a “micro-pc”.
– Explicit circuitry for incrementing or changing PC
• Includes a ROM with “microinstructions”.– Controlled logic implements at least branches and jumps
RO
M(In
stru
ctio
ns)
Addr
BranchPC
+ 1
MUX
Next Address
Control
0: forw 35 xxx1: b_no_obstacles 0002: back 10 xxx3: rotate 90 xxx4: goto 001
Instruction Branch
Com
bin
ati
on
al Log
ic/
Con
trolled
Mach
ineS
tate
w/ A
dd
ress
![Page 37: CS252/Kubiatowicz Lec 1.1 8/25/03 CS252 Graduate Computer Architecture Lecture 1 Review of Technology Trends and Cost/Performance August 25, 2003 Prof](https://reader030.vdocument.in/reader030/viewer/2022032800/56649d4b5503460f94a2963e/html5/thumbnails/37.jpg)
CS252/KubiatowiczLec 1.37
8/25/03
Execution Cycle
Instruction
Fetch
Instruction
Decode
Operand
Fetch
Execute
Result
Store
Next
Instruction
Obtain instruction from program storage
Determine required actions and instruction size
Locate and obtain operand data
Compute result value or status
Deposit results in storage for later use
Determine successor instruction
![Page 38: CS252/Kubiatowicz Lec 1.1 8/25/03 CS252 Graduate Computer Architecture Lecture 1 Review of Technology Trends and Cost/Performance August 25, 2003 Prof](https://reader030.vdocument.in/reader030/viewer/2022032800/56649d4b5503460f94a2963e/html5/thumbnails/38.jpg)
CS252/KubiatowiczLec 1.38
8/25/03
What’s a Clock Cycle?
• Old days: 10 levels of gates• Today: determined by numerous time-
of-flight issues + gate delays– clock propagation, wire lengths, drivers
Latchor
register
combinationallogic
![Page 39: CS252/Kubiatowicz Lec 1.1 8/25/03 CS252 Graduate Computer Architecture Lecture 1 Review of Technology Trends and Cost/Performance August 25, 2003 Prof](https://reader030.vdocument.in/reader030/viewer/2022032800/56649d4b5503460f94a2963e/html5/thumbnails/39.jpg)
CS252/KubiatowiczLec 1.39
8/25/03
Pipelined Instruction Interpretation
Instruction Register
Operand Registers
Instruction Address
Result Registers
Next Instruction
Instruction Fetch
Decode &Operand Fetch
Execute
Store Results
NIIF
DE
W
NIIF
DE
W
NIIF
DE
W
NIIF
DE
W
NIIF
DE
W
Time
Registers or Mem
![Page 40: CS252/Kubiatowicz Lec 1.1 8/25/03 CS252 Graduate Computer Architecture Lecture 1 Review of Technology Trends and Cost/Performance August 25, 2003 Prof](https://reader030.vdocument.in/reader030/viewer/2022032800/56649d4b5503460f94a2963e/html5/thumbnails/40.jpg)
CS252/KubiatowiczLec 1.40
8/25/03
Sequential Laundry
• Sequential laundry takes 6 hours for 4 loads• If they learned pipelining, how long would laundry take?
A
B
C
D
30 40 2030 40 2030 40 2030 40 20
6 PM 7 8 9 10 11 Midnight
Task
Order
Time
![Page 41: CS252/Kubiatowicz Lec 1.1 8/25/03 CS252 Graduate Computer Architecture Lecture 1 Review of Technology Trends and Cost/Performance August 25, 2003 Prof](https://reader030.vdocument.in/reader030/viewer/2022032800/56649d4b5503460f94a2963e/html5/thumbnails/41.jpg)
CS252/KubiatowiczLec 1.41
8/25/03
Pipelined LaundryStart work ASAP
• Pipelined laundry takes 3.5 hours for 4 loads
A
B
C
D
6 PM 7 8 9 10 11 Midnight
Task
Order
Time
30 40 40 40 40 20
![Page 42: CS252/Kubiatowicz Lec 1.1 8/25/03 CS252 Graduate Computer Architecture Lecture 1 Review of Technology Trends and Cost/Performance August 25, 2003 Prof](https://reader030.vdocument.in/reader030/viewer/2022032800/56649d4b5503460f94a2963e/html5/thumbnails/42.jpg)
CS252/KubiatowiczLec 1.42
8/25/03
Pipelining Lessons• Pipelining doesn’t help
latency of single task, it helps throughput of entire workload
• Pipeline rate limited by slowest pipeline stage
• Multiple tasks operating simultaneously
• Potential speedup = Number pipe stages
• Unbalanced lengths of pipe stages reduces speedup
• Time to “fill” pipeline and time to “drain” it reduces speedup
A
B
C
D
6 PM 7 8 9
Task
Order
Time
30 40 40 40 40 20
![Page 43: CS252/Kubiatowicz Lec 1.1 8/25/03 CS252 Graduate Computer Architecture Lecture 1 Review of Technology Trends and Cost/Performance August 25, 2003 Prof](https://reader030.vdocument.in/reader030/viewer/2022032800/56649d4b5503460f94a2963e/html5/thumbnails/43.jpg)
CS252/KubiatowiczLec 1.43
8/25/03
The Process of Design
Design
Analysis
Architecture is an iterative process:• Searching the space of possible designs• At all levels of computer systems
Creativity
Good IdeasGood Ideas
Mediocre IdeasBad Ideas
Cost /PerformanceAnalysis
![Page 44: CS252/Kubiatowicz Lec 1.1 8/25/03 CS252 Graduate Computer Architecture Lecture 1 Review of Technology Trends and Cost/Performance August 25, 2003 Prof](https://reader030.vdocument.in/reader030/viewer/2022032800/56649d4b5503460f94a2963e/html5/thumbnails/44.jpg)
CS252/KubiatowiczLec 1.44
8/25/03
Measurement Tools
• Benchmarks, Traces, Mixes• Hardware: Cost, delay, area, power
estimation• Simulation (many levels)
– ISA, RT, Gate, Circuit
• Queuing Theory• Rules of Thumb• Fundamental “Laws”/Principles
![Page 45: CS252/Kubiatowicz Lec 1.1 8/25/03 CS252 Graduate Computer Architecture Lecture 1 Review of Technology Trends and Cost/Performance August 25, 2003 Prof](https://reader030.vdocument.in/reader030/viewer/2022032800/56649d4b5503460f94a2963e/html5/thumbnails/45.jpg)
CS252/KubiatowiczLec 1.45
8/25/03
The Bottom Line: Performance (and Cost)
• Time to run the task (ExTime)– Execution time, response time, latency
• Tasks per day, hour, week, sec, ns … (Performance)
– Throughput, bandwidth
Plane
Boeing 747
BAD/Sud Concodre
Speed
610 mph
1350 mph
DC to Paris
6.5 hours
3 hours
Passengers
470
132
Throughput (pmph)
286,700
178,200
![Page 46: CS252/Kubiatowicz Lec 1.1 8/25/03 CS252 Graduate Computer Architecture Lecture 1 Review of Technology Trends and Cost/Performance August 25, 2003 Prof](https://reader030.vdocument.in/reader030/viewer/2022032800/56649d4b5503460f94a2963e/html5/thumbnails/46.jpg)
CS252/KubiatowiczLec 1.46
8/25/03
Performance(X) Execution_time(Y)
n = =
Performance(Y) Execution_time(Y)
Definitions•Performance is in units of things per sec
– bigger is better
•If we are primarily concerned with response time–performance(x) = 1
execution_time(x)
" X is n times faster than Y" means
![Page 47: CS252/Kubiatowicz Lec 1.1 8/25/03 CS252 Graduate Computer Architecture Lecture 1 Review of Technology Trends and Cost/Performance August 25, 2003 Prof](https://reader030.vdocument.in/reader030/viewer/2022032800/56649d4b5503460f94a2963e/html5/thumbnails/47.jpg)
CS252/KubiatowiczLec 1.47
8/25/03
Amdahl’s Law
enhanced
enhancedenhanced
new
oldoverall
Speedup
Fraction Fraction
1
ExTimeExTime
Speedup
1
Best you could ever hope to do:
enhancedmaximum Fraction - 1
1 Speedup
enhanced
enhancedenhancedoldnew Speedup
FractionFraction ExTime ExTime 1
![Page 48: CS252/Kubiatowicz Lec 1.1 8/25/03 CS252 Graduate Computer Architecture Lecture 1 Review of Technology Trends and Cost/Performance August 25, 2003 Prof](https://reader030.vdocument.in/reader030/viewer/2022032800/56649d4b5503460f94a2963e/html5/thumbnails/48.jpg)
CS252/KubiatowiczLec 1.48
8/25/03
Metrics of Performance
Compiler
Programming Language
Application
DatapathControl
TransistorsWiresPins
ISA
Function Units
(millions) of Instructions per second: MIPS(millions) of (FP) operations per second: MFLOP/s
Cycles per second (clock rate)
Megabytes per second
Answers per monthOperations per second
![Page 49: CS252/Kubiatowicz Lec 1.1 8/25/03 CS252 Graduate Computer Architecture Lecture 1 Review of Technology Trends and Cost/Performance August 25, 2003 Prof](https://reader030.vdocument.in/reader030/viewer/2022032800/56649d4b5503460f94a2963e/html5/thumbnails/49.jpg)
CS252/KubiatowiczLec 1.49
8/25/03
Computer Performance
CPU time = Seconds = Instructions x Cycles x Seconds
Program Program Instruction Cycle
CPU time = Seconds = Instructions x Cycles x Seconds
Program Program Instruction Cycle
Inst Count CPI Clock RateProgram X
Compiler X (X)
Inst. Set. X X
Organization X X
Technology X
inst count
CPI
Cycle time
![Page 50: CS252/Kubiatowicz Lec 1.1 8/25/03 CS252 Graduate Computer Architecture Lecture 1 Review of Technology Trends and Cost/Performance August 25, 2003 Prof](https://reader030.vdocument.in/reader030/viewer/2022032800/56649d4b5503460f94a2963e/html5/thumbnails/50.jpg)
CS252/KubiatowiczLec 1.50
8/25/03
Cycles Per Instruction(Throughput)
“Instruction Frequency”
CPI = (CPU Time * Clock Rate) / Instruction Count = Cycles / Instruction Count
“Average Cycles per Instruction”
j
n
jj I CPI TimeCycle time CPU
1
Count nInstructio
I F where F CPI CPI j
j
n
jjj
1
![Page 51: CS252/Kubiatowicz Lec 1.1 8/25/03 CS252 Graduate Computer Architecture Lecture 1 Review of Technology Trends and Cost/Performance August 25, 2003 Prof](https://reader030.vdocument.in/reader030/viewer/2022032800/56649d4b5503460f94a2963e/html5/thumbnails/51.jpg)
CS252/KubiatowiczLec 1.51
8/25/03
Example: Calculating CPI bottom up
Typical Mix of instruction typesin program
Base Machine (Reg / Reg)
Op Freq Cycles CPI(i) (% Time)
ALU 50% 1 .5 (33%)
Load 20% 2 .4 (27%)
Store 10% 2 .2 (13%)
Branch 20% 2 .4 (27%)
1.5
![Page 52: CS252/Kubiatowicz Lec 1.1 8/25/03 CS252 Graduate Computer Architecture Lecture 1 Review of Technology Trends and Cost/Performance August 25, 2003 Prof](https://reader030.vdocument.in/reader030/viewer/2022032800/56649d4b5503460f94a2963e/html5/thumbnails/52.jpg)
CS252/KubiatowiczLec 1.52
8/25/03
Example: Branch Stall Impact
• Assume CPI = 1.0 ignoring branches (ideal)• Assume solution was stalling for 3 cycles• If 30% branch, Stall 3 cycles on 30%
Op Freq Cycles CPI(i) (% Time)Other 70% 1 .7 (37%)Branch30% 4 1.2 (63%)
new CPI = 1.9
• New machine is 1/1.9 = 0.52 times faster (i.e. slow!)
![Page 53: CS252/Kubiatowicz Lec 1.1 8/25/03 CS252 Graduate Computer Architecture Lecture 1 Review of Technology Trends and Cost/Performance August 25, 2003 Prof](https://reader030.vdocument.in/reader030/viewer/2022032800/56649d4b5503460f94a2963e/html5/thumbnails/53.jpg)
CS252/KubiatowiczLec 1.53
8/25/03
Speed Up Equation for Pipelining
pipelined
dunpipeline
TimeCycle
TimeCycle
CPI stall Pipeline CPI Idealdepth Pipeline CPI Ideal
Speedup
pipelined
dunpipeline
TimeCycle
TimeCycle
CPI stall Pipeline 1depth Pipeline
Speedup
Instper cycles Stall Average CPI Ideal CPIpipelined
For simple RISC pipeline, CPI = 1:
![Page 54: CS252/Kubiatowicz Lec 1.1 8/25/03 CS252 Graduate Computer Architecture Lecture 1 Review of Technology Trends and Cost/Performance August 25, 2003 Prof](https://reader030.vdocument.in/reader030/viewer/2022032800/56649d4b5503460f94a2963e/html5/thumbnails/54.jpg)
CS252/KubiatowiczLec 1.54
8/25/03
SPEC: System Performance Evaluation Cooperative
• First Round 1989– 10 programs yielding a single number (“SPECmarks”)
• Second Round 1992– SPECInt92 (6 integer programs) and SPECfp92 (14 floating point
programs)» Compiler Flags unlimited. March 93 of DEC 4000 Model 610:
spice: unix.c:/def=(sysv,has_bcopy,”bcopy(a,b,c)=memcpy(b,a,c)”wave5: /ali=(all,dcom=nat)/ag=a/ur=4/ur=200nasa7: /norecu/ag=a/ur=4/ur2=200/lc=blas
• Third Round 1995– new set of programs: SPECint95 (8 integer programs) and
SPECfp95 (10 floating point) – “benchmarks useful for 3 years”– Single flag setting for all programs: SPECint_base95,
SPECfp_base95
• Fourth Round 2000: 26 apps– analysis and simulation programs– Compression: bzip2, gzip, – Integrated circuit layout, ray tracing, lots of others
![Page 55: CS252/Kubiatowicz Lec 1.1 8/25/03 CS252 Graduate Computer Architecture Lecture 1 Review of Technology Trends and Cost/Performance August 25, 2003 Prof](https://reader030.vdocument.in/reader030/viewer/2022032800/56649d4b5503460f94a2963e/html5/thumbnails/55.jpg)
CS252/KubiatowiczLec 1.55
8/25/03
How to Summarize Performance
• Arithmetic mean (weighted arithmetic mean) tracks execution time:
(Ti)/n or (Wi*Ti)• Harmonic mean (weighted harmonic mean) of
rates (e.g., MFLOPS) tracks execution time: n/(1/Ri) or n/(Wi/Ri)
• Normalized execution time is handy for scaling performance (e.g., X times faster than SPARCstation 10)
• But do not take the arithmetic mean of normalized execution time, use the geometric mean:
( Tj / Nj )1/n
![Page 56: CS252/Kubiatowicz Lec 1.1 8/25/03 CS252 Graduate Computer Architecture Lecture 1 Review of Technology Trends and Cost/Performance August 25, 2003 Prof](https://reader030.vdocument.in/reader030/viewer/2022032800/56649d4b5503460f94a2963e/html5/thumbnails/56.jpg)
CS252/KubiatowiczLec 1.56
8/25/03
SPEC First Round• One program: 99% of time in single line of
code• New front-end compiler could improve
dramatically
Benchmark
SP
EC
Pe
rf
0
100
200
300
400
500
600
700
800
gcc
epre
sso
spic
e
doduc
nasa
7 li
eqnto
tt
matr
ix300
fpppp
tom
catv
![Page 57: CS252/Kubiatowicz Lec 1.1 8/25/03 CS252 Graduate Computer Architecture Lecture 1 Review of Technology Trends and Cost/Performance August 25, 2003 Prof](https://reader030.vdocument.in/reader030/viewer/2022032800/56649d4b5503460f94a2963e/html5/thumbnails/57.jpg)
CS252/KubiatowiczLec 1.57
8/25/03
Performance Evaluation• “For better or worse, benchmarks shape a field”• Good products created when have:
– Good benchmarks– Good ways to summarize performance
• Given sales is a function in part of performance relative to competition, investment in improving product as reported by performance summary
• If benchmarks/summary inadequate, then choose between improving product for real programs vs. improving product to get more sales;Sales almost always wins!
• Execution time is the measure of computer performance!
![Page 58: CS252/Kubiatowicz Lec 1.1 8/25/03 CS252 Graduate Computer Architecture Lecture 1 Review of Technology Trends and Cost/Performance August 25, 2003 Prof](https://reader030.vdocument.in/reader030/viewer/2022032800/56649d4b5503460f94a2963e/html5/thumbnails/58.jpg)
CS252/KubiatowiczLec 1.58
8/25/03
Integrated Circuits Costs
Die Cost goes roughly with die area4
Test_Die Die_Area 2
Wafer_diam
Die_Area
2m/2)(Wafer_dia wafer per Dies
Die_area sityDefect_Den
1 dWafer_yiel YieldDie
yieldtest Finalcost Packaging cost Testingcost Die
cost IC
yield Die Wafer per DiescostWafer
cost Die
![Page 59: CS252/Kubiatowicz Lec 1.1 8/25/03 CS252 Graduate Computer Architecture Lecture 1 Review of Technology Trends and Cost/Performance August 25, 2003 Prof](https://reader030.vdocument.in/reader030/viewer/2022032800/56649d4b5503460f94a2963e/html5/thumbnails/59.jpg)
CS252/KubiatowiczLec 1.59
8/25/03
Real World Examples
Chip Metal Line Wafer Defect Area Dies/ Yield Die Cost layers width cost /cm2 mm2 wafer
386DX 2 0.90 $900 1.0 43 360 71% $4
486DX2 3 0.80 $1200 1.0 81 181 54% $12
PowerPC 601 4 0.80 $1700 1.3 121 115 28% $53
HP PA 7100 3 0.80 $1300 1.0 196 66 27% $73
DEC Alpha 3 0.70 $1500 1.2 234 53 19% $149
SuperSPARC 3 0.70 $1700 1.6 256 48 13% $272
Pentium 3 0.80 $1500 1.5 296 40 9% $417
– From "Estimating IC Manufacturing Costs,” by Linley Gwennap, Microprocessor Report, August 2, 1993, p. 15
![Page 60: CS252/Kubiatowicz Lec 1.1 8/25/03 CS252 Graduate Computer Architecture Lecture 1 Review of Technology Trends and Cost/Performance August 25, 2003 Prof](https://reader030.vdocument.in/reader030/viewer/2022032800/56649d4b5503460f94a2963e/html5/thumbnails/60.jpg)
CS252/KubiatowiczLec 1.60
8/25/03
Summary, #1• Designing to Last through Trends
Capacity Speed
Logic 2x in 3 years 2x in 3 years
SPEC RATING: 2x in 1.5 years
DRAM 4x in 3 years 2x in 10 years
Disk 4x in 3 years 2x in 10 years
• 6yrs to graduate => 16X CPU speed, DRAM/Disk size
• Time to run the task– Execution time, response time, latency
• Tasks per day, hour, week, sec, ns, …– Throughput, bandwidth
• “X is n times faster than Y” means ExTime(Y) Performance(X)
--------- = --------------
ExTime(X) Performance(Y)
![Page 61: CS252/Kubiatowicz Lec 1.1 8/25/03 CS252 Graduate Computer Architecture Lecture 1 Review of Technology Trends and Cost/Performance August 25, 2003 Prof](https://reader030.vdocument.in/reader030/viewer/2022032800/56649d4b5503460f94a2963e/html5/thumbnails/61.jpg)
CS252/KubiatowiczLec 1.61
8/25/03
Summary, #2
• Amdahl’s Law:
• CPI Law:
• Execution time is the REAL measure of computer performance!
• Good products created when have:– Good benchmarks, good ways to summarize
performance• Die Cost goes roughly with die area4
Speedupoverall =ExTimeold
ExTimenew
=
1
(1 - Fractionenhanced) + Fractionenhanced
Speedupenhanced
CPU time = Seconds = Instructions x Cycles x Seconds
Program Program Instruction Cycle
CPU time = Seconds = Instructions x Cycles x Seconds
Program Program Instruction Cycle