computer systems architecture ipcrowley/cse/560/l2-8-31-2009.pdf · 2009-08-31 · • 2002: 480m...
TRANSCRIPT
![Page 1: Computer Systems Architecture Ipcrowley/cse/560/L2-8-31-2009.pdf · 2009-08-31 · • 2002: 480M mobile phones • 2004 – 178M PCs – 600M mobile phones • 2006 – 230M PCs](https://reader033.vdocument.in/reader033/viewer/2022042113/5e8ee743a2d2cf60f70845ef/html5/thumbnails/1.jpg)
Computer Systems
Architecture I
CSE 560M
Lecture 2
Prof. Patrick Crowley
![Page 2: Computer Systems Architecture Ipcrowley/cse/560/L2-8-31-2009.pdf · 2009-08-31 · • 2002: 480M mobile phones • 2004 – 178M PCs – 600M mobile phones • 2006 – 230M PCs](https://reader033.vdocument.in/reader033/viewer/2022042113/5e8ee743a2d2cf60f70845ef/html5/thumbnails/2.jpg)
Plan for Today
• Questions
• Administrivia
• Class background
• Today’s discussion
• Assignment
![Page 3: Computer Systems Architecture Ipcrowley/cse/560/L2-8-31-2009.pdf · 2009-08-31 · • 2002: 480M mobile phones • 2004 – 178M PCs – 600M mobile phones • 2006 – 230M PCs](https://reader033.vdocument.in/reader033/viewer/2022042113/5e8ee743a2d2cf60f70845ef/html5/thumbnails/3.jpg)
Administrivia
• My office hours :
– No one “good” time
– By appointment, times available each day
• Shakir’s office hours
– M&W, 5:30pm-6:30pm
– Bryan 422
![Page 4: Computer Systems Architecture Ipcrowley/cse/560/L2-8-31-2009.pdf · 2009-08-31 · • 2002: 480M mobile phones • 2004 – 178M PCs – 600M mobile phones • 2006 – 230M PCs](https://reader033.vdocument.in/reader033/viewer/2022042113/5e8ee743a2d2cf60f70845ef/html5/thumbnails/4.jpg)
2009 Class Background
0%
10%
20%
30%
40%
50%
60%
70%
80%
90%
100%
Architecture Digital Design VHDL
much
some
none
![Page 5: Computer Systems Architecture Ipcrowley/cse/560/L2-8-31-2009.pdf · 2009-08-31 · • 2002: 480M mobile phones • 2004 – 178M PCs – 600M mobile phones • 2006 – 230M PCs](https://reader033.vdocument.in/reader033/viewer/2022042113/5e8ee743a2d2cf60f70845ef/html5/thumbnails/5.jpg)
Introduction
“Speed is not everything but it’s kilometers
ahead of whatever is in second place”
Ed McCreight
The Dragon Computer System
Xerox PARC September, 1984
![Page 6: Computer Systems Architecture Ipcrowley/cse/560/L2-8-31-2009.pdf · 2009-08-31 · • 2002: 480M mobile phones • 2004 – 178M PCs – 600M mobile phones • 2006 – 230M PCs](https://reader033.vdocument.in/reader033/viewer/2022042113/5e8ee743a2d2cf60f70845ef/html5/thumbnails/6.jpg)
Computer Design: Make the
Common Case Fast
Amdahl’s law (speedup, S):
tenhancemen with timeExec.
tenhancemen without timeExec.
tenhancemenithout for task w Perf.
tenhancemenith for task w ePerformanc
=
=
S
S
![Page 7: Computer Systems Architecture Ipcrowley/cse/560/L2-8-31-2009.pdf · 2009-08-31 · • 2002: 480M mobile phones • 2004 – 178M PCs – 600M mobile phones • 2006 – 230M PCs](https://reader033.vdocument.in/reader033/viewer/2022042113/5e8ee743a2d2cf60f70845ef/html5/thumbnails/7.jpg)
Outline
• Types of computer systems
• Technology trends
• Explaining processor performance improvements
• Performance evaluation
• Fallacies and Pitfalls
![Page 8: Computer Systems Architecture Ipcrowley/cse/560/L2-8-31-2009.pdf · 2009-08-31 · • 2002: 480M mobile phones • 2004 – 178M PCs – 600M mobile phones • 2006 – 230M PCs](https://reader033.vdocument.in/reader033/viewer/2022042113/5e8ee743a2d2cf60f70845ef/html5/thumbnails/8.jpg)
Classes of Computer Systems
• Desktop
– Intel IA-32, AMD, IBM PowerPC
• Servers
– Intel IA-32, Intel IA-64, Sun SPARC, AMD
• Embedded
– MIPS, ARM, NEC, Motorola
![Page 9: Computer Systems Architecture Ipcrowley/cse/560/L2-8-31-2009.pdf · 2009-08-31 · • 2002: 480M mobile phones • 2004 – 178M PCs – 600M mobile phones • 2006 – 230M PCs](https://reader033.vdocument.in/reader033/viewer/2022042113/5e8ee743a2d2cf60f70845ef/html5/thumbnails/9.jpg)
Classes of Computer Systems
• Consider one metric: worldwide unit sales
• 1980: 724K PCs
• 1986: 9M PCs• 2002: 130M PCs
• 2002: 3500 Intel Itanium processors
• 2002: 36M game consoles
• 2002: 480M mobile phones
• 2004– 178M PCs– 600M mobile phones
• 2006– 230M PCs– 960M mobile phones
• 2008– 299M PCs– 1.2B mobile phones
0
200
400
600
800
1000
1200
2004 2006 2008
PC
Mobile
Phone
![Page 10: Computer Systems Architecture Ipcrowley/cse/560/L2-8-31-2009.pdf · 2009-08-31 · • 2002: 480M mobile phones • 2004 – 178M PCs – 600M mobile phones • 2006 – 230M PCs](https://reader033.vdocument.in/reader033/viewer/2022042113/5e8ee743a2d2cf60f70845ef/html5/thumbnails/10.jpg)
Technology Trends
• CPU/Microprocessor– Annual rate of transistor count increase is 55% per year
– Performance improvement has trad. been better than that
• Memory– Density increases, bandwidth increases, access time is stagnant (although new memory architectures help)
• I/O– Disk density: 100% per year!
– Disk access time: 30% in 10 years
– Networks: periodic order of magnitude increases
![Page 11: Computer Systems Architecture Ipcrowley/cse/560/L2-8-31-2009.pdf · 2009-08-31 · • 2002: 480M mobile phones • 2004 – 178M PCs – 600M mobile phones • 2006 – 230M PCs](https://reader033.vdocument.in/reader033/viewer/2022042113/5e8ee743a2d2cf60f70845ef/html5/thumbnails/11.jpg)
Computer Generations
Generation Date Technology
1 1950-1959 Vacuum Tubes
2 1960-1968 Transistors
3 1969-1977 Integrated Circuit
4 1978-1999 LSI, VLSI
5 2000-20xx VLSI …
![Page 12: Computer Systems Architecture Ipcrowley/cse/560/L2-8-31-2009.pdf · 2009-08-31 · • 2002: 480M mobile phones • 2004 – 178M PCs – 600M mobile phones • 2006 – 230M PCs](https://reader033.vdocument.in/reader033/viewer/2022042113/5e8ee743a2d2cf60f70845ef/html5/thumbnails/12.jpg)
Processor-Memory Perf. Gap
![Page 13: Computer Systems Architecture Ipcrowley/cse/560/L2-8-31-2009.pdf · 2009-08-31 · • 2002: 480M mobile phones • 2004 – 178M PCs – 600M mobile phones • 2006 – 230M PCs](https://reader033.vdocument.in/reader033/viewer/2022042113/5e8ee743a2d2cf60f70845ef/html5/thumbnails/13.jpg)
Explaining Processor Improvements• Technology– Faster clock– More transistors
• Architecture– Extensive pipelining
– More transistors enable new functionality• Multiple functional units
• Superscalar execution
• Out-of-order execution
– On-chip caches, TLBs– Instruction fetch units, branch prediction– Multiple cores, thread contexts– Greater on-chip integration
![Page 14: Computer Systems Architecture Ipcrowley/cse/560/L2-8-31-2009.pdf · 2009-08-31 · • 2002: 480M mobile phones • 2004 – 178M PCs – 600M mobile phones • 2006 – 230M PCs](https://reader033.vdocument.in/reader033/viewer/2022042113/5e8ee743a2d2cf60f70845ef/html5/thumbnails/14.jpg)
Clock Rate
Chip Date Clock Freq. (MHz)
Clock Period (nanosec)
Intel 8086 1978 4.77 200 Intel 386 1985 40.00 25 Dec Alpha (v1) 1990 100.00 10 Dec Alpha (v2) 1994 300.00 3.33 Intel P4 2002 2,000.00 0.50 Intel Xeon L7455 2008 2,130.00 0.46 Time for signal to Travel 1cm on-chip
~1.00
![Page 15: Computer Systems Architecture Ipcrowley/cse/560/L2-8-31-2009.pdf · 2009-08-31 · • 2002: 480M mobile phones • 2004 – 178M PCs – 600M mobile phones • 2006 – 230M PCs](https://reader033.vdocument.in/reader033/viewer/2022042113/5e8ee743a2d2cf60f70845ef/html5/thumbnails/15.jpg)
Intel x86
Progression
Clock rate
stagnates,
cores increase
Chip Date T Count Speed (MHz)
4004 Nov-71 2300 0.108
8008 Apr-72 3500 0.2
8080 Apr-74 6000 2
8086 Jun-78 29000 10
8088 Jun-79 29000 10
286 Feb-82 134000 12.5
386 Oct-85 275000 33
486 Apr-89 1.2M 50
Pentium Mar-93 3.1M 66
Pentium Pro Mar-95 5.5M
166
Pentium II Jul-97 7.5M 300
Pentium III Feb-99 9.5M
1200
Pentium 4 Nov-00 42M 1800
P4-HT Nov-02 188M 3060
Pentium D May-05 169M 2800
Core 2 Duo Jul-06 291M 3000
Xeon L7455 Jul-08 1900M 2130
Xeon E5450 Jan-09 731M 2530
![Page 16: Computer Systems Architecture Ipcrowley/cse/560/L2-8-31-2009.pdf · 2009-08-31 · • 2002: 480M mobile phones • 2004 – 178M PCs – 600M mobile phones • 2006 – 230M PCs](https://reader033.vdocument.in/reader033/viewer/2022042113/5e8ee743a2d2cf60f70845ef/html5/thumbnails/16.jpg)
Performance Evaluation Basics
• Performance inversely proportional to execution time
• Elapsed time includes:– User + system; I/O; memory accesses
• CPU time includes:– User + system CPU (no I/O)
• CPU Execution time for a singe program execution:– Cycles Per Instruction (CPI)
timecycleClock CPIcount n Instructio timeCPU ××=
![Page 17: Computer Systems Architecture Ipcrowley/cse/560/L2-8-31-2009.pdf · 2009-08-31 · • 2002: 480M mobile phones • 2004 – 178M PCs – 600M mobile phones • 2006 – 230M PCs](https://reader033.vdocument.in/reader033/viewer/2022042113/5e8ee743a2d2cf60f70845ef/html5/thumbnails/17.jpg)
Components of CPI
• Ideal CPI = 1
• Classes of instructions– RISC machines: alu, control flow, f.p., load-store
– CISC machines: string instructions
• We will discuss “contributions to CPI from”:– Memory hierarchy
– Branches (misprediction)
– Pipeline hazards
![Page 18: Computer Systems Architecture Ipcrowley/cse/560/L2-8-31-2009.pdf · 2009-08-31 · • 2002: 480M mobile phones • 2004 – 178M PCs – 600M mobile phones • 2006 – 230M PCs](https://reader033.vdocument.in/reader033/viewer/2022042113/5e8ee743a2d2cf60f70845ef/html5/thumbnails/18.jpg)
Components of CPI
∑=
×=n
i
i
iCPI
ICCPI
1 Countn Instructio
![Page 19: Computer Systems Architecture Ipcrowley/cse/560/L2-8-31-2009.pdf · 2009-08-31 · • 2002: 480M mobile phones • 2004 – 178M PCs – 600M mobile phones • 2006 – 230M PCs](https://reader033.vdocument.in/reader033/viewer/2022042113/5e8ee743a2d2cf60f70845ef/html5/thumbnails/19.jpg)
Measuring/Modeling CPU
Performance
• Hardware counters on a real CPU
• Instrumented execution of programs running
on a real system
– Binary re-writing
– Debugger
• Instruction set simulation or interpretation
![Page 20: Computer Systems Architecture Ipcrowley/cse/560/L2-8-31-2009.pdf · 2009-08-31 · • 2002: 480M mobile phones • 2004 – 178M PCs – 600M mobile phones • 2006 – 230M PCs](https://reader033.vdocument.in/reader033/viewer/2022042113/5e8ee743a2d2cf60f70845ef/html5/thumbnails/20.jpg)
Benchmarks
• Desktop– SPEC, integer and floating-point
– Commercial workloads: SYSmark, Winstone
• Servers– SPEC WEB
– TCP-A,B,C
• Embedded– EEMBC
– Other application-specific suites
![Page 21: Computer Systems Architecture Ipcrowley/cse/560/L2-8-31-2009.pdf · 2009-08-31 · • 2002: 480M mobile phones • 2004 – 178M PCs – 600M mobile phones • 2006 – 230M PCs](https://reader033.vdocument.in/reader033/viewer/2022042113/5e8ee743a2d2cf60f70845ef/html5/thumbnails/21.jpg)
Important Future Topics
• Computer Architecture Methodology
• Pipelining
• Locality
![Page 22: Computer Systems Architecture Ipcrowley/cse/560/L2-8-31-2009.pdf · 2009-08-31 · • 2002: 480M mobile phones • 2004 – 178M PCs – 600M mobile phones • 2006 – 230M PCs](https://reader033.vdocument.in/reader033/viewer/2022042113/5e8ee743a2d2cf60f70845ef/html5/thumbnails/22.jpg)
Fallacy
“The relative performance of two processors
with the same instruction set architecture
(ISA) can be judged by clock rate or by the
performance of a single benchmark suite.”
![Page 23: Computer Systems Architecture Ipcrowley/cse/560/L2-8-31-2009.pdf · 2009-08-31 · • 2002: 480M mobile phones • 2004 – 178M PCs – 600M mobile phones • 2006 – 230M PCs](https://reader033.vdocument.in/reader033/viewer/2022042113/5e8ee743a2d2cf60f70845ef/html5/thumbnails/23.jpg)
Intel P4 (1.7GHz) vs P3 (1GHz)
![Page 24: Computer Systems Architecture Ipcrowley/cse/560/L2-8-31-2009.pdf · 2009-08-31 · • 2002: 480M mobile phones • 2004 – 178M PCs – 600M mobile phones • 2006 – 230M PCs](https://reader033.vdocument.in/reader033/viewer/2022042113/5e8ee743a2d2cf60f70845ef/html5/thumbnails/24.jpg)
Pitfall
“Neglecting the cost of software in either
evaluating a system or examining cost-
performance.”
![Page 25: Computer Systems Architecture Ipcrowley/cse/560/L2-8-31-2009.pdf · 2009-08-31 · • 2002: 480M mobile phones • 2004 – 178M PCs – 600M mobile phones • 2006 – 230M PCs](https://reader033.vdocument.in/reader033/viewer/2022042113/5e8ee743a2d2cf60f70845ef/html5/thumbnails/25.jpg)
Assignment
• Readings– Wednesday• H&P: App. B.1-B.7
• V&L: Ch. 2
• Turner & Zar VHDL concepts tutorial
– Monday• Labor Day, no class meeting