computer architecture 2012 – introduction (lec1) 1 computer architecture (“mamas”, 234267)...
TRANSCRIPT
![Page 1: Computer Architecture 2012 – Introduction (lec1) 1 Computer Architecture (“MAMAS”, 234267) Spring 2012 Lecturer: Dan Tsafrir Reception: Mon 18:30, Taub](https://reader031.vdocument.in/reader031/viewer/2022012918/56649d985503460f94a82452/html5/thumbnails/1.jpg)
Computer Architecture 2012 – Introduction (lec1)1
Computer Architecture Computer Architecture (“MAMAS”, 234267)(“MAMAS”, 234267)
Spring 2012Spring 2012
Lecturer: Dan TsafrirReception: Mon 18:30, Taub 611
12/3/2012Presentation based on slides by David Patterson, Avi Mendelson, Lihu Rappoport, and Adi Yoaz
![Page 2: Computer Architecture 2012 – Introduction (lec1) 1 Computer Architecture (“MAMAS”, 234267) Spring 2012 Lecturer: Dan Tsafrir Reception: Mon 18:30, Taub](https://reader031.vdocument.in/reader031/viewer/2022012918/56649d985503460f94a82452/html5/thumbnails/2.jpg)
Computer Architecture 2012 – Introduction (lec1)2
General InfoGeneral Info Grade
20% Exercise (mandatory) תקף 80% Final exam
Textbook “Computer Architecture:
A Quantitative Approach” (4th Edition)by: Patterson & Hennessy
Other course information Course web site:
http://webcourse.cs.technion.ac.il/234267/Spring2012 Lectures will be upload to the web a day before the
class
![Page 3: Computer Architecture 2012 – Introduction (lec1) 1 Computer Architecture (“MAMAS”, 234267) Spring 2012 Lecturer: Dan Tsafrir Reception: Mon 18:30, Taub](https://reader031.vdocument.in/reader031/viewer/2022012918/56649d985503460f94a82452/html5/thumbnails/3.jpg)
Computer Architecture 2012 – Introduction (lec1)3
Computer System Computer System StructureStructure
![Page 4: Computer Architecture 2012 – Introduction (lec1) 1 Computer Architecture (“MAMAS”, 234267) Spring 2012 Lecturer: Dan Tsafrir Reception: Mon 18:30, Taub](https://reader031.vdocument.in/reader031/viewer/2022012918/56649d985503460f94a82452/html5/thumbnails/4.jpg)
Computer Architecture 2012 – Introduction (lec1)4
Classical Motherboard Classical Motherboard DiagramDiagram
CPU
PCI
North Bridge DDR2 or DDR3Channel 1
mouse
LAN
LanAdap
External Graphics
CardMem BUS
CPU BUS
Cache
SoundCard
speakers
South Bridge
PCI express 2.0
IO Controller
HardDisk
Pa
rall
el
Po
rt
Se
ria
l P
ort Floppy
Drivekeybrd
DDR2 or DDR3Channel 2
USBcontroller
SATAcontroller
PCI express ×1
Memory controller
On-board Graphics
DVDDrive
IOMMU
More to the “north” = closer to the CPU = faster
![Page 5: Computer Architecture 2012 – Introduction (lec1) 1 Computer Architecture (“MAMAS”, 234267) Spring 2012 Lecturer: Dan Tsafrir Reception: Mon 18:30, Taub](https://reader031.vdocument.in/reader031/viewer/2022012918/56649d985503460f94a82452/html5/thumbnails/5.jpg)
Computer Architecture 2012 – Introduction (lec1)5
Intel Core 2Intel Core 2
Northbridge = MCH =mem controller hub
Southbridge = ICH = I/O controller hub
Notice bandwidths
65 to 45 nm
![Page 6: Computer Architecture 2012 – Introduction (lec1) 1 Computer Architecture (“MAMAS”, 234267) Spring 2012 Lecturer: Dan Tsafrir Reception: Mon 18:30, Taub](https://reader031.vdocument.in/reader031/viewer/2022012918/56649d985503460f94a82452/html5/thumbnails/6.jpg)
Computer Architecture 2012 – Introduction (lec1)6
Intel Nehalem Core i3 i5 i7Intel Nehalem Core i3 i5 i7
For high-end i-Series chips,Northbridge functionalitymoved onto processor(=> made faster)
45 to 32 nm
![Page 7: Computer Architecture 2012 – Introduction (lec1) 1 Computer Architecture (“MAMAS”, 234267) Spring 2012 Lecturer: Dan Tsafrir Reception: Mon 18:30, Taub](https://reader031.vdocument.in/reader031/viewer/2022012918/56649d985503460f94a82452/html5/thumbnails/7.jpg)
Computer Architecture 2012 – Introduction (lec1)7
Intel Sandy Bridge Core i3 i5 Intel Sandy Bridge Core i3 i5 i7i7
The trend continues32 to 22 nm
![Page 8: Computer Architecture 2012 – Introduction (lec1) 1 Computer Architecture (“MAMAS”, 234267) Spring 2012 Lecturer: Dan Tsafrir Reception: Mon 18:30, Taub](https://reader031.vdocument.in/reader031/viewer/2022012918/56649d985503460f94a82452/html5/thumbnails/8.jpg)
Computer Architecture 2012 – Introduction (lec1)8
![Page 9: Computer Architecture 2012 – Introduction (lec1) 1 Computer Architecture (“MAMAS”, 234267) Spring 2012 Lecturer: Dan Tsafrir Reception: Mon 18:30, Taub](https://reader031.vdocument.in/reader031/viewer/2022012918/56649d985503460f94a82452/html5/thumbnails/9.jpg)
Computer Architecture 2012 – Introduction (lec1)9
Course FocusCourse Focus Start from CPU (=processor)
Instruction set, performance Pipeline, hazards Branch prediction Out-of-order execution
Move on to Memory Hierarchy Caching Main memory Virtual Memory
Move on to PC Architecture Motherboard & chipset, DRAM, I/O, Disk,
peripherals End with some Advanced Topics
![Page 10: Computer Architecture 2012 – Introduction (lec1) 1 Computer Architecture (“MAMAS”, 234267) Spring 2012 Lecturer: Dan Tsafrir Reception: Mon 18:30, Taub](https://reader031.vdocument.in/reader031/viewer/2022012918/56649d985503460f94a82452/html5/thumbnails/10.jpg)
Computer Architecture 2012 – Introduction (lec1)10
The ProcessorThe Processor
![Page 11: Computer Architecture 2012 – Introduction (lec1) 1 Computer Architecture (“MAMAS”, 234267) Spring 2012 Lecturer: Dan Tsafrir Reception: Mon 18:30, Taub](https://reader031.vdocument.in/reader031/viewer/2022012918/56649d985503460f94a82452/html5/thumbnails/11.jpg)
Computer Architecture 2012 – Introduction (lec1)11
Architecture vs. Architecture vs. MicroarchitectureMicroarchitecture
Architecture:= The processor features as seen by its user= Interface Instruction set, number of registers, addressing modes,
…
Microarchitecture:= Manner by which the processor is implemented= Implementation details Caches size and structure, number of execution units, …
Note: different processors with different u-archs can support the same arch Example: Intel Pentium-IV vs. Intel Core2 Duo
We will address both
![Page 12: Computer Architecture 2012 – Introduction (lec1) 1 Computer Architecture (“MAMAS”, 234267) Spring 2012 Lecturer: Dan Tsafrir Reception: Mon 18:30, Taub](https://reader031.vdocument.in/reader031/viewer/2022012918/56649d985503460f94a82452/html5/thumbnails/12.jpg)
Computer Architecture 2012 – Introduction (lec1)12
Why Should We Care?Why Should We Care?
Abstractions enhance productivity, so:If we know the arch (=interface),Why should we care about the u-arch (=internals)?
Same goes for archJust details for a programmer of a high-level language
Abstractions only work so long as what’s below worksThe taxi story: http://vimeo.com/11478146 (4:50-6:00)
![Page 13: Computer Architecture 2012 – Introduction (lec1) 1 Computer Architecture (“MAMAS”, 234267) Spring 2012 Lecturer: Dan Tsafrir Reception: Mon 18:30, Taub](https://reader031.vdocument.in/reader031/viewer/2022012918/56649d985503460f94a82452/html5/thumbnails/13.jpg)
Computer Architecture 2012 – Introduction (lec1)13
Recent Processor TrendsRecent Processor Trends
Source: http://www.scidacreview.org/0904/html/multicore.html
![Page 14: Computer Architecture 2012 – Introduction (lec1) 1 Computer Architecture (“MAMAS”, 234267) Spring 2012 Lecturer: Dan Tsafrir Reception: Mon 18:30, Taub](https://reader031.vdocument.in/reader031/viewer/2022012918/56649d985503460f94a82452/html5/thumbnails/14.jpg)
Computer Architecture 2012 – Introduction (lec1)14
Well-Known Moore’s LawWell-Known Moore’s Law
Graph taken from: http://www.intel.com/technology/mooreslaw/index.htm
![Page 15: Computer Architecture 2012 – Introduction (lec1) 1 Computer Architecture (“MAMAS”, 234267) Spring 2012 Lecturer: Dan Tsafrir Reception: Mon 18:30, Taub](https://reader031.vdocument.in/reader031/viewer/2022012918/56649d985503460f94a82452/html5/thumbnails/15.jpg)
Computer Architecture 2012 – Introduction (lec1)15
![Page 16: Computer Architecture 2012 – Introduction (lec1) 1 Computer Architecture (“MAMAS”, 234267) Spring 2012 Lecturer: Dan Tsafrir Reception: Mon 18:30, Taub](https://reader031.vdocument.in/reader031/viewer/2022012918/56649d985503460f94a82452/html5/thumbnails/16.jpg)
Computer Architecture 2012 – Introduction (lec1)16
The Story in a NutshellThe Story in a NutshellTransistors(1000s)
clock speed(MHz)
power (W)
Instructions/cycle(ILP)
![Page 17: Computer Architecture 2012 – Introduction (lec1) 1 Computer Architecture (“MAMAS”, 234267) Spring 2012 Lecturer: Dan Tsafrir Reception: Mon 18:30, Taub](https://reader031.vdocument.in/reader031/viewer/2022012918/56649d985503460f94a82452/html5/thumbnails/17.jpg)
Computer Architecture 2012 – Introduction (lec1)17
Took the Industry by SurpriseTook the Industry by Surprise
![Page 18: Computer Architecture 2012 – Introduction (lec1) 1 Computer Architecture (“MAMAS”, 234267) Spring 2012 Lecturer: Dan Tsafrir Reception: Mon 18:30, Taub](https://reader031.vdocument.in/reader031/viewer/2022012918/56649d985503460f94a82452/html5/thumbnails/18.jpg)
Computer Architecture 2012 – Introduction (lec1)18
Dire Implications: PerformanceDire Implications: Performance
![Page 19: Computer Architecture 2012 – Introduction (lec1) 1 Computer Architecture (“MAMAS”, 234267) Spring 2012 Lecturer: Dan Tsafrir Reception: Mon 18:30, Taub](https://reader031.vdocument.in/reader031/viewer/2022012918/56649d985503460f94a82452/html5/thumbnails/19.jpg)
Computer Architecture 2012 – Introduction (lec1)19
Dire Implications: SalesDire Implications: Sales
![Page 20: Computer Architecture 2012 – Introduction (lec1) 1 Computer Architecture (“MAMAS”, 234267) Spring 2012 Lecturer: Dan Tsafrir Reception: Mon 18:30, Taub](https://reader031.vdocument.in/reader031/viewer/2022012918/56649d985503460f94a82452/html5/thumbnails/20.jpg)
Computer Architecture 2012 – Introduction (lec1)20
Dire Implications: SalesDire Implications: Sales
![Page 21: Computer Architecture 2012 – Introduction (lec1) 1 Computer Architecture (“MAMAS”, 234267) Spring 2012 Lecturer: Dan Tsafrir Reception: Mon 18:30, Taub](https://reader031.vdocument.in/reader031/viewer/2022012918/56649d985503460f94a82452/html5/thumbnails/21.jpg)
Computer Architecture 2012 – Introduction (lec1)21
Dire Implications: ProgrammersDire Implications: Programmers
![Page 22: Computer Architecture 2012 – Introduction (lec1) 1 Computer Architecture (“MAMAS”, 234267) Spring 2012 Lecturer: Dan Tsafrir Reception: Mon 18:30, Taub](https://reader031.vdocument.in/reader031/viewer/2022012918/56649d985503460f94a82452/html5/thumbnails/22.jpg)
Computer Architecture 2012 – Introduction (lec1)22
Supercomputing: “Top 500 list”Supercomputing: “Top 500 list”
![Page 23: Computer Architecture 2012 – Introduction (lec1) 1 Computer Architecture (“MAMAS”, 234267) Spring 2012 Lecturer: Dan Tsafrir Reception: Mon 18:30, Taub](https://reader031.vdocument.in/reader031/viewer/2022012918/56649d985503460f94a82452/html5/thumbnails/23.jpg)
Computer Architecture 2012 – Introduction (lec1)23
Dire Implications: Dire Implications: SupercomputingSupercomputing
![Page 24: Computer Architecture 2012 – Introduction (lec1) 1 Computer Architecture (“MAMAS”, 234267) Spring 2012 Lecturer: Dan Tsafrir Reception: Mon 18:30, Taub](https://reader031.vdocument.in/reader031/viewer/2022012918/56649d985503460f94a82452/html5/thumbnails/24.jpg)
Computer Architecture 2012 – Introduction (lec1)24
Processor PerformanceProcessor Performance
![Page 25: Computer Architecture 2012 – Introduction (lec1) 1 Computer Architecture (“MAMAS”, 234267) Spring 2012 Lecturer: Dan Tsafrir Reception: Mon 18:30, Taub](https://reader031.vdocument.in/reader031/viewer/2022012918/56649d985503460f94a82452/html5/thumbnails/25.jpg)
Computer Architecture 2012 – Introduction (lec1)25
Metrics: IC, CPI, IPCMetrics: IC, CPI, IPC CPUs work according to a clock signal
Clock cycle: measured in nanoseconds (10-9 of a second) Clock frequency = 1/|clock cycle|: in GHz (109 cycles/sec)
Instruction Count (IC) Total number of instructions executed in the program
Cycles Per Instruction (CPI) Average #cycles per Instruction (in a given program)
IPC (= 1/CPI) : Instructions per cycles.Can be > 1; see the “story in a nutshell slide”
CPI =#cycles required to execute the program
IC
![Page 26: Computer Architecture 2012 – Introduction (lec1) 1 Computer Architecture (“MAMAS”, 234267) Spring 2012 Lecturer: Dan Tsafrir Reception: Mon 18:30, Taub](https://reader031.vdocument.in/reader031/viewer/2022012918/56649d985503460f94a82452/html5/thumbnails/26.jpg)
Computer Architecture 2012 – Introduction (lec1)26
Minimizing Execution TimeMinimizing Execution Time CPU Time - time required to execute a program
CPU Time = IC CPI clock cycle
Our goal:
minimize CPU Time (any of above components) Minimize clock cycle: increase GHz (processor design)
Minimize CPI: u-arch (e.g.: more execution
units)
Minimize IC: arch + u-arch (e.g.: SSETM)
SSE = streaming SIMD extension (Intel)
![Page 27: Computer Architecture 2012 – Introduction (lec1) 1 Computer Architecture (“MAMAS”, 234267) Spring 2012 Lecturer: Dan Tsafrir Reception: Mon 18:30, Taub](https://reader031.vdocument.in/reader031/viewer/2022012918/56649d985503460f94a82452/html5/thumbnails/27.jpg)
Computer Architecture 2012 – Introduction (lec1)27
Alternative Way to Calculate Alternative Way to Calculate CPICPI
ICi = #times instruction of type-i is executed in program
IC = #instruction executed in program =
Fi = relative frequency of type-i instruction = ICi/IC
CPIi = #cycles to execute type-i instruction e.g.: CPIadd = 1, CPImul = 3
#cycles required to execute the program:
CPI: CPIcyc
IC
CPI IC
ICCPI
IC
ICCPI F
i ii
n
ii
i
n
i ii
n
# 1
1 1
1
#n
i ii
cyc CPI IC
IC ICii
n
1
![Page 28: Computer Architecture 2012 – Introduction (lec1) 1 Computer Architecture (“MAMAS”, 234267) Spring 2012 Lecturer: Dan Tsafrir Reception: Mon 18:30, Taub](https://reader031.vdocument.in/reader031/viewer/2022012918/56649d985503460f94a82452/html5/thumbnails/28.jpg)
Computer Architecture 2012 – Introduction (lec1)28
Performance Evaluation: Performance Evaluation: How?How?
No simple answer
Performance depends on Application Input
Mathematical analysis Typically impossible
What to do?
![Page 29: Computer Architecture 2012 – Introduction (lec1) 1 Computer Architecture (“MAMAS”, 234267) Spring 2012 Lecturer: Dan Tsafrir Reception: Mon 18:30, Taub](https://reader031.vdocument.in/reader031/viewer/2022012918/56649d985503460f94a82452/html5/thumbnails/29.jpg)
Computer Architecture 2012 – Introduction (lec1)29
BenchmarksBenchmarks
Use benchmarks & measure how long it takes Use real applications (=> no absolute answers)
Preferably standardized benchmarks (+input), e.g., SPEC INT: integer apps
• Compression, C complier, Perl, text-processing, … SPEC FP: floating point apps (mostly scientific) TPC benchmarks: measure transaction throughput (DB) SPEC JBB: models wholesale company (Java server, DB)
Sometimes you see FLOPS (“pick” or “sustained”) Supercomputers (top500 list), against LINPACK
![Page 30: Computer Architecture 2012 – Introduction (lec1) 1 Computer Architecture (“MAMAS”, 234267) Spring 2012 Lecturer: Dan Tsafrir Reception: Mon 18:30, Taub](https://reader031.vdocument.in/reader031/viewer/2022012918/56649d985503460f94a82452/html5/thumbnails/30.jpg)
Computer Architecture 2012 – Introduction (lec1)30
-2%
0%
2%
4%
6%
Evaluating PerformanceEvaluating Performance Use a performance simulator to evaluate
the performance of a new feature / algorithm Models the uarch to a great detail Run 100’s of representative applications
Produce the performance s-curve Sort the applications according to the IPC increase Baseline (0%) is the processor without the new
feature
-4%
-3%
-2%
-1%
0%
1%
2%
3%
Negativeoutliers
Positiveoutliers
Bad S-curve
Small negativeoutliers
Positiveoutliers
Good S-curve
![Page 31: Computer Architecture 2012 – Introduction (lec1) 1 Computer Architecture (“MAMAS”, 234267) Spring 2012 Lecturer: Dan Tsafrir Reception: Mon 18:30, Taub](https://reader031.vdocument.in/reader031/viewer/2022012918/56649d985503460f94a82452/html5/thumbnails/31.jpg)
Computer Architecture 2012 – Introduction (lec1)31
Amdahl’s LawAmdahl’s Law
Suppose we accelerate the computation such thatP = proportion of computation we make fasterS = speedup experienced by the proportion we improved
For exampleIf an improvement can speedup 40% of the computation
=> P = 0.4If the improvement makes the portion run twice as fast
=> S = 2
Then overall speedup =
1
(1 ) PP S
![Page 32: Computer Architecture 2012 – Introduction (lec1) 1 Computer Architecture (“MAMAS”, 234267) Spring 2012 Lecturer: Dan Tsafrir Reception: Mon 18:30, Taub](https://reader031.vdocument.in/reader031/viewer/2022012918/56649d985503460f94a82452/html5/thumbnails/32.jpg)
Computer Architecture 2012 – Introduction (lec1)32
Amdahl’s Law - ExampleAmdahl’s Law - Example
FP operations improved to run 2x fasterS = 2, but…P = only affects 10% of the programSpeedup:
ConclusionBetter to make common case fast…
1 1 11.053
0.1 0.95(1 ) (1 0.1) 2PP S
![Page 33: Computer Architecture 2012 – Introduction (lec1) 1 Computer Architecture (“MAMAS”, 234267) Spring 2012 Lecturer: Dan Tsafrir Reception: Mon 18:30, Taub](https://reader031.vdocument.in/reader031/viewer/2022012918/56649d985503460f94a82452/html5/thumbnails/33.jpg)
Computer Architecture 2012 – Introduction (lec1)33
Amdahl’s Law – ParallelismAmdahl’s Law – Parallelism
When parallelizing a program P = proportion of program that can be made parallel 1 - P = inherently serial N = number of processing elements (say, cores)Speedup:
Serial component imposes a hard limit
1
(1 ) PP N
1 1lim
(1 )(1 )N P PP N
![Page 34: Computer Architecture 2012 – Introduction (lec1) 1 Computer Architecture (“MAMAS”, 234267) Spring 2012 Lecturer: Dan Tsafrir Reception: Mon 18:30, Taub](https://reader031.vdocument.in/reader031/viewer/2022012918/56649d985503460f94a82452/html5/thumbnails/34.jpg)
Computer Architecture 2012 – Introduction (lec1)34
The ISA is what the user & compiler see
The HW implements the ISA
instruction set
software
hardware
Instruction Set DesignInstruction Set Design
![Page 35: Computer Architecture 2012 – Introduction (lec1) 1 Computer Architecture (“MAMAS”, 234267) Spring 2012 Lecturer: Dan Tsafrir Reception: Mon 18:30, Taub](https://reader031.vdocument.in/reader031/viewer/2022012918/56649d985503460f94a82452/html5/thumbnails/35.jpg)
Computer Architecture 2012 – Introduction (lec1)35
Considerations in ISA DesignConsiderations in ISA Design Instruction size
Long instructions take more time to fetch from memory Longer instructions require a larger memory
• Important for small (embedded) devices, e.g., cell phones
Number of instructions (IC) Reduce IC => reduce runtime (at a given CPI &
frequency)
Virtues of instructions simplicity Simpler HW allows for: higher frequency & lower power Optimization can be applied better to simpler code Cheaper HW
![Page 36: Computer Architecture 2012 – Introduction (lec1) 1 Computer Architecture (“MAMAS”, 234267) Spring 2012 Lecturer: Dan Tsafrir Reception: Mon 18:30, Taub](https://reader031.vdocument.in/reader031/viewer/2022012918/56649d985503460f94a82452/html5/thumbnails/36.jpg)
Computer Architecture 2012 – Introduction (lec1)36
Basing Design Decisions on Basing Design Decisions on WorkloadWorkload
Immediate argument’s size in bits (histogram)
1% of data values > 16-bits Having 16 bits is likely good enough
0%
10%
20%
30%
0 1 2 3 4 5 6 7 8 9
10
11 12
13
14
15
Immediate data bits
Int. Avg.
FP Avg.
![Page 37: Computer Architecture 2012 – Introduction (lec1) 1 Computer Architecture (“MAMAS”, 234267) Spring 2012 Lecturer: Dan Tsafrir Reception: Mon 18:30, Taub](https://reader031.vdocument.in/reader031/viewer/2022012918/56649d985503460f94a82452/html5/thumbnails/37.jpg)
Computer Architecture 2012 – Introduction (lec1)37
CISC ProcessorsCISC Processors CISC - Complex Instruction Set Computer
Example: x86 The idea: a high level machine language
• Once people programmed in assembly, CISC supposedly easier
Characteristic Many instruction types, with a many addressing modes Some of the instructions are complex
• Execute complex tasks• Require many cycles
ALU operations directly on memory (e.g., arr[j] = arr[i]+n)• Registers not used (and, accordingly, only a few registers exist)
Variable length instructions• common instructions get short codes save code length
![Page 38: Computer Architecture 2012 – Introduction (lec1) 1 Computer Architecture (“MAMAS”, 234267) Spring 2012 Lecturer: Dan Tsafrir Reception: Mon 18:30, Taub](https://reader031.vdocument.in/reader031/viewer/2022012918/56649d985503460f94a82452/html5/thumbnails/38.jpg)
Computer Architecture 2012 – Introduction (lec1)38
Rank instruction % of total executed
1 load 22%
2 conditional branch 20%
3 compare 16%
4 store 12%
5 add 8%
6 and 6%
7 sub 5%
8 move register-register 4%
9 call 1%
10 return 1%
Total 96%
Simple instructions dominate instruction frequency
But it Turns Out…But it Turns Out…
![Page 39: Computer Architecture 2012 – Introduction (lec1) 1 Computer Architecture (“MAMAS”, 234267) Spring 2012 Lecturer: Dan Tsafrir Reception: Mon 18:30, Taub](https://reader031.vdocument.in/reader031/viewer/2022012918/56649d985503460f94a82452/html5/thumbnails/39.jpg)
Computer Architecture 2012 – Introduction (lec1)39
CISC DrawbacksCISC Drawbacks Complex instructions and complex addressing modes
complicates the processor slows down the simple, common instructions contradicts Make The Common Case Fast
Compilers don’t use complex instructions / indexing
methods
Variable length instructions are real pain in the neck Difficult to decode few instructions in parallel
• As long as instruction is not decoded, its length is unknown It is unknown where the instruction ends It is unknown where the next instruction starts
An instruction may be over more than a single cache line An instruction may be over more than a single page
![Page 40: Computer Architecture 2012 – Introduction (lec1) 1 Computer Architecture (“MAMAS”, 234267) Spring 2012 Lecturer: Dan Tsafrir Reception: Mon 18:30, Taub](https://reader031.vdocument.in/reader031/viewer/2022012918/56649d985503460f94a82452/html5/thumbnails/40.jpg)
Computer Architecture 2012 – Introduction (lec1)40
RISC ProcessorsRISC Processors RISC - Reduced Instruction Set Computer
The idea: simple instructions enable fast hardware Characteristic
A small instruction set, with only a few instructions formats
Simple instructions• execute simple tasks• Most of them require a single cycle (with pipeline)
A few indexing methods ALU operations on registers only
• Memory is accessed using Load and Store instructions only
• Many orthogonal registers • Three address machine: Add dst, src1, src2
Fixed length instructions
Examples: MIPSTM, SparcTM, AlphaTM, PowerTM
![Page 41: Computer Architecture 2012 – Introduction (lec1) 1 Computer Architecture (“MAMAS”, 234267) Spring 2012 Lecturer: Dan Tsafrir Reception: Mon 18:30, Taub](https://reader031.vdocument.in/reader031/viewer/2022012918/56649d985503460f94a82452/html5/thumbnails/41.jpg)
Computer Architecture 2012 – Introduction (lec1)41
RISC Processors (Cont.)RISC Processors (Cont.) Simple arch => simple u-arch
Room for larger on die caches Smaller => faster Easier to design & validate (=> cheaper to
manufacture) Shorten time-to-market More general-purpose registers (=> less memory refs)
Compiler can be smarter Better pipeline usage Better register allocation
Existing RISC processor are not “pure” RISC e.g., support division which takes many cycles
![Page 42: Computer Architecture 2012 – Introduction (lec1) 1 Computer Architecture (“MAMAS”, 234267) Spring 2012 Lecturer: Dan Tsafrir Reception: Mon 18:30, Taub](https://reader031.vdocument.in/reader031/viewer/2022012918/56649d985503460f94a82452/html5/thumbnails/42.jpg)
Computer Architecture 2012 – Introduction (lec1)42
Compilers and ISACompilers and ISA Ease of compilation
Orthogonality: • no special registers• few special cases • all operand modes available with any data type or
instruction type Regularity:
• no overloading for the meanings of instruction fields streamlined
• resource needs easily determined
Register assignment is critical too Easier if lots of registers
![Page 43: Computer Architecture 2012 – Introduction (lec1) 1 Computer Architecture (“MAMAS”, 234267) Spring 2012 Lecturer: Dan Tsafrir Reception: Mon 18:30, Taub](https://reader031.vdocument.in/reader031/viewer/2022012918/56649d985503460f94a82452/html5/thumbnails/43.jpg)
Computer Architecture 2012 – Introduction (lec1)43
Still, CISC Is DominantStill, CISC Is Dominant x86 (CISC) dominates the processor market
Legacy A vast amount of existing software Intel, AMD, Microsoft benefit But put lot of money to compensate for
disadvantage
CISC internally arch emulates RISC Starting at Pentium II and K6, x86 processors
translate CISC instructions into RISC-like operations internally
Inside core looks much like that of a RISC processor
![Page 44: Computer Architecture 2012 – Introduction (lec1) 1 Computer Architecture (“MAMAS”, 234267) Spring 2012 Lecturer: Dan Tsafrir Reception: Mon 18:30, Taub](https://reader031.vdocument.in/reader031/viewer/2022012918/56649d985503460f94a82452/html5/thumbnails/44.jpg)
Computer Architecture 2012 – Introduction (lec1)44
Software Specific ExtensionsSoftware Specific Extensions Extend arch to accelerate exec of specific
apps
Example: SSETM – Streaming SIMD Extensions 128-bit packed (vector) / scalar single precision FP
(4×32) Introduced on Pentium® III on ’99 8 new 128 bit registers (XMM0 – XMM7) Accelerates graphics, video, scientific calculations,
…
Packed: Scalar:
x0x1x2x3
y0y1y2y3
x0+y0x1+y1x2+y2x3+y3
+
128-bits
x0x1x2x3
y0y1y2y3
x0+y0y1y2y3
+
128-bits
![Page 45: Computer Architecture 2012 – Introduction (lec1) 1 Computer Architecture (“MAMAS”, 234267) Spring 2012 Lecturer: Dan Tsafrir Reception: Mon 18:30, Taub](https://reader031.vdocument.in/reader031/viewer/2022012918/56649d985503460f94a82452/html5/thumbnails/45.jpg)
Computer Architecture 2012 – Introduction (lec1)45
BACKUPBACKUP
![Page 46: Computer Architecture 2012 – Introduction (lec1) 1 Computer Architecture (“MAMAS”, 234267) Spring 2012 Lecturer: Dan Tsafrir Reception: Mon 18:30, Taub](https://reader031.vdocument.in/reader031/viewer/2022012918/56649d985503460f94a82452/html5/thumbnails/46.jpg)
Computer Architecture 2012 – Introduction (lec1)46
CompatibilityCompatibility Backward compatibility (HW responsibility)
When buying new hardware, it can run existing software:• i5 can run SW written for Core2 Duo, Pentium4,
PentiumM, Pentium III, Pentium II, Pentium, 486, 386, 268
BTW:
Forward compatibility (SW responsibility) For example: MS Word 2003 can open MS Word 2010 doc Commonly supports one or two generations behind
Architecture-independent SW Run SW on top of VM that does JIT (just in time compiler):
JVM for Java and CLR for .NET Interpreted languages: Perl, Python