computer architecture challenges shriniwas gadage
TRANSCRIPT
![Page 1: Computer Architecture Challenges Shriniwas Gadage](https://reader030.vdocument.in/reader030/viewer/2022032707/56649e4a5503460f94b3ebed/html5/thumbnails/1.jpg)
Computer Architecture Challenges
Shriniwas Gadage
![Page 2: Computer Architecture Challenges Shriniwas Gadage](https://reader030.vdocument.in/reader030/viewer/2022032707/56649e4a5503460f94b3ebed/html5/thumbnails/2.jpg)
Computer Architecture• Computer architects design computer systems
– Processors: Intel Pentium 4, IBM PowerPC– Also: memory systems, interconnections, ???
slide 2
Pentium
cachememory
memory(DRAM) bridge to I/O
disk ethernetcard
![Page 3: Computer Architecture Challenges Shriniwas Gadage](https://reader030.vdocument.in/reader030/viewer/2022032707/56649e4a5503460f94b3ebed/html5/thumbnails/3.jpg)
3
Computer Architecture• If the Intel Pentium4 has a faster clock speed than the IBM Power4, does it execute your programs faster?
Completing instruction
Clock tick
Case 1:
Case 2:
Time
![Page 4: Computer Architecture Challenges Shriniwas Gadage](https://reader030.vdocument.in/reader030/viewer/2022032707/56649e4a5503460f94b3ebed/html5/thumbnails/4.jpg)
Microarchitecture
• Micro-Architects design processors
• Goals for processors:– Faster!!!!– Higher bandwidth communication with memory system– Backward-compatible with previous models
• How do we make processors faster? – Faster clocks (>2 GHz)– Do more work (execute instructions) at same time
slide 4
![Page 5: Computer Architecture Challenges Shriniwas Gadage](https://reader030.vdocument.in/reader030/viewer/2022032707/56649e4a5503460f94b3ebed/html5/thumbnails/5.jpg)
5
A Typical Microprocessor
BranchPredictor
Decode &Rename Issue Logic
ALUALU ALU ALU
L2 Cache
L1 InstrCache
L1 DataCache
RegisterFile
![Page 6: Computer Architecture Challenges Shriniwas Gadage](https://reader030.vdocument.in/reader030/viewer/2022032707/56649e4a5503460f94b3ebed/html5/thumbnails/6.jpg)
Multiprocessor Architecture
• Computer systems with multiple processors
slide 6
Node Node Node
InterconnectionNetwork
InterconnectionNetwork
Pentium
cache memory(DRAM) bridge to I/O
disk ethernetcard
Node
![Page 7: Computer Architecture Challenges Shriniwas Gadage](https://reader030.vdocument.in/reader030/viewer/2022032707/56649e4a5503460f94b3ebed/html5/thumbnails/7.jpg)
Multiprocessor Architecture
• How do we make processors work together?– Exploit parallelism in applications
• Example: web server– Each processor handles different requests– Processors communicate occasionally to synch up
• Some challenges:– Interconnection network design– Protocols for communicating and sharing data– Scalability– Reliability
slide 7
![Page 8: Computer Architecture Challenges Shriniwas Gadage](https://reader030.vdocument.in/reader030/viewer/2022032707/56649e4a5503460f94b3ebed/html5/thumbnails/8.jpg)
8
Computer Architecture
• To a large extent, computer architecture determines:
• the number of instructions used to execute a program
• the time each instruction takes to execute
• the idle cycles when no work gets done
• the number of instructions that can execute in parallel
![Page 9: Computer Architecture Challenges Shriniwas Gadage](https://reader030.vdocument.in/reader030/viewer/2022032707/56649e4a5503460f94b3ebed/html5/thumbnails/9.jpg)
Technology Trends
• We design an architecture for a given technology• Technology parameters:
– Number of transistors on a chip– Transistor speed– Amount of memory– Memory speed– Bandwidth between components– Power usage– Applications to be run on system
• All of these change dramatically over time!slide 9
time
parameter
![Page 10: Computer Architecture Challenges Shriniwas Gadage](https://reader030.vdocument.in/reader030/viewer/2022032707/56649e4a5503460f94b3ebed/html5/thumbnails/10.jpg)
Technology Trends
• Performance was the ultimate metric
• Transistors were a limiting factor
• As on-chip transistors became available in the 90s,
• more functionality and complex circuitry was added to boost performance
![Page 11: Computer Architecture Challenges Shriniwas Gadage](https://reader030.vdocument.in/reader030/viewer/2022032707/56649e4a5503460f94b3ebed/html5/thumbnails/11.jpg)
Technology Trends – A Few Examples
• Number of transistors – Doubles every 18 months (Moore’s Law)
• Memory size– 1992: We bought extra 512Kbytes for desktop– 2002: Desktop came with 512 Mbytes– 2012: Desktop comes with 2GB
• Power usage– Pentium 4 can draw 50 amps of current and burn 50 W
• Important applications– Word processing, spreadsheets multimedia, web surfing
slide 11
![Page 12: Computer Architecture Challenges Shriniwas Gadage](https://reader030.vdocument.in/reader030/viewer/2022032707/56649e4a5503460f94b3ebed/html5/thumbnails/12.jpg)
Technology Trends – Good or Bad?
• Pessimist: trends make designs obsolete– But now I have to re-think everything I’d already solved!
• Optimist/Architect: trends offer opportunities– What can we do with a billion transistors?
• Good design ideas Bad design ideas
• E.g., It was good idea to scale up processor sizes– But, it now uses too much power and is too complex
slide 12
![Page 13: Computer Architecture Challenges Shriniwas Gadage](https://reader030.vdocument.in/reader030/viewer/2022032707/56649e4a5503460f94b3ebed/html5/thumbnails/13.jpg)
What To Do With a Billion Transistors?
A. Make the processors bigger
slide 13
Proc
B. Make more little processors
Proc
chip
chipProc Proc
![Page 14: Computer Architecture Challenges Shriniwas Gadage](https://reader030.vdocument.in/reader030/viewer/2022032707/56649e4a5503460f94b3ebed/html5/thumbnails/14.jpg)
Hitting the Wall: Architecture Challenges
• Single core performance
• Memory
• Complexity
• Power and temperature-efficient designs
![Page 15: Computer Architecture Challenges Shriniwas Gadage](https://reader030.vdocument.in/reader030/viewer/2022032707/56649e4a5503460f94b3ebed/html5/thumbnails/15.jpg)
Hitting the Wall: Architecture Challenges
• Functionalities in multi-core chips
• Simplifying the programmer’s task
• Efficient interconnects
• Designs tolerant of errors
![Page 16: Computer Architecture Challenges Shriniwas Gadage](https://reader030.vdocument.in/reader030/viewer/2022032707/56649e4a5503460f94b3ebed/html5/thumbnails/16.jpg)
16
Hitting the Power Wall
Power is as important a metric today as performance
![Page 17: Computer Architecture Challenges Shriniwas Gadage](https://reader030.vdocument.in/reader030/viewer/2022032707/56649e4a5503460f94b3ebed/html5/thumbnails/17.jpg)
17
The Advent of Multi-Core Chips
• In the past, performance magically increased by 50% every year• In the future, this improvement will be only ~20% every year … unless … the application is multi-threaded!
Core
Cache bank
![Page 18: Computer Architecture Challenges Shriniwas Gadage](https://reader030.vdocument.in/reader030/viewer/2022032707/56649e4a5503460f94b3ebed/html5/thumbnails/18.jpg)
Interconnects as a Bottleneck• In the past, on-chip data transmission on wires cost almost
nothing
• Interconnect speed and power has been improving, but not at the same rate as transistor speeds
• Hence, relative to computation, communication is much more expensive
• In the near future, it will take 100 cycles to travel across the chip
• 50% of chip power can be attributed to interconnects
![Page 19: Computer Architecture Challenges Shriniwas Gadage](https://reader030.vdocument.in/reader030/viewer/2022032707/56649e4a5503460f94b3ebed/html5/thumbnails/19.jpg)
19
Interconnects in Multi-Core Chips
A
L1
A
CPU 3
CPU 1 CPU 2
L2cache
L2control
AA
A
A
A
L2control
![Page 20: Computer Architecture Challenges Shriniwas Gadage](https://reader030.vdocument.in/reader030/viewer/2022032707/56649e4a5503460f94b3ebed/html5/thumbnails/20.jpg)
20
Not all Wires are Created Equal
B-Wires L-Wires W-Wires PW-Wires
Relative latency 1x 0.5x 1.6x 3.2xRelative area 1x 4x 0.5x 0.5xDynamic power (W/m) 2.65a 1.46a 2.9a 0.87aStatic Power (W/m) 1.02 0.57 1.16 0.31
![Page 21: Computer Architecture Challenges Shriniwas Gadage](https://reader030.vdocument.in/reader030/viewer/2022032707/56649e4a5503460f94b3ebed/html5/thumbnails/21.jpg)
21
Data Transfers have Varying Needs
• Example of a cache coherence transaction: Read exclusive request for a shared block
![Page 22: Computer Architecture Challenges Shriniwas Gadage](https://reader030.vdocument.in/reader030/viewer/2022032707/56649e4a5503460f94b3ebed/html5/thumbnails/22.jpg)
22
Other Interconnect Choices
• Optical interconnects: speed of light, cost in converting between optical and electrical domains
• 3D chips: reduces communication distances, low cost for vertical signal transmission, increase in power density
![Page 23: Computer Architecture Challenges Shriniwas Gadage](https://reader030.vdocument.in/reader030/viewer/2022032707/56649e4a5503460f94b3ebed/html5/thumbnails/23.jpg)
23
3D Layouts
Cluster
(a) Arch-1 (cache-on-cluster) (b) Arch-2 (cluster on cluster) (c) Arch-3 (staggered)
Cache bank Intra-die horizontal wire Inter-die vertical wire
Die 1
Die 0
![Page 24: Computer Architecture Challenges Shriniwas Gadage](https://reader030.vdocument.in/reader030/viewer/2022032707/56649e4a5503460f94b3ebed/html5/thumbnails/24.jpg)
24
Upcoming Architecture Challenges
• Improving single core performance
• Functionalities in multi-core chips
• Simplifying the programmer’s task
• Efficient interconnects
• Power and temperature-efficient designs
• Designs tolerant of errors
Clustered architectures: relatively low complexity scalable solution easily handles multiple threads
![Page 25: Computer Architecture Challenges Shriniwas Gadage](https://reader030.vdocument.in/reader030/viewer/2022032707/56649e4a5503460f94b3ebed/html5/thumbnails/25.jpg)
25
Upcoming Architecture Challenges
• Improving single core performance
• Functionalities in multi-core chips
• Simplifying the programmer’s task
• Efficient interconnects
• Power and temperature-efficient designs
• Designs tolerant of errors
Heterogeneous perf/powerCores that execute the OSCores that verify results
![Page 26: Computer Architecture Challenges Shriniwas Gadage](https://reader030.vdocument.in/reader030/viewer/2022032707/56649e4a5503460f94b3ebed/html5/thumbnails/26.jpg)
26
Upcoming Architecture Challenges
• Improving single core performance
• Functionalities in multi-core chips
• Simplifying the programmer’s task
• Efficient interconnects
• Power and temperature-efficient designs
• Designs tolerant of errors
Hardware to supporttransactional memory
![Page 27: Computer Architecture Challenges Shriniwas Gadage](https://reader030.vdocument.in/reader030/viewer/2022032707/56649e4a5503460f94b3ebed/html5/thumbnails/27.jpg)
27
Upcoming Architecture Challenges
• Improving single core performance
• Functionalities in multi-core chips
• Simplifying the programmer’s task
• Efficient interconnects
• Power and temperature-efficient designs
• Designs tolerant of errors
Faults are caused by high energy particles that deposit enough charge to toggle bits
Variations in conditions may cause a circuit to not produce its result in time
![Page 28: Computer Architecture Challenges Shriniwas Gadage](https://reader030.vdocument.in/reader030/viewer/2022032707/56649e4a5503460f94b3ebed/html5/thumbnails/28.jpg)
Ways to Evaluate New Architectures
slide 28
Simulati
ng
Modeling
Building
Tradeoff between three desired features
![Page 29: Computer Architecture Challenges Shriniwas Gadage](https://reader030.vdocument.in/reader030/viewer/2022032707/56649e4a5503460f94b3ebed/html5/thumbnails/29.jpg)
Building
• Construct a hardware prototype
• Advantages+ Way cool to show off hardware+ Runs quickly
• Disadvantages– Takes long time to build– Expensive– Not flexible
Generally too labor intensive for research studiesslide 29
![Page 30: Computer Architecture Challenges Shriniwas Gadage](https://reader030.vdocument.in/reader030/viewer/2022032707/56649e4a5503460f94b3ebed/html5/thumbnails/30.jpg)
Modeling
• Mathematically model the system– Use probabilities and/or queuing models
• Advantages+ Very flexible+ Very quick to develop+ Runs quickly
• Disadvantages– Cannot capture effects of system details– Architects are skeptical of models
Generally OK for back of the envelope estimates
slide 30
mem time = hit time + miss rate*penalty
![Page 31: Computer Architecture Challenges Shriniwas Gadage](https://reader030.vdocument.in/reader030/viewer/2022032707/56649e4a5503460f94b3ebed/html5/thumbnails/31.jpg)
Simulating
• Write a program that mimics system behavior
• Advantages+ Very flexible+ Relatively quick to develop
• Disadvantages– Runs slowly (e.g., 30,000 times slower than hardware)
Method of choice for most architectural research
slide 31
![Page 32: Computer Architecture Challenges Shriniwas Gadage](https://reader030.vdocument.in/reader030/viewer/2022032707/56649e4a5503460f94b3ebed/html5/thumbnails/32.jpg)
Simulation Challenges
slide 32
SimulatorApplication
System description
Performance results
Tough problems associated with each arrow!
![Page 33: Computer Architecture Challenges Shriniwas Gadage](https://reader030.vdocument.in/reader030/viewer/2022032707/56649e4a5503460f94b3ebed/html5/thumbnails/33.jpg)
Describing Simulated System
• How detailed must our simulator be?• Model every transistor in the processor?
– Would take too long• Abstract away details of processor organization?
– Could miss important effects of processor features– Could achieve wrong conclusion
• Need balance– Model in detail only where necessary– E.g., Model memory system in detail, but abstract disks
slide 33
![Page 34: Computer Architecture Challenges Shriniwas Gadage](https://reader030.vdocument.in/reader030/viewer/2022032707/56649e4a5503460f94b3ebed/html5/thumbnails/34.jpg)
Why Architects Need Friends
• Architecture is considered both computer engineering and computer science
• Architects interact with other areas– Circuit design (Electrical Engineering)– Transmission lines (EE)– Power (EE, Mechanical Engineering)– Compilers (Comp Sci)– Operating systems (CS)– Networking (EE, CS)– Databases (CS)– Queuing theory (CS, EE, Industrial Engineering)
slide 34
![Page 35: Computer Architecture Challenges Shriniwas Gadage](https://reader030.vdocument.in/reader030/viewer/2022032707/56649e4a5503460f94b3ebed/html5/thumbnails/35.jpg)
How Architecture Relates to Other Areas
slide 35
Computer Architecture
Operating Systems, Compilers, Networking Software
Circuits, Wires, Network Hardware
Application Software
• Besides these interactions, also global issues!
– Power, system verification, performance analysis, etc.
![Page 36: Computer Architecture Challenges Shriniwas Gadage](https://reader030.vdocument.in/reader030/viewer/2022032707/56649e4a5503460f94b3ebed/html5/thumbnails/36.jpg)
How Architecture Relates to Hardware (EE)
slide 36
Computer Architecture
Operating Systems, Compilers, Networking Software
Circuits, Wires, Network Hardware
Application Software
• Architecture should enable efficient hardware design
– Avoid huge hardware structures
– Avoid cross-chip wires
![Page 37: Computer Architecture Challenges Shriniwas Gadage](https://reader030.vdocument.in/reader030/viewer/2022032707/56649e4a5503460f94b3ebed/html5/thumbnails/37.jpg)
How Architecture Relates to System Software
slide 37
Computer Architecture
Operating Systems, Compilers, Networking Software
Circuits, Wires, Network Hardware
Application Software
• Architecture should support system software
– Provide good target for compiler
– Support important OS features (such as synchronization)
![Page 38: Computer Architecture Challenges Shriniwas Gadage](https://reader030.vdocument.in/reader030/viewer/2022032707/56649e4a5503460f94b3ebed/html5/thumbnails/38.jpg)
How Architecture Relates to User Software
slide 38
Computer Architecture
Operating Systems, Compilers, Networking Software
Circuits, Wires, Network Hardware
Application Software
• Architecture should efficiently run important apps
• Intel added MMX hardware to support media apps
• Sun & IBM design multiprocessors for commercial apps