lecture 25: wrap-up
DESCRIPTION
Lecture 25: Wrap-Up. Mid-term-II stats: High 91 Mean 73.12 Qs 1-3: half the class got 25/25 Qs 4: only one student got 25/25; almost no one mentioned that we’ll need a mechanism to determine exclusivity Qs 5: highest was 22/30; very few mentioned that allowing - PowerPoint PPT PresentationTRANSCRIPT
![Page 1: Lecture 25: Wrap-Up](https://reader035.vdocument.in/reader035/viewer/2022062517/56812ebc550346895d945c00/html5/thumbnails/1.jpg)
1
Lecture 25: Wrap-Up
• Mid-term-II stats: High 91 Mean 73.12
• Qs 1-3: half the class got 25/25
• Qs 4: only one student got 25/25; almost no one mentioned that we’ll need a mechanism to determine exclusivity
• Qs 5: highest was 22/30; very few mentioned that allowing blocks to move would complicate search
![Page 2: Lecture 25: Wrap-Up](https://reader035.vdocument.in/reader035/viewer/2022062517/56812ebc550346895d945c00/html5/thumbnails/2.jpg)
2
Example Solutions
![Page 3: Lecture 25: Wrap-Up](https://reader035.vdocument.in/reader035/viewer/2022062517/56812ebc550346895d945c00/html5/thumbnails/3.jpg)
3
Example Solutions
![Page 4: Lecture 25: Wrap-Up](https://reader035.vdocument.in/reader035/viewer/2022062517/56812ebc550346895d945c00/html5/thumbnails/4.jpg)
4
Example Solutions
![Page 5: Lecture 25: Wrap-Up](https://reader035.vdocument.in/reader035/viewer/2022062517/56812ebc550346895d945c00/html5/thumbnails/5.jpg)
5
Example Solutions
![Page 6: Lecture 25: Wrap-Up](https://reader035.vdocument.in/reader035/viewer/2022062517/56812ebc550346895d945c00/html5/thumbnails/6.jpg)
6
Example Solutions
![Page 7: Lecture 25: Wrap-Up](https://reader035.vdocument.in/reader035/viewer/2022062517/56812ebc550346895d945c00/html5/thumbnails/7.jpg)
7
Example Solutions
![Page 8: Lecture 25: Wrap-Up](https://reader035.vdocument.in/reader035/viewer/2022062517/56812ebc550346895d945c00/html5/thumbnails/8.jpg)
8
CPU 2
L1D L1I CP
U 4
L1
DL
1I
CPU 6
L1DL1I
CP
U 1
L1
DL
1I
CPU 3
L1D L1I
CP
U 5
L1
DL
1I
CPU 7
L1DL1I
CP
U 0
L1
DL
1I
![Page 9: Lecture 25: Wrap-Up](https://reader035.vdocument.in/reader035/viewer/2022062517/56812ebc550346895d945c00/html5/thumbnails/9.jpg)
9
Tetris?!
![Page 10: Lecture 25: Wrap-Up](https://reader035.vdocument.in/reader035/viewer/2022062517/56812ebc550346895d945c00/html5/thumbnails/10.jpg)
10
Non-Uniform Cache Access (NUCA)
• Many open problems in NUCA and D-NUCA How should search happen? Allocation/replacement/migration policies Managing bandwidth/latency on the network Prefetch mechanisms Selective replication of blocks Efficient write-throughs Power/performance trade-offs
• P.S. We have simulators, etc., to help model such caches in case anyone is interested
![Page 11: Lecture 25: Wrap-Up](https://reader035.vdocument.in/reader035/viewer/2022062517/56812ebc550346895d945c00/html5/thumbnails/11.jpg)
11
Shameless Plug
• CS 7810: Advanced Architecture
• Lectures based on seminal (and still relevant) papers
• Not much work, apart from class project (in teams)
• Class project can involve as little as 1 week’s worth of concentrated effort…
• … or, enough to get a paper out of it you WILL work on novel problems lots of help from me/other students with the simulator
![Page 12: Lecture 25: Wrap-Up](https://reader035.vdocument.in/reader035/viewer/2022062517/56812ebc550346895d945c00/html5/thumbnails/12.jpg)
12
3-D
• Imagine a similar problem in 3D
C P C P
CP CP
C P C P
CP CP
C P C P
CP CP
![Page 13: Lecture 25: Wrap-Up](https://reader035.vdocument.in/reader035/viewer/2022062517/56812ebc550346895d945c00/html5/thumbnails/13.jpg)
13
3-D
• Imagine a similar problem in 3D
C P C P
CP CP
C P C P
CP CP
C P C P
CP CP
Must schedule threads to manage temperature
![Page 14: Lecture 25: Wrap-Up](https://reader035.vdocument.in/reader035/viewer/2022062517/56812ebc550346895d945c00/html5/thumbnails/14.jpg)
14
Single Thread Performance
• To improve single-thread performance, can even schedule a single thread’s instructions across cores – large window of in-flight instructions to mine high ILP – requires high levels of speculation (power-hungry!) – any solutions?
C P C P
CP CP
C P C P
CP CP
C P C P
CP CP
![Page 15: Lecture 25: Wrap-Up](https://reader035.vdocument.in/reader035/viewer/2022062517/56812ebc550346895d945c00/html5/thumbnails/15.jpg)
15
Heterogeneous CMPs (Alpha EVx and Cell)
o-o-o
o-o-o
in-o
![Page 16: Lecture 25: Wrap-Up](https://reader035.vdocument.in/reader035/viewer/2022062517/56812ebc550346895d945c00/html5/thumbnails/16.jpg)
16
NASCAR Applied to CPUs !?!
• Bullet
Source: Eric Rotenberg (NCSU)
![Page 17: Lecture 25: Wrap-Up](https://reader035.vdocument.in/reader035/viewer/2022062517/56812ebc550346895d945c00/html5/thumbnails/17.jpg)
17
Runahead Execution
Single thread in a baseline architecture
Single thread executing in tandem witha helper thread
![Page 18: Lecture 25: Wrap-Up](https://reader035.vdocument.in/reader035/viewer/2022062517/56812ebc550346895d945c00/html5/thumbnails/18.jpg)
18
Reliability
P1 C2 P2 C1
SMT core 1 SMT core 2
For power
For performance
![Page 19: Lecture 25: Wrap-Up](https://reader035.vdocument.in/reader035/viewer/2022062517/56812ebc550346895d945c00/html5/thumbnails/19.jpg)
19
Title
• Bullet