caching chapter 7. memory hierarchy cpu l1 l2 cache dram speed fastest slowest size smallest largest...
TRANSCRIPT
![Page 1: Caching Chapter 7. Memory Hierarchy CPU L1 L2 Cache DRAM Speed Fastest Slowest Size Smallest Largest Cost/bit Highest Lowest Tech SRAM (logic) SRAM (logic)](https://reader033.vdocument.in/reader033/viewer/2022051116/5697c0041a28abf838cc48ff/html5/thumbnails/1.jpg)
Caching
Chapter 7
![Page 2: Caching Chapter 7. Memory Hierarchy CPU L1 L2 Cache DRAM Speed Fastest Slowest Size Smallest Largest Cost/bit Highest Lowest Tech SRAM (logic) SRAM (logic)](https://reader033.vdocument.in/reader033/viewer/2022051116/5697c0041a28abf838cc48ff/html5/thumbnails/2.jpg)
Memory Hierarchy
CPU
L1
L2 Cache
DRAM
Speed
Fastest
Slowest
Size
Smallest
Largest
Cost/bit
Highest
Lowest
Tech
SRAM(logic)
SRAM(logic)
DRAM(capacitors)
![Page 3: Caching Chapter 7. Memory Hierarchy CPU L1 L2 Cache DRAM Speed Fastest Slowest Size Smallest Largest Cost/bit Highest Lowest Tech SRAM (logic) SRAM (logic)](https://reader033.vdocument.in/reader033/viewer/2022051116/5697c0041a28abf838cc48ff/html5/thumbnails/3.jpg)
Two design decisions
• What shall we put in the cache?
• How shall we organize cache to – find things quickly– hold the most important data– freezer or backpack….
![Page 4: Caching Chapter 7. Memory Hierarchy CPU L1 L2 Cache DRAM Speed Fastest Slowest Size Smallest Largest Cost/bit Highest Lowest Tech SRAM (logic) SRAM (logic)](https://reader033.vdocument.in/reader033/viewer/2022051116/5697c0041a28abf838cc48ff/html5/thumbnails/4.jpg)
What to put in cache?Try to apply a similar problem’s solution
• Can we predict what data we will use?
![Page 5: Caching Chapter 7. Memory Hierarchy CPU L1 L2 Cache DRAM Speed Fastest Slowest Size Smallest Largest Cost/bit Highest Lowest Tech SRAM (logic) SRAM (logic)](https://reader033.vdocument.in/reader033/viewer/2022051116/5697c0041a28abf838cc48ff/html5/thumbnails/5.jpg)
What to put in cache?
• Can we predict what data we will use?– Instead of predicting branch direction, predict
next memory address request
![Page 6: Caching Chapter 7. Memory Hierarchy CPU L1 L2 Cache DRAM Speed Fastest Slowest Size Smallest Largest Cost/bit Highest Lowest Tech SRAM (logic) SRAM (logic)](https://reader033.vdocument.in/reader033/viewer/2022051116/5697c0041a28abf838cc48ff/html5/thumbnails/6.jpg)
What to put in cache?
• Can we predict what data we will use?– Instead of predicting branch direction, predict
next memory address request– Like branch prediction, use previous behavior
![Page 7: Caching Chapter 7. Memory Hierarchy CPU L1 L2 Cache DRAM Speed Fastest Slowest Size Smallest Largest Cost/bit Highest Lowest Tech SRAM (logic) SRAM (logic)](https://reader033.vdocument.in/reader033/viewer/2022051116/5697c0041a28abf838cc48ff/html5/thumbnails/7.jpg)
What to put in cache?
• Can we predict what data we will use?– Instead of predicting branch direction, predict
next memory address request– Like branch prediction, use previous behavior
• Keep a prediction for every load?– Fetch stage for load is *TOO LATE*
• Keep a prediction per-memory address?
![Page 8: Caching Chapter 7. Memory Hierarchy CPU L1 L2 Cache DRAM Speed Fastest Slowest Size Smallest Largest Cost/bit Highest Lowest Tech SRAM (logic) SRAM (logic)](https://reader033.vdocument.in/reader033/viewer/2022051116/5697c0041a28abf838cc48ff/html5/thumbnails/8.jpg)
What to put in cache?
• Can we predict what data we will use?– Instead of predicting branch direction, predict
next memory address request– Like branch prediction, use previous behavior
• Keep a prediction for every load?– Fetch stage for load is *TOO LATE*
• Keep a prediction per-memory address?– Given address, guess next likely address
![Page 9: Caching Chapter 7. Memory Hierarchy CPU L1 L2 Cache DRAM Speed Fastest Slowest Size Smallest Largest Cost/bit Highest Lowest Tech SRAM (logic) SRAM (logic)](https://reader033.vdocument.in/reader033/viewer/2022051116/5697c0041a28abf838cc48ff/html5/thumbnails/9.jpg)
What to put in cache?
• Can we predict what data we will use?– Instead of predicting branch direction, predict next
memory address request– Like branch prediction, use previous behavior
• Keep a prediction for every load?– Fetch stage for load is *TOO LATE*
• Keep a prediction per-memory address?– Given address, guess next likely address– Too many choices – table too large or fits too few
![Page 10: Caching Chapter 7. Memory Hierarchy CPU L1 L2 Cache DRAM Speed Fastest Slowest Size Smallest Largest Cost/bit Highest Lowest Tech SRAM (logic) SRAM (logic)](https://reader033.vdocument.in/reader033/viewer/2022051116/5697c0041a28abf838cc48ff/html5/thumbnails/10.jpg)
Program CharacteristicsFind out more about programs
• Temporal Locality
• Spatial Locality
![Page 11: Caching Chapter 7. Memory Hierarchy CPU L1 L2 Cache DRAM Speed Fastest Slowest Size Smallest Largest Cost/bit Highest Lowest Tech SRAM (logic) SRAM (logic)](https://reader033.vdocument.in/reader033/viewer/2022051116/5697c0041a28abf838cc48ff/html5/thumbnails/11.jpg)
Program Characteristics
• Temporal Locality– If you use one item, you are likely to use it
again soon
• Spatial Locality
![Page 12: Caching Chapter 7. Memory Hierarchy CPU L1 L2 Cache DRAM Speed Fastest Slowest Size Smallest Largest Cost/bit Highest Lowest Tech SRAM (logic) SRAM (logic)](https://reader033.vdocument.in/reader033/viewer/2022051116/5697c0041a28abf838cc48ff/html5/thumbnails/12.jpg)
Program Characteristics
• Temporal Locality– If you use one item, you are likely to use it
again soon
• Spatial Locality– If you use one item, you are likely to use its
neighbors soon
![Page 13: Caching Chapter 7. Memory Hierarchy CPU L1 L2 Cache DRAM Speed Fastest Slowest Size Smallest Largest Cost/bit Highest Lowest Tech SRAM (logic) SRAM (logic)](https://reader033.vdocument.in/reader033/viewer/2022051116/5697c0041a28abf838cc48ff/html5/thumbnails/13.jpg)
Locality
• Programs tend to exhibit spatial & temporal locality. Just a fact of life.
• How can we use this knowledge of program behavior to design a cache?
![Page 14: Caching Chapter 7. Memory Hierarchy CPU L1 L2 Cache DRAM Speed Fastest Slowest Size Smallest Largest Cost/bit Highest Lowest Tech SRAM (logic) SRAM (logic)](https://reader033.vdocument.in/reader033/viewer/2022051116/5697c0041a28abf838cc48ff/html5/thumbnails/14.jpg)
What does that mean?!?
• 1. Design cache that takes advantage of spatial & temporal locality
![Page 15: Caching Chapter 7. Memory Hierarchy CPU L1 L2 Cache DRAM Speed Fastest Slowest Size Smallest Largest Cost/bit Highest Lowest Tech SRAM (logic) SRAM (logic)](https://reader033.vdocument.in/reader033/viewer/2022051116/5697c0041a28abf838cc48ff/html5/thumbnails/15.jpg)
What does that mean?!?
• 1. Design cache that takes advantage of spatial & temporal locality
• 2. When you program, place data together that is used together to increase spatial & temporal locality
![Page 16: Caching Chapter 7. Memory Hierarchy CPU L1 L2 Cache DRAM Speed Fastest Slowest Size Smallest Largest Cost/bit Highest Lowest Tech SRAM (logic) SRAM (logic)](https://reader033.vdocument.in/reader033/viewer/2022051116/5697c0041a28abf838cc48ff/html5/thumbnails/16.jpg)
What does that mean?!?
• 1. Design cache that takes advantage of spatial & temporal locality
• 2. When you program, place data together that is used together to increase locality– Java - difficult to do– C - more control over data placement
• Note: Caches exploit locality. Programs have varying degrees of locality. Caches do not have locality!
![Page 17: Caching Chapter 7. Memory Hierarchy CPU L1 L2 Cache DRAM Speed Fastest Slowest Size Smallest Largest Cost/bit Highest Lowest Tech SRAM (logic) SRAM (logic)](https://reader033.vdocument.in/reader033/viewer/2022051116/5697c0041a28abf838cc48ff/html5/thumbnails/17.jpg)
Cache Design
• Temporal Locality
• Spatial Locality
![Page 18: Caching Chapter 7. Memory Hierarchy CPU L1 L2 Cache DRAM Speed Fastest Slowest Size Smallest Largest Cost/bit Highest Lowest Tech SRAM (logic) SRAM (logic)](https://reader033.vdocument.in/reader033/viewer/2022051116/5697c0041a28abf838cc48ff/html5/thumbnails/18.jpg)
Cache Design
• Temporal Locality– When we obtain the data, store it in the cache.
• Spatial Locality
![Page 19: Caching Chapter 7. Memory Hierarchy CPU L1 L2 Cache DRAM Speed Fastest Slowest Size Smallest Largest Cost/bit Highest Lowest Tech SRAM (logic) SRAM (logic)](https://reader033.vdocument.in/reader033/viewer/2022051116/5697c0041a28abf838cc48ff/html5/thumbnails/19.jpg)
Cache Design
• Temporal Locality– When we obtain the data, store it in the cache.
• Spatial Locality– Transfer large block of contiguous data to get
item’s neighbors.– Block (Line): Amount of data transferred for a
single miss (data plus neighbors)
![Page 20: Caching Chapter 7. Memory Hierarchy CPU L1 L2 Cache DRAM Speed Fastest Slowest Size Smallest Largest Cost/bit Highest Lowest Tech SRAM (logic) SRAM (logic)](https://reader033.vdocument.in/reader033/viewer/2022051116/5697c0041a28abf838cc48ff/html5/thumbnails/20.jpg)
Where do we put data?
• Searching whole cache takes time & power
• Direct-mapped– Limit each piece of data to one possible
position
• Search is quick and simple
![Page 21: Caching Chapter 7. Memory Hierarchy CPU L1 L2 Cache DRAM Speed Fastest Slowest Size Smallest Largest Cost/bit Highest Lowest Tech SRAM (logic) SRAM (logic)](https://reader033.vdocument.in/reader033/viewer/2022051116/5697c0041a28abf838cc48ff/html5/thumbnails/21.jpg)
What is our “key” for lookup?
• Tools are sorted by tool-type
• Books are sorted by subject (Dewey-Decimal)
• Old LISP machine sorted by data type
• Modern machines have no information – can only sort by address
![Page 22: Caching Chapter 7. Memory Hierarchy CPU L1 L2 Cache DRAM Speed Fastest Slowest Size Smallest Largest Cost/bit Highest Lowest Tech SRAM (logic) SRAM (logic)](https://reader033.vdocument.in/reader033/viewer/2022051116/5697c0041a28abf838cc48ff/html5/thumbnails/22.jpg)
Direct-Mapped
Cache
00011011
010000
100000
110000
Memory
000100
010100
100100
110100
Index
000000
Each box corresponds to one
word (4 bytes)
![Page 23: Caching Chapter 7. Memory Hierarchy CPU L1 L2 Cache DRAM Speed Fastest Slowest Size Smallest Largest Cost/bit Highest Lowest Tech SRAM (logic) SRAM (logic)](https://reader033.vdocument.in/reader033/viewer/2022051116/5697c0041a28abf838cc48ff/html5/thumbnails/23.jpg)
Direct-Mapped
Cache
00011011
Memory
One block (line)
Index
000000
010000
100000
110000
000100
010100
100100
110100
![Page 24: Caching Chapter 7. Memory Hierarchy CPU L1 L2 Cache DRAM Speed Fastest Slowest Size Smallest Largest Cost/bit Highest Lowest Tech SRAM (logic) SRAM (logic)](https://reader033.vdocument.in/reader033/viewer/2022051116/5697c0041a28abf838cc48ff/html5/thumbnails/24.jpg)
Direct-Mapped
Cache
00011011
000000
010000
100000
110000
Memory
000100
010100
100100
110100
One block (line)
Index
Draw on the board!!!Show what addresses go
where
![Page 25: Caching Chapter 7. Memory Hierarchy CPU L1 L2 Cache DRAM Speed Fastest Slowest Size Smallest Largest Cost/bit Highest Lowest Tech SRAM (logic) SRAM (logic)](https://reader033.vdocument.in/reader033/viewer/2022051116/5697c0041a28abf838cc48ff/html5/thumbnails/25.jpg)
Direct-Mapped cacheBlock (Line) size = 2 words or 8 bytes
00011011
Byte Address0b100100100
Where do we look in the cache?
How do we know if it is there?
DataIndex
![Page 26: Caching Chapter 7. Memory Hierarchy CPU L1 L2 Cache DRAM Speed Fastest Slowest Size Smallest Largest Cost/bit Highest Lowest Tech SRAM (logic) SRAM (logic)](https://reader033.vdocument.in/reader033/viewer/2022051116/5697c0041a28abf838cc48ff/html5/thumbnails/26.jpg)
Direct-Mapped cacheBlock (Line) size = 2 words or 8 bytes
00011011
Byte Address0b100100100
Where do we look in the cache? BlockAddress mod #setsBlockAddress & (#sets-1)
How do we know if it is there?
DataIndex
Where is it within the block?Block Address
![Page 27: Caching Chapter 7. Memory Hierarchy CPU L1 L2 Cache DRAM Speed Fastest Slowest Size Smallest Largest Cost/bit Highest Lowest Tech SRAM (logic) SRAM (logic)](https://reader033.vdocument.in/reader033/viewer/2022051116/5697c0041a28abf838cc48ff/html5/thumbnails/27.jpg)
Direct-Mapped cacheBlock (Line) size = 2 words or 8 bytes
00011011
Byte Address0b100100100
Where do we look in the cache? BlockAddress mod #slots BlockAddress & (#slots-1)
How do we know if it is there? We need a tag & valid bit
M[292-295]
DataTag1001
Valid1 M[288-291]
Where is it within the block?IndexTag
![Page 28: Caching Chapter 7. Memory Hierarchy CPU L1 L2 Cache DRAM Speed Fastest Slowest Size Smallest Largest Cost/bit Highest Lowest Tech SRAM (logic) SRAM (logic)](https://reader033.vdocument.in/reader033/viewer/2022051116/5697c0041a28abf838cc48ff/html5/thumbnails/28.jpg)
00011011
Direct-Mapped Cache
DataTagValid
000
00b1010001
Tag
Index
Byte Offset
Block Offset
Splitting the Address
![Page 29: Caching Chapter 7. Memory Hierarchy CPU L1 L2 Cache DRAM Speed Fastest Slowest Size Smallest Largest Cost/bit Highest Lowest Tech SRAM (logic) SRAM (logic)](https://reader033.vdocument.in/reader033/viewer/2022051116/5697c0041a28abf838cc48ff/html5/thumbnails/29.jpg)
Definitions
• Byte Offset: Which _____ within _____?
• Block Offset: Which _____ within ______?
• Set: Group of ______ checked each access
• Index: Which ______ within cache?• Tag: Is this the right one?
![Page 30: Caching Chapter 7. Memory Hierarchy CPU L1 L2 Cache DRAM Speed Fastest Slowest Size Smallest Largest Cost/bit Highest Lowest Tech SRAM (logic) SRAM (logic)](https://reader033.vdocument.in/reader033/viewer/2022051116/5697c0041a28abf838cc48ff/html5/thumbnails/30.jpg)
Definitions
• Byte Offset: Which byte within word• Block Offset: Which _____ within
______?• Set: Group of ______ checked each
access• Index: Which ______ within cache?• Tag: Is this the right one?
![Page 31: Caching Chapter 7. Memory Hierarchy CPU L1 L2 Cache DRAM Speed Fastest Slowest Size Smallest Largest Cost/bit Highest Lowest Tech SRAM (logic) SRAM (logic)](https://reader033.vdocument.in/reader033/viewer/2022051116/5697c0041a28abf838cc48ff/html5/thumbnails/31.jpg)
Definitions
• Byte Offset: Which byte within word• Block Offset: Which word within
block• Set: Group of ______ checked each
access• Index: Which ______ within cache?• Tag: Is this the right one?
![Page 32: Caching Chapter 7. Memory Hierarchy CPU L1 L2 Cache DRAM Speed Fastest Slowest Size Smallest Largest Cost/bit Highest Lowest Tech SRAM (logic) SRAM (logic)](https://reader033.vdocument.in/reader033/viewer/2022051116/5697c0041a28abf838cc48ff/html5/thumbnails/32.jpg)
Definitions
• Byte Offset: Which byte within word• Block Offset: Which word within
block• Set: Group of blocks checked each
access• Index: Which ______ within cache?• Tag: Is this the right one?
![Page 33: Caching Chapter 7. Memory Hierarchy CPU L1 L2 Cache DRAM Speed Fastest Slowest Size Smallest Largest Cost/bit Highest Lowest Tech SRAM (logic) SRAM (logic)](https://reader033.vdocument.in/reader033/viewer/2022051116/5697c0041a28abf838cc48ff/html5/thumbnails/33.jpg)
Definitions
• Byte Offset: Which byte within word
• Block Offset: Which word within block
• Set: Group of blocks checked each access
• Index: Which set within cache?
• Tag: Is this the right one?
(All of the upper bits)
![Page 34: Caching Chapter 7. Memory Hierarchy CPU L1 L2 Cache DRAM Speed Fastest Slowest Size Smallest Largest Cost/bit Highest Lowest Tech SRAM (logic) SRAM (logic)](https://reader033.vdocument.in/reader033/viewer/2022051116/5697c0041a28abf838cc48ff/html5/thumbnails/34.jpg)
Definitions
• Block (Line)
• Hit
• Miss
• Hit time / Access time
• Miss Penalty
![Page 35: Caching Chapter 7. Memory Hierarchy CPU L1 L2 Cache DRAM Speed Fastest Slowest Size Smallest Largest Cost/bit Highest Lowest Tech SRAM (logic) SRAM (logic)](https://reader033.vdocument.in/reader033/viewer/2022051116/5697c0041a28abf838cc48ff/html5/thumbnails/35.jpg)
Definitions
• Block - unit of data transfer – bytes/words
• Hit
• Miss
• Hit time / Access time
• Miss Penalty
![Page 36: Caching Chapter 7. Memory Hierarchy CPU L1 L2 Cache DRAM Speed Fastest Slowest Size Smallest Largest Cost/bit Highest Lowest Tech SRAM (logic) SRAM (logic)](https://reader033.vdocument.in/reader033/viewer/2022051116/5697c0041a28abf838cc48ff/html5/thumbnails/36.jpg)
Definitions
• Block - unit of data transfer – bytes/words
• Hit - data found in this cache
• Miss
• Hit time / Access time
• Miss Penalty
![Page 37: Caching Chapter 7. Memory Hierarchy CPU L1 L2 Cache DRAM Speed Fastest Slowest Size Smallest Largest Cost/bit Highest Lowest Tech SRAM (logic) SRAM (logic)](https://reader033.vdocument.in/reader033/viewer/2022051116/5697c0041a28abf838cc48ff/html5/thumbnails/37.jpg)
Definitions
• Block - unit of data transfer – bytes/words
• Hit - data found in this cache
• Miss - data not found in this cache– Send request to lower level
• Hit time / Access time
• Miss Penalty
![Page 38: Caching Chapter 7. Memory Hierarchy CPU L1 L2 Cache DRAM Speed Fastest Slowest Size Smallest Largest Cost/bit Highest Lowest Tech SRAM (logic) SRAM (logic)](https://reader033.vdocument.in/reader033/viewer/2022051116/5697c0041a28abf838cc48ff/html5/thumbnails/38.jpg)
Definitions
• Block - unit of data transfer – bytes/words• Hit - data found in this cache• Miss - data not found in this cache
– Send request to lower level
• Hit time / Access time– Time to access this cache – look for item, return
data
• Miss Penalty
![Page 39: Caching Chapter 7. Memory Hierarchy CPU L1 L2 Cache DRAM Speed Fastest Slowest Size Smallest Largest Cost/bit Highest Lowest Tech SRAM (logic) SRAM (logic)](https://reader033.vdocument.in/reader033/viewer/2022051116/5697c0041a28abf838cc48ff/html5/thumbnails/39.jpg)
Definitions• Block - unit of data transfer – bytes/words
• Hit - data found in this cache
• Miss - data not found in this cache– Send request to lower level
• Hit time / Access time– Time to access this cache
• Miss Penalty– Time to receive block from lower level– Not always constant
![Page 40: Caching Chapter 7. Memory Hierarchy CPU L1 L2 Cache DRAM Speed Fastest Slowest Size Smallest Largest Cost/bit Highest Lowest Tech SRAM (logic) SRAM (logic)](https://reader033.vdocument.in/reader033/viewer/2022051116/5697c0041a28abf838cc48ff/html5/thumbnails/40.jpg)
00011011
Direct-Mapped Cache
DataTagValid
000
0 0x1010001
Tag
Index
Byte Offset
Block Offset
Example 1 – Direct-MappedBlock size=2 words
![Page 41: Caching Chapter 7. Memory Hierarchy CPU L1 L2 Cache DRAM Speed Fastest Slowest Size Smallest Largest Cost/bit Highest Lowest Tech SRAM (logic) SRAM (logic)](https://reader033.vdocument.in/reader033/viewer/2022051116/5697c0041a28abf838cc48ff/html5/thumbnails/41.jpg)
00011011
Direct-Mapped Cache
DataTagValid
000
0
Reference Stream: Hit/Miss0b10010000b00101000b01110000b00100000b00101000b0100100
Miss Rate:Tag Index Byte OffsetBlock Offset
Example 1 – Direct-MappedBlock size=2 words
![Page 42: Caching Chapter 7. Memory Hierarchy CPU L1 L2 Cache DRAM Speed Fastest Slowest Size Smallest Largest Cost/bit Highest Lowest Tech SRAM (logic) SRAM (logic)](https://reader033.vdocument.in/reader033/viewer/2022051116/5697c0041a28abf838cc48ff/html5/thumbnails/42.jpg)
Example 1 – Direct-MappedBlock size=2 words
00011011
Direct-Mapped Cache
DataTagValid
00
0
Reference Stream: Hit/Miss0b10010000b00101000b0111000 0b00100000b0010100 0b0100100
Miss Rate:Tag Index Byte OffsetBlock Offset
0
![Page 43: Caching Chapter 7. Memory Hierarchy CPU L1 L2 Cache DRAM Speed Fastest Slowest Size Smallest Largest Cost/bit Highest Lowest Tech SRAM (logic) SRAM (logic)](https://reader033.vdocument.in/reader033/viewer/2022051116/5697c0041a28abf838cc48ff/html5/thumbnails/43.jpg)
Example 1 – Direct-MappedBlock size=2 words
001001
1011
M[76-79]
Direct-Mapped Cache
DataTagValid
100
0
Reference Stream: Hit/Miss0b1001000 M0b00101000b0111000 0b00100000b0010100 0b0100100
Miss Rate:Tag Index Byte OffsetBlock Offset
M[72-75]
![Page 44: Caching Chapter 7. Memory Hierarchy CPU L1 L2 Cache DRAM Speed Fastest Slowest Size Smallest Largest Cost/bit Highest Lowest Tech SRAM (logic) SRAM (logic)](https://reader033.vdocument.in/reader033/viewer/2022051116/5697c0041a28abf838cc48ff/html5/thumbnails/44.jpg)
001001
1011
Direct-Mapped Cache
DataTagValid
100
0
Reference Stream: Hit/Miss0b1001000 M0b00101000b0111000 0b00100000b0010100 0b0100100
Miss Rate:Tag Index Byte OffsetBlock Offset
Example 1 – Direct-MappedBlock size=2 words
M[76-79] M[72-75]
![Page 45: Caching Chapter 7. Memory Hierarchy CPU L1 L2 Cache DRAM Speed Fastest Slowest Size Smallest Largest Cost/bit Highest Lowest Tech SRAM (logic) SRAM (logic)](https://reader033.vdocument.in/reader033/viewer/2022051116/5697c0041a28abf838cc48ff/html5/thumbnails/45.jpg)
0010010010
11M[20-23]
Direct-Mapped Cache
DataTagValid
11
0
0
Reference Stream: Hit/Miss0b1001000 M0b00101000b0111000 0b00100000b0010100 0b0100100
Miss Rate:Tag Index Byte OffsetBlock Offset
M[16-19]
Example 1 – Direct-MappedBlock size=2 words
M[76-79] M[72-75]
![Page 46: Caching Chapter 7. Memory Hierarchy CPU L1 L2 Cache DRAM Speed Fastest Slowest Size Smallest Largest Cost/bit Highest Lowest Tech SRAM (logic) SRAM (logic)](https://reader033.vdocument.in/reader033/viewer/2022051116/5697c0041a28abf838cc48ff/html5/thumbnails/46.jpg)
0010010010
11
Direct-Mapped Cache
DataTagValid
0
11
0
Reference Stream: Hit/Miss0b1001000 M0b0010100 M0b01110000b00100000b0010100 0b0100100
Miss Rate:Tag Index Byte OffsetBlock Offset
Example 1 – Direct-MappedBlock size=2 words
M[76-79] M[72-75]M[20-23] M[16-19]
![Page 47: Caching Chapter 7. Memory Hierarchy CPU L1 L2 Cache DRAM Speed Fastest Slowest Size Smallest Largest Cost/bit Highest Lowest Tech SRAM (logic) SRAM (logic)](https://reader033.vdocument.in/reader033/viewer/2022051116/5697c0041a28abf838cc48ff/html5/thumbnails/47.jpg)
00100100100111 M[60-63]
Direct-Mapped Cache
DataTagValid
111
0
Reference Stream: Hit/Miss0b1001000 M0b0010100 M0b0111000 M0b00100000b0010100 0b0100100
Miss Rate:Tag Index Byte OffsetBlock Offset
M[56-59]
Example 1 – Direct-MappedBlock size=2 words
M[76-79] M[72-75]M[20-23] M[16-19]
![Page 48: Caching Chapter 7. Memory Hierarchy CPU L1 L2 Cache DRAM Speed Fastest Slowest Size Smallest Largest Cost/bit Highest Lowest Tech SRAM (logic) SRAM (logic)](https://reader033.vdocument.in/reader033/viewer/2022051116/5697c0041a28abf838cc48ff/html5/thumbnails/48.jpg)
00100100100111
Direct-Mapped Cache
DataTagValid
111
0
Reference Stream: Hit/Miss0b1001000 M0b0010100 M0b0111000 M0b00100000b0010100 0b0100100
Miss Rate:Tag Index Byte OffsetBlock Offset
Example 1 – Direct-MappedBlock size=2 words
M[76-79] M[72-75]M[20-23] M[16-19]M[60-63] M[56-59]
![Page 49: Caching Chapter 7. Memory Hierarchy CPU L1 L2 Cache DRAM Speed Fastest Slowest Size Smallest Largest Cost/bit Highest Lowest Tech SRAM (logic) SRAM (logic)](https://reader033.vdocument.in/reader033/viewer/2022051116/5697c0041a28abf838cc48ff/html5/thumbnails/49.jpg)
00100100100111
Direct-Mapped Cache
DataTagValid
111
0
Reference Stream: Hit/Miss0b1001000 M0b0010100 M0b0111000 M0b0010000 H0b0010100 0b0100100
Miss Rate:Tag Index Byte OffsetBlock Offset
Example 1 – Direct-MappedBlock size=2 words
M[16-19]M[20-23]M[76-79] M[72-75]
M[60-63] M[56-59]
![Page 50: Caching Chapter 7. Memory Hierarchy CPU L1 L2 Cache DRAM Speed Fastest Slowest Size Smallest Largest Cost/bit Highest Lowest Tech SRAM (logic) SRAM (logic)](https://reader033.vdocument.in/reader033/viewer/2022051116/5697c0041a28abf838cc48ff/html5/thumbnails/50.jpg)
00100100100111
Direct-Mapped Cache
DataTagValid
111
0
Reference Stream: Hit/Miss0b1001000 M0b0010100 M0b0111000 M0b0010000 H0b0010100 0b0100100
Miss Rate:Tag Index Byte OffsetBlock Offset
Example 1 – Direct-MappedBlock size=2 words
M[16-19]M[20-23]M[76-79] M[72-75]
M[60-63] M[56-59]
![Page 51: Caching Chapter 7. Memory Hierarchy CPU L1 L2 Cache DRAM Speed Fastest Slowest Size Smallest Largest Cost/bit Highest Lowest Tech SRAM (logic) SRAM (logic)](https://reader033.vdocument.in/reader033/viewer/2022051116/5697c0041a28abf838cc48ff/html5/thumbnails/51.jpg)
00100100100111
Direct-Mapped Cache
DataTagValid
111
0
Reference Stream: Hit/Miss0b1001000 M0b0010100 M0b0111000 M0b0010000 H0b0010100 H0b0100100
Miss Rate:Tag Index Byte OffsetBlock Offset
Example 1 – Direct-MappedBlock size=2 words
M[16-19]M[20-23]M[76-79] M[72-75]
M[60-63] M[56-59]
![Page 52: Caching Chapter 7. Memory Hierarchy CPU L1 L2 Cache DRAM Speed Fastest Slowest Size Smallest Largest Cost/bit Highest Lowest Tech SRAM (logic) SRAM (logic)](https://reader033.vdocument.in/reader033/viewer/2022051116/5697c0041a28abf838cc48ff/html5/thumbnails/52.jpg)
00100101100111
Direct-Mapped Cache
DataTagValid
111
0
Reference Stream: Hit/Miss0b1001000 M0b0010100 M0b0111000 M0b0010000 H0b0010100 H0b0100100 M
Miss Rate:Tag Index Byte OffsetBlock Offset
Example 1 – Direct-MappedBlock size=2 words
M[16-19]M[20-23]M[76-79] M[72-75]
M[60-63] M[56-59]
![Page 53: Caching Chapter 7. Memory Hierarchy CPU L1 L2 Cache DRAM Speed Fastest Slowest Size Smallest Largest Cost/bit Highest Lowest Tech SRAM (logic) SRAM (logic)](https://reader033.vdocument.in/reader033/viewer/2022051116/5697c0041a28abf838cc48ff/html5/thumbnails/53.jpg)
0100100101100111
M[36-39]
Direct-Mapped Cache
DataTagValid
111
1
Reference Stream: Hit/Miss0b1001000 M0b0010100 M0b0111000 M0b0010000 H0b0010100 H0b0100100 M
Miss Rate:Tag Index Byte OffsetBlock Offset
M[32-35]
Example 1 – Direct-MappedBlock size=2 words
M[16-19]M[20-23]M[76-79] M[72-75]
M[60-63] M[56-59]
![Page 54: Caching Chapter 7. Memory Hierarchy CPU L1 L2 Cache DRAM Speed Fastest Slowest Size Smallest Largest Cost/bit Highest Lowest Tech SRAM (logic) SRAM (logic)](https://reader033.vdocument.in/reader033/viewer/2022051116/5697c0041a28abf838cc48ff/html5/thumbnails/54.jpg)
0100100101100111
Direct-Mapped Cache
DataTagValid
111
1
Reference Stream: Hit/Miss0b1001000 M0b0010100 M0b0111000 M0b0010000 H0b0010100 H0b0100100 M
Miss Rate: Tag Index Byte OffsetBlock Offset
Example 1 – Direct-MappedBlock size=2 words
M[16-19]M[20-23]M[76-79] M[72-75]
M[60-63] M[56-59]
M[36-39] M[32-35]
![Page 55: Caching Chapter 7. Memory Hierarchy CPU L1 L2 Cache DRAM Speed Fastest Slowest Size Smallest Largest Cost/bit Highest Lowest Tech SRAM (logic) SRAM (logic)](https://reader033.vdocument.in/reader033/viewer/2022051116/5697c0041a28abf838cc48ff/html5/thumbnails/55.jpg)
0100100101100111
Direct-Mapped Cache
DataTagValid
111
1
Reference Stream: Hit/Miss0b1001000 M0b0010100 M0b0111000 M0b0010000 H0b0010100 H0b0100100 M
Miss Rate: 4 / 6 = 67%Hit Rate: 2 / 6 = 33%
Tag Index Byte OffsetBlock Offset
Example 1 – Direct-MappedBlock size=2 words
M[16-19]M[20-23]M[76-79] M[72-75]
M[60-63] M[56-59]
M[36-39] M[32-35]
![Page 56: Caching Chapter 7. Memory Hierarchy CPU L1 L2 Cache DRAM Speed Fastest Slowest Size Smallest Largest Cost/bit Highest Lowest Tech SRAM (logic) SRAM (logic)](https://reader033.vdocument.in/reader033/viewer/2022051116/5697c0041a28abf838cc48ff/html5/thumbnails/56.jpg)
Implementation
00011011
DataTagValid
Byte Address0b100100100
Tag IndexByte Offset
=
Hit?
MUX
Block offset
Data
![Page 57: Caching Chapter 7. Memory Hierarchy CPU L1 L2 Cache DRAM Speed Fastest Slowest Size Smallest Largest Cost/bit Highest Lowest Tech SRAM (logic) SRAM (logic)](https://reader033.vdocument.in/reader033/viewer/2022051116/5697c0041a28abf838cc48ff/html5/thumbnails/57.jpg)
Example 2• You are implementing a 64-Kbyte cache,
32-bit address• The block size (line size) is 16 bytes.• Each word is 4 bytes• How many bits is the block offset?
• How many bits is the index?
• How many bits is the tag?
![Page 58: Caching Chapter 7. Memory Hierarchy CPU L1 L2 Cache DRAM Speed Fastest Slowest Size Smallest Largest Cost/bit Highest Lowest Tech SRAM (logic) SRAM (logic)](https://reader033.vdocument.in/reader033/viewer/2022051116/5697c0041a28abf838cc48ff/html5/thumbnails/58.jpg)
Example 2• You are implementing a 64-Kbyte cache
• The block size (line size) is 16 bytes.
• Each word is 4 bytes
• How many bits is the block offset?– 16 / 4 = 4 words -> 2 bits
• How many bits is the index?
• How many bits is the tag?
![Page 59: Caching Chapter 7. Memory Hierarchy CPU L1 L2 Cache DRAM Speed Fastest Slowest Size Smallest Largest Cost/bit Highest Lowest Tech SRAM (logic) SRAM (logic)](https://reader033.vdocument.in/reader033/viewer/2022051116/5697c0041a28abf838cc48ff/html5/thumbnails/59.jpg)
Example 2• You are implementing a 64-Kbyte cache
• The block size (line size) is 16 bytes.
• Each word is 4 bytes, address 32 bits
• How many bits is the block offset?– 16 / 4 = 4 words -> 2 bits
• How many bits is the index?– 64*1024 / 16 = 4096 -> 12 bits
• How many bits is the tag?
![Page 60: Caching Chapter 7. Memory Hierarchy CPU L1 L2 Cache DRAM Speed Fastest Slowest Size Smallest Largest Cost/bit Highest Lowest Tech SRAM (logic) SRAM (logic)](https://reader033.vdocument.in/reader033/viewer/2022051116/5697c0041a28abf838cc48ff/html5/thumbnails/60.jpg)
Example 2• You are implementing a 64-Kbyte cache
• The block size (line size) is 16 bytes.
• Each word is 4 bytes, address 32 bits
• How many bits is the block offset?– 16 / 4 = 4 words -> 2 bits
• How many bits is the index?– 64*1024 / 16 = 4096 -> 12 bits
• How many bits is the tag?– 32 - (2 + 12 + 2) = 16 bits
![Page 61: Caching Chapter 7. Memory Hierarchy CPU L1 L2 Cache DRAM Speed Fastest Slowest Size Smallest Largest Cost/bit Highest Lowest Tech SRAM (logic) SRAM (logic)](https://reader033.vdocument.in/reader033/viewer/2022051116/5697c0041a28abf838cc48ff/html5/thumbnails/61.jpg)
How caches work• Classic abstraction
• Each level of hierarchy has no knowledge of the configuration of lower level
L1
L2 Cache
DRAM
Memory
Me L2 Cache
DRAM
Memory
Me
L1 cache’s perspective L2 cache’s perspective
![Page 62: Caching Chapter 7. Memory Hierarchy CPU L1 L2 Cache DRAM Speed Fastest Slowest Size Smallest Largest Cost/bit Highest Lowest Tech SRAM (logic) SRAM (logic)](https://reader033.vdocument.in/reader033/viewer/2022051116/5697c0041a28abf838cc48ff/html5/thumbnails/62.jpg)
Memory Operation at any level
Cache
Memory
Me
Address
1. Cache receives request1.
![Page 63: Caching Chapter 7. Memory Hierarchy CPU L1 L2 Cache DRAM Speed Fastest Slowest Size Smallest Largest Cost/bit Highest Lowest Tech SRAM (logic) SRAM (logic)](https://reader033.vdocument.in/reader033/viewer/2022051116/5697c0041a28abf838cc48ff/html5/thumbnails/63.jpg)
Cache
Memory
Me
Address
1. Cache receives request2. Look for item in cache
Memory operation at any level
1.
2.
![Page 64: Caching Chapter 7. Memory Hierarchy CPU L1 L2 Cache DRAM Speed Fastest Slowest Size Smallest Largest Cost/bit Highest Lowest Tech SRAM (logic) SRAM (logic)](https://reader033.vdocument.in/reader033/viewer/2022051116/5697c0041a28abf838cc48ff/html5/thumbnails/64.jpg)
Cache
Memory
Me
Address
1. Cache receives request2. Look for item in cache
Hit - return data
Memory operation at any levelData
1.
2.
3.
![Page 65: Caching Chapter 7. Memory Hierarchy CPU L1 L2 Cache DRAM Speed Fastest Slowest Size Smallest Largest Cost/bit Highest Lowest Tech SRAM (logic) SRAM (logic)](https://reader033.vdocument.in/reader033/viewer/2022051116/5697c0041a28abf838cc48ff/html5/thumbnails/65.jpg)
Cache
Memory
Me
Address
1. Cache receives request2. Look for item in cache
Hit - return dataMiss - request memory
Memory operation at any level
1.
2.
3.
![Page 66: Caching Chapter 7. Memory Hierarchy CPU L1 L2 Cache DRAM Speed Fastest Slowest Size Smallest Largest Cost/bit Highest Lowest Tech SRAM (logic) SRAM (logic)](https://reader033.vdocument.in/reader033/viewer/2022051116/5697c0041a28abf838cc48ff/html5/thumbnails/66.jpg)
Cache
Memory
Me
Address
1. Cache receives request2. Look for item in cache
Hit - return dataMiss - request memory
receive dataupdate cache
Memory operation at any level
1.
2.
3.4.
![Page 67: Caching Chapter 7. Memory Hierarchy CPU L1 L2 Cache DRAM Speed Fastest Slowest Size Smallest Largest Cost/bit Highest Lowest Tech SRAM (logic) SRAM (logic)](https://reader033.vdocument.in/reader033/viewer/2022051116/5697c0041a28abf838cc48ff/html5/thumbnails/67.jpg)
Cache
Memory
Me
Address
1. Cache receives request2. Look for item in cache
Hit - return dataMiss – 3. request memory
4. receive data5. update cache5. return data
Memory operation at any levelData
1.
2.
3.4.
5.
![Page 68: Caching Chapter 7. Memory Hierarchy CPU L1 L2 Cache DRAM Speed Fastest Slowest Size Smallest Largest Cost/bit Highest Lowest Tech SRAM (logic) SRAM (logic)](https://reader033.vdocument.in/reader033/viewer/2022051116/5697c0041a28abf838cc48ff/html5/thumbnails/68.jpg)
Timing
Cache
Memory
Me
Address
1. Cache receives request
![Page 69: Caching Chapter 7. Memory Hierarchy CPU L1 L2 Cache DRAM Speed Fastest Slowest Size Smallest Largest Cost/bit Highest Lowest Tech SRAM (logic) SRAM (logic)](https://reader033.vdocument.in/reader033/viewer/2022051116/5697c0041a28abf838cc48ff/html5/thumbnails/69.jpg)
Cache
Memory
Me
Address
1. Cache receives request2. Look for item in cache
Timing
Access Time
![Page 70: Caching Chapter 7. Memory Hierarchy CPU L1 L2 Cache DRAM Speed Fastest Slowest Size Smallest Largest Cost/bit Highest Lowest Tech SRAM (logic) SRAM (logic)](https://reader033.vdocument.in/reader033/viewer/2022051116/5697c0041a28abf838cc48ff/html5/thumbnails/70.jpg)
Cache
Memory
Me
Address
1. Cache receives request2. Look for item in cache
Hit - return data
Data
Access Time
![Page 71: Caching Chapter 7. Memory Hierarchy CPU L1 L2 Cache DRAM Speed Fastest Slowest Size Smallest Largest Cost/bit Highest Lowest Tech SRAM (logic) SRAM (logic)](https://reader033.vdocument.in/reader033/viewer/2022051116/5697c0041a28abf838cc48ff/html5/thumbnails/71.jpg)
Cache
Memory
Me
Address
1. Cache receives request2. Look for item in cache
Hit - return dataMiss - request memory
Access Time
![Page 72: Caching Chapter 7. Memory Hierarchy CPU L1 L2 Cache DRAM Speed Fastest Slowest Size Smallest Largest Cost/bit Highest Lowest Tech SRAM (logic) SRAM (logic)](https://reader033.vdocument.in/reader033/viewer/2022051116/5697c0041a28abf838cc48ff/html5/thumbnails/72.jpg)
Cache
Memory
Me
Address
1. Cache receives request2. Look for item in cache
Hit - return dataMiss - request memory
receive blockupdate cache
Access Time
Miss Penalty
![Page 73: Caching Chapter 7. Memory Hierarchy CPU L1 L2 Cache DRAM Speed Fastest Slowest Size Smallest Largest Cost/bit Highest Lowest Tech SRAM (logic) SRAM (logic)](https://reader033.vdocument.in/reader033/viewer/2022051116/5697c0041a28abf838cc48ff/html5/thumbnails/73.jpg)
Cache
Memory
Me
Address
1. Cache receives request2. Look for item in cache
Hit - return dataMiss - request memory
receive blockupdate cachereturn data
Data
Access Time
Miss Penalty
![Page 74: Caching Chapter 7. Memory Hierarchy CPU L1 L2 Cache DRAM Speed Fastest Slowest Size Smallest Largest Cost/bit Highest Lowest Tech SRAM (logic) SRAM (logic)](https://reader033.vdocument.in/reader033/viewer/2022051116/5697c0041a28abf838cc48ff/html5/thumbnails/74.jpg)
Performance
• Hit: latency =
• Miss: latency =
• Goal: minimize misses!!!
![Page 75: Caching Chapter 7. Memory Hierarchy CPU L1 L2 Cache DRAM Speed Fastest Slowest Size Smallest Largest Cost/bit Highest Lowest Tech SRAM (logic) SRAM (logic)](https://reader033.vdocument.in/reader033/viewer/2022051116/5697c0041a28abf838cc48ff/html5/thumbnails/75.jpg)
Performance
• Hit: latency = access time
• Miss: latency =
• Goal: minimize misses!!!
![Page 76: Caching Chapter 7. Memory Hierarchy CPU L1 L2 Cache DRAM Speed Fastest Slowest Size Smallest Largest Cost/bit Highest Lowest Tech SRAM (logic) SRAM (logic)](https://reader033.vdocument.in/reader033/viewer/2022051116/5697c0041a28abf838cc48ff/html5/thumbnails/76.jpg)
Performance
• Hit: latency = access time
• Miss: latency = access time + miss penalty
• Goal: minimize misses!!!