cache

CSIT 301 (Blum) 1

Cache

Based in part on Chapter 9 in Computer Architecture

(Nicholas Carter)

CSIT 301 (Blum) 2

Bad Direct Mapping Scenario Recalled

• With direct mapping cache, the loop involves memory locations that share the same cache address. With set associative cache, the loop involves memory locations that share the same set of cache addresses.

• It is thus possible with set associative cache that each of these memory locations is cache to a different member of the set. The iterations can proceed without repeated cache misses.

CSIT 301 (Blum) 3

The Problem with Fully Associative Cache

• All of those comparators are made of transistors. They take up room on the die. And any space lost to comparators has to be taken away from the data array. – After all we’re talking about thousands of

comparators.

• ASSOCIATIVITY LOWERS CAPACITY!

CSIT 301 (Blum) 4

Set-Associative Caches: The Compromise

• For example, instead of having the 1000-to-1 mapping we had with direct mapping, we could elect to have an 8000-to-8 mapping.

• That is, a given memory location can be cached into any of 8 cache locations, but the set of memory locations sharing those cache locations has also gone up by a factor of 8.

• This would be called an 8-way set associative cache.

CSIT 301 (Blum) 5

A Happy Medium

• 4- or 8-way set associative provides enough flexibility to allow one (under most circumstances) to cache the necessary memory locations to get the desired effects of caching for an iterative procedure.– I.e. it minimizes cache misses.

• But it only requires 4 or 8 comparators instead of the thousands required for fully associative caches.

CSIT 301 (Blum) 6

Set-Associative Cache

• Again the memory address is broken into three parts. – One part determines the position in the line. – One part determines this time a set of cache

addresses. – The last part is compared to what is stored in

the tags of the set of cache locations. – Etc.

CSIT 301 (Blum) 7

PCGuide.com comparison table

To which we add that full associativity has an adverse effect on capacity.

CSIT 301 (Blum) 8

Cache Misses

• When a cache miss occurs, several factors have to be considered. For example, – We want the new memory location written into

the cache, but where? – Can we continue attempting other cache

interactions or should we wait? – What if the cached data has been modified? – Should we do anything with the data we are

taking out of the cache?

CSIT 301 (Blum) 9

Replacement Policy

• Upon a cache miss, the memory that was not found in cache will be written to cache, but where? – In Direct Mapping, there is no choice it can

only be written to the cache address it is mapped to.

– In Associative and Set-Associative there is a choice in what to replace.

CSIT 301 (Blum) 10

Replacement Policy (Cont.)

• Least Recently Used (LRU)– Track the order in which the items in cache were used,

replace the line that is last in your order, i.e. the least recently used.

– This is best in keeping with the locality of reference notion behind cache, but it requires a fair amount of overhead.

• This can be too much overhead even in set associative cache where there may only be eight places under consideration.

CSIT 301 (Blum) 11

Replacement Policy (Cont.)

• Least Frequently Used (LFU)– Similar to above, track how often each item in cache is

used, replace the item with the lowest frequency.

• Not-Most-Recently Used– Another approach is to choose a line at random

except that one protects the line (from the set) that has been used most recently

– Less overhead

CSIT 301 (Blum) 12

Blocking or Non-Blocking Cache

• Replacement requires interacting with a slower type of memory (a lower level of cache or main memory). Do we allow the processor to continue to access cache during this procedure or not?

• This is the distinction between blocking and non-blocking cache. – In blocking, all cache transactions must wait until the

cache has been updated.

– In non-blocking, other cache transactions are possible.

CSIT 301 (Blum) 13

Cache Write Policy• The data cache may not only be read but may also

be written to. But cache is just standing in as a handy representative for main memory. That’s really where one wants to write. This is relatively slow just as reading from main memory is relatively slow.

• Rules about when one does this writing to memory is called one’s write policy.

• One reason for separating data cache and instruction cache is that the instruction cache does not require a write policy. – Recall the separation is known as the Harvard cache.

CSIT 301 (Blum) 14

Write-Back Cache

• Because writing to memory is slow, in Write-Back Cache, a.k.a. "copy back” cache, one waits to until the line of cache is being replaced to write any values back to memory. – Main memory and cache are inconsistent but

the cache value will always be used. – In such a case the memory is said to be “stale.”

CSIT 301 (Blum) 15

Dirty Bit

• Since writing back to main memory is slow, one only wants to do it if necessary, that is, if some part of line has been updated.

• Each line of cache has a “dirty bit” which tells the cache controller whether or not the line has been updated since it was last replaced. – Only if the dirty bit is flipped does one need to

write back.

CSIT 301 (Blum) 16

Pros and Cons of Write Back

• Pro: Write Back takes advantage of the locality of reference concept. If the line of cache is written to, it’s likely to be written to again soon (before it is replaced).

• Con: When one writes back to main memory, one must write the entire line.

CSIT 301 (Blum) 17

Write-Through Cache

• With Write-Through Cache, one writes the value back to memory every time a cache line is updated. – Con: Effectively a write-through cache is being used as

a cache (fast stand in for memory) only for purposes of reading and not for writing.

– Pro: When one writes, one is only writing a byte instead of a line. That’s not much of an advantage given the efficiency of burst/page reading/writing when the cache interacts with memory.

– Pro: Integrity: cache and memory always agree

CSIT 301 (Blum) 18

Comparing Policies

• Write back is more efficient.

• Write through maintains integrity.

• Integrity is not so much an issue at the SRAM-DRAM interface in the memory hierarchy since both are volatile. – This issue is more important at the next lower

interface main memory/virtual memory as virtual memory is non-volatile.

CSIT 301 (Blum) 19

Victim Cache

• Other than write modified data back to memory, what do we do with the data that is being replaced?

• One answer is nothing. • Another possibility is to store it in a buffer that is

faster than the next lower level, effectively introducing another small level of cache. This is as the victim cache or victim buffer.

• Monitoring the victim cache can lead to improved replacement policies.

CSIT 301 (Blum) 20

References

• Computer Architecture, Nicholas Carter• http://www.simmtester.com/page/news/

showpubnews.asp?num=101• http://www.pcguide.com/ref/mbsys/cache/ • http://www.howstuffworks.com/cache.htm/

printable • http://slcentral.com/articles/00/10/cache/print.php • http://en.wikipedia.org/wiki/Cache

cache

Documents