on-chip cache analysis a parameterized cache implementation for a system-on-chip risc cpu

Post on 18-Dec-2015

231 Views

Category:

Documents

2 Downloads

Preview:

Click to see full reader

TRANSCRIPT

On-Chip Cache Analysis

A Parameterized Cache Implementation for a System-on-Chip RISC CPU

Presentation Outline

Informal Introduction Underpin Design – xr16 Cache Design Issue Implementation Details Results & Conclusion Future Work Questions

Informal Introduction

Field Programmable Gate Array (FPGAs) Verilog HDL System-on-Chip (SoC) Reduced Instruction Set Computer (RISC) Caches Project Theme

Underpin Design – xr16

Classical pipelined RISC Big-Endian, Von-Numen Architecture Sixteen 16-bit registers Forty Two Instructions (16-bit) Result Forwarding, Branch Annulments,

Interlocked instructions

Underpin Design – xr16 (cont’d)

Internal and external Buses (CPU clocked) Pipelined Memory Interface Single-cycle read, 3-cycle write DMA and Interrupt Handling Support Ported Compiler and Assembler

Underpin Design – xr16 (cont’d)

Block Diagram

Underpin Design – xr16 (cont’d)

Datapath

Underpin Design – xr16 (cont’d)

Memory Preferences

Underpin Design – xr16 (cont’d)

RAM Interface

Cache Design Issues

Cache Size * Line Size Fetch Algorithm Placement Policy * Replacement Policy * Split vs. Unified Cache

Cache Design Issues (cont’d)

Write Back Strategy * Write Allocate Policy * Blocking vs. Non-Blocking Pipelined Transactions Virtually addressed Caches Multilevel Caches

Cache Design Issues (cont’d)Cache Size 32 – 256K Data Bits

Placement Policy

Direct Mapped, Set Associative,

Fully Associative

Replacement Policy

FIFO, Random*

Write Back Strategy

Write Back, Write Through

Write Allocate Policy

Write Allocate, Write No Allocate

Implementation Details

Configurable Parameters

Cache Size Placement Strategy Write Back Policy Write Allocate Policy Replacement Policy

Implementation Details (cont’d)

Implementation Details (cont’d)

Implementation Details (cont’d)

1. Miss Read Replacement NOT Required

Let the memory operation complete and place

fetched data from memory in cache.

Implementation Details (cont’d)

2. Miss Read Replacement RequiredInitiate a write memory operation and write back

the set to be replaced. Initiate read operation for

desired data.

Implementation Details (cont’d)

3. Miss Write No AllocateLet the memory operation complete and do nothing else.

Implementation Details (cont’d)

4. Miss Write Yes Allocate WriteThroughLet the memory operation complete and place the new

data in cache.

Implementation Details (cont’d)

5. Miss Write Yes Allocate WriteBack

Replacement NOT RequiredCancel memory operation and only update the cache,

mark the data dirty.

Implementation Details (cont’d)

6. Miss Write Yes Allocate WriteBack

Replacement RequiredInstead of writing the data that caused the write miss,

write back the set that is to be replaced and update the

cache with data that caused the miss.

Implementation Details (cont’d)

7. Hit ReadCancel memory operation and provide data for either

instruction fetch or data load instruction.

Implementation Details (cont’d)

8. Hit Write WriteThroughLet the memory operation complete and update the cache

when memory operation completes.

Implementation Details (cont’d)

9. Hit Write WriteBackCancel the memory operation and update the cache.

Implementation Details (cont’d)

1. Read Hit

2. Write Hit

3. Read Miss (rep)

4. Read Miss (no rep)

5. Write Miss (rep)

6. Write Miss (no rep)

1, 4 32 5,6

Results & Conclusion

Proof of Concept Rigid Design Parameters R&D Options Architecture Innovation

Future Work

LRU Implementation Victim Cache Buffer Split Caches Level 2 Cache Pipeline Enrichment Multiprocessor Support

Questions

top related