Transcript
Page 1: Computer Architecture Foundations for Graduate Level Students

Computer Architecture Foundations for Graduate Level

Students

Page 2: Computer Architecture Foundations for Graduate Level Students

Basic Paradigm

HDMM CPU

cache

Page 3: Computer Architecture Foundations for Graduate Level Students

Transfers of data

HDMM CPU

cache

Page 4: Computer Architecture Foundations for Graduate Level Students

Transfers of data

HDMM CPU

cache

Page 5: Computer Architecture Foundations for Graduate Level Students

CPU needs particular data

HDMM CPU

cache

Page 6: Computer Architecture Foundations for Graduate Level Students

If required data is found in cache

HDMM CPU

cache

Page 7: Computer Architecture Foundations for Graduate Level Students

When required data is not in cache

HDMM CPU

cache

Page 8: Computer Architecture Foundations for Graduate Level Students

CPU ultimately gets data from cache

HDMM CPU

cache

Page 9: Computer Architecture Foundations for Graduate Level Students

If data is not in cache and in MM

HDMM CPU

cache

Page 10: Computer Architecture Foundations for Graduate Level Students

From HD to MM to cache

HDMM CPU

cache

Page 11: Computer Architecture Foundations for Graduate Level Students

If MM is full

HDMM CPU

cache

Page 12: Computer Architecture Foundations for Graduate Level Students

If cache is full

HDMM CPU

cache

Page 13: Computer Architecture Foundations for Graduate Level Students

Swapping between MM and cache

HDMM CPU

cache

Page 14: Computer Architecture Foundations for Graduate Level Students

Access Time

• If every memory reference to cache required transfer of one word between MM and cache, no increase in speed is achieved. In fact, speed will drop because apart from MM access, there is additional access to cache

• Suppose reference is repeated n times, and after the first reference, location is always found in the cache

Page 15: Computer Architecture Foundations for Graduate Level Students

Cache Hit Ratio

• The probability that a word will be found in the cache

• Depends upon the program and the size and organization of the cache

h = Number of times required word found in cacheTotal number of references

h: hit ratio

Page 16: Computer Architecture Foundations for Graduate Level Students

Access Time

mca thtt )1(

ta = Average access timetc = Cache access time(1-h) = miss ratiotm = Memory access time

Page 17: Computer Architecture Foundations for Graduate Level Students

Fetch Mechanisms

• Demand Fetch– Fetch a block from memory when it is needed

and is not in the cache

• Prefetch– Fetch block/s from memory before they are

requested

• Selective Fetch– Not always fetching blocks, dependent on

some defined criterion; blocks are stored in MM rather than the cache

Page 18: Computer Architecture Foundations for Graduate Level Students

Data in cache should be replaced with data from MM

• Blocks (a group of memory addresses) are transferred from MM to cache

• Cache has a limited capacity (page frame)

MM

cache

Page 19: Computer Architecture Foundations for Graduate Level Students

Replacement Algorithms

• When the word being requested by the CPU is not in the cache, it needs to be transferred from MM. (or it can also be from secondary memory to MM)

• A page fault occurs when a page or a block is not in the cache (or MM in the case of secondary memory)

• Replacement algorithms determine which page/block to remove or overwrite

Page 20: Computer Architecture Foundations for Graduate Level Students

Characteristics

• Usage based or Non-usage based– Usage based : the choice of page/block to

replace is dependent on the how many times each page/block has been referenced

– Non-usage based : Use some other criteria for replacement

Page 21: Computer Architecture Foundations for Graduate Level Students

Assumptions

• For a given page size, we only need to consider the page/block number.

• If we have a reference (hit) to a page p, then any immediately succeeding references to p does not cause a page fault

• The size of memory/cache is represented as the number of pages it is capable of holding (page frame )

Page 22: Computer Architecture Foundations for Graduate Level Students

Example

Consider the following address sequence calls:

0110 0432 0101 0612 0102 0103 0104

0101 0611 0102 0103 0302

which, at 100 bytes per page, can be reduced to the following access string:

1 4 1 6 1 6 1 3

This sequence of page requests is called a reference string.

Page 23: Computer Architecture Foundations for Graduate Level Students

Replacement Policies

• Random replacement algorithm

• First-in first-out replacement

• Optimal Algorithm

• Least recently used algorithm

• Least Frequently Used

• Most Frequently Used

Page 24: Computer Architecture Foundations for Graduate Level Students

Random Replacement

• A page is chosen randomly at page fault time

• There is no relationship between the pages or their use.

• Choice is done by a random number generator.

Page 25: Computer Architecture Foundations for Graduate Level Students

FIFO

• Memory treated as a queue– When a page comes in, it is inserted at the tail– When a page is removed, the entry at the

head of the queue gets deleted

• Easy to understand and program

• Performance is not consistently good; dependent on reference string

Page 26: Computer Architecture Foundations for Graduate Level Students

FIFO Example

Consider the following reference string:7 0 1 2 0 3 0 4 2

With a page frame of 3

* * * * * * * *

7 0 1 2 2 3 0 4 27 0 1 1 2 3 0 4

7 0 0 1 2 3 0

An * indicates a miss (the page requested by the CPU is not in the cache or in MM)

Page 27: Computer Architecture Foundations for Graduate Level Students

FIFO Example #2

Consider the following reference string:1 2 3 4 1 2 5 1 2 3 4 5

With a page frame of 3

* * * * * * * * * 1 2 3 4 1 2 5 5 5 3 4 4 1 2 3 4 1 2 2 2 5 3 3

1 2 3 4 1 1 1 2 5 5

We have 9 page faultsTry performing this FIFO with a page frame of 4

Page 28: Computer Architecture Foundations for Graduate Level Students

Belady’s Anomaly

• An increase in page frame does not necessarily mean a decrease in page faults

• More formally, Belady’s anomaly reflects the fact that, for some page-replacement algorithms, the page fault rate may increase as the number of allocated frames increases

Page 29: Computer Architecture Foundations for Graduate Level Students

Optimal Algorithm

• The page that will not be used for the longest period of time is replaced

• Guarantees the lowest page fault rate for a fixed number of frames

• Difficult to implement because it requires future knowledge of the reference string

Page 30: Computer Architecture Foundations for Graduate Level Students

Optimal Algorithm Example

Consider the following reference string:7 0 1 2 0 3 0 4 2With a page frame of 3

We look ahead and see that 7 is the page which will not be used again, so we replace 7; we also note that after our first hit we should not replace 0 immediately, but rather 1 because 1 will not be referenced any more (2 will be referenced last.)

* * * * * *7 0 1 2 2 3 3 4 4

7 0 1 1 2 2 3 37 0 0 0 0 2 2

Page 31: Computer Architecture Foundations for Graduate Level Students

Least Recently Used

• Approximates the optimal algorithm

• Replaces the page that has not been used for the longest period of time

• When all page frames have been used up and every time there is a page hit, the referenced page is placed at the tail to indicate it has been recently accessed

Page 32: Computer Architecture Foundations for Graduate Level Students

LRU Example

Consider the following reference string:7 0 1 2 0 3 0 4 0 3 0 2

With a page frame of 3

* * * * * * *7 0 1 2 0 3 0 4 0 3 0 2 7 0 1 2 0 3 0 4 0 3 0

7 0 1 2 2 3 3 4 4 3

We have 7 page faultsTry performing this LRU with a page frame of 4

Page 33: Computer Architecture Foundations for Graduate Level Students

Least Frequently Used

• Counts the number of references made to each page; when page is accessed, counter is incremented by one

• Page with smallest count is replaced• FIFO is used to resolve a tie• Rationale: Page with the bigger counter is an

actively used page• Problem

– Page initially actively may never be used again– Solved by using a decaying counter

Page 34: Computer Architecture Foundations for Graduate Level Students

LFU Example

Consider the following reference string:7 0 1 2 0 3 0 4 0 3 0 2

With a page frame of 3

* * * * * * *71 01 11 21 21 31 31 41 41 41 41 21

71 01 11 11 21 21 31 31 32 32 32

71 01 02 02 03 03 04 04 05 05

We have 7 page faultsTry performing this LFU with a page frame of 4

Page 35: Computer Architecture Foundations for Graduate Level Students

Most Frequently Used

• Opposite of LFU• Replace page with the highest count• Tie is resolved using FIFO• Based on the argument that the page with

smallest count has just been probably brought in and is yet to be used

• Both LFU and MFU are not common and implementation is expensive.

Page 36: Computer Architecture Foundations for Graduate Level Students

The Central Processing Unit

• The operating hub and heart of every computer system

• Composed of– Control Unit– Datapath

• Each component inside the CPU has a specific role in executing a command

• Communicates with other components of the system

Page 37: Computer Architecture Foundations for Graduate Level Students

Control Unit (CU)

• Regulates all activities inside the machine

• Serves as “nerve center” that sends control signals to other units and senses their status

• Connected to all components in the CPU as well as main memory

Page 38: Computer Architecture Foundations for Graduate Level Students

How The CU Is Connected

ControlUnit

Registers

ALU

CPU

Main Memory

Page 39: Computer Architecture Foundations for Graduate Level Students

Inside the CPU: The Datapath

Registers

ALU

Page 40: Computer Architecture Foundations for Graduate Level Students

Registers

• Components used for data storage (can be read from or written to)

• High speed memory locations used to store important information during CPU operations

• Two types– Special– General-purpose

Page 41: Computer Architecture Foundations for Graduate Level Students

Special Registers

• Registers used for specific purposes

• Used heavily during execution of CPU instructions

Page 42: Computer Architecture Foundations for Graduate Level Students

General Purpose Registers

• CPU registers used as “scratch pad” during execution of machine-level instructions

• Number varies between processors

Page 43: Computer Architecture Foundations for Graduate Level Students

Arithmetic Logic Unit (ALU)

• Performs all mathematical and logical operations within the CPU

• Operands not in the CPU would have to be retrieved from main memory

Page 44: Computer Architecture Foundations for Graduate Level Students

CPU-Memory Coordination• Bus - a group of

wires that connect separate components

• Types of bus:– Control bus (control

signals)– Address bus

(address information)

– Data bus (instruction/data)

Page 45: Computer Architecture Foundations for Graduate Level Students

CPU-Memory Coordination

• The different busses facilitate communication between the CPU and main memory

• Actions of the two components are highly-synchronized to ensure efficient and timely execution of instructions

Page 46: Computer Architecture Foundations for Graduate Level Students

CPU Operations

• Instructions do not reside in the CPU, they have to be fetched from memory

• Each machine level instruction is broken down into a logical sequence of smaller steps

Page 47: Computer Architecture Foundations for Graduate Level Students

CPU Operations

• Instructions are carried out by performing one or more of the following functions in some pre-specified sequence– Retrieving data from main memory– Putting data to main memory– Register data transfer– ALU operation

Page 48: Computer Architecture Foundations for Graduate Level Students

How An Instruction is Processed

• Instruction is retrieved from memory

• Analyze what the instruction is and how to execute it

• Operands/parameters (if any) are fetched from main memory

• Instruction is executed

• Results are stored (CPU or MM)

• Prepare for next instruction

Page 49: Computer Architecture Foundations for Graduate Level Students

Instruction Processing Example

• Fetch instruction from memory

• Decode it (turns out to be an ADD)

• Get the two numbers to add from MM

• Perform the addition

• Where will it be stored?

• Prepare for next instruction

Page 50: Computer Architecture Foundations for Graduate Level Students

Processing Data in Clusters

• Information is organized into groups of fixed-size data that can be stored and retrieved in a single, basic operation

• Each group of n bits is referred to as a word of information

• Access to each word requires a distinct name (location/address)

• Can also refer to a characteristic of other components (i.e. size of bus)

Page 51: Computer Architecture Foundations for Graduate Level Students

Word Length

• Size of a word specified in bits known as word length

• Possible benefits of a large word length:– Faster processing (more data and/or

instructions can be fetched at a time)– Greater numeric precision– More powerful instructions (e.g. instructions

can have more operands)

Page 52: Computer Architecture Foundations for Graduate Level Students

Machine Language

• Composed of– Instruction– Data (instruction parameters)

• Instructions and data are represented by a stream of 1s and 0s

• Cumbersome to deal with when preparing programs; programmers use hexadecimal numbers

• In some computers, both instruction and data are stored in a single memory location

Page 53: Computer Architecture Foundations for Graduate Level Students

Assembly: An Improvement on Machine Language

• Symbols, called mnemonics, are used to represent instructions

• Sample instruction–

• Advantage:– Easier recall of instructions

• Disadvantage:– Need to convert mnemonics and hexadecimal

numbers back to binary

add (105),(8)instruction

Instruction parameters

Page 54: Computer Architecture Foundations for Graduate Level Students

How Programs Are Loaded Into Main Memory

• Programs are loaded as binary numbers• Assumptions:

– An instruction is represented by a 2-digit hexadecimal number (e.g. add by 1A, mov by A0)

– World length of instruction parameters: 3 hex digits (12 bits)

100 101 102 103 104 105

A01040F2 1A105008 A00F2200 01000000 00000009 …

100

1010 0000 0001 0000 0100 0000 1111 0010

Instruction Instruction parameters

Page 55: Computer Architecture Foundations for Graduate Level Students

Executing Multiple Programs

• Programs share processor time

• Time slicing• Supported by

modern CPUs


Top Related