cover page ee457 midterm exam (~24%)

14
EE457 MT - Fall 2020 1 / 14 C Copyright 2020 Gandhi Puvvada EE457 Midterm Exam (~24%) Open-book Open-notes Exam just for the Fall 2020; Verilog Guides are not needed and are not allowed. Smart phones, tablets, and all kinds of computing/Internet devices are allowed for writing your exam and for communicating with your proctor. You should not be communicating with anyone other than your proctor during the entire period of the exam. This is a Crowdmark exam. Please do not write on margins or on the backside. Fall 2020 Instructor: Gandhi Puvvada Thursday 10/22/2020 (A 3 hour exam) nominally 05:30 PM - 08:30 PM (180 min) on Zoom Viterbi School of Engineering University of Southern California Ques# Topic Page# Time Points Score 1 Lab 7 Part 1modified 2-5 70 150 2 FIFO 6-8 25 62 3 9-stage pipeline 9-11 45 64 4 Virtual Memory 12 15 37 5 Cache 13 20 47 Total Cover+11+1 175 min 360 Perfect Score 330 I have previously read the Viterbi Code of Integrity and other related material at the site https://viterbischool.usc.edu/academic- integrity/ and I will abide by these rules of conduct. I will neither seek help from others nor offer help to others in my exams. _____________________________ <== Student’s signature Cover page

Upload: others

Post on 13-May-2022

4 views

Category:

Documents


0 download

TRANSCRIPT

Page 1: Cover page EE457 Midterm Exam (~24%)

EE457 MT - Fall 2020 1 / 14 C Copyright 2020 Gandhi Puvvada

EE457 Midterm Exam (~24%)Open-book Open-notes Exam just for the Fall 2020;Verilog Guides are not needed and are not allowed.

Smart phones, tablets, and all kinds of computing/Internet devices are allowed for writing your exam and for communicating with your proctor. You should not be communicating with anyone other than your proctor during the entire period of the exam.

This is a Crowdmark exam. Please do not write on margins or on the backside.Fall 2020

Instructor: Gandhi PuvvadaThursday 10/22/2020 (A 3 hour exam) nominally 05:30 PM - 08:30 PM (180 min) on Zoom

Viterbi School of EngineeringUniversity of Southern California

Ques# Topic Page# Time Points Score

1 Lab 7 Part 1modified 2-5 70 1502 FIFO 6-8 25 623 9-stage pipeline 9-11 45 644 Virtual Memory 12 15 375 Cache 13 20 47

Total Cover+11+1 175 min 360Perfect Score 330

I have previously read the Viterbi Code of Integrity and other related material at the site https://viterbischool.usc.edu/academic-integrity/ and I will abide by these rules of conduct. I will neither seek help from others nor offer help to others in my exams.

_____________________________ <== Student’s signature

Cover page

Page 2: Cover page EE457 Midterm Exam (~24%)

EE457 MT - Fall 2020 2 / 14 C Copyright 2020 Gandhi Puvvada

1 ( 26 + 30 + 38 + 56 = 150 points) 70 min. Lab 7 Part 1 modified

This design is derived from Lab 7 Part1 (the three element adder)

To make students to think afresh and design during the exam, we changed the design quite a bit but there is no big complexity in this design. Here EX1 has an Adder/Multiplier with an A/M control input (0 = Add, 1 = Multiply). All sources and destinations (including the product (result) of multiplication) are all 16-bit in size. Multiplication takes 2 clocks in EX1. So you need to generate and utilize STALL_M (M for Multiply). $0 is writable and is not a special register here.EX2 has a regular Adder with a bypass path around it, so that two out of the four instructions below can bypass going through the adder in EX2. Instructions have either two (X, Y) or three (X, Y, Z) source registers. A2 = Add 2; A3 = Add 3; M = Multiply; MA = Multiply and Add

Instruction Operation Opcode 32-bit instruction in hex

MA M A3 A2 R=Dest, X,Y,Z=Sources

NOP 0 0 0 0 00000000

A2 $R, $X, $Y; ($R) <= ($X)+($Y) 0 0 0 1 1000RXY0

A3 $R, $X, $Y, $Z; ($R) <= ($X)+($Y)+($Z) 0 0 1 0 2000RXYZ

M $R, $X, $Y; ($R) <= ($X)*($Y) 0 1 0 0 4000RXY0

MA $R, $X, $Y, $Z; ($R) <= ($X)*($Y)+($Z) 1 0 0 0 8000RXYZ

A 2-source instruction has its result ready in EX2 itself and can help his junior in EX1, whereas a 3-source instruction can only help its juniors from the WB stage only. Unlike in the Lab 7 Part 1, there is no overflow here, so there is no converting to a bubble here.Read the incomplete block diagram on the next page throughly. There are 20 loose-ends to connect to. You may need a few gates sometimes (for example to generate SKIP) to complete this page.

1.1. In our 5-stage early branch pipeline, if ISA declares one branch delay slot, then a successful branch in ID stage ____________ (does/doesn’t) flush its junior in IF stage.____________ (Hence/Even then) you _____________ (need/do not need) a WBFF on IF/ID stage register. When we switch-on power, the random instruction in the ID stage is converted to a bubble due to _______________________________________________________________ ____________________________________________________________________________ ____________________________________________________________________________. In the modified Lab 7 design here, we ____________ (do/don’t) need a WBFF on IF/ID stage register to convert the power-on random instruction in the ID stage. Explain: ___________________________________________________________________________________________ ____________________________________________________________________________ ____________________________________________________________________________ ________ (Like/Unlike) the X and Y operands, which need two pairs of forwarding muxes in EX1, the Z operand _____________ (needs/doesn’t need) a pair of forwarding muxes in EX1. Explain _____________________________________________________________________ ____________________________________________________________________________

pts 10

pts 8

pts 8

Q1P2 Page total 26 pts

Page 3: Cover page EE457 Midterm Exam (~24%)

EE457 MT - Fall 2020 3 / 14 C Copyright 2020 Gandhi Puvvada

PC

EN

ZA

YA

XA

RA

ZD YD

XD RA

RA

Reg

. File

ZA

YA

XA

RA

RD

R-W

rite

EN

0 1

0 1

0 1

0 1

A B

Adder/Multiplier

S

A B

S

Adder

EN

ID_Z

MEX

1

ID_Y

MEX

1

ID_Z

A

ID_Y

A

ID_X

MEX

1

ID_X

AEX

1_R

A

ID_Z

A

ID_Y

A

ID_X

A

ID_XMEX1=ID_XA Matched with EX1_RA

IFID

EX

1

Com

p St

atio

n in

ID S

tage

EX

2

EX

2_R

A

ID_X

MEX

2

ID_Y

MEX

2

ID_Z

ME

X2

Y2M

ux

X2M

ux

Z1M

uxZ2

Mux

ZD RAEN

INS-MEM

EX1_

RA

EX

2_R

AW

B_R

A

WB

ID_X

A

ID_Y

A

ID_Z

A

EN

Write

XD+YD

XD+YD+ZD

XD

YD

ZD

STA

LL_D

STA

LL_D

B

FX2M

FY2M

FZ1M

FZ2M

WB

_RD

WB

_WR

ITE

1. C

ompl

ete

all m

issi

ng c

onne

ctio

ns m

arke

d in

dot

ted

lines

Als

o co

mpl

ete

the

RA

(Res

ult A

ddre

ee) c

onne

ctio

n in

ID s

tage

(ID

_RA

).2.

Com

plet

e al

l fiv

e en

able

(EN

) con

trols

on

the

pipe

line

regi

ster

s (in

clud

ing

PC).

4. G

ener

ate

STA

LL

_M (S

tall

for M

ultip

ly).

3. C

ompl

ete

the

forw

ardi

ng p

aths

into

the

six

forw

ardi

ng m

uxes

.

7. D

raw

nee

ded

logi

c on

a s

epar

ate

page

for g

ener

atin

g

5. G

ener

ate

SKIP

(Byp

ass)

con

trol

sig

nal.

S

TA

LL

_D (S

tall

Dep

ende

ncy)

, FX

1M, F

X2M

, FZ

1M, F

Z2M

.

ID_RA

0 1

R1M

ux SKIP

PQ

P=Q

PQ

P=Q

PQ

P=Q

PQ

P=Q

PQ

P=Q

PQ

P=Q

A /

M

MAMA3

A2

MAMA3

A2

MAMA3

A2

0 1 0 1Y1M

ux

X1M

ux

FX1M

FY1M

EX

2_W

RIT

E

Res

ult A

ddre

ss

Res

ult D

ata

Res

ult W

rite

DQ

CL

KC

LR

CL

K

R

RR

RR

R 6. G

ener

ate

A/M

(Add

/Mul

tiply

) sig

nal

STA

LL_

MST

AL

L_M

B

EX1_

Writ

e

On this page Next pageID

_MA

EX1_

MA

EX2_

MA

pts30

Q1P3 Page total 30 pts

Page 4: Cover page EE457 Midterm Exam (~24%)

EE457 MT - Fall 2020 4 / 14 C Copyright 2020 Gandhi Puvvada

Draw gates

Draw gates

8 pts STALL_D

Dependency STALL in the ID

FX1MX1Mux select signal

0

1

X2Mux

FX2M

0

1

X1Mux

FX1M

FX2MX2Mux select signal

FZ1MZ1Mux select signal

FZ2MZ2Mux select signal

0

1

0

1

Z1Mux Z2Mux

ZD

EN

XD

+YD

FZ1M FZ2M

pts30 pts

Q1P4

4x6+6

Page total 38 pts

Page 5: Cover page EE457 Midterm Exam (~24%)

EE457 MT - Fall 2020 5 / 14 C Copyright 2020 Gandhi Puvvada

Complete the Time-Space diagram below. If there are excess rows in the table, just leave them blank. If there aren’t enough rows to complete the given sequence, stop at the last row. Write "Bubble" or draw a bubble when needed.

In what clocks (write CC numbers) did STALL_D go active? ______________________________ In what subsequent clocks did STALL_D go inactive? ______________________________

In what clocks (write CC numbers) did STALL_M go active? ______________________________In what subsequent clocks did STALL_M go inactive? ______________________________

STALL_D and STALL_M: Is it OK if both of them go active together? _______ (OK, not OK).STALL_M is never active for more than one clock at a stretch. ______ (T / F)If it is OK for STALL_D and STALL_M to go active together, STALL_D will remain active for one clock beyond the active clock(s) of the STALL_M. __________ (T / F / NA) (NA = Not Applicable).

If Mr. Bruin forgot to make the register file an IFRF (Internally Forwarding Register File), he would realize his mistake, when he executes the above sequence of 5 instructions as they would produce wrong results. ____ (T / F). Explain: _______________________________________________ ____________________________________________________________________________ ____________________________________________________________________________ (W = Wasteful, H = Harmful) Note: Harmful means, it produces wrong results. Forwarding from a NOP is ____ (W/H). Forwarding to a NOP is ____ (W/H). Stalling a NOP is ____ (W/H). Spurious stalls are ____ (W/H).If your VLSI engineer asks you (for her layout convenience), you would be able to redesign to (circle right choices) (i) move Z2Mux to EX1 (ii) move Z1Mux to EX2 (iii) neither (iv) any of the two choicesExplain: _____________________________________________________________________ ____________________________________________________________________________ ____________________________________________________________________________ Though, this design did not talk about ICM (Instruction Cache Miss), what would you do if there is a ICM signal coming out of the INS-MEM component in the IF stage? State in words, do not alter the

diagram. ______________________________________________________________________ ____________________________________________________________________________

16pts

CC

# =

Clo

ck C

ycle

#

Q1P5

4x3=12

3+3=6

3+3=6

4

2+4=6

6

Page total 56 pts

Page 6: Cover page EE457 Midterm Exam (~24%)

EE457 MT - Fall 2020 6 / 14 C Copyright 2020 Gandhi Puvvada

2 ( 27 + 18 + 17 = 62 points) 25 min. FIFO

2.1 Assume that the Producer, the Consumer and the FIFO between them below are 3 separate chips.

It’s a bad idea to move the WP (Write Pointer) and the RP (Read Pointer) out of the FIFO into the Producer and the consumer respectively as shown above. Let us limit our discussion here to a single-clock FIFO using (n+1)-bit pointer method. Assume that the (WP-RP) subtraction is done in the FIFO. You use the ________ (L/U) (L = lower n bits, U = upper n bits) of the (n+1) bit pointers to index the array in the FIFO.

For a 16-location FIFO case, the bad plan requires the producer chip and the consumer chip to increase each of its output pins by ____ . It also causes the FIFO chip to increase its input pins by ______.

Suppose we wish to increase the FIFO size from 16 (=24)locations to 16K (=214) locations.

In the right practice, the Producer and the Consumer chips will each have an increase in its pin count by _______ (0/10/11/14/15/other) pins and the FIFO chip itself will have an increase in its pin count by _______ (0/10/11/14/15/20/22/28/30/other) pins. In the suggested bad plan, the Producer and the Consumer chips will each have an increase in its pin count by _______ (0/10/11/14/15/other) pins and the FIFO chip itself will have an increase in its pin count by _______ (0/10/11/14/15/20/22/28/30/other) pins.

2.2 A 32-location FIFO can be empty or FULL or anything in-between. So the total different values of populated depth are ________ (32/33/neither). For this FIFO, if you use the n-bit pointer method, you would use _______ (4/5/6/7)-bit counters for the WP and the RP and the same size subtracter doing mod____ (16/32/64) subtraction to calculate populated depth. On the other hand, if you use the (n+1)-bit pointer method, you would use _______ (4/5/6/7)-bit counters for the WP and the RP and the same size subtracter doing mod____ (16/32/64) subtraction to calculate populated depth.

2.3 It is ___________________ (wasteful/harmful/useful) to use gray-code pointers in a single-clock FIFO. It is ___________________ (wasteful/harmful/useful) to synchronize WP as WPS and further as WPSS (similarly synchronize RP as RPS and further as RPSS) in a single-clock FIFO.

2.4 Synchronizing Flip-flops, RPS and RPSS, are clocked by the ___________ (WCLK / RCLK).

2.5 Because of the delay in conveying the pointers in a 2-clock FIFO, the receiver (the consumer) sometimes may not have the latest value of the populated depth. It may have a depth value ___________ (lower / higher) than the actual value. Similarly, the sender (the producer) sometimes may not have the latest value of the populated depth. It may have a value ___________ (lower / higher) than the actual value. This is a ___________ (safe-side / unsafe-side) error, as it causes ___________ (A/B), where A = delay in conveying the data, B = error in conveying the data.

Prod

ucer

Con

sum

er

FIFO

WP RP

Right practice

Prod

ucer

Con

sum

er

FIFO

WP RP

Bad suggestion

Blank area (can be used for rough work)

Q2P6

2+1=3pts

2+2=4pts

2+2=4pts

7pts

1pt

2pts

2pts

2+2=4pts

Page total 27 pts

Page 7: Cover page EE457 Midterm Exam (~24%)

EE457 MT - Fall 2020 7 / 14 C Copyright 2020 Gandhi Puvvada

2.6 A 4-location FIFO, has a 2-bit WP (WP[1:0]) and a 2-bit RP (RP[1:0]). The difference DIFF[1:0] can only generate _____ (3/4/5) values where as the actual populated Depth has _____ (3/4/5) values. Complete as many of the following four boxes of circuits as possible using of course different pairs (combinations) of threshold values for DIFF[1:0] that you can think off. I have completed box #1 already!

2.6.1 A 4K-location FIFO, has a 12-bit WP (WP[11:0]) and a 12-bit RP (RP[11:0]). The difference DIFF[11:0] can only generate _________ (4095/4096/4097) values where as the actual populated Depth has _________ (4095/4096/4097) values.Complete as many of the following four boxes of circuits as possible using of course different pairs of threshold values for DIFF that you can think off. Decide which two bits of the 12-bit DIFF you wish to use here.

RAEDIFF[1]DIFF[0]

assign RAE = (DIFF[1:0] == 2’b01);

RAFDIFF[1]DIFF[0]

assign RAF = (DIFF[1:0] == 2’b10);

Box #1

RAEDIFF[1]DIFF[0]

assign RAE = (DIFF[1:0] == 2’b____);

RAFDIFF[1]DIFF[0]

assign RAF = (DIFF[1:0] == 2’b____);

Box #2

RAEDIFF[1]DIFF[0]

assign RAE = (DIFF[1:0] == 2’b____);

RAFDIFF[1]DIFF[0]

assign RAF = (DIFF[1:0] == 2’b____);

Box #3

RAEDIFF[1]DIFF[0]

assign RAE = (DIFF[1:0] == 2’b____);

RAFDIFF[1]DIFF[0]

assign RAF = (DIFF[1:0] == 2’b____);

Box #4

RAEDIFF[ ]DIFF[ ]

RAFDIFF[ ]DIFF[ ]B

ox #1

RAEDIFF[ ]DIFF[ ]

RAFDIFF[ ]DIFF[ ]B

ox #2

RAEDIFF[ ]DIFF[ ]

RAFDIFF[ ]DIFF[ ]B

ox #3

RAEDIFF[ ]DIFF[ ]

RAFDIFF[ ]DIFF[ ]B

ox #4

Blank area (can be used for rough work)

Q2P7

6pts

2pts

10pts

Page total 18 pts

Page 8: Cover page EE457 Midterm Exam (~24%)

EE457 MT - Fall 2020 8 / 14 C Copyright 2020 Gandhi Puvvada

2.7 The _______ (WP/RP) is never behind the _______ (WP/RP). The depth (populated depth) ________________________ (can be/ can never be) negative! If (WP==RP), it is an ambiguous situation in a ____________________ (n-bit pointer / (n+1)-bit pointer)-based design, but in the other design, (WP==RP) indicates that the FIFO is ______________.

2.8 For an 8-location 2-clock FIFO using 4-bit pointers for WP and RP, find the depth (populated depth) and write it in decimal. Also state if the depth is legal or illegal for the 8-location FIFO.

01

2

3

4

5

678

9

10

11

12

13

14

15

RP

WP

00000001

0010

0011

0100

0101

0110

011110001001

1010

1011

1100

1101

1110

11110

1

2

3

4

5

678

9

10

11

12

13

14

15

WP

RP

00000001

0010

0011

0100

0101

0110

011110001001

1010

1011

1100

1101

1110

1111

Depth = (WP - RP) mod-16 =__________It is a __________ (legal/illegal) depth.

Depth = (WP - RP) mod-16 =__________It is a __________ (legal/illegal) depth.

01

2

3

4

5

678

9

10

11

12

13

14

15

RPWP

00000001

0010

0011

0100

0101

0110

011110001001

1010

1011

1100

1101

1110

11110

1

2

3

4

5

678

9

10

11

12

13

14

15

WP

00000001

0010

0011

0100

0101

0110

011110001001

1010

1011

1100

1101

1110

1111

Depth = (WP - RP) mod-16 =__________It is a __________ (legal/illegal) depth.

Depth = (WP - RP) mod-16 =__________It is a __________ (legal/illegal) depth.

RP

01

2

3

4

5

678

9

10

11

12

13

14

15

RPWP

00000001

0010

0011

0100

0101

0110

011110001001

1010

1011

1100

1101

1110

11110

1

2

3

4

5

678

9

10

11

12

13

14

15

WP

00000001

0010

0011

0100

0101

0110

011110001001

1010

1011

1100

1101

1110

1111

Depth = (WP - RP) mod-16 =__________It is a __________ (legal/illegal) depth.

Depth = (WP - RP) mod-16 =__________It is a __________ (legal/illegal) depth.

RP

Q2P8

5pts

8+4=12pts

Page total 17 pts

Page 9: Cover page EE457 Midterm Exam (~24%)

EE457 MT - Fall 2020 9 / 14 C Copyright 2020 Gandhi Puvvada

3 ( 16 + 21 + 27 = 64 points) 45 min. 9-stage PipelineReproduced below is a part of the solution to the Q#4.2 from Spring 2020 Final exam that you went through.

3.1 Implement the "remedy" stated above in gates below.

if

Reg

FUFU_Br

pair

#1

pair

#4

pair

#1

pair

#2

pair

#3

Q3P9

10+6=16pts

Page total 16 pts

Page 10: Cover page EE457 Midterm Exam (~24%)

EE457 MT - Fall 2020 10 / 14 C Copyright 2020 Gandhi Puvvada

3.2 Pair #1 in EX stage can not be removed. Explain briefly. _____________________________ ___________________________________________________________________________ ___________________________________________________________________________ ___________________________________________________________________________

3.3 The IFRF (Internally Forwarding Register File) design does not change (circle your choices)(a) whether you declare branch delay slots or not(b) whether you declare load-word delay slots or not(c) whether you implement an early branch or a late branch(d) whether you implement a 5-stage or a 7-stage or a 9-stage pipeline

3.4 Using an IFRF in a single-cycle CPU is (circle your choices) (a) wasteful (b) harmful (c) useful Explain: ____________________________________________________________________________________________________________________________________________________________________________________________________________________________

3.5 Please find another copy of the 9-stage pipeline on the next page. We added three WBFFs to the three stage registers IF1/IF2, IF2/IF3, IF3/ID) to facilitate flushing of the three juniors, J1, J2, and J3 (juniors of a successful branch). But, now, we are told that this processor’s ISA has declared one branch delay slot and two load-word delay slots. Decide if you need to remove any of the three WBFFs. Remember that the WBFFs have two purposes. Our WBFFs are cleared on power-on reset. Use a few gates as needed and complete the design associated with the WBFFs.

In this design, some of the following four (circle as appropriate) (i) HDU, (ii) HDU_Br, (iii) FU_Br, (iv) FUare simplified because of (circle as appropriate) (a) one branch delay slot declaration (b) two load-word delay slot declaration (c) both (d) neither

In the 4 squares next to the four units, HDU, HDU_Br, FU_Br, and FU, please write the number of 5-bit comparators in those units. Please do not count zero checkers.

Blank area

Q3P10

5pts

4pts

4pts

8pts

Page total 21 pts

Page 11: Cover page EE457 Midterm Exam (~24%)

EE457 MT - Fall 2020 11 / 14 C Copyright 2020 Gandhi Puvvada

Reg

Instr.

TLB

HD

U

Data

TLB

D.C

ache

Dat

aRam

FU

IF1

IF3

ID

EX

MEM1

MEM2

WB

BR

AN

CH

BR1

9-st

age

pipe

lined

ver

sion

of th

e ea

rly-

bran

ch d

esig

n

FU

_B

r

PC

control

HD

U_B

r

Zero

pair #1pair #2

pair #4

pair #1

pair #2

pair #3

pair #3

Acc

ess

MEM3

D.C

ache

TagR

amAc

cess+

I.Cac

heD

ataR

am A

cces

s

I.Cac

heTa

gRam

Acce

ss+Ta

gChe

ck

TagC

heck

IF2

ENEN

EN

EN

ENEN

ENEN

EN

J#1

J#2

J#3

RR

R

RR

RR

RR

Q3P11

15ptsfor WBFFs

12ptsfor Comparison units

Page total 27 pts

Page 12: Cover page EE457 Midterm Exam (~24%)

EE457 MT - Fall 2020 12 / 14 C Copyright 2020 Gandhi Puvvada

4 ( 37 points) 15 min. Virtual Memory

4.1 ____________ (TLB/L1 Cache) entries are singular. Multi-word blocks (like 4 or 8 words per block) are common in a _____________ (TLB/L1 Cache). In the case of a ____________ (TLB/L1 Cache), with a 4-word (16-byte) block, the DATA RAM’s height (depth) is ______ (1/4/8/16/other) times that of the TAG RAM. Presence bit is like a _____________ (Valid/Dirty) bit. A _________ (TLB/PT) (PT = Page Table) is indexed whereas a _________ (TLB/PT) is searched.The total number of VPNs of all Active or Ready to Run or Blocked processes is ____________ (equal to/more than/less than) the total number of the PPFNs holding pages for them in the main memory.

4.2 If we increase the number of levels of a multi-level page table, we will have _______ (A/B/C/D), where A = an advantage only, B = a disadvantage only, C = both, D = neither. TLB helps to _________ (X/Y/Z) where X = overcome the disadvantage, Y = enhance the advantage, Z = neither.

4.3 A small part of the upper part of a _____________(VPN/PPFN/Page Offset) is sometimes ignored to limit ___________ (A/B/C) where A = the amount of Virtual Address Space allocated to a process, B = the number of Physical Page Frames needed, C = both. This results in saving physical memory consumption to _______ (X/Y/Z) where X = hold pages of the process, Y = hold the Page Table of the process, Z = both.

4.4 PTBR contains a _______ (VA/PA). MMU sends out a ______ (VA/PA) to the PT to lookup VPN to PPFN translation. Any address going to the Physical Memory has to be a Physical Address. ____ (T / F)

4.5 You had gone through ee457_MT_Sp2013_VM_Ques_sol.pdf. Extract:

Here, I have filled the sorted table of the first 8 distinct VPNs on the side with a different set of VPNs. Find how many of the A, B, C, D tables were built by the OS by then.

A-level: __________B-level: __________C-level: __________D-level: __________

P Q R S T1 3 6 7 A1 3 6 8 A1 4 6 8 A1 4 6 8 B2 4 6 8 B2 4 6 8 C2 5 6 8 C2 5 6 9 C

PQRST on the side represents a 20-bit (5-digit hex) VPN in a 4-level page table with upper 4 bits (P) indexing the A-level table, next 8 bits (QR) indexing the B-level tables, next 4 bits (S) indexing the C-level tables, and the last 4 bits (T) indexing the D-level tables.

Blank area

Q4P12 Page total 37 pts

10pts

3pts

3+3=6pts

3+3=6pts

4*2+4=12pts

Page 13: Cover page EE457 Midterm Exam (~24%)

EE457 MT - Fall 2020 13 / 14 C Copyright 2020 Gandhi Puvvada

5 ( 27 + 10 + 10 = 47 points) 20 min. Cache and Main Memory Organization: A 128-bit data (D127-D0) 32-bit (logical) address byte-addressable processor (address pins: A31-A4, /BE15-/BE0) has its cache and MM organized as shown below. Fill-in the 9 boxes and 5 blanks.

5.1 Block size (based on degree of lower-order interleaving of the MM) = _______ Words = ______ BytesDegree of set associativity based on number of TAG RAMs = _______ blocks/set. Size of one segregation (say Block 0’s segregation): ________________________________ Total Cache size = ________ K Bytes; The processor address space = _______ G Bytes.

5.2 Please divide the address below into appropriate fields and name the fields.

CPU

Cache

128-bitbus

2-w

ay lo

wer

-ord

er

inte

rleav

ed M

M

128

One of the TAG RAMs

Addr

Data-inData-out

Comp

1

Hit

128

Valid

?

(total 7 such TAG RAMs)

=

D7-

0

XCVR XCVR

Note

Note

Address

?

?

128-bitbus

128

Size of one TAG RAM ?

Size of one Byte-wide bank in MB in MM

?

Size of one Comparator

?

D12

7-12

0

D7-

0

D127-D0

proc

esso

r

CacheData RAM

Block 0’s Block 6’sBlock 1’s

Size of one Byte-wide

?

D7-

0

D12

7-12

0

D7-

0

D7-

0Address?

Main Memory(Here populated by SRAMS)

512 x _____Note

D12

7-12

0

Note

D12

7-12

0

D12

7-12

0

bank in KB in cache

MB

KB

A[ : ]

A[ : ]

A[ : ]

A[

:

]

Here, transceivers (XCVRs) are used for bus conversion.Number of XCVRs used: ______ Size of XCVRs used: __________-bit A memory module bus of __________ bits was convertedto the CPU bus of _________ bits using these XCVRs.Address bit(s) _______ and its/their _______ (write anumber) combinations are used in enabling one XCVR ata time.

A19 A18 A17 A16A31 A30 A29 A28 A27 A26 A25 A24 A23 A22 A21 A20 A3 A2 A1 A0A15 A14 A13 A12 A11 A10 A9 A8 A7 A6 A5 A4

It is not difficult to get an A in EE457. You need to aspire for it, work for it, and seek help from the 457 teaching team on whatever you do not understand. We are eager to help you. The final topics, exceptions, branch prediction, out-of-order execution, chip multi-threading, chip multiprocessing, cache coherency, locks and mutual exclusion are interesting andchallenging too. They are the focus of 70% of the final exam. Best wishes! - Gandhi, Kartik, Zequ, Gengyu, Kautuk, and Bala

Q5P13 Page total 47 pts

6+3=9pts

18pts

10pts

10pts

Page 14: Cover page EE457 Midterm Exam (~24%)

EE457 MT - Fall 2020 14 / 14 C Copyright 2020 Gandhi Puvvada

Non-grading page, Don’t submit

Blank page for rough work