if you are caught violating these rules by the teaching staff, you...
TRANSCRIPT
ECE 550 Final Exam – Page 1 of 13
ECE 550 – Fundamentals of Computer Systems and Engineering
Section 01 – Fall 2016 – Final Exam
Name: ____________________________________________________ NetID: _____________
READ THIS:
This is a closed-book, closed-internet, closed-peer, calculator-free exam. You may use ONE 8.5x11” two-
sided sheet of notes; additional notes are not permitted. If you are caught violating these rules by the
teaching staff, you will receive a -100 on the exam and will have an academic integrity violation filed
against you.
The last page is a reference sheet you may find useful. You may tear it from the exam, and it’s yours to
keep.
Unless otherwise specified, the computing platform assumed on all questions is 32-bit MIPS.
Please sign the honor pledge below to affirm that you understand the rules of this exam period. Your exam will
not be graded if you do not sign the honor pledge.
“I have neither given nor received unauthorized aid on this test or assignment”
Signature: ___________________________ Date: __________
Question Max points Score
1 20
2 16
3 18
4 18
5 12
6 15
7 20
8 20
9 21
10 20
11 20
E.C. (+10)
TOTAL 200
ECE 550 Final Exam – Page 2 of 13
Question 1 [20]: Match each of the following definitions with the appropriate vocabulary word.
(a) Virtual memory address translation data structure.
(b) A class of ISAs characterized by complex instructions, such as x86.
(c) A piece of logic which selects between two inputs, based on the
value of a third input.
(d) A piece of logic which translates from binary representation to
one-hot representation.
(e) When an instruction exhibits unusual circumstances, requiring
the OS’s attention.
(f) The idea that a program will often re-access the same data it just
accessed.
(g) A memory technology which requires periodic refreshing,
because the charge stored in the capacitor slowly leaks away.
(h) An enhancement to pipeline CPU design that reduces the
performance penalty for conditional instructions.
(i) Multiple hard disks combined for performance
and/or reliability.
Choices A. ALU B. Branch prediction C. CISC D. Decoder E. DRAM F. Exception G. Flush H. Hard Disk I. Interrupt J. Multi-cycle K. Mux L. Page Table M. Pipeline N. RAID O. Return Address Stack P. RISC Q. Single-cycle R. Spatial Locality S. SRAM T. Stall U. Temporal Locality V. XOR-gate W. 这个选项只是在这里吓唬
四个不认识中文的学生
ECE 550 Final Exam – Page 3 of 13
Question 2 [16]: For each of the networking tasks below, identify which protocol is used to achieve it on the
modern internet. All choices will be used at least once; some will be used twice.
(a) Routing a packet from point to point, getting closer to the
destination with each hop.
(b) Transporting frames of data between connected systems
on a single network.
(c) Providing configurable addresses to network interfaces.
(d) Providing hard-coded, non-configurable unique IDs to
network interfaces.
(e) Regulating flow of traffic in response to network conditions.
(f) Translating human-friendly names and IP addresses.
(g) Acknowledging received data and re-transmitting sent data that didn’t get acknowledged.
(h) A Protocol to Transport HyperText.
Choices A. Ethernet B. IP C. TCP D. DNS E. HTTP
ECE 550 Final Exam – Page 4 of 13
Question 3 [18]: Briefly answer these questions pertaining to operating systems.
(a) In the ext2 file system, why do inodes contain a mix of direct, single-indirect, double-indirect, and triple-
indirect block pointers? Why not just use direct block pointers?
(b) What is one reason why a process might be removed from the “run queue” of the operating system’s
process scheduler?
(c) Describe the process of booting a PC from initial power-on to the point where the kernel is loaded.
ECE 550 Final Exam – Page 5 of 13
Question 4 [18]: Convert the following C code to MIPS assembly language. Assume standard MIPS calling
conventions, and that this code is in a void function that requires no stack frame.
C code Equivalent MIPS code Comment
int a[10]; (not shown) // assume &a[0] is in $4
(a) [1] int n = 9; // assume n is $5
(b) [1] int* p = a+3; // assume p is $6
(c) [1] int r = p[0]; // assume r is $7
(d) [1] int s = n + r; // assume s is $8
(e) [1] p[1] = s;
(f) [3] if (n<r) {n=0;} // can use $9 for a temp value // can be multiple instructions
(g) [1] return;
ECE 550 Final Exam – Page 6 of 13
Question 5 [12]: Consider the following hardware which implements a finite state machine. This state machine
has 3 states, and uses a 1-hot encoding. It has one input.
Draw the edges of the state transition diagram for this FSM:
A B
C
ECE 550 Final Exam – Page 7 of 13
Question 6 [15]: You are a CPU engineer and have a choice between the two cache designs given below.
Design A Design B
L1 layout 64kB, 2-way set associative 128kB, 4-way set associative
L1 access latency 1 cycle 3 cycles
L1 miss rate 20% 5%
L2 layout 1 MB, 4-way set associative 4 MB, 8-way set associative
L2 access latency 10 cycles 16 cycles
L2 miss rate 5% 2%
Main memory latency 200 cycles 200 cycles
(a) [6] What is the effective access time for each?
(b) [4] After further study, your team has found a design tradeoff. It turns out that Design B takes up so
much die space that the CPU’s branch predictor would have to be simplified to compensate. The CPU
you’re building has a long pipeline, and branch mispredictions incur a 10-cycle penalty. With the full
branch predictor that’s possible with Design A, just 5% of branches will be mispredicted. The
compromised branch predictor needed in Design B will have a 10% misprediction rate. Based on this,
what will the average time penalty for branching be in Design A vs. Design B?
(c) [5] Assume that 25% of instructions are memory loads/stores and thus have the average execution time
found in part (a). Further assume that 10% of instructions are branches and thus have the average
execution time found in part (b). All other instructions finish in exactly 1 cycle. Which design yields
better average-case performance? Show your work.
ECE 550 Final Exam – Page 8 of 13
Question 7 [20]: Assume a 16-bit machine with a 2-way set associative 8kB cache with 16-byte blocks. Fill in the
attributes below. HINT: Use the powers of two table on the reference sheet to simplify the math.
(a) [1] How many frames does it have? ______________
(b) [1] How many sets does it have? ______________
(c) [1] Number of bits for the offset field of a memory address: ______________
(d) [1] Number of bits for the index field of a memory address: ______________
(e) [1] Number of bits for the tag field of a memory address: ______________
[15] The table below shows a memory access pattern, and set indexes are computed for you. Assuming an LRU
replacement policy and that all frames are initially invalid, indicate if the access is a cache hit or a miss. For
misses, indicate if the miss is compulsory (due to initial boot-up) or conflict (misses due to the limited
associativity of the cache). Capacity misses will not be encountered.
Address (hex)
Tag Index Offset Hit/miss?
(f) 0001 ☐Hit ☐Miss (compulsory) ☐Miss (conflict)
(g) 1003 ☐Hit ☐Miss (compulsory) ☐Miss (conflict)
(h) 2005 ☐Hit ☐Miss (compulsory) ☐Miss (conflict)
(i) 0023 ☐Hit ☐Miss (compulsory) ☐Miss (conflict)
(j) 1001 ☐Hit ☐Miss (compulsory) ☐Miss (conflict)
(k) 0005 ☐Hit ☐Miss (compulsory) ☐Miss (conflict)
(l) 0023 ☐Hit ☐Miss (compulsory) ☐Miss (conflict)
(m) 2021 ☐Hit ☐Miss (compulsory) ☐Miss (conflict)
(n) 1003 ☐Hit ☐Miss (compulsory) ☐Miss (conflict)
ECE 550 Final Exam – Page 9 of 13
Question 8 [20]: When I moved into my office here at Duke, I inherited an awesome old laptop from the
previous tenant. It’s a Pentium III (a 32-bit CPU) with 128MB of RAM. The page size on an x86 machine is 4kB.
Let’s determine some characteristics about this thing. The first problem is done for you. Write large numeric
answers in the form of 2n rather than trying to figure out what that corresponds to (e.g., don’t write out 2048;
write 211 instead). Remember 210=1k, 220=1M, and 230=1G.
(a) Is it absurdly heavy? Is it thicker than the textbook for this course?
Yes and yes.
(b) [4] How many virtual pages are there per process?
(c) [4] How many bits are needed to represent the page offset?
(d) [4] How many bits are needed to represent the virtual page number?
(e) [4] How many bits are needed to represent a physical address?
(f) [4] How many bits are needed to represent the physical page number?
ECE 550 Final Exam – Page 10 of 13
Question 9 [21]: I/O.
(a) [7] Describe the difference between polling and interrupts.
(b) [7] What is DMA and how does it enhance throughput beyond having interrupts alone?
(c) [7] For a hard drive, why is it better to do few large I/Os than many small I/Os?
ECE 550 Final Exam – Page 11 of 13
Question 10 [20]: Pipelining. Assume the 5-stage pipeline we’ve used in class (F, D, X, M, W). Assume the
pipeline forwards/bypasses operands whenever possible and stalls only when needed to satisfy a RAW
dependence that can’t be forwarded/bypassed. Assume all loads and stores hit in the 1-cycle data cache.
Complete a pipeline diagram for the first 7 cycles of the code shown, and assume that the first instruction is in
the Fetch (F) stage in cycle 1, as shown.
1 2 3 4 5 6 7 addi $1, $1, 4 F andi $2, $2, 0x20 lw $3, 0($1) sub $4, $4, $3 move $6, $5
Question 10 [20]: The year is 2001. You need to run an astrophysics simulation which has a very complex control
flow pattern (lots of branches). Your choices are the two x86 chips of the era: the AMD Athlon (which had a 10-
stage pipeline and ran at 1GHz) and the Intel Pentium IV (which had a 20-stage pipeline and was clocked at
1.3GHz). Why might the Athlon give better performance despite having a slower clock rate?
ECE 550 Final Exam – Page 12 of 13
Question 11 [20]: List four attributes or features that a modern x86 CPU has that the classic MIPS CPU doesn’t.
(a)
(b)
(c)
(d)
Extra credit [10]: Write assembly code that will swap the values of two registers (i.e., if the registers started with
values 19 and 42, they’d end with values 42 and 19 respectively). However, you MUST NEVER MODIFY
ANYTHING EXCEPT THE REGISTERS IN QUESTION. So you can’t change other registers or write to memory.
5 points will be awarded for a MIPS solution swapping registers $1 and $2, and 10 points will be awarded for an
x86 solution swapping registers eax and ebx.
ECE 550 Final Exam – Page 13 of 13
ECE/CS 250 EXAM REFERENCE SHEET (You may keep or discard this sheet)
Powers of two
n 2n 0 1 1 2 2 4 3 8 4 16 5 32 6 64 7 128 8 256 9 512 10 1,024 11 2,048 12 4,096 13 8,192 14 16,384 15 32,768 16 65,536
MIPS info
The Intel x86 instruction set AAA CMOVE CVTPS2DQ FCMOVU FNOP GS JNGE MFENCE MULSS PCMPISTRM PMULLD PUNPCKLDQ SETC STOSB
AAD CMOVG CVTPS2PD FCOM FNSAVE HADDPD JNL MINPD MWAIT PEXTRB PMULLW PUNPCKLQDQ SETE STOSD
AAM CMOVGE CVTPS2PI FCOM2 FNSETPM HADDPS JNLE MINPS NEG PEXTRD PMULUDQ PUNPCKLWD SETG STOSW
AAS CMOVL CVTSD2SI FCOMI FNSTCW HINT_NOP JNO MINSD NOP PEXTRQ POP PUSH SETGE STR
ADC CMOVLE CVTSD2SS FCOMIP FNSTENV HLT JNP MINSS NOT PEXTRW POPA PUSHA SETL SUB
ADD CMOVNA CVTSI2SD FCOMP FNSTSW HSUBPD JNS MONITOR OR PHADDD POPAD PUSHAD SETLE SUBPD
ADDPD CMOVNAE CVTSI2SS FCOMP3 FPATAN HSUBPS JNZ MOV ORPD PHADDSW POPCNT PUSHF SETNA SUBPS
ADDPS CMOVNB CVTSS2SD FCOMP5 FPREM ICEBP JO MOVAPD ORPS PHADDW POPF PUSHFD SETNAE SUBSD
ADDSD CMOVNBE CVTSS2SI FCOMPP FPREM1 IDIV JP MOVAPS OUT PHMINPOSUW POPFD PXOR SETNB SUBSS
ADDSS CMOVNC CVTTPD2DQ FCOS FPTAN IMUL JPE MOVBE OUTS PHSUBD POR RCL SETNBE SYSENTER
ADDSUBPD CMOVNE CVTTPD2PI FDECSTP FRNDINT IN JPO MOVD OUTSB PHSUBSW PREFETCHNTA RCPPS SETNC SYSEXIT
ADDSUBPS CMOVNG CVTTPS2DQ FDIV FRSTOR INC JS MOVDDUP OUTSD PHSUBW PREFETCHT0 RCPSS SETNE TEST
ADX CMOVNGE CVTTPS2PI FDIVP FS INS JZ MOVDQ2Q OUTSW PINSRB PREFETCHT1 RCR SETNG UCOMISD
AMX CMOVNL CVTTSD2SI FDIVR FSAVE INSB LAHF MOVDQA PABSB PINSRD PREFETCHT2 RDMSR SETNGE UCOMISS
AND CMOVNLE CVTTSS2SI FDIVRP FSCALE INSD LAR MOVDQU PABSD PINSRQ PSADBW RDPMC SETNL UD
ANDNPD CMOVNO CWD FFREE FSIN INSERTPS LDDQU MOVHLPS PABSW PINSRW PSHUFB RDTSC SETNLE UD2
ANDNPS CMOVNP CWDE FFREEP FSINCOS INSW LDMXCSR MOVHPD PACKSSDW PMADDUBSW PSHUFD RDTSCP SETNO UNPCKHPD
ANDPD CMOVNS DAA FIADD FSQRT INT LDS MOVHPS PACKSSWB PMADDWD PSHUFHW REP SETNP UNPCKHPS
ANDPS CMOVNZ DAS FICOM FST INT1 LEA MOVLHPS PACKUSDW PMAXSB PSHUFLW REPE SETNS UNPCKLPD
ARPL CMOVO DEC FICOMP FSTCW INTO LEAVE MOVLPD PACKUSWB PMAXSD PSHUFW REPNE SETNZ UNPCKLPS
BLENDPD CMOVP DIV FIDIV FSTENV INVD LES MOVLPS PADDB PMAXSW PSIGNB REPNZ SETO VERR
BLENDPS CMOVPE DIVPD FIDIVR FSTP INVEPT LFENCE MOVMSKPD PADDD PMAXUB PSIGND REPZ SETP VERW
BLENDVPD CMOVPO DIVPS FILD FSTP1 INVLPG LFS MOVMSKPS PADDQ PMAXUD PSIGNW RETF SETPE VMCALL
BLENDVPS CMOVS DIVSD FIMUL FSTP8 INVVPID LGDT MOVNTDQ PADDSB PMAXUW PSLLD RETN SETPO VMCLEAR
BOUND CMOVZ DIVSS FINCSTP FSTP9 IRET LGS MOVNTDQA PADDSW PMINSB PSLLDQ ROL SETS VMLAUNCH
BSF CMP DPPD FINIT FSTSW IRETD LIDT MOVNTI PADDUSB PMINSD PSLLQ ROR SETZ VMPTRLD
BSR CMPPD DPPS FIST FSUB JA LLDT MOVNTPD PADDUSW PMINSW PSLLW ROUNDPD SFENCE VMPTRST
BSWAP CMPPS DS FISTP FSUBP JAE LMSW MOVNTPS PADDW PMINUB PSRAD ROUNDPS SGDT VMREAD
BT CMPS EMMS FISTTP FSUBR JB LOCK MOVNTQ PALIGNR PMINUD PSRAW ROUNDSD SHL VMRESUME
BTC CMPSB ENTER FISUB FSUBRP JBE LODS MOVQ PAND PMINUW PSRLD ROUNDSS SHLD VMWRITE
BTR CMPSD ES FISUBR FTST JC LODSB MOVQ2DQ PANDN PMOVMSKB PSRLDQ RSM SHR VMXOFF
BTS CMPSS EXTRACTPS FLD FUCOM JCXZ LODSD MOVS PAUSE PMOVSXBD PSRLQ RSQRTPS SHRD VMXON
CALL CMPSW F2XM1 FLD1 FUCOMI JE LODSW MOVSB PAVGB PMOVSXBQ PSRLW RSQRTSS SHUFPD WAIT
CALLF CMPXCHG FABS FLDCW FUCOMIP JECXZ LOOP MOVSD PAVGW PMOVSXBW PSUBB SAHF SHUFPS WBINVD
CBW CMPXCHG8B FADD FLDENV FUCOMP JG LOOPE MOVSHDUP PBLENDVB PMOVSXDQ PSUBD SAL SIDT WRMSR
CDQ COMISD FADDP FLDL2E FUCOMPP JGE LOOPNE MOVSLDUP PBLENDW PMOVSXWD PSUBQ SALC SLDT XADD
CLC COMISS FBLD FLDL2T FWAIT JL LOOPNZ MOVSS PCMPEQB PMOVSXWQ PSUBSB SAR SMSW XCHG
CLD CPUID FBSTP FLDLG2 FXAM JLE LOOPZ MOVSW PCMPEQD PMOVZXBD PSUBSW SBB SQRTPD XGETBV
CLFLUSH CRC32 FCHS FLDLN2 FXCH JMP LSL MOVSX PCMPEQQ PMOVZXBQ PSUBUSB SCAS SQRTPS XLAT
CLI CS FCLEX FLDPI FXCH4 JMPF LSS MOVUPD PCMPEQW PMOVZXBW PSUBUSW SCASB SQRTSD XLATB
CLTS CVTDQ2PD FCMOVB FLDZ FXCH7 JNA LTR MOVUPS PCMPESTRI PMOVZXDQ PSUBW SCASD SQRTSS XOR
CMC CVTDQ2PS FCMOVBE FMUL FXRSTOR JNAE MASKMOVDQU MOVZX PCMPESTRM PMOVZXWD PTEST SCASW SS XORPD
CMOVA CVTPD2DQ FCMOVE FMULP FXSAVE JNB MASKMOVQ MPSADBW PCMPGTB PMOVZXWQ PUNPCKHBW SETA STC XORPS
CMOVAE CVTPD2PI FCMOVNB FNCLEX FXTRACT JNBE MAXPD MUL PCMPGTD PMULDQ PUNPCKHDQ SETAE STD XRSTOR
CMOVB CVTPD2PS FCMOVNBE FNDISI FYL2X JNC MAXPS MULPD PCMPGTQ PMULHRSW PUNPCKHQDQ SETALC STI XSAVE
CMOVBE CVTPI2PD FCMOVNE FNENI FYL2XP1 JNE MAXSD MULPS PCMPGTW PMULHUW PUNPCKHWD SETB STMXCSR XSETBV
CMOVC CVTPI2PS FCMOVNU FNINIT GETSEC JNG MAXSS MULSD PCMPISTRI PMULHW PUNPCKLBW SETBE STOS