corrrrr unit iv

Upload: subathra-devi-mourougane

Post on 02-Jun-2018

217 views

Category:

Documents


0 download

TRANSCRIPT

  • 8/10/2019 COrrrrr Unit IV

    1/29

    1

    Mailam Engineering College(Approved by AICTE, New Delhi, Affiliated to Anna University, Chennai

    & Accredited by National Board of Accreditation (NBA, New Delhi

    Mailam (Po), Villupuram (Dt). Pin: 604 304

    DEPARTMET !" C!MP#TER APP$%CAT%!&Computer !rgani'ation MC*++

    Part A

    +. -at o /ou meant / pipelining1 2an *0+*

    A pipelinin! "ay be vis#ali$ed as a collection of se!"ents called pipe sta!es thro#!h

    which binary infor"ation flows% Each se!"ent perfor"s partial processin! as dictated by the

    tas% The res#lt obtained in each se!"ent is transferred to the ne't se!"ent in the pipeline%The final res#lt is obtained after the data passes thro#!h all the se!"ents%

    *. E5plain laten/ an t-roug-put.

    atency) Each Instr#ction taes certain a"o#nt of ti"e to co"plete% This is called aslatency% It is the ti"e differences when an instr#ction is iss#ed and when it is co"pleted%

    Thro#!hp#t) The n#"ber of instr#ctions co"pleted in a !iven ti"e is calledThro#!hp#t%

    3. -at are t-e ma7or -arateri8ti8 o9 a pipeline1

    *ipelinin! cannot be i"ple"ented in a sin!le tas% As it wors by splittin! "#ltiple

    tas into a n#"ber of s#btas and operatin! on the" si"#ltaneo#sly%

    The speed#p or efficiently is achieved by #sin! the pipelinin! depends on the n#"ber

    of pipe sta!es and the n#"ber of available tas that can be s#bdivide%

    4. De9ine ontrol or. 2Ma/ *0+*The co"bination of control steps #sed for the !eneration of control si!nals is a

    control word% A control word is a word whose individ#al bits represent the vario#s controlsi!nals%

    ;. -at are t-e

  • 8/10/2019 COrrrrr Unit IV

    2/29

    2

    Data ha$ards%

    Instr#ction ha$ards%

    .tr#ct#ral ha$ards%

    ?. -at are =a'ar81

    A ha$ard is also called as h#rdle %The sit#ation that prevents the ne't instr#ction in

    the instr#ction strea" fro" e'ec#tin! d#rin! its desi!nated Cloc cycle% .tall is introd#cedby ha$ard%

    . -at i8 meant / Data -a'ar81

    A data ha$ard is any condition in which either the so#rce or the destination operandsof an instr#ction are not available at the ti"e e'pected in pipeline% As a res#lt so"e

    operation has to be delayed, and the pipeline stalls%

    +0. -at i8 meant / %n8trution -a'ar81The pipeline "ay be stalled beca#se of a delay in the availability of an instr#ction%

    +or e'a"ple, this "ay be a res#lt of "iss in cache, re/#irin! the instr#ction to be fetchedfro" the "ain "e"ory% .#ch ha$ards are called as Instr#ction ha$ards or Control ha$ards%

    ++. -at i8 meant / &trutural -a'ar81

    The str#ct#ral ha$ards is the sit#ation when two instr#ctions re/#ire the #se of a!iven hardware reso#rce at the sa"e ti"e% The "ost co""on case in which this ha$ard

    "ay arise is access to "e"ory%

    +*. -at o /ou mean / out@o9 orer e5eution1 %8 it De8irale1In a pipelined processor with several instr#ctions is process conc#rrently it is *ossible

    for instr#ction to finish o#t of se/#ence, one instr#ction finishes before another which isiss#ed earlier% As for as "ain co"p#tation is concerned no 0a$ards will happen b#t if an

    interr#pts occ#rs it creates the proble"%

    +3. $i8t out Variou8 ran-ing te-niue u8e in miro program ontrol unit1

    Bit1rin! Usin! Conditional 2ariable

    -ide Branch Addressin!

    +4. -at i8 miro programming an miro programme ontrol unit1

    3icropro!ra""in! is a "ethod of control #nit desi!n in which the control #nitselection and se/#encin! infor"ation are stored in 43 and 4A3s called control store or

    control "e"ory%

    3icro pro!ra""ed control #nit is a !eneral approach #sed for i"ple"entation ofcontrol #nit% 0ere control si!nals are !enerated by a pro!ra" si"ilar to "achine lan!#a!e

    pro!ra"s%

    +;. De9ine t-e term -arire ontrol. 2an *0+*It is the one that contains control #nits that #se fi'ed lo!ic circ#its to interpret

    instr#ctions and !enerate control si!nals fro" the"% The fi'ed lo!ic circ#it bloc incl#desco"binational circ#it that !enerates the re/#ired control o#tp#ts for decodin! and encodin!

    f#nctions%

    +6. -at i8 t-e nee88it/ o9 grouping 8ignal81 It is #sed to red#ce the n#"ber of the bits in the "icroinstr#ction%

    Prepared ByMrs. V.Rekha AP / MCA

  • 8/10/2019 COrrrrr Unit IV

    3/29

    3

    It is #sed to overco"e the drawbac of assi!nin! individ#al bits to each control si!nal

    res#lts in lon! "icroinstr#ctions%

    +>. De9ine o &euening.

    It is a process of sched#lin! tas that are awaitin! initiation in order to avoid collisionand achieve hi!h thro#!hp#t%

    +?. rite ontrol 8ignal8 9or 8toring a or in memor/. 45o#t , 3A4in

    46o#t , 3D4in ,write

    3D4o#t E , -3+C

    +. -at are t-e prolem8 9ae in %n8trution Pipeline. 4eso#rces Conflicts

    Data Dependency

    Branch Diffic#lties

    *0. -at i8 Regi8ter Renaming1

    If a te"porary re!ister ass#"es the role of the per"anent re!ister whose data it is

    holdin! and is !iven the sa"e na"e is called as the 4e!ister 4ena"in!

    *+. =o ata -a'ar an e pre

  • 8/10/2019 COrrrrr Unit IV

    4/29

    4

    All operations and data transfers within the processor tae place within ti"e periodsdefined by the processor cloc%

    *>. De9ine multip-a8e loing.Ed!e1tri!!ered flip1flops are not #sed9 two or "ore cloc si!nals "ay be needed to

    !#arantee proper transfer of data% This is nown as "#ltiphase clocin!%

    *?. -at are t-ree 8tep8 t-at reuire8 9or t-e memor/ rea operation1 45o#t, 3A4in, 4ead

    3D4inE, -3+C

    3D4o#t, 46in

    *. -at are t-e ation8 t-at reuire8 9or e5euting o9 a omplete in8trution1 +etch the instr#ction

    +etch the first operand (the contents of the "e"ory location pointed to by 4:%

    *erfor" the addition

    oad the res#lt into 4I

    30. De9ine regi8ter 9ile.

    A three1b#s str#ct#re #sed to connect the re!isters and the AU of a processor% All

    !eneral1p#rpose re!isters are co"bined into a sin!le bloc called the re!ister file%

    3+. De9ine ontrol 8tore.The "icro ro#tines for all instr#ctions in the instr#ction set of a co"p#ter are stored

    in a special "e"ory called the control store%

    3*. De9ine

  • 8/10/2019 COrrrrr Unit IV

    5/29

    5

    The instr#ction fetch #nit has e'ec#ted the branch instr#ction conc#rrently with thee'ec#tion of other instr#ction% This techni/#e is referred as branch foldin!%

    3. De9ine ran- ela/ 8lot.

    -hen e'ec#tion of I6 is co"pleted and a branch is to be "ade, the processor "#stdiscard I: and fetch the instr#ction at the branch tar!et% The location followin! a branch

    instr#ction is called a branch delay slot%

    40. -at i8 ela/e ran-ing1

    A techni/#e called delayed branchin! can "ini"i$e the penalty inc#rred as a res#lt of

    conditional branch instr#ctions% The idea is si"ple% The instr#ctions in the delay slots arealways fetched%

    4+. De9ine 8tati ran- preition.

    -ith either of these sche"es, the branch prediction decision is always the sa"eevery ti"e a !iven instr#ction is e'ec#ted% Any approach that has this characteristic is called

    static branch prediction

    4*. De9ine /nami ran- preition.

    Approach in which the prediction decision "ay chan!e dependin! on e'ec#tionhistory is called dyna"ic branch prediction%

    43. De9ine [email protected] "ore a!!ressive approach is to e/#ip the processor with "#ltiple processin! #nits

    to handle several instr#ctions in parallel in each processor sta!e% -ith this arran!e"ent,several instr#ctions start e'ec#tion in the sa"e cloc, and the processor is said to #se

    "#ltiple1iss#e%

    44. De9ine ommitment unit.-hen o#t1of1order e'ec#tion is allowed, a special control #nit is needed to !#arantee

    in1order co""it"ent% This is called co""it"ent #nit%

    4;. E5plain ealo1A deadloc is a sit#ation that can arise when two #nits, A and B #se a shared

    reso#rce% .#ppose that #nit B cannot co"plete its tas #nit A co"pletes its tas% At thesa"e ti"e, #nit B has been assi!ned a reso#rce that #nit A need% If this happens, neither

    #nit can co"plete its tas% Unit A is waitin! for the reso#rce it needs, which is bein! held by#nit B% At the sa"e ti"e, #nit B is waitin! for #nit A to finish before it can release that

    reco#rse%

    46. De9ine &uper8alar operation.

    .#perscalar describes a "icroprocessor desi!n that "aes it possible for "ore thanone instr#ction at a ti"e to be e'ec#ted d#rin! a sin!le cloc cycle% In a s#perscalar desi!n,

    the processor or the instr#ction co"piler is able to deter"ine whether an instr#ction can becarried o#t independently of other se/#ential instr#ctions, or whether it has a dependency

    on another instr#ction and "#st be e'ec#ted in se/#ence with it%

    4>. $i8t out t-e i8a

  • 8/10/2019 COrrrrr Unit IV

    6/29

    6

    The branch instr#ction processin!%

    4?. -at in9ormation etermine8 t-e ontrol 8ignal81 2De *0++ Instr#ction opcode is fetched

    6nd half of instr#ction is fetched with I> address

    Contents of AC written o#t to device over data b#s

    4. Di99erentiate prei8e an imprei8e e5eption8. 2De *0++

    A "achine is said to s#pport precise interr#pt when it !#arantees that all theinstr#ction before the instr#ction ca#sin! the e'ception will be e'ec#ted and retired witho#t

    bein! affected by the e'ception bein! raised and all instr#ctions after the fa#ltin! instr#ctionwill not chan!e the state of the "achine before the e'ception is handled% Any "achine that

    does not !ive s#ch !#arantee is called to have i"precise e'ception%

    *recise e'ception is a desired attrib#te as it helps pro!ra""er to reason abo#t thelo!ic in the pro!ra", especially in the event of deb#!!in! in the presence of an e'ception%

    3oreover i"precise e'ception can t#rn a behavior of even a sin!le threaded pro!ra" withsa"e inp#t, non1deter"inistic%

    ;0. $i8t t-e te-niue8 u8e 9or o

  • 8/10/2019 COrrrrr Unit IV

    7/29

  • 8/10/2019 COrrrrr Unit IV

    8/29

    8

    +etch the contents of the "e"ory location pointed to by the *C% The contents of this

    location are interpreted as an instr#ction to be e'ec#ted% 0ence, they are loaded into

    the I4%

    %R 22PC

    Ass#"in! that the "e"ory is byte addressable, incre"ent the contents of the *C by

    8, that is,

    PC 2PC F 4

    Carry o#t the actions specified by the instr#ction in the I4%

    -here an instr#ction occ#pies "ore than one word, steps 5 and 6 "#st be repeated as"any ti"es as necessary to fetch the co"plete instr#ction% These two steps are #s#ally

    referred to as the fetch phase9 step : constit#tes the e'ec#tion phase% In which thearith"etic and lo!ic #nit (AU and all the re!isters are interconnected via a sin!le co""on

    b#s% This b#s is internal to the processor and sho#ld not be conf#sed with the e'ternal b#sthat connects the processor to the "e"ory and I> devices%

    The data and address lines of the e'ternal "e"ory b#s are connected to the internal

    processor b#s via the "e"ory data re!ister, 3D4, and the "e"ory address re!ister, 3A4,respectively% 4e!ister 3D4 has two inp#ts and two o#tp#ts% Data "ay be loaded into 3D4

    either fro" the "e"ory b#s or fro" the internal processor b#s% The data stored in 3D4"ay be placed on either b#s%

    The inp#t of 3A4 is connected to the internal b#s, and its o#tp#t is connected to thee'ternal b#s% The control lines of the "e"ory b#s are connected to the instr#ction decoder

    and control lo!ic bloc% This #nit is responsible for iss#in! the si!nals that control theoperation of all the #nits inside the processor and for interactin! with the "e"ory b#s%

    The n#"ber and #se of the processor re!isters 4 thro#!h 4(n 1 5 vary considerably fro"one processor to another% 4e!isters "ay be provided for !eneral1p#rpose #se by the

    pro!ra""er% .o"e "ay be dedicated as special1p#rpose re!isters, s#ch as inde' re!isters

    or stac pointers%

    Three re!isters, 7, , and TE3*, have not been "entioned before% These re!isters are

    transparent to the pro!ra""er, that is, the pro!ra""er need not be concerned with the"beca#se they are never referenced e'plicitly by any instr#ction%

    The "#ltiple'er 3U selects either the o#tp#t of re!ister 7 or a constant val#e 8 to be

    provided as inp#t A of the AU% The constant 8 is #sed to incre"ent the contents of thepro!ra" co#nter% The two possible val#es of the 3U control inp#t .elect as .elect8 and

    .elect7 for selectin! the constant 8 or re!ister 7, respectively%

    As instr#ction e'ec#tion pro!resses, data are transferred fro" one re!ister to another, often

    passin! thro#!h the A U to perfor" so"e arith"etic or lo!ic operation% The instr#ctiondecoder and control lo!ic #nit is responsible for i"ple"entin! the actions specified by theinstr#ction loaded in the I4 re!ister%

    The decoder !enerates the control si!nals needed to select the re!isters involved and direct

    the transfer of data% The re!isters, the AU, and the interconnectin! b#s are collectivelyreferred to as the datapath%

    Prepared ByMrs. V.Rekha AP / MCA

  • 8/10/2019 COrrrrr Unit IV

    9/29

    9

    Single bus organization of the data path inside a processor

    An instr#ction can be e'ec#ted by perfor"in! one or "ore of the followin! operations in

    so"e specified se/#ence) Transfer a word of data fro" one processor re!ister to another or to the AU%

    *erfor" arith"etic or a lo!ic operation and store the res#lt in a processor re!ister%

    +etch the contents of a !iven "e"ory location and load the" into a processor

    re!ister%

    .tore a word of data fro" a processor re!ister into a !iven "e"ory location%

    Register Transfers:

    Instr#ction e'ec#tion involves a se/#ence of steps in which data are transferred fro"one re!ister to another% +or each re!ister, two control si!nals are #sed to place the contents

    of that re!ister on the b#s or to load the data on the b#s into the re!ister% The inp#t ando#tp#t of re!ister 4i are connected to the b#s via switches controlled by the si!nals 4iin and

    4i o#t respectively% -hen 4iin is set to 5, the data on the b#s are loaded into 4i% .i"ilarly,when 4io#t is set to 5, the contents of re!ister 4i are placed on the b#s% -hile 4io#t is e/#al

    to , the b#s can be #sed for transferrin! data fro" other re!isters% .#ppose that we wishto transfer the contents of re!ister 4l to re!ister 48% This can be acco"plished as follows)

    Enable the o#tp#t of re!ister 4l by settin! 45o#t to 5% This places the contents of 4 5

    on the processor b#s% Enable the inp#t of re!ister 48 by settin! 48in to 5% This loads data fro" the

    processor b#s into re!ister 48%

    Prepared ByMrs. V.Rekha AP / MCA

  • 8/10/2019 COrrrrr Unit IV

    10/29

    10

    Input and output gating for the registers

    All operations and data transfers within the processor tae place within ti"e periods definedby the processor cloc% The control si!nals that !overn a partic#lar transfer are asserted at

    the start of the cloc cycle%

    Performing Arithmetic And Logical Operation:The AU is a co"binational circ#it that has no internal stora!e% It perfor"s

    arith"etic and lo!ic operations on the two operands applied to its A and B inp#ts% Theoperands is the o#tp#t of the "#ltiple'er 3U and the other operand is obtained directly

    fro" the b#s% The res#lt prod#ced by the AU is stored te"porarily in re!ister % Therefore,a se/#ence of operations to add the contents of re!ister 4l to those of re!ister 46 and store

    the res#lt in re!ister 4: isR1out, Yin

    R2out, Select Y, Add, ZinZout, Rin

    !etching a "ord from #emor$:

    The connection for re!ister 3D4 has fo#r control si!nals) 3D4 in and 3D4o#t controlthe connection to the internal b#s, and 3D4 inE and 3D4o#t E control the connection to the

    e'ternal b#s% The circ#it is easily "odified to provide the additional connections%

    Prepared ByMrs. V.Rekha AP / MCA

  • 8/10/2019 COrrrrr Unit IV

    11/29

    11

    Input and output gating for one register bit.

    Connections and control signals for register MDR

    E5ample: 3A4 45F

    .tart a 4ead operation on the "e"ory b#s-ait for the 3+C response fro" the "e"ory

    oad 3D4 fro" the "e"ory b#s46 3D4F

    &toring a or %n Memor/:-ritin! a word into a "e"ory location follows a si"ilar proced#re% The desired

    address is loaded into 3A4% Then, the data to be written are loaded into 3D4, and a -riteco""and is iss#ed% 0ence, e'ec#tin! the instr#ction 3ove 46,(4 5 re/#ires the followin!

    se/#ence)

    R1out, MARinR2out, MDRin, Write

    MDRout, WM!CAs in the case of the read operation, the -rite control si!nal ca#ses the "e"ory b#s

    interface hardware to iss#e a -rite co""and on the "e"ory b#s% The processor re"ains instep : #ntil the "e"ory operation is co"pleted and an 3+C response is received%

    *. $i8t an e5plain t-e 8tep8 in

  • 8/10/2019 COrrrrr Unit IV

    12/29

    12

    The #pdated val#e is "oved fro" re!ister bac into the *C d#rin! step 6, while waitin! forthe "e"ory to respond% In step :, the word fetched fro" the "e"ory is loaded into the I4%

    .teps 5 thro#!h : constit#te the instr#ction fetch phase, which is the sa"e for all

    instr#ctions% The instr#ction decodin! circ#it interprets the contents of the I4 at thebe!innin! of step 8% This enables the control circ#itry to activate the control si!nals for

    steps 8 thro#!h ;, which constit#te the e'ec#tion phase% The contents of re!ister 4: aretransferred to the 3A4 in step 8, and a "e"ory read operation is initiated%

    Then the contents of 4 5 are transferred to re!ister 7 in step

  • 8/10/2019 COrrrrr Unit IV

    13/29

    13

    Th#s, if N G the processor ret#rns to step 5 i""ediately after step 8% If N G 5, step < isperfor"ed to load a new val#e into the *C, th#s perfor"in! the branch operation%

    3 Di8u88 multiple u8 organi'ation.

    All !eneral1p#rpose re!isters are co"bined into a sin!le bloc called the re!ister file%The re!ister file is said to have three ports%

    There are two o#tp#ts, allowin! the contents of two different re!isters to be accessed

    si"#ltaneo#sly and have their contents placed on b#ses A and B% The third port allows thedata on b#s C to be loaded into a third re!ister d#rin! the sa"e cloc cycle%

    B#ses A and B are #sed to transfer the so#rce operands to the A and B inp#ts of the

    AU, where an arith"etic or lo!ic operation "ay be perfor"ed% The res#lt is transferred tothe destination over b#s C% If needed, the AU "ay si"ply pass one of its two inp#t

    operands #n"odified to b#s C%

    The AU control si!nals for s#ch an operation 4GA or 4GB% A second feat#re is theintrod#ction of the Incre"ented #nit, which is #sed to incre"ent the *C by 8% Usin! the

    Incre"ented eli"inates the need to add 8 to the *C #sin! the "ain AD, as was done insin!le b#s or!ani$ation%

    Prepared ByMrs. V.Rekha AP / MCA

  • 8/10/2019 COrrrrr Unit IV

    14/29

    14

    Consider the three1operand instr#ction A R4,R;,R6

    In step 5, the contents of the *C are passed thro#!h the AU, #sin! the 4GB control

    si!nal, and loaded into the 3A4 to start a "e"ory read operation% At the sa"e ti"ethe *C is incre"ented by 8% Note that the val#e loaded into 3A4 is the ori!inal

    contents of the *C% The incre"ented val#e is loaded into the *C at the end of thecloc cycle and will not affect the contents of 3A4%

    In step 6, the processor waits for 3+C and loads the data received into 3D4, thentransfers the" to I4 in step :%

    +inally, the e'ec#tion phase of the instr#ction re/#ires only one control step to

    co"plete, step 8% By providin! "ore paths for data transfer a si!nificant red#ction in

    the n#"ber of cloc cycles needed to e'ec#te an instr#ction is achieved%

    4. E5plain =arire ontrol it- t-e lo iagram, Miro Programme ontrol Miro in8trution 2Ma/ *0+*, De *0++ an *0+3

    The processor "#st have so"e "eans of !eneratin! the control si!nals needed in theproper se/#ence% Co"p#ter desi!ners #se a wide variety of techni/#es to solve this

    proble"% The approaches #sed fall into one of two cate!ories) 0ardwired control

    3icro pro!ra""ed control%

    The re/#ired control si!nals are deter"ined by the followin! infor"ation)

    Contents of the control step co#nter

    Contents of the instr#ction re!ister

    Contents of the condition code fla!s

    E'ternal inp#t si!nals, s#ch as 3+C and interr#pt re/#ests

    The decoder>encoder bloc is a co"binational circ#it that !enerates the re/#ired control

    o#tp#ts, dependin! on the state of all its inp#ts% By separatin! the decodin! and encodin!

    Prepared ByMrs. V.Rekha AP / MCA

  • 8/10/2019 COrrrrr Unit IV

    15/29

    15

    f#nctions% +or any instr#ction loaded in the I4, one of the o#tp#t lines IN. 5 thro#!h IN. "is set to 5, and all other lines are set to %

    The inp#t si!nals to the encoder bloc are co"bined to !enerate the individ#al controlsi!nals 7 in , *C Uh Add, End, and so on% An e'a"ple of how the encoder !enerates the

    in control si!nal for the processor or!ani$ation% This circ#it i"ple"ents the lo!ic f#nction

    si!nal is asserted d#rin! ti"e slot Tl for all instr#ctions, d#rin! T= for an Add instr#ction,d#rin! T 8 for an #nconditional branch instr#ction, and so on% Circ#it that !enerates the Endcontrol si!nal fro" the lo!ic f#nction

    The End si!nal starts a new instr#ction fetch cycle by resettin! the control step

    co#nter to its startin! val#e% Control si!nal called 4UN% -hen set to 5, 4UN ca#ses the

    co#nter to be incre"ented by one at the end of every cloc cycle% -hen 4UN is e/#al to ,the co#nter stops co#ntin!%

    The control hardware can be viewed as a state "achine that chan!es fro" one stateto another in every cloc cycle, dependin! on the contents of the instr#ction re!ister, the

    condition codes, and the e'ternal inp#ts% The o#tp#ts of the state "achine are the controlsi!nals% The se/#ence of operations carried o#t by this "achine is deter"ined by the wirin!

    of the lo!ic ele"ents, hence the na"e Hhardwired%H A controller that #ses this approach canoperate at hi!h speed% 0owever, it has little fle'ibility, and the co"ple'ity of the instr#ction

    set it can i"ple"ent is li"ited%

    Prepared ByMrs. V.Rekha AP / MCA

  • 8/10/2019 COrrrrr Unit IV

    16/29

    16

    A 'omplete Processor:

    This str#ct#re has an instr#ction #nit that fetches instr#ctions fro" an instr#ctioncache or fro" the "ain "e"ory when the desired instr#ctions are not already in the cache%

    It has separate processin! #nits to deal with inte!er data and floatin!1point data% A data

    cache is inserted between these #nits and the "ain "e"ory% Usin! separate caches forinstr#ctions and data is co""on practice in "any processors today%

    Miro programme ontrol 2Ma/ *0+*

    An alternative sche"e for hardwired control is called "icro pro!ra""ed control in whichcontrol si!nals are !enerated by a pro!ra" si"ilar to "achine lan!#a!e pro!ra"s%

    A control word (C- is a word whose individ#al bits represent the vario#s control si!nals

    each of the control steps in the control se/#ence of an instr#ction defines a #ni/#eco"bination of 5s and s in the C-%

    Prepared ByMrs. V.Rekha AP / MCA

  • 8/10/2019 COrrrrr Unit IV

    17/29

    17

    The C- s correspondin! to the ; steps of .elect7 is represented by .elect G and .elect8by .elect G 5% A se/#ence of C- s correspondin! to the control se/#ence of a "achine

    instr#ction constit#tes the "icroro#tine for that instr#ction, and the individ#al control wordsin this "icroro#tine are referred to as "icroinstr#ctions%

    The "icroro#tines for all instr#ctions in the instr#ction set of a co"p#ter are stored in aspecial "e"ory called the control store% The control #nit can !enerate the control si!nals for

    any instr#ction by se/#entially readin! the C- s of the correspondin! "icroro#tine fro" thecontrol store% This s#!!ests or!ani$in! the control #nit%

    To read the control words se/#entially fro" the control store, a "icropro!ra" co#nter (*C

    is #sed% Every ti"e a new instr#ction is loaded into the I4, the o#tp#t of the bloc labeledHstartin! address !eneratorH is loaded into the *C%

    In "icropro!ra""ed control, an alternative approach is to #se conditional branch

    "icroinstr#ctions% In addition to the branch address, these "icroinstr#ctions specify whichof the e'ternal inp#ts, condition codes, or, possibly, bits of the instr#ction re!ister sho#ld be

    checed as a condition for branchin! to tae place%

    Prepared ByMrs. V.Rekha AP / MCA

  • 8/10/2019 COrrrrr Unit IV

    18/29

    18

    The instr#ction Branch J "ay now be i"ple"ented by a "icroro#tine% After loadin! this

    instr#ction into I4, a branch "icroinstr#ction transfers control to the correspondin!"icroro#tine, which is ass#"ed to start at location 6< in the control store% This address is

    the o#tp#t of the startin! address !enerator bloc% The "icroinstr#ction at location 6< tests Nbit of the condition codes% If this bit is e/#al to , a branch taes place to location to fetch

    a new "achine instr#ction% therwise, the "icroinstr#ction at location 6= is e'ec#ted to p#t

    the branch tar!et address into re!ister % The "icroinstr#ction in location 6; loads thisaddress into the *C%

    #icroinstructions:

    0ori$ontal and vertical or!ani$ations represent the two or!ani$ational e'tre"es in"icropro!ra""ed control% 3any inter"ediate sche"es are also possible, in which the

    de!ree of encodin! is a desi!n para"eter% The layo#t is a hori$ontal or!ani$ation beca#se it!ro#ps only "#t#ally e'cl#sive "icrooperations in the sa"e fields% As a res#lt, it does not

    li"it in any way the processorKs ability to perfor" vario#s "icrooperations in parallel%

    0i!hly encoded sche"es that #se co"pact codes to specify only a s"all n#"ber of control

    f#nctions in each "icroinstr#ction are referred to as a vertical or!ani$ation% n the other

    hand, the "ini"ally encoded sche"e, in which "any reso#rces can be controlled with asin!le "icroinstr#ction, is called a hori$ontal or!ani$ation%

    The hori$ontal approach is #sef#l when a hi!her operatin! speed is desired and when the

    "achine str#ct#re allows parallel #se of reso#rces% The vertical approach res#lts inconsiderably slower operatin! speeds beca#se "ore "icroinstr#ctions are needed to

    perfor" the desired control f#nctions%

    Prepared ByMrs. V.Rekha AP / MCA

  • 8/10/2019 COrrrrr Unit IV

    19/29

    19

    ;. E5plain in etail t-e implementation o9 pipeline it- a neat iagram. 2an *0+*In co"p#ter architect#re *ipelinin! "eans e'ec#tin! "achine instr#ctions conc#rrently% The

    pipelinin! is #sed in "odern co"p#ters to achieve hi!h perfor"ance% The speed ofe'ec#tion of pro!ra"s is infl#enced by "any factors% ne way to i"prove perfor"ance is to

    #se faster circ#it technolo!y to b#ild the processor and the "ain "e"ory% Another

    possibility is to arran!e the hardware so that "ore than one operation can be perfor"ed atthe sa"e ti"e% In this way, the n#"ber of operations perfor"ed per second is increasedeven tho#!h the elapsed ti"e needed to perfor" anyone operation is not chan!ed%

    *ipelinin! is a partic#larly effective way of or!ani$in! conc#rrent activity in a co"p#ter

    syste"% The basic idea is very si"ple% It is fre/#ently enco#ntered in "an#fact#rin! plants,

    where pipelinin! is co""only nown as an asse"bly1line operation% The processor e'ec#tesa pro!ra" by fetchin! and e'ec#tin! instr#ctions, one after the other% et +i and Ei refer to

    the fetch and e'ec#te steps for instr#ction Ii% E'ec#tions of a pro!ra" consists of a

    se/#ence of fetch and e'ec#te steps,

    Now consider a co"p#ter that has two separate hardware #nits, one for fetchin!

    instr#ctions and another for e'ec#tin! the"% The instr#ction fetched by the fetch #nit isdeposited in an inter"ediate stora!e b#ffer, B5% This b#ffer is needed to enable the

    e'ec#tion #nit to e'ec#te the instr#ction while the fetch #nit is fetchin! the ne't instr#ction%The res#lts of e'ec#tion are deposited in the destination location specified by the

    instr#ction% The data can be operated by the instr#ctions are inside the bloc labeledHE'ec#tion #nitH%

    The co"p#ter is controlled by a cloc whose period is s#ch that the fetch and e'ec#te stepsof any instr#ction can each be co"pleted in one cloc cycle% peration of the co"p#ter

    proceeds% In the first cloc cycle, the fetch #nit fetches an instr#ction I5 (step +5 andstores it in b#ffer Bl at the end of the cloc cycle% In the second cloc cycle, the instr#ction

    Prepared ByMrs. V.Rekha AP / MCA

  • 8/10/2019 COrrrrr Unit IV

    20/29

    20

    fetch #nit proceeds with the fetch operation for instr#ction I6 (step +6% 3eanwhile, thee'ec#tion #nit perfor"s the operation specified by instr#ction I5, which is available to it in

    b#ffer Bl (step E5% By the end of the second cloc cycle, the e'ec#tion of instr#ction I5 isco"pleted and instr#ction I6 is available% Instr#ction I6 is stored in B5, replacin! I5, which

    is no lon!er needed% .tep E6 is perfor"ed by the e'ec#tion #nit d#rin! the third cloc cycle,while instr#ction I: is bein! fetched by the fetch #nit% In this "anner, both the fetch and

    e'ec#te #nits are ept b#sy all the ti"e%

    + +etch) read the instr#ction fro" the "e"ory%

    D Decode) decode the instr#ction and fetch the so#rce operand(s%

    E E'ec#te) perfor" the operation specified by the instr#ction%

    - -rite) store the res#lt in the destination location

    Role of 'ache #emor$:

    Each sta!e in a pipeline is e'pected to co"plete its operation in one cloc cycle% 0ence, thecloc period sho#ld be s#fficiently lon! to co"plete the tas bein! perfor"ed in any sta!e%

    *ipelinin! is "ost effective in i"provin! perfor"ance if the tass bein! perfor"ed indifferent sta!es re/#ire abo#t the sa"e a"o#nt of ti"e%

    Pipeline Performance:

    The pipelined processor processin! of one instr#ction in each cloc cycle, which "eans thatthe rate of instr#ction processin! is fo#r ti"es that of se/#ential operation% The potential

    increase in perfor"ance res#ltin! fro" pipelinin! is proportional to the n#"ber of pipelinesta!es%

    Prepared ByMrs. V.Rekha AP / MCA

  • 8/10/2019 COrrrrr Unit IV

    21/29

    21

    6. -at i8 a Data -a'ar81 =o ill /ou o

  • 8/10/2019 COrrrrr Unit IV

    22/29

    22

    !peran 9oraring:The data ha$ard #st described arises beca#se one instr#ction, instr#ction I6 is waitin! for

    data to be written in the re!ister file% 0owever, these data are available at the o#tp#t of theAU once the E'ec#te sta!e co"pletes step El% 0ence, the delay can be red#ced, or possibly

    eli"inated, if we arran!e for the res#lt of instr#ction I5 to be forwarded directly for #se instep E6%

    The processor datapath involvin! the AU and the re!ister file% This arran!e"ent is si"ilarto the three1b#s str#ct#r, e'cept that re!isters .4Cl, .4C6, and 4.T have been added%

    These re!isters constit#te intersta!e b#ffers needed for pipelined operation% 4e!isters .4C5

    and .4C6 are part of b#ffer B6 and 4.T is part of B:% The data forwardin! "echanis" isprovided by the bl#e connection lines% The two "#ltiple'ers connected at the inp#ts to the

    AU allow the data on the destination b#s to be selected instead of the contents of eitherthe .4CI or .4C6 re!ister% -hen the instr#ctions are e'ec#ted in the datapath of the

    operations perfor"ed in each cloc cycle are as follows% After decodin! instr#ction I6 anddetectin! the data dependency, a decision is "ade to #se data forwardin!% The operand not

    involved in the dependency, re!ister 46, is read and loaded in re!ister .4CI in cloc cycle :%In the ne't cloc cycle, the prod#ct prod#ced by instr#ction I5 is available in re!ister 4.T,

    and beca#se of the forwardin! connection, it can be #sed in step E6% 0ence, e'ec#tion of I6

    proceeds witho#t interr#ption%(andling data ha)ards in soft*are:

    %+: Mul R*,R3,R4

    !P!P

    %* : A R;,R4,R6&ie e99et:

    The data dependencies enco#ntered in the precedin! e'a"ples are e'plicit and easilydetected beca#se the re!ister involved is na"ed as the destination in instr#ction I5 and as a

    so#rce in I6% .o"eti"es an instr#ction chan!es the contents of a re!ister other than theone na"ed as the destination%

    Cla88i9iation o9 ata epenent -a'ar8:The Data dependent ha$ards can be classified into three types accordin! to vario#s data

    #pdate patterns, Consider two instr#ctions I5 and I6, with I5 occ#rrin! before I6 in pro!ra"order%

    %. Rea A9ter rite (RA) (9lo epenene -a'ar) ( R(+) D(*) L M )

    Data ha$ard refers to a sit#ation where an instr#ction refers to a res#lt that has not yet

    been calc#lated or retrieved%

    %%. rite A9ter Rea (AR) (Anti epenene -a'ar) ( D(+) R(*) L M )

    A write after read (-A4 data ha$ard represents a proble" with conc#rrent e'ec#tion%

    %%%. rite A9ter rite (A) (!utput epenene -a'ar) ( R(+) R(*) L M )

    A write after write (-A- data ha$ard "ay occ#r in a conc#rrent e'ec#tion environ"ent%

    >. Di8u8 %n8trution -a'ar8. 2an *0+* *0+3*ipeline e'ec#tion of instr#ctions will red#ce the ti"e and i"proves the perfor"ance%

    -henever this strea" is interr#pted, the pipeline stalls ill#strates for the case of a cache"iss% A branch instr#ction "ay also ca#se the pipeline to stall% The effect of branch

    instr#ctions and the techni/#es that can be #sed for "iti!atin! their i"pact are disc#ssedwith #nconditional branches and conditional branches%

    Prepared ByMrs. V.Rekha AP / MCA

  • 8/10/2019 COrrrrr Unit IV

    23/29

    23

    #nonitional ran-e8:

    A se/#ence of instr#ctions bein! e'ec#ted in a two1sta!e pipeline% Instr#ctions I5 to I: arestored at s#ccessive "e"ory addresses, and I6 is a branch instr#ction% et the branch

    tar!et be instr#ction I% In cloc cycle :, the fetch operation for instr#ction 5: is in pro!ressat the sa"e ti"e that the branch instr#ction is bein! decoded and the tar!et address

    co"p#ted% In cloc cycle 8, the processor "#st discard I:, which has been incorrectlyfetched, and fetch instr#ction I% In the "eanti"e, the hardware #nit responsible for theE'ec#te (E step "#st be told to do nothin! d#rin! that cloc period%

    Either a cache "iss or a branch instr#ction stalls the pipeline for one or "ore cloc cycles%To red#ce the effect of these interr#ptions, "any processors e"ploy sophisticated fetch

    #nits that can fetch instr#ctions before they are needed and p#t the" in a /#e#e% Typically,the instr#ction /#e#e can store several instr#ctions% A separate #nit, which we call the

    dispatch #nit, taes instr#ctions fro" the front of the /#e#e and sends the" to thee'ec#tion #nit% This leads to the or!ani$ation% The dispatch #nit also perfor"s the decodin!

    f#nction%

    To be effective, the fetch #nit "#st have s#fficient decodin! and processin! capability toreco!ni$e and e'ec#te branch instr#ctions% It atte"pts to eep the instr#ction /#e#e filled

    at all ti"es to red#ce the i"pact of occasional delays when fetchin! instr#ctions% If there isa delay in fetchin! instr#ctions beca#se of a branch or a cache "iss, the dispatch #nit

    contin#es to iss#e instr#ctions fro" the instr#ction /#e#e% The fetch #nit contin#es to fetchinstr#ctions and add the" to the /#e#e%

    the /#e#e len!th chan!es and how it affects the relationship between different pipeline

    sta!es% .#ppose that instr#ction I5 introd#ces a 61cycle tall% .ince space is available in the/#e#e, the fetch #nit contin#es to fetch instr#ctions and the /#e#e len!th rises to : in cloc

    cycle =% Instr#ction I< is a branch instr#ction% Instr#ctions I5, I6, I:, I8 and I co"plete

    Prepared ByMrs. V.Rekha AP / MCA

  • 8/10/2019 COrrrrr Unit IV

    24/29

    24

    e'ec#tion in s#ccessive cloc cycles% 0ence, the branch instr#ction does not increase theoverall e'ec#tion ti"e% This techni/#e is referred to as branch folding%

    4eadin! "ore than one instr#ction in each cloc cycle "ay red#ce delay% 0avin! an

    instr#ction /#e#e lie this is also beneficial in dealin! with cache "isses% The instr#ction/#e#e "iti!ates the i"pact of branch instr#ctions on perfor"ance thro#!h the process of

    branch foldin!% It has a si"ilar effect on stalls ca#sed by cache "isses% The effectiveness ofthis techni/#e is enhanced when the instr#ction fetch #nit is able to read "ore than oneinstr#ction at a ti"e fro" the instr#ction cache%

    Conitional ran-e8 an ran- preition:

    &+ ela$ed -ranching

    The processor fetches ne't instr#ctions before it deter"ines whether the c#rrent instr#ction

    is a branch instr#ction%&& %ranching Prediction .Static/Another techni/#e for red#cin! the branch penalty associated with conditional branches is to

    atte"pt topredict whether or not a partic#lar branch will be taen%

    &&& $namic %ranch PredictionThe idea is that the processor hardware assesses the lielihood of a !iven branch bein!

    taen by eepin! trac of branch decisions every ti"e that instr#ction is e'ec#ted%

    ?. E5plain Datapat- an ontrol on8ieration8

    The three1b#s str#ct#re s#itable for pipelined e'ec#tion with a sli!ht "odification to s#pporta 81sta!e pipeline% There are separate instruction and data caches that #se separate

    address and data connections to the processor% This re/#ires two versions of the 3A4re!ister, I3A4 for accessin! tile instr#ction cache and D3A4 for accessin! the data cache%

    The *C is connected directl# to the I3A4, so that the contents of the *C can be transferred

    to I3A4 at the sa"e ti"e that an independent AU operation is tain! place% The dataaddress in D3A4 can be obtained directl# fro" the register file or fro" the A$% to s#pport

    the re!ister indirect and inde'ed addressin! "odes% .eparate 3D4 re!isters are providedfor read and &rite operations% Data can be transferred directly between these re!isters and

    the re!ister file d#rin! load and store operations witho#t the need to pass thro#!h the AU%

    Prepared ByMrs. V.Rekha AP / MCA

  • 8/10/2019 COrrrrr Unit IV

    25/29

    25

    'uffer registers have been introd#ced at the inp#ts and o#tp#t of the AU% These arere!isters .4Cl, .4C6, and 4.T% +orwardin! connections "ay be added if desired% The

    instr#ction re!ister has been replaced with an instruction (ueue, which is loaded fro" theinstr#ction cache% The o#tp#t of the instr#ction decoder is connected to the control signal

    pipeline% This pipeline holds the control si!nals in b#ffers B6 and B%:

    The followin! operations can be perfor"ed independentl# in the processor, 4eadin! an instr#ction fro" the instr#ction cache Incre"entin! the *C

    Decodin! an instr#ction

    4eadin! fro" or writin! into the data cache 4eadin! the contents of #p to two re!isters fro" the re!ister file

    -ritin! into one re!ister in the re!ister file *erfor"in! an AU operation

    . Di8u88 aout &uper8alar !peration.

    *ipelinin! "aes it possible to e'ec#te instr#ctions conc#rrently% .everal instr#ctions are

    present in the pipeline at the sa"e ti"e, b#t they are in different sta!es of their e'ec#tion%

    -hile one instr#ction is perfor"in! an AU operation, another instr#ction is bein! decodedand yet another is bein! fetched fro" the "e"ory% Instr#ctions enter the pipeline in strictpro!ra" order%

    The "a'i"#" thro#!hp#t of a pipelined processor is one instr#ction per cloc cycle% The

    processors are capable of achievin! an instr#ction e'ec#tion thro#!hp#t of "ore than one

    instr#ction per cycle% They are nown as superscalar processors. 3any "ode" hi!h1perfor"ance processors #se this approach%

    Prepared ByMrs. V.Rekha AP / MCA

  • 8/10/2019 COrrrrr Unit IV

    26/29

    26

    In a s#perscalar processor, the detri"ental effect on perfor"ance of vario#s ha$ardsbeco"es even "ore prono#nced% The co"piler can avoid "any ha$ards thro#!h #dicio#s

    selection and orderin! of instr#ctions% +or e'a"ple, the co"piler sho#ld strive to interleavefloatin!1point and inte!er instr#ctions%

    This wo#ld enable the dispatch #nit to eep both the inte!er and floatin!1point #nits b#sy

    "ost of the ti"e% In !eneral, hi!h perfor"ance is achieved if the co"piler is able to arran!e

    pro!ra" instr#ctions to tae "a'i"#" advanta!e of the available hardware #nits%

    !ut@o9@orer e5eution:Instr#ctions are dispatched in the sa"e order as they appear in the pro!ra"% 0owever, their

    e'ec#tion is co"pleted o#t of order% .#ppose one iss#e arise fro" dependencies a"on!instr#ctions%

    To !#arantee a consistent state when e'ceptions occ#r, the res#lts of the e'ec#tion of

    instr#ctions "#st be written into the destination locations strictly in pro!ra" order% This"eans we "#st delay step -6 #ntil cycle =% In t#rn, the inte!er e'ec#tion #nit "#st retain

    the res#lt of instr#ction I6, and hence it cannot accept instr#ction I8 #ntil cycle =% If ane'ception occ#rs d#rin! an instr#ction, all s#bse/#ent instr#ctions that "ay have been

    partially e'ec#ted are discarded% This is called a precise e)ception% It is easier to provide

    precise e'ceptions in the case of e'ternal interr#pts% At this point, the processor and all itsre!isters are in a consistent state, and interr#pt processin! can be!in%

    Prepared ByMrs. V.Rekha AP / MCA

  • 8/10/2019 COrrrrr Unit IV

    27/29

    27

    +0. Di99erene eteen miro programme an -arire ontrol. 2an *0+*

    0ardwired control is a control "echanis" to !enerate control si!nals by #sin! appropriatefinite state "achine (+.3% 3icropro!ra""ed control is a control "echanis" to !enerate

    control si!nals by #sin! a "e"ory called control stora!e (C., which contains the controlsi!nals% Altho#!h "icropro!ra""ed control see"s to be advanta!eo#s to CI.C "achines,

    since CI.C re/#ires syste"atic develop"ent of sophisticated control si!nals, there is nointrinsic difference between these 6 control "echanis"%

    The pair of H"icroinstr#ction1re!isterH and Hcontrol stora!e address re!isterH can bere!arded as a Hstate re!isterH for the hardwired control% Note that the control stora!e can

    be re!arded as a ind of co"binational lo!ic circ#it% -e can assi!n any , 5 val#es to eacho#tp#t correspondin! to each address, which can be re!arded as the inp#t for a

    co"binational lo!ic circ#it% This is a tr#th table%

    The "icropro!ra""ed control is not always necessary to i"ple"ent CI.C "achines%0ardwired control also can be #sed for i"ple"entin! sophisticated CI.C "achines%

    0ardwired syste"s are "ade to perfor" in a set "anner, i"ple"ented with lo!ic, switches,etc% between any inp#t and o#tp#t in the syste"% nce the "anner in which the control is

    e'ec#ted%

    3icropro!ra""ed syste"s are centered aro#nd a co"p#ter of so"e sort, often a

    "icrocontroller in s"all syste"s, that controls the syste" #sin! a pro!ra"% Inp#t is sent tothe co"p#ter, and the pro!ra" deter"ines what sho#ld be done with the inp#t to co"e #p

    with an o#tp#t% .o the processor is between the inp#t and the o#tp#t, rather than a directlin between the inp#t and o#tp#t%

    The versatility of the "icropro!ra""ed syste" far e'ceeds the hardwired syste"% The

    syste"s can also be considerably s"aller% The si$e of a co"ple' "icrocontroller can be /#itea bit s"aller that a b#nch of lo!ic and switches for the sa"e f#nctionality%

    ++. -at i8 ran- penalt/1 E5plain -o ran- penalt/ i8 reue. 2De *0++

    A branch instr#ction loads the processor@s pro!ra" co#nter with a new non1se/#entialval#e% Conse/#ently, all the instr#ctions whose e'ec#tion was started before the branch wastaen are s#ddenly red#ndant and the pipeline has to be refilled with instr#ctions followin!

    the branch tar!et address% The cost of e'ec#tin! an operation that ca#ses a non1se/#entialflow of control is nown as the branch penalty%

    Instr#ctions that "odify the flow of control red#cin! or even eli"inatin! the b#bble in the

    4I.C@s pipeline ca#sed when a branch is taen9 that is, concerned with ways of red#cin! the

    Prepared ByMrs. V.Rekha AP / MCA

  • 8/10/2019 COrrrrr Unit IV

    28/29

    28

    branch penalty% .o"e of the techni/#es involve li"itin! the da"a!e done by a branch andso"e techni/#es atte"pt to predict the o#tco"e of a branch before it has been e'ec#ted%

    .everal instr#ctions "odify the flow of control9 for e'a"ple, the #nconditional branch, the

    conditional branch, the s#bro#tine call, and the s#bro#tine ret#rn% Internally !eneratedtraps and e'ceptions and e'ternally !enerated interr#pts also "odify the flow of control%

    .#bro#tine call and ret#rns are not nor"ally re!arded as branch operations fro" theco"p#ter architectKs point of view, b#t they have si"ilar characteristics fro" the co"p#terdesi!nerKs point of view9 that is, they also inc#r a branch penalty% The #nconditional branch

    is always taen and forces e'ec#tion to contin#e at the tar!et address% An #nconditional

    branch is e/#ivalent to the hi!h1level lan!#a!e !o to and its o#tco"e is nown at co"pile1ti"e%

    Reduce -ranch penalt$:

    The o#tco"e of a conditional branch is deter"ined by the state of one or "ore fla! bits inthe processorKs condition code re!ister and is therefore not nown #ntil r#nti"e% The

    conditional branch "ay be taen% -hen a branch is not taen, the o#tco"e is so"eti"escalled in line beca#se the ne't instr#ction i""ediately followin! the branch is e'ec#ted% A

    s#bro#tine call is a type of #nconditional branch that saves the ret#rn address% .i"ilarly, a

    s#bro#tine ret#rn is an #nconditional branch that fetches the tar!et address fro" a re!isteror the stac% .o"e co"p#ters s#pport conditional s#bro#tine calls and ret#rns%

    5% *redict branch>#"p instr#ctions AND branch direction (taen or not taen6% *redict branch>#"p tar!et address (for taen branches

    :% .pec#latively e'ec#te instr#ctions alon! the predicted path

    Anna #ni

  • 8/10/2019 COrrrrr Unit IV

    29/29

    29

    P% -hat is branch penaltyO E'plain how branch penalty is red#ced% 4ef% No%) 55F

    Prepared By