super scalar issue despatch

Upload: rohit

Post on 08-Jan-2016

232 views

Category:

Documents


0 download

DESCRIPTION

Super Scalar Issue Despatch

TRANSCRIPT

  • CSL718 : Superscalar ProcessorsIssue and Despatch23rd Jan, 2006

  • Early proposals/prototypes1982 1983 1984 1985 1986 1987 1988 1989IBMDECStanford UKyushu UCheetahAmerica project(4)Multititan project(2)Match(2) Torch(4)SIMP(4) DSNS(4)TermSuperscalar

  • Commercial superscalarsRISCsIntel960KA/KB 960CA (3)1989IBMPower 1 RS/6000 (4)1990HPPA7000 PA7100 (2)1992SUNSPARC SuperSparc (3)1992DECAlpha 21064(2)1992MotorolaMC88100 MC88110(2)1993MotorolaPowerPC 601/603 (3)1993MIPSR4000 R8000(4)1994

  • Commercial superscalarsCISCsIntel80486 Pentium (2)1993Motorola MC68040 MC68060 (2)1993GmicroGmicro/100p Gmicro 500 (2)1993AMDK5(2) 4 RISC instr1995CYRIXM1 (2)1995

  • Tasks of superscalar processingParallel Parallel Preserving thedecoding instruction sequential and issue execution consistency of instruction execution and exception processing

  • Superscalar decode and issueI - cacheInstructionbufferDecode & IssueIFD/II - cacheInstructionbufferDecode & IssueIFDIScalarIssueSuperscalarIssue

  • Parallel DecodingFetch multiple instructions in instruction bufferDecode multiple instructions in parallel instruction windowPossibly check dependencies among these as well as with the instructions already under execution

  • Pre-decodingDo partial decoding while instructions are being loaded in I-cacheDecoded information is appended to the instructionThis includes instruction class, resources required etc.

    Second level cacheor main memoryPre-decode unitI - cacheN bits/cycleN + n bits/cycle

  • Number of Pre-decode bitsProcessorNo. of predecode bitsPA 7200 (1995)5PA 8000 (1996)5PowerPC 620(1996)7UltraSparc (1995)4HAL PM1 (1995)4AMD K5 (1995)5 (per byte)R 10000 (1996)4

  • Issue vs DispatchBlocking IssueDecode and issue to EU

    Instructions may be blocked due to data dependencyNon-blocking IssueDecode and issue to bufferFrom buffer dispatch to EU

    Instructions are not blocked due to data dependency

  • Blocking IssueEUEUEUDecode Check & IssueInstructionbufferissue window

  • Non-blocking (shelved) IssueReservationstationDep. Checking/dispatchEUReservationstationDep. Checking/dispatchEUReservationstationDep. Checking/dispatchEUDecode & IssueInstructionbuffer

  • Handling of Issue BlockagesPreserving issue order Alignment of instruction issuealigned unalignedin-order out of order

  • Issue OrdercdabeaIssue windowInstructionsto be issued

    InstructionsissuedcdabeaIssue windowInstructionsto be issued

    InstructionsissuedIssue in strict program orderOut of order IssuecExample: MC 88110, PowerPC 601Independent instructionDependent instructionIssued instruction

  • Alignmentcdabeafixed windowcheckedin cycle 1Aligned IssueUnaligned Issueissuedin cycle 1fghnext windowcdbebcheckedin cycle 2issuedin cycle 2fghdedcheckedin cycle 3issuedin cycle 3fghccdabeagliding windowfghcdbebfghdefghcdef

  • Design choices in instruction issueCoping with Coping with Use of Handling of Issuefalse data unresolved shelving issue blockages ratedependencies control (2-6) dependenciesno Register renamingwait speculativeblocking shelved

  • Frequently used issue policies in scalar processorsTraditional Traditional Traditional Traditionalscalar issue scalar issue scalar issue scalar issue with shelving with shelving with spec. and renaming executionCDC 6600IBM 360/91i386MC68030R3000SparcI486MC68040R4000MicroSparc

  • Frequently used issue policies in super scalar processorsStraightforward Straightforward Straight forward Advancedsuperscalar superscalar superscalar superscalar issue issue with issue with issue shelving renaming (renaming+shelving)aligned unaligned(speculative execution in all)PentiumPowerPC601PA7100SuperSparcAlpha21164MC68060PA7200UltraSparcMC88110R8000PowerPC602R10000PentiumProPowerPC602PA8000Sparc64Am29000K5

  • Frequently used issue policies Traditional Traditional Straight forward Advancedscalar issue scalar issue superscalar issue superscalar with spec. Issue executionaligned unaligned

  • Design Space of ShelvingScope of Layout of Operand fetch Instructionshelving shelving policy dispatch scheme bufferspartial full

  • Layout of Shelving BuffersType of the Number of Number of readshelving buffers shelving buffer entries and write portsStand combined withalone renaming and(RS) reorderingindividual 2-4group 6-16central 20total 15-40depends onno. of EUsconnected

  • Reservation Stations (RS)EUEUEUEUEUEUEUEUIndividual RSsGroup RSsCentral RS

  • Combined Buffer(for Shelving, Renaming, Reordering)EUEUDRISFrom decode/issueDeferred scheduling, Register renaming and InstructionShelving

  • Operand Fetch PoliciesIssueboundfetchDispatchboundfetch

  • Issue bound operand fetch(with single register file)EUEUEUEUDecode/issueRFinstructiondata

  • Dispatch bound operand fetch (with single register file)EUEUEUEUDecode/issue

  • Issue bound operand fetch(with multiple register files)EUEUEUEUDecode/issueRFRFinstructiondata

  • Dispatch bound operand fetch (with multiple register files)EUEUEUEUDecode/issue

  • Updating RFs and RSsEUEUEUEUDecode/issueRFRFinstructiondata

  • Instruction dispatch schemeDispatch Dispatch Checking Treatment ofpolicy rate operand empty RS availabilitysingle multipleinstr/ instr/cycle cycleIndividual RSGroup or central RS

  • Dispatch policySelection Arbitration Dispatchrule rule orderRule for identifyinginstructions which areready for execution(data dependency check)Rule for choosingone out of severalready instructions(earlier instruction has priority)

  • Dispatch orderin-order partially out of out of order ordercheckcheck

  • Checking availability of operandsDirect check of Check of explicit score-board bits status bits in RS

    (usual for dispatch (usual for issuebound operand fetch) bound operand fetch)

    control flow approach data flow approachFlynns terminology

  • Score-boardRegisterFile10110012Data statusIntroduced with CDC6600

  • Checking in dispatch bound fetchRegisterFileReservationstationOC Rs1 Rs2 RdEUdecodedinstructioncheck V bits of sourcesupdate Rdset V bitRs1,Rs2,Rdreset V bit of RdOC(opcode)Os1Os2 (operand value)result, Rd

  • Checking in issue bound fetchOC Os1/Is1 Vs1 Os2/Is2 Vs2 RdEUdecodedinstructionOC, Os1, Os2, Rdresult, RdRegisterFileupdate Rd, set V bitRs1,Rs2,Rdreset V bit of RdOs1Os2 (operand value)Reservation stationcheck Vs1, Vs2associative update ofIs1, Is2 with Rd, set Vs bits

  • Treatment of an empty RSStraight forward Bypassingapproach RS if emptyAt least onecycle stay in RSEUEUNx586Sparc64PowerPc 604

  • Approaches in dispatchingStraight forward Enhanced Advanced in order partially out of order out of order single single multiple instr/cycle instr/cycle instr/cycleindividual RSs individual RSs group/central RSs

    Power1, PPC603 Power2 PM1, PentiumProNx586, Am29000 PPC604,620 PA8000, R10000