smt and cmp architecture

Upload: dskhari

Post on 19-Oct-2015

150 views

Category:

Documents


0 download

DESCRIPTION

This ppt gives info about SMT and CMP architecture.And its also extracts difference between both arch.

TRANSCRIPT

  • Performance, Energy and Thermal Considerations of SMT and CMP architecturesYingmin Li, David Brooks, Zhigang Hu, Kevin SkadronDept. of Computer Science, University of VirginiaDivision of Engineering and Applied Sciences, Havard UniversityIBM T.J.Watson Research Center

    * 2005, Yingmin Li

    MotivationFuture trend calls for multi-core and multi-thread architecturesWhich is better: lots of tiny speed demons or fewer brainiacs?Which is more valuable, more L2 or additional cores?Performance, power, and thermal properties of multi-core vs. multi-thread architectures not well understood

    * 2005, Yingmin Li

    Scope of this Study Equal-area comparison between SMT vs. CMP extensions of an Apple G5-like core

    Note: 1MB L2 roughly equals to 1 G5 like Core in terms of areaSingle- threadedSMTSingle-threaded CMP

    * 2005, Yingmin Li

    Outline Modeling / Model ValidationSMT vs. CMP performance, power and thermal analysis (without DTM)SMT vs. CMP performance, power and thermal analysis (with DTM)Conclusions and future work

    * 2005, Yingmin Li

    Performance sensitivity with different L2 sizeCMP L2 size = SMT L2 size 1MB

    Chart1

    0.33647980790.026457119

    0.38102282370.0735237442

    0.42944883650.2215576404

    0.46216701730.4938353442

    0.52251829960.6199915056

    0.53020038770.6795919424

    0.54434737870.7341772393

    SMT

    CMP

    L2 size (SMT)

    Relative performance change compared to ST baseline

    Sheet1

    354.8753333333346.4786666667362.1142424242361.3981818182363.588125361.1606060606

    81.725333333373.328666666788.964242424288.248181818290.43812588.0106060606

    ST

    ST (area enlarged)

    SMT

    SMT(only activity factor)

    CMP

    CMP (one core rotated)

    3.24490112547.541836901913.93308078421.31212121211.31212121210.8427272727

    26440.484848484826440.484848484826440.48484848482.04909090912.29757575761.5575757576

    0.10930971650.22193412670.34510404434.54272727275.02393939392.2827272727

    1309070

    SMT

    CMP

    ST

    smt-lcmp-lst-l

    b0.25898319650.8742610040.42944883650.2215576404

    g0.03074097310.5907973889-0.06966020140.24378380560.1557347431-0.0256217382

    l0.06211140860.5345847812-0.09632045230.26250559360.1338163491-0.035526005

    IPCr0.10581767690.5854375314-0.07999156090.32794622010.151027027-0.0297370176

    POWERd0.14461626230.6707720816-0.041161340.31707656450.1695501129-0.0149913657

    ENERGYd200.09808611030.5956772038-0.05695520950.26906444170.1461591668-0.0205857998

    ENERGY DELAY

    ENERGY DELAY^2

    SMT-lowSMT-high

    BGLRDVS10DVS20

    0.25898319650.03074097310.06211140860.10581767690.14461626230.09808611030.42944883650.24378380560.26250559360.32794622010.31707656450.2690644417

    0.37984054270.21658237770.26821004760.27959109690.04009992680.00580745040.45855923720.31798110730.3741059050.3954006020.14030016330.1043433999

    0.07753108790.16098795030.27781292460.1733499403-0.1190658026-0.11351602610.00432723090.07969232770.21218715450.0723822197-0.159194344-0.1540390618

    -0.13931645840.17667160560.39053401220.1270176583-0.227085279-0.1848850453-0.23367268690.08808466470.2376049533-0.1328624067-0.3003622446-0.2604098608

    -0.29894605280.25498651660.61918149850.1271092222-0.3012049553-0.2232194076-0.34982751060.36614980160.4765056837-0.2620823634-0.3375556991-0.2553640208

    CMP-lowCMP-high

    0.8742610040.59079738890.53458478120.58543753140.67077208160.59567720380.22155764040.15573474310.13381634910.1510270270.16955011290.1461591668

    0.93437493740.80333938180.85444872970.86304534070.36062999620.31573597770.70840719960.79909784020.82123169480.82417034540.49881250240.4731907887

    0.01333889090.17932682030.30600609230.2405303675-0.2115738687-0.20502705250.55415678330.68595736210.75124529860.70840888380.43859875950.4429925023

    -0.446380657-0.16361808080.0131603537-0.101961117-0.5205315035-0.49310309480.92244807561.09546877081.20687310571.0879726060.86423114790.908241176

    -0.6867610944-0.3668360075-0.1523275257-0.3047110638-0.6956662191-0.66070249011.89901156962.16521117412.36541010582.06325204451.93666166922.0730124142

    mcf+mcfmcf+vpr

    CMP2MSMT3MSMT2MCMP1MCMP2MSMT3MSMT2MCMP1M

    0.08056970120.7562425940.0178648675-0.49992309650.89516590760.64218869450.58623483260.1699922658

    0.76212768220.51783019610.38187316540.77191674330.81107092720.55184073880.53182854620.6586422793

    0.6307394889-0.13575139280.35761948692.54328850380.0753941638-0.01810823040.00605737151.3502576369

    0.509147894-0.50789906740.33379149226.0854872099-0.3614426746-0.378730418-0.33925279212.3302605562

    0.3966224412-0.71979899550.310381709913.1687951593-0.6208316247-0.6069058673-0.56604177353.7189019612

    CMPSMT

    0.0264571190.3364798079

    0.07352374420.3810228237

    0.22155764040.4294488365

    0.49383534420.4621670173

    0.61999150560.5225182996

    0.67959194240.5302003877

    0.73417723930.5443473787

    IPC1.5M

    POWER1.75M

    ENERGY2M

    ENERGY DELAY2.25M

    ENERGY DELAY^22.5M

    2.75M

    3M

    Sheet1

    Normal case

    L2 leakage radically reduced

    No Temperature effect on Leakge

    Technology (nm)

    Average temperature difference between CMP and SMT

    Sheet2

    Absolute Temperature (Celsius)

    Sheet3

    2-way SMT

    dual-core CMP

    Relative change compared to ST baseline

    2-way SMT

    dual-core CMP

    Relative change compared to ST baseline

    No DTM

    Global fetch throttling

    Local renaming throttling

    Register file throttling

    DVS10

    DVS20

    Relative change compared to ST baseline without DTM

    No DTM

    Global fetch throttling

    Local renaming throttling

    Register file throttling

    DVS10

    DVS20

    Relative change compared to ST baseline without DTM

    -0.0545799581-0.0390064596-0.0360948816-0.095716911-0.1087500114

    0.02506602060.08906567280.0636536881-0.0589310101-0.0566065961

    0.12139632310.25798160460.1885986013-0.0185160950.0029792386

    0.23878798150.48369469820.34666520760.02595916460.0712814762

    Fetch throttling

    Rename throttling

    Register file throttling

    DVS10

    DVS20

    Relative change compared with baseline without DTM

    -0.0188401584-0.0133399996-0.0122237317-0.0012915749-0.0063309131

    0.00833237330.03034882850.02176455320.01231223310.0130886515

    0.0389932170.08251137110.06096026490.02677661880.0342944582

    0.07362170770.14487410540.10621857540.04215762010.0574549696

    Fetch throttling

    Rename throttling

    Register file throttling

    DVS10

    DVS20

    Relative change compared with baseline without DTM

    No DTM

    Fetch throttling

    Rename throttling

    Register file throttling

    DVS10

    DVS20

    Relative change compared with baseline without DTM

    No DTM

    Fetch throttling

    Rename throttling

    Register file throttling

    DVS10

    DVS20

    Relative change compared with baseline without DTM

    No DTM

    Fetch throttling

    Rename throttling

    Register file throttling

    DVS10

    DVS20

    Relative change compared with baseline without DTM

    1.10

    1.21

    1.09

    1.9

    2.17

    2.37

    2.06

    1.94

    2.07

    No DTM

    Fetch throttling

    Rename throttling

    Register file throttling

    DVS10

    DVS20

    Relative change compared with baseline without DTM

    0.01786486750.756242594

    0.38187316540.5178301961

    0.3576194869-0.1357513928

    0.3337914922-0.5078990674

    0.3103817099-0.7197989955

    13.2

    6.09

    2.54

    SMT with 2MB L2

    SMT with 3MB L2

    CMP with 1MB L2

    CMP with 2MB L2

    Relative change compared with baseline ST with 2MB L2

    0.58623483260.6421886945

    0.53182854620.5518407388

    0.0060573715-0.0181082304

    -0.3392527921-0.378730418

    -0.5660417735-0.6069058673

    1.35

    2.33

    3.72

    SMT with 2MB L2

    SMT with 3MB L2

    CMP with 1MB L2

    CMP with 2MB L2

    Relative change compared with baseline ST with 2MB L2

    SMT

    CMP

    L2 size (SMT)

    Relative performance change compared to ST baseline

    * 2005, Yingmin Li

    Modeling and ValidationPerformance: Turandot with SMT and CMP augmentations, validated against Power4 preRTL modelPower: PowerTimer with SMT and CMP augmentations, validated against CPAM power data extracted from circuitTemperature: Hotspot from UVA integrated with Turandot/PowerTimer, validated with test chips at UVA

    * 2005, Yingmin Li

    Turandot/PowerTimer Simulation Framework Supports SMT/CMP Runs on AIX/PowerPC and Linux/Intel platforms PowerTimer based on CPAM data, extracted from circuits See Micro02 tutorial by Zhigang Hu and David Brooks for details

    * 2005, Yingmin Li

    Hotspot temperature modelModels all parts along both primary and secondary heat transfer pathsAt arbitrary granularitiesFast and accurateEssentially a lumped thermal R-C network

    Fin-to-air convection thermal resistor

    Silicon Die

    Thermal Interface Material

    Heat Sink

    Heat Spreader

    To Interconnect Layer Thermal Model

    * 2005, Yingmin Li

    Peak Temperature of The Hottest Spot for SMT and CMP3 heat-up mechanismsUnit self heating determined by the power density of the unitGlobal heating through TIM (thermal interface material) and spreaderLateral thermal coupling between neighboring units

    Chart3

    81.7253333333

    73.3286666667

    88.9642424242

    88.2481818182

    90.438125

    88.0106060606

    Temperature (Celsius)

    Sheet1

    354.8753333333346.4786666667362.1142424242361.3981818182363.588125361.1606060606

    81.725333333373.328666666788.964242424288.248181818290.43812588.0106060606

    ST

    ST (area enlarged)

    SMT

    SMT(only activity factor)

    CMP

    CMP (one core rotated)

    3.24490112547.541836901913.93308078421.31212121211.31212121210.8427272727

    26440.484848484826440.484848484826440.48484848482.04909090912.29757575761.5575757576

    0.10930971650.22193412670.34510404434.54272727275.02393939392.2827272727

    1309070

    SMT

    CMP

    ST

    smt-lcmp-lst-l

    b0.25898319650.8742610040.42944883650.2215576404

    g0.03074097310.5907973889-0.06966020140.24378380560.1557347431-0.0256217382

    l0.06211140860.5345847812-0.09632045230.26250559360.1338163491-0.035526005

    IPCr0.10581767690.5854375314-0.07999156090.32794622010.151027027-0.0297370176

    POWERd0.14461626230.6707720816-0.041161340.31707656450.1695501129-0.0149913657

    ENERGYd200.09808611030.5956772038-0.05695520950.26906444170.1461591668-0.0205857998

    ENERGY DELAY

    ENERGY DELAY^2

    SMT-lowSMT-high

    BGLRDVS10DVS20

    0.25898319650.03074097310.06211140860.10581767690.14461626230.09808611030.42944883650.24378380560.26250559360.32794622010.31707656450.2690644417

    0.37984054270.21658237770.26821004760.27959109690.04009992680.00580745040.45855923720.31798110730.3741059050.3954006020.14030016330.1043433999

    0.07753108790.16098795030.27781292460.1733499403-0.1190658026-0.11351602610.00432723090.07969232770.21218715450.0723822197-0.159194344-0.1540390618

    -0.13931645840.17667160560.39053401220.1270176583-0.227085279-0.1848850453-0.23367268690.08808466470.2376049533-0.1328624067-0.3003622446-0.2604098608

    -0.29894605280.25498651660.61918149850.1271092222-0.3012049553-0.2232194076-0.34982751060.36614980160.4765056837-0.2620823634-0.3375556991-0.2553640208

    CMP-lowCMP-high

    0.8742610040.59079738890.53458478120.58543753140.67077208160.59567720380.22155764040.15573474310.13381634910.1510270270.16955011290.1461591668

    0.93437493740.80333938180.85444872970.86304534070.36062999620.31573597770.70840719960.79909784020.82123169480.82417034540.49881250240.4731907887

    0.01333889090.17932682030.30600609230.2405303675-0.2115738687-0.20502705250.55415678330.68595736210.75124529860.70840888380.43859875950.4429925023

    -0.446380657-0.16361808080.0131603537-0.101961117-0.5205315035-0.49310309480.92244807561.09546877081.20687310571.0879726060.86423114790.908241176

    -0.6867610944-0.3668360075-0.1523275257-0.3047110638-0.6956662191-0.66070249011.89901156962.16521117412.36541010582.06325204451.93666166922.0730124142

    mcf+mcfmcf+vpr

    CMP2MSMT3MSMT2MCMP1MCMP2MSMT3MSMT2MCMP1M

    0.08056970120.7562425940.0178648675-0.49992309650.89516590760.64218869450.58623483260.1699922658

    0.76212768220.51783019610.38187316540.77191674330.81107092720.55184073880.53182854620.6586422793

    0.6307394889-0.13575139280.35761948692.54328850380.0753941638-0.01810823040.00605737151.3502576369

    0.509147894-0.50789906740.33379149226.0854872099-0.3614426746-0.378730418-0.33925279212.3302605562

    0.3966224412-0.71979899550.310381709913.1687951593-0.6208316247-0.6069058673-0.56604177353.7189019612

    CMPSMT

    0.0264571190.3364798079

    0.07352374420.3810228237

    0.22155764040.4294488365

    0.49383534420.4621670173

    0.61999150560.5225182996

    0.67959194240.5302003877

    0.73417723930.5443473787

    IPC1.5M

    POWER1.75M

    ENERGY2M

    ENERGY DELAY2.25M

    ENERGY DELAY^22.5M

    2.75M

    3M

    Sheet1

    000

    000

    000

    Normal case

    L2 leakage radically reduced

    No Temperature effect on Leakge

    Technology (nm)

    Average temperature difference between CMP and SMT

    Sheet2

    0

    0

    0

    0

    0

    0

    Temperature (Celsius)

    Sheet3

    00

    00

    00

    00

    00

    2-way SMT

    dual-core CMP

    Relative change compared to ST baseline

    00

    00

    00

    00

    00

    2-way SMT

    dual-core CMP

    Relative change compared to ST baseline

    000000

    000000

    000000

    No DTM

    Global fetch throttling

    Local renaming throttling

    Register file throttling

    DVS10

    DVS20

    Relative change compared to ST baseline without DTM

    000000

    000000

    000000

    No DTM

    Global fetch throttling

    Local renaming throttling

    Register file throttling

    DVS10

    DVS20

    Relative change compared to ST baseline without DTM

    -0.0545799581-0.0390064596-0.0360948816-0.095716911-0.1087500114

    0.02506602060.08906567280.0636536881-0.0589310101-0.0566065961

    0.12139632310.25798160460.1885986013-0.0185160950.0029792386

    0.23878798150.48369469820.34666520760.02595916460.0712814762

    Fetch throttling

    Rename throttling

    Register file throttling

    DVS10

    DVS20

    Relative change compared with baseline without DTM

    -0.0188401584-0.0133399996-0.0122237317-0.0012915749-0.0063309131

    0.00833237330.03034882850.02176455320.01231223310.0130886515

    0.0389932170.08251137110.06096026490.02677661880.0342944582

    0.07362170770.14487410540.10621857540.04215762010.0574549696

    Fetch throttling

    Rename throttling

    Register file throttling

    DVS10

    DVS20

    Relative change compared with baseline without DTM

    000000

    000000

    000000

    000000

    No DTM

    Fetch throttling

    Rename throttling

    Register file throttling

    DVS10

    DVS20

    Relative change compared with baseline without DTM

    000000

    000000

    000000

    000000

    No DTM

    Fetch throttling

    Rename throttling

    Register file throttling

    DVS10

    DVS20

    Relative change compared with baseline without DTM

    000000

    000000

    000000

    000000

    No DTM

    Fetch throttling

    Rename throttling

    Register file throttling

    DVS10

    DVS20

    Relative change compared with baseline without DTM

    000000

    000000

    000000

    000000

    1.10

    1.21

    1.09

    1.9

    2.17

    2.37

    2.06

    1.94

    2.07

    No DTM

    Fetch throttling

    Rename throttling

    Register file throttling

    DVS10

    DVS20

    Relative change compared with baseline without DTM

    0.01786486750.75624259400

    0.38187316540.517830196100

    0.3576194869-0.135751392800

    0.3337914922-0.507899067400

    0.3103817099-0.719798995500

    13.2

    6.09

    2.54

    SMT with 2MB L2

    SMT with 3MB L2

    CMP with 1MB L2

    CMP with 2MB L2

    Relative change compared with baseline ST with 2MB L2

    0.58623483260.642188694500

    0.53182854620.551840738800

    0.0060573715-0.018108230400

    -0.3392527921-0.37873041800

    -0.5660417735-0.606905867300

    1.35

    2.33

    3.72

    SMT with 2MB L2

    SMT with 3MB L2

    CMP with 1MB L2

    CMP with 2MB L2

    Relative change compared with baseline ST with 2MB L2

    00

    00

    00

    00

    00

    00

    00

    SMT

    CMP

    L2 size (SMT)

    Relative performance change compared to ST baseline

    * 2005, Yingmin Li

    Heat Flow of Global Heat-up

    Heat Sink

    Heat Spreader

    Thermal Interface Material

    Silicon Bulk

    Interconnect Layers

    C4 Pads and Underfill

    Ceramic Substrate

    CBGA Joint

    Printed-circuit Board

    Primary Path

    Secondary Path

    * 2005, Yingmin Li

    Illustration (global heat-up of CMP vs. local heat-up of SMT)

    Chart1

    66.3658.84

    75.8873.86

    95.3694.42

    CMP

    SMT

    Temperature

    Sheet1

    CMPSMT

    DeviceInterface materialSpreadDeviceInterface materialSpread

    CMPSMT

    IFU_cache341.91339.47321.16IFU_cache336.25334.18318.865.665.292.3IFU_cache341.91336.2568.7663.1

    IFU_B1339.51337.28321IFU_B1331.99330.4318.67.526.882.4IFU_B1339.51331.9966.3658.84

    IFU_B2345.91343.03321.49IFU_B2342.8340.01319.183.113.022.31IFU_B2345.91342.872.7669.65

    IDU358.61354.21321.59IDU350.62346.87319.357.997.342.24IDU358.61350.6285.4677.47

    LSU_cache349.03345.79321.68LSU_cache347.01343.72319.312.022.072.37LSU_cache349.03347.0175.8873.86

    LSU_B1356.85352.7321.9LSU_B1357.01352.58319.67-0.160.122.23LSU_B1356.85357.0183.783.86

    LSU_B2362.38357.61322.22LSU_B2360.91356.03319.861.471.582.36LSU_B2362.38360.9189.2387.76

    FXU_reg368.51363.03322.35FXU_reg367.57361.85319.950.941.182.4FXU_reg368.51367.5795.3694.42

    FXU_B1365.36360.18321.88FXU_B1361.85356.81319.373.513.372.51FXU_B1365.36361.8592.2188.7

    FXU_B2364.54359.52322.36FXU_B2362.06357.06319.922.482.462.44FXU_B2364.54362.0691.3988.91

    ISU360.03355.4322.03ISU358.69353.78319.771.341.622.26ISU360.03358.6986.8885.54

    FPU_reg353.46349.65321.22FPU_reg352.27348.35319.011.191.32.21FPU_reg353.46352.2780.3179.12

    FPU_B1353.2349.44321.48FPU_B1345.81342.65319.027.396.792.46FPU_B1353.2345.8180.0572.66

    BXU351.52348321.77BXU348.55345.12319.52.972.882.27BXU351.52348.5578.3775.4

    IFU_cache341.91339.47321.16L2325.59324.7318.216.3214.772.96L2340.37325.5967.2252.44

    IFU_B1339.51337.28321

    IFU_B2345.91343.03321.49

    IDU358.61354.21321.59IFU_B166.3658.84

    LSU_cache349.03345.79321.68LSU_cache75.8873.86

    LSU_B1356.85352.7321.9FXU_reg95.3694.42

    LSU_B2362.38357.61322.22

    FXU_reg368.51363.03322.35IFU_B146.625504815247.0833333333

    FXU_B1365.36360.18321.88LSU_cache60.672899386973.5680941603

    FXU_B2364.54359.52322.36FXU_reg162.5349721707203.0258948024

    ISU360.03355.4322.03

    FPU_reg353.46349.65321.22

    FPU_B1353.2349.44321.48

    BXU351.52348321.77

    L2340.37338.2321.1

    IFU_cache0.87395221590.20992522341.08387743940.8424352460.17555087371.017986119736.006810178733.623530188936.006810178733.6235301889

    IFU_B10.26017508660.040.30017508660.26492997890.030.294929978946.625504815247.083333333346.625504815247.0833333333

    IFU_B20.96619551340.17564836771.14184388110.98555335230.15663155291.142184905251.295777178851.311096136651.295777178851.3110961366

    IDU2.51335217950.22139522162.73474740112.2959226670.17196058992.4678832569139.5279081633125.912397959260.672899386973.5680941603

    LSU_cache2.89697428420.46794455593.36491884013.631331020.43146734424.062798364160.672899386973.5680941603124.1153043478142.5393043478

    LSU_B14.56377348170.43186759074.99564107245.31278349240.42442310085.7372065931124.1153043478142.5393043478128.8828667413145.0042923219

    LSU_B23.11436804670.33840415833.4527722053.60135084890.31739001123.9187408601128.8828667413145.0042923219139.5279081633125.9123979592

    FXU_reg1.59050884550.16161772751.7521265733.02870120210.23290990213.2616111042162.5349721707203.025894802479.566697588199.0469781931

    FXU_B10.86534014230.13484588731.00018602951.11723137940.13182302641.2490544058102.0597959184116.625023342766.450906183462.75536

    FXU_B20.77580703990.11719131860.89299835860.90389887220.10563266821.0095315405101.2469387755115.7587432634101.3677959184102.9144062128

    ISU6.57987050880.87066204637.45053255529.87166687321.206864543311.0785314165101.3677959184102.9144062128162.5349721707203.0258948024

    FPU_reg0.7488509940.10887835130.85772934521.43517784110.15452596361.589703804779.566697588199.0469781931102.0597959184116.6250233427

    FPU_B11.06210833490.18451030461.24661863941.02668360390.14997893171.176662535666.450906183462.75536101.2469387755115.7587432634

    BXU0.87785701260.18296552981.06082254240.89312678150.16559920681.058725988454.12362244954.016632653154.12362244954.0166326531

    IFU_cache0.87395221590.20992522341.08387743942.89984632542.23677407365.136620399120.31464674768.748245793320.31464674768.7482457933

    IFU_B10.26017508660.040.3001750866

    IFU_B20.96619551340.17564836771.1418438811

    IDU2.51335217950.22139522162.7347474011

    LSU_cache2.89697428420.46794455593.3649188401

    LSU_B14.56377348170.43186759074.9956410724

    LSU_B23.11436804670.33840415833.452772205

    FXU_reg1.59050884550.16161772751.752126573

    FXU_B10.86534014230.13484588731.0001860295

    FXU_B20.77580703990.11719131860.8929983586

    ISU6.57987050880.87066204637.4505325552

    FPU_reg0.7488509940.10887835130.8577293452

    FPU_B11.06210833490.18451030461.2466186394

    BXU0.87785701260.18296552981.0608225424

    L24.81191865771.71517782246.52709648

    Sheet1

    00

    00

    00

    00

    00

    00

    00

    00

    00

    00

    00

    00

    00

    00

    00

    CMP

    SMT

    Units

    Temperature (C)

    Sheet2

    00

    00

    00

    00

    00

    00

    00

    00

    00

    00

    00

    00

    00

    00

    00

    CMP

    SMT

    Units

    Power Density (w/cm^2)

    Sheet3

    CMP

    SMT

    Temperature

    CMP

    SMT

    Power density

    Chart2

    46.625504815247.0833333333

    60.672899386973.5680941603

    162.5349721707203.0258948024

    CMP

    SMT

    Power density

    Sheet1

    CMPSMT

    DeviceInterface materialSpreadDeviceInterface materialSpread

    CMPSMT

    IFU_cache341.91339.47321.16IFU_cache336.25334.18318.865.665.292.3IFU_cache341.91336.2568.7663.1

    IFU_B1339.51337.28321IFU_B1331.99330.4318.67.526.882.4IFU_B1339.51331.9966.3658.84

    IFU_B2345.91343.03321.49IFU_B2342.8340.01319.183.113.022.31IFU_B2345.91342.872.7669.65

    IDU358.61354.21321.59IDU350.62346.87319.357.997.342.24IDU358.61350.6285.4677.47

    LSU_cache349.03345.79321.68LSU_cache347.01343.72319.312.022.072.37LSU_cache349.03347.0175.8873.86

    LSU_B1356.85352.7321.9LSU_B1357.01352.58319.67-0.160.122.23LSU_B1356.85357.0183.783.86

    LSU_B2362.38357.61322.22LSU_B2360.91356.03319.861.471.582.36LSU_B2362.38360.9189.2387.76

    FXU_reg368.51363.03322.35FXU_reg367.57361.85319.950.941.182.4FXU_reg368.51367.5795.3694.42

    FXU_B1365.36360.18321.88FXU_B1361.85356.81319.373.513.372.51FXU_B1365.36361.8592.2188.7

    FXU_B2364.54359.52322.36FXU_B2362.06357.06319.922.482.462.44FXU_B2364.54362.0691.3988.91

    ISU360.03355.4322.03ISU358.69353.78319.771.341.622.26ISU360.03358.6986.8885.54

    FPU_reg353.46349.65321.22FPU_reg352.27348.35319.011.191.32.21FPU_reg353.46352.2780.3179.12

    FPU_B1353.2349.44321.48FPU_B1345.81342.65319.027.396.792.46FPU_B1353.2345.8180.0572.66

    BXU351.52348321.77BXU348.55345.12319.52.972.882.27BXU351.52348.5578.3775.4

    IFU_cache341.91339.47321.16L2325.59324.7318.216.3214.772.96L2340.37325.5967.2252.44

    IFU_B1339.51337.28321

    IFU_B2345.91343.03321.49

    IDU358.61354.21321.59IFU_B166.3658.84

    LSU_cache349.03345.79321.68LSU_cache75.8873.86

    LSU_B1356.85352.7321.9FXU_reg95.3694.42

    LSU_B2362.38357.61322.22

    FXU_reg368.51363.03322.35IFU_B146.625504815247.0833333333

    FXU_B1365.36360.18321.88LSU_cache60.672899386973.5680941603

    FXU_B2364.54359.52322.36FXU_reg162.5349721707203.0258948024

    ISU360.03355.4322.03

    FPU_reg353.46349.65321.22

    FPU_B1353.2349.44321.48

    BXU351.52348321.77

    L2340.37338.2321.1

    IFU_cache0.87395221590.20992522341.08387743940.8424352460.17555087371.017986119736.006810178733.623530188936.006810178733.6235301889

    IFU_B10.26017508660.040.30017508660.26492997890.030.294929978946.625504815247.083333333346.625504815247.0833333333

    IFU_B20.96619551340.17564836771.14184388110.98555335230.15663155291.142184905251.295777178851.311096136651.295777178851.3110961366

    IDU2.51335217950.22139522162.73474740112.2959226670.17196058992.4678832569139.5279081633125.912397959260.672899386973.5680941603

    LSU_cache2.89697428420.46794455593.36491884013.631331020.43146734424.062798364160.672899386973.5680941603124.1153043478142.5393043478

    LSU_B14.56377348170.43186759074.99564107245.31278349240.42442310085.7372065931124.1153043478142.5393043478128.8828667413145.0042923219

    LSU_B23.11436804670.33840415833.4527722053.60135084890.31739001123.9187408601128.8828667413145.0042923219139.5279081633125.9123979592

    FXU_reg1.59050884550.16161772751.7521265733.02870120210.23290990213.2616111042162.5349721707203.025894802479.566697588199.0469781931

    FXU_B10.86534014230.13484588731.00018602951.11723137940.13182302641.2490544058102.0597959184116.625023342766.450906183462.75536

    FXU_B20.77580703990.11719131860.89299835860.90389887220.10563266821.0095315405101.2469387755115.7587432634101.3677959184102.9144062128

    ISU6.57987050880.87066204637.45053255529.87166687321.206864543311.0785314165101.3677959184102.9144062128162.5349721707203.0258948024

    FPU_reg0.7488509940.10887835130.85772934521.43517784110.15452596361.589703804779.566697588199.0469781931102.0597959184116.6250233427

    FPU_B11.06210833490.18451030461.24661863941.02668360390.14997893171.176662535666.450906183462.75536101.2469387755115.7587432634

    BXU0.87785701260.18296552981.06082254240.89312678150.16559920681.058725988454.12362244954.016632653154.12362244954.0166326531

    IFU_cache0.87395221590.20992522341.08387743942.89984632542.23677407365.136620399120.31464674768.748245793320.31464674768.7482457933

    IFU_B10.26017508660.040.3001750866

    IFU_B20.96619551340.17564836771.1418438811

    IDU2.51335217950.22139522162.7347474011

    LSU_cache2.89697428420.46794455593.3649188401

    LSU_B14.56377348170.43186759074.9956410724

    LSU_B23.11436804670.33840415833.452772205

    FXU_reg1.59050884550.16161772751.752126573

    FXU_B10.86534014230.13484588731.0001860295

    FXU_B20.77580703990.11719131860.8929983586

    ISU6.57987050880.87066204637.4505325552

    FPU_reg0.7488509940.10887835130.8577293452

    FPU_B11.06210833490.18451030461.2466186394

    BXU0.87785701260.18296552981.0608225424

    L24.81191865771.71517782246.52709648

    Sheet1

    00

    00

    00

    00

    00

    00

    00

    00

    00

    00

    00

    00

    00

    00

    00

    CMP

    SMT

    Units

    Temperature (C)

    Sheet2

    00

    00

    00

    00

    00

    00

    00

    00

    00

    00

    00

    00

    00

    00

    00

    CMP

    SMT

    Units

    Power Density (w/cm^2)

    Sheet3

    CMP

    SMT

    Temperature

    CMP

    SMT

    Power density

    * 2005, Yingmin Li

    Temperature Trend with technology evolutionIncreased utilization of SMT becomes mutedL2 cache tends to be much cooler than the coreExpotential temperature dependence of leakage

    Chart2

    1.31212121211.31212121210.8427272727

    2.04909090912.29757575761.5575757576

    4.54272727275.02393939392.2827272727

    Normal case

    L2 leakage radically reduced

    No Temperature effect on Leakge

    Technology (nm)

    Average temperature difference between CMP and SMT

    Sheet1

    354.8753333333346.4786666667362.1142424242361.3981818182363.588125361.1606060606

    81.725333333373.328666666788.964242424288.248181818290.43812588.0106060606

    ST

    ST (area enlarged)

    SMT

    SMT(only activity factor)

    CMP

    CMP (one core rotated)

    3.24490112547.541836901913.93308078421.31212121211.31212121210.8427272727

    26440.484848484826440.484848484826440.48484848482.04909090912.29757575761.5575757576

    0.10930971650.22193412670.34510404434.54272727275.02393939392.2827272727

    1309070

    SMT

    CMP

    ST

    smt-lcmp-lst-l

    b0.25898319650.8742610040.42944883650.2215576404

    g0.03074097310.5907973889-0.06966020140.24378380560.1557347431-0.0256217382

    l0.06211140860.5345847812-0.09632045230.26250559360.1338163491-0.035526005

    IPCr0.10581767690.5854375314-0.07999156090.32794622010.151027027-0.0297370176

    POWERd0.14461626230.6707720816-0.041161340.31707656450.1695501129-0.0149913657

    ENERGYd200.09808611030.5956772038-0.05695520950.26906444170.1461591668-0.0205857998

    ENERGY DELAY

    ENERGY DELAY^2

    SMT-lowSMT-high

    BGLRDVS10DVS20

    0.25898319650.03074097310.06211140860.10581767690.14461626230.09808611030.42944883650.24378380560.26250559360.32794622010.31707656450.2690644417

    0.37984054270.21658237770.26821004760.27959109690.04009992680.00580745040.45855923720.31798110730.3741059050.3954006020.14030016330.1043433999

    0.07753108790.16098795030.27781292460.1733499403-0.1190658026-0.11351602610.00432723090.07969232770.21218715450.0723822197-0.159194344-0.1540390618

    -0.13931645840.17667160560.39053401220.1270176583-0.227085279-0.1848850453-0.23367268690.08808466470.2376049533-0.1328624067-0.3003622446-0.2604098608

    -0.29894605280.25498651660.61918149850.1271092222-0.3012049553-0.2232194076-0.34982751060.36614980160.4765056837-0.2620823634-0.3375556991-0.2553640208

    CMP-lowCMP-high

    0.8742610040.59079738890.53458478120.58543753140.67077208160.59567720380.22155764040.15573474310.13381634910.1510270270.16955011290.1461591668

    0.93437493740.80333938180.85444872970.86304534070.36062999620.31573597770.70840719960.79909784020.82123169480.82417034540.49881250240.4731907887

    0.01333889090.17932682030.30600609230.2405303675-0.2115738687-0.20502705250.55415678330.68595736210.75124529860.70840888380.43859875950.4429925023

    -0.446380657-0.16361808080.0131603537-0.101961117-0.5205315035-0.49310309480.92244807561.09546877081.20687310571.0879726060.86423114790.908241176

    -0.6867610944-0.3668360075-0.1523275257-0.3047110638-0.6956662191-0.66070249011.89901156962.16521117412.36541010582.06325204451.93666166922.0730124142

    mcf+mcfmcf+vpr

    CMP2MSMT3MSMT2MCMP1MCMP2MSMT3MSMT2MCMP1M

    0.08056970120.7562425940.0178648675-0.49992309650.89516590760.64218869450.58623483260.1699922658

    0.76212768220.51783019610.38187316540.77191674330.81107092720.55184073880.53182854620.6586422793

    0.6307394889-0.13575139280.35761948692.54328850380.0753941638-0.01810823040.00605737151.3502576369

    0.509147894-0.50789906740.33379149226.0854872099-0.3614426746-0.378730418-0.33925279212.3302605562

    0.3966224412-0.71979899550.310381709913.1687951593-0.6208316247-0.6069058673-0.56604177353.7189019612

    CMPSMT

    0.0264571190.3364798079

    0.07352374420.3810228237

    0.22155764040.4294488365

    0.49383534420.4621670173

    0.61999150560.5225182996

    0.67959194240.5302003877

    0.73417723930.5443473787

    IPC1.5M

    POWER1.75M

    ENERGY2M

    ENERGY DELAY2.25M

    ENERGY DELAY^22.5M

    2.75M

    3M

    Sheet1

    000

    000

    000

    Normal case

    L2 leakage radically reduced

    No Temperature effect on Leakge

    Technology (nm)

    Average temperature difference between CMP and SMT

    Sheet2

    0

    0

    0

    0

    0

    0

    Absolute Temperature (Celsius)

    Sheet3

    00

    00

    00

    00

    00

    2-way SMT

    dual-core CMP

    Relative change compared to ST baseline

    00

    00

    00

    00

    00

    2-way SMT

    dual-core CMP

    Relative change compared to ST baseline

    000000

    000000

    000000

    No DTM

    Global fetch throttling

    Local renaming throttling

    Register file throttling

    DVS10

    DVS20

    Relative change compared to ST baseline without DTM

    000000

    000000

    000000

    No DTM

    Global fetch throttling

    Local renaming throttling

    Register file throttling

    DVS10

    DVS20

    Relative change compared to ST baseline without DTM

    -0.0545799581-0.0390064596-0.0360948816-0.095716911-0.1087500114

    0.02506602060.08906567280.0636536881-0.0589310101-0.0566065961

    0.12139632310.25798160460.1885986013-0.0185160950.0029792386

    0.23878798150.48369469820.34666520760.02595916460.0712814762

    Fetch throttling

    Rename throttling

    Register file throttling

    DVS10

    DVS20

    Relative change compared with baseline without DTM

    -0.0188401584-0.0133399996-0.0122237317-0.0012915749-0.0063309131

    0.00833237330.03034882850.02176455320.01231223310.0130886515

    0.0389932170.08251137110.06096026490.02677661880.0342944582

    0.07362170770.14487410540.10621857540.04215762010.0574549696

    Fetch throttling

    Rename throttling

    Register file throttling

    DVS10

    DVS20

    Relative change compared with baseline without DTM

    000000

    000000

    000000

    000000

    No DTM

    Fetch throttling

    Rename throttling

    Register file throttling

    DVS10

    DVS20

    Relative change compared with baseline without DTM

    000000

    000000

    000000

    000000

    No DTM

    Fetch throttling

    Rename throttling

    Register file throttling

    DVS10

    DVS20

    Relative change compared with baseline without DTM

    000000

    000000

    000000

    000000

    No DTM

    Fetch throttling

    Rename throttling

    Register file throttling

    DVS10

    DVS20

    Relative change compared with baseline without DTM

    000000

    000000

    000000

    000000

    1.10

    1.21

    1.09

    1.9

    2.17

    2.37

    2.06

    1.94

    2.07

    No DTM

    Fetch throttling

    Rename throttling

    Register file throttling

    DVS10

    DVS20

    Relative change compared with baseline without DTM

    0.01786486750.75624259400

    0.38187316540.517830196100

    0.3576194869-0.135751392800

    0.3337914922-0.507899067400

    0.3103817099-0.719798995500

    13.2

    6.09

    2.54

    SMT with 2MB L2

    SMT with 3MB L2

    CMP with 1MB L2

    CMP with 2MB L2

    Relative change compared with baseline ST with 2MB L2

    0.58623483260.642188694500

    0.53182854620.551840738800

    0.0060573715-0.018108230400

    -0.3392527921-0.37873041800

    -0.5660417735-0.606905867300

    1.35

    2.33

    3.72

    SMT with 2MB L2

    SMT with 3MB L2

    CMP with 1MB L2

    CMP with 2MB L2

    Relative change compared with baseline ST with 2MB L2

    00

    00

    00

    00

    00

    00

    00

    SMT

    CMP

    L2 size (SMT)

    Relative performance change compared to ST baseline

    * 2005, Yingmin Li

    SMT vs. CMP performance and power efficiency analysis (without DTM)SMT is superior for memory bound(high-l2-cache-miss rate) benchmarks while CMP is superior for non memory bound benchmarksCompute-boundMemory-bound

    Chart1

    0.25898319650.874261004

    0.37984054270.9343749374

    0.07753108790.0133388909

    -0.1393164584-0.446380657

    -0.2989460528-0.6867610944

    2-way SMT

    dual-core CMP

    Relative change compared to ST baseline

    Sheet1

    354.8753333333346.4786666667362.1142424242361.3981818182363.588125361.1606060606

    81.725333333373.328666666788.964242424288.248181818290.43812588.0106060606

    ST

    ST (area enlarged)

    SMT

    SMT(only activity factor)

    CMP

    CMP (one core rotated)

    3.24490112547.541836901913.93308078421.31212121211.31212121210.8427272727

    26440.484848484826440.484848484826440.48484848482.04909090912.29757575761.5575757576

    0.10930971650.22193412670.34510404434.54272727275.02393939392.2827272727

    1309070

    SMT

    CMP

    ST

    smt-lcmp-lst-l

    b0.25898319650.8742610040.42944883650.2215576404

    g0.03074097310.5907973889-0.06966020140.24378380560.1557347431-0.0256217382

    l0.06211140860.5345847812-0.09632045230.26250559360.1338163491-0.035526005

    IPCr0.10581767690.5854375314-0.07999156090.32794622010.151027027-0.0297370176

    POWERd0.14461626230.6707720816-0.041161340.31707656450.1695501129-0.0149913657

    ENERGYd200.09808611030.5956772038-0.05695520950.26906444170.1461591668-0.0205857998

    ENERGY DELAY

    ENERGY DELAY^2

    SMT-lowSMT-high

    BGLRDVS10DVS20

    0.25898319650.03074097310.06211140860.10581767690.14461626230.09808611030.42944883650.24378380560.26250559360.32794622010.31707656450.2690644417

    0.37984054270.21658237770.26821004760.27959109690.04009992680.00580745040.45855923720.31798110730.3741059050.3954006020.14030016330.1043433999

    0.07753108790.16098795030.27781292460.1733499403-0.1190658026-0.11351602610.00432723090.07969232770.21218715450.0723822197-0.159194344-0.1540390618

    -0.13931645840.17667160560.39053401220.1270176583-0.227085279-0.1848850453-0.23367268690.08808466470.2376049533-0.1328624067-0.3003622446-0.2604098608

    -0.29894605280.25498651660.61918149850.1271092222-0.3012049553-0.2232194076-0.34982751060.36614980160.4765056837-0.2620823634-0.3375556991-0.2553640208

    CMP-lowCMP-high

    0.8742610040.59079738890.53458478120.58543753140.67077208160.59567720380.22155764040.15573474310.13381634910.1510270270.16955011290.1461591668

    0.93437493740.80333938180.85444872970.86304534070.36062999620.31573597770.70840719960.79909784020.82123169480.82417034540.49881250240.4731907887

    0.01333889090.17932682030.30600609230.2405303675-0.2115738687-0.20502705250.55415678330.68595736210.75124529860.70840888380.43859875950.4429925023

    -0.446380657-0.16361808080.0131603537-0.101961117-0.5205315035-0.49310309480.92244807561.09546877081.20687310571.0879726060.86423114790.908241176

    -0.6867610944-0.3668360075-0.1523275257-0.3047110638-0.6956662191-0.66070249011.89901156962.16521117412.36541010582.06325204451.93666166922.0730124142

    mcf+mcfmcf+vpr

    CMP2MSMT3MSMT2MCMP1MCMP2MSMT3MSMT2MCMP1M

    0.08056970120.7562425940.0178648675-0.49992309650.89516590760.64218869450.58623483260.1699922658

    0.76212768220.51783019610.38187316540.77191674330.81107092720.55184073880.53182854620.6586422793

    0.6307394889-0.13575139280.35761948692.54328850380.0753941638-0.01810823040.00605737151.3502576369

    0.509147894-0.50789906740.33379149226.0854872099-0.3614426746-0.378730418-0.33925279212.3302605562

    0.3966224412-0.71979899550.310381709913.1687951593-0.6208316247-0.6069058673-0.56604177353.7189019612

    CMPSMT

    0.0264571190.3364798079

    0.07352374420.3810228237

    0.22155764040.4294488365

    0.49383534420.4621670173

    0.61999150560.5225182996

    0.67959194240.5302003877

    0.73417723930.5443473787

    IPC1.5M

    POWER1.75M

    ENERGY2M

    ENERGY DELAY2.25M

    ENERGY DELAY^22.5M

    2.75M

    3M

    Sheet1

    Normal case

    L2 leakage radically reduced

    No Temperature effect on Leakge

    Technology (nm)

    Average temperature difference between CMP and SMT

    Sheet2

    Temperature (Celsius)

    Sheet3

    2-way SMT

    dual-core CMP

    Relative change compared to ST baseline

    2-way SMT

    dual-core CMP

    Relative change compared to ST baseline

    No DTM

    Global fetch throttling

    Local renaming throttling

    Register file throttling

    Relative change compared to ST baseline without DTM

    No DTM

    Global fetch throttling

    Local renaming throttling

    Register file throttling

    Relative change compared to ST baseline without DTM

    -0.0545799581-0.0390064596-0.0360948816-0.095716911-0.1087500114

    0.02506602060.08906567280.0636536881-0.0589310101-0.0566065961

    0.12139632310.25798160460.1885986013-0.0185160950.0029792386

    0.23878798150.48369469820.34666520760.02595916460.0712814762

    Fetch throttling

    Rename throttling

    Register file throttling

    DVS10

    DVS20

    Relative change compared with baseline without DTM

    -0.0188401584-0.0133399996-0.0122237317-0.0012915749-0.0063309131

    0.00833237330.03034882850.02176455320.01231223310.0130886515

    0.0389932170.08251137110.06096026490.02677661880.0342944582

    0.07362170770.14487410540.10621857540.04215762010.0574549696

    Fetch throttling

    Rename throttling

    Register file throttling

    DVS10

    DVS20

    Relative change compared with baseline without DTM

    No DTM

    Fetch throttling

    Rename throttling

    Register file throttling

    Relative change compared with baseline without DTM

    No DTM

    Fetch throttling

    Rename throttling

    Register file throttling

    Relative change compared with baseline without DTM

    No DTM

    Fetch throttling

    Rename throttling

    Register file throttling

    Relative change compared with baseline without DTM

    1.10

    1.21

    1.09

    1.9

    2.17

    2.37

    2.06

    1.94

    2.07

    No DTM

    Fetch throttling

    Rename throttling

    Register file throttling

    Relative change compared with baseline without DTM

    0.01786486750.756242594

    0.38187316540.5178301961

    0.3576194869-0.1357513928

    0.3337914922-0.5078990674

    0.3103817099-0.7197989955

    13.2

    6.09

    2.54

    SMT with 2MB L2

    SMT with 3MB L2

    CMP with 1MB L2

    CMP with 2MB L2

    Relative change compared with baseline ST with 2MB L2

    0.58623483260.6421886945

    0.53182854620.5518407388

    0.0060573715-0.0181082304

    -0.3392527921-0.378730418

    -0.5660417735-0.6069058673

    1.35

    2.33

    3.72

    SMT with 2MB L2

    SMT with 3MB L2

    CMP with 1MB L2

    CMP with 2MB L2

    Relative change compared with baseline ST with 2MB L2

    SMT

    CMP

    L2 size (SMT)

    Relative performance change compared to ST baseline

    Chart2

    0.42944883650.2215576404

    0.45855923720.7084071996

    0.00432723090.5541567833

    -0.23367268690.9224480756

    -0.34982751061.8990115696

    2-way SMT

    dual-core CMP

    Relative change compared to ST baseline

    Sheet1

    354.8753333333346.4786666667362.1142424242361.3981818182363.588125361.1606060606

    81.725333333373.328666666788.964242424288.248181818290.43812588.0106060606

    ST

    ST (area enlarged)

    SMT

    SMT(only activity factor)

    CMP

    CMP (one core rotated)

    3.24490112547.541836901913.93308078421.31212121211.31212121210.8427272727

    26440.484848484826440.484848484826440.48484848482.04909090912.29757575761.5575757576

    0.10930971650.22193412670.34510404434.54272727275.02393939392.2827272727

    1309070

    SMT

    CMP

    ST

    smt-lcmp-lst-l

    b0.25898319650.8742610040.42944883650.2215576404

    g0.03074097310.5907973889-0.06966020140.24378380560.1557347431-0.0256217382

    l0.06211140860.5345847812-0.09632045230.26250559360.1338163491-0.035526005

    IPCr0.10581767690.5854375314-0.07999156090.32794622010.151027027-0.0297370176

    POWERd0.14461626230.6707720816-0.041161340.31707656450.1695501129-0.0149913657

    ENERGYd200.09808611030.5956772038-0.05695520950.26906444170.1461591668-0.0205857998

    ENERGY DELAY

    ENERGY DELAY^2

    SMT-lowSMT-high

    BGLRDVS10DVS20

    0.25898319650.03074097310.06211140860.10581767690.14461626230.09808611030.42944883650.24378380560.26250559360.32794622010.31707656450.2690644417

    0.37984054270.21658237770.26821004760.27959109690.04009992680.00580745040.45855923720.31798110730.3741059050.3954006020.14030016330.1043433999

    0.07753108790.16098795030.27781292460.1733499403-0.1190658026-0.11351602610.00432723090.07969232770.21218715450.0723822197-0.159194344-0.1540390618

    -0.13931645840.17667160560.39053401220.1270176583-0.227085279-0.1848850453-0.23367268690.08808466470.2376049533-0.1328624067-0.3003622446-0.2604098608

    -0.29894605280.25498651660.61918149850.1271092222-0.3012049553-0.2232194076-0.34982751060.36614980160.4765056837-0.2620823634-0.3375556991-0.2553640208

    CMP-lowCMP-high

    0.8742610040.59079738890.53458478120.58543753140.67077208160.59567720380.22155764040.15573474310.13381634910.1510270270.16955011290.1461591668

    0.93437493740.80333938180.85444872970.86304534070.36062999620.31573597770.70840719960.79909784020.82123169480.82417034540.49881250240.4731907887

    0.01333889090.17932682030.30600609230.2405303675-0.2115738687-0.20502705250.55415678330.68595736210.75124529860.70840888380.43859875950.4429925023

    -0.446380657-0.16361808080.0131603537-0.101961117-0.5205315035-0.49310309480.92244807561.09546877081.20687310571.0879726060.86423114790.908241176

    -0.6867610944-0.3668360075-0.1523275257-0.3047110638-0.6956662191-0.66070249011.89901156962.16521117412.36541010582.06325204451.93666166922.0730124142

    mcf+mcfmcf+vpr

    CMP2MSMT3MSMT2MCMP1MCMP2MSMT3MSMT2MCMP1M

    0.08056970120.7562425940.0178648675-0.49992309650.89516590760.64218869450.58623483260.1699922658

    0.76212768220.51783019610.38187316540.77191674330.81107092720.55184073880.53182854620.6586422793

    0.6307394889-0.13575139280.35761948692.54328850380.0753941638-0.01810823040.00605737151.3502576369

    0.509147894-0.50789906740.33379149226.0854872099-0.3614426746-0.378730418-0.33925279212.3302605562

    0.3966224412-0.71979899550.310381709913.1687951593-0.6208316247-0.6069058673-0.56604177353.7189019612

    CMPSMT

    0.0264571190.3364798079

    0.07352374420.3810228237

    0.22155764040.4294488365

    0.49383534420.4621670173

    0.61999150560.5225182996

    0.67959194240.5302003877

    0.73417723930.5443473787

    IPC1.5M

    POWER1.75M

    ENERGY2M

    ENERGY DELAY2.25M

    ENERGY DELAY^22.5M

    2.75M

    3M

    Sheet1

    Normal case

    L2 leakage radically reduced

    No Temperature effect on Leakge

    Technology (nm)

    Average temperature difference between CMP and SMT

    Sheet2

    Temperature (Celsius)

    Sheet3

    2-way SMT

    dual-core CMP

    Relative change compared to ST baseline

    2-way SMT

    dual-core CMP

    Relative change compared to ST baseline

    No DTM

    Global fetch throttling

    Local renaming throttling

    Register file throttling

    Relative change compared to ST baseline without DTM

    No DTM

    Global fetch throttling

    Local renaming throttling

    Register file throttling

    Relative change compared to ST baseline without DTM

    -0.0545799581-0.0390064596-0.0360948816-0.095716911-0.1087500114

    0.02506602060.08906567280.0636536881-0.0589310101-0.0566065961

    0.12139632310.25798160460.1885986013-0.0185160950.0029792386

    0.23878798150.48369469820.34666520760.02595916460.0712814762

    Fetch throttling

    Rename throttling

    Register file throttling

    DVS10

    DVS20

    Relative change compared with baseline without DTM

    -0.0188401584-0.0133399996-0.0122237317-0.0012915749-0.0063309131

    0.00833237330.03034882850.02176455320.01231223310.0130886515

    0.0389932170.08251137110.06096026490.02677661880.0342944582

    0.07362170770.14487410540.10621857540.04215762010.0574549696

    Fetch throttling

    Rename throttling

    Register file throttling

    DVS10

    DVS20

    Relative change compared with baseline without DTM

    No DTM

    Fetch throttling

    Rename throttling

    Register file throttling

    Relative change compared with baseline without DTM

    No DTM

    Fetch throttling

    Rename throttling

    Register file throttling

    Relative change compared with baseline without DTM

    No DTM

    Fetch throttling

    Rename throttling

    Register file throttling

    Relative change compared with baseline without DTM

    1.10

    1.21

    1.09

    1.9

    2.17

    2.37

    2.06

    1.94

    2.07

    No DTM

    Fetch throttling

    Rename throttling

    Register file throttling

    Relative change compared with baseline without DTM

    0.01786486750.756242594

    0.38187316540.5178301961

    0.3576194869-0.1357513928

    0.3337914922-0.5078990674

    0.3103817099-0.7197989955

    13.2

    6.09

    2.54

    SMT with 2MB L2

    SMT with 3MB L2

    CMP with 1MB L2

    CMP with 2MB L2

    Relative change compared with baseline ST with 2MB L2

    0.58623483260.6421886945

    0.53182854620.5518407388

    0.0060573715-0.0181082304

    -0.3392527921-0.378730418

    -0.5660417735-0.6069058673

    1.35

    2.33

    3.72

    SMT with 2MB L2

    SMT with 3MB L2

    CMP with 1MB L2

    CMP with 2MB L2

    Relative change compared with baseline ST with 2MB L2

    SMT

    CMP

    L2 size (SMT)

    Relative performance change compared to ST baseline

    * 2005, Yingmin Li

    The impact of changing L2 size: ExamplesMCF+MCFMCF+VPRStays memory bound when L2 size changesChanges from memory bound to non memory bound when L2 size changes

    Chart3

    0.01786486750.756242594-0.49992309650.0805697012

    0.38187316540.51783019610.77191674330.7621276822

    0.3576194869-0.13575139282.54328850380.6307394889

    0.3337914922-0.50789906746.08548720990.509147894

    0.3103817099-0.719798995513.16879515930.3966224412

    13.2

    6.09

    2.54

    SMT with 2MB L2

    SMT with 3MB L2

    CMP with 1MB L2

    CMP with 2MB L2

    Relative change compared with baseline ST with 2MB L2

    Sheet1

    354.8753333333346.4786666667362.1142424242361.3981818182363.588125361.1606060606

    81.725333333373.328666666788.964242424288.248181818290.43812588.0106060606

    ST

    ST (area enlarged)

    SMT

    SMT(only activity factor)

    CMP

    CMP (one core rotated)

    3.24490112547.541836901913.93308078421.31212121211.31212121210.8427272727

    26440.484848484826440.484848484826440.48484848482.04909090912.29757575761.5575757576

    0.10930971650.22193412670.34510404434.54272727275.02393939392.2827272727

    1309070

    SMT

    CMP

    ST

    smt-lcmp-lst-l

    b0.25898319650.8742610040.42944883650.2215576404

    g0.03074097310.5907973889-0.06966020140.24378380560.1557347431-0.0256217382

    l0.06211140860.5345847812-0.09632045230.26250559360.1338163491-0.035526005

    IPCr0.10581767690.5854375314-0.07999156090.32794622010.151027027-0.0297370176

    POWERd0.14461626230.6707720816-0.041161340.31707656450.1695501129-0.0149913657

    ENERGYd200.09808611030.5956772038-0.05695520950.26906444170.1461591668-0.0205857998

    ENERGY DELAY

    ENERGY DELAY^2

    SMT-lowSMT-high

    BGLRDVS10DVS20

    0.25898319650.03074097310.06211140860.10581767690.14461626230.09808611030.42944883650.24378380560.26250559360.32794622010.31707656450.2690644417

    0.37984054270.21658237770.26821004760.27959109690.04009992680.00580745040.45855923720.31798110730.3741059050.3954006020.14030016330.1043433999

    0.07753108790.16098795030.27781292460.1733499403-0.1190658026-0.11351602610.00432723090.07969232770.21218715450.0723822197-0.159194344-0.1540390618

    -0.13931645840.17667160560.39053401220.1270176583-0.227085279-0.1848850453-0.23367268690.08808466470.2376049533-0.1328624067-0.3003622446-0.2604098608

    -0.29894605280.25498651660.61918149850.1271092222-0.3012049553-0.2232194076-0.34982751060.36614980160.4765056837-0.2620823634-0.3375556991-0.2553640208

    CMP-lowCMP-high

    0.8742610040.59079738890.53458478120.58543753140.67077208160.59567720380.22155764040.15573474310.13381634910.1510270270.16955011290.1461591668

    0.93437493740.80333938180.85444872970.86304534070.36062999620.31573597770.70840719960.79909784020.82123169480.82417034540.49881250240.4731907887

    0.01333889090.17932682030.30600609230.2405303675-0.2115738687-0.20502705250.55415678330.68595736210.75124529860.70840888380.43859875950.4429925023

    -0.446380657-0.16361808080.0131603537-0.101961117-0.5205315035-0.49310309480.92244807561.09546877081.20687310571.0879726060.86423114790.908241176

    -0.6867610944-0.3668360075-0.1523275257-0.3047110638-0.6956662191-0.66070249011.89901156962.16521117412.36541010582.06325204451.93666166922.0730124142

    mcf+mcfmcf+vpr

    CMP2MSMT3MSMT2MCMP1MCMP2MSMT3MSMT2MCMP1M

    0.08056970120.7562425940.0178648675-0.49992309650.89516590760.64218869450.58623483260.1699922658

    0.76212768220.51783019610.38187316540.77191674330.81107092720.55184073880.53182854620.6586422793

    0.6307394889-0.13575139280.35761948692.54328850380.0753941638-0.01810823040.00605737151.3502576369

    0.509147894-0.50789906740.33379149226.0854872099-0.3614426746-0.378730418-0.33925279212.3302605562

    0.3966224412-0.71979899550.310381709913.1687951593-0.6208316247-0.6069058673-0.56604177353.7189019612

    CMPSMT

    0.0264571190.3364798079

    0.07352374420.3810228237

    0.22155764040.4294488365

    0.49383534420.4621670173

    0.61999150560.5225182996

    0.67959194240.5302003877

    0.73417723930.5443473787

    IPC1.5M

    POWER1.75M

    ENERGY2M

    ENERGY DELAY2.25M

    ENERGY DELAY^22.5M

    2.75M

    3M

    Sheet1

    Normal case

    L2 leakage radically reduced

    No Temperature effect on Leakge

    Technology (nm)

    Average temperature difference between CMP and SMT

    Sheet2

    Temperature (Celsius)

    Sheet3

    2-way SMT

    dual-core CMP

    Relative change compared to ST baseline

    2-way SMT

    dual-core CMP

    Relative change compared to ST baseline

    No DTM

    Global fetch throttling

    Local renaming throttling

    Register file throttling

    Relative change compared to ST baseline without DTM

    No DTM

    Global fetch throttling

    Local renaming throttling

    Register file throttling

    Relative change compared to ST baseline without DTM

    -0.0545799581-0.0390064596-0.0360948816-0.095716911-0.1087500114

    0.02506602060.08906567280.0636536881-0.0589310101-0.0566065961

    0.12139632310.25798160460.1885986013-0.0185160950.0029792386

    0.23878798150.48369469820.34666520760.02595916460.0712814762

    Fetch throttling

    Rename throttling

    Register file throttling

    DVS10

    DVS20

    Relative change compared with baseline without DTM

    -0.0188401584-0.0133399996-0.0122237317-0.0012915749-0.0063309131

    0.00833237330.03034882850.02176455320.01231223310.0130886515

    0.0389932170.08251137110.06096026490.02677661880.0342944582

    0.07362170770.14487410540.10621857540.04215762010.0574549696

    Fetch throttling

    Rename throttling

    Register file throttling

    DVS10

    DVS20

    Relative change compared with baseline without DTM

    No DTM

    Fetch throttling

    Rename throttling

    Register file throttling

    Relative change compared with baseline without DTM

    No DTM

    Fetch throttling

    Rename throttling

    Register file throttling

    Relative change compared with baseline without DTM

    No DTM

    Fetch throttling

    Rename throttling

    Register file throttling

    Relative change compared with baseline without DTM

    1.10

    1.21

    1.09

    1.9

    2.17

    2.37

    2.06

    1.94

    2.07

    No DTM

    Fetch throttling

    Rename throttling

    Register file throttling

    Relative change compared with baseline without DTM

    0.01786486750.756242594

    0.38187316540.5178301961

    0.3576194869-0.1357513928

    0.3337914922-0.5078990674

    0.3103817099-0.7197989955

    13.2

    6.09

    2.54

    SMT with 2MB L2

    SMT with 3MB L2

    CMP with 1MB L2

    CMP with 2MB L2

    Relative change compared with baseline ST with 2MB L2

    0.58623483260.6421886945

    0.53182854620.5518407388

    0.0060573715-0.0181082304

    -0.3392527921-0.378730418

    -0.5660417735-0.6069058673

    1.35

    2.33

    3.72

    SMT with 2MB L2

    SMT with 3MB L2

    CMP with 1MB L2

    CMP with 2MB L2

    Relative change compared with baseline ST with 2MB L2

    SMT

    CMP

    L2 size (SMT)

    Relative performance change compared to ST baseline

    Chart4

    0.58623483260.64218869450.16999226580.8951659076

    0.53182854620.55184073880.65864227930.8110709272

    0.0060573715-0.01810823041.35025763690.0753941638

    -0.3392527921-0.3787304182.3302605562-0.3614426746

    -0.5660417735-0.60690586733.7189019612-0.6208316247

    1.35

    2.33

    3.72

    SMT with 2MB L2

    SMT with 3MB L2

    CMP with 1MB L2

    CMP with 2MB L2

    Relative change compared with baseline ST with 2MB L2

    Sheet1

    354.8753333333346.4786666667362.1142424242361.3981818182363.588125361.1606060606

    81.725333333373.328666666788.964242424288.248181818290.43812588.0106060606

    ST

    ST (area enlarged)

    SMT

    SMT(only activity factor)

    CMP

    CMP (one core rotated)

    3.24490112547.541836901913.93308078421.31212121211.31212121210.8427272727

    26440.484848484826440.484848484826440.48484848482.04909090912.29757575761.5575757576

    0.10930971650.22193412670.34510404434.54272727275.02393939392.2827272727

    1309070

    SMT

    CMP

    ST

    smt-lcmp-lst-l

    b0.25898319650.8742610040.42944883650.2215576404

    g0.03074097310.5907973889-0.06966020140.24378380560.1557347431-0.0256217382

    l0.06211140860.5345847812-0.09632045230.26250559360.1338163491-0.035526005

    IPCr0.10581767690.5854375314-0.07999156090.32794622010.151027027-0.0297370176

    POWERd0.14461626230.6707720816-0.041161340.31707656450.1695501129-0.0149913657

    ENERGYd200.09808611030.5956772038-0.05695520950.26906444170.1461591668-0.0205857998

    ENERGY DELAY

    ENERGY DELAY^2

    SMT-lowSMT-high

    BGLRDVS10DVS20

    0.25898319650.03074097310.06211140860.10581767690.14461626230.09808611030.42944883650.24378380560.26250559360.32794622010.31707656450.2690644417

    0.37984054270.21658237770.26821004760.27959109690.04009992680.00580745040.45855923720.31798110730.3741059050.3954006020.14030016330.1043433999

    0.07753108790.16098795030.27781292460.1733499403-0.1190658026-0.11351602610.00432723090.07969232770.21218715450.0723822197-0.159194344-0.1540390618

    -0.13931645840.17667160560.39053401220.1270176583-0.227085279-0.1848850453-0.23367268690.08808466470.2376049533-0.1328624067-0.3003622446-0.2604098608

    -0.29894605280.25498651660.61918149850.1271092222-0.3012049553-0.2232194076-0.34982751060.36614980160.4765056837-0.2620823634-0.3375556991-0.2553640208

    CMP-lowCMP-high

    0.8742610040.59079738890.53458478120.58543753140.67077208160.59567720380.22155764040.15573474310.13381634910.1510270270.16955011290.1461591668

    0.93437493740.80333938180.85444872970.86304534070.36062999620.31573597770.70840719960.79909784020.82123169480.82417034540.49881250240.4731907887

    0.01333889090.17932682030.30600609230.2405303675-0.2115738687-0.20502705250.55415678330.68595736210.75124529860.70840888380.43859875950.4429925023

    -0.446380657-0.16361808080.0131603537-0.101961117-0.5205315035-0.49310309480.92244807561.09546877081.20687310571.0879726060.86423114790.908241176

    -0.6867610944-0.3668360075-0.1523275257-0.3047110638-0.6956662191-0.66070249011.89901156962.16521117412.36541010582.06325204451.93666166922.0730124142

    mcf+mcfmcf+vpr

    CMP2MSMT3MSMT2MCMP1MCMP2MSMT3MSMT2MCMP1M

    0.08056970120.7562425940.0178648675-0.49992309650.89516590760.64218869450.58623483260.1699922658

    0.76212768220.51783019610.38187316540.77191674330.81107092720.55184073880.53182854620.6586422793

    0.6307394889-0.13575139280.35761948692.54328850380.0753941638-0.01810823040.00605737151.3502576369

    0.509147894-0.50789906740.33379149226.0854872099-0.3614426746-0.378730418-0.33925279212.3302605562

    0.3966224412-0.71979899550.310381709913.1687951593-0.6208316247-0.6069058673-0.56604177353.7189019612

    CMPSMT

    0.0264571190.3364798079

    0.07352374420.3810228237

    0.22155764040.4294488365

    0.49383534420.4621670173

    0.61999150560.5225182996

    0.67959194240.5302003877

    0.73417723930.5443473787

    IPC1.5M

    POWER1.75M

    ENERGY2M

    ENERGY DELAY2.25M

    ENERGY DELAY^22.5M

    2.75M

    3M

    Sheet1

    Normal case

    L2 leakage radically reduced

    No Temperature effect on Leakge

    Technology (nm)

    Average temperature difference between CMP and SMT

    Sheet2

    Temperature (Celsius)

    Sheet3

    2-way SMT

    dual-core CMP

    Relative change compared to ST baseline

    2-way SMT

    dual-core CMP

    Relative change compared to ST baseline

    No DTM

    Global fetch throttling

    Local renaming throttling

    Register file throttling

    Relative change compared to ST baseline without DTM

    No DTM

    Global fetch throttling

    Local renaming throttling

    Register file throttling

    Relative change compared to ST baseline without DTM

    -0.0545799581-0.0390064596-0.0360948816-0.095716911-0.1087500114

    0.02506602060.08906567280.0636536881-0.0589310101-0.0566065961

    0.12139632310.25798160460.1885986013-0.0185160950.0029792386

    0.23878798150.48369469820.34666520760.02595916460.0712814762

    Fetch throttling

    Rename throttling

    Register file throttling

    DVS10

    DVS20

    Relative change compared with baseline without DTM

    -0.0188401584-0.0133399996-0.0122237317-0.0012915749-0.0063309131

    0.00833237330.03034882850.02176455320.01231223310.0130886515

    0.0389932170.08251137110.06096026490.02677661880.0342944582

    0.07362170770.14487410540.10621857540.04215762010.0574549696

    Fetch throttling

    Rename throttling

    Register file throttling

    DVS10

    DVS20

    Relative change compared with baseline without DTM

    No DTM

    Fetch throttling

    Rename throttling

    Register file throttling

    Relative change compared with baseline without DTM

    No DTM

    Fetch throttling

    Rename throttling

    Register file throttling

    Relative change compared with baseline without DTM

    No DTM

    Fetch throttling

    Rename throttling

    Register file throttling

    Relative change compared with baseline without DTM

    1.10

    1.21

    1.09

    1.9

    2.17

    2.37

    2.06

    1.94

    2.07

    No DTM

    Fetch throttling

    Rename throttling

    Register file throttling

    Relative change compared with baseline without DTM

    0.01786486750.756242594

    0.38187316540.5178301961

    0.3576194869-0.1357513928

    0.3337914922-0.5078990674

    0.3103817099-0.7197989955

    13.2

    6.09

    2.54

    SMT with 2MB L2

    SMT with 3MB L2

    CMP with 1MB L2

    CMP with 2MB L2

    Relative change compared with baseline ST with 2MB L2

    0.58623483260.6421886945

    0.53182854620.5518407388

    0.0060573715-0.0181082304

    -0.3392527921-0.378730418

    -0.5660417735-0.6069058673

    1.35

    2.33

    3.72

    SMT with 2MB L2

    SMT with 3MB L2

    CMP with 1MB L2

    CMP with 2MB L2

    Relative change compared with baseline ST with 2MB L2

    SMT

    CMP

    L2 size (SMT)

    Relative performance change compared to ST baseline

    * 2005, Yingmin Li

    SMT vs. CMP performance with DTMLocalized DTM method favors SMT while global DTM method favors CMPGlobal techniqueGlobal DVSFetch throttlingLocal techniqueRename throttlingRegister file throttling (ideal)Compute-boundMemory-bound

    Chart5

    0.25898319650.03074097310.06211140860.1058176769

    0.8742610040.59079738890.53458478120.5854375314

    ST-0.0696602014-0.0963204523-0.0799915609

    No DTM

    Global fetch throttling

    Local renaming throttling

    Register file throttling

    Relative change compared to ST baseline without DTM

    Sheet1

    354.8753333333346.4786666667362.1142424242361.3981818182363.588125361.1606060606

    81.725333333373.328666666788.964242424288.248181818290.43812588.0106060606

    ST

    ST (area enlarged)

    SMT

    SMT(only activity factor)

    CMP

    CMP (one core rotated)

    3.24490112547.541836901913.93308078421.31212121211.31212121210.8427272727

    26440.484848484826440.484848484826440.48484848482.04909090912.29757575761.5575757576

    0.10930971650.22193412670.34510404434.54272727275.02393939392.2827272727

    1309070

    SMT

    CMP

    ST

    smt-lcmp-lst-l

    b0.25898319650.8742610040.42944883650.2215576404

    g0.03074097310.5907973889-0.06966020140.24378380560.1557347431-0.0256217382

    l0.06211140860.5345847812-0.09632045230.26250559360.1338163491-0.035526005

    IPCr0.10581767690.5854375314-0.07999156090.32794622010.151027027-0.0297370176

    POWERd0.14461626230.6707720816-0.041161340.31707656450.1695501129-0.0149913657

    ENERGYd200.09808611030.5956772038-0.05695520950.26906444170.1461591668-0.0205857998

    ENERGY DELAY

    ENERGY DELAY^2

    SMT-lowSMT-high

    BGLRDVS10DVS20

    0.25898319650.03074097310.06211140860.10581767690.14461626230.09808611030.42944883650.24378380560.26250559360.32794622010.31707656450.2690644417

    0.37984054270.21658237770.26821004760.27959109690.04009992680.00580745040.45855923720.31798110730.3741059050.3954006020.14030016330.1043433999

    0.07753108790.16098795030.27781292460.1733499403-0.1190658026-0.11351602610.00432723090.07969232770.21218715450.0723822197-0.159194344-0.1540390618

    -0.13931645840.17667160560.39053401220.1270176583-0.227085279-0.1848850453-0.23367268690.08808466470.2376049533-0.1328624067-0.3003622446-0.2604098608

    -0.29894605280.25498651660.61918149850.1271092222-0.3012049553-0.2232194076-0.34982751060.36614980160.4765056837-0.2620823634-0.3375556991-0.2553640208

    CMP-lowCMP-high

    0.8742610040.59079738890.53458478120.58543753140.67077208160.59567720380.22155764040.15573474310.13381634910.1510270270.16955011290.1461591668

    0.93437493740.80333938180.85444872970.86304534070.36062999620.31573597770.70840719960.79909784020.82123169480.82417034540.49881250240.4731907887

    0.01333889090.17932682030.30600609230.2405303675-0.2115738687-0.20502705250.55415678330.68595736210.75124529860.70840888380.43859875950.4429925023

    -0.446380657-0.16361808080.0131603537-0.101961117-0.5205315035-0.49310309480.92244807561.09546877081.20687310571.0879726060.86423114790.908241176

    -0.6867610944-0.3668360075-0.1523275257-0.3047110638-0.6956662191-0.66070249011.89901156962.16521117412.36541010582.06325204451.93666166922.0730124142

    mcf+mcfmcf+vpr

    CMP2MSMT3MSMT2MCMP1MCMP2MSMT3MSMT2MCMP1M

    0.08056970120.7562425940.0178648675-0.49992309650.89516590760.64218869450.58623483260.1699922658

    0.76212768220.51783019610.38187316540.77191674330.81107092720.55184073880.53182854620.6586422793

    0.6307394889-0.13575139280.35761948692.54328850380.0753941638-0.01810823040.00605737151.3502576369

    0.509147894-0.50789906740.33379149226.0854872099-0.3614426746-0.378730418-0.33925279212.3302605562

    0.3966224412-0.71979899550.310381709913.1687951593-0.6208316247-0.6069058673-0.56604177353.7189019612

    CMPSMT

    0.0264571190.3364798079

    0.07352374420.3810228237

    0.22155764040.4294488365

    0.49383534420.4621670173

    0.61999150560.5225182996

    0.67959194240.5302003877

    0.73417723930.5443473787

    IPC1.5M

    POWER1.75M

    ENERGY2M

    ENERGY DELAY2.25M

    ENERGY DELAY^22.5M

    2.75M

    3M

    Sheet1

    Normal case

    L2 leakage radically reduced

    No Temperature effect on Leakge

    Technology (nm)

    Average temperature difference between CMP and SMT

    Sheet2

    Temperature (Celsius)

    Sheet3

    2-way SMT

    dual-core CMP

    Relative change compared to ST baseline

    2-way SMT

    dual-core CMP

    Relative change compared to ST baseline

    No DTM

    Global fetch throttling

    Local renaming throttling

    Register file throttling

    Relative change compared to ST baseline without DTM

    No DTM

    Global fetch throttling

    Local renaming throttling

    Register file throttling

    Relative change compared to ST baseline without DTM

    -0.0545799581-0.0390064596-0.0360948816-0.095716911-0.1087500114

    0.02506602060.08906567280.0636536881-0.0589310101-0.0566065961

    0.12139632310.25798160460.1885986013-0.0185160950.0029792386

    0.23878798150.48369469820.34666520760.02595916460.0712814762

    Fetch throttling

    Rename throttling

    Register file throttling

    DVS10

    DVS20

    Relative change compared with baseline without DTM

    -0.0188401584-0.0133399996-0.0122237317-0.0012915749-0.0063309131

    0.00833237330.03034882850.02176455320.01231223310.0130886515

    0.0389932170.08251137110.06096026490.02677661880.0342944582

    0.07362170770.14487410540.10621857540.04215762010.0574549696

    Fetch throttling

    Rename throttling

    Register file throttling

    DVS10

    DVS20

    Relative change compared with baseline without DTM

    No DTM

    Fetch throttling

    Rename throttling

    Register file throttling

    Relative change compared with baseline without DTM

    No DTM

    Fetch throttling

    Rename throttling

    Register file throttling

    Relative change compared with baseline without DTM

    No DTM

    Fetch throttling

    Rename throttling

    Register file throttling

    Relative change compared with baseline without DTM

    1.10

    1.21

    1.09

    1.9

    2.17

    2.37

    2.06

    1.94

    2.07

    No DTM

    Fetch throttling

    Rename throttling

    Register file throttling

    Relative change compared with baseline without DTM

    0.01786486750.756242594

    0.38187316540.5178301961

    0.3576194869-0.1357513928

    0.3337914922-0.5078990674

    0.3103817099-0.7197989955

    13.2

    6.09

    2.54

    SMT with 2MB L2

    SMT with 3MB L2

    CMP with 1MB L2

    CMP with 2MB L2

    Relative change compared with baseline ST with 2MB L2

    0.58623483260.6421886945

    0.53182854620.5518407388

    0.0060573715-0.0181082304

    -0.3392527921-0.378730418

    -0.5660417735-0.6069058673

    1.35

    2.33

    3.72

    SMT with 2MB L2

    SMT with 3MB L2

    CMP with 1MB L2

    CMP with 2MB L2

    Relative change compared with baseline ST with 2MB L2

    SMT

    CMP

    L2 size (SMT)

    Relative performance change compared to ST baseline

    Chart6

    0.42944883650.24378380560.26250559360.3279462201

    0.22155764040.15573474310.13381634910.151027027

    ST-0.0256217382-0.035526005-0.0297370176

    No DTM

    Global fetch throttling

    Local renaming throttling

    Register file throttling

    Relative change compared to ST baseline without DTM

    Sheet1

    354.8753333333346.4786666667362.1142424242361.3981818182363.588125361.1606060606

    81.725333333373.328666666788.964242424288.248181818290.43812588.0106060606

    ST

    ST (area enlarged)

    SMT

    SMT(only activity factor)

    CMP

    CMP (one core rotated)

    3.24490112547.541836901913.93308078421.31212121211.31212121210.8427272727

    26440.484848484826440.484848484826440.48484848482.04909090912.29757575761.5575757576

    0.10930971650.22193412670.34510404434.54272727275.02393939392.2827272727

    1309070

    SMT

    CMP

    ST

    smt-lcmp-lst-l

    b0.25898319650.8742610040.42944883650.2215576404

    g0.03074097310.5907973889-0.06966020140.24378380560.1557347431-0.0256217382

    l0.06211140860.5345847812-0.09632045230.26250559360.1338163491-0.035526005

    IPCr0.10581767690.5854375314-0.07999156090.32794622010.151027027-0.0297370176

    POWERd0.14461626230.6707720816-0.041161340.31707656450.1695501129-0.0149913657

    ENERGYd200.09808611030.5956772038-0.05695520950.26906444170.1461591668-0.0205857998

    ENERGY DELAY

    ENERGY DELAY^2

    SMT-lowSMT-high

    BGLRDVS10DVS20

    0.25898319650.03074097310.06211140860.10581767690.14461626230.09808611030.42944883650.24378380560.26250559360.32794622010.31707656450.2690644417

    0.37984054270.21658237770.26821004760.27959109690.04009992680.00580745040.45855923720.31798110730.3741059050.3954006020.14030016330.1043433999

    0.07753108790.16098795030.27781292460.1733499403-0.1190658026-0.11351602610.00432723090.07969232770.21218715450.0723822197-0.159194344-0.1540390618

    -0.13931645840.17667160560.39053401220.1270176583-0.227085279-0.1848850453-0.23367268690.08808466470.2376049533-0.1328624067-0.3003622446-0.2604098608

    -0.29894605280.25498651660.61918149850.1271092222-0.3012049553-0.2232194076-0.34982751060.36614980160.4765056837-0.2620823634-0.3375556991-0.2553640208

    CMP-lowCMP-high

    0.8742610040.59079738890.53458478120.58543753140.67077208160.59567720380.22155764040.15573474310.13381634910.1510270270.16955011290.1461591668

    0.93437493740.80333938180.85444872970.86304534070.36062999620.31573597770.70840719960.79909784020.82123169480.82417034540.49881250240.4731907887

    0.01333889090.17932682030.30600609230.2405303675-0.2115738687-0.20502705250.55415678330.68595736210.75124529860.70840888380.43859875950.4429925023

    -0.446380657-0.16361808080.0131603537-0.101961117-0.5205315035-0.49310309480.92244807561.09546877081.20687310571.0879726060.86423114790.908241176

    -0.6867610944-0.3668360075-0.1523275257-0.3047110638-0.6956662191-0.66070249011.89901156962.16521117412.36541010582.06325204451.93666166922.0730124142

    mcf+mcfmcf+vpr

    CMP2MSMT3MSMT2MCMP1MCMP2MSMT3MSMT2MCMP1M

    0.08056970120.7562425940.0178648675-0.49992309650.89516590760.64218869450.58623483260.1699922658

    0.76212768220.51783019610.38187316540.77191674330.81107092720.55184073880.53182854620.6586422793

    0.6307394889-0.13575139280.35761948692.54328850380.0753941638-0.01810823040.00605737151.3502576369

    0.509147894-0.50789906740.33379149226.0854872099-0.3614426746-0.378730418-0.33925279212.3302605562

    0.3966224412-0.71979899550.310381709913.1687951593-0.6208316247-0.6069058673-0.56604177353.7189019612

    CMPSMT

    0.0264571190.3364798079

    0.07352374420.3810228237

    0.22155764040.4294488365

    0.49383534420.4621670173

    0.61999150560.5225182996

    0.67959194240.5302003877

    0.73417723930.5443473787

    IPC1.5M

    POWER1.75M

    ENERGY2M

    ENERGY DELAY2.25M

    ENERGY DELAY^22.5M

    2.75M

    3M

    Sheet1

    Normal case

    L2 leakage radically reduced

    No Temperature effect on Leakge

    Technology (nm)

    Average temperature difference between CMP and SMT

    Sheet2

    Temperature (Celsius)

    Sheet3

    2-way SMT

    dual-core CMP

    Relative change compared to ST baseline

    2-way SMT

    dual-core CMP

    Relative change compared to ST baseline

    No DTM

    Global fetch throttling

    Local renaming throttling

    Register file throttling

    Relative change compared to ST baseline without DTM

    No DTM

    Global fetch throttling

    Local renaming throttling

    Register file throttling

    Relative change compared to ST baseline without DTM

    -0.0545799581-0.0390064596-0.0360948816-0.095716911-0.1087500114

    0.02506602060.08906567280.0636536881-0.0589310101-0.0566065961

    0.12139632310.25798160460.1885986013-0.0185160950.0029792386

    0.23878798150.48369469820.34666520760.02595916460.0712814762

    Fetch throttling

    Rename throttling

    Register file throttling

    DVS10

    DVS20

    Relative change compared with baseline without DTM

    -0.0188401584-0.0133399996-0.0122237317-0.0012915749-0.0063309131

    0.00833237330.03034882850.02176455320.01231223310.0130886515

    0.0389932170.08251137110.06096026490.02677661880.0342944582

    0.07362170770.14487410540.10621857540.04215762010.0574549696

    Fetch throttling

    Rename throttling

    Register file throttling

    DVS10

    DVS20

    Relative change compared with baseline without DTM

    No DTM

    Fetch throttling

    Rename throttling

    Register file throttling

    Relative change compared with baseline without DTM

    No DTM

    Fetch throttling

    Rename throttling

    Register file throttling

    Relative change compared with baseline without DTM

    No DTM

    Fetch throttling

    Rename throttling

    Register file throttling

    Relative change compared with baseline without DTM

    1.10

    1.21

    1.09

    1.9

    2.17

    2.37

    2.06

    1.94

    2.07

    No DTM

    Fetch throttling

    Rename throttling

    Register file throttling

    Relative change compared with baseline without DTM

    0.01786486750.756242594

    0.38187316540.5178301961

    0.3576194869-0.1357513928

    0.3337914922-0.5078990674

    0.3103817099-0.7197989955

    13.2

    6.09

    2.54

    SMT with 2MB L2

    SMT with 3MB L2

    CMP with 1MB L2

    CMP with 2MB L2

    Relative change compared with baseline ST with 2MB L2

    0.58623483260.6421886945

    0.53182854620.5518407388

    0.0060573715-0.0181082304

    -0.3392527921-0.378730418

    -0.5660417735-0.6069058673

    1.35

    2.33

    3.72

    SMT with 2MB L2

    SMT with 3MB L2

    CMP with 1MB L2

    CMP with 2MB L2

    Relative change compared with baseline ST with 2MB L2

    SMT

    CMP

    L2 size (SMT)

    Relative performance change compared to ST baseline

    * 2005, Yingmin Li

    SMT energy efficiency with DTMLocalized method can lead to better energy-delay product result compared with global method in some cases.Compute-boundMemory-bound

    Chart7

    0.37984054270.21658237770.26821004760.2795910969

    0.07753108790.16098795030.27781292460.1733499403

    -0.13931645840.17667160560.39053401220.1270176583

    -0.29894605280.25498651660.61918149850.1271092222

    No DTM

    Fetch throttling

    Rename throttling

    Register file throttling

    Relative change compared with baseline without DTM

    Sheet1

    354.8753333333346.4786666667362.1142424242361.3981818182363.588125361.1606060606

    81.725333333373.328666666788.964242424288.248181818290.43812588.0106060606

    ST

    ST (area enlarged)

    SMT

    SMT(only activity factor)

    CMP

    CMP (one core rotated)

    3.24490112547.541836901913.93308078421.31212121211.31212121210.8427272727

    26440.484848484826440.484848484826440.48484848482.04909090912.29757575761.5575757576

    0.10930971650.22193412670.34510404434.54272727275.02393939392.2827272727

    1309070

    SMT

    CMP

    ST

    smt-lcmp-lst-l

    b0.25898319650.8742610040.42944883650.2215576404

    g0.03074097310.5907973889-0.06966020140.24378380560.1557347431-0.0256217382

    l0.06211140860.5345847812-0.09632045230.26250559360.1338163491-0.035526005

    IPCr0.10581767690.5854375314-0.07999156090.32794622010.151027027-0.0297370176

    POWERd0.14461626230.6707720816-0.041161340.31707656450.1695501129-0.0149913657

    ENERGYd200.09808611030.5956772038-0.05695520950.26906444170.1461591668-0.0205857998

    ENERGY DELAY

    ENERGY DELAY^2

    SMT-lowSMT-high

    BGLRDVS10DVS20

    0.25898319650.03074097310.06211140860.10581767690.14461626230.09808611030.42944883650.24378380560.26250559360.32794622010.31707656450.2690644417

    0.37984054270.21658237770.26821004760.27959109690.04009992680.00580745040.45855923720.31798110730.3741059050.3954006020.14030016330.1043433999

    0.07753108790.16098795030.27781292460.1733499403-0.1190658026-0.11351602610.00432723090.07969232770.21218715450.0723822197-0.159194344-0.1540390618

    -0.13931645840.17667160560.39053401220.1270176583-0.227085279-0.1848850453-0.23367268690.08808466470.2376049533-0.1328624067-0.3003622446-0.2604098608

    -0.29894605280.25498651660.61918149850.1271092222-0.3012049553-0.2232194076-0.34982751060.36614980160.4765056837-0.2620823634-0.3375556991-0.2553640208

    CMP-lowCMP-high

    0.8742610040.59079738890.53458478120.58543753140.67077208160.59567720380.22155764040.15573474310.13381634910.1510270270.16955011290.1461591668

    0.93437493740.80333938180.85444872970.86304534070.36062999620.31573597770.70840719960.79909784020.82123169480.82417034540.49881250240.4731907887

    0.01333889090.17932682030.30600609230.2405303675-0.2115738687-0.20502705250.55415678330.68595736210.75124529860.70840888380.43859875950.4429925023

    -0.446380657-0.16361808080.0131603537-0.101961117-0.5205315035-0.49310309480.92244807561.09546877081.20687310571.0879726060.86423114790.908241176

    -0.6867610944-0.3668360075-0.1523275257-0.3047110638-0.6956662191-0.66070249011.89901156962.16521117412.36541010582.06325204451.93666166922.0730124142

    mcf+mcfmcf+vpr

    CMP2MSMT3MSMT2MCMP1MCMP2MSMT3MSMT2MCMP1M

    0.08056970120.7562425940.0178648675-0.49992309650.89516590760.64218869450.58623483260.1699922658

    0.76212768220.51783019610.38187316540.77191674330.81107092720.55184073880.53182854620.6586422793

    0.6307394889-0.13575139280.35761948692.54328850380.0753941638-0.01810823040.00605737151.3502576369

    0.509147894-0.50789906740.33379149226.0854872099-0.3614426746-0.378730418-0.33925279212.3302605562

    0.3966224412-0.71979899550.310381709913.1687951593-0.6208316247-0.6069058673-0.56604177353.7189019612

    CMPSMT

    0.0264571190.3364798079

    0.07352374420.3810228237

    0.22155764040.4294488365

    0.49383534420.4621670173

    0.61999150560.5225182996

    0.67959194240.5302003877

    0.73417723930.5443473787

    IPC1.5M

    POWER1.75M

    ENERGY2M

    ENERGY DELAY2.25M

    ENERGY DELAY^22.5M

    2.75M

    3M

    Sheet1

    Normal case

    L2 leakage radically reduced

    No Temperature effect on Leakge

    Technology (nm)

    Average temperature difference between CMP and SMT

    Sheet2

    Temperature (Celsius)

    Sheet3

    2-way SMT

    dual-core CMP

    Relative change compared to ST baseline

    2-way SMT

    dual-core CMP

    Relative change compared to ST baseline

    No DTM

    Global fetch throttling

    Local renaming throttling

    Register file throttling

    Relative change compared to ST baseline without DTM

    No DTM

    Global fetch throttling

    Local renaming throttling

    Register file throttling

    Relative change compared to ST baseline without DTM

    -0.0545799581-0.0390064596-0.0360948816-0.095716911-0.1087500114

    0.02506602060.08906567280.0636536881-0.0589310101-0.0566065961

    0.12139632310.25798160460.1885986013-0.0185160950.0029792386

    0.23878798150.48369469820.34666520760.02595916460.0712814762

    Fetch throttling

    Rename throttling

    Register file throttling

    DVS10

    DVS20

    Relative change compared with baseline without DTM

    -0.0188401584-0.0133399996-0.0122237317-0.0012915749-0.0063309131

    0.00833237330.03034882850.02176455320.01231223310.0130886515

    0.0389932170.08251137110.06096026490.02677661880.0342944582

    0.07362170770.14487410540.10621857540.04215762010.0574549696

    Fetch throttling

    Rename throttling

    Register file throttling

    DVS10

    DVS20

    Relative change compared with baseline without DTM

    No DTM

    Fetch throttling

    Rename throttling

    Register file throttling

    Relative change compared with baseline without DTM

    No DTM

    Fetch throttling

    Rename throttling

    Register file throttling

    Relative change compared with baseline without DTM

    No DTM

    Fetch throttling

    Rename throttling

    Register file throttling

    Relative change compared with baseline without DTM

    1.10

    1.21

    1.09

    1.9

    2.17

    2.37

    2.06

    1.94

    2.07

    No DTM

    Fetch throttling

    Rename throttling

    Register file throttling

    Relative change compared with baseline without DTM

    0.01786486750.756242594

    0.38187316540.5178301961

    0.3576194869-0.1357513928

    0.3337914922-0.5078990674

    0.3103817099-0.7197989955

    13.2

    6.09

    2.54

    SMT with 2MB L2

    SMT with 3MB L2

    CMP with 1MB L2

    CMP with 2MB L2

    Relative change compared with baseline ST with 2MB L2

    0.58623483260.6421886945

    0.53182854620.5518407388

    0.0060573715-0.0181082304

    -0.3392527921-0.378730418

    -0.5660417735-0.6069058673

    1.35

    2.33

    3.72

    SMT with 2MB L2

    SMT with 3MB L2

    CMP with 1MB L2

    CMP with 2MB L2

    Relative change compared with baseline ST with 2MB L2

    SMT

    CMP

    L2 size (SMT)

    Relative performance change compared to ST baseline

    Chart8

    0.45855923720.31798110730.3741059050.395400602

    0.00432723090.07969232770.21218715450.0723822197

    -0.23367268690.08808466470.2376049533-0.1328624067

    -0.34982751060.36614980160.4765056837-0.2620823634

    No DTM

    Fetch throttling

    Rename throttling

    Register file throttling

    Relative change compared with baseline without DTM

    Sheet1

    354.8753333333346.4786666667362.1142424242361.3981818182363.588125361.1606060606

    81.725333333373.328666666788.964242424288.248181818290.43812588.0106060606

    ST

    ST (area enlarged)

    SMT

    SMT(only activity factor)

    CMP

    CMP (one core rotated)

    3.24490112547.541836901913.93308078421.31212121211.31212121210.8427272727

    26440.484848484826440.484848484826440.48484848482.04909090912.29757575761.5575757576

    0.10930971650.22193412670.34510404434.54272727275.02393939392.2827272727

    1309070

    SMT

    CMP

    ST

    smt-lcmp-lst-l

    b0.25898319650.8742610040.42944883650.2215576404

    g0.03074097310.5907973889-0.06966020140.24378380560.1557347431-0.0256217382

    l0.06211140860.5345847812-0.09632045230.26250559360.1338163491-0.035526005

    IPCr0.10581767690.5854375314-0.07999156090.32794622010.151027027-0.0297370176

    POWERd0.14461626230.6707720816-0.041161340.31707656450.1695501129-0.0149913657

    ENERGYd200.09808611030.5956772038-0.05695520950.26906444170.1461591668-0.0205857998

    ENERGY DELAY

    ENERGY DELAY^2

    SMT-lowSMT-high

    BGLRDVS10DVS20

    0.25898319650.03074097310.06211140860.10581767690.14461626230.09808611030.42944883650.24378380560.26250559360.32794622010.31707656450.2690644417

    0.37984054270.21658237770.26821004760.27959109690.04009992680.00580745040.45855923720.31798110730.3741059050.3954006020.14030016330.1043433999

    0.07753108790.16098795030.27781292460.1733499403-0.1190658026-0.11351602610.00432723090.07969232770.21218715450.0723822197-0.159194344-0.1540390618

    -0.13931645840.17667160560.39053401220.1270176583-0.227085279-0.1848850453-0.23367268690.08808466470.2376049533-0.1328624067-0.3003622446-0.2604098608

    -0.29894605280.25498651660.61918149850.1271092222-0.3012049553-0.2232194076-0.34982751060.36614980160.4765056837-0.2620823634-0.3375556991-0.2553640208

    CMP-lowCMP-high

    0.8742610040.59079738890.53458478120.58543753140.67077208160.59567720380.22155764040.15573474310.13381634910.1510270270.16955011290.1461591668

    0.93437493740.80333938180.85444872970.86304534070.36062999620.31573597770.70840719960.79909784020.82123169480.82417034540.49881250240.4731907887

    0.01333889090.17932682030.30600609230.2405303675-0.2115738687-0.20502705250.55415678330.68595736210.75124529860.70840888380.43859875950.4429925023

    -0.446380657-0.16361808080.0131603537-0.101961117-0.5205315035-0.49310309480.92244807561.09546877081.20687310571.0879726060.86423114790.908241176

    -0.6867610944-0.3668360075-0.1523275257-0.3047110638-0.6956662191-0.66070249011.89901156962.16521117412.36541010582.06325204451.93666166922.0730124142

    mcf+mcfmcf+vpr

    CMP2MSMT3MSMT2MCMP1MCMP2MSMT3MSMT2MCMP1M

    0.080569