smt and cmp architecture
DESCRIPTION
This ppt gives info about SMT and CMP architecture.And its also extracts difference between both arch.TRANSCRIPT
-
Performance, Energy and Thermal Considerations of SMT and CMP architecturesYingmin Li, David Brooks, Zhigang Hu, Kevin SkadronDept. of Computer Science, University of VirginiaDivision of Engineering and Applied Sciences, Havard UniversityIBM T.J.Watson Research Center
* 2005, Yingmin Li
MotivationFuture trend calls for multi-core and multi-thread architecturesWhich is better: lots of tiny speed demons or fewer brainiacs?Which is more valuable, more L2 or additional cores?Performance, power, and thermal properties of multi-core vs. multi-thread architectures not well understood
* 2005, Yingmin Li
Scope of this Study Equal-area comparison between SMT vs. CMP extensions of an Apple G5-like core
Note: 1MB L2 roughly equals to 1 G5 like Core in terms of areaSingle- threadedSMTSingle-threaded CMP
* 2005, Yingmin Li
Outline Modeling / Model ValidationSMT vs. CMP performance, power and thermal analysis (without DTM)SMT vs. CMP performance, power and thermal analysis (with DTM)Conclusions and future work
* 2005, Yingmin Li
Performance sensitivity with different L2 sizeCMP L2 size = SMT L2 size 1MB
Chart1
0.33647980790.026457119
0.38102282370.0735237442
0.42944883650.2215576404
0.46216701730.4938353442
0.52251829960.6199915056
0.53020038770.6795919424
0.54434737870.7341772393
SMT
CMP
L2 size (SMT)
Relative performance change compared to ST baseline
Sheet1
354.8753333333346.4786666667362.1142424242361.3981818182363.588125361.1606060606
81.725333333373.328666666788.964242424288.248181818290.43812588.0106060606
ST
ST (area enlarged)
SMT
SMT(only activity factor)
CMP
CMP (one core rotated)
3.24490112547.541836901913.93308078421.31212121211.31212121210.8427272727
26440.484848484826440.484848484826440.48484848482.04909090912.29757575761.5575757576
0.10930971650.22193412670.34510404434.54272727275.02393939392.2827272727
1309070
SMT
CMP
ST
smt-lcmp-lst-l
b0.25898319650.8742610040.42944883650.2215576404
g0.03074097310.5907973889-0.06966020140.24378380560.1557347431-0.0256217382
l0.06211140860.5345847812-0.09632045230.26250559360.1338163491-0.035526005
IPCr0.10581767690.5854375314-0.07999156090.32794622010.151027027-0.0297370176
POWERd0.14461626230.6707720816-0.041161340.31707656450.1695501129-0.0149913657
ENERGYd200.09808611030.5956772038-0.05695520950.26906444170.1461591668-0.0205857998
ENERGY DELAY
ENERGY DELAY^2
SMT-lowSMT-high
BGLRDVS10DVS20
0.25898319650.03074097310.06211140860.10581767690.14461626230.09808611030.42944883650.24378380560.26250559360.32794622010.31707656450.2690644417
0.37984054270.21658237770.26821004760.27959109690.04009992680.00580745040.45855923720.31798110730.3741059050.3954006020.14030016330.1043433999
0.07753108790.16098795030.27781292460.1733499403-0.1190658026-0.11351602610.00432723090.07969232770.21218715450.0723822197-0.159194344-0.1540390618
-0.13931645840.17667160560.39053401220.1270176583-0.227085279-0.1848850453-0.23367268690.08808466470.2376049533-0.1328624067-0.3003622446-0.2604098608
-0.29894605280.25498651660.61918149850.1271092222-0.3012049553-0.2232194076-0.34982751060.36614980160.4765056837-0.2620823634-0.3375556991-0.2553640208
CMP-lowCMP-high
0.8742610040.59079738890.53458478120.58543753140.67077208160.59567720380.22155764040.15573474310.13381634910.1510270270.16955011290.1461591668
0.93437493740.80333938180.85444872970.86304534070.36062999620.31573597770.70840719960.79909784020.82123169480.82417034540.49881250240.4731907887
0.01333889090.17932682030.30600609230.2405303675-0.2115738687-0.20502705250.55415678330.68595736210.75124529860.70840888380.43859875950.4429925023
-0.446380657-0.16361808080.0131603537-0.101961117-0.5205315035-0.49310309480.92244807561.09546877081.20687310571.0879726060.86423114790.908241176
-0.6867610944-0.3668360075-0.1523275257-0.3047110638-0.6956662191-0.66070249011.89901156962.16521117412.36541010582.06325204451.93666166922.0730124142
mcf+mcfmcf+vpr
CMP2MSMT3MSMT2MCMP1MCMP2MSMT3MSMT2MCMP1M
0.08056970120.7562425940.0178648675-0.49992309650.89516590760.64218869450.58623483260.1699922658
0.76212768220.51783019610.38187316540.77191674330.81107092720.55184073880.53182854620.6586422793
0.6307394889-0.13575139280.35761948692.54328850380.0753941638-0.01810823040.00605737151.3502576369
0.509147894-0.50789906740.33379149226.0854872099-0.3614426746-0.378730418-0.33925279212.3302605562
0.3966224412-0.71979899550.310381709913.1687951593-0.6208316247-0.6069058673-0.56604177353.7189019612
CMPSMT
0.0264571190.3364798079
0.07352374420.3810228237
0.22155764040.4294488365
0.49383534420.4621670173
0.61999150560.5225182996
0.67959194240.5302003877
0.73417723930.5443473787
IPC1.5M
POWER1.75M
ENERGY2M
ENERGY DELAY2.25M
ENERGY DELAY^22.5M
2.75M
3M
Sheet1
Normal case
L2 leakage radically reduced
No Temperature effect on Leakge
Technology (nm)
Average temperature difference between CMP and SMT
Sheet2
Absolute Temperature (Celsius)
Sheet3
2-way SMT
dual-core CMP
Relative change compared to ST baseline
2-way SMT
dual-core CMP
Relative change compared to ST baseline
No DTM
Global fetch throttling
Local renaming throttling
Register file throttling
DVS10
DVS20
Relative change compared to ST baseline without DTM
No DTM
Global fetch throttling
Local renaming throttling
Register file throttling
DVS10
DVS20
Relative change compared to ST baseline without DTM
-0.0545799581-0.0390064596-0.0360948816-0.095716911-0.1087500114
0.02506602060.08906567280.0636536881-0.0589310101-0.0566065961
0.12139632310.25798160460.1885986013-0.0185160950.0029792386
0.23878798150.48369469820.34666520760.02595916460.0712814762
Fetch throttling
Rename throttling
Register file throttling
DVS10
DVS20
Relative change compared with baseline without DTM
-0.0188401584-0.0133399996-0.0122237317-0.0012915749-0.0063309131
0.00833237330.03034882850.02176455320.01231223310.0130886515
0.0389932170.08251137110.06096026490.02677661880.0342944582
0.07362170770.14487410540.10621857540.04215762010.0574549696
Fetch throttling
Rename throttling
Register file throttling
DVS10
DVS20
Relative change compared with baseline without DTM
No DTM
Fetch throttling
Rename throttling
Register file throttling
DVS10
DVS20
Relative change compared with baseline without DTM
No DTM
Fetch throttling
Rename throttling
Register file throttling
DVS10
DVS20
Relative change compared with baseline without DTM
No DTM
Fetch throttling
Rename throttling
Register file throttling
DVS10
DVS20
Relative change compared with baseline without DTM
1.10
1.21
1.09
1.9
2.17
2.37
2.06
1.94
2.07
No DTM
Fetch throttling
Rename throttling
Register file throttling
DVS10
DVS20
Relative change compared with baseline without DTM
0.01786486750.756242594
0.38187316540.5178301961
0.3576194869-0.1357513928
0.3337914922-0.5078990674
0.3103817099-0.7197989955
13.2
6.09
2.54
SMT with 2MB L2
SMT with 3MB L2
CMP with 1MB L2
CMP with 2MB L2
Relative change compared with baseline ST with 2MB L2
0.58623483260.6421886945
0.53182854620.5518407388
0.0060573715-0.0181082304
-0.3392527921-0.378730418
-0.5660417735-0.6069058673
1.35
2.33
3.72
SMT with 2MB L2
SMT with 3MB L2
CMP with 1MB L2
CMP with 2MB L2
Relative change compared with baseline ST with 2MB L2
SMT
CMP
L2 size (SMT)
Relative performance change compared to ST baseline
* 2005, Yingmin Li
Modeling and ValidationPerformance: Turandot with SMT and CMP augmentations, validated against Power4 preRTL modelPower: PowerTimer with SMT and CMP augmentations, validated against CPAM power data extracted from circuitTemperature: Hotspot from UVA integrated with Turandot/PowerTimer, validated with test chips at UVA
* 2005, Yingmin Li
Turandot/PowerTimer Simulation Framework Supports SMT/CMP Runs on AIX/PowerPC and Linux/Intel platforms PowerTimer based on CPAM data, extracted from circuits See Micro02 tutorial by Zhigang Hu and David Brooks for details
* 2005, Yingmin Li
Hotspot temperature modelModels all parts along both primary and secondary heat transfer pathsAt arbitrary granularitiesFast and accurateEssentially a lumped thermal R-C network
Fin-to-air convection thermal resistor
Silicon Die
Thermal Interface Material
Heat Sink
Heat Spreader
To Interconnect Layer Thermal Model
* 2005, Yingmin Li
Peak Temperature of The Hottest Spot for SMT and CMP3 heat-up mechanismsUnit self heating determined by the power density of the unitGlobal heating through TIM (thermal interface material) and spreaderLateral thermal coupling between neighboring units
Chart3
81.7253333333
73.3286666667
88.9642424242
88.2481818182
90.438125
88.0106060606
Temperature (Celsius)
Sheet1
354.8753333333346.4786666667362.1142424242361.3981818182363.588125361.1606060606
81.725333333373.328666666788.964242424288.248181818290.43812588.0106060606
ST
ST (area enlarged)
SMT
SMT(only activity factor)
CMP
CMP (one core rotated)
3.24490112547.541836901913.93308078421.31212121211.31212121210.8427272727
26440.484848484826440.484848484826440.48484848482.04909090912.29757575761.5575757576
0.10930971650.22193412670.34510404434.54272727275.02393939392.2827272727
1309070
SMT
CMP
ST
smt-lcmp-lst-l
b0.25898319650.8742610040.42944883650.2215576404
g0.03074097310.5907973889-0.06966020140.24378380560.1557347431-0.0256217382
l0.06211140860.5345847812-0.09632045230.26250559360.1338163491-0.035526005
IPCr0.10581767690.5854375314-0.07999156090.32794622010.151027027-0.0297370176
POWERd0.14461626230.6707720816-0.041161340.31707656450.1695501129-0.0149913657
ENERGYd200.09808611030.5956772038-0.05695520950.26906444170.1461591668-0.0205857998
ENERGY DELAY
ENERGY DELAY^2
SMT-lowSMT-high
BGLRDVS10DVS20
0.25898319650.03074097310.06211140860.10581767690.14461626230.09808611030.42944883650.24378380560.26250559360.32794622010.31707656450.2690644417
0.37984054270.21658237770.26821004760.27959109690.04009992680.00580745040.45855923720.31798110730.3741059050.3954006020.14030016330.1043433999
0.07753108790.16098795030.27781292460.1733499403-0.1190658026-0.11351602610.00432723090.07969232770.21218715450.0723822197-0.159194344-0.1540390618
-0.13931645840.17667160560.39053401220.1270176583-0.227085279-0.1848850453-0.23367268690.08808466470.2376049533-0.1328624067-0.3003622446-0.2604098608
-0.29894605280.25498651660.61918149850.1271092222-0.3012049553-0.2232194076-0.34982751060.36614980160.4765056837-0.2620823634-0.3375556991-0.2553640208
CMP-lowCMP-high
0.8742610040.59079738890.53458478120.58543753140.67077208160.59567720380.22155764040.15573474310.13381634910.1510270270.16955011290.1461591668
0.93437493740.80333938180.85444872970.86304534070.36062999620.31573597770.70840719960.79909784020.82123169480.82417034540.49881250240.4731907887
0.01333889090.17932682030.30600609230.2405303675-0.2115738687-0.20502705250.55415678330.68595736210.75124529860.70840888380.43859875950.4429925023
-0.446380657-0.16361808080.0131603537-0.101961117-0.5205315035-0.49310309480.92244807561.09546877081.20687310571.0879726060.86423114790.908241176
-0.6867610944-0.3668360075-0.1523275257-0.3047110638-0.6956662191-0.66070249011.89901156962.16521117412.36541010582.06325204451.93666166922.0730124142
mcf+mcfmcf+vpr
CMP2MSMT3MSMT2MCMP1MCMP2MSMT3MSMT2MCMP1M
0.08056970120.7562425940.0178648675-0.49992309650.89516590760.64218869450.58623483260.1699922658
0.76212768220.51783019610.38187316540.77191674330.81107092720.55184073880.53182854620.6586422793
0.6307394889-0.13575139280.35761948692.54328850380.0753941638-0.01810823040.00605737151.3502576369
0.509147894-0.50789906740.33379149226.0854872099-0.3614426746-0.378730418-0.33925279212.3302605562
0.3966224412-0.71979899550.310381709913.1687951593-0.6208316247-0.6069058673-0.56604177353.7189019612
CMPSMT
0.0264571190.3364798079
0.07352374420.3810228237
0.22155764040.4294488365
0.49383534420.4621670173
0.61999150560.5225182996
0.67959194240.5302003877
0.73417723930.5443473787
IPC1.5M
POWER1.75M
ENERGY2M
ENERGY DELAY2.25M
ENERGY DELAY^22.5M
2.75M
3M
Sheet1
000
000
000
Normal case
L2 leakage radically reduced
No Temperature effect on Leakge
Technology (nm)
Average temperature difference between CMP and SMT
Sheet2
0
0
0
0
0
0
Temperature (Celsius)
Sheet3
00
00
00
00
00
2-way SMT
dual-core CMP
Relative change compared to ST baseline
00
00
00
00
00
2-way SMT
dual-core CMP
Relative change compared to ST baseline
000000
000000
000000
No DTM
Global fetch throttling
Local renaming throttling
Register file throttling
DVS10
DVS20
Relative change compared to ST baseline without DTM
000000
000000
000000
No DTM
Global fetch throttling
Local renaming throttling
Register file throttling
DVS10
DVS20
Relative change compared to ST baseline without DTM
-0.0545799581-0.0390064596-0.0360948816-0.095716911-0.1087500114
0.02506602060.08906567280.0636536881-0.0589310101-0.0566065961
0.12139632310.25798160460.1885986013-0.0185160950.0029792386
0.23878798150.48369469820.34666520760.02595916460.0712814762
Fetch throttling
Rename throttling
Register file throttling
DVS10
DVS20
Relative change compared with baseline without DTM
-0.0188401584-0.0133399996-0.0122237317-0.0012915749-0.0063309131
0.00833237330.03034882850.02176455320.01231223310.0130886515
0.0389932170.08251137110.06096026490.02677661880.0342944582
0.07362170770.14487410540.10621857540.04215762010.0574549696
Fetch throttling
Rename throttling
Register file throttling
DVS10
DVS20
Relative change compared with baseline without DTM
000000
000000
000000
000000
No DTM
Fetch throttling
Rename throttling
Register file throttling
DVS10
DVS20
Relative change compared with baseline without DTM
000000
000000
000000
000000
No DTM
Fetch throttling
Rename throttling
Register file throttling
DVS10
DVS20
Relative change compared with baseline without DTM
000000
000000
000000
000000
No DTM
Fetch throttling
Rename throttling
Register file throttling
DVS10
DVS20
Relative change compared with baseline without DTM
000000
000000
000000
000000
1.10
1.21
1.09
1.9
2.17
2.37
2.06
1.94
2.07
No DTM
Fetch throttling
Rename throttling
Register file throttling
DVS10
DVS20
Relative change compared with baseline without DTM
0.01786486750.75624259400
0.38187316540.517830196100
0.3576194869-0.135751392800
0.3337914922-0.507899067400
0.3103817099-0.719798995500
13.2
6.09
2.54
SMT with 2MB L2
SMT with 3MB L2
CMP with 1MB L2
CMP with 2MB L2
Relative change compared with baseline ST with 2MB L2
0.58623483260.642188694500
0.53182854620.551840738800
0.0060573715-0.018108230400
-0.3392527921-0.37873041800
-0.5660417735-0.606905867300
1.35
2.33
3.72
SMT with 2MB L2
SMT with 3MB L2
CMP with 1MB L2
CMP with 2MB L2
Relative change compared with baseline ST with 2MB L2
00
00
00
00
00
00
00
SMT
CMP
L2 size (SMT)
Relative performance change compared to ST baseline
* 2005, Yingmin Li
Heat Flow of Global Heat-up
Heat Sink
Heat Spreader
Thermal Interface Material
Silicon Bulk
Interconnect Layers
C4 Pads and Underfill
Ceramic Substrate
CBGA Joint
Printed-circuit Board
Primary Path
Secondary Path
* 2005, Yingmin Li
Illustration (global heat-up of CMP vs. local heat-up of SMT)
Chart1
66.3658.84
75.8873.86
95.3694.42
CMP
SMT
Temperature
Sheet1
CMPSMT
DeviceInterface materialSpreadDeviceInterface materialSpread
CMPSMT
IFU_cache341.91339.47321.16IFU_cache336.25334.18318.865.665.292.3IFU_cache341.91336.2568.7663.1
IFU_B1339.51337.28321IFU_B1331.99330.4318.67.526.882.4IFU_B1339.51331.9966.3658.84
IFU_B2345.91343.03321.49IFU_B2342.8340.01319.183.113.022.31IFU_B2345.91342.872.7669.65
IDU358.61354.21321.59IDU350.62346.87319.357.997.342.24IDU358.61350.6285.4677.47
LSU_cache349.03345.79321.68LSU_cache347.01343.72319.312.022.072.37LSU_cache349.03347.0175.8873.86
LSU_B1356.85352.7321.9LSU_B1357.01352.58319.67-0.160.122.23LSU_B1356.85357.0183.783.86
LSU_B2362.38357.61322.22LSU_B2360.91356.03319.861.471.582.36LSU_B2362.38360.9189.2387.76
FXU_reg368.51363.03322.35FXU_reg367.57361.85319.950.941.182.4FXU_reg368.51367.5795.3694.42
FXU_B1365.36360.18321.88FXU_B1361.85356.81319.373.513.372.51FXU_B1365.36361.8592.2188.7
FXU_B2364.54359.52322.36FXU_B2362.06357.06319.922.482.462.44FXU_B2364.54362.0691.3988.91
ISU360.03355.4322.03ISU358.69353.78319.771.341.622.26ISU360.03358.6986.8885.54
FPU_reg353.46349.65321.22FPU_reg352.27348.35319.011.191.32.21FPU_reg353.46352.2780.3179.12
FPU_B1353.2349.44321.48FPU_B1345.81342.65319.027.396.792.46FPU_B1353.2345.8180.0572.66
BXU351.52348321.77BXU348.55345.12319.52.972.882.27BXU351.52348.5578.3775.4
IFU_cache341.91339.47321.16L2325.59324.7318.216.3214.772.96L2340.37325.5967.2252.44
IFU_B1339.51337.28321
IFU_B2345.91343.03321.49
IDU358.61354.21321.59IFU_B166.3658.84
LSU_cache349.03345.79321.68LSU_cache75.8873.86
LSU_B1356.85352.7321.9FXU_reg95.3694.42
LSU_B2362.38357.61322.22
FXU_reg368.51363.03322.35IFU_B146.625504815247.0833333333
FXU_B1365.36360.18321.88LSU_cache60.672899386973.5680941603
FXU_B2364.54359.52322.36FXU_reg162.5349721707203.0258948024
ISU360.03355.4322.03
FPU_reg353.46349.65321.22
FPU_B1353.2349.44321.48
BXU351.52348321.77
L2340.37338.2321.1
IFU_cache0.87395221590.20992522341.08387743940.8424352460.17555087371.017986119736.006810178733.623530188936.006810178733.6235301889
IFU_B10.26017508660.040.30017508660.26492997890.030.294929978946.625504815247.083333333346.625504815247.0833333333
IFU_B20.96619551340.17564836771.14184388110.98555335230.15663155291.142184905251.295777178851.311096136651.295777178851.3110961366
IDU2.51335217950.22139522162.73474740112.2959226670.17196058992.4678832569139.5279081633125.912397959260.672899386973.5680941603
LSU_cache2.89697428420.46794455593.36491884013.631331020.43146734424.062798364160.672899386973.5680941603124.1153043478142.5393043478
LSU_B14.56377348170.43186759074.99564107245.31278349240.42442310085.7372065931124.1153043478142.5393043478128.8828667413145.0042923219
LSU_B23.11436804670.33840415833.4527722053.60135084890.31739001123.9187408601128.8828667413145.0042923219139.5279081633125.9123979592
FXU_reg1.59050884550.16161772751.7521265733.02870120210.23290990213.2616111042162.5349721707203.025894802479.566697588199.0469781931
FXU_B10.86534014230.13484588731.00018602951.11723137940.13182302641.2490544058102.0597959184116.625023342766.450906183462.75536
FXU_B20.77580703990.11719131860.89299835860.90389887220.10563266821.0095315405101.2469387755115.7587432634101.3677959184102.9144062128
ISU6.57987050880.87066204637.45053255529.87166687321.206864543311.0785314165101.3677959184102.9144062128162.5349721707203.0258948024
FPU_reg0.7488509940.10887835130.85772934521.43517784110.15452596361.589703804779.566697588199.0469781931102.0597959184116.6250233427
FPU_B11.06210833490.18451030461.24661863941.02668360390.14997893171.176662535666.450906183462.75536101.2469387755115.7587432634
BXU0.87785701260.18296552981.06082254240.89312678150.16559920681.058725988454.12362244954.016632653154.12362244954.0166326531
IFU_cache0.87395221590.20992522341.08387743942.89984632542.23677407365.136620399120.31464674768.748245793320.31464674768.7482457933
IFU_B10.26017508660.040.3001750866
IFU_B20.96619551340.17564836771.1418438811
IDU2.51335217950.22139522162.7347474011
LSU_cache2.89697428420.46794455593.3649188401
LSU_B14.56377348170.43186759074.9956410724
LSU_B23.11436804670.33840415833.452772205
FXU_reg1.59050884550.16161772751.752126573
FXU_B10.86534014230.13484588731.0001860295
FXU_B20.77580703990.11719131860.8929983586
ISU6.57987050880.87066204637.4505325552
FPU_reg0.7488509940.10887835130.8577293452
FPU_B11.06210833490.18451030461.2466186394
BXU0.87785701260.18296552981.0608225424
L24.81191865771.71517782246.52709648
Sheet1
00
00
00
00
00
00
00
00
00
00
00
00
00
00
00
CMP
SMT
Units
Temperature (C)
Sheet2
00
00
00
00
00
00
00
00
00
00
00
00
00
00
00
CMP
SMT
Units
Power Density (w/cm^2)
Sheet3
CMP
SMT
Temperature
CMP
SMT
Power density
Chart2
46.625504815247.0833333333
60.672899386973.5680941603
162.5349721707203.0258948024
CMP
SMT
Power density
Sheet1
CMPSMT
DeviceInterface materialSpreadDeviceInterface materialSpread
CMPSMT
IFU_cache341.91339.47321.16IFU_cache336.25334.18318.865.665.292.3IFU_cache341.91336.2568.7663.1
IFU_B1339.51337.28321IFU_B1331.99330.4318.67.526.882.4IFU_B1339.51331.9966.3658.84
IFU_B2345.91343.03321.49IFU_B2342.8340.01319.183.113.022.31IFU_B2345.91342.872.7669.65
IDU358.61354.21321.59IDU350.62346.87319.357.997.342.24IDU358.61350.6285.4677.47
LSU_cache349.03345.79321.68LSU_cache347.01343.72319.312.022.072.37LSU_cache349.03347.0175.8873.86
LSU_B1356.85352.7321.9LSU_B1357.01352.58319.67-0.160.122.23LSU_B1356.85357.0183.783.86
LSU_B2362.38357.61322.22LSU_B2360.91356.03319.861.471.582.36LSU_B2362.38360.9189.2387.76
FXU_reg368.51363.03322.35FXU_reg367.57361.85319.950.941.182.4FXU_reg368.51367.5795.3694.42
FXU_B1365.36360.18321.88FXU_B1361.85356.81319.373.513.372.51FXU_B1365.36361.8592.2188.7
FXU_B2364.54359.52322.36FXU_B2362.06357.06319.922.482.462.44FXU_B2364.54362.0691.3988.91
ISU360.03355.4322.03ISU358.69353.78319.771.341.622.26ISU360.03358.6986.8885.54
FPU_reg353.46349.65321.22FPU_reg352.27348.35319.011.191.32.21FPU_reg353.46352.2780.3179.12
FPU_B1353.2349.44321.48FPU_B1345.81342.65319.027.396.792.46FPU_B1353.2345.8180.0572.66
BXU351.52348321.77BXU348.55345.12319.52.972.882.27BXU351.52348.5578.3775.4
IFU_cache341.91339.47321.16L2325.59324.7318.216.3214.772.96L2340.37325.5967.2252.44
IFU_B1339.51337.28321
IFU_B2345.91343.03321.49
IDU358.61354.21321.59IFU_B166.3658.84
LSU_cache349.03345.79321.68LSU_cache75.8873.86
LSU_B1356.85352.7321.9FXU_reg95.3694.42
LSU_B2362.38357.61322.22
FXU_reg368.51363.03322.35IFU_B146.625504815247.0833333333
FXU_B1365.36360.18321.88LSU_cache60.672899386973.5680941603
FXU_B2364.54359.52322.36FXU_reg162.5349721707203.0258948024
ISU360.03355.4322.03
FPU_reg353.46349.65321.22
FPU_B1353.2349.44321.48
BXU351.52348321.77
L2340.37338.2321.1
IFU_cache0.87395221590.20992522341.08387743940.8424352460.17555087371.017986119736.006810178733.623530188936.006810178733.6235301889
IFU_B10.26017508660.040.30017508660.26492997890.030.294929978946.625504815247.083333333346.625504815247.0833333333
IFU_B20.96619551340.17564836771.14184388110.98555335230.15663155291.142184905251.295777178851.311096136651.295777178851.3110961366
IDU2.51335217950.22139522162.73474740112.2959226670.17196058992.4678832569139.5279081633125.912397959260.672899386973.5680941603
LSU_cache2.89697428420.46794455593.36491884013.631331020.43146734424.062798364160.672899386973.5680941603124.1153043478142.5393043478
LSU_B14.56377348170.43186759074.99564107245.31278349240.42442310085.7372065931124.1153043478142.5393043478128.8828667413145.0042923219
LSU_B23.11436804670.33840415833.4527722053.60135084890.31739001123.9187408601128.8828667413145.0042923219139.5279081633125.9123979592
FXU_reg1.59050884550.16161772751.7521265733.02870120210.23290990213.2616111042162.5349721707203.025894802479.566697588199.0469781931
FXU_B10.86534014230.13484588731.00018602951.11723137940.13182302641.2490544058102.0597959184116.625023342766.450906183462.75536
FXU_B20.77580703990.11719131860.89299835860.90389887220.10563266821.0095315405101.2469387755115.7587432634101.3677959184102.9144062128
ISU6.57987050880.87066204637.45053255529.87166687321.206864543311.0785314165101.3677959184102.9144062128162.5349721707203.0258948024
FPU_reg0.7488509940.10887835130.85772934521.43517784110.15452596361.589703804779.566697588199.0469781931102.0597959184116.6250233427
FPU_B11.06210833490.18451030461.24661863941.02668360390.14997893171.176662535666.450906183462.75536101.2469387755115.7587432634
BXU0.87785701260.18296552981.06082254240.89312678150.16559920681.058725988454.12362244954.016632653154.12362244954.0166326531
IFU_cache0.87395221590.20992522341.08387743942.89984632542.23677407365.136620399120.31464674768.748245793320.31464674768.7482457933
IFU_B10.26017508660.040.3001750866
IFU_B20.96619551340.17564836771.1418438811
IDU2.51335217950.22139522162.7347474011
LSU_cache2.89697428420.46794455593.3649188401
LSU_B14.56377348170.43186759074.9956410724
LSU_B23.11436804670.33840415833.452772205
FXU_reg1.59050884550.16161772751.752126573
FXU_B10.86534014230.13484588731.0001860295
FXU_B20.77580703990.11719131860.8929983586
ISU6.57987050880.87066204637.4505325552
FPU_reg0.7488509940.10887835130.8577293452
FPU_B11.06210833490.18451030461.2466186394
BXU0.87785701260.18296552981.0608225424
L24.81191865771.71517782246.52709648
Sheet1
00
00
00
00
00
00
00
00
00
00
00
00
00
00
00
CMP
SMT
Units
Temperature (C)
Sheet2
00
00
00
00
00
00
00
00
00
00
00
00
00
00
00
CMP
SMT
Units
Power Density (w/cm^2)
Sheet3
CMP
SMT
Temperature
CMP
SMT
Power density
* 2005, Yingmin Li
Temperature Trend with technology evolutionIncreased utilization of SMT becomes mutedL2 cache tends to be much cooler than the coreExpotential temperature dependence of leakage
Chart2
1.31212121211.31212121210.8427272727
2.04909090912.29757575761.5575757576
4.54272727275.02393939392.2827272727
Normal case
L2 leakage radically reduced
No Temperature effect on Leakge
Technology (nm)
Average temperature difference between CMP and SMT
Sheet1
354.8753333333346.4786666667362.1142424242361.3981818182363.588125361.1606060606
81.725333333373.328666666788.964242424288.248181818290.43812588.0106060606
ST
ST (area enlarged)
SMT
SMT(only activity factor)
CMP
CMP (one core rotated)
3.24490112547.541836901913.93308078421.31212121211.31212121210.8427272727
26440.484848484826440.484848484826440.48484848482.04909090912.29757575761.5575757576
0.10930971650.22193412670.34510404434.54272727275.02393939392.2827272727
1309070
SMT
CMP
ST
smt-lcmp-lst-l
b0.25898319650.8742610040.42944883650.2215576404
g0.03074097310.5907973889-0.06966020140.24378380560.1557347431-0.0256217382
l0.06211140860.5345847812-0.09632045230.26250559360.1338163491-0.035526005
IPCr0.10581767690.5854375314-0.07999156090.32794622010.151027027-0.0297370176
POWERd0.14461626230.6707720816-0.041161340.31707656450.1695501129-0.0149913657
ENERGYd200.09808611030.5956772038-0.05695520950.26906444170.1461591668-0.0205857998
ENERGY DELAY
ENERGY DELAY^2
SMT-lowSMT-high
BGLRDVS10DVS20
0.25898319650.03074097310.06211140860.10581767690.14461626230.09808611030.42944883650.24378380560.26250559360.32794622010.31707656450.2690644417
0.37984054270.21658237770.26821004760.27959109690.04009992680.00580745040.45855923720.31798110730.3741059050.3954006020.14030016330.1043433999
0.07753108790.16098795030.27781292460.1733499403-0.1190658026-0.11351602610.00432723090.07969232770.21218715450.0723822197-0.159194344-0.1540390618
-0.13931645840.17667160560.39053401220.1270176583-0.227085279-0.1848850453-0.23367268690.08808466470.2376049533-0.1328624067-0.3003622446-0.2604098608
-0.29894605280.25498651660.61918149850.1271092222-0.3012049553-0.2232194076-0.34982751060.36614980160.4765056837-0.2620823634-0.3375556991-0.2553640208
CMP-lowCMP-high
0.8742610040.59079738890.53458478120.58543753140.67077208160.59567720380.22155764040.15573474310.13381634910.1510270270.16955011290.1461591668
0.93437493740.80333938180.85444872970.86304534070.36062999620.31573597770.70840719960.79909784020.82123169480.82417034540.49881250240.4731907887
0.01333889090.17932682030.30600609230.2405303675-0.2115738687-0.20502705250.55415678330.68595736210.75124529860.70840888380.43859875950.4429925023
-0.446380657-0.16361808080.0131603537-0.101961117-0.5205315035-0.49310309480.92244807561.09546877081.20687310571.0879726060.86423114790.908241176
-0.6867610944-0.3668360075-0.1523275257-0.3047110638-0.6956662191-0.66070249011.89901156962.16521117412.36541010582.06325204451.93666166922.0730124142
mcf+mcfmcf+vpr
CMP2MSMT3MSMT2MCMP1MCMP2MSMT3MSMT2MCMP1M
0.08056970120.7562425940.0178648675-0.49992309650.89516590760.64218869450.58623483260.1699922658
0.76212768220.51783019610.38187316540.77191674330.81107092720.55184073880.53182854620.6586422793
0.6307394889-0.13575139280.35761948692.54328850380.0753941638-0.01810823040.00605737151.3502576369
0.509147894-0.50789906740.33379149226.0854872099-0.3614426746-0.378730418-0.33925279212.3302605562
0.3966224412-0.71979899550.310381709913.1687951593-0.6208316247-0.6069058673-0.56604177353.7189019612
CMPSMT
0.0264571190.3364798079
0.07352374420.3810228237
0.22155764040.4294488365
0.49383534420.4621670173
0.61999150560.5225182996
0.67959194240.5302003877
0.73417723930.5443473787
IPC1.5M
POWER1.75M
ENERGY2M
ENERGY DELAY2.25M
ENERGY DELAY^22.5M
2.75M
3M
Sheet1
000
000
000
Normal case
L2 leakage radically reduced
No Temperature effect on Leakge
Technology (nm)
Average temperature difference between CMP and SMT
Sheet2
0
0
0
0
0
0
Absolute Temperature (Celsius)
Sheet3
00
00
00
00
00
2-way SMT
dual-core CMP
Relative change compared to ST baseline
00
00
00
00
00
2-way SMT
dual-core CMP
Relative change compared to ST baseline
000000
000000
000000
No DTM
Global fetch throttling
Local renaming throttling
Register file throttling
DVS10
DVS20
Relative change compared to ST baseline without DTM
000000
000000
000000
No DTM
Global fetch throttling
Local renaming throttling
Register file throttling
DVS10
DVS20
Relative change compared to ST baseline without DTM
-0.0545799581-0.0390064596-0.0360948816-0.095716911-0.1087500114
0.02506602060.08906567280.0636536881-0.0589310101-0.0566065961
0.12139632310.25798160460.1885986013-0.0185160950.0029792386
0.23878798150.48369469820.34666520760.02595916460.0712814762
Fetch throttling
Rename throttling
Register file throttling
DVS10
DVS20
Relative change compared with baseline without DTM
-0.0188401584-0.0133399996-0.0122237317-0.0012915749-0.0063309131
0.00833237330.03034882850.02176455320.01231223310.0130886515
0.0389932170.08251137110.06096026490.02677661880.0342944582
0.07362170770.14487410540.10621857540.04215762010.0574549696
Fetch throttling
Rename throttling
Register file throttling
DVS10
DVS20
Relative change compared with baseline without DTM
000000
000000
000000
000000
No DTM
Fetch throttling
Rename throttling
Register file throttling
DVS10
DVS20
Relative change compared with baseline without DTM
000000
000000
000000
000000
No DTM
Fetch throttling
Rename throttling
Register file throttling
DVS10
DVS20
Relative change compared with baseline without DTM
000000
000000
000000
000000
No DTM
Fetch throttling
Rename throttling
Register file throttling
DVS10
DVS20
Relative change compared with baseline without DTM
000000
000000
000000
000000
1.10
1.21
1.09
1.9
2.17
2.37
2.06
1.94
2.07
No DTM
Fetch throttling
Rename throttling
Register file throttling
DVS10
DVS20
Relative change compared with baseline without DTM
0.01786486750.75624259400
0.38187316540.517830196100
0.3576194869-0.135751392800
0.3337914922-0.507899067400
0.3103817099-0.719798995500
13.2
6.09
2.54
SMT with 2MB L2
SMT with 3MB L2
CMP with 1MB L2
CMP with 2MB L2
Relative change compared with baseline ST with 2MB L2
0.58623483260.642188694500
0.53182854620.551840738800
0.0060573715-0.018108230400
-0.3392527921-0.37873041800
-0.5660417735-0.606905867300
1.35
2.33
3.72
SMT with 2MB L2
SMT with 3MB L2
CMP with 1MB L2
CMP with 2MB L2
Relative change compared with baseline ST with 2MB L2
00
00
00
00
00
00
00
SMT
CMP
L2 size (SMT)
Relative performance change compared to ST baseline
* 2005, Yingmin Li
SMT vs. CMP performance and power efficiency analysis (without DTM)SMT is superior for memory bound(high-l2-cache-miss rate) benchmarks while CMP is superior for non memory bound benchmarksCompute-boundMemory-bound
Chart1
0.25898319650.874261004
0.37984054270.9343749374
0.07753108790.0133388909
-0.1393164584-0.446380657
-0.2989460528-0.6867610944
2-way SMT
dual-core CMP
Relative change compared to ST baseline
Sheet1
354.8753333333346.4786666667362.1142424242361.3981818182363.588125361.1606060606
81.725333333373.328666666788.964242424288.248181818290.43812588.0106060606
ST
ST (area enlarged)
SMT
SMT(only activity factor)
CMP
CMP (one core rotated)
3.24490112547.541836901913.93308078421.31212121211.31212121210.8427272727
26440.484848484826440.484848484826440.48484848482.04909090912.29757575761.5575757576
0.10930971650.22193412670.34510404434.54272727275.02393939392.2827272727
1309070
SMT
CMP
ST
smt-lcmp-lst-l
b0.25898319650.8742610040.42944883650.2215576404
g0.03074097310.5907973889-0.06966020140.24378380560.1557347431-0.0256217382
l0.06211140860.5345847812-0.09632045230.26250559360.1338163491-0.035526005
IPCr0.10581767690.5854375314-0.07999156090.32794622010.151027027-0.0297370176
POWERd0.14461626230.6707720816-0.041161340.31707656450.1695501129-0.0149913657
ENERGYd200.09808611030.5956772038-0.05695520950.26906444170.1461591668-0.0205857998
ENERGY DELAY
ENERGY DELAY^2
SMT-lowSMT-high
BGLRDVS10DVS20
0.25898319650.03074097310.06211140860.10581767690.14461626230.09808611030.42944883650.24378380560.26250559360.32794622010.31707656450.2690644417
0.37984054270.21658237770.26821004760.27959109690.04009992680.00580745040.45855923720.31798110730.3741059050.3954006020.14030016330.1043433999
0.07753108790.16098795030.27781292460.1733499403-0.1190658026-0.11351602610.00432723090.07969232770.21218715450.0723822197-0.159194344-0.1540390618
-0.13931645840.17667160560.39053401220.1270176583-0.227085279-0.1848850453-0.23367268690.08808466470.2376049533-0.1328624067-0.3003622446-0.2604098608
-0.29894605280.25498651660.61918149850.1271092222-0.3012049553-0.2232194076-0.34982751060.36614980160.4765056837-0.2620823634-0.3375556991-0.2553640208
CMP-lowCMP-high
0.8742610040.59079738890.53458478120.58543753140.67077208160.59567720380.22155764040.15573474310.13381634910.1510270270.16955011290.1461591668
0.93437493740.80333938180.85444872970.86304534070.36062999620.31573597770.70840719960.79909784020.82123169480.82417034540.49881250240.4731907887
0.01333889090.17932682030.30600609230.2405303675-0.2115738687-0.20502705250.55415678330.68595736210.75124529860.70840888380.43859875950.4429925023
-0.446380657-0.16361808080.0131603537-0.101961117-0.5205315035-0.49310309480.92244807561.09546877081.20687310571.0879726060.86423114790.908241176
-0.6867610944-0.3668360075-0.1523275257-0.3047110638-0.6956662191-0.66070249011.89901156962.16521117412.36541010582.06325204451.93666166922.0730124142
mcf+mcfmcf+vpr
CMP2MSMT3MSMT2MCMP1MCMP2MSMT3MSMT2MCMP1M
0.08056970120.7562425940.0178648675-0.49992309650.89516590760.64218869450.58623483260.1699922658
0.76212768220.51783019610.38187316540.77191674330.81107092720.55184073880.53182854620.6586422793
0.6307394889-0.13575139280.35761948692.54328850380.0753941638-0.01810823040.00605737151.3502576369
0.509147894-0.50789906740.33379149226.0854872099-0.3614426746-0.378730418-0.33925279212.3302605562
0.3966224412-0.71979899550.310381709913.1687951593-0.6208316247-0.6069058673-0.56604177353.7189019612
CMPSMT
0.0264571190.3364798079
0.07352374420.3810228237
0.22155764040.4294488365
0.49383534420.4621670173
0.61999150560.5225182996
0.67959194240.5302003877
0.73417723930.5443473787
IPC1.5M
POWER1.75M
ENERGY2M
ENERGY DELAY2.25M
ENERGY DELAY^22.5M
2.75M
3M
Sheet1
Normal case
L2 leakage radically reduced
No Temperature effect on Leakge
Technology (nm)
Average temperature difference between CMP and SMT
Sheet2
Temperature (Celsius)
Sheet3
2-way SMT
dual-core CMP
Relative change compared to ST baseline
2-way SMT
dual-core CMP
Relative change compared to ST baseline
No DTM
Global fetch throttling
Local renaming throttling
Register file throttling
Relative change compared to ST baseline without DTM
No DTM
Global fetch throttling
Local renaming throttling
Register file throttling
Relative change compared to ST baseline without DTM
-0.0545799581-0.0390064596-0.0360948816-0.095716911-0.1087500114
0.02506602060.08906567280.0636536881-0.0589310101-0.0566065961
0.12139632310.25798160460.1885986013-0.0185160950.0029792386
0.23878798150.48369469820.34666520760.02595916460.0712814762
Fetch throttling
Rename throttling
Register file throttling
DVS10
DVS20
Relative change compared with baseline without DTM
-0.0188401584-0.0133399996-0.0122237317-0.0012915749-0.0063309131
0.00833237330.03034882850.02176455320.01231223310.0130886515
0.0389932170.08251137110.06096026490.02677661880.0342944582
0.07362170770.14487410540.10621857540.04215762010.0574549696
Fetch throttling
Rename throttling
Register file throttling
DVS10
DVS20
Relative change compared with baseline without DTM
No DTM
Fetch throttling
Rename throttling
Register file throttling
Relative change compared with baseline without DTM
No DTM
Fetch throttling
Rename throttling
Register file throttling
Relative change compared with baseline without DTM
No DTM
Fetch throttling
Rename throttling
Register file throttling
Relative change compared with baseline without DTM
1.10
1.21
1.09
1.9
2.17
2.37
2.06
1.94
2.07
No DTM
Fetch throttling
Rename throttling
Register file throttling
Relative change compared with baseline without DTM
0.01786486750.756242594
0.38187316540.5178301961
0.3576194869-0.1357513928
0.3337914922-0.5078990674
0.3103817099-0.7197989955
13.2
6.09
2.54
SMT with 2MB L2
SMT with 3MB L2
CMP with 1MB L2
CMP with 2MB L2
Relative change compared with baseline ST with 2MB L2
0.58623483260.6421886945
0.53182854620.5518407388
0.0060573715-0.0181082304
-0.3392527921-0.378730418
-0.5660417735-0.6069058673
1.35
2.33
3.72
SMT with 2MB L2
SMT with 3MB L2
CMP with 1MB L2
CMP with 2MB L2
Relative change compared with baseline ST with 2MB L2
SMT
CMP
L2 size (SMT)
Relative performance change compared to ST baseline
Chart2
0.42944883650.2215576404
0.45855923720.7084071996
0.00432723090.5541567833
-0.23367268690.9224480756
-0.34982751061.8990115696
2-way SMT
dual-core CMP
Relative change compared to ST baseline
Sheet1
354.8753333333346.4786666667362.1142424242361.3981818182363.588125361.1606060606
81.725333333373.328666666788.964242424288.248181818290.43812588.0106060606
ST
ST (area enlarged)
SMT
SMT(only activity factor)
CMP
CMP (one core rotated)
3.24490112547.541836901913.93308078421.31212121211.31212121210.8427272727
26440.484848484826440.484848484826440.48484848482.04909090912.29757575761.5575757576
0.10930971650.22193412670.34510404434.54272727275.02393939392.2827272727
1309070
SMT
CMP
ST
smt-lcmp-lst-l
b0.25898319650.8742610040.42944883650.2215576404
g0.03074097310.5907973889-0.06966020140.24378380560.1557347431-0.0256217382
l0.06211140860.5345847812-0.09632045230.26250559360.1338163491-0.035526005
IPCr0.10581767690.5854375314-0.07999156090.32794622010.151027027-0.0297370176
POWERd0.14461626230.6707720816-0.041161340.31707656450.1695501129-0.0149913657
ENERGYd200.09808611030.5956772038-0.05695520950.26906444170.1461591668-0.0205857998
ENERGY DELAY
ENERGY DELAY^2
SMT-lowSMT-high
BGLRDVS10DVS20
0.25898319650.03074097310.06211140860.10581767690.14461626230.09808611030.42944883650.24378380560.26250559360.32794622010.31707656450.2690644417
0.37984054270.21658237770.26821004760.27959109690.04009992680.00580745040.45855923720.31798110730.3741059050.3954006020.14030016330.1043433999
0.07753108790.16098795030.27781292460.1733499403-0.1190658026-0.11351602610.00432723090.07969232770.21218715450.0723822197-0.159194344-0.1540390618
-0.13931645840.17667160560.39053401220.1270176583-0.227085279-0.1848850453-0.23367268690.08808466470.2376049533-0.1328624067-0.3003622446-0.2604098608
-0.29894605280.25498651660.61918149850.1271092222-0.3012049553-0.2232194076-0.34982751060.36614980160.4765056837-0.2620823634-0.3375556991-0.2553640208
CMP-lowCMP-high
0.8742610040.59079738890.53458478120.58543753140.67077208160.59567720380.22155764040.15573474310.13381634910.1510270270.16955011290.1461591668
0.93437493740.80333938180.85444872970.86304534070.36062999620.31573597770.70840719960.79909784020.82123169480.82417034540.49881250240.4731907887
0.01333889090.17932682030.30600609230.2405303675-0.2115738687-0.20502705250.55415678330.68595736210.75124529860.70840888380.43859875950.4429925023
-0.446380657-0.16361808080.0131603537-0.101961117-0.5205315035-0.49310309480.92244807561.09546877081.20687310571.0879726060.86423114790.908241176
-0.6867610944-0.3668360075-0.1523275257-0.3047110638-0.6956662191-0.66070249011.89901156962.16521117412.36541010582.06325204451.93666166922.0730124142
mcf+mcfmcf+vpr
CMP2MSMT3MSMT2MCMP1MCMP2MSMT3MSMT2MCMP1M
0.08056970120.7562425940.0178648675-0.49992309650.89516590760.64218869450.58623483260.1699922658
0.76212768220.51783019610.38187316540.77191674330.81107092720.55184073880.53182854620.6586422793
0.6307394889-0.13575139280.35761948692.54328850380.0753941638-0.01810823040.00605737151.3502576369
0.509147894-0.50789906740.33379149226.0854872099-0.3614426746-0.378730418-0.33925279212.3302605562
0.3966224412-0.71979899550.310381709913.1687951593-0.6208316247-0.6069058673-0.56604177353.7189019612
CMPSMT
0.0264571190.3364798079
0.07352374420.3810228237
0.22155764040.4294488365
0.49383534420.4621670173
0.61999150560.5225182996
0.67959194240.5302003877
0.73417723930.5443473787
IPC1.5M
POWER1.75M
ENERGY2M
ENERGY DELAY2.25M
ENERGY DELAY^22.5M
2.75M
3M
Sheet1
Normal case
L2 leakage radically reduced
No Temperature effect on Leakge
Technology (nm)
Average temperature difference between CMP and SMT
Sheet2
Temperature (Celsius)
Sheet3
2-way SMT
dual-core CMP
Relative change compared to ST baseline
2-way SMT
dual-core CMP
Relative change compared to ST baseline
No DTM
Global fetch throttling
Local renaming throttling
Register file throttling
Relative change compared to ST baseline without DTM
No DTM
Global fetch throttling
Local renaming throttling
Register file throttling
Relative change compared to ST baseline without DTM
-0.0545799581-0.0390064596-0.0360948816-0.095716911-0.1087500114
0.02506602060.08906567280.0636536881-0.0589310101-0.0566065961
0.12139632310.25798160460.1885986013-0.0185160950.0029792386
0.23878798150.48369469820.34666520760.02595916460.0712814762
Fetch throttling
Rename throttling
Register file throttling
DVS10
DVS20
Relative change compared with baseline without DTM
-0.0188401584-0.0133399996-0.0122237317-0.0012915749-0.0063309131
0.00833237330.03034882850.02176455320.01231223310.0130886515
0.0389932170.08251137110.06096026490.02677661880.0342944582
0.07362170770.14487410540.10621857540.04215762010.0574549696
Fetch throttling
Rename throttling
Register file throttling
DVS10
DVS20
Relative change compared with baseline without DTM
No DTM
Fetch throttling
Rename throttling
Register file throttling
Relative change compared with baseline without DTM
No DTM
Fetch throttling
Rename throttling
Register file throttling
Relative change compared with baseline without DTM
No DTM
Fetch throttling
Rename throttling
Register file throttling
Relative change compared with baseline without DTM
1.10
1.21
1.09
1.9
2.17
2.37
2.06
1.94
2.07
No DTM
Fetch throttling
Rename throttling
Register file throttling
Relative change compared with baseline without DTM
0.01786486750.756242594
0.38187316540.5178301961
0.3576194869-0.1357513928
0.3337914922-0.5078990674
0.3103817099-0.7197989955
13.2
6.09
2.54
SMT with 2MB L2
SMT with 3MB L2
CMP with 1MB L2
CMP with 2MB L2
Relative change compared with baseline ST with 2MB L2
0.58623483260.6421886945
0.53182854620.5518407388
0.0060573715-0.0181082304
-0.3392527921-0.378730418
-0.5660417735-0.6069058673
1.35
2.33
3.72
SMT with 2MB L2
SMT with 3MB L2
CMP with 1MB L2
CMP with 2MB L2
Relative change compared with baseline ST with 2MB L2
SMT
CMP
L2 size (SMT)
Relative performance change compared to ST baseline
* 2005, Yingmin Li
The impact of changing L2 size: ExamplesMCF+MCFMCF+VPRStays memory bound when L2 size changesChanges from memory bound to non memory bound when L2 size changes
Chart3
0.01786486750.756242594-0.49992309650.0805697012
0.38187316540.51783019610.77191674330.7621276822
0.3576194869-0.13575139282.54328850380.6307394889
0.3337914922-0.50789906746.08548720990.509147894
0.3103817099-0.719798995513.16879515930.3966224412
13.2
6.09
2.54
SMT with 2MB L2
SMT with 3MB L2
CMP with 1MB L2
CMP with 2MB L2
Relative change compared with baseline ST with 2MB L2
Sheet1
354.8753333333346.4786666667362.1142424242361.3981818182363.588125361.1606060606
81.725333333373.328666666788.964242424288.248181818290.43812588.0106060606
ST
ST (area enlarged)
SMT
SMT(only activity factor)
CMP
CMP (one core rotated)
3.24490112547.541836901913.93308078421.31212121211.31212121210.8427272727
26440.484848484826440.484848484826440.48484848482.04909090912.29757575761.5575757576
0.10930971650.22193412670.34510404434.54272727275.02393939392.2827272727
1309070
SMT
CMP
ST
smt-lcmp-lst-l
b0.25898319650.8742610040.42944883650.2215576404
g0.03074097310.5907973889-0.06966020140.24378380560.1557347431-0.0256217382
l0.06211140860.5345847812-0.09632045230.26250559360.1338163491-0.035526005
IPCr0.10581767690.5854375314-0.07999156090.32794622010.151027027-0.0297370176
POWERd0.14461626230.6707720816-0.041161340.31707656450.1695501129-0.0149913657
ENERGYd200.09808611030.5956772038-0.05695520950.26906444170.1461591668-0.0205857998
ENERGY DELAY
ENERGY DELAY^2
SMT-lowSMT-high
BGLRDVS10DVS20
0.25898319650.03074097310.06211140860.10581767690.14461626230.09808611030.42944883650.24378380560.26250559360.32794622010.31707656450.2690644417
0.37984054270.21658237770.26821004760.27959109690.04009992680.00580745040.45855923720.31798110730.3741059050.3954006020.14030016330.1043433999
0.07753108790.16098795030.27781292460.1733499403-0.1190658026-0.11351602610.00432723090.07969232770.21218715450.0723822197-0.159194344-0.1540390618
-0.13931645840.17667160560.39053401220.1270176583-0.227085279-0.1848850453-0.23367268690.08808466470.2376049533-0.1328624067-0.3003622446-0.2604098608
-0.29894605280.25498651660.61918149850.1271092222-0.3012049553-0.2232194076-0.34982751060.36614980160.4765056837-0.2620823634-0.3375556991-0.2553640208
CMP-lowCMP-high
0.8742610040.59079738890.53458478120.58543753140.67077208160.59567720380.22155764040.15573474310.13381634910.1510270270.16955011290.1461591668
0.93437493740.80333938180.85444872970.86304534070.36062999620.31573597770.70840719960.79909784020.82123169480.82417034540.49881250240.4731907887
0.01333889090.17932682030.30600609230.2405303675-0.2115738687-0.20502705250.55415678330.68595736210.75124529860.70840888380.43859875950.4429925023
-0.446380657-0.16361808080.0131603537-0.101961117-0.5205315035-0.49310309480.92244807561.09546877081.20687310571.0879726060.86423114790.908241176
-0.6867610944-0.3668360075-0.1523275257-0.3047110638-0.6956662191-0.66070249011.89901156962.16521117412.36541010582.06325204451.93666166922.0730124142
mcf+mcfmcf+vpr
CMP2MSMT3MSMT2MCMP1MCMP2MSMT3MSMT2MCMP1M
0.08056970120.7562425940.0178648675-0.49992309650.89516590760.64218869450.58623483260.1699922658
0.76212768220.51783019610.38187316540.77191674330.81107092720.55184073880.53182854620.6586422793
0.6307394889-0.13575139280.35761948692.54328850380.0753941638-0.01810823040.00605737151.3502576369
0.509147894-0.50789906740.33379149226.0854872099-0.3614426746-0.378730418-0.33925279212.3302605562
0.3966224412-0.71979899550.310381709913.1687951593-0.6208316247-0.6069058673-0.56604177353.7189019612
CMPSMT
0.0264571190.3364798079
0.07352374420.3810228237
0.22155764040.4294488365
0.49383534420.4621670173
0.61999150560.5225182996
0.67959194240.5302003877
0.73417723930.5443473787
IPC1.5M
POWER1.75M
ENERGY2M
ENERGY DELAY2.25M
ENERGY DELAY^22.5M
2.75M
3M
Sheet1
Normal case
L2 leakage radically reduced
No Temperature effect on Leakge
Technology (nm)
Average temperature difference between CMP and SMT
Sheet2
Temperature (Celsius)
Sheet3
2-way SMT
dual-core CMP
Relative change compared to ST baseline
2-way SMT
dual-core CMP
Relative change compared to ST baseline
No DTM
Global fetch throttling
Local renaming throttling
Register file throttling
Relative change compared to ST baseline without DTM
No DTM
Global fetch throttling
Local renaming throttling
Register file throttling
Relative change compared to ST baseline without DTM
-0.0545799581-0.0390064596-0.0360948816-0.095716911-0.1087500114
0.02506602060.08906567280.0636536881-0.0589310101-0.0566065961
0.12139632310.25798160460.1885986013-0.0185160950.0029792386
0.23878798150.48369469820.34666520760.02595916460.0712814762
Fetch throttling
Rename throttling
Register file throttling
DVS10
DVS20
Relative change compared with baseline without DTM
-0.0188401584-0.0133399996-0.0122237317-0.0012915749-0.0063309131
0.00833237330.03034882850.02176455320.01231223310.0130886515
0.0389932170.08251137110.06096026490.02677661880.0342944582
0.07362170770.14487410540.10621857540.04215762010.0574549696
Fetch throttling
Rename throttling
Register file throttling
DVS10
DVS20
Relative change compared with baseline without DTM
No DTM
Fetch throttling
Rename throttling
Register file throttling
Relative change compared with baseline without DTM
No DTM
Fetch throttling
Rename throttling
Register file throttling
Relative change compared with baseline without DTM
No DTM
Fetch throttling
Rename throttling
Register file throttling
Relative change compared with baseline without DTM
1.10
1.21
1.09
1.9
2.17
2.37
2.06
1.94
2.07
No DTM
Fetch throttling
Rename throttling
Register file throttling
Relative change compared with baseline without DTM
0.01786486750.756242594
0.38187316540.5178301961
0.3576194869-0.1357513928
0.3337914922-0.5078990674
0.3103817099-0.7197989955
13.2
6.09
2.54
SMT with 2MB L2
SMT with 3MB L2
CMP with 1MB L2
CMP with 2MB L2
Relative change compared with baseline ST with 2MB L2
0.58623483260.6421886945
0.53182854620.5518407388
0.0060573715-0.0181082304
-0.3392527921-0.378730418
-0.5660417735-0.6069058673
1.35
2.33
3.72
SMT with 2MB L2
SMT with 3MB L2
CMP with 1MB L2
CMP with 2MB L2
Relative change compared with baseline ST with 2MB L2
SMT
CMP
L2 size (SMT)
Relative performance change compared to ST baseline
Chart4
0.58623483260.64218869450.16999226580.8951659076
0.53182854620.55184073880.65864227930.8110709272
0.0060573715-0.01810823041.35025763690.0753941638
-0.3392527921-0.3787304182.3302605562-0.3614426746
-0.5660417735-0.60690586733.7189019612-0.6208316247
1.35
2.33
3.72
SMT with 2MB L2
SMT with 3MB L2
CMP with 1MB L2
CMP with 2MB L2
Relative change compared with baseline ST with 2MB L2
Sheet1
354.8753333333346.4786666667362.1142424242361.3981818182363.588125361.1606060606
81.725333333373.328666666788.964242424288.248181818290.43812588.0106060606
ST
ST (area enlarged)
SMT
SMT(only activity factor)
CMP
CMP (one core rotated)
3.24490112547.541836901913.93308078421.31212121211.31212121210.8427272727
26440.484848484826440.484848484826440.48484848482.04909090912.29757575761.5575757576
0.10930971650.22193412670.34510404434.54272727275.02393939392.2827272727
1309070
SMT
CMP
ST
smt-lcmp-lst-l
b0.25898319650.8742610040.42944883650.2215576404
g0.03074097310.5907973889-0.06966020140.24378380560.1557347431-0.0256217382
l0.06211140860.5345847812-0.09632045230.26250559360.1338163491-0.035526005
IPCr0.10581767690.5854375314-0.07999156090.32794622010.151027027-0.0297370176
POWERd0.14461626230.6707720816-0.041161340.31707656450.1695501129-0.0149913657
ENERGYd200.09808611030.5956772038-0.05695520950.26906444170.1461591668-0.0205857998
ENERGY DELAY
ENERGY DELAY^2
SMT-lowSMT-high
BGLRDVS10DVS20
0.25898319650.03074097310.06211140860.10581767690.14461626230.09808611030.42944883650.24378380560.26250559360.32794622010.31707656450.2690644417
0.37984054270.21658237770.26821004760.27959109690.04009992680.00580745040.45855923720.31798110730.3741059050.3954006020.14030016330.1043433999
0.07753108790.16098795030.27781292460.1733499403-0.1190658026-0.11351602610.00432723090.07969232770.21218715450.0723822197-0.159194344-0.1540390618
-0.13931645840.17667160560.39053401220.1270176583-0.227085279-0.1848850453-0.23367268690.08808466470.2376049533-0.1328624067-0.3003622446-0.2604098608
-0.29894605280.25498651660.61918149850.1271092222-0.3012049553-0.2232194076-0.34982751060.36614980160.4765056837-0.2620823634-0.3375556991-0.2553640208
CMP-lowCMP-high
0.8742610040.59079738890.53458478120.58543753140.67077208160.59567720380.22155764040.15573474310.13381634910.1510270270.16955011290.1461591668
0.93437493740.80333938180.85444872970.86304534070.36062999620.31573597770.70840719960.79909784020.82123169480.82417034540.49881250240.4731907887
0.01333889090.17932682030.30600609230.2405303675-0.2115738687-0.20502705250.55415678330.68595736210.75124529860.70840888380.43859875950.4429925023
-0.446380657-0.16361808080.0131603537-0.101961117-0.5205315035-0.49310309480.92244807561.09546877081.20687310571.0879726060.86423114790.908241176
-0.6867610944-0.3668360075-0.1523275257-0.3047110638-0.6956662191-0.66070249011.89901156962.16521117412.36541010582.06325204451.93666166922.0730124142
mcf+mcfmcf+vpr
CMP2MSMT3MSMT2MCMP1MCMP2MSMT3MSMT2MCMP1M
0.08056970120.7562425940.0178648675-0.49992309650.89516590760.64218869450.58623483260.1699922658
0.76212768220.51783019610.38187316540.77191674330.81107092720.55184073880.53182854620.6586422793
0.6307394889-0.13575139280.35761948692.54328850380.0753941638-0.01810823040.00605737151.3502576369
0.509147894-0.50789906740.33379149226.0854872099-0.3614426746-0.378730418-0.33925279212.3302605562
0.3966224412-0.71979899550.310381709913.1687951593-0.6208316247-0.6069058673-0.56604177353.7189019612
CMPSMT
0.0264571190.3364798079
0.07352374420.3810228237
0.22155764040.4294488365
0.49383534420.4621670173
0.61999150560.5225182996
0.67959194240.5302003877
0.73417723930.5443473787
IPC1.5M
POWER1.75M
ENERGY2M
ENERGY DELAY2.25M
ENERGY DELAY^22.5M
2.75M
3M
Sheet1
Normal case
L2 leakage radically reduced
No Temperature effect on Leakge
Technology (nm)
Average temperature difference between CMP and SMT
Sheet2
Temperature (Celsius)
Sheet3
2-way SMT
dual-core CMP
Relative change compared to ST baseline
2-way SMT
dual-core CMP
Relative change compared to ST baseline
No DTM
Global fetch throttling
Local renaming throttling
Register file throttling
Relative change compared to ST baseline without DTM
No DTM
Global fetch throttling
Local renaming throttling
Register file throttling
Relative change compared to ST baseline without DTM
-0.0545799581-0.0390064596-0.0360948816-0.095716911-0.1087500114
0.02506602060.08906567280.0636536881-0.0589310101-0.0566065961
0.12139632310.25798160460.1885986013-0.0185160950.0029792386
0.23878798150.48369469820.34666520760.02595916460.0712814762
Fetch throttling
Rename throttling
Register file throttling
DVS10
DVS20
Relative change compared with baseline without DTM
-0.0188401584-0.0133399996-0.0122237317-0.0012915749-0.0063309131
0.00833237330.03034882850.02176455320.01231223310.0130886515
0.0389932170.08251137110.06096026490.02677661880.0342944582
0.07362170770.14487410540.10621857540.04215762010.0574549696
Fetch throttling
Rename throttling
Register file throttling
DVS10
DVS20
Relative change compared with baseline without DTM
No DTM
Fetch throttling
Rename throttling
Register file throttling
Relative change compared with baseline without DTM
No DTM
Fetch throttling
Rename throttling
Register file throttling
Relative change compared with baseline without DTM
No DTM
Fetch throttling
Rename throttling
Register file throttling
Relative change compared with baseline without DTM
1.10
1.21
1.09
1.9
2.17
2.37
2.06
1.94
2.07
No DTM
Fetch throttling
Rename throttling
Register file throttling
Relative change compared with baseline without DTM
0.01786486750.756242594
0.38187316540.5178301961
0.3576194869-0.1357513928
0.3337914922-0.5078990674
0.3103817099-0.7197989955
13.2
6.09
2.54
SMT with 2MB L2
SMT with 3MB L2
CMP with 1MB L2
CMP with 2MB L2
Relative change compared with baseline ST with 2MB L2
0.58623483260.6421886945
0.53182854620.5518407388
0.0060573715-0.0181082304
-0.3392527921-0.378730418
-0.5660417735-0.6069058673
1.35
2.33
3.72
SMT with 2MB L2
SMT with 3MB L2
CMP with 1MB L2
CMP with 2MB L2
Relative change compared with baseline ST with 2MB L2
SMT
CMP
L2 size (SMT)
Relative performance change compared to ST baseline
* 2005, Yingmin Li
SMT vs. CMP performance with DTMLocalized DTM method favors SMT while global DTM method favors CMPGlobal techniqueGlobal DVSFetch throttlingLocal techniqueRename throttlingRegister file throttling (ideal)Compute-boundMemory-bound
Chart5
0.25898319650.03074097310.06211140860.1058176769
0.8742610040.59079738890.53458478120.5854375314
ST-0.0696602014-0.0963204523-0.0799915609
No DTM
Global fetch throttling
Local renaming throttling
Register file throttling
Relative change compared to ST baseline without DTM
Sheet1
354.8753333333346.4786666667362.1142424242361.3981818182363.588125361.1606060606
81.725333333373.328666666788.964242424288.248181818290.43812588.0106060606
ST
ST (area enlarged)
SMT
SMT(only activity factor)
CMP
CMP (one core rotated)
3.24490112547.541836901913.93308078421.31212121211.31212121210.8427272727
26440.484848484826440.484848484826440.48484848482.04909090912.29757575761.5575757576
0.10930971650.22193412670.34510404434.54272727275.02393939392.2827272727
1309070
SMT
CMP
ST
smt-lcmp-lst-l
b0.25898319650.8742610040.42944883650.2215576404
g0.03074097310.5907973889-0.06966020140.24378380560.1557347431-0.0256217382
l0.06211140860.5345847812-0.09632045230.26250559360.1338163491-0.035526005
IPCr0.10581767690.5854375314-0.07999156090.32794622010.151027027-0.0297370176
POWERd0.14461626230.6707720816-0.041161340.31707656450.1695501129-0.0149913657
ENERGYd200.09808611030.5956772038-0.05695520950.26906444170.1461591668-0.0205857998
ENERGY DELAY
ENERGY DELAY^2
SMT-lowSMT-high
BGLRDVS10DVS20
0.25898319650.03074097310.06211140860.10581767690.14461626230.09808611030.42944883650.24378380560.26250559360.32794622010.31707656450.2690644417
0.37984054270.21658237770.26821004760.27959109690.04009992680.00580745040.45855923720.31798110730.3741059050.3954006020.14030016330.1043433999
0.07753108790.16098795030.27781292460.1733499403-0.1190658026-0.11351602610.00432723090.07969232770.21218715450.0723822197-0.159194344-0.1540390618
-0.13931645840.17667160560.39053401220.1270176583-0.227085279-0.1848850453-0.23367268690.08808466470.2376049533-0.1328624067-0.3003622446-0.2604098608
-0.29894605280.25498651660.61918149850.1271092222-0.3012049553-0.2232194076-0.34982751060.36614980160.4765056837-0.2620823634-0.3375556991-0.2553640208
CMP-lowCMP-high
0.8742610040.59079738890.53458478120.58543753140.67077208160.59567720380.22155764040.15573474310.13381634910.1510270270.16955011290.1461591668
0.93437493740.80333938180.85444872970.86304534070.36062999620.31573597770.70840719960.79909784020.82123169480.82417034540.49881250240.4731907887
0.01333889090.17932682030.30600609230.2405303675-0.2115738687-0.20502705250.55415678330.68595736210.75124529860.70840888380.43859875950.4429925023
-0.446380657-0.16361808080.0131603537-0.101961117-0.5205315035-0.49310309480.92244807561.09546877081.20687310571.0879726060.86423114790.908241176
-0.6867610944-0.3668360075-0.1523275257-0.3047110638-0.6956662191-0.66070249011.89901156962.16521117412.36541010582.06325204451.93666166922.0730124142
mcf+mcfmcf+vpr
CMP2MSMT3MSMT2MCMP1MCMP2MSMT3MSMT2MCMP1M
0.08056970120.7562425940.0178648675-0.49992309650.89516590760.64218869450.58623483260.1699922658
0.76212768220.51783019610.38187316540.77191674330.81107092720.55184073880.53182854620.6586422793
0.6307394889-0.13575139280.35761948692.54328850380.0753941638-0.01810823040.00605737151.3502576369
0.509147894-0.50789906740.33379149226.0854872099-0.3614426746-0.378730418-0.33925279212.3302605562
0.3966224412-0.71979899550.310381709913.1687951593-0.6208316247-0.6069058673-0.56604177353.7189019612
CMPSMT
0.0264571190.3364798079
0.07352374420.3810228237
0.22155764040.4294488365
0.49383534420.4621670173
0.61999150560.5225182996
0.67959194240.5302003877
0.73417723930.5443473787
IPC1.5M
POWER1.75M
ENERGY2M
ENERGY DELAY2.25M
ENERGY DELAY^22.5M
2.75M
3M
Sheet1
Normal case
L2 leakage radically reduced
No Temperature effect on Leakge
Technology (nm)
Average temperature difference between CMP and SMT
Sheet2
Temperature (Celsius)
Sheet3
2-way SMT
dual-core CMP
Relative change compared to ST baseline
2-way SMT
dual-core CMP
Relative change compared to ST baseline
No DTM
Global fetch throttling
Local renaming throttling
Register file throttling
Relative change compared to ST baseline without DTM
No DTM
Global fetch throttling
Local renaming throttling
Register file throttling
Relative change compared to ST baseline without DTM
-0.0545799581-0.0390064596-0.0360948816-0.095716911-0.1087500114
0.02506602060.08906567280.0636536881-0.0589310101-0.0566065961
0.12139632310.25798160460.1885986013-0.0185160950.0029792386
0.23878798150.48369469820.34666520760.02595916460.0712814762
Fetch throttling
Rename throttling
Register file throttling
DVS10
DVS20
Relative change compared with baseline without DTM
-0.0188401584-0.0133399996-0.0122237317-0.0012915749-0.0063309131
0.00833237330.03034882850.02176455320.01231223310.0130886515
0.0389932170.08251137110.06096026490.02677661880.0342944582
0.07362170770.14487410540.10621857540.04215762010.0574549696
Fetch throttling
Rename throttling
Register file throttling
DVS10
DVS20
Relative change compared with baseline without DTM
No DTM
Fetch throttling
Rename throttling
Register file throttling
Relative change compared with baseline without DTM
No DTM
Fetch throttling
Rename throttling
Register file throttling
Relative change compared with baseline without DTM
No DTM
Fetch throttling
Rename throttling
Register file throttling
Relative change compared with baseline without DTM
1.10
1.21
1.09
1.9
2.17
2.37
2.06
1.94
2.07
No DTM
Fetch throttling
Rename throttling
Register file throttling
Relative change compared with baseline without DTM
0.01786486750.756242594
0.38187316540.5178301961
0.3576194869-0.1357513928
0.3337914922-0.5078990674
0.3103817099-0.7197989955
13.2
6.09
2.54
SMT with 2MB L2
SMT with 3MB L2
CMP with 1MB L2
CMP with 2MB L2
Relative change compared with baseline ST with 2MB L2
0.58623483260.6421886945
0.53182854620.5518407388
0.0060573715-0.0181082304
-0.3392527921-0.378730418
-0.5660417735-0.6069058673
1.35
2.33
3.72
SMT with 2MB L2
SMT with 3MB L2
CMP with 1MB L2
CMP with 2MB L2
Relative change compared with baseline ST with 2MB L2
SMT
CMP
L2 size (SMT)
Relative performance change compared to ST baseline
Chart6
0.42944883650.24378380560.26250559360.3279462201
0.22155764040.15573474310.13381634910.151027027
ST-0.0256217382-0.035526005-0.0297370176
No DTM
Global fetch throttling
Local renaming throttling
Register file throttling
Relative change compared to ST baseline without DTM
Sheet1
354.8753333333346.4786666667362.1142424242361.3981818182363.588125361.1606060606
81.725333333373.328666666788.964242424288.248181818290.43812588.0106060606
ST
ST (area enlarged)
SMT
SMT(only activity factor)
CMP
CMP (one core rotated)
3.24490112547.541836901913.93308078421.31212121211.31212121210.8427272727
26440.484848484826440.484848484826440.48484848482.04909090912.29757575761.5575757576
0.10930971650.22193412670.34510404434.54272727275.02393939392.2827272727
1309070
SMT
CMP
ST
smt-lcmp-lst-l
b0.25898319650.8742610040.42944883650.2215576404
g0.03074097310.5907973889-0.06966020140.24378380560.1557347431-0.0256217382
l0.06211140860.5345847812-0.09632045230.26250559360.1338163491-0.035526005
IPCr0.10581767690.5854375314-0.07999156090.32794622010.151027027-0.0297370176
POWERd0.14461626230.6707720816-0.041161340.31707656450.1695501129-0.0149913657
ENERGYd200.09808611030.5956772038-0.05695520950.26906444170.1461591668-0.0205857998
ENERGY DELAY
ENERGY DELAY^2
SMT-lowSMT-high
BGLRDVS10DVS20
0.25898319650.03074097310.06211140860.10581767690.14461626230.09808611030.42944883650.24378380560.26250559360.32794622010.31707656450.2690644417
0.37984054270.21658237770.26821004760.27959109690.04009992680.00580745040.45855923720.31798110730.3741059050.3954006020.14030016330.1043433999
0.07753108790.16098795030.27781292460.1733499403-0.1190658026-0.11351602610.00432723090.07969232770.21218715450.0723822197-0.159194344-0.1540390618
-0.13931645840.17667160560.39053401220.1270176583-0.227085279-0.1848850453-0.23367268690.08808466470.2376049533-0.1328624067-0.3003622446-0.2604098608
-0.29894605280.25498651660.61918149850.1271092222-0.3012049553-0.2232194076-0.34982751060.36614980160.4765056837-0.2620823634-0.3375556991-0.2553640208
CMP-lowCMP-high
0.8742610040.59079738890.53458478120.58543753140.67077208160.59567720380.22155764040.15573474310.13381634910.1510270270.16955011290.1461591668
0.93437493740.80333938180.85444872970.86304534070.36062999620.31573597770.70840719960.79909784020.82123169480.82417034540.49881250240.4731907887
0.01333889090.17932682030.30600609230.2405303675-0.2115738687-0.20502705250.55415678330.68595736210.75124529860.70840888380.43859875950.4429925023
-0.446380657-0.16361808080.0131603537-0.101961117-0.5205315035-0.49310309480.92244807561.09546877081.20687310571.0879726060.86423114790.908241176
-0.6867610944-0.3668360075-0.1523275257-0.3047110638-0.6956662191-0.66070249011.89901156962.16521117412.36541010582.06325204451.93666166922.0730124142
mcf+mcfmcf+vpr
CMP2MSMT3MSMT2MCMP1MCMP2MSMT3MSMT2MCMP1M
0.08056970120.7562425940.0178648675-0.49992309650.89516590760.64218869450.58623483260.1699922658
0.76212768220.51783019610.38187316540.77191674330.81107092720.55184073880.53182854620.6586422793
0.6307394889-0.13575139280.35761948692.54328850380.0753941638-0.01810823040.00605737151.3502576369
0.509147894-0.50789906740.33379149226.0854872099-0.3614426746-0.378730418-0.33925279212.3302605562
0.3966224412-0.71979899550.310381709913.1687951593-0.6208316247-0.6069058673-0.56604177353.7189019612
CMPSMT
0.0264571190.3364798079
0.07352374420.3810228237
0.22155764040.4294488365
0.49383534420.4621670173
0.61999150560.5225182996
0.67959194240.5302003877
0.73417723930.5443473787
IPC1.5M
POWER1.75M
ENERGY2M
ENERGY DELAY2.25M
ENERGY DELAY^22.5M
2.75M
3M
Sheet1
Normal case
L2 leakage radically reduced
No Temperature effect on Leakge
Technology (nm)
Average temperature difference between CMP and SMT
Sheet2
Temperature (Celsius)
Sheet3
2-way SMT
dual-core CMP
Relative change compared to ST baseline
2-way SMT
dual-core CMP
Relative change compared to ST baseline
No DTM
Global fetch throttling
Local renaming throttling
Register file throttling
Relative change compared to ST baseline without DTM
No DTM
Global fetch throttling
Local renaming throttling
Register file throttling
Relative change compared to ST baseline without DTM
-0.0545799581-0.0390064596-0.0360948816-0.095716911-0.1087500114
0.02506602060.08906567280.0636536881-0.0589310101-0.0566065961
0.12139632310.25798160460.1885986013-0.0185160950.0029792386
0.23878798150.48369469820.34666520760.02595916460.0712814762
Fetch throttling
Rename throttling
Register file throttling
DVS10
DVS20
Relative change compared with baseline without DTM
-0.0188401584-0.0133399996-0.0122237317-0.0012915749-0.0063309131
0.00833237330.03034882850.02176455320.01231223310.0130886515
0.0389932170.08251137110.06096026490.02677661880.0342944582
0.07362170770.14487410540.10621857540.04215762010.0574549696
Fetch throttling
Rename throttling
Register file throttling
DVS10
DVS20
Relative change compared with baseline without DTM
No DTM
Fetch throttling
Rename throttling
Register file throttling
Relative change compared with baseline without DTM
No DTM
Fetch throttling
Rename throttling
Register file throttling
Relative change compared with baseline without DTM
No DTM
Fetch throttling
Rename throttling
Register file throttling
Relative change compared with baseline without DTM
1.10
1.21
1.09
1.9
2.17
2.37
2.06
1.94
2.07
No DTM
Fetch throttling
Rename throttling
Register file throttling
Relative change compared with baseline without DTM
0.01786486750.756242594
0.38187316540.5178301961
0.3576194869-0.1357513928
0.3337914922-0.5078990674
0.3103817099-0.7197989955
13.2
6.09
2.54
SMT with 2MB L2
SMT with 3MB L2
CMP with 1MB L2
CMP with 2MB L2
Relative change compared with baseline ST with 2MB L2
0.58623483260.6421886945
0.53182854620.5518407388
0.0060573715-0.0181082304
-0.3392527921-0.378730418
-0.5660417735-0.6069058673
1.35
2.33
3.72
SMT with 2MB L2
SMT with 3MB L2
CMP with 1MB L2
CMP with 2MB L2
Relative change compared with baseline ST with 2MB L2
SMT
CMP
L2 size (SMT)
Relative performance change compared to ST baseline
* 2005, Yingmin Li
SMT energy efficiency with DTMLocalized method can lead to better energy-delay product result compared with global method in some cases.Compute-boundMemory-bound
Chart7
0.37984054270.21658237770.26821004760.2795910969
0.07753108790.16098795030.27781292460.1733499403
-0.13931645840.17667160560.39053401220.1270176583
-0.29894605280.25498651660.61918149850.1271092222
No DTM
Fetch throttling
Rename throttling
Register file throttling
Relative change compared with baseline without DTM
Sheet1
354.8753333333346.4786666667362.1142424242361.3981818182363.588125361.1606060606
81.725333333373.328666666788.964242424288.248181818290.43812588.0106060606
ST
ST (area enlarged)
SMT
SMT(only activity factor)
CMP
CMP (one core rotated)
3.24490112547.541836901913.93308078421.31212121211.31212121210.8427272727
26440.484848484826440.484848484826440.48484848482.04909090912.29757575761.5575757576
0.10930971650.22193412670.34510404434.54272727275.02393939392.2827272727
1309070
SMT
CMP
ST
smt-lcmp-lst-l
b0.25898319650.8742610040.42944883650.2215576404
g0.03074097310.5907973889-0.06966020140.24378380560.1557347431-0.0256217382
l0.06211140860.5345847812-0.09632045230.26250559360.1338163491-0.035526005
IPCr0.10581767690.5854375314-0.07999156090.32794622010.151027027-0.0297370176
POWERd0.14461626230.6707720816-0.041161340.31707656450.1695501129-0.0149913657
ENERGYd200.09808611030.5956772038-0.05695520950.26906444170.1461591668-0.0205857998
ENERGY DELAY
ENERGY DELAY^2
SMT-lowSMT-high
BGLRDVS10DVS20
0.25898319650.03074097310.06211140860.10581767690.14461626230.09808611030.42944883650.24378380560.26250559360.32794622010.31707656450.2690644417
0.37984054270.21658237770.26821004760.27959109690.04009992680.00580745040.45855923720.31798110730.3741059050.3954006020.14030016330.1043433999
0.07753108790.16098795030.27781292460.1733499403-0.1190658026-0.11351602610.00432723090.07969232770.21218715450.0723822197-0.159194344-0.1540390618
-0.13931645840.17667160560.39053401220.1270176583-0.227085279-0.1848850453-0.23367268690.08808466470.2376049533-0.1328624067-0.3003622446-0.2604098608
-0.29894605280.25498651660.61918149850.1271092222-0.3012049553-0.2232194076-0.34982751060.36614980160.4765056837-0.2620823634-0.3375556991-0.2553640208
CMP-lowCMP-high
0.8742610040.59079738890.53458478120.58543753140.67077208160.59567720380.22155764040.15573474310.13381634910.1510270270.16955011290.1461591668
0.93437493740.80333938180.85444872970.86304534070.36062999620.31573597770.70840719960.79909784020.82123169480.82417034540.49881250240.4731907887
0.01333889090.17932682030.30600609230.2405303675-0.2115738687-0.20502705250.55415678330.68595736210.75124529860.70840888380.43859875950.4429925023
-0.446380657-0.16361808080.0131603537-0.101961117-0.5205315035-0.49310309480.92244807561.09546877081.20687310571.0879726060.86423114790.908241176
-0.6867610944-0.3668360075-0.1523275257-0.3047110638-0.6956662191-0.66070249011.89901156962.16521117412.36541010582.06325204451.93666166922.0730124142
mcf+mcfmcf+vpr
CMP2MSMT3MSMT2MCMP1MCMP2MSMT3MSMT2MCMP1M
0.08056970120.7562425940.0178648675-0.49992309650.89516590760.64218869450.58623483260.1699922658
0.76212768220.51783019610.38187316540.77191674330.81107092720.55184073880.53182854620.6586422793
0.6307394889-0.13575139280.35761948692.54328850380.0753941638-0.01810823040.00605737151.3502576369
0.509147894-0.50789906740.33379149226.0854872099-0.3614426746-0.378730418-0.33925279212.3302605562
0.3966224412-0.71979899550.310381709913.1687951593-0.6208316247-0.6069058673-0.56604177353.7189019612
CMPSMT
0.0264571190.3364798079
0.07352374420.3810228237
0.22155764040.4294488365
0.49383534420.4621670173
0.61999150560.5225182996
0.67959194240.5302003877
0.73417723930.5443473787
IPC1.5M
POWER1.75M
ENERGY2M
ENERGY DELAY2.25M
ENERGY DELAY^22.5M
2.75M
3M
Sheet1
Normal case
L2 leakage radically reduced
No Temperature effect on Leakge
Technology (nm)
Average temperature difference between CMP and SMT
Sheet2
Temperature (Celsius)
Sheet3
2-way SMT
dual-core CMP
Relative change compared to ST baseline
2-way SMT
dual-core CMP
Relative change compared to ST baseline
No DTM
Global fetch throttling
Local renaming throttling
Register file throttling
Relative change compared to ST baseline without DTM
No DTM
Global fetch throttling
Local renaming throttling
Register file throttling
Relative change compared to ST baseline without DTM
-0.0545799581-0.0390064596-0.0360948816-0.095716911-0.1087500114
0.02506602060.08906567280.0636536881-0.0589310101-0.0566065961
0.12139632310.25798160460.1885986013-0.0185160950.0029792386
0.23878798150.48369469820.34666520760.02595916460.0712814762
Fetch throttling
Rename throttling
Register file throttling
DVS10
DVS20
Relative change compared with baseline without DTM
-0.0188401584-0.0133399996-0.0122237317-0.0012915749-0.0063309131
0.00833237330.03034882850.02176455320.01231223310.0130886515
0.0389932170.08251137110.06096026490.02677661880.0342944582
0.07362170770.14487410540.10621857540.04215762010.0574549696
Fetch throttling
Rename throttling
Register file throttling
DVS10
DVS20
Relative change compared with baseline without DTM
No DTM
Fetch throttling
Rename throttling
Register file throttling
Relative change compared with baseline without DTM
No DTM
Fetch throttling
Rename throttling
Register file throttling
Relative change compared with baseline without DTM
No DTM
Fetch throttling
Rename throttling
Register file throttling
Relative change compared with baseline without DTM
1.10
1.21
1.09
1.9
2.17
2.37
2.06
1.94
2.07
No DTM
Fetch throttling
Rename throttling
Register file throttling
Relative change compared with baseline without DTM
0.01786486750.756242594
0.38187316540.5178301961
0.3576194869-0.1357513928
0.3337914922-0.5078990674
0.3103817099-0.7197989955
13.2
6.09
2.54
SMT with 2MB L2
SMT with 3MB L2
CMP with 1MB L2
CMP with 2MB L2
Relative change compared with baseline ST with 2MB L2
0.58623483260.6421886945
0.53182854620.5518407388
0.0060573715-0.0181082304
-0.3392527921-0.378730418
-0.5660417735-0.6069058673
1.35
2.33
3.72
SMT with 2MB L2
SMT with 3MB L2
CMP with 1MB L2
CMP with 2MB L2
Relative change compared with baseline ST with 2MB L2
SMT
CMP
L2 size (SMT)
Relative performance change compared to ST baseline
Chart8
0.45855923720.31798110730.3741059050.395400602
0.00432723090.07969232770.21218715450.0723822197
-0.23367268690.08808466470.2376049533-0.1328624067
-0.34982751060.36614980160.4765056837-0.2620823634
No DTM
Fetch throttling
Rename throttling
Register file throttling
Relative change compared with baseline without DTM
Sheet1
354.8753333333346.4786666667362.1142424242361.3981818182363.588125361.1606060606
81.725333333373.328666666788.964242424288.248181818290.43812588.0106060606
ST
ST (area enlarged)
SMT
SMT(only activity factor)
CMP
CMP (one core rotated)
3.24490112547.541836901913.93308078421.31212121211.31212121210.8427272727
26440.484848484826440.484848484826440.48484848482.04909090912.29757575761.5575757576
0.10930971650.22193412670.34510404434.54272727275.02393939392.2827272727
1309070
SMT
CMP
ST
smt-lcmp-lst-l
b0.25898319650.8742610040.42944883650.2215576404
g0.03074097310.5907973889-0.06966020140.24378380560.1557347431-0.0256217382
l0.06211140860.5345847812-0.09632045230.26250559360.1338163491-0.035526005
IPCr0.10581767690.5854375314-0.07999156090.32794622010.151027027-0.0297370176
POWERd0.14461626230.6707720816-0.041161340.31707656450.1695501129-0.0149913657
ENERGYd200.09808611030.5956772038-0.05695520950.26906444170.1461591668-0.0205857998
ENERGY DELAY
ENERGY DELAY^2
SMT-lowSMT-high
BGLRDVS10DVS20
0.25898319650.03074097310.06211140860.10581767690.14461626230.09808611030.42944883650.24378380560.26250559360.32794622010.31707656450.2690644417
0.37984054270.21658237770.26821004760.27959109690.04009992680.00580745040.45855923720.31798110730.3741059050.3954006020.14030016330.1043433999
0.07753108790.16098795030.27781292460.1733499403-0.1190658026-0.11351602610.00432723090.07969232770.21218715450.0723822197-0.159194344-0.1540390618
-0.13931645840.17667160560.39053401220.1270176583-0.227085279-0.1848850453-0.23367268690.08808466470.2376049533-0.1328624067-0.3003622446-0.2604098608
-0.29894605280.25498651660.61918149850.1271092222-0.3012049553-0.2232194076-0.34982751060.36614980160.4765056837-0.2620823634-0.3375556991-0.2553640208
CMP-lowCMP-high
0.8742610040.59079738890.53458478120.58543753140.67077208160.59567720380.22155764040.15573474310.13381634910.1510270270.16955011290.1461591668
0.93437493740.80333938180.85444872970.86304534070.36062999620.31573597770.70840719960.79909784020.82123169480.82417034540.49881250240.4731907887
0.01333889090.17932682030.30600609230.2405303675-0.2115738687-0.20502705250.55415678330.68595736210.75124529860.70840888380.43859875950.4429925023
-0.446380657-0.16361808080.0131603537-0.101961117-0.5205315035-0.49310309480.92244807561.09546877081.20687310571.0879726060.86423114790.908241176
-0.6867610944-0.3668360075-0.1523275257-0.3047110638-0.6956662191-0.66070249011.89901156962.16521117412.36541010582.06325204451.93666166922.0730124142
mcf+mcfmcf+vpr
CMP2MSMT3MSMT2MCMP1MCMP2MSMT3MSMT2MCMP1M
0.080569