ex p lo rin g b e n e fits a n d d e s ig n s o f o p tic ... · c ro s s -c h ip le t a s s e m b...
TRANSCRIPT
Exploring Benefits and Designs of Optically Connected
Disintegrated Processor Architecture
Yan Pan, Yigit Demir, Nikos Hardavellas, John Kim!, Gokhan Memik
""#$%&'()*+,'-+%./*+01'2+'*-%3-45'*24+6%
"5)-2+/-7%89:%3$;%
!%#$%&'()*+,'-+%<;8$=%
&)'>'/-7%</*')%
! %%?/@5)@/-%%! %%A#&B%;*C04+'C+D*'%! %%B/1'*%#/,()*42/-%! %%#/-CED24/-%
F8.&$%GHIH%J4-%C/->:%F4+0%?8#KA%LMN%.4O/2%P)*Q)5'EE)2% GRIS%
Motivation
! Transistor density grows exponentially ! But, processors are physically constrained
– Low yield, bandwidth wall, power wall – Dark silicon: we can build dense devices we cannot
afford to power ! Optically-Connected Disintegrated Processor
(OCDP) – Divide (impractical) monolithic processor into chiplets – Improves yield – Breaks the bandwidth wall – Breaks the power wall
• Spread out chiplets, cheaper cooling
Motivation
! %%?/@5)@/-%%! %%A#&B%;*C04+'C+D*'%! %%B/1'*%#/,()*42/-%! %%#/-CED24/-%
F8.&$%GHIH%J4-%C/->:%F4+0%?8#KA%LMN%.4O/2%P)*Q)5'EE)2% MRIS%
Motivation
! Advantage of nanophotonics – Latency – Bandwidth density
! Using nanophotonics for inter-chip interconnect – Reduced memory latency – Increased off-chip bandwidth – Increased total chip area – Increased power budget
! Analytical model* for performance estimation * N. Hardavellas et al., Tech Report NWU-EECS-10-05, Mar. 2010.
Motivation
! %%?/@5)@/-%%! %%A#&B%;*C04+'C+D*'%! %%B/1'*%#/,()*42/-%! %%#/-CED24/-%
F8.&$%GHIH%J4-%C/->:%F4+0%?8#KA%LMN%.4O/2%P)*Q)5'EE)2% LRIS%
Memory Latency
!"#$%&'$(
)*+(
,(
,*-(
,*.(
,*/(
,*+(
)( 0( ,)( ,0( -)( -0( 1)(
23$$45
3(
6$789:(;"<$'=:(>'#?(
Motivation
! %%?/@5)@/-%%! %%A#&B%;*C04+'C+D*'%! %%B/1'*%#/,()*42/-%! %%#/-CED24/-%
F8.&$%GHIH%J4-%C/->:%F4+0%?8#KA%LMN%.4O/2%P)*Q)5'EE)2% TRIS%
Off-chip Bandwidth
!"#$%&'$(
)*+(
,(
,*-(
,*.(
,*/(
,*+(
)( 0)( ,))( ,0)( -))( -0)(
23$$45
3(
@AB=C&3(!"'4D&4<C(>E!F#?(
Motivation
! %%?/@5)@/-%%! %%A#&B%;*C04+'C+D*'%! %%B/1'*%#/,()*42/-%! %%#/-CED24/-%
F8.&$%GHIH%J4-%C/->:%F4+0%?8#KA%LMN%.4O/2%P)*Q)5'EE)2% URIS%
Scaling Power, Chip Area
!"#$%&'$()*+(
,(
,*-(
,*.(
,*/(
,*+(
)( -))( .))( /))( +))( ,)))( ,-))( ,.))(
23$$45
3(
G8<"%(H&$(I9$"(>77-?(
2="%"J%$(K8D$9(!54L$<M(.N(@AB=C&3(!O(
2="%"J%$(K8D$9(!54L$<M-N(@AB=C&3(!O(
2="%"J%$(K8D$9(!54L$<M(,N(@AB=C&3(!O(
P&N$4(K8D$9(!54L$<M(,N(@AB=C&3(!O(
Motivation
! %%?/@5)@/-%%! %%A#&B%;*C04+'C+D*'%! %%B/1'*%#/,()*42/-%! %%#/-CED24/-%
F8.&$%GHIH%J4-%C/->:%F4+0%?8#KA%LMN%.4O/2%P)*Q)5'EE)2% VRIS%
Motivation
! Performance impact – Reduced memory latency " minimal – Improved off-chip bandwidth " small – Total chip area " small – Power budget " big
! Power budget scalability is critical – Spread out chiplets – Cheaper cooling
! Optically-Connected Disintegrated Processor (OCDP)
Motivation
! %%?/@5)@/-%%! %%A#&B%;*C04+'C+D*'%! %%B/1'*%#/,()*42/-%! %%#/-CED24/-%
F8.&$%GHIH%J4-%C/->:%F4+0%?8#KA%LMN%.4O/2%P)*Q)5'EE)2% WRIS%
Off-chip Optical Channels
!"#$%&"' ()#&*"'+,-.. /%-)"0"#&-1+2)$$3
/&#*4+53$1.C
2&'&*-1+8"9$0:&3$ ;<=+3>?*@ ;<ABC* A;:@+
()#&*+D&E$% ;<A+3>?F@ ;<CGC* AH;:@
! Optical fiber is low-loss, high speed – Enables further spreading out chiplets – BW density was a challenge
*
* J. Cardenas et al., Optics Express 2009
Motivation
! %%?/@5)@/-%%! %%A#&B%;*C04+'C+D*'%! %%B/1'*%#/,()*42/-%! %%#/-CED24/-%
F8.&$%GHIH%J4-%C/->:%F4+0%?8#KA%LMN%.4O/2%P)*Q)5'EE)2% SRIS%
Dense Off-chip Coupling
! Dense optical fiber array [Lee et al., OSA/OFC/NFOEC 2010]
! <1dB loss, 8 Tbps/mm demonstrated
Motivation
! %%?/@5)@/-%%! %%A#&B%;*C04+'C+D*'%! %%B/1'*%#/,()*42/-%! %%#/-CED24/-%
F8.&$%GHIH%J4-%C/->:%F4+0%?8#KA%LMN%.4O/2%P)*Q)5'EE)2% IHRIS%
OCDP Design Considerations
! Inter-chiplet optical channel technology – Optic fiber for low loss
! Inter-chiplet optical channel organization – Point-to-point [Koka et al., ISCA 2010]
– Minimize waveguide and coupler loss ! On-chip topology
– Scalable chiplet size
! On-chip / off-chip bandwidth interfacing – Distributed BW, seamless integration
Motivation
! %%?/@5)@/-%%! %%A#&B%;*C04+'C+D*'%! %%B/1'*%#/,()*42/-%! %%#/-CED24/-%
F8.&$%GHIH%J4-%C/->:%F4+0%?8#KA%LMN%.4O/2%P)*Q)5'EE)2% IIRIS%
OCDP Architecture
Chiplet 1 Chiplet 0src
Chiplet 3
Chiplet 2
Chiplet 4
Cross-chiplet assemblies share an optical bus, forming optical crossbars (FlexiShare)
Chiplet 0
Chiplet 3
Laser Source
couplers
Optical fiber
Electrical cluster
dst
OCDP Arch.
! %%?/@5)@/-%%! %%A#&B%;*C04+'C+D*'%! %%B/1'*%#/,()*42/-%! %%#/-CED24/-%
F8.&$%GHIH%J4-%C/->:%F4+0%?8#KA%LMN%.4O/2%P)*Q)5'EE)2% IGRIS%
Firefly On-chip Topology
! Firefly on-chip topology [Pan et al., ISCA 2009] – Flexible chiplet sizing, optical on-chip communication
! FlexiShare optical crossbars [Pan et al., HPCA 2010] – Flexible bandwidth provisioning – Light-weight optical arbitration needed, proposed
Chiplet 0
C0R3PPPP
C0R0
PPPP
C0R2
PPPP
C0R1PPPP
C2R0PPPP
C3R0PPPP
C1R0PPPP
C0
C1
C2
C3
C0R3PPPP
C0R0
PPPP
C0R2
PPPP
C0R1PPPP
C0
... ...
C2R0PPPP
C3R0PPPP
C1R0PPPP
C1
C2
C3... ...
... ...
A0
A1
A2
A3
FlexiShare
CH0
CH1
CHM-1!...
R0
R1
Rk-1
R0
R1
Rk-1
......
...
......
...
out
out
out
in
in
in
OCDP Arch.
! %%?/@5)@/-%%! %%A#&B%;*C04+'C+D*'%! %%B/1'*%#/,()*42/-%! %%#/-CED24/-%
F8.&$%GHIH%J4-%C/->:%F4+0%?8#KA%LMN%.4O/2%P)*Q)5'EE)2% IMRIS%
Extending across chiplets
! Distributed bandwidth across chiplets ! Flexible inter-chiplet bandwidth provisioning ! Minimal number of couplers ! Seamless on-chip/off-chip interfacing
Chiplet 0Chiplet 1
OCDP Arch.
! %%?/@5)@/-%%! %%A#&B%;*C04+'C+D*'%! %%B/1'*%#/,()*42/-%! %%#/-CED24/-%
F8.&$%GHIH%J4-%C/->:%F4+0%?8#KA%LMN%.4O/2%P)*Q)5'EE)2% ILRIS%
Technology Assumptions
! Moderate DWDM (16-way)
Parameter Loss Parameter ValueCoupler 1 dB Detector Sensitivity 0.01 mWSplitters 1 dB DWDM 16 !Non-linear 1 dB fiber coupler loss 0.1Modulator Insertion 0.1 dB fiber loss 2.00E-06 dB/cmWaveguide 0.3 dB/cm ring heating power 40 uW/ringRing Through 0.001 dB Modulation Power 80 fJ/bitFilter Drop 1.5 dB Demodulation Power 40 fJ/bitPhotoDetector 0.1 dB
Power Eval.
! %%?/@5)@/-%%! %%A#&B%;*C04+'C+D*'%! %%B/1'*%#/,()*42/-%! %%#/-CED24/-%
F8.&$%GHIH%J4-%C/->:%F4+0%?8#KA%LMN%.4O/2%P)*Q)5'EE)2% ITRIS%
Optical Power (320-core)
! 5-chiplet OCDP vs. single-chip topologies ! Total number of optical channels (wavelengths)
held constant.
!
"
#!
#"
$!
$"
%&'()*"+,-./0123
4.51607)*&893 4.51607)*&8:3 401;.<-=51
>?2=0)%/2.,=0)8?@@)*AB3
!
"!
#!!
#"!
$!!
$"!
%&'()*"+,-./0123
4.51607)*&893 4.51607)*&8:3 401;.<-=51
>-?@
A=BC
A
>?2=0)D@EF15)?6)G.BH)G1A?B=2?5A
Power Eval.
! %%?/@5)@/-%%! %%A#&B%;*C04+'C+D*'%! %%B/1'*%#/,()*42/-%! %%#/-CED24/-%
F8.&$%GHIH%J4-%C/->:%F4+0%?8#KA%LMN%.4O/2%P)*Q)5'EE)2% IURIS%
Per-Core Network Static Power
! ~ 30% power reduction compared to the best alternative.
!
"!
#!
$!
%!
&!
'!
()*+,-&./0123456
718493:,-);%6 718493:,-);<6 734=1>0?84
@A5?3,>5?51/,+AB48,+48,)A84,-CD6
Power Eval.
! %%?/@5)@/-%%! %%A#&B%;*C04+'C+D*'%! %%B/1'*%#/,()*42/-%! %%#/-CED24/-%
F8.&$%GHIH%J4-%C/->:%F4+0%?8#KA%LMN%.4O/2%P)*Q)5'EE)2% IVRIS%
Scaling Up
! OCDP limits the total on-chip waveguide length ! Better optical scalability
!"#!#"$!$"%!%"&!&"
'()*+*",-%$!.
'()*+*",-#$/!.
0123456+(7/,-#$/!.
053819:;23+-#$/!.
'()*+*<,-$%!&.
'()*+*#=,-&%"$.
>?@;5A'B@1C;5A7?DDA+EF.
Power Eval.
! %%?/@5)@/-%%! %%A#&B%;*C04+'C+D*'%! %%B/1'*%#/,()*42/-%! %%#/-CED24/-%
F8.&$%GHIH%J4-%C/->:%F4+0%?8#KA%LMN%.4O/2%P)*Q)5'EE)2% IWRIS%
Scaling Up
! OCDP shows very good power scalability. ! Single-chip is impractical for 1280-core
processor
!
!"
!""
!"""
!""""
#$%&'&()*+,"-
#$%&'&()*!,."-
/012345'$6.)*!,."-
/427089:12'*!,."-
#$%&'&;)*,+"<-
#$%&'&!=)*<+(,-
>?@:4A8@:@0BA&?C21A&21A$?12A'DE-
Power Eval.
! %%?/@5)@/-%%! %%A#&B%;*C04+'C+D*'%! %%B/1'*%#/,()*42/-%! %%#/-CED24/-%
F8.&$%GHIH%J4-%C/->:%F4+0%?8#KA%LMN%.4O/2%P)*Q)5'EE)2% ISRIS%
Conclusion
! OCDP leverages – Low latency / high bandwidth density – Low loss optic fibers
! Power scalability is critical – Minimize optical loss on the path
! Seamless on-chip / off-chip interfacing – Firefly intra-chiplet (distributed off-chiplet BW) – Point-to-point (Dragonfly) inter-chiplet
! Performance evaluation needed ! Chiplet composition to be explored
Conclusion
GQIRS(T@UV(XD'2@/-2Y%
! %%?/@5)@/-%%! %%A#&B%;*C04+'C+D*'%! %%B/1'*%#/,()*42/-%! %%#/-CED24/-%
F8.&$%GHIH%J4-%C/->:%F4+0%?8#KA%LMN%.4O/2%P)*Q)5'EE)2% GIRIS%
On-chip Optical Channel
! Silicon photonics with DWDM
(Offchip)Laser
Source
Waveguide
Resonant Modulators
Electrical Signal
Electrical Signal
FiltersPhoto
Detector
Motivation
(off-chip)