![Page 1: Approximating Computation and Data for Energy Efficiencyretis.sssup.it/iwes/technical/pagliari.pdf · Approximating Computation and Data for Energy Efficiency 1st IWES September20th](https://reader033.vdocument.in/reader033/viewer/2022042308/5ed4368aee2f292b645b3648/html5/thumbnails/1.jpg)
ApproximatingComputationandDataforEnergyEfficiency
1stIWESSeptember 20th,2016,Pisa,Italy
EDAGroupPolitecnico diTorino
Torino,Italy
DanieleJahierPagliari
![Page 2: Approximating Computation and Data for Energy Efficiencyretis.sssup.it/iwes/technical/pagliari.pdf · Approximating Computation and Data for Energy Efficiency 1st IWES September20th](https://reader033.vdocument.in/reader033/viewer/2022042308/5ed4368aee2f292b645b3648/html5/thumbnails/2.jpg)
2
Outline
§ ErrorToleranceandApproximateComputing
§OurView
§ ACinProcessing
§ ACinInterconnects
§ ACinActuators(OLEDDisplays)
§ Conclusions
![Page 3: Approximating Computation and Data for Energy Efficiencyretis.sssup.it/iwes/technical/pagliari.pdf · Approximating Computation and Data for Energy Efficiency 1st IWES September20th](https://reader033.vdocument.in/reader033/viewer/2022042308/5ed4368aee2f292b645b3648/html5/thumbnails/3.jpg)
3
ErrorTolerance§ Manyemergingapplicationsareerrortolerant (orresilient)
Applications Error
Tolerance
Noisy Inputs (e.g. sensor
data)
Multiple "Golden" Outputs
Human Perception
(e.g. multimedia)
Algorithm Features
Error!
NoisyInput NoisyOutput
OriginalImage Errorsin3LSBs
![Page 4: Approximating Computation and Data for Energy Efficiencyretis.sssup.it/iwes/technical/pagliari.pdf · Approximating Computation and Data for Energy Efficiency 1st IWES September20th](https://reader033.vdocument.in/reader033/viewer/2022042308/5ed4368aee2f292b645b3648/html5/thumbnails/4.jpg)
4
ApproximateComputing
Performance Demands
Energy Budget
• ”Smart”Systems• InternetofThings
• Battery-operated• Energy-autonomous
Tradeoffenergyconsumption andoutputquality leveragingapplicationserrortolerance.
ApproximateComputing(AC):
Software LevelProcessor Level
Architecture LevelLogic Level
AbstractionLevels:ESsDesignChallenges:
ClassicAC:• Design-timeapproximations(fixederror).RecentTrend:• Runtime-reconfigurableerror.
Issues:• Whataboutsystem-level?.• Whataboutautomation?
![Page 5: Approximating Computation and Data for Energy Efficiencyretis.sssup.it/iwes/technical/pagliari.pdf · Approximating Computation and Data for Energy Efficiency 1st IWES September20th](https://reader033.vdocument.in/reader033/viewer/2022042308/5ed4368aee2f292b645b3648/html5/thumbnails/5.jpg)
5
Motivation§ EmbeddedSystem:
Environment
Processing
EETimes
Stanley-Marbell et al. DAC’16InSummary:§ Sensors,actuators andinterconnects are
relevantcontributorstoconsumption.§ Thebreakdownisstronglysystem-
dependent§ Approach:approximationasasystem-
leveldesignknob!
§ ESEnergyBreakdown:
![Page 6: Approximating Computation and Data for Energy Efficiencyretis.sssup.it/iwes/technical/pagliari.pdf · Approximating Computation and Data for Energy Efficiency 1st IWES September20th](https://reader033.vdocument.in/reader033/viewer/2022042308/5ed4368aee2f292b645b3648/html5/thumbnails/6.jpg)
6
ACinProcessing:RPR
Idea:[Shanbag etal.ISLPED’99]§ VoltageOver-Scaling(VOS)onthe
originalcircuit(MDSP)§ ErrorControlBlock (EC-Block)to
mitigatetheeffectoftimingerrors.
Original Circuit(MDSP)
Error ControlIN
yM
y
Decision Block
Estimator(RPR)
yr
Arrivaltime[s]
N.ofp
aths
Tclk
Arrivaltime[s]
N.ofpaths
TclkVOS
Viol.
![Page 7: Approximating Computation and Data for Energy Efficiencyretis.sssup.it/iwes/technical/pagliari.pdf · Approximating Computation and Data for Energy Efficiency 1st IWES September20th](https://reader033.vdocument.in/reader033/viewer/2022042308/5ed4368aee2f292b645b3648/html5/thumbnails/7.jpg)
7
ACinProcessing:RPR
Idea:[Shanbag etal.ISLPED’99]§ VoltageOver-Scaling(VOS)onthe
originalcircuit(MDSP)§ ErrorControlBlock (EC-Block)to
mitigatetheeffectoftimingerrors.
ECBlockStructure:§ Estimator oftheerror-freeoutput§ Decisionblock toselectbetween
MDSPandEstimatoroutputs
EstimatorImplementation:§ ReducedPrecisionReplica(RPR)
Original Circuit(MDSP)
Error ControlIN
yM
y
Decision Block
Estimator(RPR)
yr
Arrivaltime[s]
N.ofp
aths
Tclk
Arrivaltime[s]
N.ofpaths
TclkVOS
Viol.
![Page 8: Approximating Computation and Data for Energy Efficiencyretis.sssup.it/iwes/technical/pagliari.pdf · Approximating Computation and Data for Energy Efficiency 1st IWES September20th](https://reader033.vdocument.in/reader033/viewer/2022042308/5ed4368aee2f292b645b3648/html5/thumbnails/8.jpg)
8
ACinProcessing:OurContributionClassicRPRhaslimitations:§ Replicadesignanderrorestimation
requireknowledgeoffunctionality(designspecific)
§ Usessimplifiedandunrealisticassumptions(e.g.oninputstatistics)
ProposedFramework:§ AutomaticallyaddRPRtoexisting
gate-levelnetlistofadatapathcircuit.
Features:§ Functionality-agnostic.§ Simulation-based.§ Integratedwithstate-of-the-art
tools forsynthesisandsimulation.
𝝈𝒔𝟐
𝝈𝒚𝒓𝟐 +𝝈𝜼𝟐
MDSPNetlist
RepresentativeInputSet
QualityEvaluation
01101110100111010001011000011011
RPRAutomation
Engine
RPRNetlist
EDATools(e.g.DC)
![Page 9: Approximating Computation and Data for Energy Efficiencyretis.sssup.it/iwes/technical/pagliari.pdf · Approximating Computation and Data for Energy Efficiency 1st IWES September20th](https://reader033.vdocument.in/reader033/viewer/2022042308/5ed4368aee2f292b645b3648/html5/thumbnails/9.jpg)
9
ACinProcessing:ResultsSetup:§ 45nmlibraryfromSTM.§ Opencores designs,realisticquality
constraintsGenerality:§ SuccessfullyappliedRPRto
previouslyuntesteddesigns(CORDIC,SRU).
§ Comparablesavingsw.r.t.ad-hocapproach onFIRandFFT.
Benchmark Tot. Power Saving [%]
Area Ovr. [%]
FIR Filter 44.96 82.39FFT Butterfly 49.66 133.20RM-CORDIC 42.05 127.64SRU 47.91 143.32
Benefitsofsimulation-basedapproach:§ Differentinputstimulicausedifferent
errorrateontheMDSP,atthesameVVOS.§ Consequently,alarger/smallerreplicacan
beusedtoobtainthesamequality.§ Strongimpactofinputsonthe
obtainablepowersavings.
>20%difference!!
RPRpowersavingvsvoltageforaFIRfilter,withdifferentinputstimuli(samequalityconstraint).
![Page 10: Approximating Computation and Data for Energy Efficiencyretis.sssup.it/iwes/technical/pagliari.pdf · Approximating Computation and Data for Energy Efficiency 1st IWES September20th](https://reader033.vdocument.in/reader033/viewer/2022042308/5ed4368aee2f292b645b3648/html5/thumbnails/10.jpg)
10
ACinInterconnects:MotivationSerialbuses:§ De-factostandardforinterfacingsensors,
actuatorsandI/Ocontrollers§ Higherfrequencies,nojitterissue,
reducedcrosstalk§ Lowercosts(lesspinsandeasierwiring)§ SPI,I2C,MIPI,etc.
Motivation:§ PCBtraceshavelargecapacitiveloadsthat
havenotscaledastransistors!§ Transmissionofone12bitsample »
executionof1instruction![1][2]
§ Tens ofserialconnectionsinasystem!
[1] P. Stanley-Marbell and M. Rinard. Value-deviation-bounded serial data encoding for energy-efficient approximate communication. 2015 [2] N. Ickes, et al. . A 10-pJ/instruction, 4-MIPS micropower DSP for sensor applications. 2008.
ErrorTolerantBusTraces:§ SensorICs/multimediaactuators(audio
DAC,displays)
§ Long“idle” (roughlyconstant)phases.§ Short“bursty”(fastandlargevariations)
phases.§ Example: Lena image(redchannel)
(Most)informationconveyedbythebursty phases!
![Page 11: Approximating Computation and Data for Energy Efficiencyretis.sssup.it/iwes/technical/pagliari.pdf · Approximating Computation and Data for Energy Efficiency 1st IWES September20th](https://reader033.vdocument.in/reader033/viewer/2022042308/5ed4368aee2f292b645b3648/html5/thumbnails/11.jpg)
11
ACinInterconnects:ST0/ADETwoEncodingswithcommonPrinciples:§ Exploitidlephasesforpowersaving!§ Avoidredundancy(introduceslargeoverheadsinserialbuses)§ Allowruntime-reconfigurationofacceptederror.§ Simpleimplementation(CODECHWoverheadsmustnotoffsetgains).
§ SerialT0(ST0):§ Selectivelytransmitthecorrectdatumoraspecial0-Transitionspattern(interpretedas“repeatpreviousdatum").
§ |b(t)– b(t’)|> Thà Sendcorrectdata§ |b(t)– b(t’)|≤ Thà Send0-Tpattern.
§ ApproximateDifferentialEncoding(ADE):§ BasedonbitwiseDifferentialEncoding(DE):B(t)=b(t)⊕ b(t-1)§ EnhancedwithLSB-saturation toreducetransitionsalsoduringbursty phases
![Page 12: Approximating Computation and Data for Energy Efficiencyretis.sssup.it/iwes/technical/pagliari.pdf · Approximating Computation and Data for Energy Efficiency 1st IWES September20th](https://reader033.vdocument.in/reader033/viewer/2022042308/5ed4368aee2f292b645b3648/html5/thumbnails/12.jpg)
12
ACinInterconnects:Results
0.05 0.1 0.15 0.2 0.25 0.3 0.35Error [%]
10
20
30
40
50
60
70
TC R
educ
tion
[%]
ADELSBSRAKEST0DE
0 0.1 0.2 0.3 0.4Error [%]
10
20
30
40
50
60
70
80
90
TC R
educ
tion
[%]
ADELSBSRAKEST0DE
1 2 3 4 5 6Error [%]
10
20
30
40
50
60
70
80
90
TC R
educ
tion
[%]
ADELSBSRAKEST0DE
Comparison:§ Rake [Stanley-Marbell,DAC’16]§ LSBS andAccurateDE
Results:§ ADEandST0arebothsuperiortostate-
of-the-art§ ST0betterfor“strongburstiness”,ADE
superiorformorerandomdata.
Images [Kodak Database] ECG [Physionet Database]
Accelerometer [PSR Database]
⋍+40%
⋍+18%
⋍+50%
![Page 13: Approximating Computation and Data for Energy Efficiencyretis.sssup.it/iwes/technical/pagliari.pdf · Approximating Computation and Data for Energy Efficiency 1st IWES September20th](https://reader033.vdocument.in/reader033/viewer/2022042308/5ed4368aee2f292b645b3648/html5/thumbnails/13.jpg)
13
ACinActuators:OLEDDisplays
OLEDDisplays:§ Brighterandbetterviewingangles
w.r.t.LCDs§ Thinnerand/orflexiblepanels
OLEDsareemissive:§ Powerstronglydependsonpixels
luminanceand(secondarily)color
§ Poweroptimizationcanbeachievedwithanimagetransformation!(≠LCD)
Motivation:§ Transformationsforgeneralimagesmustpreservecontrastwhilereducingpowerconsumption.
§ Existingsolutionsarecomputationallyintensive.
§ Poweroverhead?§ Realtime applicability?
Claim:§ Similartransformationscanbeobtainedbymuchsimpler(approximate)computations.
![Page 14: Approximating Computation and Data for Energy Efficiencyretis.sssup.it/iwes/technical/pagliari.pdf · Approximating Computation and Data for Energy Efficiency 1st IWES September20th](https://reader033.vdocument.in/reader033/viewer/2022042308/5ed4368aee2f292b645b3648/html5/thumbnails/14.jpg)
14
ACinActuators:OLEDDisplays
3rd OrderPolynomialFit:§ Transformimagesaccordingtoa3rd
orderpolynomialfunctionoftheinputluminance (YUVspace)
§ Polynomialevaluationvs.histogramprocessing,etc.
§ Simplerandfeweroperations(ADD,MULT)
Approach:1. OfflineTrainingPhase
(ComputationallyIntensive):
2. OnlineTransformation(LinearComplexity):
0 0.2 0.4 0.6 0.8 1Input Luminance (Y)
-0.1
0
0.1
0.2
0.3
0.4
0.5
0.6
0.7
0.8
0.9
Outpu
t Lum
inanc
e (Y t)
Data from Lee et al.3rd-order Polynomial Fit
Lee et al, TIP 2010
ImageDBExtraction offunction
parameters
QualityConstraints
Params.
Determine and apply image-
specific transformation
Params.
![Page 15: Approximating Computation and Data for Energy Efficiencyretis.sssup.it/iwes/technical/pagliari.pdf · Approximating Computation and Data for Energy Efficiency 1st IWES September20th](https://reader033.vdocument.in/reader033/viewer/2022042308/5ed4368aee2f292b645b3648/html5/thumbnails/15.jpg)
15
ACinActuators:Results
§ Comparablequalityatiso-savingsw.r.t.state-of-the-art§ Visuallyandquantitatively:
§ Muchlowercomplexity!(SWorHW)§ 10xfasterthanLeeetal.§ Minimalpoweroverhead forHWimplementation.
Original Leeetal. Proposed
![Page 16: Approximating Computation and Data for Energy Efficiencyretis.sssup.it/iwes/technical/pagliari.pdf · Approximating Computation and Data for Energy Efficiency 1st IWES September20th](https://reader033.vdocument.in/reader033/viewer/2022042308/5ed4368aee2f292b645b3648/html5/thumbnails/16.jpg)
16
ACinActuators:Results
§ Comparablequalityatiso-savingsw.r.t.state-of-the-art§ Visuallyandquantitatively:
§ Muchlowercomplexity!(SWorHW)§ 10xfasterthanLeeetal.§ Minimalpoweroverhead forHWimplementation.
Original Leeetal. Proposed
MSSIM0.692
MSSIM0.799
MSSIM0.860
MSSIM0.691
![Page 17: Approximating Computation and Data for Energy Efficiencyretis.sssup.it/iwes/technical/pagliari.pdf · Approximating Computation and Data for Energy Efficiency 1st IWES September20th](https://reader033.vdocument.in/reader033/viewer/2022042308/5ed4368aee2f292b645b3648/html5/thumbnails/17.jpg)
17
ACinActuators:Results
§ Comparablequalityatiso-savingsw.r.t.state-of-the-art§ Visuallyandquantitatively:
§ Muchlowercomplexity!(SWorHW)§ 10xfasterthanLeeetal.§ Minimalpoweroverhead forHWimplementation.
Original Leeetal. Proposed
Saving61.6 %
Saving59.5 %
Saving55.1 %
Saving69.4 %
![Page 18: Approximating Computation and Data for Energy Efficiencyretis.sssup.it/iwes/technical/pagliari.pdf · Approximating Computation and Data for Energy Efficiency 1st IWES September20th](https://reader033.vdocument.in/reader033/viewer/2022042308/5ed4368aee2f292b645b3648/html5/thumbnails/18.jpg)
18
Conclusions
§ Exploringtheenergyversusqualitytradeoffcanbeinterestingatsystemlevel:
§ Thecomputationpart isnotalwaystheonetoblame.
§ Automation aspectsarekeytothewidespreaddiffusionofthesedesigntechniques.
§ OpenIssues/FutureWork:§ ACinmemories?§ ACinsensors (and/or,ADCs)?§ Howto combineACtechniquesindifferentpartsofthesystemtomaximizetotalpowersavings?
§ (e.g.,encoding+RPR)
![Page 19: Approximating Computation and Data for Energy Efficiencyretis.sssup.it/iwes/technical/pagliari.pdf · Approximating Computation and Data for Energy Efficiency 1st IWES September20th](https://reader033.vdocument.in/reader033/viewer/2022042308/5ed4368aee2f292b645b3648/html5/thumbnails/19.jpg)
19
THANKYOU!