omputing c tatistical sbwrcs.eecs.berkeley.edu/faculty/jan/jansweb... · statistical computing the...

26
STATISTICAL COMPUTING THE ALTERNATIVE ROAD TO LOW ENERGY Jan M. Rabaey Donald O. Pederson Distinguished Prof. University of California at Berkeley DAC 2009 With major contributions from Doug Jones, Subhasish Mitra, and Naresh Shanbhag All research sponsored by Gigascale Systems Research Center (GSRC)

Upload: others

Post on 06-Aug-2020

0 views

Category:

Documents


0 download

TRANSCRIPT

Page 1: OMPUTING C TATISTICAL Sbwrcs.eecs.berkeley.edu/faculty/jan/JansWeb... · STATISTICAL COMPUTING THE ALTERNATIVE ROAD TO LOW ENERGY Jan M. Rabaey Donald O. Pederson Distinguished Prof

STATISTICAL COMPUTING THE ALTERNATIVE ROAD TO LOW ENERGY

Jan M. Rabaey Donald O. Pederson Distinguished Prof.

University of California at Berkeley

DAC 2009

With major contributions from Doug Jones, Subhasish Mitra, and Naresh Shanbhag All research sponsored by Gigascale Systems Research Center (GSRC)

Page 2: OMPUTING C TATISTICAL Sbwrcs.eecs.berkeley.edu/faculty/jan/JansWeb... · STATISTICAL COMPUTING THE ALTERNATIVE ROAD TO LOW ENERGY Jan M. Rabaey Donald O. Pederson Distinguished Prof

It Is All About Energy …

Further progress in all aspects of future information technology platform requires continuing increase in energy efficiency

The Compute Cloud Mobiles

Page 3: OMPUTING C TATISTICAL Sbwrcs.eecs.berkeley.edu/faculty/jan/JansWeb... · STATISTICAL COMPUTING THE ALTERNATIVE ROAD TO LOW ENERGY Jan M. Rabaey Donald O. Pederson Distinguished Prof

But … We are running out of options

Waste has been largely eliminated (…)

0 0.2 0.4 0.6 0.8 1 1.2

VDD (V)

0.001

0.01

0.1

1

En

erg

y (

no

rm.)

0.3V

12x

Minimum energy point set by leakage

Technology scaling may not help much anymore

0.03

0.04

0.05

0.06

0.07

0.08

0.09

0.1

0.11

0.12

20 30 40 50 60 70 80 90

Technology node (nm)

EO

P (

fJ)

Process variations and random upsets dictate noise and timing margins

Page 4: OMPUTING C TATISTICAL Sbwrcs.eecs.berkeley.edu/faculty/jan/JansWeb... · STATISTICAL COMPUTING THE ALTERNATIVE ROAD TO LOW ENERGY Jan M. Rabaey Donald O. Pederson Distinguished Prof

The ways out … New devices that lower the minimum energy point

Example: NEMS Relay Logic (King, Alon)

Others: TFETs, IGFETs

Probably more than decade out

Cut the margins in major way and absorb the consequences

Robustness

Effic

iency

Conventional

Statistical Computing

Possible Today!

Page 5: OMPUTING C TATISTICAL Sbwrcs.eecs.berkeley.edu/faculty/jan/JansWeb... · STATISTICAL COMPUTING THE ALTERNATIVE ROAD TO LOW ENERGY Jan M. Rabaey Donald O. Pederson Distinguished Prof

First Step: Better-than-Worst-Case Computing

Example: RAZOR (T. Austin at al, Michigan)

Scale voltage more than is allowable and deal with the circumstances (through error-trapping and correction) Shadow

Latch

Error_L

Error comparator

clk_del

FF

clk

Q D

“razorized pipeline”

Reduced margins throughput uncertainty – functionally deterministic

Page 6: OMPUTING C TATISTICAL Sbwrcs.eecs.berkeley.edu/faculty/jan/JansWeb... · STATISTICAL COMPUTING THE ALTERNATIVE ROAD TO LOW ENERGY Jan M. Rabaey Donald O. Pederson Distinguished Prof

The Opportunity: Functional Non-Determinism +

-

Effic

iency

-

+

Redundancy/ Overdesign

App. Domain Solution

Statistical Computing

Required

Accura

cy

Page 7: OMPUTING C TATISTICAL Sbwrcs.eecs.berkeley.edu/faculty/jan/JansWeb... · STATISTICAL COMPUTING THE ALTERNATIVE ROAD TO LOW ENERGY Jan M. Rabaey Donald O. Pederson Distinguished Prof

Statistical Computing

Statistical Performance Metrics

Statistical Model of Implementation Platform

Statistical Computation

Inputs: deterministic or stochastic variables Outputs: stochastic variables with guaranteed properties (mean, distribution, bounds)

Implementation adds randomness (errors) System designed such that output metrics are accomplished in spite of randomness of implementation

Requires error models to help design the compensation techniques

Examples: synthesis, classification, modeling, search, recognition

Page 8: OMPUTING C TATISTICAL Sbwrcs.eecs.berkeley.edu/faculty/jan/JansWeb... · STATISTICAL COMPUTING THE ALTERNATIVE ROAD TO LOW ENERGY Jan M. Rabaey Donald O. Pederson Distinguished Prof

Not to be confused with …

Probabilistic algorithms: Algorithms that have element of randomness

Given deterministic inputs and implementation, outputs are random variables

May lead to better performance (search, optimization, polynomial factoring) Example: simulated annealing, genetic algorithms

No specific benefits related to nanoscale computing

Page 9: OMPUTING C TATISTICAL Sbwrcs.eecs.berkeley.edu/faculty/jan/JansWeb... · STATISTICAL COMPUTING THE ALTERNATIVE ROAD TO LOW ENERGY Jan M. Rabaey Donald O. Pederson Distinguished Prof

Not to be Confused With … Probabilistic Boolean Networks

All signals in logic network considered as stochastic variables.

Noise added into the process. Each logic network is essentially stochastic process, producing stochastic variables at the output

Soft data is turned into Boolean variables with error probability at decision points (e.g. latch with sharp timing edges)

In Out

Equivalent to discrete communication known as Binary Symmetric Channel (BSC) Studied extensively in Information Theory (Von Neumann, Winograd, Hajek) Coding can be applied, but large overhead and latency

Page 10: OMPUTING C TATISTICAL Sbwrcs.eecs.berkeley.edu/faculty/jan/JansWeb... · STATISTICAL COMPUTING THE ALTERNATIVE ROAD TO LOW ENERGY Jan M. Rabaey Donald O. Pederson Distinguished Prof

Statistical Computing … What it is! Computational engines that, given the properties and statistics of the input signals and the physical implementation, ensure that the outputs fall within the desired specifications

Basic Tools: • Algorithmic resilience • Estimation • Detection

Page 11: OMPUTING C TATISTICAL Sbwrcs.eecs.berkeley.edu/faculty/jan/JansWeb... · STATISTICAL COMPUTING THE ALTERNATIVE ROAD TO LOW ENERGY Jan M. Rabaey Donald O. Pederson Distinguished Prof

Example: Error-Resilient System Architecture (ERSA)

RMS: Recognition, Mining, Synthesis

Emerging killer applications: cognition, vision, genomics

Large data sets, highly parallel

Core algorithms

Probabilistic belief propagation, K-means clustering, Bayesian networks

[S. Mitra et al., Stanford University]

Cognitive resilience

“Acceptable” results OK

Algorithmic resilience

Low order bit-errors – minimal effects

Intolerant to control and higher order bit-errors

Page 12: OMPUTING C TATISTICAL Sbwrcs.eecs.berkeley.edu/faculty/jan/JansWeb... · STATISTICAL COMPUTING THE ALTERNATIVE ROAD TO LOW ENERGY Jan M. Rabaey Donald O. Pederson Distinguished Prof

RMS Workload Model

Worker thread

Main thread

Setup

Work Assignment

Barrier

Data Reduction

Work Queue

Convergence Test

Iterations

Calculate

Worker thread

Calculate

Worker thread

Calculate

Page 13: OMPUTING C TATISTICAL Sbwrcs.eecs.berkeley.edu/faculty/jan/JansWeb... · STATISTICAL COMPUTING THE ALTERNATIVE ROAD TO LOW ENERGY Jan M. Rabaey Donald O. Pederson Distinguished Prof

13

Relaxed

Reliability

Cores

Super

Reliable

Core

RRC 1

L1 cache

RRC 1

L1 cache

RRC 1

L1 cache

RRC 1

L1 cache

L2 Bank 2

L2 Bank 2

L2 Bank 2

Supervisor

SRC

L1 cache

Interconnect

L2 cache Bank 1

: Reliable : Unreliable

ERSA Vision: Asymmetric Reliability

OS visible Sequestered from OS

Relaxed Reliability Cores : Specification

• Inexpensive & Unreliable • Without expensive error detection

• Worker Threads • Consists most of the workload

• Reliable parts • Memory Bound Check • Restart

Super Reliable Core : Specification

• Highly Reliable (Expensive) • Proper Error Protection

• Executes Main Thread • Assign Worker Threads • Reduction

• Supervise RRCs • Timeout check

Page 14: OMPUTING C TATISTICAL Sbwrcs.eecs.berkeley.edu/faculty/jan/JansWeb... · STATISTICAL COMPUTING THE ALTERNATIVE ROAD TO LOW ENERGY Jan M. Rabaey Donald O. Pederson Distinguished Prof

RMS on ERSA

Calculate

Worker thread + bounds check

(RRC)

Main Thread (SRC)

Setup

Work Assignment

Barrier “Basic” check

Data Reduction

(Work, Memory bounds, Timeout)

Convergence

Test

Iterations

Worker thread + bounds check

(RRC)

Simplistic ERSA inadequate

Convergence filtering heuristics

Convergence damping

= Estimation

Calculate

Page 15: OMPUTING C TATISTICAL Sbwrcs.eecs.berkeley.edu/faculty/jan/JansWeb... · STATISTICAL COMPUTING THE ALTERNATIVE ROAD TO LOW ENERGY Jan M. Rabaey Donald O. Pederson Distinguished Prof

ERSA Prototyping

MISP [Hankins ISCA 06] Emulation Firmware

Hardware Error Injection (Virtualization)

OS + Many-Core Runtime

Application Program

RRC

RRC

SRC

RRC

RRC

RRC

RRC

RRC

Error model: Random register bits flipped per RRC at random intervals (consistent with mean error rate)

Page 16: OMPUTING C TATISTICAL Sbwrcs.eecs.berkeley.edu/faculty/jan/JansWeb... · STATISTICAL COMPUTING THE ALTERNATIVE ROAD TO LOW ENERGY Jan M. Rabaey Donald O. Pederson Distinguished Prof

ERSA Results

0

5

10

15

20

25

30

0 1K 2K 3K 5K 10K 20K

Error %

(Probabilty

Dist.)

Errors / RRC / sec

Naïve ERSA

No ERSA

Optimized

ERSA

Bayesian Network Inference

0

20

40

60

80

100

0 2K 4K 6K 8K 10K 25K

Successful

Decoding

(%)

Errors / RRC / sec

Naïve ERSA

No ERSA

Optimized

ERSA

LDPC Decoding

0

0.5

1

1.5

2

2.5

3

3.5

4

0 500 1K 5K 10K 15K 20K 25K 30K

Normalized

Execution

Time

Errors / RRC / sec

Naïve ERSA

No ERSA

Optimized

ERSA

0

0.5

1

1.5

2

2.5

3

3.5

4

0 500 1K 5K 10K 15K 20K 25K 30K

Normalized

Execution

Time

Errors / RRC / sec

Naïve ERSA

No ERSA

Optimized

ERSA

Execution T

ime

Outp

ut

Qualit

y

Page 17: OMPUTING C TATISTICAL Sbwrcs.eecs.berkeley.edu/faculty/jan/JansWeb... · STATISTICAL COMPUTING THE ALTERNATIVE ROAD TO LOW ENERGY Jan M. Rabaey Donald O. Pederson Distinguished Prof

Algorithmic Noise-Tolerance (ANT) Combining estimation and detection

• Main Block designed for average case – Makes intermittent errors (reduced margins)

• Estimator approximates Main Block output

• Detector compares and replaces

• Assumes algorithmic knowledge for designing efficient estimators [Courtesy: Shanbhag et al, UIUC]

Page 18: OMPUTING C TATISTICAL Sbwrcs.eecs.berkeley.edu/faculty/jan/JansWeb... · STATISTICAL COMPUTING THE ALTERNATIVE ROAD TO LOW ENERGY Jan M. Rabaey Donald O. Pederson Distinguished Prof

Results: ANT Motion Estimation

2.5X energy-savings

ANT

Conventional

ideal conventional ANT

PSNR variance reduction: 7X

Peak SNR

PSNR increase

Page 19: OMPUTING C TATISTICAL Sbwrcs.eecs.berkeley.edu/faculty/jan/JansWeb... · STATISTICAL COMPUTING THE ALTERNATIVE ROAD TO LOW ENERGY Jan M. Rabaey Donald O. Pederson Distinguished Prof

Using Estimation Only “Sensor Networks on a Chip (SNOC)”

Stochastic Model

i i Y +=

estimate

observations noise

Estimation Theory

Computational cores Requires

Efficient and robust estimators

Favorable error-statistics, e.g. independent and identical distributions

[Shanbhag, Jones et al, UIUC]

Page 20: OMPUTING C TATISTICAL Sbwrcs.eecs.berkeley.edu/faculty/jan/JansWeb... · STATISTICAL COMPUTING THE ALTERNATIVE ROAD TO LOW ENERGY Jan M. Rabaey Donald O. Pederson Distinguished Prof

SNOC-based PN-Code Acquisition

300X

40% energy

savings

Probability (Detection)

Commonly wireless CDMA receiver kernel

Polyphase decomposition

800X (better performance), 300X (reduced performance variation), 40% (energy savings)

800X

Page 21: OMPUTING C TATISTICAL Sbwrcs.eecs.berkeley.edu/faculty/jan/JansWeb... · STATISTICAL COMPUTING THE ALTERNATIVE ROAD TO LOW ENERGY Jan M. Rabaey Donald O. Pederson Distinguished Prof

Statistical Computing: Quo Vadis?

So far: pretty much ad-hoc

The quest for a generalized strategy

Input descriptions that capture intended statistical behavior

(GP) statistical processors with known error models

Algorithm optimization and software generation (a.k.a compilers) so that intended behavior is obtained

Page 22: OMPUTING C TATISTICAL Sbwrcs.eecs.berkeley.edu/faculty/jan/JansWeb... · STATISTICAL COMPUTING THE ALTERNATIVE ROAD TO LOW ENERGY Jan M. Rabaey Donald O. Pederson Distinguished Prof

Statistical Processors – Thinking aloud

Reliable simple CPU:

Calibration: Collect statistics (Vdd,f) for cores and interconnect

Statistics selection: Application dependent (QoS); (Vdd,f) for ‘good’ statistics

Application-dependent reconfiguration and adaptation

Energy-efficient unreliable IP Cores

Performs majority of computation

Intermittent errors

Page 23: OMPUTING C TATISTICAL Sbwrcs.eecs.berkeley.edu/faculty/jan/JansWeb... · STATISTICAL COMPUTING THE ALTERNATIVE ROAD TO LOW ENERGY Jan M. Rabaey Donald O. Pederson Distinguished Prof

General-Purpose Statistical Computing Soft NMR (N-way Modular Redundancy)

Soft voter: combines multiple observations with observed error profiles and multiple hypothesis to provide output minimizing error does not need algorithmic information

Challenges: Generation of error profiles Hypothesis synthesis

[Courtesy: Shanbhag, Kim, UIUC]

Page 24: OMPUTING C TATISTICAL Sbwrcs.eecs.berkeley.edu/faculty/jan/JansWeb... · STATISTICAL COMPUTING THE ALTERNATIVE ROAD TO LOW ENERGY Jan M. Rabaey Donald O. Pederson Distinguished Prof

Error Profiles (pdf)

0.00%

0.01%

0.10%

1.00%

10.00%

100.00%

0.6 0.8 1 1.2 1.4 1.6 1.8 2

Err

or

rate

Supply Voltage

random

bzip

ammp

Example: Errors resulting from Voltage Over-Scaling (VOS)

Can be obtained from simulation or on-chip test

Kogge-Stone Adder with realistic patterns

[Courtesy: T. Austin, Umich]

Page 25: OMPUTING C TATISTICAL Sbwrcs.eecs.berkeley.edu/faculty/jan/JansWeb... · STATISTICAL COMPUTING THE ALTERNATIVE ROAD TO LOW ENERGY Jan M. Rabaey Donald O. Pederson Distinguished Prof

Example: Soft Multiplier • Error Statistics

– VOS: 16 bit RCA with 66% of Vdd-crit

• N=3: 10X in Psys, 3X in Pe

• N=7: 800X in Psys at Pe = 0.2

800x

3x

Page 26: OMPUTING C TATISTICAL Sbwrcs.eecs.berkeley.edu/faculty/jan/JansWeb... · STATISTICAL COMPUTING THE ALTERNATIVE ROAD TO LOW ENERGY Jan M. Rabaey Donald O. Pederson Distinguished Prof

Major Take-Away’s

Energy rules

Reductions in energy/op not quickly fore coming

Statistical computing allows for major reduction in margins and eliminates over-design

Initial prototypes show very promising potential

The Million $ Proposal: “General-purpose statistical computing” (or is this an oxymoron)