introduction to embedded systems research: power, energy
TRANSCRIPT
Introduction to Embedded Systems Research:Power, Energy, and Temperature
Robert Dick
[email protected] of Electrical Engineering and Computer Science
University of Michigan
Fovea
Lens
Cornea
Variable Resolution
and Position Sampling
Change-Adaptive
Signalling
Spatial State
Cache
(Occipital
Place Area)
Analasys, e.g.,
Classification
Adequate
decision confidence?
Sampling Guidance
Y
N
Long-Term Memory
Decision /
Result
(b) Iterative, multi-round human vision system.
Image Signal
Processor
Application Processor
(CPU and/or GPU)
CloudDecision /
Result
(a) Conventional machine vision pipeline.
Image Sensor: typically
homogeneous RGGB or RCCC.
Demosaicing, binning,
denoising, gamma
correction, and compression.
Hardware Feature
Extraction Accelerator
or
Feature extraction on
raw captured data.
Runs CNN, LSTM or
other analysis algorithm.
May drop computation on less
important data, but already payed
Image Signal Processor transfer cost.
May render decision or (at high
energy cost) do feature extraction
and defer decision to cloud.
Minimalistic
Image Pre-Processor
Application Processor
(CPU and/or GPU)
CloudDecision /
Result
Image Sensor: capture only the
most important
data for decision accuracy.
Efficient gamma
correction
and binning.
Decide based on features.
Very high energy cost for
wireless data transfer.
Scene Cache
Capture ControllerHardware Feature
Extraction Accelerator
or
Adequate
decision confidence?
or
Y
N
Captures most relevant
and rapidly changing data.
Learns important sample locations
from prior rounds.
Maintains state built from
prior still-relevant samples.
Determine and
transmit
relevant data.
Decide based on features.
Very high energy cost for
wireless data transfer.
Issue commands to
capture important data.
(c) Goal: multi-round, energy-efficient, low-latency
continuous learning machine vision.
Feature extraction on sparse captured
data with similar distribution to processed data.
Continuously learn features and important
data based on prior captures.
1.1040
2.1041
385
2.1039
704
1.1039
36
2.1040
1734
0.1039
1
3.1040
4
4.1039
642
3.1039
4.1040
409
3.1041
396665
5.1042
108612774
5.1040
337
4.1041
10644
6.1044
117
6.1039
609
6.1045
164
6.1040
841434 723938
6.1042
209
5.1041
4 12 164
5.1045
529417
5.1044
140105 88
5.1039
154 10551 90677
7.1039
1248
7.1047
2106
7.1040
29773362 241966 22903106
6.1041
3119 1936 40
6.1047
4128 253
8.1050
4
8.1042
9.1050
24784
8.1039
89632
8.1044
2840
8.1040
1088
9.1039
10.1040
4
11.1039
957
10.1050
144
11.1040
156
10.1041
32 16
10.1047
87780
10.1045
2152
10.1039
1165
12.1040
145
13.1052
2404
12.1042
24
12.1045
33008
12.1044
8217
12.1041
8
12.1039
135427
14.1049
113
14.1040
229
13.1050
132
13.1042
74433
13.1041
17187
13.1040
2715
13.1039
170059 90
15.1040
16
15.1050
242 1225
14.1050
237
14.1042
6200
14.1041
4
14.1044
720
14.1052
20
14.1039
84 36939
16.1040
88
15.1049
6
15.1039
56
17.1054
2919
16.1050
129
16.1041
36
16.1047
49154
16.1045
2632
16.1049
27
16.1052
16
16.1039
222734
18.1048
133
18.1039
3439
18.1040
241
17.1050
16
18.1049
36 339224
17.1042
172832
17.1041
49620 448
17.1040
1376493 2883
17.1045
72
17.1044
1073
17.1049
55441826
17.1048
124 3648290
17.1052
3547
17.1039
2484110477445
17.1047
24
19.1040
72
19.1039
6 88 60631617
18.1041
28 76
18.1047
3305
18.1054
13744
20.1040
109
20.1039
269
19.1052
8
19.1047
11712
19.1049
10
19.1054
7520
21.1039
82
20.1049
5
20.1047
4896
20.1054
864
22.1040
4
22.1050
23.1040
4
22.1039
144
24.1058
3389
23.1050
76
24.1040
4
23.1042
17528
23.1041
4
23.1054
24
23.1044
6234
23.1058
261
23.1049
4
23.1052
2944
23.1039
3069658
25.1040
80
24.1050
4
24.1039
58
25.1039
26.1040
4
27.1039
489
26.1050
4
27.1040
4
26.1047
3808
26.1045
2248
26.1058
113 80
26.1049
11
26.1039
66
26.1054
840
28.1055
84266 1542
27.1042
1229
27.1041
29619
27.1058
12
27.1049
3984
27.1048
35337
29.1040
262
29.1056
164
29.1039
742
28.1050
4 2464
28.1042
2137912
29.1055
1128
28.1041
4 2633 4
28.1040
2716132 691
28.1058
24 84 32
28.1049
36176 3
28.1048
1192 48
28.1039
365 110957 24475
29.105029.104229.104129.104729.104529.105829.104929.105229.1054
2
3
4
5
6
7
8
0 1 2 3 4 5 6 7 8
Pow
er
(mW
)
Time (s)
35 40 45 50 55 60 65 70 75 80 85 90
-8 -6 -4 -2 0 2 4 6 8
-8
-6
-4
-2
0
2
4
6
8
35 40 45 50 55 60 65 70 75 80 85 90
Temperature (°C)
Position (mm)
Temperature (°C)
Deadlines and announcementsPower and temperature definitions and fundamentals
Thermal analysisPower models for embedded systems
Outline
1. Deadlines and announcements
2. Power and temperature definitions and fundamentals
3. Thermal analysis
4. Power models for embedded systems
2 R. Dick EECS 507
Deadlines and announcementsPower and temperature definitions and fundamentals
Thermal analysisPower models for embedded systems
Deadlines and Announcements I
From now on, deadlines and announcements will come at the start of lectureslides.
23 February: J. Polastre, R. Szewczyk, A. Mainwaring, D. Culler, andJ. Anderson, “Analysis of wireless sensor networks for habitat monitoring,”in Wireless Sensor Networks, C. S. Raghavendra, K. M. Sivalingam, andT. Znati, Eds. Springer US, 2004, ch. 18, pp. 399–423.
I was quite ill recently. I’m catching up on feedback/evaluations.
2 March: S. Roundy, P. K. Wright, and J. Rabaey, “A study of low levelvibrations as a power source for wireless sensor nodes,” ComputerCommunications, vol. 26, pp. 1131–1144, Oct. 2003.
4 March: Project checkpoint 1.
11 March: Midterm exam.
3 R. Dick EECS 507
Deadlines and announcementsPower and temperature definitions and fundamentals
Thermal analysisPower models for embedded systems
Deadlines and Announcements II
30 March: Project checkpoint 2.
21 Apr: Project deadline.
10:30am–12:30pm 29 Apr: Final exam.
4 R. Dick EECS 507
Deadlines and announcementsPower and temperature definitions and fundamentals
Thermal analysisPower models for embedded systems
Presentations feedback
Private feedback on presentations in office hours?
5 R. Dick EECS 507
Deadlines and announcementsPower and temperature definitions and fundamentals
Thermal analysisPower models for embedded systems
Context
Finish C. L. Liu and J. W. Layland, “Scheduling algorithms formultiprogramming in a hard-real-time environment,” J. of the ACM, vol. 20,no. 1, pp. 46–61, Jan. 1973.
Brief lecture on RTOSs and CPS.
E. A. Lee, “The past, present and future of cyber-physical systems: A focuson models,” Sensors, Feb. 2015.
Lecture on power, energy, and temperature.
L. Zhang, B. Tiwana, Z. Qian, Z. Wang, R. P. Dick, Z. M. Mao, andL. Yang, “Accurate online power estimation and automatic battery behaviorbased power model generation for smartphones,” in Proc. Int. Conf.Hardware/Software Codesign and System Synthesis, Oct. 2010, pp. 105–114
.
6 R. Dick EECS 507
Deadlines and announcementsPower and temperature definitions and fundamentals
Thermal analysisPower models for embedded systems
Outline
1. Deadlines and announcements
2. Power and temperature definitions and fundamentals
3. Thermal analysis
4. Power models for embedded systems
7 R. Dick EECS 507
Deadlines and announcementsPower and temperature definitions and fundamentals
Thermal analysisPower models for embedded systems
Definitions
Temperature: Average kinetic energy of particle.
Heat flow: Transfer of this energy.
Heat always flows from regions of higher temperature to regions of lowertemperature.
Particles move.
What happens to a moving particle in a lattice?
8 R. Dick EECS 507
Deadlines and announcementsPower and temperature definitions and fundamentals
Thermal analysisPower models for embedded systems
Acoustic phonons
Lattice structure.
Transverse and longitudinal waves.
Electron–phonon interactions.
Effect of carrier energy increasing beyond optic phonon energy?
9 R. Dick EECS 507
Deadlines and announcementsPower and temperature definitions and fundamentals
Thermal analysisPower models for embedded systems
Optic phonons
Only occur in lattices with more than one atom per unit cell.
Optic phonons out of phase from primitive cell to primitive cell.
Positive and negative ions swing against each other.
Low group velocity.
Interact with electrons.
10 R. Dick EECS 507
Deadlines and announcementsPower and temperature definitions and fundamentals
Thermal analysisPower models for embedded systems
Nanostructure heat transfer
Boundary scattering.
Quantum effects when phonon spectra of materials do not match.
11 R. Dick EECS 507
Deadlines and announcementsPower and temperature definitions and fundamentals
Thermal analysisPower models for embedded systems
Why do wires get hot?
Scattering of electrons due to destructive interference with waves in thelattice.
What are these waves?
What happens to the energy of these electrons?
What happens when wires start very, very cool?
What is electrical resistance?
What is thermal resistance?
12 R. Dick EECS 507
Deadlines and announcementsPower and temperature definitions and fundamentals
Thermal analysisPower models for embedded systems
Why do transistors get hot?
Scattering of electrons due to destructive interference with waves in thelattice.
Where do these waves come from?
Where do the electrons come from?
Intrinsic carriers.
Dopants.
What happens as the semiconductor heats up?
Carrier concentration increases.
Carrier mobility decreases.
Threshold voltage decreases.
13 R. Dick EECS 507
Deadlines and announcementsPower and temperature definitions and fundamentals
Thermal analysisPower models for embedded systems
Power consumption trends
Initial optimization at transistor level.
Further research-driven gains at this level difficult.
Research moved to higher levels, e.g., RTL.
Trade area for performance and performance for power.
Clock frequency gains linear.
Voltage scaling VDD2 – important.
14 R. Dick EECS 507
Deadlines and announcementsPower and temperature definitions and fundamentals
Thermal analysisPower models for embedded systems
Power consumption in synchronous CMOS
P = PSWITCH + PSHORT + PLEAK
PSWITCH = C · VDD2 · f · A
† PSHORT =b
12(VDD − 2 · VT )3 · f · A · t
PLEAK = VDD · (ISUB + IGATE + IJUNCTION + IGIDL)
C : total switched capacitance VDD : high voltage
f : switching frequency A : switching activity
b : MOS transistor gain VT : threshold voltage
t : rise/fall time of inputs
† PSHORT usually ≤ 10% of PSWITCH
Smaller as VDD → VT
15 R. Dick EECS 507
Deadlines and announcementsPower and temperature definitions and fundamentals
Thermal analysisPower models for embedded systems
Adiabatic charging
Voltage step function implies E = CVCAP2/2.
Instead, vary voltage to hold current constant: E = CVCAP2 · RC/t.
Lower energy if T > 2RC .
Impractical when leakage significant.
16 R. Dick EECS 507
Deadlines and announcementsPower and temperature definitions and fundamentals
Thermal analysisPower models for embedded systems
Wiring power consumption
In the past, transistor power � wiring power.
Process scaling ⇒ ratio changing.
Conventional CAD tools neglect wiring power.
17 R. Dick EECS 507
Deadlines and announcementsPower and temperature definitions and fundamentals
Thermal analysisPower models for embedded systems
Leakage
B
DSG
n+ n+
Gate Leakage Subthreshold Leakage
Junction LeakageGIDL Leakage
Punchthrough Leakage
18 R. Dick EECS 507
Deadlines and announcementsPower and temperature definitions and fundamentals
Thermal analysisPower models for embedded systems
Subthreshold leakage current
Isubthreshold = AsW
LvT
2
(1− e
−VDSvT
)e
(VGS−Vth)
nvT ,
where As is a technology-dependent constant,
Vth is the threshold voltage,
L and W are the device effective channel length and width,
VGS is the gate-to-source voltage,
n is the subthreshold swing coefficient for the transistor,
VDS is the drain-to-source voltage, and
vT is the thermal voltage.
A. Chandrakasan, W. Bowhill, and F. Fox, Design of High-Performance MicroprocessorCircuits. IEEE Press, 2001
19 R. Dick EECS 507
Deadlines and announcementsPower and temperature definitions and fundamentals
Thermal analysisPower models for embedded systems
Simplified subthreshold leakage current
VDS � vT and vT = kTq . q is the charge of an electron. Therefore, equation
can be simplified to
Isubthreshold = AsW
L
(kT
q
)2
eq(VGS−Vth)
nkT (1)
20 R. Dick EECS 507
Deadlines and announcementsPower and temperature definitions and fundamentals
Thermal analysisPower models for embedded systems
Exponential?
20 40 60 80 100 1200.5
1
1.5
2
2.5
3
3.5
4
4.5
5
5.5
Temperature (Co)
Nor
mal
ized
leak
age
valu
e
C7552 HSPICEC7552 Linear ModelC7552 PWL3SRAM HSPICESRAM Linear ModelSRAM PWL3
21 R. Dick EECS 507
Deadlines and announcementsPower and temperature definitions and fundamentals
Thermal analysisPower models for embedded systems
Piece-wise linear error
PWL1 PWL2 PWL3 PWL4 PWL5 PWL10 PWL150
1
2
3
4
5
6
Piece−wise linear leakage model name
Leak
age
mod
el e
rror
(%
)
C7552 Worst
2Mx32 SRAM Worst
C7552 Avg.
2Mx32 SRAM Avg.
Y. Liu, R. P. Dick, L. Shang, and H. Yang, “Accurate temperature-dependentintegrated circuit leakage power estimation is easy,” in Proc. Design,Automation & Test in Europe Conf., Mar. 2007, pp. 1526–1531
22 R. Dick EECS 507
Deadlines and announcementsPower and temperature definitions and fundamentals
Thermal analysisPower models for embedded systems
Gate leakage
Caused by tunneling between gate and other terminals.
Igate = WLAJ
(Toxr
Tox
)ntVgVaux
T 2ox
e−BTox (a−b|Vox |)(1+c|Vox |)
where AJ ,B, a, b, and c are technology-dependent constants,
nt is a fitting parameter with a default value of one,
Vox is the voltage across gate dielectric,
Tox is gate dielectric thickness,
Toxr is the reference oxide thickness,
Vaux is an auxiliary function that approximates the density of tunnelingcarriers and available states, and
Vg is the gate voltage.
K. M. Cao, W. C. Lee, W. Liu, X. Jin, P. Su, S. K. H. Fung, J. X. An, B. Yu, and C. Hu,“BSIM4 gate leakage model including source-drain partition,” in IEDM Technology Dig.,Dec. 2000, pp. 815–818
23 R. Dick EECS 507
Deadlines and announcementsPower and temperature definitions and fundamentals
Thermal analysisPower models for embedded systems
Temperature-aware leakage estimation
power estimation at reference temperature(using PrimerPower, HSPICE, etc.)
leakage power dynamic power
chip-package thermal analysis
leakage power analysis
until leakage power
& temperatureprofiles converge
detailed IC power profile
detailed IC thermal profile
24 R. Dick EECS 507
Deadlines and announcementsPower and temperature definitions and fundamentals
Thermal analysisPower models for embedded systems
Power consumption conclusions
Voltage scaling is currently the most promising low-level power-reductionmethod: V 2 dependence.
As VDD reduced, VT must also be reduced.
Sub-threshold leakage becomes significant.
What happens if PLEAK > PSWITCH?
25 R. Dick EECS 507
Deadlines and announcementsPower and temperature definitions and fundamentals
Thermal analysisPower models for embedded systems
Outline
1. Deadlines and announcements
2. Power and temperature definitions and fundamentals
3. Thermal analysis
4. Power models for embedded systems
26 R. Dick EECS 507
Deadlines and announcementsPower and temperature definitions and fundamentals
Thermal analysisPower models for embedded systems
R(C) model
Partition into 3-D elements (diagram 2-D for simplicity)Thermal resistance ↔ Resistance
Heat flow ↔ CurrentFor dynamic: Heat capacity ↔ Capacitance
27 R. Dick EECS 507
Deadlines and announcementsPower and temperature definitions and fundamentals
Thermal analysisPower models for embedded systems
Problem definition
CdT(t)
dt= AT(t) + PU(t)
A is the thermal conductivity matrix
Steady-state: Initial temperature and C unnecessary
Dynamic: Transient temperature analysis, must also consider heatcapacity
28 R. Dick EECS 507
Deadlines and announcementsPower and temperature definitions and fundamentals
Thermal analysisPower models for embedded systems
Thermal analysis infrastructure overview
29 R. Dick EECS 507
Thermal analysis infrastructure overview
Thermal analysis infrastructure overview
Thermal analysis infrastructure overview
Deadlines and announcementsPower and temperature definitions and fundamentals
Thermal analysisPower models for embedded systems
Thermal analysis infrastructure overview
31 R. Dick EECS 507
Deadlines and announcementsPower and temperature definitions and fundamentals
Thermal analysisPower models for embedded systems
Steady-state thermal analysis
Basis: Multigrid analysis
Fast, multi-resolution relaxation method for matrix solving.
1 Iterative solver (relaxation) on fine grid.
2 Coarsen and propagate residual upward.
3 Iterative solver for error at coarser level.
4 Correct fine-grained solution based on coarse-grained error.
5 Iterative solver for error at fine level.
Main challenge: Too slow for repeated use on large structures,especially 3-D chip-package modeling.
Observation: Steepness of thermal gradients vary across IC.
32 R. Dick EECS 507
Deadlines and announcementsPower and temperature definitions and fundamentals
Thermal analysisPower models for embedded systems
Neighbor temperature difference histogram
100
101
102
0
2000
4000
6000
8000
10000
12000N
umbe
r of
ele
men
ts
Spatial adaptation can improve performance w.o. loss of accuracy.
33 R. Dick EECS 507
Deadlines and announcementsPower and temperature definitions and fundamentals
Thermal analysisPower models for embedded systems
Hybrid oct-tree
Reduce element count by merging when∆T < ε
Conventional oct-tree inefficient forchip-package model
Anisotropic thermal gradients
We generalize to hybrid oct-tree
Arbitrary partitioning on each axis
34 R. Dick EECS 507
Deadlines and announcementsPower and temperature definitions and fundamentals
Thermal analysisPower models for embedded systems
Hybrid oct-tree
1 2
3 4
7 8
6
3 4
4
2
8
1 2
10 4
7
6
4
4
2
1 2
4
713
6
4
4
2
13
9
11 12
109
11 12
9
9
15
15 16
16
1414
0
3 4 5 6 7 81 2
11 129 10 13 14
15 16
Level 1Level 2Level 3
1313
1414
35 R. Dick EECS 507
Deadlines and announcementsPower and temperature definitions and fundamentals
Thermal analysisPower models for embedded systems
Outline
1. Deadlines and announcements
2. Power and temperature definitions and fundamentals
3. Thermal analysis
4. Power models for embedded systems
36 R. Dick EECS 507
Deadlines and announcementsPower and temperature definitions and fundamentals
Thermal analysisPower models for embedded systems
General case
Many components.
Each may have many power management/activity states.
System-wide power consumption depends on the specific combination ofcomponent states.
How many samples?
10 components.
5 states, each.
510 ' 10-million system-wide states.
How to get enough samples to characterize?
37 R. Dick EECS 507
Deadlines and announcementsPower and temperature definitions and fundamentals
Thermal analysisPower models for embedded systems
Independence assumption for embedded system powermodeling
What if a component’s power consumption were mostly independent of thepower management/activity states of other components?
How many samples?
10 components.
5 states, each.
50 samples of interest.
The assumption is often correct.
When it is not, can treat the two interdependent components as a singlecomponent.
38 R. Dick EECS 507
Deadlines and announcementsPower and temperature definitions and fundamentals
Thermal analysisPower models for embedded systems
Practical embedded system power estimation
1 For each component.1 Put all other components in lowest power state.2 Measure component power consumption in each state.
2 Can manually use measurements to build expression for system-widepower consumption.
3 Also works for incomplete sampling by using linear regression to find therelationship between each state variable and the system-wide powerconsumption.
39 R. Dick EECS 507
Deadlines and announcementsPower and temperature definitions and fundamentals
Thermal analysisPower models for embedded systems
Applying the power model
Estimate/measure the proportion of time each component spends in eachstate.
Sum the products of time proportions and component–state powerconsumptions to get system-wide average power consumption.
This is often inaccurate for instantaneous power consumption.
Not good for power supply provisioning or thermal design.
Often very accurate over timescales of minutes.
Good for battery lifespan estimation.
40 R. Dick EECS 507