static memory management for efficient mobile sensing applications
TRANSCRIPT
![Page 1: Static Memory Management for Efficient Mobile Sensing Applications](https://reader031.vdocument.in/reader031/viewer/2022030318/5a65849a7f8b9af13a8b4d07/html5/thumbnails/1.jpg)
University of Iowa | Mobile Sensing Laboratory
Static Memory Management for Efficient Mobile Sensing
Applications
EMSOFT 2015
Farley Lai, Daniel Schmidt, Octav ChiparaDepartment of Computer Science
![Page 2: Static Memory Management for Efficient Mobile Sensing Applications](https://reader031.vdocument.in/reader031/viewer/2022030318/5a65849a7f8b9af13a8b4d07/html5/thumbnails/2.jpg)
University of Iowa | Mobile Sensing Laboratory
• A class of applications that process continuous input data streams and may produce continuous output streams
– real-time processing
– efficient resource management
Emerging Mobile Sensing Applications
2
Speaker Models
Speech Recording
VADFeature
Extraction
HTTP Upload
Speaker Identifier
Introduction
Sensing Stream Processing
![Page 3: Static Memory Management for Efficient Mobile Sensing Applications](https://reader031.vdocument.in/reader031/viewer/2022030318/5a65849a7f8b9af13a8b4d07/html5/thumbnails/3.jpg)
University of Iowa | Mobile Sensing Laboratory
• Workload: stream operations on frames of samples
– e.g., windowing, splitting, or appending
– stream operation tend to be memory intensive
• Goal: implement stream operations efficiently
– reduce memory footprint
– reduce number of memory accesses
• Challenges:
– handle complex interaction between components
– avoid unnecessary memory copies
– enable data sharing between components
The Memory Management Challenge
3
Introduction
![Page 4: Static Memory Management for Efficient Mobile Sensing Applications](https://reader031.vdocument.in/reader031/viewer/2022030318/5a65849a7f8b9af13a8b4d07/html5/thumbnails/4.jpg)
University of Iowa | Mobile Sensing Laboratory
• Dynamic memory management
– specialized data structures to implement memory management
• e.g., SigSeg [Girod, et al. 2008] – linked list of buffered samples
– a level of indirection in accessing streaming data
• Static memory management
– no runtime overhead
– requires precise knowledge of the variable live ranges
• difficult to achieve in complex applications
• must be time-efficient to be included in compilers
Approaches to Memory Management
4
Introduction
[Girod2008] L. Girod, Y. Mei, R. Newton, S. Rost, A. Thiagarajan, H. Balakrishnan, and S. Madden, “XStream: a Signal-Oriented Data Stream Management System,” in ICDE, 2008.
![Page 5: Static Memory Management for Efficient Mobile Sensing Applications](https://reader031.vdocument.in/reader031/viewer/2022030318/5a65849a7f8b9af13a8b4d07/html5/thumbnails/5.jpg)
University of Iowa | Mobile Sensing Laboratory
• Application model
• Static analysis
• Memory layout
• Evaluation
• Conclusions
Outline
5
![Page 6: Static Memory Management for Efficient Mobile Sensing Applications](https://reader031.vdocument.in/reader031/viewer/2022030318/5a65849a7f8b9af13a8b4d07/html5/thumbnails/6.jpg)
University of Iowa | Mobile Sensing Laboratory
• StreamIt – synchronous data flow (SDF) language
– application = graph of filters connected with FIFO channels
• limited memory operations: pop(), peek(), and push()
• known consumption and production rates
A Model for Stream Applications
6
pop
peek
push
Filter::work()
INPUT: OUTPUT:
![Page 7: Static Memory Management for Efficient Mobile Sensing Applications](https://reader031.vdocument.in/reader031/viewer/2022030318/5a65849a7f8b9af13a8b4d07/html5/thumbnails/7.jpg)
University of Iowa | Mobile Sensing Laboratory
• StreamIt – synchronous data flow language
– applications are constructed hierarchically
• pipeline of streams
• split and joins (splitter and joiner)
– pass-by-value semantics
• naïve implementation would incur significant number of copies
A Model for Stream Applications
7
LPF2
Source
Du
plic
ate LPF1
Subtract SinkR
ou
nd
-Ro
bin
![Page 8: Static Memory Management for Efficient Mobile Sensing Applications](https://reader031.vdocument.in/reader031/viewer/2022030318/5a65849a7f8b9af13a8b4d07/html5/thumbnails/8.jpg)
University of Iowa | Mobile Sensing Laboratory
• SDFs may be executed in a cyclo-static schedule– the complete memory behavior of the program may be
observed within one execution of the schedule
• Our solution: static analysis + memory layout
Insight
8
LPF2
Source
Du
plic
ate LPF1
Subtract Sink
Ro
un
dR
ob
in
Source,3 DUP, 3 LPF1,1 LPF2,1
Source,1 DUP, 1 LPF1,1 LPF2,1 RR,1 Sub,1 Sink
INIT PHASE:
STEADY
PHASE:
RR,1 Sub,1 Sink
![Page 9: Static Memory Management for Efficient Mobile Sensing Applications](https://reader031.vdocument.in/reader031/viewer/2022030318/5a65849a7f8b9af13a8b4d07/html5/thumbnails/9.jpg)
University of Iowa | Mobile Sensing Laboratory
• Location Sharing
– an output element is pushed from an unmodified input element
– each I/O element is associated with a pop/push index
• Temporal Sharing
– an output element reuses the input element storage
– each I/O element is associated with a live range [i, j]
• Builds on abstract interpretation
– build a Control-Flow Graph (CFG) for each filter
– abstract interpretation of memory operations
Component Analysis
9
![Page 10: Static Memory Management for Efficient Mobile Sensing Applications](https://reader031.vdocument.in/reader031/viewer/2022030318/5a65849a7f8b9af13a8b4d07/html5/thumbnails/10.jpg)
University of Iowa | Mobile Sensing Laboratory
• Abstract interpretation of memory operations
– memory counter (MC) – relative order of operation
– indexes of current push (out) and pop (in)
– live range for each input (LIN) and output (LOUT) element
• Indexes and live ranges represented as intervals
• Subset of rules for determining live ranges:
Component Analysis
10
MC, out, LOUT
LOUT [out]⊔ MC, out++, MC++push
MC, in, LIN
LIN[in]⊔MC, in++, MC++pop
(MC1, in1, out1) (MC2, in2, out2)
(MC=max(MC1,MC2), in= in1 ⊔ in2, out=out1 ⊔ out2)join
![Page 11: Static Memory Management for Efficient Mobile Sensing Applications](https://reader031.vdocument.in/reader031/viewer/2022030318/5a65849a7f8b9af13a8b4d07/html5/thumbnails/11.jpg)
University of Iowa | Mobile Sensing Laboratory | 11
Example of Component Analysis
[0,0] ∅ ∅ExampleLIN LOUT
0 0 1
MC, LIN, in
LIN[in]⊔MC, in++, MC++pop
RULE:
STATE:
MC 0
in 0 0
out 0 0
MC 1
in 1 1
out 0 0
CFG:
LIN[0] =LIN[0]⊔[0,0]
![Page 12: Static Memory Management for Efficient Mobile Sensing Applications](https://reader031.vdocument.in/reader031/viewer/2022030318/5a65849a7f8b9af13a8b4d07/html5/thumbnails/12.jpg)
University of Iowa | Mobile Sensing Laboratory | 12
Example of Component Analysis
[0,0] [1,1] ∅ExampleLIN LOUT
0 0 1
RULE:
STATE:
MC 1
in 1 1
out 0 0
MC 2
in 1 1
out 1 1
CFG:
LOUT[0] =LOUT[0]⊔[1,1]
MC, LOUT, out
LOUT [out]⊔ MC, out++, MC++push
![Page 13: Static Memory Management for Efficient Mobile Sensing Applications](https://reader031.vdocument.in/reader031/viewer/2022030318/5a65849a7f8b9af13a8b4d07/html5/thumbnails/13.jpg)
University of Iowa | Mobile Sensing Laboratory | 13
Example of Component Analysis
[0,0] [1,1] ∅ExampleLIN LOUT
0 0 1
RULE:
STATE:
MC 1
in 1 1
out 0 0
MC 2
in 1 1
out 0 1
CFG:
MC 2
in 1 1
out 1 1
(MC1, in1, out1) (MC2, in2, out2)
(MC=max(MC1,MC2), in= in1 ⊔ in2, out=out1 ⊔ out2)
join
![Page 14: Static Memory Management for Efficient Mobile Sensing Applications](https://reader031.vdocument.in/reader031/viewer/2022030318/5a65849a7f8b9af13a8b4d07/html5/thumbnails/14.jpg)
University of Iowa | Mobile Sensing Laboratory | 14
Example of Component Analysis
[0,0] [1,1] [2,2]ExampleLIN LOUT
0 0 [0,1]
RULE:
STATE:
MC 2
in 1 1
out 0 1
MC 3
in 1 1
out 1 2
CFG:
LOUT[0,1] =LOUT[0,1]⊔[2,2]
MC, LOUT, out
LOUT [out]⊔ MC, out++, MC++push
![Page 15: Static Memory Management for Efficient Mobile Sensing Applications](https://reader031.vdocument.in/reader031/viewer/2022030318/5a65849a7f8b9af13a8b4d07/html5/thumbnails/15.jpg)
University of Iowa | Mobile Sensing Laboratory
• Component analysis constructs a memory fragment
– captures live ranges for temporal reuse
– captures location sharing edges
• Whole program analysis constructs a memory graph
– stitches together memory fragments
– simulates the schedule to
• connect location sharing edges into paths and
• extend live ranges with the phase number and invocation index
• Our approach:
– analysis is precise when there is no input dependency
– otherwise, it is a sound approximation
Whole Program Analysis
15
![Page 16: Static Memory Management for Efficient Mobile Sensing Applications](https://reader031.vdocument.in/reader031/viewer/2022030318/5a65849a7f8b9af13a8b4d07/html5/thumbnails/16.jpg)
University of Iowa | Mobile Sensing Laboratory
B
• Empirical insights– split-joins can be eliminated for manipulating location shared
elements
– a filter usually can reuse its input memory
• Heuristic approaches to resolving temporal reuse conflicts
Memory Layout
16
A
B
A0
0
0
A B other comps A memory B memory
0
0 0
No conflict Append on Conflict (AoC) Insert-in-Place (IP)
B
A
A
![Page 17: Static Memory Management for Efficient Mobile Sensing Applications](https://reader031.vdocument.in/reader031/viewer/2022030318/5a65849a7f8b9af13a8b4d07/html5/thumbnails/17.jpg)
University of Iowa | Mobile Sensing Laboratory
• Intel x86_64 on Mac OS X 10.10.3– 3GHz Intel Xeon CPU E5-1680 v2.
– 32KB L1 instruction + 32KB L1 data caches
– 256KB L2 + 25MB L3 caches
• StreamIt Compiler– baseline default settings without optimizations
– enabled cache optimizations with –cacheopt
– gcc –O3 to compile generated C/C++ code
• 11 micro benchmarks from StreamIt
• 3 macro benchmarks from real MSAs– BeepBeep [Peng, C., et al. 2007],
– MFCC and Crowd [Xu, C., et al. 2013]
Experimental Setup
17
Evaluation
![Page 18: Static Memory Management for Efficient Mobile Sensing Applications](https://reader031.vdocument.in/reader031/viewer/2022030318/5a65849a7f8b9af13a8b4d07/html5/thumbnails/18.jpg)
University of Iowa | Mobile Sensing Laboratory
– ESMS reduces both channel buffer sizes and the number memory operations from splitters, joiners and reordering filters
Memory Usage on Intel x86_64
18
45% to 96% reductions73% reductions on average
Evaluation
![Page 19: Static Memory Management for Efficient Mobile Sensing Applications](https://reader031.vdocument.in/reader031/viewer/2022030318/5a65849a7f8b9af13a8b4d07/html5/thumbnails/19.jpg)
University of Iowa | Mobile Sensing Laboratory
– Compared with baseline StreamIt– The average speedup of AA, AoC, and IP are 3, 3.1, and 3 while the average
speedup of CacheOpt is merely 1.07. – ESMS improves the performance by eliminating unnecessary memory
operations and reducing cache/memory references.
Speedup on Intel x86_64
19
Evaluation
![Page 20: Static Memory Management for Efficient Mobile Sensing Applications](https://reader031.vdocument.in/reader031/viewer/2022030318/5a65849a7f8b9af13a8b4d07/html5/thumbnails/20.jpg)
University of Iowa | Mobile Sensing Laboratory
• Static memory management is effective for stream languages
– whole program memory behaviors may be characterized
– both location and temporal sharing opportunities are exploited
– performance improvement due to fewer memory operations and references
• ESMS provides significant performance improvements
– 45% to 96% data size reduction
– 73% code size reduction
– 3X speedup
Conclusions
20
![Page 21: Static Memory Management for Efficient Mobile Sensing Applications](https://reader031.vdocument.in/reader031/viewer/2022030318/5a65849a7f8b9af13a8b4d07/html5/thumbnails/21.jpg)
University of Iowa | Mobile Sensing Laboratory
• National Science Foundation (NeTs grant #1144664 )
• Carver Foundation (grant #14-43555 )
Acknowledgements
21
CSense Toolkit
![Page 22: Static Memory Management for Efficient Mobile Sensing Applications](https://reader031.vdocument.in/reader031/viewer/2022030318/5a65849a7f8b9af13a8b4d07/html5/thumbnails/22.jpg)
University of Iowa | Mobile Sensing Laboratory
Questions?
Thank You
22