drastic: dynamically reconfigurable architecture systems ... · filter peripheral fir filter ip...
TRANSCRIPT
![Page 1: DRASTIC: Dynamically Reconfigurable Architecture Systems ... · Filter peripheral FIR Filter IP core PRR ICAP port EPA constraints 'n' filters: DMA DPR core M S ICAP core frequency](https://reader034.vdocument.in/reader034/viewer/2022043012/5fac15df5bac12738515f412/html5/thumbnails/1.jpg)
DRASTIC: Dynamically Reconfigurable Architecture Systems for Time-varying
Image Constraints
Marios S. Pattichis image and video Processing and Communications Laboratory (ivPCL)
Department of Electrical and Computer Engineering University of New Mexico Albuquerque, New Mexico
ivpcl.org Based on collaborative research with Dr. Y. Jiang, Dr. D. Llamocca, Dr. A. Panayides, and
Mr. C. Carranza.
Acknowledgment This material is based upon work supported by the National Science Foundation under NSF AWD
CNS1422031.
![Page 2: DRASTIC: Dynamically Reconfigurable Architecture Systems ... · Filter peripheral FIR Filter IP core PRR ICAP port EPA constraints 'n' filters: DMA DPR core M S ICAP core frequency](https://reader034.vdocument.in/reader034/viewer/2022043012/5fac15df5bac12738515f412/html5/thumbnails/2.jpg)
Talk Outline
• Motivation
• Related work
• Video communication examples
• Video analysis examples
• Discrete Periodic Radon Transform (DPRT)
• Conclusion
![Page 3: DRASTIC: Dynamically Reconfigurable Architecture Systems ... · Filter peripheral FIR Filter IP core PRR ICAP port EPA constraints 'n' filters: DMA DPR core M S ICAP core frequency](https://reader034.vdocument.in/reader034/viewer/2022043012/5fac15df5bac12738515f412/html5/thumbnails/3.jpg)
Motivation: Video Compression
HDTV video bandwidth requirements: • Has interlaced and progressive modes. • 720p: progressive, 1280x720 pixels, 60 frames per second.
Raw BW (24 bits/pixel): 1.3Gbps • 1080i: interlaced encoding, 1920x1080 pixels, 25 frames per second.
Raw BW (24 bits/pixel): 1.2Gbps • 1080p: progressive, 1920x1080 pixels, 59.94 frames per second.
Raw BW (24 bits/pixel): 2.98Gbps Ultra High Definition (hypothetical framerates): • 4K UHD: 3840x2160 (2160p 16:9): 24 bits/pixel@30 fps: 5.56Gbps,
4096x2048 (4K x 2K), 4096x2160 (1.9:1), 4096x2304 (16:9), 4096x3072: 24 bits/pixel @120 fps: 33.75Gbps.
• 8K UHD: 7680x4320 (4320p): 24 bits/pixel@30 fps: 22.24Gbps 8192x4096, 8192x4320: 24 bits/pixel@120 fps: 94.92Gbps
![Page 4: DRASTIC: Dynamically Reconfigurable Architecture Systems ... · Filter peripheral FIR Filter IP core PRR ICAP port EPA constraints 'n' filters: DMA DPR core M S ICAP core frequency](https://reader034.vdocument.in/reader034/viewer/2022043012/5fac15df5bac12738515f412/html5/thumbnails/4.jpg)
Mobile Comm. Networks Data Transfer Rates
Type Theoretical Transfer Rates Typical Transfer Rates
2G- GSM (early 1990s) 9.6 – 115 kbps About 10 kbps
2.5G-GPRS (2001) 9.6 - 171.2 kbps Between 30-50 kbps
2.5G- EDGE (2003) 9.6 -384 kbps Between 75-135 kbps
3G- UMTS (2001) 144 kbps - 2 Mbps Between 220-384 kbps
3.5G-HSPA (Rel. 7) (HSDPA , Rel. 5, 2005) (HSUPA, Rel. 6, 2008)
DL: 14Mbps UL: 5.8 Mbps
DL : 1-4 Mbps UL : 500Kbps -2Mbps
3.5G- Mobile WiMAX (IEEE 802.16e, 2005)
DL: 46 Mbps UL: 5.6 Mbps As for 3.5-HSPA
4G-LTE-Advanced (Rel. 10, Oct. 2010)
DL: 1Gbps UL: 100 Mbps N/A: See below
4G- WirelessMAN-Advanced
(IEEE 802.16m, Oct. 2010)
DL: 1Gbps UL: 100 Mbps N/A: See below
Refer to slides 159-188 from “4G …” from http://www.4gamericas.org/
![Page 5: DRASTIC: Dynamically Reconfigurable Architecture Systems ... · Filter peripheral FIR Filter IP core PRR ICAP port EPA constraints 'n' filters: DMA DPR core M S ICAP core frequency](https://reader034.vdocument.in/reader034/viewer/2022043012/5fac15df5bac12738515f412/html5/thumbnails/5.jpg)
Mobile Communication Networks Evolution
![Page 6: DRASTIC: Dynamically Reconfigurable Architecture Systems ... · Filter peripheral FIR Filter IP core PRR ICAP port EPA constraints 'n' filters: DMA DPR core M S ICAP core frequency](https://reader034.vdocument.in/reader034/viewer/2022043012/5fac15df5bac12738515f412/html5/thumbnails/6.jpg)
Video Compression Rates
3G UMTS (2001) max typical = 384 kbps • HDTV 720p: 1.3Gbps / 384 kbps = 3,549 • UHD 8192x4320 (hyp): 94.92Gbps / 384 kbps = 251,058
4G-LTE-Advanced @ max theoretical upload = 100 Mbps
• HDTV 720p: 1.3Gbps / 100 Mbps = 13.3
• UHD 8192x4320 (hyp): 94.92Gbps / 100 Mbps = 972
CR for HEVC Studies using ultrasound videos: 720x576(4CIF) (8 bits/pixel-yuv420@25fps) = 81.1 Mbps HEVC encoding using QP 36 and x265 ultra-fast profile:
PSNR: 32dB, 364 kbps, compression ratio = 223
![Page 7: DRASTIC: Dynamically Reconfigurable Architecture Systems ... · Filter peripheral FIR Filter IP core PRR ICAP port EPA constraints 'n' filters: DMA DPR core M S ICAP core frequency](https://reader034.vdocument.in/reader034/viewer/2022043012/5fac15df5bac12738515f412/html5/thumbnails/7.jpg)
Still Image Criteria for Medical Video
Plaque Motion Stenosis Plaque Morphology 5 plaque(s) motion(s) in transmitted video
identifiable as in original degree of stenosis in transmitted video
determined as in original plaque morphology in
transmitted video is the same as in original
4 plaque(s) motion(s) in transmitted video has artefacts that do not compromise diagnosis
enough clinical data to determine degree of
stenosis Some artefacts are seen that do not compromise
morphology visualization 3 plaque(s) motion(s) artefacts that can
compromise diagnosis clinical data only allow
approximation of degree of stenosis
Artefacts may compromise morphology
visualization 2 plaque(s) motion(s) artefacts that
significantly limit diagnosis very limited ability to
estimate degree of stenosis Significantly limit
diagnosis 1 Not visible
not determinable Not visible.
Assessed by humans for atherosclerotic plaque ultrasound.
![Page 8: DRASTIC: Dynamically Reconfigurable Architecture Systems ... · Filter peripheral FIR Filter IP core PRR ICAP port EPA constraints 'n' filters: DMA DPR core M S ICAP core frequency](https://reader034.vdocument.in/reader034/viewer/2022043012/5fac15df5bac12738515f412/html5/thumbnails/8.jpg)
Motivation: Multi-objective Opt
Provide Constraints for better control on Image and Video Compression
Power
Bitrate
Quality
![Page 9: DRASTIC: Dynamically Reconfigurable Architecture Systems ... · Filter peripheral FIR Filter IP core PRR ICAP port EPA constraints 'n' filters: DMA DPR core M S ICAP core frequency](https://reader034.vdocument.in/reader034/viewer/2022043012/5fac15df5bac12738515f412/html5/thumbnails/9.jpg)
Motivation: Multi-obj (more)
• Adaptive accuracy based on changes in the video
• Real-time video communications performance
• Fast image processing with limited computational resources (e.g., scalable DPRT)
• Large datasets with limited resources
![Page 10: DRASTIC: Dynamically Reconfigurable Architecture Systems ... · Filter peripheral FIR Filter IP core PRR ICAP port EPA constraints 'n' filters: DMA DPR core M S ICAP core frequency](https://reader034.vdocument.in/reader034/viewer/2022043012/5fac15df5bac12738515f412/html5/thumbnails/10.jpg)
Video Processing System
Multi-objectivehistory
hardwarerealizations
Objectives1 Module 1
Module 2
Module 3
Module 4
Module n
......
Objectives2
Objectives3
Objectives4
Objectivesn
Estimatedobjectives
n sets of:bitstreams+frequency
Pareto-optimal HW realizationsand estimated objectives
MEM
OR
Y
EXTERNALCONSTRAINTS
STATIC HWcomponents
DYNAMIC HWcomponents
DYNAMIC DIGITAL SYSTEM
VIDEOINPUT
VIDEOOUTPUT
OutputsMeasurements
MODELBUILDING
Realization Selector
Parameterhistory
Parameters
InputsMeasurements
Pareto FrontPrediction
Model
CONSTRAINTS
CONTROL BLOCK
![Page 11: DRASTIC: Dynamically Reconfigurable Architecture Systems ... · Filter peripheral FIR Filter IP core PRR ICAP port EPA constraints 'n' filters: DMA DPR core M S ICAP core frequency](https://reader034.vdocument.in/reader034/viewer/2022043012/5fac15df5bac12738515f412/html5/thumbnails/11.jpg)
Modes for Video Comm.
Mode Constraint Optimization Formulation
Max Im. Qual.: max Q subj. to: (BPS Bmax
)& (DP DPmax
)
Min Bitrate: min BPS subj. to: (Q � Qmin
)& (DP DPmax
)
Min Dyn. Power: min DP subj. to: (BPS Bmax
)& (Q � Qmin
)
Typical Mode: max ↵ · Q� � · BPS� � · DPsubj. to: (Q � Q
min
)& (DP DPmax
)& (BPS BPSmax
)
We have the following objectives and bounds:
DP Dynamic Power, max avail.=DPmax
Q Image Quality, min. acceptable=Qmin
BPS Bits Per Sample, max avail.=BPSmax
![Page 12: DRASTIC: Dynamically Reconfigurable Architecture Systems ... · Filter peripheral FIR Filter IP core PRR ICAP port EPA constraints 'n' filters: DMA DPR core M S ICAP core frequency](https://reader034.vdocument.in/reader034/viewer/2022043012/5fac15df5bac12738515f412/html5/thumbnails/12.jpg)
DRASTIC DCT: System
We have software/hardware control implementing DRASTIC modes using: • SW: Adjustable Quantization Table (QF only) • HW: Variable Zonal Coding • HW: Adjustable Bitwidth of the DCT coefficients
![Page 13: DRASTIC: Dynamically Reconfigurable Architecture Systems ... · Filter peripheral FIR Filter IP core PRR ICAP port EPA constraints 'n' filters: DMA DPR core M S ICAP core frequency](https://reader034.vdocument.in/reader034/viewer/2022043012/5fac15df5bac12738515f412/html5/thumbnails/13.jpg)
DRASTIC DCT: DCT HW
Variable Zonal Coding:
• Fine control providing 8 configurations
• Output will be extended to 16 bits
• Full implementation shown here: Removal of red-highlighted regions for implementing zonal=7 hardware mode.
p0
p1
p2
p3
8ext
ext
ext
ext
signed[-128,127]
p4
p5
p6
p7
ext
ext
ext
ext
8
8
8
8
+
_
+
++
_
__
189
9x4
18
18
18
18
18
18
18
trim
trim
trim
trim
trim
trim
trim
trim
11ext
ext
ext
ext
ext
ext
ext
ext
+
_
+
++
_
__
2112
21
21
21
21
21
21
21
trim
trim
trim
trim
trim
trim
trim
trim
14
14
14
14
1414
14
14
11
11
11
11
11
11
11
ping-pongtranspose memory
8
8
8
8
8
8
8
8
8
8
8
11
11
11
11
11
11
11
11
9
912x4
12
12
1D Filter (Y1)
![Page 14: DRASTIC: Dynamically Reconfigurable Architecture Systems ... · Filter peripheral FIR Filter IP core PRR ICAP port EPA constraints 'n' filters: DMA DPR core M S ICAP core frequency](https://reader034.vdocument.in/reader034/viewer/2022043012/5fac15df5bac12738515f412/html5/thumbnails/14.jpg)
DRASTIC DCT: Pareto Front
• QF: 5:5:100 • Zonal: 1:1:8 • Bitwidth: 2:1:9 Configurations = 1280 Pareto Optimal = 841
Pareto-optimal Configuration shown in red. (based on median results over LIVE image database)
![Page 15: DRASTIC: Dynamically Reconfigurable Architecture Systems ... · Filter peripheral FIR Filter IP core PRR ICAP port EPA constraints 'n' filters: DMA DPR core M S ICAP core frequency](https://reader034.vdocument.in/reader034/viewer/2022043012/5fac15df5bac12738515f412/html5/thumbnails/15.jpg)
DRASTIC DCT: Image Database
![Page 16: DRASTIC: Dynamically Reconfigurable Architecture Systems ... · Filter peripheral FIR Filter IP core PRR ICAP port EPA constraints 'n' filters: DMA DPR core M S ICAP core frequency](https://reader034.vdocument.in/reader034/viewer/2022043012/5fac15df5bac12738515f412/html5/thumbnails/16.jpg)
DRASTIC DCT: Mode Transition
![Page 17: DRASTIC: Dynamically Reconfigurable Architecture Systems ... · Filter peripheral FIR Filter IP core PRR ICAP port EPA constraints 'n' filters: DMA DPR core M S ICAP core frequency](https://reader034.vdocument.in/reader034/viewer/2022043012/5fac15df5bac12738515f412/html5/thumbnails/17.jpg)
Ongoing Work: HEVC
The video coding standard H.264/AVC is the project HEVC started from. H.264/AVC is initially developed during 1999-2003, and was further extended in scalable video coding (SVC) and multi-view video coding (MVC) during 2003-2009.
Formal joint Call for Proposals (CfP) on HEVC started by VCEG mand MPEG was issued in January 2010.
Motivation:
8K-UHD (8192 × 4320)
4K (4096 x 3072)
![Page 18: DRASTIC: Dynamically Reconfigurable Architecture Systems ... · Filter peripheral FIR Filter IP core PRR ICAP port EPA constraints 'n' filters: DMA DPR core M S ICAP core frequency](https://reader034.vdocument.in/reader034/viewer/2022043012/5fac15df5bac12738515f412/html5/thumbnails/18.jpg)
4. Ongoing Work: Intra-Prediction
![Page 19: DRASTIC: Dynamically Reconfigurable Architecture Systems ... · Filter peripheral FIR Filter IP core PRR ICAP port EPA constraints 'n' filters: DMA DPR core M S ICAP core frequency](https://reader034.vdocument.in/reader034/viewer/2022043012/5fac15df5bac12738515f412/html5/thumbnails/19.jpg)
HEVC Intra-Pred (Config)
![Page 20: DRASTIC: Dynamically Reconfigurable Architecture Systems ... · Filter peripheral FIR Filter IP core PRR ICAP port EPA constraints 'n' filters: DMA DPR core M S ICAP core frequency](https://reader034.vdocument.in/reader034/viewer/2022043012/5fac15df5bac12738515f412/html5/thumbnails/20.jpg)
HEVC Intra-Pred (Space)
![Page 21: DRASTIC: Dynamically Reconfigurable Architecture Systems ... · Filter peripheral FIR Filter IP core PRR ICAP port EPA constraints 'n' filters: DMA DPR core M S ICAP core frequency](https://reader034.vdocument.in/reader034/viewer/2022043012/5fac15df5bac12738515f412/html5/thumbnails/21.jpg)
Adaptive Real-time HEVC
This is an emergency video example.
• Adaptation for the maximum video quality mode subject to real-time encoding that translates to frame-rate ≥ 25 fps and 3G available bandwidths.
• (1) BW ≤ 250 kbps to (2) BW ≤ 384 kbps (max 3G upload speed). • Lower bandwidth: PSNR of 29.7 dB (33.88 fps @ 226 kbps) • Second bandwidth: PSNR: 31.9 dB (29.29 fps@ 363.53 kbps). • Encoding delay was 0.426 seconds with 6 core [email protected] GHz (WPP streams / pool / frames: 18 / 6 / 2 for x.265 encoding).
![Page 22: DRASTIC: Dynamically Reconfigurable Architecture Systems ... · Filter peripheral FIR Filter IP core PRR ICAP port EPA constraints 'n' filters: DMA DPR core M S ICAP core frequency](https://reader034.vdocument.in/reader034/viewer/2022043012/5fac15df5bac12738515f412/html5/thumbnails/22.jpg)
4. Video Image Analysis
PPC/µBlaze
PLB
System ACE memoryEthernet
MAC
Filter peripheral
FIR Filter IP
core
PRR
ICA
Ppo
rt
EPA
cons
traint
s
'n' filters:
DPRDMAcore
M S
ICAPcore
frequency& PR ctrl iFIFO
inte
rface
CFcard
oFIFO
clkfxPR_done
Fram
e 1
ROW 1
COL 3
PRR
ROW 2
COL 1
ROW 3
COL 2
ROW 1
COL 1
t1
t2
t3
t4
t5
t6
t7
t8...
FPGA
Fram
e 1
ROW 1
COL 1
PRR
ROW 1
COL 1
ROW 1
COL 1
ROW 1
COL 1
t1
t2
t3
t4
t5
t6
t7
t8
...
FPGA
Fram
e 2
Fram
e 3
Fram
e 4
Fram
e 2
Row 1Col 1
Row 2Col 2
Row nCol n
...
Filt
er 1
Filt
er 2
Filt
er n
Row 1Col 1
Row 2Col 2
Row nCol n
...
Row 1Col 1
Row 2Col 2
Row nCol n
...
Row 1Col 1
Row 2Col 2
Row nCol n
...
Filt
er 1
Filt
er 2
Filte
r n
Filterbank 1 Filterbank 2 Filterbank m
'm' Filterbanks
'2n*m' bitstreams in memory'2n' bitstreams in memory(a) (c)(b)
...
![Page 23: DRASTIC: Dynamically Reconfigurable Architecture Systems ... · Filter peripheral FIR Filter IP core PRR ICAP port EPA constraints 'n' filters: DMA DPR core M S ICAP core frequency](https://reader034.vdocument.in/reader034/viewer/2022043012/5fac15df5bac12738515f412/html5/thumbnails/23.jpg)
Video Analysis: Filterbanks
Rowfilter
Colfilter
Row i
Col iDPR
DPR
DYNAMIC MANAGERGenerates Pareto Optimal point POi and
loads it into the 2D FIR filter
Row i Col iPO i, 1 ≤ i ≤ n
B: User constraints
Consider new constraints based on image type, user input, or output
EPA constraintgenerated?
Look for PO point that satifies theEPA constraints
PO point exists?
Load <row i,col i> bitstreams into the 2D FIR filter via DPR, and/or
modify the frequency i
no
no
yes
yes
Energy
Performance
Accuracy
Pointer tobitstream row i
Pointer tobitstream col i
(c)
Pareto Optimal PointPOi in memory:
2D FIR Filter
A: in
put i
mag
e
C: o
utpu
t fra
mes
ParametersN,NH,OB,NXr,NXc
Frequency
DPRand/or
frequencycontrol
freq.
(b)(a)
![Page 24: DRASTIC: Dynamically Reconfigurable Architecture Systems ... · Filter peripheral FIR Filter IP core PRR ICAP port EPA constraints 'n' filters: DMA DPR core M S ICAP core frequency](https://reader034.vdocument.in/reader034/viewer/2022043012/5fac15df5bac12738515f412/html5/thumbnails/24.jpg)
Autom. Analysis: Adaptive Acc.
0 50 100 150 200 250 3000
0.01
0.02
0.03
0.04
0.05
0.06
Diff
eren
ce o
f sum
of a
bsol
ute
valu
es
Frame #0 50 100 150 200 250 300
40
50
60
70
PSN
R(d
B)
0 50 100 150 200 250 30020
30
40
50
60
70
80
90
Frame #
PSN
R (d
B)
2030405060708090
0.2
0.25
0.3
0.35
0.4
psnr(dB)
Ener
gy p
er fr
ame(
mJ)
�
�N=20, NH=10, OB=8
N=32, NH=16, OB=16
�N=24, NH=16, OB=16
N=8, NH=12, OB=8
�N=24, NH=10OB=16
�
�Energy per frameAccuracy (psnr)
0.3 mJ45dB
0.3mJmax
min--
min65dB
� � � �--
max
(a) (b)
avg=49.65
std=0.0977
avg=61.14
std=0.1425
avg=23.56std=0.18
avg=66.18std=0.5568
avg=88.45std=0.0475
�
�
�
�
�
frame # 30
frame # 90
frame # 150
frame # 215
frame # 270
�
��
�
�
�
frame # 31
frame # 191
frame # 260
frame # 245
Threshold
(c) (d)
�
fram
e #
185
fram
e #
191
30 framesbelow T=0.01
10 framesabove T=0.01
15 framesbelow T=0.01
MEDIUMACCURACY
62.33 dB0.281 mJ
LOWACCURACY
49.58 dB0.214 mJ
HIGH ACCURACY
65.82 dB0.346 mJ
START 15 framesabove T=0.01
�
If the conditions do not hold,
stay in the current state
![Page 25: DRASTIC: Dynamically Reconfigurable Architecture Systems ... · Filter peripheral FIR Filter IP core PRR ICAP port EPA constraints 'n' filters: DMA DPR core M S ICAP core frequency](https://reader034.vdocument.in/reader034/viewer/2022043012/5fac15df5bac12738515f412/html5/thumbnails/25.jpg)
DPRT: Prime Directions
i
j
(1,0) (1,1)
(1,2)
(1,3)(1,4)(1,5)(1,6)
(0,1)
X
X
X
X
X
X
X
X
X
X
X
X
X
X
X
X
X
X
X
X
X
X
X
X
X
X
X
X
X
X
X
X
X
X
X
X
X
X
X
X
X
X
X
X
X
X
X
X
X
X
X
X
X
X
X
X
X
X
X
X
X
X
X
j
i
(1,2)
![Page 26: DRASTIC: Dynamically Reconfigurable Architecture Systems ... · Filter peripheral FIR Filter IP core PRR ICAP port EPA constraints 'n' filters: DMA DPR core M S ICAP core frequency](https://reader034.vdocument.in/reader034/viewer/2022043012/5fac15df5bac12738515f412/html5/thumbnails/26.jpg)
DPRT: New Architecture
f0,0
j
i f1,0
f2,0
f3,0
f4,0
f5,0
f6,0
f0,1
f1,1
f2,1
f3,1
f4,1
f5,1
f6,1
f0,2
f1,2
f2,2
f3,2
f4,2
f5,2
f6,2
f0,3
f1,3
f2,3
f3,3
f4,3
f5,3
f6,3
f0,4
f1,4
f2,4
f3,4
f4,4
f5,4
f6,4
f0,5
f1,5
f2,5
f3,5
f4,5
f5,5
f6,5
f0,6
f1,6
f2,6
f3,6
f4,6
f5,6
f6,6
åR0(0)R0(1)R0(2)R0(3)R0(4)R0(5)R0(6)
7-operand adder tree_0
7-operand adder tree_1
7-operand adder tree_2
7-operand adder tree_3
7-operand adder tree_4
7-operand adder tree_5
7-operand adder tree_6
R0(0)R0(1)R0(2)R0(3)R0(4)R0(5)R0(6)
f0,0
f1,0
f2,0
f3,0
f4,0
f5,0
f6,0
f0,1
f1,1
f2,1
f3,1
f4,1
f5,1
f6,1
f0,2
f1,2
f2,2
f3,2
f4,2
f5,2
f6,2
f0,3
f1,3
f2,3
f3,3
f4,3
f5,3
f6,3
f0,4
f1,4
f2,4
f3,4
f4,4
f5,4
f6,4
f0,5
f1,5
f2,5
f3,5
f4,5
f5,5
f6,5
f0,6
f1,6
f2,6
f3,6
f4,6
f5,6
f6,6 CLS(6)
CLS(5)
CLS(4)
CLS(3)
CLS(2)
CLS(1)
f0,0
f1,1
f2,2
f3,3
f4,4
f5,5
f6,6
f0,1
f1,2
f2,3
f3,4
f4,5
f5,6
f6,0
f0,2
f1,3
f2,4
f3,5
f4,6
f5,0
f6,1
f0,3
f1,4
f2,5
f3,6
f4,0
f5,1
f6,2
f0,4
f1,5
f2,6
f3,0
f4,1
f5,2
f6,3
f0,5
f1,6
f2,0
f3,1
f4,2
f5,3
f6,4
f0,6
f1,0
f2,1
f3,2
f4,3
f5,4
f6,5
CLS
R1(0)R1(1)R1(2)R1(3)R1(4)R1(5)R1(6)
Dá0-iñ7
D0
Dá1-iñ7
D1
Dá2-iñ7
D2
Dá3-iñ7
D3
Dá4-iñ7
D4
Dá5-iñ7
D5
Dá6-iñ7
D6
å
j
i
f(0,q)
f(1,á q-pñ7)
f(2,á q-2pñ7)
f(3,á q-3pñ7)
f(4,á q-4pñ7)
f(5,á q-5pñ7)
f(6,á q-6pñ7)
+
+
+
+
+
+R(p,q)
(a) (b) (c)
(d) (e) (f)
(g)
![Page 27: DRASTIC: Dynamically Reconfigurable Architecture Systems ... · Filter peripheral FIR Filter IP core PRR ICAP port EPA constraints 'n' filters: DMA DPR core M S ICAP core frequency](https://reader034.vdocument.in/reader034/viewer/2022043012/5fac15df5bac12738515f412/html5/thumbnails/27.jpg)
Recent Work: Scalable FDPRT
N
H
...
Block 0
Block K-1
...
Sequential processing
Hardware
Image
N
N
H
It allows effective implementations based on different constraints on the hardware resources and image sizes (powers of 2 + prime).
Spilt by 2: 2N cycles to load + N for transpose.
![Page 28: DRASTIC: Dynamically Reconfigurable Architecture Systems ... · Filter peripheral FIR Filter IP core PRR ICAP port EPA constraints 'n' filters: DMA DPR core M S ICAP core frequency](https://reader034.vdocument.in/reader034/viewer/2022043012/5fac15df5bac12738515f412/html5/thumbnails/28.jpg)
Recent Work: Fast 2D Circular Convolution
Cyclic convolution using: Computational Complexity
Definition O(N4) Discrete Fourier Transform O(N2*log2N + N2) FDPRT O(N+ceil(log2N) + N2) Scalable FDPRT O(ceil(N/2h)*N+2N+h +
N2)
![Page 29: DRASTIC: Dynamically Reconfigurable Architecture Systems ... · Filter peripheral FIR Filter IP core PRR ICAP port EPA constraints 'n' filters: DMA DPR core M S ICAP core frequency](https://reader034.vdocument.in/reader034/viewer/2022043012/5fac15df5bac12738515f412/html5/thumbnails/29.jpg)
DPRT: Pareto-Optimal HW
(4056, 15939504)
(516096, 63253)
(1135022, 511) 500
5000
50000
500000
5000000
4000 40000 400000
Clo
ck c
ycle
s
Resource usage (1-bit D-flip flops)
SerialSystolicSFDPRT: H=2, …, 251 FDPRT
Pareto front (optimal)
Image size: 251 x 251
H=2
H=251
(6275, 32638)
(1135022, 1266)
H=113 (517813, 1897)
H=ª251/2º (567762, 1396)
H=84 (385285, 1607)
(pareto optimal)
(pareto optimal)
![Page 30: DRASTIC: Dynamically Reconfigurable Architecture Systems ... · Filter peripheral FIR Filter IP core PRR ICAP port EPA constraints 'n' filters: DMA DPR core M S ICAP core frequency](https://reader034.vdocument.in/reader034/viewer/2022043012/5fac15df5bac12738515f412/html5/thumbnails/30.jpg)
Conclusion
• Example Applications Demonstrate Promise
• Approach can handle joint software-hardware optimization as well as software-only optimization
• Current research focused on automatic constraint generation and real-time Pareto-front estimation without the need to pre-compute over a training set