[doi 10.1145%2f196244.196367] gupta, pravil; chen, chih-tung; desouza-batista, j. c.; parker, --...
TRANSCRIPT
-
8/18/2019 [Doi 10.1145%2F196244.196367] Gupta, Pravil; Chen, Chih-Tung; DeSouza-Batista, J. C.; Parker, -- [ACM Press th…
1/7
E x p e r i e n c e w i t h I m a g e C o m p r e s s i o n C h i p D e s i g n
u s i n g U n i e d S y s t e m C o n s t r u c t i o n T o o l s
P r a v i l G u p t a C h i h T u n g C h e n J . C . D e S o u z a B a t i s t a
y
a n d A l i c e C . P a r k e r
D e p a r t m e n t o f E l e c t r i c a l E n g i n e e r i n g S y s t e m s
U n i v e r s i t y o f S o u t h e r n C a l i f o r n i a
E E B 3 0 0 M C 2 5 6 2
L o s A n g e l e s C A 9 0 0 8 9 2 5 6 2 U S A
A b s t r a c t T h i s p a p e r d e s c r i b e s t h e u s e o f U n i e d S y s
t e m C o n s t r u c t i o n t o o l s u n d e r d e v e l o p m e n t a t t h e U n i v e r
s i t y o f S o u t h e r n C a l i f o r n i a . T h e g o a l o f t h e p r o j e c t i s t o
a u t o m a t e t h e c o n s t r u c t i o n o f h e t e r o g e n e o u s a p p l i c a t i o n
s p e c i c s y s t e m s . K e y e l e m e n t s o f t h e U S C s y s t e m i n
c l u d e m u l t i p r o c e s s o r s y n t h e s i s m u l t i c h i p d a t a p a t h s y n
t h e s i s m e m o r y i n t e n s i v e s y n t h e s i s a n d m u l t i c h i p p a r
t i t i o n i n g . T h e t o o l s w e r e a p p l i e d t o d e s i g n o f a n i m a g e
c o m p r e s s i o n c h i p s e t a n d r e s u l t s o f u s i n g t h e s e t o o l s a r e
r e p o r t e d o n h e r e . O u r r e s u l t s a r e c o m p a r a b l e t o m a n u a l
d e s i g n s r e p o r t e d i n t h e l i t e r a t u r e .
1 I n t r o d u c t i o n
C o m m u n i c a t i o n s e n t e r t a i n m e n t a n d o t h e r e l e c t r o n i c
s y s t e m s a r e i n w i d e s p r e a d u s e . T h e s e s y s t e m s a r e g e n
e r a l l y m u l t i c h i p h e t e r o g e n e o u s a n d a p p l i c a t i o n s p e c i c .
C h i p l e v e l s y n t h e s i s t o o l s a r e i n v a l u a b l e f o r t h e r a p i d p r o
d u c t i o n o f s u c h s y s t e m s a n d s u c h t o o l s a r e b e c o m i n g
a v a i l a b l e f o r g e n e r a l u s e . S y s t e m l e v e l t o o l s c a n a l s o b e
u s e d t o s i g n i c a n t l y i n c r e a s e a d e s i g n e r s a b i l i t y t o m e e t
a s c h e d u l e a l o n g w i t h a s e t o f p e r f o r m a n c e a n d c o s t c o n
s t r a i n t s b u t o n l y a f e w o f t h e s e t o o l s h a v e b e e n a v a i l a b l e
i n t h e p a s t .
T h e U n i e d S y s t e m C o n s t r u c t i o n U S C p r o j e c t a t t h e
U n i v e r s i t y o f S o u t h e r n C a l i f o r n i a i n v o l v e s t h e p r o d u c
t i o n o f a n i n t e g r a t e d s e t o f s y s t e m l e v e l t o o l s f o r s y n t h e
s i z i n g m u l t i c h i p h e t e r o g e n e o u s a p p l i c a t i o n s p e c i c s y s
t e m s w h i c h m e e t c o s t p e r f o r m a n c e a n d p o w e r c o n s t r a i n t s .
T h i s p a p e r p r e s e n t s t h e u s e o f t h e s e s y s t e m l e v e l t o o l s t o
p e r f o r m a m u l t i c h i p d e s i g n e x e r c i s e a J P E G i m a g e c o m
p r e s s i o n c h i p s e t . T h e f o c u s o f t h e U S C p r o j e c t i s o n r e a l
t i m e s y s t e m s s u c h a s e n t e r t a i n m e n t a n d c o m m u n i c a t i o n
t e c h n o l o g i e s b u t d o e s n o t e x c l u d e o t h e r a p p l i c a t i o n s r e
q u i r i n g s p e c i a l i z e d s y s t e m d e s i g n . A b l o c k d i a g r a m o f t h e
s y s t e m i s s h o w n i n F i g u r e 1 .
T h i s w o r k w a s s u p p o r t e d b y t h e A d v a n c e d R e s e a r c h P r o j e c t s
A g e n c y a n d m o n i t o r e d b y t h e F e d e r a l B u r e a u o f I n v e s t i g a t i o n u n d e r
C o n t r a c t N o . J F B I 9 0 0 9 2 .
y
S u p p o r t e d b y C o n s e l h o N a c i o n a l d e D e s e n v o l v i m e n t o C i e n t i c o
e T e c n o l o g i c o C N P Q B r a z i l .
T h e u s e r o f t h e U S C s o f t w a r e r s t s e l e c t s a s t y l e f o r t h e
s y s t e m . S t y l e s c u r r e n t l y s u p p o r t e d i n c l u d e
h e t e r o g e n e o u s m u l t i p r o c e s s o r s c o n s i s t i n g o f
p r o c e s s o r s i n t e r c o n n e c t e d w i t h p o i n t t o p o i n t
c o n n e c t i o n s o p e r a t i n g i n a n o n p i p e l i n e d f a s h
i o n .
n o n p i p e l i n e d p r o c e s s o r s i n a r i n g
n o n p i p e l i n e d p r o c e s s o r s c o n n e c t e d b y a b u s
p i p e l i n e d p r o c e s s o r s w i t h p o i n t t o p o i n t c o n n e c
t i o n s
m u l t i p l e c u s t o m V L S I c h i p s c o m m u n i c a t i n g a s y n
c h r o n o u s l y
m u l t i p l e c u s t o m V L S I c h i p s c o m m u n i c a t i n g s y n
c h r o n o u s l y w i t h c o m m o n c l o c k a n d
m e m o r y i n t e n s i v e m o d u l e s c o n s i s t i n g o f a c u s t o m
V L S I c h i p a n d a s e p a r a t e m e m o r y c h i p .
M a n y o t h e r s t y l e s o f s y s t e m s a r e c u r r e n t l y u n d e r d e
v e l o p m e n t . O n c e a s t y l e i s s e l e c t e d s p e c i a l i z e d t o o l s
a r e i n v o k e d t o c o m p l e t e t h e d e s i g n p r o c e s s . U l t i m a t e l y
a n y c u s t o m V L S I c h i p s w h i c h m u s t b e s y n t h e s i z e d a r e
t h e n p r o c e s s e d b y t h e A D A M h i g h l e v e l s y n t h e s i s s y s
t e m w h i c h p r o d u c e s a c e l l n e t l i s t . T h i s n e t l i s t i s i n p u t t o
t h e C a s c a d e D e s i g n A u t o m a t i o n C h i p c r a f t e r S i l i c o n C o m
p i l e r a n d a c h i p l a y o u t i s p r o d u c e d .
T h e f o l l o w i n g s e c t i o n s g i v e a n o v e r v i e w o f e a c h m a j o r
s t y l e o f d e s i g n i n t h e o r d e r e a c h w a s a p p l i e d t o t h e c o m
p r e s s i o n e x a m p l e . T h e r e m a i n i n g s e c t i o n s d e s c r i b e t h e
i m a g e c o m p r e s s i o n s y s t e m t o b e d e s i g n e d a n d d e t a i l v a r
i o u s d e s i g n a c t i v i t i e s c o n d u c t e d u s i n g t h e U S C t o o l s .
2 S y n t h e s i s o f M e m o r y I n t e n s i v e S y s
t e m s
A s u b s e t o f t h e U S C t o o l s p e r f o r m s a u t o m a t i c s y n t h e
s i s o f m e m o r y i n t e n s i v e a p p l i c a t i o n s p e c i c s y s t e m s w i t h
e m p h a s i s o n h i e r a r c h i c a l s t o r a g e a r c h i t e c t u r e d e s i g n . T h e
s t o r a g e a r c h i t e c t u r e i s c l o s e l y c o n n e c t e d t o t h e d a t a p a t h
o f t h e s y s t e m a n d i s o l a t i n g i t s s y n t h e s i s f r o m d a t a p a t h
s y n t h e s i s m a y n o t r e s u l t i n a n e c i e n t s o l u t i o n . T h e r e
f o r e t h e d e s i g n o f t h e d a t a p a t h a n d s t o r a g e a r c h i t e c t u r e
31ST ACM/IEEE Design Automation Conference ®
Permission to copy without fee all or part of this material is granted provided that the copies are not made or distributed for direct commercial advantage,
the ACM copyright notice and the title of the publication and its date appear, and notice is given that copying it is by permission of the Association for
Computing Machinery. To copy otherwise, or to republish, requires a fee and/or specific permission. © 1994 ACM 0-89791-653-0/94/0006 3.50
250
-
8/18/2019 [Doi 10.1145%2F196244.196367] Gupta, Pravil; Chen, Chih-Tung; DeSouza-Batista, J. C.; Parker, -- [ACM Press th…
2/7
o s o
] f o r S y s t e m s | . . . . .
• r
•
c u s t o m V L ~ I
m u l t = p o c e s s g P '
I
m i x e d ~ .~ ,,, , l J . . . . .
f styles ~ clock
S O S
S y n t h e s i s
I T a s , . S t y l e I
C H O P : 2 - w a y
p a r t i t i o n i n g , I
I o | S y s t e m s [ I : :~ e l e c u o n I p a r t i t i o n e v a l u a t i o n I
u s e d b y /
syste.m -fevel , ^ f f .
t e e m I ~ a s c a o e u e s i g n A u t o m a t i o n ]
Figure 1: Block dia gram of the Unified System Construc-
tion (USC) Project
are coordinated in the USC tool set. SMASH [9] (Fig-
ure 2), the tool set for memory- intensive design, is used
for systems designed for specific applications, where the
memory-access pattern is not only relatively fixed but
also known before hand. This mostly -determinist ic ac-
cess characteristic helps us in being more specific, hence
more efficient in our designs.
The original MIMOLA system was the first system to
make tradeoffs in the use of multiport memories [14].
Lippens et al. [13] describe techniques to perform au-
tomatic memory allocation and address allocation for
high speed applications. They synthesize memory af-
ter the design of arithmetic units, datapath schedul-
ing and allocation. IMEC s CATHEDRAL-I I compiles
multi-dimensional dat a structures into distributed dual-
port register files and single-port SRAMs. They use a
polyhedral-based model for high-level memory manage-
ment for linear, piecewise linear and data dependent sig-
nal indexing[7]. SMASH, however, combines the tradeoffs
in datapath as well as storage architecture as explained
below.
SMASH performs high-level synthesis of an integrated
system consisting of a datapath and a two-level memory
hierarchy, from a given behavioral specification and with
constraints on cost and performance. The two memory
levels are
1. On-chip foreground memory. This consists of two
subparts: datapat h memory to store the intermedi-
ate or temporary variables in the datapath, and I/O
buffers to temporarily store the inputs and outputs
to the d atapa th chip, allowing fast access and off-chip
storage of the data.
2. Off-chip background memory . In our mode l this is
the bulk storage required for the inputs and outputs.
All the I/O data values from/to the external world
are stored here.
escription
=chitecturebeh off is
begin
r o c e s s
e n d
process,
Module
S M A S H
D a t a P a t h O p e r a t i o n s ,
I / O b u f f e r r e a d s / w r i t e s , a n d
D a t a p a t h m e m o r y r e a d s / w r it e s .
u t p m mem ory
synthesis
• Mo du le a l loca t ion for da l
a t h m e m o r y .
a . I / 0 t r a n s fe r s c h e d u l i n h b e t w e e n
i o f f - c h i p m e m o r y a n d I / 0 b u f fe r s .
; b . M o d u l e a l l o c a t i o n f o r I / 0 b u f fe r s .
a . I / 0 t r a n s fe r s c h e d u l e b e t w e e n e x t . w o l
a n d o f f - c h i p m e m o r y .
b . M o d u l e a l l o c a t i o n f o r o f f- c h i p m e m o r y .
Figure 2: SMASH: Synthesis of Memory-intensive Appli-
cation Specific Hardware
The synthesis is performed in the following two steps:
First, d atap ath synthesis with operation scheduling is per-
formed combined with scheduling of local d ata transfers
to/ from memory. As a result of this scheduling, con-
straints are placed on the memory structure. During the
second step, the storage hierarchy design is completed,
which includes determining the data transfers between
different levels of memory hierarchy and completing syn-
thesis of the storage structures. We ensure th at each step
2 5 1
-
8/18/2019 [Doi 10.1145%2F196244.196367] Gupta, Pravil; Chen, Chih-Tung; DeSouza-Batista, J. C.; Parker, -- [ACM Press th…
3/7
of the stepwise construction of the system takes into ac-
count the next step by looking ahead so that the next
step is not overly constrained. Global design parameters,
like the memory bandwidth and timing constraints, are
considered when constructing the partial design in each
step, tying the whole synthesis process together.
I P r o c e s s r a n s f o r m a t io n s I
J PIL' I¢ 'H' r 'L ' " I " I
[ ,
° ' " "
S c h e d u l i n g / A l l o c a t i o no r [
C o m m u n i c a t i n g r o c e s s e s
I
I InterconnecvControl lerynm es i s
t
I S i l i c o n O o m p i l e r I
Figure 3: A flow chart of the synthesis system for asyn-
chronous multi-chip designs
3 Sy n th e s i s o f As y n c h ro n o u s Mu l t i - c h ip
S y s t e m s
In practice, we find that many DSP and other ASIC de-
signs consist o f multiple concurrent and interacting pro-
cesses. Though high-level synthesis has received enormous
attention over the years, most approaches were concen-
trated on synthesizing single process (one thread of con-
trol) designs. Synthesizing a design with multiple concur-
rent processes poses many new challenges. For example,
since the processes interact with each other, the synthe-
sis tool has to solve all the timing constraints imposed
by one process on another concurrently [18]. Further-
more, resource allocation for each process on a chip can-
not be done without taking into account the area versus
performance characteristics of each process since the to-
tal resources taken by all processes on a chip are limited
by the chip package. The goals of this research are to
provide an integrated system for synthesizing multi-chip
designs with mul tiple concurrent processes as well as to
speed up the redesign of multi-chip systems. Figure 3
shows a flow chart which illustrates the approach used
in this synthesis system. First , the multi-process speci-
fication is translated from VHDL to a synthesizable rep-
resentat ion called the Design Data St ructure (DDS) [3].
The next step is to perform a number of process trans-
formations in order to trade off among hardware sharing,
control complexity, communica tion overhead and cost. A
process-level chip partitioner, ProPart [4], is then used to
find new cost-effective chip boundaries according to the
up-to-date packaging library. In addition, the partitioner
will distribute chip resources to the processes according
to their performance-versus-area characteristics and de-
termine the interconnection structu re as well. Next, a
concurrent approach for multiple-process synthesis is used
to synthesize each process into its own datapath and con-
trol path. The objective is to meet the t iming, area and
performance constrain ts as well as to synchronize the com-
munication among the processes. Finally, we use a hy-
brid symbolic/nume ric simulation to verify the functional
and timing correctness of the RTL implement ation [5].The
RTL implementation is submitted to the ADAM system
to obtain the final chip layout.
ProPart was used in the experiment of a JPEG image
compression system to be described later in this paper.
Unlike most of the previous behavioral partitioning ap-
proaches which focus on partitioning design behaviors at
the operation level into a number of synchronized chips.
ProPart tries to partition a set of sequential and/or con-
current behaviors into custom chips. There are several
advantages to process-level part itioning. For example,
there are far fewer objects at the process level than those
at the operation level, which allows us to utilize much
more comprehensive techniques like mixed integer-linear
programming and at the same time to take into account
more par titioning issues, like chip package selection and
chip resource distribution.
SpecPart [19] is the first system-level behavioral parti-
tioning work which elevates the objects to be partitioned
to a higher level of abstract ion (such as processes and pro-
cedures), and uses a group migration technique similar to
the Kernighan-Lin algorithm for partitioning. A compre-
hensive survey of other behavioral parti tioning approaches
at the operation level has been done by Vahid [20].
4 Mu l t ip ro c e s s o r Sy n th e s i s
Multiprocessors of various styles can be synthesized us-
ing the SOS (Synthesis of Systems) [17] set o f tools. The
input to SOS is a specification in the form of a task flow
graph. SOS decides on the number and types of proces-
sors to be used, the interconnections between them, and
the schedule of execution of tasks onto processors.
Related work in the subject of system synthesis in-
clude: graph-based theoret ical approaches [1]; analy tical
modeling approach [10]; and mathe mati cal programming
formulation [6]. The approach used by SOS considers a
more general case than the ones studied previously, that is
scheduling and allocation of tasks related by a precedence
2 5 2
-
8/18/2019 [Doi 10.1145%2F196244.196367] Gupta, Pravil; Chen, Chih-Tung; DeSouza-Batista, J. C.; Parker, -- [ACM Press th…
4/7
T a s k F l o w G r a p h
P r o c e s s o r S t y le L i b r a r y
M u l t i p r o c e s s o r
S y s t em /, ~ - .~ . ~ , .
S tr u c tu r e ~ j ~
S O S I S c h e d u l in g n d
M u l t i p r o c e s s o r
s y s t e m I A l l o c a t i o n
S y n t h e s i s
T a s k E x e c u t io n
S c h e d u l e
M a p p i n g o f S u b t a s k s
t o
P r o c e s s o r s
F i g u r e 4 : O v e r a l l o p e r a t i o n o f t h e S O S t o o l s
g r a p h a n d w i t h n o n - z e r o c o m m u n i c a t i o n c o s t s o v e r a s et
o f h e t e r o g e n i c p r o c e s s o r s c o n n e c t e d b y a i n t e r c o n n e c t i o n
n e t w o r k .
S O S t o o l s u s e m i x e d i n t e g e r - l i n e a r p r o g r a m m i n g
( M I L P ) t o m o d e l t h e s y n th e s is p r o b l e m . S o m e o f t h e
t o o l s c a n g e n e r a t e t h e M I L P c o n s t r a i n t s a u t o m a t i c a l l y ,
a n d o t h e r s c u r r e n t l y r e l y o n c o n s t r a i n t s w r i t t e n b y t h e
u s e r. T h e t o o l s r e ly o n a b r a n c h - a n d - b o u n d s o l v er c a l le d
B O Z O [1 1]. F i g u r e 4 s h o w s t h e o v e r a l l o p e r a t i o n o f t h e
S O S t o o l s .
5 J P E G I m a g e C o m p r e s s i o n D e s i g n
D u e t o t h e b a n d w i d t h c o n s t r a i n t s i m p o s e d b y b o t h s t i l l
a n d v i d e o i m a g e t r a n s m i s s i o n , d a t a c o m p r e s s i o n is a k e y
f u n c t i o n . A s a r e s u l t , w e c h o s e t o fo c u s o u r d e s i g n a c t i v -
i t y o n a s t a n d a r d f o r s t il l im a g e c o m p r e s s i o n , J P E G [2 1],
a n d t o e v e n t u a l ly e x p a n d t h e e x a m p l e t o co v e r M P E G
s t a n d a r d s a s w e l l .
Design Inp ut o ns tr a in t s SM ASH o ut put
i F u n c t i o n a l Won.o# Buffer Functional Execut ion
number a r e a ( w o r d s / R t , uJ Wb~ size time
(1061.tm2) cycle) (words) res ou rces (cy cles)
3 0 5 1 2 6 3 3 + , 4 - , 1 2 " 8
2 0 3 6 5 3 3 + , 3 - , 8 " 1 3
1 8 3 4 4 3 3 + , 3 - , 7 " 1 5
1 0 3 3 3 1 2 + , 2 - , 4 " [ 2 0
6 2 3 3 3 2 + , 2 - , 2 " 2 6
T a b l e 2 : 1 D D O T d e s ig n p a r a m e t e r s f r o m S M A S H
D e s i g n A c t i v i t i e s a n d R e s u l t s
W e b e g a n w i t h t h e s y n t h e s is o f t h e D C T ( D i s c r et e C o -
s in e T r a n s f o r m ) fu n c t i o n . T h e 2 D - D C T w a s d e c o m p o s e d
i n to r e p e a t e d r o w - c o l u m n 1 D - D C T s p r i o r t o t h e a p p li c a-
t io n o f t h e s y s te m - l e ve l to o l s. T h e 1 D - D C T m a c r o w a s
s y n t h e s iz e d f ir s t a n d u s e d t o c o n s t r u c t a 2 D - D C T , c l e a rl y
a b o t t o m - u p s t e p . S M A S H w a s u s e d t o g e n e r a t e fi v e
s c he d u le s fo r a 1 D - D C T m a c r o f r o m a b e h a v i o r a l V H D L
d e s c r i p t io n o f t h e D C T d e s c r i b e d i n t h e r e f e r e n c e d a r t i c le
[ 8] . T h e m o d u l e l ib r a r y u s e d i s s h o w n i n T a b l e 1 . T h e s e
d a t a p a t h s c h ed u l es w i th v a r y i n g c o s t a n d p e r f o r m a n c e a r e
s h o w n i n T a b l e 2 . S M A S H a l s o d e t e r m i n e d b u f f er - si z e a n d
b a n d w i d t h r e q u i r e m e n t a s s h o w n i n T a b l e 2 .
T h e 1 D - D C T s c h ed u l es w e r e t h e n p r o c e ss e d b y t h e
A D A M t o o l M A B A L [ 12 ] t o g e n e ra t e t h e R T L d a t a p a t h
n e t li s ts . T h e s e n e t l i s ts w e r e a n a l y z e d t o o b t a i n t h e a r e a
c h a r a c t e r i s ti c o f t h e d a t a p a t h a s s h o w n i n T a b l e 3 . T h e
a r e a f o r f u n c t i o n a l u n i t s , m u l t i p l e x e r s a n d r e g i st e r s w a s
d e t e r m i n e d f r o m t h e n e t l i s t s , a n d w i r i n g a r e a w a s e s t i -
m a t e d m a n u a l l y u si n g a r u l e - o f - t h u m b w h i c h w e o b s e r v e d
i n o u r e a r l i e r e x p e r i m e n t s [ 1 6] . ( W e d i d n o t u s e o u r w i r i n g
a r e a e s t i m a t i o n t o o l s h e r e d u e t o l a c k o f t im e . )
Module Area Delay Module Area Delay
name (~tm2) (ns) nam e (gtm2) (ns)
Multiplier 2398490 55 Com parator 815 58 30
Adder 81 55 8 30 Di s t. /Joi n 0 0
Subtractor 81 558 30 Register 96624 3
T a b l e 1 : M o d u l e l i b r a r y u s e d b y S M A S H
F i g u r e 5 sh o w s t h e i n p u t J P E G - s p e c i f i c a t i o n , t h e d e -
s i gn f lo w u s e d f o r t h i s e x p e r i m e n t a n d t h e o u t p u t o f o u r
s y s t e m . I t is i m p o r t a n t t o n o t e t h a t t h e d e s ig n f lo w h a s
s o m e b o t t o m - u p p o r t i o n s , w h i c h r e p r e s e n t t h e fl o w b e -
t w e e n t h e a p p l i c a t i o n o f e a c h t o o l , e a c h o f w h ic h o p e r a t e s
i n e s s e n t i a l ly a t o p - d o w n f a s h i o n . T h u s t h e d e s i g n fl o w i s
b o t h t o p - d o w n a n d b o t t o m - u p .
Functional
Design area
( 1 0 6 ~tm )
A
29.35
19.68
17.20
9.92
5.12
Interconnect area
(106 g m2)
M uxes and ; W i r ing
Registers
B C = 2(A+B )
3.74 66.18
3.87 47.09
4.05 42.50
3.67 27.18
4.12 18.48
Total
a r m
(106 ixm2)
99.27
70.63
63.74
40.77
27.73
T a b l e 3 : 1 D - D C T R T L d e s ig n s f ro m M A B A L
F o r m u l t i- c h i p p a r t i t i o n i n g o f t h e c o m p r e s s i o n s y s t e m
w e e s t i m a t e d t h e p e r f o r m a n c e a n d s i li c o n a r e a o f a l l t h e
p a r t s in t h e s y s t e m . A 2 D - D C T a r c h i t e c t u r e c o n s is t i n g
o f t w o 1 D - D C T m o d u l e s ~ /O d a n 8 × 8 f r a m e b u f f e r w a s
s e l e c t e d a s s h o w n in t h e l i t e r a t u r e [ 8 ]. T h e w o r s t - c a s e
2 5 3
-
8/18/2019 [Doi 10.1145%2F196244.196367] Gupta, Pravil; Chen, Chih-Tung; DeSouza-Batista, J. C.; Parker, -- [ACM Press th…
5/7
S o u r c ~
I m a g e O 'a i ~
J PE G S Y S T E M S P E C I F I C A T I O N
I
• I
3ompressed
ImageData
D e c o m p o s e 2 D- DC T
i n t o t w o 1 D- DC T s ~ .
m - o c r
V D H L
Descfiation
2D-DCT est imat ion
S M A S H a n d
M A B A L
P r o p a r t
P a r t i t i o n e d 3 C h i p I m p l e m e n t a t i o n o f
RTLFive D- OC T )D e s i g n s . I . J ~ _ L ~ L . _ r ' C h i p 3
Cho ice 1:~i i~ Cho ice :2
S O S
L a y o u t G e n e r a t i o n u s i n g C h i p C r a f t e r
T hr ee 2D- DCT
Mul t iprocessor Archi tectures
2D- DCT Layout (Chip 1)
F i g u r e 5 : D e s i g n f l o w fo r s ti l l in a a ge c o m p r e s s i o n s y s t e m e x a m p l e
a p a t h d e l a y w a s u s e d t o c o m p u t e t h e p e r f o r m a n c e f o r
k . T h e q u a n t i z e r p e r f o r m a n c e a n d s il i co n a r e a w e re
l so e s t i m a t e d , a n d t h e p a r a m e t e r s u s e d a r e c o m p a r a b l e
o t h o s e r e p o r t e d i n th e l i t e r a t u r e [ 8 ]. W e u s e d p a r a m e t e r s
a n e x i s t i n g c h i p f o r t h e H u f f m a n c o d i n g [ 1 5 ].
A f t e r e s t i m a t i n g t h e p e r f o r m a n c e a n d s i l ic o n a r e a o f a ll
h e p a r t s i n t h e c o m p r e s s i o n s y s t e m , i t w a s p a r t i t i o n e d b y
r o P a r t . T h e d a t a u s e d f o r e a ch f u n c t i o n i n th e s y s t e m is
(a) Pro cess Characteristics
P~occss
( d c t
q u a n
¢IICO
de¢o
d e q u
i d c t
E s t i m a t e d A r e a / D e l a y p o i n t s
2 1 7 0 9 6 / 4 8 0 1 5 9 8 1 9 / 7 8 0 1 4 6 0 2 4 / 9 0 0
1 0 0 1 0 6 1 1 2 0 0 7 4 0 0 4 1 1 5 6 0
1 9 0 3 6 / 8 9 ( f i x e d )
1 2 2 0 0 / 1 0 0 ( f i x e d )
1 2 2 0 0 / 1 0 0 ( f i x e d )
1 9 0 3 6 / 8 9 ( f i x e d )
2 1 7 0 9 6 / 4 8 0 1 5 9 8 1 9 / 7 / 8 0 1 4 6 0 2 4 / 9 0 0 1
1 0 0 1 0 6 1 1 2 0 0 7 4 0 0 4 / I 5 6 0
(b) PackageLibrary
P a c k a g e A r e a P i n s C o s t
k l 0 0 8 4 1 7 4 2 6 0 1 2 1 4
k 2 0 9 1 7 5 5 2 8 3 0 4 4 2 4 0
k 2 7 1 2 2 7 3 0 6 3 4 4 6 8 4 9
" ' k 4 6 4 3 8 9 8 0 0 4 5 2 1 8 9 2 3
N o t e : 1 . A r e a i s 1 0 3 l a i n2.
2 . D e l a y i s i n n s .
3 . C o s t i s a f u n c t i o n o f
a r e a a n d p i n c a p a c i t i e s .
t h e s e c on d 1 D - D C T d e s i g n p r o d u c e d b y S M A S H . N o t e
t h a t P r o P a r t p l a c ed t h e D C T a n d I D C T o n s e p a r a t e d ie s,
a n d l u m p e d t h e r e m a i n i n g f u n c t i o n s o n a s i n g l e d i e .
F i n a l ly , w e g e n e r a t e d t h e l a y o u t s o f th e 1 D - D C T m a c r o
a n d 2 D - D C T c h i p u s i n g C a s c a d e D e s i g n A u t o m a t i o n ' s
C h i p C r a f t e r ( F i g u re s 6 a n d 7 ) , a n d a n a l y z e d t h e a r e a d i s -
t r i b u t i o n ( T a b l e 5 ) .
iF i~:~:~ . . . . . j - . ~
• "'~" ~ 1
-~ - . '~ . .:" II~ _at---: : ~ - ,. ,.
b : " / ' i " . ~ . . x t , ~ , . . . . . • ~ _ .
" ' ' ,
• a ' ~ .
~ " • ~ = ~ . . . . . . .
il - '
....
• : '1 - i ~g~:~ . .' . i - - - - , :?~ ' ~ •
. . . : . : .z ,L .~ , , . , ,~ .~ . ; .~- . ...... ~ :~ ;
I . , j , ~ . I i , f l O l . ~ ~ , ' ~ . - , ' ;: ~ : " ~ i .....
" ' • : i z ~ . ' - . ~ . , ~ ' ~ : - . , ~ , . . - - 1 6 ' - . ' ~ ; . , • i ~ l ~ ~ " ~ . . .. .
,n~ .c i~ , ' ~ , j i~ " .~ ., ,~,3 i .J , , , ~ i l ~ ~ - ,
• :" : : , $ 1 3 1 . ~ (~ . - - ' i , I ; • f . - - , ' l l ] t ] l l l i ' : :
' . ~ '. ; .~,%-~ ., , i , ,~ -,, '- . ,~ ' h ~ ' l f n
: • ; :~ .~ ' . r ~ l l l l ' ~ - ,41 ' , ; , I
I I
....
. . . . . , ; I , , . . . . .
, . o :
" " ' 1 [ ° e ~ : | l ~ l t . ' " " . _ _ 1 ~ ,
b , ' ; ' ~:~ ' ~~- ' |~" ~ ' " "" . . . . .
, " l " " ¢ ;~ , t : ' " l : . . : . " " '
. " . . . . . . . . . ~: n~
,~.'.'.
-. . . . . ~. .=',:
_ _ [ " [~" I r l~ I I ;.. • . . . .
T a b l e 4: P a r a m e t e r s u s e d b y P r o p a r t
T h e r e s u l t s p r o d u c e d b y P r o P a r t a r e s h o w n i n F i g -
5 . P r o P a r t s e l e c t ed t h e 2 D - D C T d e s i g n w h i c h u s e s
2 5 4
F i g u r e 6 : L a y o u t o f 1 D - D C T m o d u l e
A c o m p a r i s o n o f o u r c h i p s e t w i t h o t h e r s [2] is s h o w n
-
8/18/2019 [Doi 10.1145%2F196244.196367] Gupta, Pravil; Chen, Chih-Tung; DeSouza-Batista, J. C.; Parker, -- [ACM Press th…
6/7
s . . , t l , s , , , . ~ c ~ r + , t , + ~ t t v )
~.z
' - ~ . . ~ . ~ • , 1 1 11 |
1 1 , $ 1 i
J l
.t,1.
°*.,
,0°*
~ b
~ ]
F i g u r e 7 : L a y o u t o f 2 D - D C T c hi p
Area
~ L ~ T ota l F un ctio na l C o ntr olle r M u xe s+
Registers
ID -D C T 8 3 . 7 2 1 9 . 8 3 1 .0 3 4 .6 7 l
2 D - D C T 2 0 9 . 9 4 3 9 . 6 6 2 . 0 6 " 1 3 . 9 1 - ]
* controller for the frame-buffer not included
** a 64 w ord on-chip R AM included.
T a b l e 5 : A r e a a n a l y s i s f o r t h e l a y o u t s
Interconnect
Wiring
58.19
154.31
i n T a b l e 6 . S i n c e w e o b t a i n e d t h e H u f f m a n c o d i n g ch i p
p a r a m e t e r s f r o m a n o t h e r s o u r c e , t h e y a re o n l y c o m p a r e d
h e r e t o s h o w t h a t t h e p a r a m e t e r s w e a re u s in g a re c o m -
p a r a b l e t o t h o s e i n t h e l i t e r a t u r e . T h e d i e w e d i d d es i gn ,
t h e D C T , h a s s o m e w h a t l a r g e r d i e s iz e t h a n t h e in d u s -
t r ia l ch i ps , b u t t h e p e r f o r m a n c e w a s c o m p a r a b l e . T h e
t e c h n o l og i e s u s e d b y t h e i n d u s t r i a l c h i p s w a s n o t m e n -
t i o n e d i n [2 ], s o w e w e r e n o t a b l e t o d e t e r m i n e w h e t h e r
o u r 1 .2 m i c r o n C M O S t e c h n o l o g y w a s i n h e r e n tl y l a r g e r
t h a n t h e i n d u s t r i a l t e c h n o l o g i e s .
Ours
A rea ( m m2)
Bellcore LSI
2 D D C T / I D C T 1 1 .8 x 1 7 .7 1 0 .7 x 1 0 .2 9 . 5 x 9 . 5
Q u a n t . ~ e q u a n t . ( 4 . 4 x 4 . 4 ) 9 . 0 x 9 . 0 9 . 9 x 9 . 9
E n c ~ e r ( 3 . 5 x 3 .5 ) 6 . 6 x 6 . 2 7 . 4 x 7 . 4
D e c k e r ( 3 . 5 x 3 .5 ) 7 . 5 x 8 . 4 7 . 4 x 7 . 4
T a b l e 6: C h i p - s e t p a r a m e t e r s
A r c h i t e c t u r e t r a d e o f f s t u d y
T o s e a r c h t h e d e s i g n s p a c e f o r a w i d e r r a n g e o f i m -
p l e m e n t a t i o n s , w e a ls o a p p l i e d t h e S O S m u l t i p r o c e s s o r
s y n t h e s i s t o o l t o t h e f i r s t s t a g e o f t h e 2 D - D C T t a s k f lo w
g r a p h s h o w n i n F i g u r e 8 . I n t h i s g r a p h , t a s k s T 1 . . . T 8
a r e 1 D - D C T s w h i c h o p e r a t e r o w - w i s e o n t h e 8 × 8 a r r a y
o f p i x el s . T a s k T 9 i s a j o i n - d i s t r i b u t e o p e r a t o r , w h i c h i n -
d i c a te s t h e s e c on d s e t o f 1 D - D C T s c a n n o t s t a r t u n t il t h e
f ir st a re c o m p l e t e d . T a s k s T 1 0 . . . T 1 7 o p e r a t e c o l u m n -
w i se o n t h e r e s u l t s o f t a s k s T 1 . . . T 8 . W e a s s u m e d a
m a c r o - p i p e l i n e e x e c u t i o n b e t w e e n t h e s e t o f t a s k s T 1 . . .
T 8 a n d T 1 0 . . . T 1 7 . W h i l e ta s k s T 1 . . . T 8 a r e b e i n g pe r -
f o r m e d f o r f r a m e i + 1 , T 1 0 . . . T 1 7 a r e b e i n g p e r f o r m e d
o n t h e p r e v i o u s r e s u l t o f t a s k s T 1 . . . T 8 .
F i g u r e 8 : T a s k f l ow g r a p h f o r S O S
T h e d e s i g n s p a c e w a s s e a r c h e d f o r v a r i o u s p e r f o r -
m a n c e c o n s t r a i n t s w i t h t h e o b j e c t i v e o f m i n i m i z i n g t he
c o s t fo r t h e t a r g e t a r c h i t e c t u r e s h o w n i n F i g u r e 9 .
C o s t / p e r f o r m a n c e p a r a m e t e r s p r e d i ca t e d f r o m t he R T L
n e t li s ts f o r t h e 1 D - D C T i m p l e m e n t a t i o n s ( T a b l e 3) w e re
i n p u t t o S O S , s o t h a t i t c o u l d c h o o s e f r o m a l l t h r e e 1 D -
D C T i m p l e m e n t a t i o n s .
Fq-
Buf f e r
M e m o r y
F i g u r e 9: T a r g e t a r c h i t e c t u r e f o r S O S
W e c o n s i d e r e d t h r e e p o s s i b l e e x e c u t i o n - t i m e c o n -
s t r a i n t s , w h e r e t h e e x e c u t i o n t i m e i s d e f i n e d a s th e t i m e
t o c o m p u t e T 1 . . . T S . A l l p r o c e s s o r s s e n d r e s u l ts t o t h e
b u ff e r m e m o r y o v e r a c o m m o n 8 - pi x e l w i d e b us . T h e
s e t s o f p r o c e s s o r s f o u n d b y S O S f o r v a ri o u s t i m i n g c o n -
s t r a i n t s a r e s h o w n i n T a b l e 7 . T h e s e r e s u l ts c l e a r l y sh o w
t h e c o s t / p e r f o r m a n c e t r a d e o f f a t t h e a r c h i t e c t u r a l l e ve l .
6 Conc lus ions
T h e e n t i r e d e s i g n e x e rc i s e t o o k l e s s t h a n a w e e k o f t i m e
f o r t h r e e g r a d u a t e s t u d e n t s f a m i l i a r w i t h a l l t h e t o o l s , a n d
a b l e to w r i te V H D L d e s c r i p t i o n s r a p i d l y . W e f o u n d d u r -
i ng t h e c o u r s e o f o u r t o o l u s a g e t h a t t h e t o o l s t e n d e d t o b e
u s e d i n a b o t t o m - u p f a s h i o n . W e a l so f o un d o u t t h a t o u r
s o f t w a r e w a s n o t d e s ig n e d t o t a k e a d v a n t a g e o f a l r e a d y d e -
s i g n e d m a c r o s l ik e th e 1 D - D C T , s o c o d i n g c h a n g e s w e r e
255
-
8/18/2019 [Doi 10.1145%2F196244.196367] Gupta, Pravil; Chen, Chih-Tung; DeSouza-Batista, J. C.; Parker, -- [ACM Press th…
7/7
I n p u t t o i
s o s O u tp u t s f r om s o s
D e s i g n E x e c u t i o n
n u m b e r l i m e
c o n s t r a i n t P r o c e s s o r s * C o s t t i m e / P i x e l r a t e
8 x 8 f l a m e
ns 106 I. tm2 n s 1 0 6 p i x e l / s
1 6 4 0 0 4 P 5 1 1 0 . 9 0 6 3 5 0 1 0 . 0 8
2 3 2 0 0 2 P I , 2 5 5 . 0 8 3 2 0 0 2 0 . 0 0
2 P5
5 P1,
3 9 5 0 2 P 2 , 1 4 0 2 . 9 3 9 5 0 6 7 . 3 7
1 P3
* PI . . . P5 are the
five 1D-DCTdesigns from SMASH.
T a b l e 7: 2 D - D C T i m p l e m e n t a t i o n s fr o m S O S
r e q u i re d i n o r d e r t o o b t a i n t h e f i na l l a y o u t s f r o m t h e R T L
d a t a p a t h s . I t w a s cl e a r f r o m t h e e x e r c is e th a t m a n y m o r e
d i m e n s i o n s o f t h e d e s i g n s p a c e c o u l d h a v e b e e n s e a r c h e d
b y o u r t o o ls , g iv e n m o r e t i m e . F o r e x a m p l e , S O S p r o -
d u c e d a v a r i e t y o f a r c h i t e c t u re s f o r 2 D - D C T t h a t c a n b e
u s e d t o m e e t d i f f e r e n t d e s i g n r e q u i r e m e n t s . O u r e s t i m a -
t i o n t o o l s , w h i c h w e r e n o t u s e d , w i l l p r o v i d e v a l u a b l e i n -
f o r m a t i o n e a r l y f o r o u r t o o l s w h e n u s e d i n a c t u a l d e s i g n
s i t u a t i o n s .
R e f e r e n c e s
[1] S. H. Bokhar i .
Assignment Problems in Paralle l and
Distributed Computing. K l u w e r A c a d e m i c P u b l i s h -
ers, 1987.
[2 ] C . F . C h a n g a n d B . J . S h e u . A M u l t i - C h i p M o d u l e
D e s i g n f o r P o r t a b l e V i d e o C o m p r e s s i o n S y s t e m s . I n
IEEE Multi-Chip Module Conf.,
pages 39-44, 1993.
[3] C . T . C h e n a n d A . C . P a r k e r. V H D L 2 D D S : A V H D L
L a n g u a g e t o D D S D a t a S t r u c t u r e T r a n s l a to r . T e c h .
R e p . C E n g 9 1 - 2 1 , D e p t . o f E E - S y s t e m s , U n i v . S o u th -
e r n C a l i f o r n i a , J u l y 1 9 9 1 .
[4 ] C . T . C h e n a n d A . C . P a r k e r . P r o P a r t : A P r o c e s s -
L e v e l B e h a v i o r a l P a r t i t io n e r . T e c h . R e p . C E n g 9 3 -
3 8 , D e p t o f E E - S y s t e m s , U n i v . o f S o u t h e r n C a l i fo r -
ni a, Sept . 1993.
[5 ] C . T . C h e n a n d A . C . P a r k e r . A H y b r i d N u -
m e r i c / S y m b o l i c P r o g r a m f o r C h e c k i n g F u n c t i o n a l
a n d T i m i n g C o m p a t i b i l i t y o f S y n t h e s iz e d D e s ig n s . I n
7th lnt l Syrup. on High-Level Synthesis,
May 1994.
[6 ] W . W . C h u , L . J . H o l l a w a y , M . - - T . L a n , a n d K . E r e .
T a s k a l l o c a t io n i n d i s t r i b u t e d d a t a p r o c es s in g .
Com-
puter,
1 3 ( 1 1 ) : 5 7 - 6 9 , N o v . 1 9 8 0 .
[ 7 ] F . F r a n s s e n , F . B a l a s a , M . S w a a i j , F . C a t t h o o r , a n d
H . D e M a n . M o d e l i n g m u l t i d i m e n s t i o n a l d a t a a n d
c o n t r o l f l o w .
IEE E Tran. on VLS I Systems,
1 ( 3 ) : 3 1 9 -
327, 1993.
56
[8 ] H . F u j i w a r a , M . L . L i o u , M . T . S u n , K . M . Y a n g ,
M . M . M a r u y a m a , K . S h o m u r a , a n d K . O h y a m a . A n
A I I -A S I C I m p l e m e n t a t i o n o f a L o w B i t - R a t e V i d e o
C o d e c .
IEE E Trans. on Circuits and System s for
Video Technology,
2 ( 2 ) : 1 2 3 - 1 3 4 , J u n e 1 9 0 2 .
[9 ] P . G u p t a a n d A . C . P a r k e r . S M A S H : A P r o g r a m
f o r S c h e d u l i n g M e m o r y - I n t e n s i v e A p p l i c a t i o n S p e -
ci f i c Hardware. In
7th lnt l Syrup. on High-Level Syn-
thesis,
M a y 1 9 9 4 .
[ 10 ] E . K . H a d d a d . O p t i m a l L o a d A l l o c a t i o n f o r P a r a l -
l el a n d D i s t r i b u t e d P r o c e s s i n g . T e c h . R e p . T R 8 9 -1 2 ,
D e p a r t m e n t o f C o m p u t e r S c i e n ce , V i r g in i a P o l y t e c h -
n i c I n s t i t u t e a n d S t a t e U n i v . , A p r i l 1 9 8 9 .
[ 11 ] L . J . H a l e r a n d E . H u t c h i n g s . B r i n g i n g u p B o z o .
T e c h. R e p . C M P T T R 9 0 -2 , S ch o o l o f C o m p u t i n g
S c i e n c e , S i m o n F r a s e r U n i v . , B u r n a b y , B . C . , V 5 A
1S6, Mar. 1990.
[ 1 2 ] K . K u c u k c a k a r a n d A . C . P a r k e r . D a t a P a t h T r a d e -
o f f s u s i n g M A B A L .
27th Design Automation Conf.,
J u n e 1 9 9 0 .
[ 13 ] P . E . R . L i p p e n s , J . L . v a n M e e r b e r g e n , A . v a n d e r
W e f t , W . F . J . V e r h a e g h , a n d B . T . M c S w e e n e y . M e m -
o r y S y n th e s i s f o r H i g h S p e e d D S P A p p l i c a t i o n s . I n
Proc. of the IE EE Custom Integrated Circuits Conf.,
May 1991.
[ 14 ] P . M a r w e d e l . T h e M I M O L A D e s i g n S y s t e m : D e -
t a i le d D e s c r i p ti o n o f t h e S o f t w a r e S y s t e m . I n 16th
Design Automation Conf.,
J u n e 1 9 7 9 .
[ 1 5 ] H . P a r k a n d V . K . P r a s a n n a . A r e a E f f i c i e n t V L S I A r -
c h i t e c tu r e s f o r H u f f m a n C o d i n g .
Int. Conf. on Aco.us-
tics, Speech and Signal Processing,
1993.
[ 1 6] A . C . P a r k e r , P . G u p t a , a n d A . H u s s a i n . T h e E f -
f e c ts o f P h y s i c a l D e s i g n C h a r a c t e r i s t ic s o n t h e A r e a -
P e r f o r m a n c e T r a d e o f f C u r v e . I n
28th Design Au-
tomation Conf.,
J u n e 1 9 9 1 .
[ 1 7] S . P r a k a s h a n d A . P a r k e r . S O S : S y n t h e s i s o f
A p p l i c a t i o n - S p e c i f i c H e t e r o g e n e o u s M u l t i p r o c e s s o r
S y s t e m s .
Journal of Paralle l and Distributed Com-
puting,
16: 338-351, Dec. 1992.
[ 1 8 ] A . T a k a c h a n d W . W o l f . S c h e d u l i n g C o n s t r a i n t G e n -
e r a t i o n f o r C o m m u n i c a t i n g P r o c e s s e s . T e c h . R e p .
S R C P u b C 9 3 0 6 9 , P r i n c e t o n U n i v . , F e b . 1 9 9 3 .
[ 1 9 ] F . V a h i d a n d D . D . G a j s k i . S p e c i f i c a t i o n P a r t i t i o n i n g
f o r S y s t e m D e s i g n . I n
29th Design Automation Conf.,
J u n e 1 9 9 2 .
[ 20 ] F . V a h i d . A S u r v e y o f B e h a v i o r a l - L e v e l P a r t i t i o n i n g
S y s t e m s . T e c h . R e p . T R I C S 9 1 - 7 1 , U C I r v i n e , 1 99 1.
[ 2 1 ] G . K . W a l l a c e . T h e J P E G S t i l l P i c t u r e C o m p r e s s i o n
S t a n d a r d .
Communication of the ACM,
3 4 ( 4 ) : 3 1 - 4 4 ,
April 1991.