[ieee 2006 14th annual ieee symposium on field-programmable custom computing machines - napa, ca...

2
DSynth: A Pipeline Synthesis Environment for FPGAs Michael Wirthlin and Welson Sun Department of Electrical and Computer Engineering Brigham Young University, Provo, UT ABSTRACT A synthesis environment called DSynth has been created for synthesizing high-performance pipelined circuits for FPGAs from synchronous data flow specifications. The goal of this work is to generate the minimum size circuit that meets the throughput constraint of the data flow model. To achieve this constraint efficiently, this approach relies heavily upon a library of pre-characterized pipelined circuit modules. In addition, resource sharing is used extensively to reduce the overall hardware cost. 1. INTRODUCTION A high-level datapath synthesis environment called DSynth was created to address the unique requirements and con- straints of FPGA based architectures. The goals of the DSynth synthesis environment are as follows. First, DSynth synthesizes pipelined FPGA circuits that meet a user thr- oughput constraint. Second, DSynth attempts to maximize hardware efficiency by selecting circuit modules from a rich library of pre-characterized FPGA circuits. Third, DSynth employs resources sharing[4] to maximize the efficiency of the allocated circuit modules. One of the most important aspects of this project is the use of characterized pipelined circuit modules. A large li- brary of pre-placed, high-performance circuit modules allows the synthesis tool to find the most efficiency circuit module for each operation within a computation. While the use of a large, well characterized library will improve the quality of synthesized designs, it significantly complicates the de- sign exploration process. This paper will summarize the design methodology used within DSynth and introduce the dynamic module selection and communication API between the synthesis environment and the library. 2. METHODOLOGY OVERVIEW The goal of the DSynth synthesis environment is to gen- erate efficient pipelined FPGA circuits from data flow spec- ifications that meet a minimum user-specified throughput constraint. A summary of the design methodology used by DSynth is shown in Figure 1. The design flow can be orga- nized broadly into the specification phase, module matching phase, architectural exploration phase, datapath generation phase, and technology mapping. 2.1 SDF Specification High-level behavior is specified within DSynth as untimed synchronous data flow (SDF) models[3]. SDF has shown to VHDL RTL Synthesis VHDL RTL Synthesis FPGA Tech Mapping FPGA Tech Mapping FPGA Bitstream SDF Model Ptolemy II Language Translation High-level Language Module Matching Module Matching Scheduling Scheduling Module Selection Module Selection Resource Sharing Resource Sharing Datapath Generation Datapath Generation VHDL Code Generation VHDL Code Generation Synthesis Constraints Synthesis Constraints Module Library Module Library Figure 1: Summary of the DSynth Methodology be an effective specification format for many digital signal processing and other stream-based computations. The use of untimed data flow is essential in this design methodology as artificial timing constraints or relationships in most high- level specifications severely limit the design space. As shown in Figure 1, SDF behavior is currently speci- fied using the Berkeley Ptolemy II modeling environment. Ptolemy provides a number of modeling aides that simplify this modeling process. These aides include support for fixed- point arithmetic, a rich set of arithmetic and signal process- ing actors, graphical user interfaces for viewing data, and robust simulation environment. A fixed point signal process- ing library has been created to simplify the specification of fixed point signal processing algorithms. The use of Ptolemy II, however, is not required and other specification formats could be used to create the SDF model used by the DSynth environment. For example, high-level language compilers or translators could be used to convert a text specification in a high-level language into a SDF model. Alternatively, models defined in Matlab Simulink could be converted into an SDF model. In addition to an untimed SDF model, the synthesis process requires global timing constraints. These timing constraints, specified as a throughput in terms of samples per second, are specified outside of the SDF model. This allows a single SDF model to be synthesized for any given throughput con- straints. The architecture chosen by the synthesis process depends heavily on this timing constraint – architectures synthesized for high throughput systems require much more 14th Annual IEEE Symposium on Field-Programmable Custom Computing Machines (FCCM'06) 0-7695-2661-6/06 $20.00 © 2006

Upload: welson

Post on 14-Apr-2017

214 views

Category:

Documents


1 download

TRANSCRIPT

Page 1: [IEEE 2006 14th Annual IEEE Symposium on Field-Programmable Custom Computing Machines - Napa, CA (2006.4.24-2006.4.24)] 2006 14th Annual IEEE Symposium on Field-Programmable Custom

DSynth: A Pipeline Synthesis Environment for FPGAs

Michael Wirthlin and Welson SunDepartment of Electrical and Computer Engineering

Brigham Young University, Provo, UT

ABSTRACTA synthesis environment called DSynth has been created forsynthesizing high-performance pipelined circuits for FPGAsfrom synchronous data flow specifications. The goal of thiswork is to generate the minimum size circuit that meets thethroughput constraint of the data flow model. To achievethis constraint efficiently, this approach relies heavily upona library of pre-characterized pipelined circuit modules. Inaddition, resource sharing is used extensively to reduce theoverall hardware cost.

1. INTRODUCTIONA high-level datapath synthesis environment called DSynth

was created to address the unique requirements and con-straints of FPGA based architectures. The goals of theDSynth synthesis environment are as follows. First, DSynthsynthesizes pipelined FPGA circuits that meet a user thr-oughput constraint. Second, DSynth attempts to maximizehardware efficiency by selecting circuit modules from a richlibrary of pre-characterized FPGA circuits. Third, DSynthemploys resources sharing[4] to maximize the efficiency ofthe allocated circuit modules.

One of the most important aspects of this project is theuse of characterized pipelined circuit modules. A large li-brary of pre-placed, high-performance circuit modules allowsthe synthesis tool to find the most efficiency circuit modulefor each operation within a computation. While the use ofa large, well characterized library will improve the qualityof synthesized designs, it significantly complicates the de-sign exploration process. This paper will summarize thedesign methodology used within DSynth and introduce thedynamic module selection and communication API betweenthe synthesis environment and the library.

2. METHODOLOGY OVERVIEWThe goal of the DSynth synthesis environment is to gen-

erate efficient pipelined FPGA circuits from data flow spec-ifications that meet a minimum user-specified throughputconstraint. A summary of the design methodology used byDSynth is shown in Figure 1. The design flow can be orga-nized broadly into the specification phase, module matchingphase, architectural exploration phase, datapath generationphase, and technology mapping.

2.1 SDF SpecificationHigh-level behavior is specified within DSynth as untimed

synchronous data flow (SDF) models[3]. SDF has shown to

VHDL RTLSynthesisVHDL RTLSynthesis

FPGATech Mapping

FPGATech Mapping

FPGABitstream

SDFModel

Ptolemy II LanguageTranslation

High-levelLanguage

ModuleMatchingModule

Matching

SchedulingScheduling

ModuleSelectionModule

SelectionResourceSharing

ResourceSharing

DatapathGenerationDatapath

Generation

VHDL CodeGenerationVHDL CodeGenerationSynthesis

ConstraintsSynthesis

ConstraintsModule LibraryModule Library

Figure 1: Summary of the DSynth Methodology

be an effective specification format for many digital signalprocessing and other stream-based computations. The useof untimed data flow is essential in this design methodologyas artificial timing constraints or relationships in most high-level specifications severely limit the design space.

As shown in Figure 1, SDF behavior is currently speci-fied using the Berkeley Ptolemy II modeling environment.Ptolemy provides a number of modeling aides that simplifythis modeling process. These aides include support for fixed-point arithmetic, a rich set of arithmetic and signal process-ing actors, graphical user interfaces for viewing data, androbust simulation environment. A fixed point signal process-ing library has been created to simplify the specification offixed point signal processing algorithms.

The use of Ptolemy II, however, is not required and otherspecification formats could be used to create the SDF modelused by the DSynth environment. For example, high-levellanguage compilers or translators could be used to convert atext specification in a high-level language into a SDF model.Alternatively, models defined in Matlab Simulink could beconverted into an SDF model.

In addition to an untimed SDF model, the synthesis processrequires global timing constraints. These timing constraints,specified as a throughput in terms of samples per second,are specified outside of the SDF model. This allows a singleSDF model to be synthesized for any given throughput con-straints. The architecture chosen by the synthesis processdepends heavily on this timing constraint – architecturessynthesized for high throughput systems require much more

14th Annual IEEE Symposium on Field-Programmable Custom Computing Machines (FCCM'06)0-7695-2661-6/06 $20.00 © 2006

Page 2: [IEEE 2006 14th Annual IEEE Symposium on Field-Programmable Custom Computing Machines - Napa, CA (2006.4.24-2006.4.24)] 2006 14th Annual IEEE Symposium on Field-Programmable Custom

hardware and high-speed circuit modules that architecturessynthesized with low-throughput constriants.

2.2 Module MatchingThe next step in the synthesis methodology is module

matching. Module matching searches through the circuitlibrary to identify a set of compatible circuit for each oper-ator or signal processing element in the model. The cost ofeach circuit module is estimated by using the operation spe-cific parameters in the model operation including bit-widths,rounding modes, etc. These cost estimates are essential forthe architecture exploration process.

A key element of this methodology is the presence of awell characterized circuit module library. This library, de-scribed in [4], contains a large number of implementationalternatives for each supported operation type as well ascommon signal processing functions. Each circuit elementis characterized for operating speed, pipelining parameters,and area costs. The use of a rich library allows the architec-tural exploration tool to identify a wide range of alternativesin search of the optimal circuit architecture.

2.3 Architecture ExplorationThe most important phase of this process is the architec-

ture exploration. This phase will explore a very large designspace to find the lowest cost circuit that meets the giventhroughput constraint. As shown in Figure 1, three interre-lated synthesis techniques are used during this explorationprocess. These techniques include pipeline scheduling, mod-ule selection, and resource sharing.

During pipeline scheduling, the start time is determinedfor each operation in the SDF model. Pipeline schedulinguses modulo scheduling to allow overlapping execution ofoperations in different iterations of the computation. Fur-ther, pipeline scheduling in DSynth supports the ability toschedule multi-cycle, pipelined circuit modules.

Module selection is the process of selecting circuit moduleresources for each operator in the SDF specification. Moduleselection has a significant impact on the overall quality andsize of the design. However, module selection is a searchproblem within an exponential search space. Heuristics mustbe used to reduce the size of the module selection searchspace and identify the lowest cost modules appropriate forthe given model and throughput constraint.

Resource sharing is the process of identifying compatibleoperators within the SDF model that can be shared on thesame hardware resource. Resources sharing multiple opera-tors is an important architectural synthesis step for reducingthe size of the circuit. Resource sharing, however, dependsheavily on module selection as two operators can not beshared unless they are implemented by the same moduletype.

Pipeline scheduling, module selection, and resource shar-ing are closely interrelated. Architectural synthesis strate-gies must be developed to organize the execution of thesetasks while limiting the design search time. Two approachesfor integrating these techniques are described in [4].

2.4 Datapath GenerationOnce architectural exploration has completed, circuit mod-

ules have been chose for each operator and the operatorshave been scheduled within a global pipeline schedule. Thenext phase involves the synthesis of the datapath for imple-

menting the chosen circuit architecture. The purpose of thisstep is to generate a synthesizable RTL description of thearchitecture selected in the previous step.

The first step of the datapath generation step is to gener-ate a netlist for each circuit module chosen during architec-tural synthesis. This may involve the invocation of a third-party module generation tool or the instancing of a para-meterized VHDL entity. Second, the circuit modules chosenduring module selection are instanced within the top-leveldesign. Third, communication structures are created andinstanced for steering data (multiplexing) and storing inter-mediate data (registers and memories). Fourth, a controlunit is synthesized for sequencing the circuit modules andcommunication structures.

2.5 Technology MappingThe final step in this high-level synthesis methodology is

device specific technology mapping to generate a device bit-stream. After the RTL VHDL has been generated in theprevious step, the circuit is synthesized into FPGA primi-tives using traditional RTL synthesis tools. Vendor specificmapping tools are used to translate the structural netlist ofthe synthesized circuit into a bitstream.

3. CONCLUSIONSA high-level synthesis environment called DSynth has been

created for generating high-quality circuit structures fromuntimed dataflow models. A variety of circuit architecturescan be synthesized from the same behavioral specification bychanging the throughput constraint. High quality circuit ar-chitectures are identified by integrating pipeline scheduling,module selection, and resource sharing. This methodologyallow high-quality FPGA circuits to be easily created formany signal processing algorithms.

4. REFERENCES[1] P. Bellows and B. L. Hutchings. JHDL—an HDL for

reconfigurable systems. In J. M. Arnold and K. L.Pocek, editors, Proceedings of IEEE Workshop onFPGAs for Custom Computing Machines, pages175–184, Napa, CA, Apr. 1998.

[2] A. Koch and N. Kasprzyk. Module generators drivingthe compilation for adaptive computing systems. InProceedings of the Tenth Annual IEEE Symposium onField-Programmable Custom Computing Machines(FCCM), pages 293–294, April 2002.

[3] E. A. Lee and D. G. Messerschmitt. Synchronous dataflow. Proceedings of the IEEE, 75(9):1235–1245,September 1987.

[4] W. Sun, M. Wirthlin, and S. Neuendorffer. Combiningmodule selection and resource sharing for efficient fpgapipeline synthesis. In Proceedings of the ACM/SIGDAInternational Symposium on Field Programmable GateArrays, February 2006. To be published.

14th Annual IEEE Symposium on Field-Programmable Custom Computing Machines (FCCM'06)0-7695-2661-6/06 $20.00 © 2006