embedded solutions for programmable logic designs · 2021. 1. 27. · frame rates and algorithm...

20
INSIDE FPGA-Based Video Analysis Spotlights Clown Fish Innovating ECG Algorithms with Xilinx System Generator How to Leverage APU Inside Virtex-5 FXT’s PowerPC 440 Core INSIDE FPGA-Based Video Analysis Spotlights Clown Fish Innovating ECG Algorithms with Xilinx System Generator How to Leverage APU Inside Virtex-5 FXT’s PowerPC 440 Core Issue 5 April 2009 EMBEDDED SOLUTIONS FOR PROGRAMMABLE LOGIC DESIGNS Embedded Embedded magazine magazine

Upload: others

Post on 18-Feb-2021

1 views

Category:

Documents


0 download

TRANSCRIPT

  • INSIDE

    FPGA-Based Video AnalysisSpotlights Clown Fish

    Innovating ECG Algorithms with Xilinx System Generator

    How to Leverage APU InsideVirtex-5 FXT’s PowerPC 440 Core

    INSIDE

    FPGA-Based Video AnalysisSpotlights Clown Fish

    Innovating ECG Algorithms with Xilinx System Generator

    How to Leverage APU InsideVirtex-5 FXT’s PowerPC 440 Core

    Issue 5April 2009

    E M B E D D E D S O L U T I O N S F O R P R O G R A M M A B L E L O G I C D E S I G N S

    EmbeddedEmbeddedmagazinemagazine

  • More gates, more speed, more versatility, and ofcourse, less cost — it’s what you expect from The Dini Group. This new

    board features 16 Xilinx Virtex-5 LX 330s (-1 or -2 speed grades). With over 32 Million ASICgates (not counting memories or multipliers) the DN9000K10 is the biggest, fastest, ASICprototyping platform in production.

    User-friendly features include:

    • 9 clock networks, balanced and distributed to all FPGAs

    • 6 DDR2 SODIMM modules with options for FLASH, SSRAM, QDR SSRAM, Mictor(s),DDR3, RLDRAM, and other memories

    • USB and multiple RS 232 ports for user interface

    • 1500 I/O pins for the most demanding expansion requirements

    Software for board operation includes reference designs to get you up and running quickly. Theboard is available “off-the-shelf ” with lead times of 2-3 weeks. For more gates and more speed,call The Dini Group and get your product to market faster.

    www.dinigroup.com • 1010 Pearl Street, Suite 6 • La Jolla, CA 92037 • (858) 454-3419 • e-mail: [email protected]

  • Interested in adding “published author” to your resume and achieving a greater level of credibilityand recognition in your peer community? Consider submitting an article for global publication inthe highly respected, award-winning Xcell Journal.

    The Xcell team regularly guides new and experienced authors from idea development to publishedarticle with an editorial process that includes planning, copy editing, graphics development, andpage layout. Our author guidelines and article template provide ample direction to keep you ontrack and focused on how best to present your chosen topic, be it a new product,research breakthrough, or inventive solution to a common design challenge.

    Our design staff can even help you turn artistic concepts into effective graphics orredraw graphics that need a professional polish.

    We ensure the highest standards of technical accuracy by communicating withyou throughout the editorial process and allowing you to review your article tomake any necessary tweaks.

    Submit final draft articles for publication in our Web-based Xcell Online orour digital and print Xcell Journal. We will assign an editor and a graphicsartist to work with you to make your papers clear, professional, and effective.

    For more information on this exciting and highly rewarding opportunity, please contact:

    Mike SantariniPublisher, Xcell [email protected]

    Would you like to write for Xcell Publications? It’s easier than you think.

    Get Published

    www.xilinx.com/xcell/

  • Welcome to this Embedded Systemsspecial supplement of Xcell Journal.

    very year over the last eight or so years, Xilinx has seen a steady increase in the number ofembedded-software designers and DSP algorithm developers using our FPGAs to createnew innovations. This rise has its genesis in a few significant factors. First of all, Xilinxhas been and continues to be committed to leveraging the latest silicon process technolo-

    gies and their doubling of transistor counts, in keeping with Moore’s Law. The wealth of transis-tors has allowed us to add advanced processors and ever more DSP slices into our FPGAs, alongwith programmable logic cells.

    The doubling also means that customers can add soft processors like our MicroBlaze™ to theirXilinx designs, and combine them with a growing number of functions that otherwise would haveoccupied separate chips or other sections of the PCB. This integration reduces the overall bill ofmaterials while delivering a leap in overall product performance, a system power savings and areduction in the overall physical size of the end products (in turn saving more money).

    The next reason is complexity. Today, many systems you want to implement simply outpace thecomplexity of the devices you previously targeted or the silicon you can afford to design yourself.For a growing number of these advanced systems, a single DSP just won’t cut it, a series of DSPsis too cumbersome and the price points and time-to-market risks can’t justify an ASIC or ASSP.FPGAs simply are the best choice. As Vin Ratford, our senior vice president of worldwide mar-keting, puts it, “Once FPGAs were in embedded systems; today they are embedded systems.”

    All this said, today, if you are an embedded-systems programmer or a DSP algorithm developerwho isn’t familiar with HDLs, programming an FPGA can seem intimidating. However, over thelast eight years Xilinx has steadily acquired a better understanding of the methodologies, softwareand needs of embedded-software developers and DSP algorithm developers. Through softwarecompany acquisitions, internal software and IP development, and partnerships with third-partyvendors, we have made great strides in making it easy for FPGA newbies and those unfamiliar withHDLs to create advanced embedded systems with our devices. Mind you, these flows still have lotsof room for improvement, but today, embedded-software designers and DSP algorithm developersare making remarkable embedded systems with our FPGAs.

    To show you what I mean, in this issue we present three feature articles that demonstrate how yourcolleagues are using Xilinx FPGAs to continually rewrite what really is the state of the art in embed-ded-system design.

    Our first article, from Xilinx partner Impulse Accelerated Technologies, describes how Impulse,Xilinx and other partners are making it much easier for embedded-software engineers to leveragean emerging class of tools, commonly called ESL tools, to use the Xilinx MicroBlaze soft processorand program their designs into Xilinx FPGAs with minimal understanding of HDLs. In fact,Impulse CEO David Pellerin tells how in 20 hours the company created a demo for an advancedvideo analytics system that identifies a subject—in this case a clown fish—and tracks it as it movesto different areas of the screen.

    Our second article, which first appeared in issue 65 of Xcell Journal, is a fascinating piece inwhich a grad student at the University of Manoa, Hawaii, and his advising professor describehow they implemented an advanced algorithm in a Xilinx FPGA to research next-generationelectrocardiogram (ECG) systems. While neither is an HDL expert, they were able to use theXilinx System Generator tool to successfully implement their algorithm for what potentiallycould be a life-saving system—literally.

    Xilinx, Inc.2100 Logic DriveSan Jose, CA 95124-3400Phone: 408-559-7778FAX: 408-879-4780

    © 2009 Xilinx, Inc. All rights reserved. XILINX, the Xilinx Logo, and other designated brands includedherein are trademarks of Xilinx, Inc. All other trade-marks are the property of their respective owners.

    The articles, information, and other materials includedin this issue are provided solely for the convenience ofour readers. Xilinx makes no warranties, express,implied, statutory, or otherwise, and accepts no liabilitywith respect to any such articles, information, or othermaterials or their use, and any use thereof is solely atthe risk of the user. Any person or entity using such infor-mation in any way releases and waives any claim itmight have against Xilinx for any loss, damage, orexpense caused thereby.

    PUBLISHER Mike [email protected]

    EDITOR Jacqueline Damian

    ART DIRECTOR Scott Blair

    DESIGN/PRODUCTION Teie, Gelwicks & Associates1-800-493-5551

    ADVERTISING SALES Dan [email protected]

    INTERNATIONAL Melissa Zhang, Asia [email protected]

    Christelle Moraga, Europe/Middle East/[email protected]

    Yumi Homura, [email protected]

    SUBSCRIPTIONS All Inquirieswww.xcellpublications.com

    REPRINT ORDERS 1-800-493-5551

    Embeddedmagazine

    www.xilinx.com/xcell/

    L E T T E R F R O M T H E P U B L I S H E R

    E

  • Our third story is derived from an article that first appeared inXcell issue 66. In this fascinating how-to piece, a design team fromMissing Link Electronics describes how easy it is to use the APUinside the PowerPC® 440 processor in the Virtex®-5 FXT devicesto implement in FPGAs advanced programming once thought thedomain of standalone processors or state-of-the-art ASSPs only.

    These are just a few of the tales demonstrating how rapidlyFPGA technology is advancing and how a growing number ofyou are leveraging these versatile devices to create next-genera-tion innovations. In fact, Xilinx is stepping up developmentefforts even more to help you work faster and more intuitively.In February, we announced our Virtex-6 and Spartan-6 TargetedDesign Platforms.

    Targeted design platforms are not just state-of-the-art FPGA silicon;they also include the tools, IP, reference designs and methodologiesyou need to quickly create innovations in your targeted applicationarea. We are adding layers of automation, IP socketization and ref-erence designs for multiple applications, and tailoring tool flows tobest suit specific designer skill sets. You’ll hear more about our tar-geted design platforms over the coming months and years.

    It’s amazing to see what you have created so far. And as targeteddesign platforms mature, I can’t wait to see what you’ll come upwith next. I encourage you to share your experience and wisdomwith your colleagues and fellow Xcell Journal readers to furtherspur creative ideas and future innovations. Feel free to contact meif you have a contribution or even if you just want to chat.

    Mike SantariniPublisher

    [email protected](408) 626-5981

    Send in the Clown Fish: Implementing Video Analysis in Xilinx FPGAs …6

    Exploring and Prototyping Designs for Biomedical Applications …11

    Extend the PowerPC Instruction Set for Complex-Number Arithmetic ...15

    E M B E D D E D M A G A Z I N E I S S U E 5 , A P R I L 2 0 0 9

    XCELLENCE IN EMBEDDED SYSTEMS DESIGN

    66

    11111515

  • by David PellerinCEOImpulse Accelerated [email protected]

    As more electronic devices in an increasingnumber of application areas employ videocameras, system designers are moving to thenext evolutionary step in video system tech-nology by adding intelligence or analysiscapabilities to help identify objects andpeople. Advanced production machinery ina factory, for example, may use multiplevideo analysis systems to monitor machineparts and instantly identify failing compo-nents. In such applications, video systemsmay monitor materials as they movethrough an assembly line, identifying thosethat don’t meet standards. Surveillance sys-tems may also use advanced video analysisto tag suspicious objects or persons, track-ing their movements in concert with a net-work of other cameras.

    Companies and organizations aredeploying the first generation of thesevideo analysis systems today in inspectionsystems, manned or unmanned vehicles,video monitoring devices and automotivesafety systems. Designers of these first-generation systems typically implementvideo-processing algorithms in softwareusing DSP devices, microprocessors ormulticore processors. But as designersmove to next-generation video applica-tions that are much more fluid and intelli-gent, they are finding that DSP andstandard processors can’t accommodatenew requirements for video resolutions,frame rates and algorithm complexity.

    Send in the Clown Fish: Implementing Video Analysis in Xilinx FPGAs

    Send in the Clown Fish: Implementing Video Analysis in Xilinx FPGAs

    6 Embedded Magazine April 2009

    High-level design methods combine with Xilinx Video Starter Kit to enable rapid prototyping of FPGA-based object-recognition system.High-level design methods combine with Xilinx Video Starter Kit to enable rapid prototyping of FPGA-based object-recognition system.

    XCEL LENCE IN EMBEDDED SYSTEMS

  • Digital signal processors certainly do havebenefits for video applications, includingsoftware programmability using the C lan-guage, relatively high clock rates and opti-mized libraries that allow for quickdevelopment and testing. But DSPs are lim-ited in the number of instructions they canperform in parallel. They also have a fixednumber of multiply/accumulators, fixedinstruction word sizes and limited I/O.

    FPGAs, on the other hand, benefit froman arbitrary number of data paths andoperations, up to the limit of the devicecapacity. Larger FPGA devices are capableof performing hundreds or even thousandsof multiply operations simultaneously, ondata of varying widths.

    Because of their advantages in compute-intensive video processing, FPGAs—orcombinations of FPGAs and DSPs—havebecome the favored choice for designers ofthe most advanced video systems.

    At one time, FPGAs were intimidatingfor algorithm developers untrained in hard-ware design methods. Programming anFPGA to implement a complex video algo-rithm required that designers have hardwaredesign skills and a grasp of hardware descrip-tion languages. But over the last severalyears, Xilinx and a number of third-partysoftware providers, including Impulse, havecreated higher-level flows that allow algo-rithm developers to use FPGAs for even themost complex designs, without requiringsubstantial hardware design skills.

    Thanks to new tools and methods thatmake it possible to easily use FPGAs foradvanced video analysis, ImpulseAccelerated Technologies designed a moder-ately complex, high-definition video-pro-cessing application in a matter of days. Weused a combination of higher-level designtools, Xilinx video development hardwareand Xilinx video reference design examples.This demonstration project, which we calledFind the Clown Fish, serves as a model forother, more-complex projects requiring fastbring-up with minimal design risk.

    Advanced Video Analysis Requires FPGAsComplex video-processing applications areoften purpose-built. For example, amachine vision technology used in an auto-

    ular software programming languages andenvironments, while the third makes it eas-ier to manage the increased complexity ofFPGA-based systems.

    Library-Based Tools Speed DevelopmentMATLAB® and Simulink®, produced bythe Mathworks, are popular tools for thedevelopment of complex algorithms inmany domains. For DSP and video-pro-cessing algorithms in particular, they pro-

    vide a robust set of library elements thatdesigners can arrange and interconnect toform a simulation model.

    Tools such as Xilinx System Generator™extend this capability. System Generatorallows designers to take a subset of theseelements and use the tool to automatical-ly convert the elements into efficientFPGA hardware.

    For example, the developer of a machinevision application could use Simulink incombination with System Generator toquickly design and simulate a pipeline ofpredefined filters, then deploy the resultingalgorithm into the FPGA along with othercomponents for I/O and control.

    Xilinx System Generator is a highly pro-ductive method for creating such applica-tions, because it includes a wide variety ofpreoptimized components for such thingsas FIR filters and two-dimensional kernelconvolutions. There are limits, however, towhat designers can accomplish usinglibraries of predefined and only nominallyconfigurable filters.

    C-to-FPGA Accelerates Software ConversionFor increased design flexibility, designerscan also use C-to-hardware tools such asImpulse CoDeveloper to describe, debugand deploy filters of almost unlimited com-plexity. Such design methods are particu-

    mated inspection system may require a veryspecific sequence of video filtering and con-trol logic to identify objects moving downan assembly line. Such a system mightdetermine whether a specific item in a pro-duction process, be it a potato chip or a sil-icon wafer, should be rejected and divertedoff the line. In an automotive application, itmight be necessary to identify and analyzespecific types of objects–road signs, forexample–in near real-time.

    In the past there have been significantbarriers for software programmers taskedwith moving such algorithms out of tradi-tional processors and into FPGAs.Hardware design methods and languages arevery different from those used in software.The level of abstraction for FPGAs, usinghardware description languages, is muchlower than in software design. FPGA devel-opment tools have matured in recent years,however, providing software algorithmdesigners with more-productive methods ofprototyping, deploying and maintainingcomplex FPGA-based algorithms.

    Video system designers can develop anddeploy their applications in FPGAs bycombining multiple high-level methods ofdesign, using a range of available tools andintellectual-property (IP) blocks. Since noone tool or design method is ideal for allaspects of a complex video application, it isbest to select the most productive methodsfor creating different parts of a given video-processing product.

    The development hardware is also animportant consideration. Well-tested hard-ware platforms and reference designs willgreatly accelerate the design and debuggingof complex FPGA-based systems.

    Three categories of tools in particularhave helped to speed software-to-hardwareconversion. Two of them are based on pop-

    April 2009 Embedded Magazine 7

    We designed a moderately complex HDvideo-processing application in days

    using higher-level design tools, Xilinxdevelopment hardware and reference designs.

    XCEL LENCE IN EMBEDDED SYSTEMS

  • larly useful for filtering applications andalgorithms that don’t fit into existing, pre-defined blocks. These applications includeoptical-flow algorithms for face recognitionor object detection, inspection systems andimage-classification algorithms such assmart vector machine, among others.

    C-to-FPGA methods are particularlyappealing for software algorithm developers,who are accustomed to using source-leveldebuggers and other C-language tools forrapid, iterative design of complex systems.Designers can use C not only to express thefunctionality of the application itself, butalso to create a simulated application, forexample by tapping into open-source userinterface components and widely availableimage-processing software libraries.

    A secondary benefit is that C-languagemethods enable designers to employembedded processors within the FPGAitself for iterative hardware/software parti-tioning and in-system debugging. Whendesigners introduce embedded processorsinto the application, they can use the Clanguage for both the software running onthe processor as well as to describe proces-sor-attached hardware accelerators.

    Platform Studio Enables System IntegrationThe Xilinx ISE®, or Integrated SoftwareEnvironment, includes Platform Studio, atool that allows users to assemble and inter-connect Xilinx-provided, third-party andcustom IP to form a complete system ontheir target FPGA device. Xilinx and itsdevelopment-board partners provide boardsupport packages that extend PlatformStudio and greatly simplify the creation ofcomplex systems. Reference designs forspecific boards such as the Xilinx VideoStarter Kit also speed development.

    The Platform Studio integrated develop-ment environment contains a wide variety ofembedded programming tools, intellectual-property cores, software libraries, wizardsand design generators to enable fast cre-ation and bring-up of custom FPGA-based embedded platforms. These toolsrepresent a unified embedded develop-ment environment supporting PowerPC®

    hard-processor cores and MicroBlaze™soft-processor applications.

    Finding the Clown Fish in HD Video StreamsTo demonstrate how to use these tools andmethods effectively in an advanced videoapplication, we decided to create an image-filtering design with a highly constrainedschedule, of just two weeks, for showing atthe Consumer Electronics Show in LasVegas. The requirements of our demonstra-tion were somewhat flexible, but we wantedit to perform a moderately complex imageanalysis, such as object detection, and sup-port multiple resolutions up to and includ-ing 720p and 1080i. The system shouldprocess either DVI or HDMI source inputs,and support pixel-rate processing of videodata at 60 frames per second.

    After considering more-common algo-rithms such as real-time edge detection andfiltering, we decided to try something a bitmore eye-catching and fun. We decided to“Find the Clown Fish.”

    More specifically, what we set out to dowas monitor a video stream and look forparticular patterns of orange, black and

    gray–the distinctive stripes of a clownfish–and then create a spotlight effect thatwould follow the fish around as it movedthrough the scene, thereby emulating thetype of object tracking that a machinevision system might perform. The goal wasto have a demonstration that would workwell with the source video, a clip featuringa clown fish, or perhaps even with a livecamera aimed at a fish tank.

    For expediency, we decided to startwith an existing DVI pass-through filterreference design provided by Xilinx withthe Video Starter Kit. This referencedesign includes a few relatively simple fil-ters including gamma in, gamma out anda software-configurable 2-D FIR filter.An excellent starting point for anystreaming-video application, this refer-ence design also demonstrates the use ofXilinx tools including Platform Studioand System Generator.

    Within a few hours of receiving theVideo Starter Kit from Xilinx, we had abaseline Platform Studio project, the DVIpass-through example, built and verifiedthrough the complete tool flow. This refer-ence design served to verify that we hadgood, reliable setup for video input anddisplay, in this case a laptop computerwith a DVI output interface and an HD-compatible video monitor. We then begancoding additional video-filtering compo-nents using C and the streaming functions

    provided with our Impulse CoDeveloperand Impulse C compiler.

    As seen in the block diagram of thecomplete video-processing system (Figure1), a MicroBlaze processor serves as anembedded controller. We inserted theImpulse C detection filter into the DVIvideo data stream using video signal buswrappers automatically generated by theImpulse tools.

    8 Embedded Magazine April 2009

    DVI In

    GammaIn

    2D FIRFilter

    XilinxMicroBlazeProcessor

    GammaOut

    DVIOut

    Object Detection

    Figure 1 – Impulse Accelerated Technologies implemented this complete video-filtering and object-detection design in a single Spartan-3A FPGA device.

    XCEL LENCE IN EMBEDDED SYSTEMS

  • Using C for prototyping dramaticallysped up the development process. Over thecourse of a few days, we employed anddebugged many image-filtering techniquesinvolving a variety of algorithm-partition-ing strategies. For software debugging pur-poses, we used sequences of still images(scenes from a video that has an animatedclown fish) as test inputs. We interspersedcompile and debug sessions using MicrosoftVisual Studio with occasional runs throughthe C-to-hardware compiler in order toevaluate the likely resource usage and deter-mine the pipeline throughput of the variousfiltering strategies. Only on rare occasionsdid we synthesize the resulting RTL androute the design to create hardware. Theability to work at a higher level while itera-tively improving the detection algorithmdramatically accelerated the design process.

    In fact, we performed all of the algorithmdebugging using software-based methods,either through examination of the generatedBMP-format test image files or using theVisual Studio source-level debugger. At nopoint did we use an HDL simulator to vali-date the generated HDL. We used the XilinxChipScope™ debugger at one point toobserve the video stream inputs and deter-mine the correct handling of vertical andhorizontal sync, but otherwise we found noneed to perform hardware-level debugging.

    Optimizing for Pipeline PerformanceA critical aspect of this application, and oth-ers like it, is the need for the algorithm tooperate on the streaming-video data at pixelrate, meaning the design must process andgenerate pixels as quickly as they arrive inthe input video stream. For our demonstra-tion example, the required steps and com-putations for each pixel included:

    • Unpacking the pixels to obtain the R, Gand B values as well as the vertical andhorizontal sync and data-enable signals.

    • Doing 5 x 5 prefiltering to performsmoothing and other operations.

    • Storing and shifting incoming pixel valuesfor subsequent pattern recognition.

    • Performing a series of comparisons ofsaved pixels against specific ranges

    April 2009 Embedded Magazine 9

    Tips, Techniques and Tricks for Programmers

    If you are a C programmer experienced with traditional processors, you will needto learn a few new concepts and employ certain coding techniques to obtain thebest results when targeting an FPGA.

    First, you should use fixed-width, reduced-size integers when possible. For exam-ple, when counting scan lines in a frame, there is no benefit to using a standard 16-bit or 32-bit C-language integer data type. Instead, select a data type with justenough bits to represent the maximum value for the counter. C-to-FPGA toolsinclude additional, nonstandard data types for exactly this purpose, as shown belowwhen calculating how far to move the spotlight and adjust its size:

    co_uint12 diffx; // Offset of the current pixelco_uint12 diffy; // Offset of the current pixelco_int24 diffx_sq, diffy_sq, diffsum; . . .// Calculate if this pixel is in the spotlightdiffx = ABS(x_position - spotlight_x);diffy = ABS(y_position - spotlight_y);

    diffx_sq = diffx * diffx;diffy_sq = diffy * diffy;diffsum = diffx_sq + diffy_sq;

    if (de_out) { // de_out indicates visible pixelsif (spotlight_on != 0 && x_position != 0

    && diffsum < spotlight_size) {r_out = r_in; // Pass throughg_out = g_in;b_out = b_in;

    }else {

    r_out = (r_in >> 1); // Dim g_out = (g_in >> 1);b_out = (b_in >> 1);

    }}

    You should also consider refactoring the use of variables and arrays to allow effi-cient pipelining. Modern C-to-FPGA compilers can schedule parallel operations effi-ciently, and can take advantage of such FPGA features as dual-port RAM. But thereare many cases in which the actual requirements of a given algorithm–the range ofinput values expected, or input combinations that are known to be impossible–mayallow for alternative coding methods that can deliver higher performance.

    Also, write or refactor your C code with parallel operations in mind. For exam-ple, a common optimization technique in C programming is to reduce the totalnumber of calculations by placing certain operations within control statements, suchas if-then-else. In an FPGA, however, it may be more optimal to precalculate valuesprior to such a control statement, because the FPGA can perform those precalcula-tions in parallel with other statements.

    Certain operations–such as very wide or complex comparisons using relationaloperators–may be easy to code in C using macros, but may result in more logic thanexpected due to data type promotion. Casts and other coding methods can help youreduce the size of the generated logic and allow for faster clock speeds.

    Software programmers can quickly learn these coding techniques, and others likethem, by using iterative methods and by paying attention to compiler messages. Weemployed all of the above techniques when creating our clown fish video analysisdemonstration application. – David Pellerin

    XCEL LENCE IN EMBEDDED SYSTEMS

  • and patterns of colors to identify aclown fish stripe.

    • Calculating the current and new locationsof the spotlight, and moving the spotlightlocation toward the identified target at avisible rate.

    • Calculating the diameter and shape of thespotlight, using simple geometry and aframe counter to create a smooth andsteady spotlight effect.

    • Filtering pixels by increasing or decreasingthe color intensity according to whether agiven pixel is within the spotlight radius.

    A sample image of the highlighted target(the clown fish) is shown in Figure 2.

    To meet the pixel-rate requirement, thedesign must perform all of these operationsfor each pixel in the HD video stream at arate of one pixel for every clock cycle. Whenprocessing 720p video at 60 frames/s, thismeans that the above functions, represent-ing approximately 100 lines of C code in asingle automatically pipelined loop, mustbe performed more than 55 million timeseach second. We would combine this detec-tion filter with the other filtering compo-nents in the system to create the completeapplication. If we sum up all of the funda-mental operations required by all the com-ponents (including the two gamma filters,the 2-D FIR filter and the detection andspotlight filter), the design must be able toperform close to 2 billion integer operationsper second. This sounds like a lot, but infact, real-world video-processing algorithmsmay require many times that much real-time computing.

    Creating an efficient pipelined imple-mentation of a complex algorithm is nevertrivial, but using preoptimized componentsin combination with C-to-hardware pro-gramming has obvious productivity benefitsover lower-level, HDL-based methods.Because the algorithm remained expressedin a cycle-independent manner, it took onlya small amount of effort to repipeline andreoptimize it after making fundamentalchanges to the code, for example afteradding an entirely new set of computationsto handle frame-to-frame behaviors such asexpanding and shrinking the spotlight when

    the clown fish swam in and out of thescenes. Because the C compiler is capable ofautomatically scheduling operations withina pipelined loop, C programmers are able tofocus their energy on higher-, system-leveldesign decisions such as whether to createmultiple parallel FPGA processes to solve acomplex problem.

    We brought up our demo project and gotit ready to go with less than 20 hours ofactual C coding and debugging. We laterspent additional time optimizing the algo-rithm and adding features, such as a spot-light fade-out effect and support forarbitrary video resolutions.

    The complete application includes multi-ple pipelined filter modules as well as anembedded MicroBlaze processor that we canuse to configure parts of the video process-ing. For example, we can control these filtermodules at run-time to modify the bright-ness levels or perform sharpening or smooth-ing operations prior to the object-detection

    and highlighting filter. This is an excellentexample of a hybrid hardware/softwareapplication that you can implement in a sin-gle Xilinx Spartan® FPGA device.

    Overall, we created our Find the ClownFish project in under two weeks, using acombination of available development toolsand methods. The use of C language fordescribing and implementing the detectionalgorithm greatly reduced the time it took todesign and debug, allowing us to exploremany alternative approaches to the algo-rithm, simulate using C-language test fix-tures and ultimately try them out in realhardware using the Xilinx Video Starter Kit.

    While this example may be only ademonstration, it does suggest a wide vari-ety of other, more-complex video-process-ing applications that you could developusing the Video Starter Kit. It also showsthat software-oriented methods of designcan enable significantly faster deploymentof machine vision systems.

    10 Embedded Magazine April 2009

    Figure 2 – Test image shows the spotlight effect for a detected clown fish.

    XCEL LENCE IN EMBEDDED SYSTEMS

  • by Ashish Shukla Graduate StudentUniversity of Hawaii at [email protected]

    Luca MacchiaruloAssistant Professor, Department of Electrical EngineeringUniversity of Hawaii at [email protected]

    Many physicians use electrocardiogram (ECG) machines tomonitor the electrical activity of the heart and the heart con-dition in general. But today, there is a lengthy delay in thetime it takes to transfer data from ECG monitoring machinesto trained physicians.

    Here at the University of Hawaii at Manoa, we are research-ing ways to transfer – in real time – preprocessed ECG datafrom the patient’s heart to physicians. As a first step towardthis goal, we implemented two variants of a well-known soft-ware detection algorithm for ECG in hardware and exploredthe design choices using Xilinx® FPGA tools and hardware.

    Many biological instrumentation-based designs requiredesigners to combine filtering stages and customized logic suchas finite state machines in the same system. But now it is easierfor researchers to design filters in the Xilinx System Generatorenvironment by simply connecting the various blocks from theXilinx blockset (included with the System Generator blocksetlibrary) instead of creating these modules from scratch using ahardware description language like Verilog or VHDL.

    Using System Generator to create most of the blocksallows you to concentrate on the critical parts of the designand delegate the details of the implementation of those stan-dard modules to the tool. You can then import your customlogic modules into the System Generator environment; thetool will integrate those custom logic blocks with the rest ofthe design.

    April 2009 Embedded Magazine 11

    Exploring and Prototyping Designs for Biomedical Applications Exploring and Prototyping Designs for Biomedical Applications Researchers at the University of Hawaii at Manoa have implemented ECG analysis algorithms with Xilinx System Generator.

    Researchers at the University of Hawaii at Manoa have implemented ECG analysis algorithms with Xilinx System Generator.

    XCEL LENCE IN EMBEDDED SYSTEMS

  • Automated ECG System AnalysisA heart’s natural pacemaker, composed ofspecial self-exciting cells, generates andpropagates a polarization and depolariza-tion electrical signal that regulates theproper contraction of the heart muscle.This electrical activity is recorded as anelectrocardiogram by a machine called anelectrocardiograph, which provides physi-cians with a wealth of information onheartbeats and the heart’s condition ingeneral [1].

    The QRS complex – a wave structurethat corresponds to the depolarization ofventricles and has a spiked shape in theECG – is often the most telling waveformfound in an ECG signal. The morphology,duration and amplitude of the QRS com-plex in an ECG signal provides significantinformation to physicians diagnosing vari-ous arrhythmias and other cardiac ailments.

    Health professionals can best judge thestate of a patient’s heart by monitoring theheart’s activity (and the QRS complex)during stress or physical activity. Therefore,health professionals often connect ECGmonitors to their patients for many hours(generally 24 hours at a time).

    Currently, most health professionals useHolter System monitors for this purpose.But Holter System monitors have an inher-ent battery power and processing powerconstraint that limits their functionality todata acquisition systems.

    A Holter System’s heart monitors recordthe patient’s ECG data to flash memory orto a tape attached to the system. A techni-cian then removes the drive from theHolter System and sends it to the lab,where technicians analyze the data. Oncethe lab technicians complete their analysis,they send the ECG report to the physician.Of course, this means that there is a delaybetween hooking up the monitor to thepatient and getting the ECG data to thepatient’s physician, thus delaying treat-ment, possibly with tragic consequences.

    Filtering StageThe ECG recording is very sensitive to eventhe smallest body movement or noise, suchas electrical muscle (myographic) signalsand the system’s own electrical power-lineinterference. Because noise contaminationis an inherent problem in ECG monitoringand we cannot completely remove it, wehad to come up with a way to suppressnoise contamination. We achieved this byemploying filtering during the preprocess-ing stage (as in [4], [5]).

    To do this, we first pass an ECG signalthrough an infinite impulse response (IIR)low-pass filter, which suppresses the high-frequency noise in the signal. Then the sig-nal passes through an IIR high-pass filter,which attenuates the P and T waves in thesignal and suppresses the DC offset presentin the signal.

    The low-pass and high-pass filterstogether form a bandpass filter. It thenfeeds the output of the high-pass filter to afinite impulse response (FIR) derivative fil-ter, which further emphasizes the QRScomplex. This typically presents a morepronounced slope variation compared tothe other signal features, making it some-what easier for observers to identify [1].

    The FIR derivative filter step also helpsfurther reduce the noise content of the sig-nal. After filtering, we employ a squaringstage that carries out nonlinear amplifica-tion of the QRS complex and makes alldata points positive.

    Finally, we use the output of the squaringstage as the input to a moving window inte-grator. Each sample output from this filter isan average of the previous 32 values. Weimplemented all these filtering stages direct-ly in the Simulink® environment using theSystem Generator tool. Then we tried outtwo different approaches for the next stage,which is a moving window integrator.

    The first approach involved the connec-tion of a number of register, delay andadder blocks to implement the filter as a

    Here at the University of Hawaii atManoa, we are researching how to createECG analysis systems that will reduce theamount of time it takes to transfer datafrom the patient’s heart to the physicianby carrying out the analysis in real time.We hope to achieve this by integratingthe ECG analysis system into portableECG monitors, which physicians can usedirectly and thus eliminate the techniciantranslation/analysis step.

    To aid us in this effort, we have imple-mented real-time ECG analysis on aXilinx Spartan™ XC3S500 device in theSpartan-3E Starter Kit, which allows thesystem to perform this analysis muchfaster than previous software-based meth-ods. FPGAs offer many advantages forthis and other applications because theysupport rapid prototyping, are less expen-sive than ASICs at low volume and arequickly reprogrammable.

    Furthermore, their fast and efficient test-ing option, combined with SystemGenerator software, makes them a greatchoice for algorithm exploration as well ashardware implementation and prototyping.

    Implementation of the Algorithm in HardwareIn our research, we are implementing a well-known algorithm by Tompkins andHamilton for QRS detection in hardware([2], [3], [4], [5]). To help us with this effort,we took advantage of the fast prototypingfeatures included in the Xilinx toolset.

    In our system, the processing of theECG signal begins with a number of fil-tering stages through which a signal mustpass before the system accurately detects aQRS complex. We divided the design intotwo main stages. The first stage, or pre-processing stage, comprises four linear fil-ters and one nonlinear filter. The secondstage, or peak detection stage, identifiesthe peak signal in the QRS complex andapplies decision rules to qualify a featureas a QRS complex.

    Because noise contamination is an inherent problem in ECG monitoring and we cannot completely remove it , we had to

    come up with a way to suppress noise contamination.

    12 Embedded Magazine April 2009

    XCEL LENCE IN EMBEDDED SYSTEMS

  • April 2009 Embedded Magazine 13

    XCEL LENCE IN EMBEDDED SYSTEMS

    direct Form I structure. With thisapproach, however, we ended up using 31adders and an equally large number ofdelay or register blocks.

    We investigated further and later cameup with a more resource-efficient structurethat uses the block RAM resources in theSpartan-3 FPGA. This final structure uses a

    small RAM of 32 40-bit words and just twoadders and is also very efficient in timing.Figure 1 shows a typical ECG input, withresults of the various filtering operations.

    Peak Detection StageIn the peak detection stage, we are trying tofind the peak in the output of the moving

    window integrator. The peak detectionprocess depends on the calculation of thethreshold value, and we can locate a peakamong the sample values that are greaterthan the threshold value. But the algorithmdoes not report a peak until a sampleappears in the falling slope of the movingwindow integrator signal with a value lessthan half of the peak value.

    We use this method because it reducesthe number of false peak detections causedby noise. Once the algorithm locates themaximum point of the QRS signal in themoving window integrator output, wesearch for a peak in an appropriatelydelayed copy of the output of the high-passfilter and use it as a fiducial point for theposition of the QRS complex.

    Essentially, we are finding the peak ofthe QRS complex in the output of thebandpass signal instead of the original sig-nal, because the original signal is highlycontaminated by noise. We achieved thisusing a memory module, which records theprevious 60 samples from the high-pass fil-ter and sends a fixed interval of samples

    Prefetch

    Counter >400

    Counter =1h2

    Sample

  • 14 Embedded Magazine April 2009

    containing the QRS complex as an input tothe window module. The window modulefinds the maximum QRS signal.

    We based the threshold for the nextpeak detection process in the moving win-dow integrator output on the median ofthe eight previously detected peaks.

    To calculate the running median calcu-lation, the system maintains an updated,sorted list of such peaks. The computationis supervised by a median calculationFSM (called the median state machine)communicating with a separate peakdetection FSM (called the main statemachine; see Figure 2). The main statemachine finds the peak in the movingwindow output (as discussed); then themedian state machine sets up the thresh-old value for the next detection.

    We implemented the entire peak detec-tion stage in VHDL. It contains the mem-ory, window, main and median statemachines and the median RAM modules.Later, we imported these VHDL modulesinto the System Generator environment asblack boxes and simulated the design(Figure 3).

    Programming and Testing the Hardware ImplementationWe programmed the FPGA through theJTAG port. To test the design, we usedSystem Generator’s JTAG co-simulationfeature, which allowed us to pass data fromthe Simulink environment through theUSB cable to the board (the XilinxSpartan-3E Starter Kit) containing thedownloaded design.

    The design implemented in the FPGAprocesses the data and sends the outputback to the MATLAB® environment sothat we can compare it to the simulationoutput. This feature was very convenientand helped us establish the accuracy ofthe design implementation in hardware.It also allowed us to run tests on completesets of benchmark ECG data (fromwww.physionet.org) in a fraction of thetime we previously needed to test thesoftware version of the algorithm.

    Overall, we found System Generatorto be very useful for implementingdesigns for biomedical instrumentationapplications. The tool provides easy filterdesign options and simplifies designimplementation by accepting importedcustom modules as black boxes. It alsoprovides a fast and efficient way to test adesign on hardware, with easy interfac-ing options between the board and theuser’s workstation.

    Citations[1] B-U. Kohler, C. Hennig, and

    R. Orglmeister, “The principles of software

    QRS detection,” IEEE Engineering in

    Medicine and Biology Magazine, vol. 21, no. 1,

    pp. 42-57, January 2002.

    [2] A. Shukla and L. Macchiarulo, “FPGA

    based ECG Analysis System,” Proceedings of

    the Sixth IASTED International Conference on

    Biomedical Engineering, Austria, pp. 68-72,

    February 2008.

    [3] A. Shukla, “Hardware Implementation of

    real-time ECG analysis algorithms,” M.S. the-

    sis, University of Hawaii at Manoa, Honolulu,

    Hawaii, 2008.

    [4] J. Pan and W. Tompkins, “A real-time

    QRS detection algorithm,” IEEE Transactions

    on Biomedical Engineering, vol. BME-32, no. 3,

    pp. 230-236, March 1983.

    [5] P. Hamilton and W. Tompkins,

    “Quantitative Investigation of QRS Detection

    Rules Using MIT/BIH Arrhythmia Database,”

    IEEE Transactions on Biomedical Engineering,

    vol. BME-33, no. 12, pp. 1157-1165,

    December 1986.

    Low-Pass Filter

    Median FSM

    Moving Window Integrator

    High-Pass Filter

    Derivative Filter

    Squaring Stage

    Memory Module Window Module

    Main FSM

    Median RAM

    Figure 3 – Complete design in the System Generator environment

    XCEL LENCE IN EMBEDDED SYSTEMS

  • by Endric Schubert, Felix Eckstein, Joachim Foerster, Lorenz Kolb and Udo KebschullMissing Link [email protected]

    Embedded processing systems face a serioustechnical challenge: how to achieve systemupgradability over the product's lengthy lifecycle? With rapidly changing standards,designers need a way to quickly modify theirdesigns even after they have deployed theirproduct in the field. But in most cases, it is notsufficient (or even possible) to update onlysoftware—many applications also require anincrease in compute performance. Yet design-ing a system with “spare” compute power forlater use is neither economical nor technicallyfeasible, since many technology changes aresimply not foreseeable.

    One solution is to upgrade the com-pute platform together with the softwarein such a way that the upgraded systemprovides sufficient computation power forthe additional software processing load. If you build a system with a Xilinx®

    Virtex®-5 FXT device, you can give yourdesign additional computation power byadding special-purpose compute opera-tions to the PowerPC® processor'sAuxiliary Processing Unit (APU).

    At Missing Link Electronics, a companyworking on reconfigurable platforms to linkheterogeneously connected embedded sys-tems, we believe the PowerPC processorAPU inside the Xilinx Virtex-5 FXT devicesis a little gem. It provides embedded-sys-tems designers with the same optimizationpowers traditionally available only to the“big guys” who build their own customASSP devices. Given what’s now available toXilinx users, we think everyone should tapthe APU to optimize their designs.

    April 2009 Embedded Magazine 15

    Extend the PowerPC Instruction Setfor Complex-Number ArithmeticUsing the APU inside a Xilinx Virtex-5 FXT can ease the design of field-upgradable systems.

    XCEL LENCE IN EMBEDDED SYSTEMS

  • Our project involved extending theinstruction set of the PowerPC processorto handle complex-number multiplica-tions. Complex numbers, which aredefined as a natural extension of realnumbers by means of computation withan imaginary unit, are used in many engi-neering applications. In multimediadesign, for example, complex-numbermultiplication is useful for the decodingof streaming-media data.

    Basics of Extending Instruction Set via the APUWhen designers want to optimize anembedded system, they typically do so bylooking for ways to extend the instructionset of the microprocessor at the heart oftheir design. Traditionally, this is the bestoption when the complexity of the embed-ded system lies in the software portion ofthe design. You could also simply put newfunctionality in the design by adding dedi-cated hardware blocks.

    However, you’ll likely find that increas-ing the instructions holds some greatadvantages that complement hardwarechanges, but is somewhat easier fordesigners to implement. For example, byextending the instructions, you can opti-mize your design in finer granularity.Also, extending the instruction set typical-ly does not interfere with memory access;thus, it has the potential to optimize thesystem’s overall performance.

    Even though individuals, companiesand academic researchers have publishedpapers on how to do it, extending aninstruction set may seem like a black art toanyone new to this technique. But in reali-ty, it isn’t that complex. Let’s examine howyou can optimize your Virtex-5 FXTdesign by making some fairly simple addi-tions to the PowerPC processor’s instruc-tion set via the APU interface.

    In general, to extend the instructionset of an embedded microprocessor, youneed to understand that you are modify-ing both the software and the hardware.First, you are adding hardware blocks toyour system to perform specialized com-putations. These computations execute inparallel in the FPGA fabric rather thansequentially in software. In Xilinx-speak,

    16 Embedded Magazine April 2009

    XCEL LENCE IN EMBEDDED SYSTEMS

    Step-by-Step Guide to Using the APUHere we present detailed information on how the engineers at Missing Link Electronicsgenerated the necessary files for our example design, and show how to use these files toreproduce the results on the Xilinx ML507 Evaluation Platform, which contains aXilinx Virtex-5 XC5VFX70T device. We also show how to use this design as a startingpoint for your own APU-enhanced FPGA design.

    Step 1: Build Your Coprocessor Theoretically, you can build almost any coprocessor as long as it fits into your FPGA,but keep in mind that a user-defined instruction (UDI) can transport two 32-bitoperands and one 32-bit result per cycle. Our coprocessor for complex-number multi-plication is implemented in file src/cmplxmul.vhd.

    Step 2: Build the FCM Wrapper To be area-efficient, your coprocessor may need a multicycle behavior similar to ours.Therefore, you will need a state machine to implement a simple handshake protocolbetween the coprocessor and the Auxiliary Processing Unit (APU). In our example, wedid this inside the wrapper “fcmcmul,” which we implemented in file src/fcmcmul.vhd.

    Inside the wrapper fcmcmul, we instantiated the complex-number multiplicationhardware block cmplxmul, which becomes the Fabric Coprocessing Module (FCM).Thus, fcmcmul provides the interface we need to connect it to the APU. You can find adetailed description of those interface signals in Xilinx document ug200, starting at page188. The important detail is the timing diagram for “Non-Autonomous Instructionswith Early Confirm Back-to-Back” on page 216, which shows the protocol between theAPU and the FCM.

    Step 3: Connect the FCM with the APUIn general, you can connect your FCM to the APU in two ways: by using the XilinxPlatform Studio (XPS) graphical user interface, or by editing the .mhs file. We havefound that when cutting and pasting a portion of an existing design into a new one, itis easiest to edit the .mhs file. So for this example, we connect the FCM/wrapper andthe APU in file syn/apu/system.mhs.

    We suggest that you do the same. Just copy the section from “BEGIN fcmcmul” to“END” from our example and paste it into your .mhs file.

    To make it all work in XPS, you must also provide a set of files in a predefinedfile/directory structure. In our example, we have called the wrapper block fcmcmul;therefore, the file/directory structure looks like this:

    syn/apu/pcores/fcmcmul/data/fcmcmul_v2_1_0.mpd

    syn/apu/pcores/fcmcmul/data/fcmcmul_v2_1_0.pao

    syn/apu/pcores/fcmcmul/hdl/vhdl/fcmcmul.vhd

    syn/apu/pcores/fcmcmul/hdl/vhdl/cmplxmul.vhd

    The .mpd file contains the port declarations of the FCM. The .pao file provides thenames of the blocks and files associated with the FCM, and XPS finds the VHDL sourcefiles for the coprocessor and the wrapper in the hdl/vhdl directory.

    You should replicate and adjust this tree as needed for your own APU-enhancedFPGA design.

    Step 4: Hardware SimulationWe have provided the necessary files to test the APU example using ModelSim. As a pre-requisite, and only if you have not done this yet, you must generate and compile the Xilinxsimulation library. You can do this from the XPS menu XPS→ Simulation→ Compile

  • April 2009 Embedded Magazine 17

    XCEL LENCE IN EMBEDDED SYSTEMS

    Simulation Library. Then generate all RTL simulation files for the entiredesign from the XPS menu XPS→ Simulation→ Generate Simulation.

    Next, run the RTL simulation to verify your APU design—in particular,the handshake protocol between the APU, the wrapper and your coprocessor.

    The simulation shows the two possibilities for the APU delivering theoperands in either one or two cycles (as explained in ug200, on page 216).Look for the signals FCMAPUDONE and FCMAPURESULTVALID.

    Step 5: Software TestingFor the complex-number multiplication, we have written a small standaloneprogram, syn/apu/aputest/aputest.c, that demonstrates the use of theAPU and our coprocessor from a software point of view.

    This program configures the APU and defines the UDI. Then it runs aloop to compute the complex-number multiplication using our hardwarecoprocessor, compares it against the result of a software-only complex-num-ber multiplication and provides a performance analysis.

    You must configure the PowerPC APU before it can function properly, inone of two ways: You can either click in XPS and enter the initialization val-ues for certain control registers of the APU, or you can configure the APUdirectly from the software program that uses the APU. We feel the latteroption is more explicit and robust.

    In our C source code file, you will find descriptive C macros and functioncalls to properly initialize the APU. Feel free to copy-and-paste them intoyour program as needed

    Within the loop, we do complex-number multiplication first using theUDI and then using the software macro ComplexMult. We use our rou-tines Start_Time and Stop_Time for performance analysis. The three callsUDI1FCM_GPR_GPR_GPR implement the three-cycle hardware complex-number multiplication. We define the C macro UDI1FCM_GPR_GPR_GPR infile syn/apu/ppc440_0/include/xpseudo_asm_gcc.h, which is a XilinxEDK-generated file. We implement the C macroUDI1FCM_GPR_GPR_GPR via assembler mnemonic udi1fcm.Because Xilinx has patched the assembler, this udi1fcmmnemonic—though obviously not part of the originalPowerPC 440 processor instruction set—is already a prop-er instruction the APU can handle.

    In our test case, aputest is the XPS software project that wecompiled, assembled, linked and downloaded into the Virtex-5 FXT block RAM for execution by the PowerPC processor.

    Step 6: Generate the FPGA ConfigurationYou can generate the FPGA configuration bit file from the XPSmenu XPS→ Hardware→ Generate Bit Stream. To save yousome time, we have included a bit file for the Xilinx ML507Development Platform. You can find it in Syn/apu/implemen-tation/download.bit.

    Step 7: Execute the Example DesignDownload the FPGA configuration bit file, start the XPS debugger XMD (UART settings are 115200-8-N-1) and view theexample design.

    The run-time it reports is 4,717 cycles for the software-only design and 1,936 cycles for the UDI hardware-accelerated com-plex-number multiplication. Thus, the acceleration is approximately 2.4 times, with the complex-number multiplication run-ning with no timing optimizations at 50 MHz. Of course, if we were to use pipelining and increase parallelism, the coprocessorcould run much faster, increasing the overall acceleration to five to ten times.

    Figure 1 – EDK processor system block diagram

    Figure 2 – Complex-number multiplication coprocessor was so easy to design, you can do it on graph paper.

  • 18 Embedded Magazine April 2009

    XCEL LENCE IN EMBEDDED SYSTEMS

    these hardware blocks are called FabricCoprocessing Modules, or FCMs. You canwrite FCMs in VHDL or Verilog, and theywill end up in the FPGA fabric of theVirtex-5 FXT device. You can connect oneor more FCM to the PowerPC processorAPU interface.

    The next step is to adjust your softwarecode to make use of those additionalinstructions. You have two options(assuming you are programming in C lan-guage). The first is to change the C com-piler to automatically exploit cases wherethe use of the additional instructionswould be beneficial. We’ll leave thisoption to the academics and certain folksworking on ASSPs.

    The second, and more elegant, option isnot to touch the compiler but instead to useso-called compiler-known functions. Thismeans that manually, in our software code,we’ll call a C macro or a C function thatmakes use of these additional instructions.

    Using either option, we must adjust theassembler so that it supports the new instruc-tions. Fortunately, Xilinx includes aPowerPC compiler and assembler with theEmbedded Development Kit (EDK) thatalready support these additional instructions.

    When the PowerPC encounters thesenew instructions, it quickly detects thatthey are not part of its own originalinstruction set and defers the handling ofthem to the APU. Xilinx has configuredthe APU to decode these instructions, pro-vide the appropriate FCM with theoperand data and then let the FCM per-form the computation.

    If this is properly done, the softwarerequires fewer instructions when running.Therefore, we can get more computepower out of our design without increas-

    ing the CPU clock frequency (which maycause other headaches).

    The key reason to use the APU ratherthan connecting hardware blocks to themicroprocessor via the PLB bus is the supe-rior bandwidth and lower latency betweenthe PowerPC processor and the APU/FCM.Another advantage lies in the fact that theAPU is independent of the CPU-to-periph-eral interface and therefore does not add anextra load to the PLB bus, which the systemneeds for fast peripheral access.

    The APU provides various ways of inter-facing between the PowerPC and the FCM.We can use a load-store method or the user-defined instruction (UDI) method.Chapter 12 of the Xilinx User Guideug200 offers detailed descriptions of thesetechniques (http://www.xilinx.com/support/documentation/user_guides/ug200.pdf ).

    In our example we’ll deploy the UDImethod, because it provides the most con-trol over the system, enabling the highestperformance. The example design is avail-able for download from our Web site, athttp://www.missinglinkelectronics.com/support/.

    Example Design DescriptionBy adding a UDI, we have extended thePowerPC processor’s instruction set to per-form complex-number multiplications, ahandy optimization for many multimediadecoding systems. The EDK diagram (seesidebar, Figure 1) shows the overall design,including how we connected the complex-number multiplier FCM to the PowerPCprocessor via the APU, and how softwarecan make use of it.

    We picked complex-number multiplica-tion as an example because of its wideapplicability in decoding streaming-media

    data, and because it clearly demonstrateshow to make use of the APU by adding aspecial-purpose instruction.

    Complex-number multiplication isdefined as multiplying two complexnumbers, each having a real value and animaginary value.

    (a_R + j a_I, where j*j = -1):

    (a_R + j a_I) * (b_R + j b_I) =

    (a_R * b_R - a_I * b_I) + j (a_I *

    b_R + a_R * b_I)

    For efficiency, the complex-numbermultiplication hardware block(cmplxmul) performs the multiplicationin three stages. This saves hardwareresources by using only two multipliersand two adders in this multicycle imple-mentation. Figure 2, also in the sidebar,shows a block diagram (in sketch form) ofthe complex-number multiplication FCM.

    As the VHDL code in cmplxmul.vhddemonstrates, we perform the complex-number multiplication in three clock cycles.In file cmplxmul.vhd we have implementedthe FCM to perform this complex-numbermultiplication. File fcmcmul.vhd providesthe FCM/APU interface wrapper to con-nect our FCM to the APU. As we show inour step-by-step procedure (see sidebar),you can use this wrapper as a template toconnect your own FCM to the APU whenusing the UDI method (the load-storemethod requires a different interconnect).

    We synthesized our design with XilinxEDK/XPS 10.1.02 using Xilinx ISE®

    10.1.02. We simulated and tested thedesign with ModelSim 6.3d SE.

    The PowerPC processor APU thatresides within the Xilinx Virtex-5 FXTdevices allows embedded engineers toaccelerate their systems in a very efficientway, by adding special-purpose user-defined instructions for hardware accelera-tion and coprocessing. Using the exampledesign described here as a starting pointwill show you that mastering the APU isstraightforward, and can give your designsa major performance boost without theuse of special tools.

    The PowerPC processor APU that resides within theXilinx Virtex-5 FXT devices allows embedded engineers

    to accelerate their systems in a very efficient way, by adding special-purpose user-defined instructions

    for hardware acceleration and coprocessing.

  • Order your parts today. Online or by phone. Fast.

    1.866.AVNET4U www.AvnetExpress.com

    A World of Information at Your Fingertips

    Largest Online Components Database

    Intelligent Part Search

    Design Support Tools

    Product News and Videos

    Events and Training

    State-of-the-Art Part Ordering SystemSmaller Packaging Quantities AvailableNext Day Delivery OfferedGlobal Shipping

    ©Avnet, Inc. 2009. All rights reserved. AVNET is a registered trademark of Avnet, Inc.

    You Can Have It NowThe power of Avnet Express just got better.Avnet Express connects you to the largest online components database linked to intelligent part data and integrated technical support. Avnet Express sets the new standard for your online experience.

  • © Copyright 2009 Xilinx, Inc. XILINX, the Xilinx logo, Virtex, and Spartan, are trademarks of Xilinx in the United States and other countries. All other trademarks are the property of their respective owners.

    Introducing programmability so advanced, it defies logic. As the foundation for Targeted Design Platforms, the new Virtex®-6 and Spartan®-6 FPGAs give you twice the capability at half the power. See why this changes everything at www.xilinx.com/6.

    THIS CHANGES EVERYTHING

    PN 2205

    p01_emb5_cover.pdfp02_emb5-dini ad.pdfp03_emb5_xcellad.pdfp04-05_5emb-publttr_toc.pdfp06-10_emb5_fastvideo.pdfp11-14_emb5_F_ASIM.pdfp15-18_emb5_F_101.pdfp19_emb5_avnetad.pdfp20_emb5_xilinx ad.pdf

    /ColorImageDict > /JPEG2000ColorACSImageDict > /JPEG2000ColorImageDict > /AntiAliasGrayImages false /CropGrayImages true /GrayImageMinResolution 300 /GrayImageMinResolutionPolicy /OK /DownsampleGrayImages true /GrayImageDownsampleType /Bicubic /GrayImageResolution 300 /GrayImageDepth -1 /GrayImageMinDownsampleDepth 2 /GrayImageDownsampleThreshold 1.50000 /EncodeGrayImages true /GrayImageFilter /DCTEncode /AutoFilterGrayImages true /GrayImageAutoFilterStrategy /JPEG /GrayACSImageDict > /GrayImageDict > /JPEG2000GrayACSImageDict > /JPEG2000GrayImageDict > /AntiAliasMonoImages false /CropMonoImages true /MonoImageMinResolution 1200 /MonoImageMinResolutionPolicy /OK /DownsampleMonoImages true /MonoImageDownsampleType /Bicubic /MonoImageResolution 1200 /MonoImageDepth -1 /MonoImageDownsampleThreshold 1.50000 /EncodeMonoImages true /MonoImageFilter /CCITTFaxEncode /MonoImageDict > /AllowPSXObjects false /CheckCompliance [ /None ] /PDFX1aCheck false /PDFX3Check false /PDFXCompliantPDFOnly false /PDFXNoTrimBoxError true /PDFXTrimBoxToMediaBoxOffset [ 0.00000 0.00000 0.00000 0.00000 ] /PDFXSetBleedBoxToMediaBox true /PDFXBleedBoxToTrimBoxOffset [ 0.00000 0.00000 0.00000 0.00000 ] /PDFXOutputIntentProfile () /PDFXOutputConditionIdentifier () /PDFXOutputCondition () /PDFXRegistryName () /PDFXTrapped /False

    /Description > /Namespace [ (Adobe) (Common) (1.0) ] /OtherNamespaces [ > /FormElements false /GenerateStructure false /IncludeBookmarks false /IncludeHyperlinks false /IncludeInteractive false /IncludeLayers false /IncludeProfiles false /MultimediaHandling /UseObjectSettings /Namespace [ (Adobe) (CreativeSuite) (2.0) ] /PDFXOutputIntentProfileSelector /DocumentCMYK /PreserveEditing true /UntaggedCMYKHandling /LeaveUntagged /UntaggedRGBHandling /UseDocumentProfile /UseDocumentBleed false >> ]>> setdistillerparams> setpagedevice

    /ColorImageDict > /JPEG2000ColorACSImageDict > /JPEG2000ColorImageDict > /AntiAliasGrayImages false /CropGrayImages true /GrayImageMinResolution 300 /GrayImageMinResolutionPolicy /OK /DownsampleGrayImages true /GrayImageDownsampleType /Bicubic /GrayImageResolution 300 /GrayImageDepth -1 /GrayImageMinDownsampleDepth 2 /GrayImageDownsampleThreshold 1.50000 /EncodeGrayImages true /GrayImageFilter /DCTEncode /AutoFilterGrayImages true /GrayImageAutoFilterStrategy /JPEG /GrayACSImageDict > /GrayImageDict > /JPEG2000GrayACSImageDict > /JPEG2000GrayImageDict > /AntiAliasMonoImages false /CropMonoImages true /MonoImageMinResolution 1200 /MonoImageMinResolutionPolicy /OK /DownsampleMonoImages true /MonoImageDownsampleType /Bicubic /MonoImageResolution 1200 /MonoImageDepth -1 /MonoImageDownsampleThreshold 1.50000 /EncodeMonoImages true /MonoImageFilter /CCITTFaxEncode /MonoImageDict > /AllowPSXObjects false /CheckCompliance [ /None ] /PDFX1aCheck false /PDFX3Check false /PDFXCompliantPDFOnly false /PDFXNoTrimBoxError true /PDFXTrimBoxToMediaBoxOffset [ 0.00000 0.00000 0.00000 0.00000 ] /PDFXSetBleedBoxToMediaBox true /PDFXBleedBoxToTrimBoxOffset [ 0.00000 0.00000 0.00000 0.00000 ] /PDFXOutputIntentProfile () /PDFXOutputConditionIdentifier () /PDFXOutputCondition () /PDFXRegistryName () /PDFXTrapped /False

    /Description > /Namespace [ (Adobe) (Common) (1.0) ] /OtherNamespaces [ > /FormElements false /GenerateStructure false /IncludeBookmarks false /IncludeHyperlinks false /IncludeInteractive false /IncludeLayers false /IncludeProfiles false /MultimediaHandling /UseObjectSettings /Namespace [ (Adobe) (CreativeSuite) (2.0) ] /PDFXOutputIntentProfileSelector /DocumentCMYK /PreserveEditing true /UntaggedCMYKHandling /LeaveUntagged /UntaggedRGBHandling /UseDocumentProfile /UseDocumentBleed false >> ]>> setdistillerparams> setpagedevice

    /ColorImageDict > /JPEG2000ColorACSImageDict > /JPEG2000ColorImageDict > /AntiAliasGrayImages false /CropGrayImages true /GrayImageMinResolution 300 /GrayImageMinResolutionPolicy /OK /DownsampleGrayImages true /GrayImageDownsampleType /Bicubic /GrayImageResolution 300 /GrayImageDepth -1 /GrayImageMinDownsampleDepth 2 /GrayImageDownsampleThreshold 1.50000 /EncodeGrayImages true /GrayImageFilter /DCTEncode /AutoFilterGrayImages true /GrayImageAutoFilterStrategy /JPEG /GrayACSImageDict > /GrayImageDict > /JPEG2000GrayACSImageDict > /JPEG2000GrayImageDict > /AntiAliasMonoImages false /CropMonoImages true /MonoImageMinResolution 1200 /MonoImageMinResolutionPolicy /OK /DownsampleMonoImages true /MonoImageDownsampleType /Bicubic /MonoImageResolution 1200 /MonoImageDepth -1 /MonoImageDownsampleThreshold 1.50000 /EncodeMonoImages true /MonoImageFilter /CCITTFaxEncode /MonoImageDict > /AllowPSXObjects false /CheckCompliance [ /None ] /PDFX1aCheck false /PDFX3Check false /PDFXCompliantPDFOnly false /PDFXNoTrimBoxError true /PDFXTrimBoxToMediaBoxOffset [ 0.00000 0.00000 0.00000 0.00000 ] /PDFXSetBleedBoxToMediaBox true /PDFXBleedBoxToTrimBoxOffset [ 0.00000 0.00000 0.00000 0.00000 ] /PDFXOutputIntentProfile () /PDFXOutputConditionIdentifier () /PDFXOutputCondition () /PDFXRegistryName () /PDFXTrapped /False

    /Description > /Namespace [ (Adobe) (Common) (1.0) ] /OtherNamespaces [ > /FormElements false /GenerateStructure false /IncludeBookmarks false /IncludeHyperlinks false /IncludeInteractive false /IncludeLayers false /IncludeProfiles false /MultimediaHandling /UseObjectSettings /Namespace [ (Adobe) (CreativeSuite) (2.0) ] /PDFXOutputIntentProfileSelector /DocumentCMYK /PreserveEditing true /UntaggedCMYKHandling /LeaveUntagged /UntaggedRGBHandling /UseDocumentProfile /UseDocumentBleed false >> ]>> setdistillerparams> setpagedevice

    /ColorImageDict > /JPEG2000ColorACSImageDict > /JPEG2000ColorImageDict > /AntiAliasGrayImages false /CropGrayImages true /GrayImageMinResolution 300 /GrayImageMinResolutionPolicy /OK /DownsampleGrayImages true /GrayImageDownsampleType /Bicubic /GrayImageResolution 300 /GrayImageDepth -1 /GrayImageMinDownsampleDepth 2 /GrayImageDownsampleThreshold 1.50000 /EncodeGrayImages true /GrayImageFilter /DCTEncode /AutoFilterGrayImages true /GrayImageAutoFilterStrategy /JPEG /GrayACSImageDict > /GrayImageDict > /JPEG2000GrayACSImageDict > /JPEG2000GrayImageDict > /AntiAliasMonoImages false /CropMonoImages true /MonoImageMinResolution 1200 /MonoImageMinResolutionPolicy /OK /DownsampleMonoImages true /MonoImageDownsampleType /Bicubic /MonoImageResolution 1200 /MonoImageDepth -1 /MonoImageDownsampleThreshold 1.50000 /EncodeMonoImages true /MonoImageFilter /CCITTFaxEncode /MonoImageDict > /AllowPSXObjects false /CheckCompliance [ /None ] /PDFX1aCheck false /PDFX3Check false /PDFXCompliantPDFOnly false /PDFXNoTrimBoxError true /PDFXTrimBoxToMediaBoxOffset [ 0.00000 0.00000 0.00000 0.00000 ] /PDFXSetBleedBoxToMediaBox true /PDFXBleedBoxToTrimBoxOffset [ 0.00000 0.00000 0.00000 0.00000 ] /PDFXOutputIntentProfile () /PDFXOutputConditionIdentifier () /PDFXOutputCondition () /PDFXRegistryName () /PDFXTrapped /False

    /Description > /Namespace [ (Adobe) (Common) (1.0) ] /OtherNamespaces [ > /FormElements false /GenerateStructure false /IncludeBookmarks false /IncludeHyperlinks false /IncludeInteractive false /IncludeLayers false /IncludeProfiles false /MultimediaHandling /UseObjectSettings /Namespace [ (Adobe) (CreativeSuite) (2.0) ] /PDFXOutputIntentProfileSelector /DocumentCMYK /PreserveEditing true /UntaggedCMYKHandling /LeaveUntagged /UntaggedRGBHandling /UseDocumentProfile /UseDocumentBleed false >> ]>> setdistillerparams> setpagedevice

    /ColorImageDict > /JPEG2000ColorACSImageDict > /JPEG2000ColorImageDict > /AntiAliasGrayImages false /CropGrayImages true /GrayImageMinResolution 300 /GrayImageMinResolutionPolicy /OK /DownsampleGrayImages true /GrayImageDownsampleType /Bicubic /GrayImageResolution 300 /GrayImageDepth -1 /GrayImageMinDownsampleDepth 2 /GrayImageDownsampleThreshold 1.50000 /EncodeGrayImages true /GrayImageFilter /DCTEncode /AutoFilterGrayImages true /GrayImageAutoFilterStrategy /JPEG /GrayACSImageDict > /GrayImageDict > /JPEG2000GrayACSImageDict > /JPEG2000GrayImageDict > /AntiAliasMonoImages false /CropMonoImages true /MonoImageMinResolution 1200 /MonoImageMinResolutionPolicy /OK /DownsampleMonoImages true /MonoImageDownsampleType /Bicubic /MonoImageResolution 1200 /MonoImageDepth -1 /MonoImageDownsampleThreshold 1.50000 /EncodeMonoImages true /MonoImageFilter /CCITTFaxEncode /MonoImageDict > /AllowPSXObjects false /CheckCompliance [ /None ] /PDFX1aCheck false /PDFX3Check false /PDFXCompliantPDFOnly false /PDFXNoTrimBoxError true /PDFXTrimBoxToMediaBoxOffset [ 0.00000 0.00000 0.00000 0.00000 ] /PDFXSetBleedBoxToMediaBox true /PDFXBleedBoxToTrimBoxOffset [ 0.00000 0.00000 0.00000 0.00000 ] /PDFXOutputIntentProfile () /PDFXOutputConditionIdentifier () /PDFXOutputCondition () /PDFXRegistryName () /PDFXTrapped /False

    /Description > /Namespace [ (Adobe) (Common) (1.0) ] /OtherNamespaces [ > /FormElements false /GenerateStructure false /IncludeBookmarks false /IncludeHyperlinks false /IncludeInteractive false /IncludeLayers false /IncludeProfiles false /MultimediaHandling /UseObjectSettings /Namespace [ (Adobe) (CreativeSuite) (2.0) ] /PDFXOutputIntentProfileSelector /DocumentCMYK /PreserveEditing true /UntaggedCMYKHandling /LeaveUntagged /UntaggedRGBHandling /UseDocumentProfile /UseDocumentBleed false >> ]>> setdistillerparams> setpagedevice

    /ColorImageDict > /JPEG2000ColorACSImageDict > /JPEG2000ColorImageDict > /AntiAliasGrayImages false /CropGrayImages true /GrayImageMinResolution 300 /GrayImageMinResolutionPolicy /OK /DownsampleGrayImages true /GrayImageDownsampleType /Bicubic /GrayImageResolution 300 /GrayImageDepth -1 /GrayImageMinDownsampleDepth 2 /GrayImageDownsampleThreshold 1.50000 /EncodeGrayImages true /GrayImageFilter /DCTEncode /AutoFilterGrayImages true /GrayImageAutoFilterStrategy /JPEG /GrayACSImageDict > /GrayImageDict > /JPEG2000GrayACSImageDict > /JPEG2000GrayImageDict > /AntiAliasMonoImages false /CropMonoImages true /MonoImageMinResolution 1200 /MonoImageMinResolutionPolicy /OK /DownsampleMonoImages true /MonoImageDownsampleType /Bicubic /MonoImageResolution 1200 /MonoImageDepth -1 /MonoImageDownsampleThreshold 1.50000 /EncodeMonoImages true /MonoImageFilter /CCITTFaxEncode /MonoImageDict > /AllowPSXObjects false /CheckCompliance [ /None ] /PDFX1aCheck false /PDFX3Check false /PDFXCompliantPDFOnly false /PDFXNoTrimBoxError true /PDFXTrimBoxToMediaBoxOffset [ 0.00000 0.00000 0.00000 0.00000 ] /PDFXSetBleedBoxToMediaBox true /PDFXBleedBoxToTrimBoxOffset [ 0.00000 0.00000 0.00000 0.00000 ] /PDFXOutputIntentProfile () /PDFXOutputConditionIdentifier () /PDFXOutputCondition () /PDFXRegistryName () /PDFXTrapped /False

    /Description > /Namespace [ (Adobe) (Common) (1.0) ] /OtherNamespaces [ > /FormElements false /GenerateStructure false /IncludeBookmarks false /IncludeHyperlinks false /IncludeInteractive false /IncludeLayers false /IncludeProfiles false /MultimediaHandling /UseObjectSettings /Namespace [ (Adobe) (CreativeSuite) (2.0) ] /PDFXOutputIntentProfileSelector /DocumentCMYK /PreserveEditing true /UntaggedCMYKHandling /LeaveUntagged /UntaggedRGBHandling /UseDocumentProfile /UseDocumentBleed false >> ]>> setdistillerparams> setpagedevice

    /ColorImageDict > /JPEG2000ColorACSImageDict > /JPEG2000ColorImageDict > /AntiAliasGrayImages false /CropGrayImages true /GrayImageMinResolution 300 /GrayImageMinResolutionPolicy /OK /DownsampleGrayImages true /GrayImageDownsampleType /Bicubic /GrayImageResolution 300 /GrayImageDepth -1 /GrayImageMinDownsampleDepth 2 /GrayImageDownsampleThreshold 1.50000 /EncodeGrayImages true /GrayImageFilter /DCTEncode /AutoFilterGrayImages true /GrayImageAutoFilterStrategy /JPEG /GrayACSImageDict > /GrayImageDict > /JPEG2000GrayACSImageDict > /JPEG2000GrayImageDict > /AntiAliasMonoImages false /CropMonoImages true /MonoImageMinResolution 1200 /MonoImageMinResolutionPolicy /OK /DownsampleMonoImages true /MonoImageDownsampleType /Bicubic /MonoImageResolution 1200 /MonoImageDepth -1 /MonoImageDownsampleThreshold 1.50000 /EncodeMonoImages true /MonoImageFilter /CCITTFaxEncode /MonoImageDict > /AllowPSXObjects false /CheckCompliance [ /None ] /PDFX1aCheck false /PDFX3Check false /PDFXCompliantPDFOnly false /PDFXNoTrimBoxError true /PDFXTrimBoxToMediaBoxOffset [ 0.00000 0.00000 0.00000 0.00000 ] /PDFXSetBleedBoxToMediaBox true /PDFXBleedBoxToTrimBoxOffset [ 0.00000 0.00000 0.00000 0.00000 ] /PDFXOutputIntentProfile () /PDFXOutputConditionIdentifier () /PDFXOutputCondition () /PDFXRegistryName () /PDFXTrapped /False

    /Description > /Namespace [ (Adobe) (Common) (1.0) ] /OtherNamespaces [ > /FormElements false /GenerateStructure false /IncludeBookmarks false /IncludeHyperlinks false /IncludeInteractive false /IncludeLayers false /IncludeProfiles false /MultimediaHandling /UseObjectSettings /Namespace [ (Adobe) (CreativeSuite) (2.0) ] /PDFXOutputIntentProfileSelector /DocumentCMYK /PreserveEditing true /UntaggedCMYKHandling /LeaveUntagged /UntaggedRGBHandling /UseDocumentProfile /UseDocumentBleed false >> ]>> setdistillerparams> setpagedevice

    /ColorImageDict > /JPEG2000ColorACSImageDict > /JPEG2000ColorImageDict > /AntiAliasGrayImages false /CropGrayImages true /GrayImageMinResolution 300 /GrayImageMinResolutionPolicy /OK /DownsampleGrayImages true /GrayImageDownsampleType /Bicubic /GrayImageResolution 300 /GrayImageDepth -1 /GrayImageMinDownsampleDepth 2 /GrayImageDownsampleThreshold 1.50000 /EncodeGrayImages true /GrayImageFilter /DCTEncode /AutoFilterGrayImages true /GrayImageAutoFilterStrategy /JPEG /GrayACSImageDict > /GrayImageDict > /JPEG2000GrayACSImageDict > /JPEG2000GrayImageDict > /AntiAliasMonoImages false /CropMonoImages true /MonoImageMinResolution 1200 /MonoImageMinResolutionPolicy /OK /DownsampleMonoImages true /MonoImageDownsampleType /Bicubic /MonoImageResolution 1200 /MonoImageDepth -1 /MonoImageDownsampleThreshold 1.50000 /EncodeMonoImages true /MonoImageFilter /CCITTFaxEncode /MonoImageDict > /AllowPSXObjects false /CheckCompliance [ /None ] /PDFX1aCheck false /PDFX3Check false /PDFXCompliantPDFOnly false /PDFXNoTrimBoxError true /PDFXTrimBoxToMediaBoxOffset [ 0.00000 0.00000 0.00000 0.00000 ] /PDFXSetBleedBoxToMediaBox true /PDFXBleedBoxToTrimBoxOffset [ 0.00000 0.00000 0.00000 0.00000 ] /PDFXOutputIntentProfile () /PDFXOutputConditionIdentifier () /PDFXOutputCondition () /PDFXRegistryName () /PDFXTrapped /False

    /Description > /Namespace [ (Adobe) (Common) (1.0) ] /OtherNamespaces [ > /FormElements false /GenerateStructure false /IncludeBookmarks false /IncludeHyperlinks false /IncludeInteractive false /IncludeLayers false /IncludeProfiles false /MultimediaHandling /UseObjectSettings /Namespace [ (Adobe) (CreativeSuite) (2.0) ] /PDFXOutputIntentProfileSelector /DocumentCMYK /PreserveEditing true /UntaggedCMYKHandling /LeaveUntagged /UntaggedRGBHandling /UseDocumentProfile /UseDocumentBleed false >> ]>> setdistillerparams> setpagedevice

    /ColorImageDict > /JPEG2000ColorACSImageDict > /JPEG2000ColorImageDict > /AntiAliasGrayImages false /CropGrayImages true /GrayImageMinResolution 300 /GrayImageMinResolutionPolicy /OK /DownsampleGrayImages true /GrayImageDownsampleType /Bicubic /GrayImageResolution 300 /GrayImageDepth -1 /GrayImageMinDownsampleDepth 2 /GrayImageDownsampleThreshold 1.50000 /EncodeGrayImages true /GrayImageFilter /DCTEncode /AutoFilterGrayImages true /GrayImageAutoFilterStrategy /JPEG /GrayACSImageDict > /GrayImageDict > /JPEG2000GrayACSImageDict > /JPEG2000GrayImageDict > /AntiAliasMonoImages false /CropMonoImages true /MonoImageMinResolution 1200 /MonoImageMinResolutionPolicy /OK /DownsampleMonoImages true /MonoImageDownsampleType /Bicubic /MonoImageResolution 1200 /MonoImageDepth -1 /MonoImageDownsampleThreshold 1.50000 /EncodeMonoImages true /MonoImageFilter /CCITTFaxEncode /MonoImageDict > /AllowPSXObjects false /CheckCompliance [ /None ] /PDFX1aCheck false /PDFX3Check false /PDFXCompliantPDFOnly false /PDFXNoTrimBoxError true /PDFXTrimBoxToMediaBoxOffset [ 0.00000 0.00000 0.00000 0.00000 ] /PDFXSetBleedBoxToMediaBox true /PDFXBleedBoxToTrimBoxOffset [ 0.00000 0.00000 0.00000 0.00000 ] /PDFXOutputIntentProfile () /PDFXOutputConditionIdentifier () /PDFXOutputCondition () /PDFXRegistryName () /PDFXTrapped /False

    /Description > /Namespace [ (Adobe) (Common) (1.0) ] /OtherNamespaces [ > /FormElements false /GenerateStructure false /IncludeBookmarks false /IncludeHyperlinks false /IncludeInteractive false /IncludeLayers false /IncludeProfiles false /MultimediaHandling /UseObjectSettings /Namespace [ (Adobe) (CreativeSuite) (2.0) ] /PDFXOutputIntentProfileSelector /DocumentCMYK /PreserveEditing true /UntaggedCMYKHandling /LeaveUntagged /UntaggedRGBHandling /UseDocumentProfile /UseDocumentBleed false >> ]>> setdistillerparams> setpagedevice