unit ii : cpld & fpga architecture & applications

Upload: narasimha-murthy-yayavaram

Post on 04-Jun-2018

245 views

Category:

Documents


0 download

TRANSCRIPT

  • 8/13/2019 UNIT II : CPLD & FPGA ARCHITECTURE & APPLICATIONS

    1/20

    Dr.Y.Narasimha Murthy [email protected]

    UNIT II : CPLD & FPGA ARCHITECTURE & APPLICATIONS

    INTRODUCTION : The Xilinx Programmable Gate Array, known as a Logic Cell Array (LCA),

    is a high-density CMOS IC that combines user programmability with the flexibility of a gate

    array architecture and the economy and testability of standard products. Xilinx reprogrammable

    architectures are used because of their flexibility, low prices for small quantities, testability and

    short development time. Most design changes can be implemented by reprogramming the LCAs.

    Thus, use of the LCAs , allows the design to go directly from schematic capture to a production

    board. The programmable logic blocks in the Xilinx family of FPGAs are called Configurable

    Logic Blocks (CLBs).The Xilinx architecture uses, CLBs, I/O blocks, switch matrix and an

    external memory chip to realize a logic function. It uses external memory to store theinterconnection information. Therefore, the device can be reprogrammed by simply changing the

    configuration data stored in the memory.

    XILINX Logic Cell Array : This is the novel architectural feature introduced by XILINX in

    the year 1985 for their FPGA devices. It is almost like a proprietary or trade mark property of

    XILINX implemented for FPGA devices. The XILINX LCA architecture consists of three major

    Components. They are

    (i).Configurable Logic Blocks (CLBs) (ii).Input/Output Blocks (lOBs) and

    (iii). Programmable Interconnect.

    In addition, configuration memory is used to hold the configuration program bits which control

    the configuration of CLRM, IOBs and interconnect.

    This LCA architecture consists of an interior matrix of logic blocks and a surrounding ring of I/O

    interface blocks. Interconnect resources occupy the channels between the rows and columns of

    logic blocks and between the logic blocks and I/O blocks. Like a microprocessor the LCA is a

    program driven logic device. The functions of the LCAs configurable logic blocks and I/O

    blocks and their interconnection are controlled by a configuration program stored in an on-chip

    memory. The configuration program is loaded automatically from an external memory on power-

    up or on command, or is programmed by a microprocessor as part of system initialization.

  • 8/13/2019 UNIT II : CPLD & FPGA ARCHITECTURE & APPLICATIONS

    2/20

    Dr.Y.Narasimha Murthy [email protected]

    As shown below diagram the configuration memory consists of a distributed array of static

    memory cells .During configuration the cell is written through the data line and is read through

    the data line during read back operation.

    During normal operation the pass transistor is off and continuous configuration control is

    provided. There are five methods for loading configuration program data into configuration

    memory. Among them two methods load the data serially and three methods load the data in a

    byte wide parallel manner.

    The LCA performance is determined by the speed of logic , storage elements and programmable

    interconnect.LCA performance is specified by the maximum toggle rate for a logic block

    storage element configured as a toggle flip-flop. For typical application system clock rates are

    one third to one-half the maximum flip-flop toggle rate.

    The core of the LCA is a matrix of identical Configurable Blocks (CLBs).Each CLB contains

    programmable combinational logic and storage registers. The combinational logic section of of

    the block is capable of implementing any Boolean function of its input variables.The registers

    can be loaded from the combinational logic or directly from a CLB input the register outputs can

    be inputs to the combinational logic via an internal feedback path.

  • 8/13/2019 UNIT II : CPLD & FPGA ARCHITECTURE & APPLICATIONS

    3/20

    Dr.Y.Narasimha Murthy [email protected]

    The periphery of the Logic Cell Array is made up of user programmable input/output blocks

    (IOBs).Each block can be programmed independently to be an input ,an output or bi-directional

    pin with three state control. Inputs can be programmed to recognize either TTL or CMOSthresholds. Each IOB also includes flip-flops that can be used to buffer inputs and outputs.

    The flexibility of the LCA is due to resources that permit program control of the interconnection

    of any two points on the chip. The LCA interconnection resources include a two-layer metal net-

    work of lines that run horizontally and vertically in the rows and columns between the CLBs.

    Programmable switches connect the inputs and outputs of IOBs and CLBs to the nearest metal

    lines. Cross point switches and interchanges at the interconnections of rows and columns allow

    signals to be switched from one path to another. Long lines run the entire length or breadth of the

    chip ,by passing interchanges to provide distribution of critical signals with minimum delay or

    skew.

    Configurable Block(CLB) : The core of the FPGA is a matrix of identical Configurable

    Blocks(CLBs) .Each CLB contains a combinational logic array, program controlled data

  • 8/13/2019 UNIT II : CPLD & FPGA ARCHITECTURE & APPLICATIONS

    4/20

    Dr.Y.Narasimha Murthy [email protected]

    multiplexers, and flip-flops. The CLB also contains RAM memory cells and can be programmed

    to realize any function of five variables or any two functions of four variables. The functions are

    stored in the truth table form, so the number of gates required to realize the functions is not

    important. In the fig below each trapezoidal block represents a multiplexer, which can be

    programmed to select one of its inputs . The block diagram of the CLB is shown below

    The array of CLBs provides the functional elements from which the users logic is constructed.

    The logic blocks are arranged in a matrix within the perimeter of IOBs. Forexample, the

    XC3020A has 64 such blocks arranged in 8rows and 8 columns. The development system is used

    tocompile the configuration data which is to be loaded intothe internal configuration memory to

    define the operationand interconnection of each block. User definition of CLBsand their

    interconnecting networks may be done by automatic translation from a schematic-capture logic

    diagram oroptionally by installing library or user macros. Each CLB has a combinatorial logic

    section, two flip-flops,and an internal control section. There are : five logic inputs (A, B, C, D

    and E); a common clock input (K); an asynchronous direct RESET input (RD); and an enable

    clock (EC). All may be driven from the interconnect resources adjacent to the blocks. Each CLB

    also has two outputs (X and Y) which may drive interconnect networks. Data input for either

  • 8/13/2019 UNIT II : CPLD & FPGA ARCHITECTURE & APPLICATIONS

    5/20

    Dr.Y.Narasimha Murthy [email protected]

    flip-flop within a CLB is supplied from the function F or G outputs of the combinatorial logic, or

    the block input, DI. Both flip-flops in each CLB share the asynchronous RD which, when

    enabled and High , is dominant over clocked inputs. All flip-flops are reset by the active-Low

    chip input, RESET, or during the configuration process. The flip-flops share the enable clock

    (EC) which, when Low, re circulates the flip- flops present states and inhibits response to the

    data-in or combinatorial function inputs on a CLB. The user may enable these control inputsand

    select their sources. The user may also select theclock net input (K), as well as its active sense

    within each CLB. This programmable inversion eliminates the need toroute both phases of a

    clock signal throughout the device.

    The combinatorial-logic portion of the CLB uses a 32 by 1 look-up table to implement Boolean

    functions. Variables selected from the five logic inputs and two internal block flip-flops are used

    as table address inputs. The combinatorial propagation delay through the network is independent

    of the logic function generated and is spike free for singleinput variable changes. The partial

    functions of six or seven variables are implemented using the input variable (E) to dynamically

    select between two functions of four different variables. For thetwo functions of four variables

    each, the independent results (F and G) may be used as data inputs to either flip-flop or either

    logic block output. For the single function of five variables and merged functions of six or seven

    variables, the F and G outputs are identical. Symmetry of the F and G functions and the flip-flops

    allows the interchange of CLB outputs to optimize routing efficiencies of the networksinterconnecting the CLBs and IOBs

    Input/Output Blocks ( I/O Block):

    The periphery of the Logic Cell Array is made up of user programmable input/output blocks

    (IOBs) .Each block can be programmed independently to be an input ,an output or bi-directional

    pin with three state control. So, each user-configurable IOB , provides an interface between the

    external package pin of the device and the internal user logic. This IOB includes both registered

    and direct input paths. Also each IOB provides a programmable3-state output buffer, which may be driven by a registered or direct output signal. Configuration options allow the IOB an

    inversion, a controlled slew rate and a high impedance pull-up. Each input circuit also provides

    input clamping diodes to provide electrostatic protection, and circuits to inhibit latch-up

    produced by input currents

  • 8/13/2019 UNIT II : CPLD & FPGA ARCHITECTURE & APPLICATIONS

    6/20

    Dr.Y.Narasimha Murthy [email protected]

    The IOB also includes input and output storage elements and I/O options selected by

    configuration memory cells. A choice of two clocks is available on each die edge. The polarity of

    each clock line (not each flip-flop or latch) is programmable. A clock line that triggers the flip-

    flop on the rising edge is an active Low Latch Enable (Latch transparent) signal and vice versa.

    Passive pull-up can only be enabled on inputs, not on outputs. All user inputs are programmedfor TTL or CMOS thresholds.

    The input-buffer portion of each IOB provides threshold detection to translate external signals

    applied to the package pin to internal logic levels. The global input-buffer threshold of the IOBs

    can be programmed to be compatible with either TTL or CMOS levels. The buffered input signal

    drives the data input of a storage element, which may be configured as either a flip-flop or a

    latch. The clocking polarity (rising/falling edge-triggered flip-flop, High/Low transparent latch)

    is programmable for each of the two clock lines on each of the four die edges. Note that a clock

    line driving a rising edge-triggered flip-flop makes any latch driven by the same line on the same

    edge Low-level transparent and vice versa (falling edge, High transparent). All Xilinx primitives

    in the supported schematic-entry packages, however, are positive edge-triggered flip-flops or

    High transparent latches. When one clock line must drive flip-flops as well as latches, it is

    necessary to compensate for the difference in clocking polarities with an additional inverter

  • 8/13/2019 UNIT II : CPLD & FPGA ARCHITECTURE & APPLICATIONS

    7/20

    Dr.Y.Narasimha Murthy [email protected]

    either in the flip-flop clock input or the latch-enable input. I/O storage elements are reset during

    configuration or by the active-Low chip RESET input. Both direct input (from IOB pin I) and

    registered input (from IOB pin Q) signals are available for interconnect.

    Programmable-interconnection resources in the Field Programmable Gate Array provide routing

    paths to connect inputs and outputs of the IOBs and CLBs into logic networks .Interconnections

    between blocks are composed of a two-layer grid of metal segments. Specially designed pass

    transistors, each controlled by a configuration bit, form programmable interconnect points (PIPs)

    and switching matrices used to implement the necessary connections between selected metal

    segments and block pins.

    Figure below is an example of a routed net. The development system provides automatic

    routing of these interconnections. Interactive routing is also available for design optimization.

    The inputs of the CLBs or IOBs are multiplexers which can be programmed to select an input

    network from the adjacent interconnect segments. Since the switch connections to block inputs

    are unidirectional, as are block outputs, they are usable only for block input connection and not

    for routing. Figure below illustrates routing access to logic block input variables, control inputs

    and block outputs.

  • 8/13/2019 UNIT II : CPLD & FPGA ARCHITECTURE & APPLICATIONS

    8/20

    Dr.Y.Narasimha Murthy [email protected]

    Three types of metal resources are provided to fulfill various network interconnect

    requirements.

    General Purpose Interconnect

    Direct Connection

    Long lines (multiplexed busses and wide AND gates)

    General Purpose Interconnect

    It consists of a grid of five horizontal and five vertical metal segments located between the rows

    and columns of logic and IOBs. Each segment is the height or width of a logic block. Switching

    matrices join the ends of these segments and allow programmed interconnections between the

    metal grid segments of adjoining rows and columns. The switches of an un-programmed device

    are all non-conducting. The connections through the switch matrix may be established by the

    automatic routing or by selecting the desired pairs of matrix pins to be connected or

    disconnected.

    Special buffers within the general interconnect areas provide periodic signal isolation and

    restoration for improved performance of lengthy nets. The interconnect buffers are available to

    propagate signals in either direction on a given general interconnect segment. These bidirectional

    (bidi) buffers are found adjacent to the switching matrices, above and to the right. The other PIPs

  • 8/13/2019 UNIT II : CPLD & FPGA ARCHITECTURE & APPLICATIONS

    9/20

    Dr.Y.Narasimha Murthy [email protected]

    adjacent to the matrices are accessed to or from Long lines. The development system

    automatically defines the buffer direction based on the location of the interconnection network

    source. The delay calculator of the development system automatically calculates and displays the

    block, interconnect and buffer delays for any paths selected. Generation of the simulation net list

    with a worst-case delay model is provided.

    Direct Interconnect

    Direct interconnect provides the most efficient implementation of networks between adjacent

    CLBs or I/O Blocks. Signals routed from block to block using the direct interconnect exhibit

    minimum interconnect propagation and use no general interconnect resources.

    For each CLB, the X output may be connected directly to the B input of the CLB immediately to

    its right and to the C input of the CLB to its left. The Y output can use direct interconnect to

    drive the D input of the block immediately above and the A input of the block below.Direct

    interconnect should be used to maximize the speed of high-performance portions of logic. Where

    logic blocks are adjacent to IOBs, direct connect is provided alternately to the IOB inputs (I) and

  • 8/13/2019 UNIT II : CPLD & FPGA ARCHITECTURE & APPLICATIONS

    10/20

    Dr.Y.Narasimha Murthy [email protected]

    outputs (O) on all four edges of the die. The right edge provides additional direct connects from

    CLB outputs to adjacent IOBs.

    Long lines

    The Long lines bypass the switch matrices and are intended primarily for signals that must travela long distance, or must have minimum skew among multiple destinations. Long lines, run

    vertically and horizontally the height or width of the interconnect area. Each interconnection

    column has three vertical Long lines, and each interconnection row has two horizontal Long

    lines.

    Vertical and Horizontal Long Lines

  • 8/13/2019 UNIT II : CPLD & FPGA ARCHITECTURE & APPLICATIONS

    11/20

    Dr.Y.Narasimha Murthy [email protected]

    Two additional Long lines are located adjacent to the outer sets of switching matrices. Long lines

    can be driven by a logic block or IOB output on a column-by-column basis. This capability

    provides a common low skew control or clock line within each column of logic blocks. Isolation

    buffers are provided at each input to a Long line and are enabled automatically by the

    development system when a connection is made.

    Technology Mapping for FPGA :

    An FPGA consists of a regular array of logic blocks that implement combinational and

    sequential logic functions and a user programmable routing network that provides connections

    between the logic blocks . In conventional ASIC implementation technologies such as Mask

    Programmed Gate Arrays (MPGAs) and Standard Cells the connections between logic blocks

    are implemented by metallization at a fabrication facility. In an FPGA the connections are

    implemented in the field using the user programmable routing network. This reduces

    manufacturing turn-around times drastically from weeks to minutes and reduces prototype

    costs.

    But the limitations are , density and performance penalties associated with user programmable

    routing. The programmable connections which consist of metal wire segments connected by

    programmable switches occupy greater area and incur greater delay than simple metal wires. To

    reduce the density penalty FPGA architectures employ highly functional logic blocks such as

    lookup tables that reduce the total number of logic blocks and hence the number of programmable connections needed to implement a given application. These complex logic

    blocks also reduce the performance penalty by reducing the number of logic blocks and

    programmable conections on the critical paths in the circuit.

    The high functionality of FPGA logic blocks presents new challenges for logic synthesis. So,the

    technology mapping provides a solution for FPGAs that use lookup tables to implement

    combinational logic. i.e Technology mapping is a process of transforming a technology

    independent Boolean network into a technology dependent network. For example a K input

    lookup table (LUT) is a digital memory that can implement any Boolean function of K variables.

    The K inputs are used to address a 2 K by 1 bit memory that stores the truth table of the Boolean

    function. It is a proven fact that lookup tables are an area efficient method of implementing

    combinational functions and that the delays of LUT based FPGAs are minimum when compared

  • 8/13/2019 UNIT II : CPLD & FPGA ARCHITECTURE & APPLICATIONS

    12/20

    Dr.Y.Narasimha Murthy [email protected]

    to the delays of FPGAs using other types of logic blocks .The goal of the technology mapping is

    to reduce area, delay or a combination of both.

    Technology mapping is the logic synthesiss task that is directly concerned with selecting the

    circuit elements used to implement the optimized circuit. Previous approaches to technology

    mapping have focused on using circuit elements from a limited set of simple gates. However

    such approaches are inappropriate for complex logic blocks where each logic block can

    implement a large number of functions . A K input lookup table can implement 2 K different

    functions. For values of K greater than 3 the number of different functions becomes too large

    for conventional technology mapping Therefore new approaches to technology mapping are

    required for LUT based FPGAs.

    Library-Based Technology Mapping : In library based mapping, gates or components are

    selected from a technology library to implement a circuit. Hence it is also referred to as library

    binding. So, this method generates a technology mapping for a given Boolean network using a

    characterized cell library with the objective of cost optimization or delay optimization. Standard

    Cells and Mask Programmed Gate Arrays implement combinational functions using a limited

    set of simple gates. For such ASIC technologies library-based technology mapping is very

    useful.

    In this methodology the set of available circuit elements is represented as a library of functions

    and the construction of the optimized circuit is divided into three sub problems(i). Decomposition, (ii). Matching and (iii) Covering.

    The original network is first decomposed into a canonical representation that uses limited fan in

    NAND nodes. This decomposition guarantees that there will be no nodes in the network that are

    too large to be implemented by any library element provided the library includes NAND gates

    that reach the fan in limit.

    After decomposition the network is partitioned into a forest of trees The optimal sub circuit

    covering each tree is constructed and finally the circuit covering the entire network is assembled

    from these sub circuits. To form the forest of trees, the decomposed network is partitioned at fan

    out nodes into a set of single output sub networks.

    Each of these sub networks is either a tree or a leaf DAG (Directed Acyclic Graph). A leaf DAG

    is a multi input single output DAG where only the input nodes have fan out greater than one.

    Each leaf DAG is converted into a tree by creating a unique instance of every input node for

  • 8/13/2019 UNIT II : CPLD & FPGA ARCHITECTURE & APPLICATIONS

    13/20

    Dr.Y.Narasimha Murthy [email protected]

    each of its multiple fan out edges The optimal circuit implementing each tree is constructed

    using a dynamic programming traversal that proceeds from the leaf nodes to the root node.

    For every node in the tree an optimal circuit implementing the sub tree extending from the node

    to the leaf nodes is constructed. This circuit consists of a library element that matches a sub

    function rooted at the node and previously constructed circuits implementing its inputs. The cost

    of the circuit is calculated from the cost of the matched library element and the cost of the

    circuits implementing its inputs.

    To find the lowest cost circuit, the DAGON , first finds all library elements that match sub

    functions rooted at the node. The cost of the circuit using each of these candidate library

    elements is then calculated and the lowest cost circuit is retained . The set of library elements

    is found by searching through the library and using tree matching to determine if each library

    element matches a sub function rooted at the node.

    As an example let us consider the library shown in the figure(a) below and the circuit shown in

    figure(b). The circuit elements are standard cells and their costs are given in terms of the area of

    the cells. The cost of the INV , NAND-2 and AOI-21 cells are2,3 and 4 respectively. In Figure

    (b) the only library element matching at node E is the NAND-2 and the cost of the optimal

    circuit implementing node E is therefore 3. At node C the only matching library element is also

    the NAND2. The cost of the NAND-2 is 3 and the cost of the optimal circuits implementing its

    input E is also 3.Therefore , the cumulative cost of the optimal circuit implementing node C is 6.

  • 8/13/2019 UNIT II : CPLD & FPGA ARCHITECTURE & APPLICATIONS

    14/20

    Dr.Y.Narasimha Murthy [email protected]

    Finally the algorithm will reach node A_ For node A there are two matching library elements

    the INV as used in figure(b) and the AOI-21 as used in figure (c).The circuit constructed usingthe INV matching A includes a NAND-2 implementing node B, a NAND-2 implementing node

    C, an INV implementing node D and a NAND-2 implementing node E. The cumulative cost of

    this circuit is 13. The circuit constructed using the AOI-21 matching A includes a NAND-2

    implementing node E. The cumulative cost of this circuit is 7. The circuit using the AOI-21 is

    therefore the optimal circuit implementing node A.

    The major obstacle to applying library-based technology mapping to LUT circuits is the large

    number of different functions that a K-input LUT can implement. The function implemented by

    a K-input LUT is determined by the values stored in its 2 K memory bits. Since each bit can

    independently be either 0 or 1, there are 22K different Boolean functions of K- variables.

    For values of K greater than 3 the library required to represent a K-input LUT becomes very

    large. The size of the library can be reduced by noting that some patterns are equivalent after a.

  • 8/13/2019 UNIT II : CPLD & FPGA ARCHITECTURE & APPLICATIONS

    15/20

    Dr.Y.Narasimha Murthy [email protected]

    permutation of inputs . The inversion of outputs or inputs, which is trivially accomplished with

    a LUT, can also produce equivalent patterns.

    Another alternative is to use a partial library tuned to take advantage of the network structure

    likely to be produced by technology independent logic optimization. The limitation of this

    approach is that it precludes some opportunities for optimization of the final circuit.

    LUT-based Technology Mapping:

    The major obstacle to applying library-based technology mapping to LUT circuits is the large

    number of different functions that a K-input LUT can implement. The function implemented by

    a K-input LUT is determined by the values stored in its 2 K memory bits. Since each bit can

    independently be either 0 or 1, there are 22K different Boolean functions of K- variables.For

    values of K greater than 3 the library required to represent a K-input LUT becomes very large.

    The limitations of earlier technology mapping approaches paved the way for the development

    of technology mapping that deals specially with LUT circuits. The first LUT based technology

    mappers appeared in 90s. and later improved for optimized delay performance of LUT circuits

    by minimizing the number of levels of LUT in the final circuit.

    In LUT based FPGAs (example XILINX FPGAs) the building blocks are LUTs and Flip-Flops.

    In an LUT based FPGA chip the basic programmable logic block is a K-input Look Up

    Table.(K-LUT) which can implement any Boolean function of up to K- variables.The technology

    mapping in LUT based FPGA designs is to cover a general Boolean Network using K-LUTs toobtain functionally equivalent K-LUT network. The main objectives in LUT mapping are

    (i).Cost optimal mapping i.e Minimizing the number of LUTs and Minimizing the number of

    CLBs

    (ii) Delay optimal mapping i.e Minimizing the number of LUT levels and Minimizing the

    delays (including routing delays)

    (iii).Maximizing the routability of the mapping schemes.

    The LUT based technology can be implemented using two types of algorithms .They are

    (a). The Area Algorithm and (b) The delay algorithm

    The Area Algorithm :

    A circuit can be implemented by a given FPGA only if the number of logic blocks in the circuit

    does not exceed the available number of logic blocks and the required connections between the

    logic blocks do not exceed the capacity of the routing network. The area algorithm minimizes

  • 8/13/2019 UNIT II : CPLD & FPGA ARCHITECTURE & APPLICATIONS

    16/20

  • 8/13/2019 UNIT II : CPLD & FPGA ARCHITECTURE & APPLICATIONS

    17/20

    Dr.Y.Narasimha Murthy [email protected]

    In figure(a) the shaded OR node is not decomposed and 5 levels of LUTs are required toimplement the network. However if the OR node is decomposed into the two nodes shown in

    figure (b) then only 4 levels of LUTs are required.

    The delay algorithm like the area algorithm firstt partitions the original net workin to a forest of

    trees , maps each tree separately into a circuit of K-input LUTs and then assembles the circuit

    implementing the entire network from the circuits implementing the trees. The trees are mapped

    in a breadth first order proceeding from the primary inputs toward the primary outputs. This

    ensures that when each tree is mapped that the trees implementing its leaf nodes have already

    been mapped.

    The overall strategy employed by the delay algorithm is to minimize the number of levels of

    LUTs by minimizing the depth of every path in the final circuit. This can result in a circuit that

    contains a large number of LUTs.

    MULTIPLEXER BASED TECHNOLOGY MAPPING:

    This Multiplexer based technology mapping is used in ACTEL FPGAs and in recent Xilinx

    VIRTEX 6 FPGA devices .Because their logic block architectures are MUX based.In Actel based FPGAs ,the size of the Multiplexers is small and suitable to achieve the objective of area

    optimization and minimum delays.

    Circuits usually contain a large number of multiplexers (MUXes). This is mainly true for circuits

    that are automatically synthesized from high-level descriptions. MUXes exist in the data-paths of

  • 8/13/2019 UNIT II : CPLD & FPGA ARCHITECTURE & APPLICATIONS

    18/20

    Dr.Y.Narasimha Murthy [email protected]

    circuits, where they are used to route operands to operators. Also, the control logic is frequently

    specified as a CASE statement in HDL descriptions. MUXs arise as a result of a direct

    translation of CASE statements in HDLs into a logic-level description. Cell libraries too contain

    various choices of MUXes. Cell implementations make use of the fact that a pass gate

    implementation of a MUX is both, faster and smaller. In the case of MUX-based FPGAs like

    Actel, there is a natural presence of MUX in the virtual library. Thus, a method for mapping

    MUX in the unmapped network to those in the library is desirable.

    The significance of Multiplexer synthesis is mainly due to the fact that Multiplexer tree circuits

    give new FPGA's like the ACT. FPGA family from Actel , where the basic building block

    consists of multiplexers .Each basic building block of the ACT family allows the

    implementation of a multiplexer (a) and, in the case of the ACT l family, implementation of

    three hierarchical multiplexers (b), which is denoted by act0. The ACT 2 family allows only a

    restricted realization of three hierarchical multiplexers, as can be seen in Fig. (b).

    Basic building block of the ACT' family : (a) ACT1 family; (b) .ACT2 family.

    The main objective behind this Mux based technology mapping is ,describing a combinational

    circuit in terms of Boolean equations and realize it using minimum number of basic blocks of

    the target Mux based architecture and minimizing the delay on the critical path.

    In this algorithm an appropriate base function ,a library of cells and a set of pattern graphs are

    selected .As an example let us select a 2 to 1 multiplexer as a base function.

  • 8/13/2019 UNIT II : CPLD & FPGA ARCHITECTURE & APPLICATIONS

    19/20

    Dr.Y.Narasimha Murthy [email protected]

    The above figure shows two Mux structures STRUCT and STRUCT1.Four pattern graphs are

    constructed for STRUCT1 as shon in figure below.If the function is realizable by one STRUCT1 block ,it either uses all the multiplexers or two or just one.These pattern graphs are in one to one

    correspondnce with these possibilities.So, a very small set of patterns to capture all possible

    functions realizable by one STRUCT1 block is needed.From the figure it is clear that the pattern

    graph uses all the multiplexers.

    The introduction of the OR gate at the select input of MUX3 increases the number of function

    realized by the block.from an algorithmic point of view it creates some problems .But the a

    modification of the algorithm is considered for the concurrence of OR structure.

    The advantages of MUX based technology mapping are it generates optimal mappings, which

    are often much better than those produced by conventional heuristic techniques. Moderately

    large circuits can be mapped optimally in a small amount of time. Very large circuits can be

    mapped near-optimally by partitioning the circuits and mapping each partition individually.

  • 8/13/2019 UNIT II : CPLD & FPGA ARCHITECTURE & APPLICATIONS

    20/20

    Dr.Y.Narasimha Murthy [email protected]

    ---------xxx--------------

    References:

    (i).Technology Mapping for Lookup Table Based Field Programmable Gate Arrays, Robert J

    Francis

    (ii).Technology Mapping for Field-Programmable Gate Arrays Using Integer Programming,Amit Chowdhary and John P. Hayes.

    (iii) .Experiences with XILINX Programmable Gate arrays,J.Molendijk & U.Wehrle

    .