xc4000e field programmable gate array familynacosta/xc4000e.pdf · 1999. 10. 25. · 3,000 5,000...

68
©1995 Xilinx Inc. For the latest revision of the specifications, see the Xilinx WEBLINX at http://www.xilinx.com. Features Third Generation Field-Programmable Gate Arrays - Select-RAM TM memory: on-chip ultra-fast RAM with - synchronous write option - dual-port RAM option - Fully PCI compliant - Abundant flip-flops - Flexible function generators - Dedicated high-speed carry-propagation circuit - Wide edge decoders (four per edge) - Hierarchy of interconnect lines - Internal 3-state bus capability - 8 global low-skew clock or signal distribution network Flexible Array Architecture - Programmable logic blocks and I/O blocks - Programmable interconnects and wide decoders Sub-micron CMOS Process - High-speed logic and Interconnect - Low power consumption Systems-Oriented Features - IEEE 1149.1-compatible boundary scan logic support - Programmable output slew rate (2 modes) - Programmable input pull-up or pull-down resistors - 12-mA sink current per output - 24-mA sink current per output pair Configured by Loading Binary File - Unlimited reprogrammability XC4000E Field Programmable Gate Array Family September 1, 1995 (Version 1.03) Preliminary Product Specifications - Six programming modes - Readback capability Backward Compatible with XC4000 Family XACT step Development System runs on ‘386/’486/ Pentium-type PC, Sun-4, and Hewlett-Packard 700 series - Interfaces to popular design environments including VIEWlogic, Mentor Graphics and OrCAD - Fully automatic partitioning, placement and routing - Interactive design editor for design optimization - Unified Libraries, including 288 soft macros and 34 Relationally Placed Macros (RPMs) - RAM/ROM compiler Introduction The XC4000E family of high-performance, high-density Field Programmable Gate Arrays (FPGAs) provides the benefits of custom CMOS VLSI, while avoiding the initial cost, time delay, and inherent risk of a conventional masked gate array. The result of eleven years of FPGA design experience and feedback from thousands of customers, the XC4000E fam- ily combines architectural versatility, on-chip Select-RAM memory with edge-triggered and dual-port modes, increased speed, abundant routing resources, and new, sophisticated software to achieve fully automated imple- mentation of complex, high-performance designs. Table 1: XC4000E Family of Field Programmable Gate Arrays Device XC4003E XC4005E XC4006E XC4008E XC4010E XC4013E XC4020E XC4025E Approximate Gate Count 3,000 5,000 6,000 8,000 10,000 13,000 20,000 25,000 CLB Matrix 10 x 10 14 x 14 16 x 16 18 x 18 20 x 20 24 x 24 28 x 28 32 x 32 Number of CLBs 100 196 256 324 400 576 784 1,024 Number of Flip-Flops 360 616 768 936 1,120 1,536 2,016 2,560 Max Decode Inputs per side 30 42 48 54 60 72 84 96 Max RAM Bits 3,200 6,272 8,192 10,386 12,800 18,432 25,088 32,768 Number of IOBs 80 112 128 144 160 192 224 256

Upload: others

Post on 19-Feb-2021

0 views

Category:

Documents


0 download

TRANSCRIPT

  • ©1995 Xilinx Inc. For the latest revision of the specifications, see the Xilinx WEBLINX at http://www.xilinx.com .

    Features

    • Third Generation Field-Programmable Gate Arrays- Select-RAMTM memory: on-chip ultra-fast RAM with - synchronous write option - dual-port RAM option- Fully PCI compliant- Abundant flip-flops- Flexible function generators- Dedicated high-speed carry-propagation circuit

    - Wide edge decoders (four per edge)- Hierarchy of interconnect lines- Internal 3-state bus capability- 8 global low-skew clock or signal distribution network

    • Flexible Array Architecture- Programmable logic blocks and I/O blocks

    - Programmable interconnects and wide decoders• Sub-micron CMOS Process

    - High-speed logic and Interconnect - Low power consumption• Systems-Oriented Features

    - IEEE 1149.1-compatible boundary scan logic support- Programmable output slew rate (2 modes)- Programmable input pull-up or pull-down resistors- 12-mA sink current per output- 24-mA sink current per output pair

    • Configured by Loading Binary File- Unlimited reprogrammability

    XC4000EField Programmable Gate Array Family

    September 1, 1995 (Version 1.03) Preliminary Product Specifications

    - Six programming modes - Readback capability• Backward Compatible with XC4000 Family• XACTstep Development System runs on ‘386/’486/

    Pentium-type PC, Sun-4, and Hewlett-Packard 700series

    - Interfaces to popular design environments including VIEWlogic, Mentor Graphics and OrCAD- Fully automatic partitioning, placement and routing- Interactive design editor for design optimization- Unified Libraries, including 288 soft macros and 34 Relationally Placed Macros (RPMs)

    - RAM/ROM compiler

    Introduction

    The XC4000E family of high-performance, high-densityField Programmable Gate Arrays (FPGAs) provides thebenefits of custom CMOS VLSI, while avoiding the initialcost, time delay, and inherent risk of a conventionalmasked gate array.

    The result of eleven years of FPGA design experience andfeedback from thousands of customers, the XC4000E fam-ily combines architectural versatility, on-chip Select-RAMmemory with edge-triggered and dual-port modes,increased speed, abundant routing resources, and new,sophisticated software to achieve fully automated imple-mentation of complex, high-performance designs.

    Table 1: XC4000E Family of Field Programmable Gate Arrays

    Device XC4003E XC4005E XC4006E XC4008E XC4010E XC4013E XC4020E XC4025E

    ApproximateGate Count

    3,000 5,000 6,000 8,000 10,000 13,000 20,000 25,000

    CLB Matrix 10 x 10 14 x 14 16 x 16 18 x 18 20 x 20 24 x 24 28 x 28 32 x 32

    Number ofCLBs

    100 196 256 324 400 576 784 1,024

    Number ofFlip-Flops

    360 616 768 936 1,120 1,536 2,016 2,560

    Max DecodeInputs per side

    30 42 48 54 60 72 84 96

    Max RAM Bits 3,200 6,272 8,192 10,386 12,800 18,432 25,088 32,768

    Number ofIOBs

    80 112 128 144 160 192 224 256

  • XC4000E Field Programmable Gate Array Family

    2

    mented in the XC4000E, then migrated to one of Xilinx’100%-compatible HardWire mask-programmed devices.

    Table 2 shows density and performance for a few commoncircuit functions that can be implemented in XC4000Edevices.

    Taking Advantage of Reconfiguration

    FPGA devices can be reconfigured to change logic functionwhile resident in the system. This capability gives the sys-tem designer a new degree of freedom not available withany other type of logic.

    Hardware can be changed as easily as software. Designupdates or modifications are easy, and can be made toproducts already in the field. An FPGA can even be recon-figured dynamically to perform different functions at differ-ent times.

    Reconfigurable logic can be used to implement systemself-diagnostics, create systems capable of being reconfig-ured for different environments or operations, or implementdual-purpose hardware for a given application. As anadded benefit, use of reconfigurable FPGA devices simpli-fies hardware design and debugging and shortens producttime-to-market.

    XC4000E Compared to XC4000

    Any XC4000E device is pin-out and bitstream compatiblewith the corresponding XC4000 device. An existingXC4000 bitstream can be used to program an XC4000Edevice. However, since the XC4000E includes many newfeatures, an XC4000E bitstream cannot always be loadedinto an XC4000 device.

    The XC4000E family has 8 members, ranging in density

    from 3,000 to 25,000 gates, as shown in Table 1.

    Description

    XC4000E-family devices are implemented with a regular,flexible, programmable architecture of Configurable LogicBlocks (CLBs), interconnected by a powerful hierarchy ofversatile routing resources, and surrounded by a perimeterof programmable Input/Output Blocks (IOBs). They havegenerous routing resources to accommodate the mostcomplex interconnect patterns.

    The devices are customized by loading configuration datainto the internal memory cells. The FPGA can eitheractively read its configuration data from an external serialor byte-parallel PROM (master modes), or the configura-tion data can be written into the FPGA (slave and periph-eral modes).

    The XC4000E family is supported by powerful and sophis-ticated software, covering every aspect of design fromschematic or behavioral entry, floorplanning, simulation,automatic block placement and routing of interconnects, tothe creation, downloading, and readback of the configura-tion bit stream.

    Because Xilinx FPGAs can be reprogrammed an unlimitednumber of times, they can be used in innovative designswhere hardware is changed dynamically, or where hard-ware must be adapted to different user applications.FPGAs are ideal for shortening the design and develop-ment cycle, but they also offer a cost-effective solution forproduction rates well beyond 5000 systems per month. Forfastest time-to-high-volume, a design can first be imple-

    Table 2: Density and Performance for Several Common Circuit Functions

    Design Class Function CLBs Used -3 Speed -2 Speed Units

    Memory 32 x 16 bit FIFO (simultaneous read/write) 48 61 MHz

    32 x 16 bit FIFO (MUXed read/write) 32 61 MHz

    256 x 8 Single Port 72 66 MHz

    Logic 16 bit Loadable Counter 8 70 MHz

    16 bit Up/Down Counter 8 70 MHz

    16 bit Pre-Scaled Counter 8 154 MHz

    24 bit Accumulator 13 58 MHz

    16 bit Address Decoder, XC4005E(pin-to-pin, edge decode)

    0 12.5 ns

    16 bit Address Decoder (internal decode) 3 4.7 ns

    9 bit Parity Checker 1 3.6 ns

    9 bit Shift Register (with enable) 5 170 MHz

  • 3

    For those readers already familiar with the XC4000 familyof Xilinx Field Programmable Gate Arrays, the major newfeatures in the XC4000E family are listed in this section.The biggest advantages of switching to an XC4000Edevice are the significantly increased system speed andthe new architectural features, particularly Select-RAMmemory.

    Increased System SpeedDelays in FPGA-based designs are layout dependent.There is a rule of thumb designers can consider—the sys-tem clock rate should not exceed one third to one half of thespecified toggle rate. Critical portions of a design, such asshift registers and simple counters, can run faster—approx-imately two thirds of the specified toggle rate.

    The XC4000E family can run at synchronous system clockrates of up to 70 MHz and internal performance in excessof 150 MHz. This increase in performance over the previ-ous families stems from improvements in both device pro-cessing and system architecture. XC4000E-familydevices use a deep sub-micron triple-layer metal process.In addition, many architectural improvements have beenmade, as described below.

    PCI ComplianceXC4000E-3 and faster speed grades are fully PCI compli-ant. The XC4000E offers a one-chip PCI solution.

    Carry LogicThe speed of the carry logic chain has increased dramati-cally. Some parameters, such as the delay on the carrychain through a single CLB (TBYP), have improved by asmuch as 50% from XC4000 values.

    Select-RAM Memory: Edge-Triggered, SynchronousRAM ModesThe RAM in any CLB can be changed to synchronous,edge-triggered, write operation. In this mode, the internalwrite operation is controlled by the same clock that drivesthe flip-flops. The clock polarity is programmable for theRAM (both F and G function generators together), but isindependent of the chosen flip-flop polarity. Address, Data,and WE inputs are latched by this rising or falling clockedge, and a short internal write pulse is generated shortlyafter the clock edge. This self-timed write operation is thuseffectively edge-triggered.

    The read operation is not affected by this change to anedge-triggered write.

    Dual-Port RAMA separate option converts the 16x2 RAM in any CLB into a16x1 dual-port RAM with simultaneous Read/Write. In thismode, any operation that writes into the F-RAM automati-cally also writes into the G-RAM, using the F address. TheG-address can only read from the G-RAM; it cannot beused to write into the G-RAM.

    The CLB can thus be used as an asymmetrical dual-portRAM, with F being the read address for the F-RAM and thewrite address for both F- and G-RAM, while G is the readaddress for the G-RAM. Note that F and G can still be inde-pendent read addresses, as they are in XC4000. The twoRAMs together have one read/write port using the Faddress, and one read-only port using the G address.

    The function generators in each CLB can be configured aseither level-sensitive (asynchronous) single-port RAM,edge-triggered (synchronous) single-port RAM, edge-trig-gered (synchronous) dual-port RAM or as combinatoriallogic.

    Configurable RAM ContentThe RAM content can now be configured, so that the RAMstarts up with user-defined data.

    H Function GeneratorIn the XC4000E, the H function generator is more versatile.Its inputs can come not only from the F and G function gen-erators but also from up to three control input lines. The Hfunction generator can be totally or partially independent ofthe other two function generators.

    IOB Clock EnableThe two flip-flops in each IOB have a common clock enableinput, which through configuration can be activated individ-ually for the input or output flip-flop or both. This clockenable operates exactly like the EC pin on the XC4000CLB. This new feature makes the IOBs more versatile, andavoids the need for clock gating.

    Output DriversThe output pull-up structure defaults to a TTL-like totem-pole. This driver is an n-channel pull-up transistor, pullingto a voltage one threshold below Vcc, just like the XC4000outputs. Alternatively, the XC4000E can be globally config-ured with CMOS outputs, with p-channel pull-up transistorspulling to Vcc. Also, the configurable pull-up resistor inXC4000E is a p-channel transistor that pulls to Vcc,whereas in the XC4000 it is an n-channel transistor thatpulls to a voltage one threshold below Vcc.

    Input ThresholdsThe input thresholds can be globally configured for eitherTTL (1.2 V threshold) or CMOS (2.5 V threshold), just likeXC2000 and XC3000 inputs. Note that the two globaladjustments of input threshold and output level are inde-pendent of each other.

    Global Signal Access to LogicThere is additional access from global clocks to the F andG function generator inputs.

    Configuration Pin Pull-Up ResistorsDuring configuration, the three mode pins, M0, M1, andM2, have weak pull-up resistors. For the most popular con-figuration mode, Slave Serial, the mode pins can thus beleft unconnected.

  • XC4000E Field Programmable Gate Array Family

    4

    For user mode, the three mode inputs can individually beconfigured with or without weak pull-up or pull-down resis-tors.

    The PROGRAM input pin has a permanent weak pull-up.

    Soft StartupLike the XC3000A, XC4000E devices have “Soft Startup.”When the configuration process is finished and the devicestarts up in user mode, the first activation of the outputs isautomatically slew-rate limited. This feature avoids thepotential ground bounce when all outputs are turned onsimultaneously. Immediately after start-up, the slew rate ofthe individual outputs is, as in the XC4000 family, deter-mined by the individual configuration option.

    XC4000 and XC4000A CompatibilityExisting XC4000 bitstreams can be used to configure anXC4000E device. Although they are pin-for-pin compati-

    ble, XC4000A bitstreams must be recompiled for use withthe XC4000E, due to improved routing resources.

    Detailed Functional Description

    The XC4000E family devices achieve high speed throughadvanced semiconductor technology and improved archi-tecture. The XC4000E supports system clock rates of up to70 MHz and internal performance in excess of 150 MHz.Compared to older Xilinx FPGA families, the XC4000Efamily is more powerful. It offers on-chip edge-triggeredand dual-port RAM, clock enables on I/O flip-flops, andwide-input decoders. It is more versatile in many applica-tions, especially those involving RAM. Design cycles arefaster due to a combination of increased routing resourcesand more sophisticated software.

    Table 3: CLB Count of Selected XC4000E Soft Macros

    7400 Equivalents CLBs Barrel Shifters CLBs Multiplexers CLBs

    ‘138‘139‘147‘148‘150‘151‘152‘153‘154‘157‘158‘160‘161‘162‘163‘164‘165s‘166‘168‘174‘194‘195‘280‘283‘298‘352‘390‘518‘521

    52565332

    1622568849573533822333

    brlshft4brlshft8

    413

    m2-1em4-1em8-1em16-1e

    11354-Bit Counters

    cd4cdcd4clecd4rlecb4cecb4clecb4re

    356365

    Registers

    rd4rrd8rrd16r

    248

    8- and 16-Bit Counters Shift Registers

    cb8cecb8recc16cecc16clecc16cled

    610

    99

    21

    sr8cesr16re

    48

    Decoders

    d2-4ed3-8ed4-16e

    24

    16Identity Comparators

    comp4comp8comp16

    125

    Explanation of counter nomenclature

    cb = binary countercd = BCD countercc = cascadable binary counterd = bidirectionall = loadablex = cascadablee = clock enabler = synchronous resetc = asynchronous clear

    Magnitude Comparators

    compm4compm8compm16

    49

    20

    Explanation of RAM nomenclature

    s = single-port edge-triggeredd = dual-port edge-triggeredno extension = level-sensitive

    RAMs

    ram16x4ram16x4sram16x4d

    224

  • 5

    Basic Building Blocks

    Xilinx high-density user-programmable gate arrays includethree major configurable elements: configurable logicblocks (CLBs), input/output blocks (IOBs), and intercon-nections.• CLBs provide the functional elements for constructing

    the user’s logic.• IOBs provide the interface between the package pins

    and internal signal lines.• Programmable interconnect resources provide routing

    paths to connect the inputs and outputs of the CLBsand IOBs onto the appropriate networks.

    Three other types of circuits are also available:• 3-State buffers (TBUFs) driving horizontal Longlines

    are associated with each CLB.• Wide edge decoders are available around the periph-

    ery of each device.• An on-chip oscillator is provided.

    The functionality of each circuit block is customized duringconfiguration by programming internal static memory cells.The values stored in these memory cells determine thelogic functions and interconnections implemented in theFPGA.

    Each of these available circuits is described in this section.

    Figure 1: Simplified Block Diagram of XC4000E CLB (RAM and Carry Logic functions not shown)

    LOGICFUNCTION

    OFG1-G4

    G4

    G3

    G2

    G1

    G'

    LOGICFUNCTION

    OFF1-F4

    F4

    F3

    F2

    F1

    F'

    LOGICFUNCTION

    OFF', G',ANDH1

    H'

    DINF'G'H'

    DINF'G'H'

    G'H'

    H'F'

    S/RCONTROL

    D

    ECRD

    Bypass

    Bypass

    SDYQ

    XQ

    Q

    S/RCONTROL

    D

    ECRD

    SDQ

    1

    1

    K(CLOCK)

    Multiplexer Controlledby Configuration Program

    Y

    X

    H1 DIN/H2 SR/H0 EC

    C1 C2 C3 C4

    X6460

  • XC4000E Field Programmable Gate Array Family

    6

    Configurable Logic Blocks (CLBs)

    Configurable Logic Blocks implement most of the logic inan FPGA. The principal CLB elements are shown inFigure 1. The number of CLBs needed to implementselected soft macros are shown in Table 3.

    Two 4-input function generators (F and G) offer unre-stricted versatility. Most combinatorial logic functions needfour or fewer inputs. However, a third function generator(H) is provided. The H function generator has three inputs.One or both of these inputs can be the outputs of F and G;the other input(s) are from outside the CLB. The CLB cantherefore implement certain functions of up to nine vari-ables, like parity check or expandable-identity comparisonof two sets of four inputs.

    Each CLB contains two flip-flops that can be used to storethe function generator outputs. However, the flip-flops andfunction generators can also be used independently. DINcan be used as a direct input to either of the two flip-flops.H1 can drive the other flip-flop through the H function gen-erator. Function generator outputs can also be accessedfrom outside the CLB, using two outputs independent of theflip-flop outputs. This versatility increases logic density andsimplifies routing.

    Thirteen CLB inputs and four CLB outputs provide accessto the function generators and flip-flops. These inputs andoutputs connect to the programmable interconnectresources outside the block.

    Function Generators

    Four independent inputs are provided to each of two func-tion generators (F1 - F4 and G1 - G4). These function gen-erators, whose outputs are labeled F’ and G’, are eachcapable of implementing any arbitrarily defined Booleanfunction of four inputs. The function generators are imple-mented as memory look-up tables. The propagation delayis therefore independent of the function implemented.

    A third function generator, labeled H’, can implement anyBoolean function of its three inputs. Two of these inputscan optionally be the F’ and G’ functional generator out-puts. Alternatively, one or both of these inputs can comefrom outside the CLB (H2, H0). The third input must comefrom outside the block (H1).

    Signals from the function generators can exit the CLB ontwo outputs. F’ or H’ can be connected to the X output. G’or H’ can be connected to the Y output.

    A CLB can be used to implement any of the following func-tions:• any function of up to four variables, plus any second

    function of up to four unrelated variables, plus any third

    function of up to three unrelated variables1

    • any single function of five variables• any function of four variables together with some func-

    tions of six variables• some functions of up to nine variables

    Implementing wide functions in a single block reduces boththe number of blocks required and the delay in the signalpath, achieving both increased density and speed.

    The versatility of the CLB function generators significantlyimproves system speed. In addition, the design-softwaretools can deal with each function generator independently.This flexibility improves cell usage.

    Flip-Flops

    The CLB can pass the combinatorial output(s) to the inter-connect network, but can also store the combinatorialresults or other incoming data in one or two storage ele-ments, and connect their outputs to the interconnect net-work as well.

    The two storage elements in the CLB are edge-triggered D-type flip-flops with common clock (K) and clock enable (EC)inputs. Flip-flop functionality is described in Table 4.

    Table 4: CLB Flip-Flop Functionality(no optional inversions used)

    LEGEND:

    Clock Input

    Each flip-flop can be triggered on either the rising or fallingclock edge. The clock pin is shared by both flip-flops; how-ever, the clock is independently invertible for the two flip-flops. Any inverter placed on the clock input is absorbedinto the CLB.

    1. When three separate functions are generated, one of the func-tion outputs must be captured in a flip-flop internal to the CLB.Only two unregistered function generator outputs are availablefrom the CLB.

    Mode K EC SR D Q

    Power-Upor GSR

    X X X X SR

    Flip-Flop

    X X 1 X SR

    X 0 0* X Q

    __/ 1* 0* D D

    X Don’t care

    __/ Rising edge

    SR Set or Reset value specified with INIT prop-erty. Reset is default.

    0* Input is Low or unconnected (default value)

    1* Input is High or unconnected (default value)

  • 7

    Clock Enable

    The clock enable signal (EC) is active High. If uncon-nected, it defaults to the active state. EC can be left uncon-nected for either or both flip-flops; therefore, the control isindependent. However, the input is shared by both flip-flops in a CLB. EC is not invertible within the CLB.

    Set/Reset

    An asynchronous flip-flop input (SR) can be configured aseither set or reset. This configuration option determines thestate in which the flip-flops become operational after con-figuration. It also determines the effect of a Global Set/Reset pulse during normal operation, and the effect of apulse on the SR pin of the CLB. All three set/reset func-tions for any single flip-flop are controlled by the same databit.

    The set/reset state can be independently specified for eachflip-flop. This input can also be disabled for either flip-flop.

    The set/reset state is specified by using the INIT attribute orby placing the appropriate set or reset flip-flop primitive.

    SR is active high and can affect both flip-flops. It is notinvertible within the CLB.

    Global Set/Reset

    A separate Global Set/Reset line (not shown in Figure 1)sets or clears each register during power-up, reconfigura-tion, or when a dedicated Reset net is driven active. Thisglobal net (GSR) does not compete with other routingresources.

    GSR can be driven from any package pin as a global resetinput. To use this global net, place an input pad and inputbuffer in the schematic or VHDL code, driving the GSR pinof the STARTUP symbol. A specific pin location can beassigned to this input, just as for any other user-program-mable pad. An inverter can optionally be inserted after theinput buffer to invert the sense of the Set/Reset signal.

    GSR can also be driven from any internal node.

    Data Inputs and Outputs

    The source of a flip-flop data input is programmable. It isdriven by any of the functions F’, G’, and H’, or by the DirectIn (DIN) block input. The flip-flops drive the XQ and YQCLB outputs.

    The XQ and YQ outputs are also used by the place androute software to form a fast bypass through the XC4000ECLB. A two-to-one multiplexer selects between a flip-flopoutput and either the DIN or EC input. This bypass is usedby the automated router to repower internal signals.

    Control Signals

    Multiplexers in the CLB map the four control inputs (C1 - C4in Figure 1) into the four internal control signals (H1, DIN/

    H2, SR/H0, and EC). Any of these inputs can drive any ofthe four internal control signals.

    When the memory function is disabled, the four inputs are:• EC — Enable Clock• SR/H0 — Asynchronous Set/Reset or H function gen-

    erator Input 0• DIN/H2 — Direct In or H function generator Input 2• H1 — H function generator Input 1.

    When the memory function is enabled, the four inputs are:• EC — Enable Clock• WE — Write Enable• D0 — Data Input to F and/or G function generator• D1 — Data input to G function generator (16x1 and

    16x2 modes) or 5th Address bit (32x1 mode).

    Using FPGA Flip-Flops

    When a function generator drives a flip-flop in a CLB, thecombinatorial propagation delay overlaps completely withthe setup time of the flip-flop. The set-up time is specifiedbetween the function generator inputs and the clock input.This represents a performance advantage over competingtechnologies, where combinatorial delays must be addedto the flip-flop setup time.

    The abundance of flip-flops in the XC4000E-family devicesinvites pipelined designs. This is a powerful way of increas-ing performance by breaking the function into smaller sub-functions and executing them in parallel, passing on theresults through pipeline flip-flops. This method should beseriously considered wherever total performance is moreimportant than simple through-delay.

    In the XC4000E family, the flip flops can be used as regis-ters or shift registers without blocking the function genera-tors from performing a different, perhaps unrelated task.This ability increases the functional density of the devices.

    Using Function Generators as RAM

    The XC4000E family devices are the first programmablelogic devices with edge-triggered (synchronous) and dual-port RAM accessible to the user. Edge-triggered RAMsimplifies system timing. Dual-port RAM doubles the effec-tive throughput of FIFO applications. These features canbe individually programmed in any XC4000E CLB.

    Optional modes for each CLB make the memory look-uptables in the F’ and G’ function generators usable as anarray of Read/Write memory cells. Available modes arelevel-sensitive (similar to the XC4000/A/H families), edge-triggered, and dual-port edge-triggered. Depending on theselected mode, a single CLB can be configured as either a16x2, 32x1, or 16x1 bit array.

  • XC4000E Field Programmable Gate Array Family

    8

    RAM Configuration OptionsThe function generators in any CLB can be configured asRAM arrays in the following sizes:• Two 16x1 RAMs: two data inputs and two data outputs

    with identical or, if preferred, different addressing foreach RAM

    • One 32x1 RAM: one data input and one data output

    One F or G function generator can be configured as a 16x1RAM while the other function generators are used to imple-ment any function of up to 5 inputs.

    Additionally, the XC4000E RAM may have either of two tim-ing modes:• Edge-Triggered (Synchronous): data written by the

    designated edge of the CLB clock. WE acts as a trueclock enable.

    • Level-Sensitive: an external WE signal must be sup-plied asynchronously.

    The selected timing mode applies to both function genera-tors within a CLB when both are configured as RAM.

    The number of read ports is also programmable:• Single Port: each function generator has a read port

    and a write port• Dual Port: both function generators are configured as a

    single 16x1 dual-port RAM with one write port and tworead ports. Simultaneous read and write operations tothe same or different addresses are supported.

    Supported CLB memory configurations and timing modesfor single- and dual-port modes are shown in Table 5.

    RAM configuration options are selected by placing theappropriate library symbol.

    Table 5: Supported RAM Modes

    RAM Inputs and OutputsThe F1-F4 and G1-G4 inputs to the function generators actas address lines, selecting a particular memory cell in eachlook-up table.

    The functionality of the CLB control signals changes whenthe function generators are configured as RAM. The DIN/H2, H1, and SR/H0 lines become the two data inputs (D0,D1) and the Write Enable (WE) input for the 16x2 memory.When the 32x1 configuration is selected, D1 acts as thefifth address bit and D0 is the data input.

    16x1

    16x2

    32x1

    Edge-TriggeredTiming

    Level-SensitiveTiming

    Single-Port X X X X X

    Dual-Port X X

    The contents of the memory cell(s) being addressed areavailable at the F’ and G’ function-generator outputs. Theycan exit the CLB through its X and Y outputs, or can becaptured in the CLB flip-flop(s).

    Configuring the CLB function generators as Read/Writememory does not affect the functionality of the other por-tions of the CLB, with the exception of the redefinition of thecontrol signals. The H’ function generator can be used toimplement Boolean functions of F’, G’, and D1, and the Dflip-flops can latch the F’, G’, H’, or D0 signals.

    Single-Port Edge-Triggered ModeEdge-triggered RAM simplifies timing requirements. TheXC4000E edge-triggered RAM timing operates like writingto a data register. Data and address are presented. Theregister is enabled for writing by a logic High on the writeenable input, WE. Then a rising or falling clock edge loadsthe data into the register, as shown in Figure 2.

    Careful timing relationships between address, data, andwrite enable signals are not required, and the external writeenable pulse becomes a simple clock enable. The risingedge of WCLK latches the address, input data, and WE sig-nals. An internal write pulse is generated that performs thewrite. See Figure 3 and Figure 4 for block diagrams of aCLB configured as 16x2 and 32x1 edge-triggered, single-port RAM.

    Figure 2: Edge-Triggered RAM Write Timing

    X6461

    WCLK (K)

    WE

    ADDRESS

    DATA IN

    DATA OUT OLD NEW

    TDSS TDHS

    TASS TAHS

    TWSS

    TWPS

    TWHS

    TWOS

    TILOTILO

  • 9

    Figure 3: 16x2 (or 16x1) Edge-Triggered Single-Port RAM

    Figure 4: 32x1 Edge-Triggered Single-Port RAM (F and G addresses are identical)

    G'4

    G1 • • • G4

    F1 • • • F4

    WRITEDECODER

    1 of 16

    DIN

    16-LATCHARRAY

    WE D1 D0 EC

    C1 C2 C3 C4

    X5789

    4

    MUX

    F'WRITE

    DECODER

    1 of 16

    DIN

    16-LATCHARRAY

    READADDRESS

    READADDRESS

    WRITE PULSE

    LATCHENABLE

    LATCHENABLE

    CLK

    WRITE PULSE

    MUX4

    4

    G'4

    G1 • • • G4

    F1 • • • F4

    WRITEDECODER

    1 of 16

    DIN

    16-LATCHARRAY

    WE D1 D0 EC

    C1 C2 C3 C4

    X5788

    4

    MUX

    F'WRITE

    DECODER

    1 of 16

    DIN

    16-LATCHARRAY

    READADDRESS

    READADDRESS

    WRITE PULSE

    LATCHENABLE

    LATCHENABLE

    CLK

    WRITE PULSE

    MUX4

    4

    H'

  • XC4000E Field Programmable Gate Array Family

    10

    Read data is not clocked. It appears asynchronously at thefunction generator output a certain time after the addressinputs have settled (TILO for 16x2 and 16x1, TIHO for 32x1).

    The relationships between CLB pins and RAM inputs andoutputs for single-port, edge-triggered mode are shown inTable 6.

    The Write Clock input (WCLK) can be configured as activeon either the rising edge (default) or the falling edge. Ituses the same CLB pin (K) used to clock the CLB flip-flops,but it can be independently inverted. Consequently, theRAM output can optionally be registered within the sameCLB either by the same clock edge as the RAM, or by theopposite edge of this clock. The sense of WCLK applies toboth function generators in the CLB when both are config-ured as RAM.

    The WE pin is active-High and is not invertible within theCLB.

    The pulse following the active edge of WCLK has a maxi-mum pulsewidth requirement, due to the construction of theRAM. The specification is on the order of milliseconds andshould not be a serious restriction; however, it should notbe forgotten.

    Table 6: Single-Port Edge-Triggered RAM Signals

    Dual-Port Edge-Triggered ModeIn dual-port mode, both the F and G function generatorsare used to create a single 16x1 RAM array with one writeport and two read ports. The resulting RAM array can beread and written simultaneously at two independentaddresses. Simultaneous read and write operations at thesame address are also supported.

    Dual-port mode always has edge-triggered write timing, asshown in Figure 2 on page 8.

    Figure 5 shows a simple model of an XC4000E CLB config-ured as dual-port RAM. One address port, labeled A[3:0],supplies both the read and write address for the F functiongenerator. This function generator behaves the same as a16x1 single-port edge-triggered RAM array. The RAM out-put, Single Port Out (SPO), appears at the F function gen-erator output.

    RAM Signal CLB Pin Function

    D D0 or D1 Data In

    A[3:0] F1-F4 orG1-G4

    Address

    WE WE Write Enable

    WCLK K Clock

    SPO(Data Out)

    F’ or G’ Single Port Out(Data Out)

    The other address port, labeled DPRA[3:0] for Dual PortRead Address, supplies the read address for the G functiongenerator. The write address for the G function generator,however, comes from the address A[3:0]. The output fromthis 16x1 RAM array, Dual Port Out (DPO), appears at theG function generator output.

    Therefore, by using A[3:0] for the write address andDPRA[3:0] for the read address, and reading only the DPOoutput, a FIFO that can read and write simultaneously iseasily generated. Simultaneous access doubles the effec-tive throughput of the FIFO.

    The relationships between CLB pins and RAM inputs andoutputs for dual-port, edge-triggered mode are shown inTable 7. See Figure 6 for a block diagram of a CLB config-ured in this mode.

    The pulse following the active edge of WCLK has a maxi-mum pulsewidth requirement, due to the construction of theRAM. The specification is on the order of milliseconds andshould not be a serious restriction; however, it should notbe forgotten.

    Table 7: Dual-Port Edge-Triggered RAM Signals

    Single-Port Level-Sensitive Timing ModeNote: Edge-triggered mode is recommended for all newdesigns. Level-sensitive mode is still supported forXC4000E backward-compatibility with the XC4000 family.

    Level-sensitive RAM timing is simple in concept but can becomplicated in execution. Data and address signals arepresented, then a positive pulse on the write enable (WE)performs a write into the RAM at the designated address.As indicated by the “level-sensitive” label, this RAM actslike a latch. During the WE High pulse, changing the datalines results in new data written to the old address. Chang-ing the address lines while WE is High results in spuriousdata written to the new address—and possibly at otheraddresses as well, as the address lines inevitably do not allchange simultaneously.

    RAM Signal CLB Pin Function

    D D0 Data In

    A[3:0] F1-F4 Read Address for F,Write Address for Fand G

    DPRA[3:0] G1-G4 Read Address for G

    WE WE Write Enable

    WCLK K Clock

    SPO F’ Single Port Out

    DPO G’ Dual Port Out

  • 11

    Figure 5: XC4000E Dual-Port RAM, Simple Model

    Figure 6: 16x1 Edge-Triggered Dual-Port RAM

    WE WE

    D D Q

    D Q

    D

    DPRA[3:0]

    A[3:0]

    AR[3:0]

    AW[3:0]

    WE

    D

    AR[3:0]

    AW[3:0]

    RAM16X1D Primitive

    F Function Generator

    G Function Generator

    DPO (Dual Port Out)

    Registered DPO

    SPO (Single Port Out)

    Registered SPO

    WCLK X6217

    G'

    G1 • • • G4

    F1 • • • F4

    WRITEDECODER

    1 of 16

    DIN

    16-LATCHARRAY

    WE D1 D0 EC

    C1 C2 C3 C4

    X5790

    4

    MUX

    F'WRITE

    DECODER

    1 of 16

    DIN

    16-LATCHARRAY

    READADDRESS

    READADDRESS

    WRITE PULSE

    LATCHENABLE

    LATCHENABLE

    CLK

    WRITE PULSE

    MUX4

    4

  • XC4000E Field Programmable Gate Array Family

    12

    The user must generate a carefully timed WE signal. Thedelay on the WE signal and the address lines must be care-fully verified to ensure that WE does not become activeuntil after the address lines have settled, and that WE goesinactive before the address lines change again. The datamust be stable before and after the falling edge of WE.

    In practical terms, WE is usually generated by a 2X clock.If a 2X clock is not available, the falling edge of the systemclock can be used. However, there are inherent risks in thisapproach, since the WE pulse must be guaranteed inactivebefore the next rising edge of the system clock. Severalapplication notes are available from Xilinx that discuss thedesign of level-sensitive RAMs. These application notesinclude XAPP031, “Using the XC4000 RAM Capability,”and XAPP042, “High-Speed RAM Design in XC4000.”However, the edge-triggered RAM available within theXC4000E is superior to level-sensitive RAM for nearlyevery application.

    Figure 7 shows the write timing for level-sensitive, single-port RAM.

    Figure 8 and Figure 9 show block diagrams of a CLB con-figured as 16x2 and 32x1 level-sensitive, single-port RAM.

    The relationships between CLB pins and RAM inputs andoutputs for single-port level-sensitive mode are shown inTable 8.

    Table 8: Single-Port Level-Sensitive RAM Signals

    RAM Signal CLB Pin Function

    D D0 or D1 Data In

    A[3:0] F1-F4 orG1-G4

    Address

    WE WE Write Enable

    O F’ or G’ Data Out

    Figure 7: Level-Sensitive RAM Write Timing

    WCT

    ADDRESS

    WRITE ENABLE

    DATA IN

    AST WPT

    DST DHT

    REQUIRED

    AHT

    X6462

  • 13

    Figure 8: 16x2 (or 16x1) Level-Sensitive Single-Port RAM

    Figure 9: 32x1 Level-Sensitive Single-Port RAM (F and G addresses are identical)

    Enable

    G'4

    G1 • • • G4

    F1 • • • F4

    WRITEDECODER

    1 of 16

    DIN

    16-LATCHARRAY

    WE D1 D0 EC

    C1 C2 C3 C4

    X5786

    4

    READ ADDRESS

    MUX

    Enable

    F'WRITE

    DECODER

    1 of 16

    DIN

    16-LATCHARRAY

    4

    READ ADDRESS

    MUX4

    Enable

    G'4

    G1 • • • G4

    F1 • • • F4

    WRITEDECODER

    1 of 16

    DIN

    16-LATCHARRAY

    H'

    WE D1/A5 D0 EC

    C1 C2 C3 C4

    X5787

    4

    READ ADDRESS

    MUX

    Enable

    F'WRITE

    DECODER

    1 of 16

    DIN

    16-LATCHARRAY

    4

    READ ADDRESS

    MUX4

  • XC4000E Field Programmable Gate Array Family

    14

    Initializing RAM at ConfigurationBoth RAM and ROM implementations of the XC4000Edevices are initialized at power-up. The initial contents aredefined via an INIT attribute or property attached to theRAM or ROM symbol, as described in the schematic libraryguide.

    If not defined, all RAM contents are initialized to all zeros,by default.

    RAM initialization occurs only during configuration. TheRAM content is not affected by Global Set/Reset.

    Advantages of On-Chip and Edge-Triggered RAMThe on-chip RAM is extremely fast. The read access timeis the same as the logic delay. The write access time isslightly slower. Both access times are much faster thanany off-chip solution.

    Edge-triggered RAM, also called synchronous RAM, is afeature never before available in a Field ProgrammableGate Array. The simplicity of designing with edge-triggeredRAM, combined with greatly improved system speeds fromthe elimination of the 2X clock, add up to a significantimprovement over existing devices with on-chip RAM.

    Two application notes are available from Xilinx that discussRAM in the XC4000E: “XC4000E Edge-Triggered andDual-Port RAM Capability,” and “Implementing FIFOs inXC4000E RAM.”

    Fast Carry Logic

    Each CLB F and G function generator contains dedicatedarithmetic logic for the fast generation of carry and borrowsignals. This extra output is passed on to the next CLBfunction generator above or below. The carry chain is inde-pendent of normal routing resources.

    Dedicated fast carry logic greatly increases the efficiencyand performance of adders, subtracters, accumulators,comparators and counters.

    The two 4-input function generators can be configured as a2-bit adder with built-in hidden carry that can be expandedto any length. This dedicated carry circuitry is so fast andefficient that conventional speed-up methods like carrygenerate/propagate are meaningless even at the 16-bitlevel, and of marginal benefit at the 32-bit level.

    The fast-carry logic opens the door to many new applica-tions involving arithmetic operation, where the previousgenerations of FPGAs were not fast enough or too ineffi-cient. High-speed address offset calculations in micropro-cessor or graphics systems, and high-speed addition indigital signal processing are two typical applications.

    This fast carry logic is one of the more significant featuresof the XC4000E family, speeding up arithmetic and count-ing into the 70 MHz range.

    The fast carry logic can be accessed by placing speciallibrary symbols, or by using Xilinx Relationally Placed Mac-ros (RPMs) that already include these symbols.

    Figure 10 shows the fast carry logic in one XC4000E CLB.

    Figure 10: Fast Carry Logic in XC4000E CLB

    LogicFunction

    of G1 - G4G'

    CarryLogic

    CarryLogic

    F'

    LogicFunctionof F1 - F4

    M

    F4F3

    F2

    F1

    COUT

    CIN 1

    CIN 2

    B0

    A0

    G4G3

    G2

    G1

    A1

    B1

    SUM 1

    SUM 0

    X5373

  • 15

    Figure 11: Simplified Block Diagram of XC4000E IOB

    Input/Output Blocks (IOBs)

    User-configurable input/output blocks (IOBs) provide theinterface between external package pins and the internallogic. Each IOB controls one package pin and can bedefined for input, output, or bidirectional signals.

    Figure 11 shows a simplified block diagram of theXC4000E IOB. A more complete diagram can be found inFigure 20 on page 24, in the Boundary Scan section.

    Input Signals

    Two paths, labeled I1 and I2 in Figure 11, bring input sig-nals into the array. Inputs also connect to an input registerthat can be programmed as either an edge-triggered flip-flop or a level-sensitive transparent-Low latch. The choiceis made by placing the appropriate primitive from the sym-bol library.

    The inputs can be globally configured for either TTL (1.2V)or CMOS (2.5V) thresholds. The two global adjustments ofinput threshold and output level are independent of eachother. There is a slight hysteresis of about 300mV.

    Registered InputsThe I1 and I2 signals that exit the block can each carryeither the direct or registered input signal.

    The input and output storage elements in each IOB have acommon clock enable input, which through configurationcan be activated individually for the input or output flip-flopor both. This clock enable operates exactly like the EC pinon the XC4000E CLB. It cannot be inverted within the IOB.

    The storage element behavior is shown in Table 9.

    Q Flip-Flop/Latch

    D

    D QOut

    OE

    OutputClock

    I

    InputClock

    ClockEnable

    Delay

    PadFlip-Flop

    Slew RateControl

    OutputBuffer

    InputBuffer

    PassivePull-Up/

    Pull-Down

    2

    I1

    X6463

    Table 9: Input Register Functionality

    LEGEND:

    Optional Delay Guarantees Zero Hold TImeThe data input to the register can optionally be delayed byseveral nanoseconds. With the delay enabled, the setuptime of the input flip-flop is increased so that normal clockrouting does not result in a positive hold-time requirement.A positive hold time can lead to unreliable, temperature- orprocessing-dependent operation.

    The input flip-flop setup time is defined between the datameasured at the device I/O pin and the clock input at theIOB. Any routing delay from the clock pad to the clock pinof the IOB must, therefore, be subtracted from this setuptime to arrive at the real setup time requirement relative tothe device pins. A short specified setup time might, there-fore, result in a negative set-up time at the device pins, i.e.,a positive hold-time requirement.

    When the delay is inserted on the data line, more clockdelay can be tolerated without causing a positive hold-timerequirement. This delay eliminates the possibility of a datahold-time requirement at the external pin. The delay istherefore inserted as the default. For faster input registersetup time, with non-zero hold, attach a NODELAYattribute or property to the flip-flop.

    Mode Clock Clk-Enable

    D Q

    Power-Upor GSR

    X X X SR

    Flip-Flop __/ 1* D D

    Latch 1 1* D Q

    0 1* D D

    Both X 0 X Q

    X Don’t care

    __/ Rising edge

    SR Set or Reset value specified with INIT prop-erty. Reset is default.

    0* Input is Low or unconnected (default value)

    1* Input is High or unconnected (default value)

  • XC4000E Field Programmable Gate Array Family

    16

    Output Signals

    Output signals can be optionally inverted within the IOB,and can pass directly to the pad or be stored in an edge-triggered flip-flop. The functionality of this flip-flop is shownin Table 10.

    An output enable signal can be used to place the outputbuffer in a high-impedance state, implementing 3-state out-puts or bidirectional I/O. Under configuration control, theoutput (OUT) and output enable (OE) signals can beinverted. The polarity of these signals is independentlyconfigured for each IOB.

    The 4 mA maximum output current specification of manyFPGAs often forces the user to add external buffers, whichare especially cumbersome on bidirectional I/O lines. TheXC4000E family solves many of these problems by provid-ing a guaranteed output sink current of 12 mA. Two adja-cent outputs can be interconnected externally to sink up to24 mA. The FPGA can thus drive short buses on a printedcircuit board.

    By default, the output pull-up structure is configured as aTTL-like totem-pole. This driver is an n-channel pull-uptransistor, pulling to a voltage one threshold below Vcc.Alternatively, the output can be configured as a CMOSdriver, with a p-channel pull-up transistor pulling to Vcc.This option applies to every output on the device.

    An output can be configured as open-collector by placingan OBUFT symbol in a schematic or VHDL code, then tyingthe 3-state pin (T) to the input pin (I).

    Table 10: Output Flip-Flop Functionality(no optional inversions used)

    LEGEND:

    Mode Clock Clk-Enable

    OE D Q

    Power-Upor GSR

    X X 0* X SR

    Flip-Flop

    X 0 0* X Q

    __/ 1* 0* D D

    X X 1 X Z

    X Don’t care

    __/ Rising edge

    SR Set or Reset value specified with INIT prop-erty. Reset is default.

    0* Input is Low or unconnected (default value)

    1* Input is High or unconnected (default value)

    Z 3-State

    Output Slew RateThe slew rate of each output buffer is by default reduced, tominimize power bus transients when switching non-criticalsignals. For critical signals, attach a FAST attribute orproperty to the output buffer or flip-flop.

    For XC4000E devices, maximum total capacitive load forsimultaneous fast mode switching in the same direction is200pF per Power/Ground pin pair. For slew-rate limitedoutputs this total is two times larger. This maximum capac-itive load should not be exceeded, as it can result in groundbounce of greater than 1.5 V amplitude and more than 5 nsduration. This level of ground bounce may cause undes-ired transient behavior on an output, or in the internal logic.This restriction is common to all high-speed digital ICs, andis not particular to Xilinx or the XC4000E family.

    The XC4000E family has a feature called “Soft Startup,”designed to avoid potential ground bounce when all out-puts are turned on simultaneously at the end of configura-tion. When the configuration process is finished and thedevice starts up in user mode, the first activation of the out-puts is automatically slew-rate limited. Immediately follow-ing the first activation of the I/O, the slew rate of theindividual outputs is determined by the individual configura-tion option for each IOB.

    Global Three-StateA separate Global 3-State line (not shown in Figure 11)forces all FPGA outputs to the high-impedance state,unless boundary scan is enabled and is executing anEXTEST instruction. This global net (GTS) does not com-pete with other routing resources.

    GTS can be driven from any package pin as a global 3-state input. To use this global net, place an input pad andinput buffer in the schematic or VHDL code, driving theGTS pin of the STARTUP symbol. A specific pin locationcan be assigned to this input just as for any other user-pro-grammable pad. An inverter can optionally be insertedafter the input buffer to invert the sense of the Global 3-State signal.

    GTS can also be driven from any internal node.

    Other IOB Options

    There are a number of other programmable options in theIOB.

    Pull-up and Pull-down ResistorsProgrammable pull-up and pull-down resistors are usefulfor tying unused pins to Vcc or Ground to minimize powerconsumption. The configurable pull-up resistor is a p-chan-nel transistor that pulls to Vcc. The configurable pull-downresistor is an n-channel transistor that pulls to Ground.

    The value of these resistors is 50kΩ − 100kΩ. This highvalue makes them unsuitable as wired-AND pull-up resis-tors.

  • 17

    The pull-up resistors for most user-programmable IOBs areactive during the configuration process. See the “Pin Func-tions During Configuration” table for a list of pins with pull-ups active before and during configuration.

    After configuration, voltage levels of unused pads, bondedor unbonded, must be valid logic levels. Therefore, bydefault, unused pads are configured with the internal pull-up resistor. Alternatively, they can be individually config-ured with the pull-down resistor, or as a driven output, or tobe driven by an external source.

    Independent ClocksSeparate clock signals are provided for the input and out-put registers. The clock can be independently inverted foreach flip-flop within the IOB, generating either falling-edgeor rising-edge triggered flip-flops. The clock inputs for eachIOB are independent.

    Global Set/ResetAs with the CLB registers, the Global Set/Reset signal(GSR) can be used to set or clear the input and output reg-isters, depending on the value of the INIT attribute or prop-erty. The two flip-flops can be individually configured to setor clear on reset and after configuration. Other than theglobal GSR net, no user-controlled set/reset signal is avail-able to the I/O flip-flops. The choice of set or clear appliesto both the initial state of the flip-flop and the response tothe Global Set/Reset pulse.

    JTAG SupportEmbedded logic attached to the IOBs contains test struc-tures compatible with IEEE Standard 1149.1 for boundary-scan testing, permitting easy chip and board-level testing.More information is provided in the Boundary Scan sec-tion later in this Data Sheet.

    Programmable Interconnect

    All internal connections are composed of metal segmentswith programmable switching points and switching matri-ces to implement the desired routing. A structured, hierar-chical matrix of routing resources is provided to achieveefficient automated routing.

    Four Types of Interconnect

    There are four main types of interconnect. Three are distin-guished by the relative length of their segments: single-length lines, double-length lines, and Longlines. In addi-tion, eight global buffers drive fast, low-skew nets mostoften used for clocks or global control signals.

    Single-length lines and double-length lines are connectedby way of programmable switch matrices.

    Programmable Switch Matrices

    The single-length lines are a grid of horizontal and verticallines that intersect at a switch matrix between each block.Figure 12 illustrates the single-length interconnect lines

    surrounding one CLB in the array, and the switch matricesconnecting those lines. Each switch matrix consists of pro-grammable n-channel pass transistors used to establishconnections between the single-length lines (Figure 13).

    For example, a signal entering on the right side of theswitch matrix can be routed to a single-length line on thetop, left, or bottom sides, or any combination thereof, if mul-tiple branches are required.

    Figure 12: Typical CLB Connections to Adjacent Sin-gle-Length Lines

    Figure 13: Programmable Switch Matrix

    Single-Length Lines

    Single-length lines provide the greatest interconnect flexi-bility and offer fast routing between adjacent blocks. Theselines connect the switching matrices that are located atevery intersection of a row and a column of CLBs. How-ever, they incur a delay whenever they go through a switch-ing matrix.

    CLB

    G1

    C1

    K

    F1

    X

    Y

    G3

    C3

    F3

    F4 C4 G4 YQ

    XQ F2 C2 G2

    SwitchMatrix

    X3242

    SwitchMatrix

    SwitchMatrix

    SwitchMatrix

    Six Pass TransistorsPer Switch MatrixInterconnect Point

    X3244

  • XC4000E Field Programmable Gate Array Family

    18

    Single-length lines are normally used to conduct signalswithin a localized area and to provide the branching for netswith fanout greater than one.

    Double-Length Lines

    The double-length lines consist of a grid of metal segmentstwice as long as the single-length lines: they run past twoCLBs before entering a switch matrix. Double-length linesare grouped in pairs with the switch matrices staggered, sothat each line goes through a switch matrix at every otherCLB location in that row or column (see Figure 14).

    They provide faster signal routing over intermediate dis-tances, while retaining routing flexibility.

    Figure 14: Double-Length Lines

    Longlines

    Longlines form a grid of metal interconnect segments thatrun the entire length or width of the array. A Longline nethas negligible delay variations. Longlines are intended forhigh fan-out, time-critical signal nets, or nets that are dis-tributed over long distances.

    Two horizontal Longlines per CLB can be driven by 3-stateor open-drain drivers. They can therefore implement unidi-rectional or bidirectional buses, wide multiplexers, or wired-AND functions. (See the Three-State Buffers section formore details.)

    Each Longline has a programmable splitter switch at itscenter that can separate the line into two independent rout-

    CLB

    CLB

    CLB

    CLB

    SwitchMatrices X3245

    ing channels, each running half the width or height of thearray.

    Each horizontal Longline driven by TBUFs has a pull-upresistor at each end. To activate these resistors, place aPULLUP symbol and attach it to the Longline net. The soft-ware automatically activates one or both pull-ups as appro-priate.

    There is also a weak keeper at each end of each horizontalLongline. This circuit prevents undefined floating levels.

    Global Nets and Buffers

    Additional vertical Longlines are driven by special globalbuffers, designed to distribute clocks and other high fanoutcontrol signals throughout the array with minimal skew.More dedicated global resources are provided than in mostavailable programmable logic devices. Four primary globalnets offer the shortest delay and negligible skew. Four sec-ondary global nets have slightly longer delay and slightlymore skew due to heavier loading, but offer greater flexibil-ity when used to drive non-clock CLB inputs.

    The primary global buffers must be driven by the dedicatedpads. The secondary global buffers may be sourced byeither dedicated pads or internal nets.

    Each CLB column has four dedicated vertical Longlines.Each of these lines has access to a particular primary glo-bal net, or to any of the secondary global nets, as shown inFigure 15. Each corner of the device has one primary inputand one secondary input.

    The user must specify these global nets for all timing-sen-sitive global signal distribution. To use a global net, place aBUFGP (primary buffer), BUFGS (secondary buffer), orBUFG (either primary or secondary buffer) element in aschematic or VHDL code.

    Figure 15: XC4000E Global Net Distribution

    X1027

    SECONDARYGLOBAL NETS

    PRIMARYGLOBAL NETS

  • 19

    Connections Between Lines

    Communication between Longlines and single-length linesis controlled by programmable interconnect points at theline intersections. Double-length lines do not connect toother lines.

    CLB Routing Connections

    CLB inputs and outputs are distributed on all four sides ofthe block, providing maximum routing flexibility. In general,the entire architecture is very symmetrical and regular. It iswell suited to established placement and routing algorithmsdeveloped for conventional mask-programmed gate-arraydesign. Inputs, outputs, and function generators can freelyswap positions within a CLB to avoid routing congestionduring the placement and routing operation.

    Connecting to Single-Length LinesThe function generator and control inputs to the CLB (F1-F4, G1-G4, and C1-C4) can be driven from any adjacentsingle-length line segment. Figure 12 shows typical CLBconnections to the adjacent single-length lines. (Note: Thenumber of routing channels shown in Figure 12 is for illus-trative purposes only.) The CLB clock input (K) can bedriven from half of the adjacent single-length lines. EachCLB output can drive several of the single-length lines, withconnections to both the horizontal and vertical Longlines.

    Connecting to Double-Length LinesAs with single-length lines, all the CLB inputs except K canbe driven from any adjacent double-length line. Each CLBoutput can drive nearby double-length lines in both the ver-tical and horizontal planes.

    Connecting to LonglinesCLB inputs can be driven from a subset of the adjacentLonglines (see Figure 16). CLB outputs are routed to theLonglines via 3-state buffers or the single-length intercon-nect lines.

    Three-State Buffers

    A pair of 3-state buffers is associated with each CLB in thearray. These 3-state buffers can be used to drive signalsonto the nearest horizontal Longlines above and below theblock. They can therefore be used to implement multi-plexed or bidirectional buses on the horizontal Longlines,saving logic resources. Programmable pull-up resistorsattached to both ends of these Longlines help to implementa wide wired-AND function.

    The 3-state buffer input can be driven from any X, Y, XQ, orYQ output of the neighboring CLB, or from nearby single-length lines. The buffer enable can come from nearby ver-tical single-length or Longlines. The enable is an active-High 3-state, or an active-Low enable, as shown inTable 11.

    Another 3-state buffer with similar access is located neareach I/O block along the right and left edges of the array.

    Figure 16: Longline Routing Resources with TypicalCLB Connections

    Special Longlines running along the perimeter of the arraycan be used to wire-AND signals coming from nearby IOBsor from internal Longlines. These Longlines form the wideedge decoders discussed in the next section, Wide EdgeDecoders .

    Table 11: Three-State Buffer Functionality

    Three-State Buffer Modes

    There are three modes in which the 3-state buffers can beconfigured:• Standard 3-state buffer• Wired-AND with input on the I pin• Wired OR-AND

    Standard 3-State BufferAll three pins are used. Place the library element BUFT.Tie the input to the I pin and the output to the O pin. The Tpin is an active-High 3-state or an active-Low enable.

    Wired-AND with Input on the I PinThe buffer can be used as a Wired-AND. Use the WAND1library symbol, which is essentially an open-drain buffer.(WAND4, WAND8, and WAND16 are also available. Seethe XACT Libraries Guide for further information.)

    IN T OUT

    X 1 Z

    IN 0 IN

    F4 C4 G4 YQ

    G1

    C1

    K

    F1

    X

    XQ F2 C2 G2

    F3

    C3

    G3

    Y

    CLB

    “Global”Long Lines

    X6464

    “Global”Long Lines

  • XC4000E Field Programmable Gate Array Family

    20

    The T pin is internally tied to the I pin. Tie the input to the Ipin and the output to the O pin. Tie the outputs of all theWAND1s together and attach a PULLUP symbol.

    Wired OR-ANDThe buffer can be configured as a Wired OR-AND. A Highlevel on either input turns off the output. Use theWOR2AND library symbol, which is essentially an open-drain 2-input OR gate.

    The two input pins are functionally equivalent. Attach thetwo inputs to the I0 and I1 pins and tie the output to the Opin. Tie the outputs of all the WOR2ANDs together andattach a PULLUP symbol.

    Three-State Buffer Examples

    An example showing how to use the 3-state buffers toimplement a wired-AND function is shown in Figure 17.When all the buffer inputs are High, the pull-up resistor(s)provide the High output.

    An example showing how to use the 3-state buffers toimplement a multiplexer is shown in Figure 18. The selec-tion is accomplished by the buffer 3-state signal.

    Pay particular attention to the polarity of the T pin whenusing these buffers in a design. Active-High T is identical toan active-Low output enable.

    Wide Edge Decoders

    Dedicated circuitry boosts the performance of wide decod-ing functions. When the address or data field is wider thanthe function generator inputs, FPGAs need multi-leveldecoding and are thus slower than PALs. XC4000E-familyCLBs have nine inputs. Any decoder of up to nine inputs is,therefore, compact and fast. However, there is also a needfor much wider decoders, especially for address decodingin large microprocessor systems.

    An XC4000E FPGA has four programmable decoderslocated on each edge of the device. The inputs to eachdecoder are any of the I1 signals on that edge plus onelocal interconnect per CLB row or column. Each decodergenerates a High output (resistor pull-up) when the ANDcondition of the selected inputs, or their complements, istrue. This is analogous to the AND term in typical PALdevices.

    Figure 17: Open-Drain Buffers Implement a Wired-AND Function

    Figure 18: 3-State Buffers Implement a Multiplexer

    PULL

    UP

    Z = DA

    ● DB

    ● (DC

    +DD

    ) ● (DE

    +DF)

    DE

    DF

    DC

    DD

    DB

    DA

    WAND1 WAND1W0R2AND W0R2AND

    X6465

    DNDCDBDA

    A B C N

    Z = DA • A + DB • B + DC • C + DN • N~100 kΩ

    "Weak Keeper"

    X6466

  • 21

    Each of these wired-AND gates is capable of accepting upto 42 inputs on the XC4005E and 72 on the XC4013E.These decoders may also be split in two when a large num-ber of narrower decoders are required, for a maximum of32 decoders per device.

    The decoder outputs can drive CLB inputs, so they can becombined with other logic to form a PAL-like AND/ORstructure. The decoder outputs can also be routed directlyto the chip outputs. For fastest speed, the output should beon the same chip edge as the decoder. Very large PALscan be emulated by ORing the decoder outputs in a CLB.This decoding feature covers what has long been consid-ered a weakness of older FPGAs. Users often resorted toexternal PALs for simple but fast decoding functions. Now,the dedicated decoders in the XC4000E can implementthese functions fast and efficiently.

    Figure 19 shows an example of edge decoding. Each rowor column of CLBs provides up to three variables or theircompliments.

    To use the wide edge decoders, place one or more of theWAND library symbols (WAND1, WAND4, WAND8,WAND16). Attach a DECODE attribute or property to eachWAND symbol. Tie the outputs together and attach a PUL-LUP symbol.

    Figure 19: Edge Decoding Example

    Oscillator

    The XC4000E devices include an internal oscillator. Thisoscillator is used to clock the power-on time-out, for config-uration memory clearing, and as the source of CCLK inMaster configuration modes. The oscillator runs at a nom-inal 8 MHz and varies with process, Vcc, and temperature.The output frequency falls between 4 and 10 MHz.

    IOBIOB

    BA

    INTERCONNECT

    ( C) .....

    (A • B • C) .....

    (A • B • C) .....

    (A • B • C) .....

    .I1.I1

    X2627

    C

    The oscillator output is optionally available after configura-tion. Any two of four resynchronized taps of a built-in rippledivider are also available. These taps are at the fourth,ninth, fourteenth and nineteenth bits of the divider. There-fore, if the primary oscillator output is running at the nomi-nal 8 Mhz, the user has access to an 8 Mhz clock, plus anytwo of 500kHz, 16kHz, 490Hz and 15Hz.

    If only an approximate clock frequency is desired, thesesignals can be accessed by placing the OSC4 library ele-ment in a schematic or in VHDL code. If the OSC4 symbolis not placed, the oscillator is automatically disabled afterconfiguration.

    Development System

    The powerful features of the XC4000E device familiesrequire an equally powerful, yet easy-to-use set of develop-ment tools. Xilinx provides an enhanced version of the Xil-inx Automatic CAE Tools (XACTstep) optimized for theXC4000E families.

    As with other logic technologies, the basic methodology forXC4000E FPGA design consists of three interrelatedsteps: design entry, implementation, and verification. Pop-ular generic tools such as VIEWlogic Systems’ PROSeriesare used for entry and simulation, but architecture-specifictools are needed for implementation.

    All XC4000E development system software is integratedunder the Xilinx Design Manager (XDM), providing design-ers with a common user interface regardless of their choiceof entry and verification tools. XDM simplifies the selectionof command-line options with pull-down menus and on-linehelp text. Application programs ranging from schematiccapture to Partitioning, Placement, and Routing (PPR) canbe accessed from XDM, while the program-commandsequence is generated and stored for documentation priorto execution. The XMake command, a design compilationutility, automates the entire implementation process, auto-matically retrieving the design’s input files and performingall the steps needed to create configuration and report files.

    Several advanced features of the XACTstep system facili-tate XC4000E FPGA design. The MemGen utility, a mem-ory compiler, implements on-chip RAM within an XC4000EFPGA. Relationally Placed Macros (RPMs)—schematic-based macros with relative locations constraints to guidetheir placement within the FPGA—help ensure an opti-mized implementation for common logic functions. XACT-Performance, a feature of the Partition, Place, and Route(PPR) implementation program, allows designers to entertheir exact performance requirements during design entry,at the schematic level.

  • XC4000E Field Programmable Gate Array Family

    22

    Design Entry

    Designs can be entered graphically, using schematic-cap-ture software, or in any of several text-based formats. Forexample, Boolean equations, state-machine descriptions,and high-level design languages are supported.

    Xilinx and third-party CAE vendors have developed libraryand interface products compatible with a wide variety ofdesign-entry and simulation environments. A standardinterface-file specification, XNF (Xilinx Netlist File), is pro-vided to simplify file transfers into and out of the XACTstepdevelopment system.

    Xilinx offers XACTstep development system interfaces tothe following design environments:• VIEWlogic Systems (VIEWDraw, VIEWSim, VIEWSyn-

    thesis, PROSeries)• Mentor Graphics V7 and V8 (NETED, Quicksim,

    Design Architect, Quicksim II, Exemplar)• Cadence (Composer, Concept, Verilog)• OrCAD (SDT, VST)• Synopsys (Design Compiler, FPGA Compiler)• Xilinx-ABEL• X-BLOX

    Many other environments are supported by third-party ven-dors. Currently, more than 100 packages are supported.

    The schematic library for the XC4000E FPGA reflects thewide variety of logic functions that can be implemented inthese versatile devices. The library contains over 400 prim-itives and macros, ranging from 2-input AND gates to 16-bitaccumulators, and including arithmetic functions, compara-tors, counters, data registers, decoders, encoders, I/Ofunctions, latches, Boolean functions, RAM and ROMmemory blocks, multiplexers, shift registers, and barrelshifters.

    Designing with macros is as easy as designing with stan-dard SSI/MSI functions. So-called “soft macros” containdetailed descriptions of common logic functions, but do notcontain any partitioning or routing information. The perfor-mance of these macros depends, therefore, on how thePPR software processes the design. Relationally PlacedMacros (RPMs), on the other hand, do contain pre-deter-mined partitioning and relative placement information,resulting in an optimized implementation for these func-tions. Users can create their own library elements—eithersoft macros or RPMs—based on the macros and primitivesof the standard library.

    X-BLOX is a graphics-based high-level description lan-guage (HDL) that allows designers to use a schematic edi-tor to enter designs as a set of generic modules. The X-BLOX compiler optimizes the modules for the target devicearchitecture, automatically choosing the appropriate archi-tectural resources for each function.

    The XACTstep design environment supports hierarchicaldesign entry, with top-level drawings defining the majorfunctional blocks, and lower-level descriptions defining thelogic in each block. The implementation tools automaticallycombine the hierarchical elements of a design. Differenthierarchical elements can be specified with different designentry tools, allowing the use of the most convenient entrymethod for each portion of the design.

    Design Implementation

    The design implementation tools satisfy the requirement foran automated design process.

    Logic partitioning, block placement and signal routing,encompassing the design implementation process, areperformed by the Partition, Place, and Route program(PPR). The partitioner takes the logic from the entereddesign and maps the logic into the architectural resourcesof the FPGA (such as the logic blocks, I/O blocks, 3-statebuffers, and edge decoders). The placer then determinesthe best locations for the blocks, depending on their con-nectivity and the required performance. The router finallyconnects the placed blocks together.

    The PPR program includes XACT-Performance, a featurethat allows designers to specify the timing requirementsalong entire paths during design entry. Timing path analysisroutines in PPR then recognize and accommodate theuser-specified requirements. Timing requirements can beentered on the schematic in a form directly relating to thesystem requirements. For example, the targeted minimumclock frequency or the maximum allowable delay on thedata path between two registers can be specified. So,while the timing of each individual net is not predictable, itdoes not need to be. The overall performance of the sys-tem along entire signal paths is automatically tailored tomatch user-generated specifications.

    The PPR algorithms result in the fully automatic implemen-tation of most designs. However, for demanding applica-tions, the user may exercise various degrees of controlover the automated implementation process. The imple-mentation of highly-structured designs can greatly benefitfrom the basic floorplanning techniques familiar to design-ers of large gate arrays. User-designated partitioning,placement, and routing information can be specified as partof the design entry process. Alternatively, the XACT-Floor-planner is proving to be an excellent tool for achieving max-imum density and performance for difficult designs.

    The automated implementation tools are complemented bythe XACT Design Editor (XDE), an interactive graphics-based editor that displays a model of the actual logic androuting resources of the FPGA. XDE can be used todirectly view the results achieved by the automated tools.Modifications can be made using XDE. XDE also performschecks for logic connectivity and possible design-rule viola-tions.

  • 23

    Design Verification

    The high development cost associated with common mask-programmed gate arrays necessitates extensive simulationto verify a design. Due to the custom nature of maskedgate arrays, mistakes or last-minute design changes can-not be tolerated. A gate-array designer must simulate andtest all logic and timing using simulation software. Simula-tion describes what happens in a system under worst-casesituations. However, simulation is tedious and slow, andsimulation vectors must be generated. A few seconds ofsystem time can take weeks to simulate.

    Programmable gate array users, however, can use in-cir-cuit debugging techniques in addition to simulation.Because Xilinx devices are reprogrammable, designs canbe verified in the system in real time without the need forextensive simulation vectors.

    The XACTstep development system supports both simula-tion and in-circuit debugging techniques. For simulation,the system extracts the post-layout timing information fromthe design database. This data can then be sent to the sim-ulator to verify timing-critical portions of the design. Back-annotation—the process of mapping the timing informationback into the signal names and symbols of the schematic—eases the debugging effort.

    For in-circuit debugging, XACTstep includes a serial down-load and readback cable called XChecker. XChecker con-nects the device in the system to the PC or workstationthrough an RS232 serial port. The engineer can downloada design or a design revision into the system for testing.The designer can also single-step the logic, read the con-tents of the numerous flip-flops on the device and observeinternal logic levels. Simple modifications can be down-loaded into the system in a matter of minutes.

    The XACTstep system also includes XDelay, a static timinganalyzer. XDelay examines a design’s logic and timing tocalculate the performance along signal paths, identify pos-sible race conditions, and detect setup and hold-time viola-tions. Timing analyzers do not require that the usergenerate input stimulus patterns or test vectors.

    Boundary Scan

    The ‘bed of nails’ has been the traditional method of testingelectronic assemblies. This approach has become lessappropriate, due to closer pin spacing and more sophisti-cated assembly methods like surface-mount technologyand multi-layer boards. The IEEE Boundary Scan standard1149.1 was developed to facilitate board-level testing ofelectronic assemblies. Design and test engineers canimbed a standard test logic structure in their device toachieve high fault coverage for I/O and internal logic. Thisstructure is easily implemented with a four-pin interface onany Boundary Scan-compatible IC. IEEE 1149.1-compati-ble devices may be serial daisy-chained together, con-nected in parallel, or a combination of the two.

    The XC4000E family implements IEEE 1149.1-compatibleBYPASS, PRELOAD/SAMPLE and EXTEST Boundary-Scan instructions. When the Boundary-Scan configurationoption is selected, three normal user I/O pins become ded-icated inputs for these functions. Another user output pinbecomes the dedicated boundary scan output. The detailsof how to enable this circuitry are covered later in this sec-tion.

    By exercising these input signals, the user can serially loadcommands and data into these devices to control the driv-ing of their outputs and to examine their inputs. Thismethod is an improvement over bed-of-nails testing. Itavoids the need to over-drive device outputs, and itreduces the user interface to four pins. An optional fifth pin,a reset for the control logic, is described in the standard butis not implemented in the Xilinx part.

    The dedicated on-chip logic implementing the IEEE 1149.1functions includes a 16-state state machine, an instructionregister and a number of data registers. The functionaldetails can be found in the IEEE 1149.1 specification andare also discussed in Xilinx document XAPP 017: "Bound-ary Scan in XC4000 Devices."

    Figure 20 shows a simplified block diagram of theXC4000E Input/Output Block with boundary scan imple-mented.

    Figure 21 is a diagram of the XC4000E boundary scanlogic. It includes three bits of Data Register per IOB, theIEEE 1149.1 Test Access Port controller, and the Instruc-tion Register with decodes.

    It is also possible to configure the XC4000E through theboundary scan logic. See Configuration Through theBoundary Scan Pins on page 35.

  • XC4000E Field Programmable Gate Array Family

    24

    Data Registers

    The primary data register is the boundary-scan register.For each IOB pin in the FPGA, it includes three bits for In,Out and 3-State Control. Non-IOB pins have appropriatepartial bit population for In or Out only. PROGRAM, CCLKand DONE are not included in the boundary scan register.Each EXTEST CAPTURE-DR state captures all In, Out,and 3-state pins.

    The data register also includes the following non-pin bits:TDO.T, and TDO.I, which are always bits 0 and 1 of thedata register, respectively, and BSCANT.UPD, which isalways the last bit of the data register. These three bound-ary scan bits are special-purpose Xilinx test signals.

    The other standard data register is the single flip-flopBYPASS register. It synchronizes data being passedthrough the FPGA to the next downstream boundary-scandevice.

    The FPGA provides two additional data registers that canbe specified using the BSCAN macro. The FPGA providestwo user pins (BSCAN.SEL1 and BSCAN.SEL2) which arethe decodes of two user instructions. For these instruc-tions, two corresponding pins (BSCAN.TDO1 andBSCAN.TDO2) allow user scan data to be shifted out onTDO. The data register clock (BSCAN.DRCK) is availablefor control of test logic which the user may wish to imple-ment with CLBs. The NAND of TCK and RUN-TEST-IDLEis also provided (BSCAN.IDLE).

    Figure 20: Block Diagram of XC4000E IOB with Boundary Scan (some details not shown)

    D

    EC

    Q

    M

    M

    QL

    rd

    M

    DELAY

    M M

    M M

    Input Clock IK

    I - capture

    I - update

    GLOBALS/R

    FLIP-FLOP/LATCH

    INVERT

    S/R

    Input Data 1 I1

    Input Data 2 I2

    X5792

    PAD

    VCC

    SLEWRATE

    PULLUP

    M

    OUTSEL

    D

    EC

    Q

    rd

    M

    M

    M

    INVERTOUTPUT

    M

    M

    INVERT

    S/R

    Ouput Clock OK

    Clock Enable

    Ouput Data O

    O - update

    Q - captureO - capture

    BoundaryScan

    MEXTEST

    TS - update

    TS - capture

    3-State TS

    sd

    sd

    TS INV

    OUTPUT

    TS/OE

    PULLDOWN

    INPUT

    BoundaryScan

    BoundaryScan

  • 25

    Figure 21: XC4000E Boundary Scan Logic

    D Q

    D Q

    D Q

    IOB

    IOB

    IOB

    IOB

    IOB

    IOB

    IOB

    IOB

    IOB

    IOB

    IOB

    IOB

    IOB

    MUX

    BYPASSREGISTER

    IOB IOB

    TDO

    TDI

    IOB IOB IOB

    MUXTDO

    TDI

    IOB

    IOB

    IOB

    IOB

    IOB

    IOB

    IOB IOB

    IOB

    IOB

    IOB

    IOB

    IOB

    IOB

    IOB IOB IOB IOB IOB

    1

    0

    1

    0

    1

    0

    1

    0

    1

    0

    1

    0

    1

    0

    D Q

    LE

    sd

    sd

    LE

    D Q

    D Q

    D Q

    1

    0

    1

    0

    1

    0

    1

    0

    D Q

    LE

    sd

    sd

    LE

    D Q

    sd

    LE

    D Q

    IOB

    D Q

    D Q1

    0

    1

    0D Q

    LE

    sd

    sd

    LE

    D Q

    1

    0

    DATA IN

    IOB.T

    IOB.Q

    IOB.I

    IOB.Q

    IOB.T

    IOB.I

    IOB.O

    SHIFT/CAPTURE

    CLOCK DATAREGISTER

    DATAOUT UPDATE EXTEST

    X1523

    INSTRUCTION REGISTER

    INSTRUCTION REGISTER

    BYPASSREGISTER

  • XC4000E Field Programmable Gate Array Family

    26

    Instruction Set

    The XC4000E boundary scan instruction set also includesinstructions to configure the device and read back the con-figuration data. The instruction set is coded as shown inTable 12.

    Table 12: Boundary Scan Instructions

    Bit Sequence

    The bit sequence within each IOB is: In, Out, 3-State. Froma cavity-up view of the chip (as shown in XDE), starting inthe upper right chip corner, the Boundary-Scan data-regis-ter bits are ordered as shown in Figure 22.

    BSDL (Boundary Scan Description Language) files for theXC4000E devices are available on the Xilinx BBS.

    Including Boundary Scan in a Schematic

    If boundary scan is only to be used during configuration, nospecial schematic elements need be included in the sche-matic or VHDL code. In this case, the special boundaryscan pins TDI, TMS, TCK and TDO can be used for userfunctions after configuration.

    To indicate that boundary scan remain enabled after config-uration, place the BSCAN library symbol and connect theTDI, TMS, TCK and TDO pad symbols to the appropriatepins.

    Even if the boundary scan symbol is used in a schematic,the input pins TMS, TCK, and TDI can still be used asinputs to be routed to internal logic. Care must be taken notto force the chip into an undesired boundary scan state byinadvertently applying boundary scan input patterns to

    InstructionI2 I1 I0

    TestSelected

    TDOSource

    I/O DataSource

    0 0 0 EXTEST DR DR

    0 0 1 SAMPLE/PRELOAD

    DR Pin/Logic

    0 1 0 USER 1 BSCAN.TDO1

    UserLogic

    0 1 1 USER 2 BSCAN.TDO2

    UserLogic

    1 0 0 READ-BACK

    Read-back Data

    Pin/Logic

    1 0 1 CONFIG-URE

    DOUT Disabled

    1 1 0 Reserved — —

    1 1 1 BYPASS BypassRegister

    these pins. The simplest way to do this is to keep TMSHigh, and then apply whatever signal is desired to TDI andTCK.

    Avoiding Inadvertent Boundary Scan Acti-vation

    If TMS or TCK is used as user I/O, care must be taken toensure that at least one of these pins is held constant dur-ing configuration. In some applications, a situation mayoccur where TMS or TCK is driven during configuration.This may cause the device to go into boundary scan modeand disrupt the configuration process.

    To prevent activation of boundary scan during configura-tion, you can do either of the following:• TMS: Tie it High to put the device in a benign RESET

    state• TCK: Tie it High or Low—don't toggle this clock input.

    For more information regarding Boundary Scan, refer toXAPP 017.001, “Boundary Scan in XC4000E Devices.“

    Figure 22: Boundary Scan Bit Sequence

    Bit 0 ( TDO end)Bit 1Bit 2

    TDO.TTDO.O

    Top-edge IOBs (Right to Left)

    Left-edge IOBs (Top to Bottom)

    MD1.TMD1.OMD1.IMD0.IMD2.I

    Bottom-edge IOBs (Left to Right)

    Right-edge IOBs (Bottom to Top)

    B SCANT.UPD(TDI end)

    X6075

  • 27

    Configuration

    Configuration is the process of loading design-specific pro-gramming data into one or more FPGAs to define the func-tional operation of the internal blocks and theirinterconnections. This is somewhat like loading the com-mand registers of a programmable peripheral chip. TheXC4000E family uses about 350 bits of configuration dataper CLB and its associated interconnects. Each configura-tion bit defines the state of a static memory cell that con-trols either a function look-up table bit, a multiplexer input,or an interconnect pass transistor. The XACTstep develop-ment system translates the design into a netlist file. It auto-matically partitions, places and routes the logic andgenerates the configuration data in PROM format.

    Special Purpose Pins

    Three configuration mode pins (M2, M1, M0) are sampledprior to configuration to determine the configuration mode.After configuration, these pins can be used as auxiliaryconnections. M2 and M0 can be used as inputs, and M1can be used as an output. The XACTstep developmentsystem does not use these resources unless they areexplicitly specified in the design entry. This is done by plac-ing a special pad symbol called MD2, MD1, or MD0 insteadof the input or output pad symbol.

    In the XC4000E, the mode pins have weak pull-up resistorsduring configuration. With all three mode pins High, SlaveSerial mode is selected, which is the most popular configu-ration mode. Therefore, for the most common configura-tion mode, the mode pins can be left unconnected. (Note,however, that the internal pull-up resistor value can be ashigh as 100kΩ.) After configuration, these pins can individ-ually have weak pull-up or pull-down resistors, as specifiedin the design entry.

    These dedicated nets are located in the lower left chip cor-ner and are near the readback nets. This location allowsconvenient routing if compatibility with the XC2000 andXC3000 family conventions of M0/RT, M1/RD is desired.

    Configuration Modes

    The XC4000E families have six configuration modes,selected by a 3-bit input code applied to the M2, M1, andM0 inputs. There are three self-loading Master modes, twoPeripheral modes, and the Serial Slave mode, which isused primarily for daisy-chained devices. The coding formode selection is shown in Table 13.

    A detailed description of each configuration mode isincluded later in this data sheet. During configuration,some of the I/O pins are used temporarily for the configura-tion process. All pins used during configuration are shownin the “Pin Functions During Configuration” table later inthis data sheet.

    Table 13: Configuration Modes

    Master Modes

    The three Master modes use an internal oscillator to gener-ate a Configuration Clock (CCLK) for driving potential slavedevices. They also generate address and timing for exter-nal PROM(s) containing the configuration data.

    Master Parallel (Up or Down) modes generate the CCLKsignal and PROM addresses and receive byte paralleldata. The data is internally serialized into the FPGA data-frame format. The up and down selection generates start-ing addresses at either zero or 3FFFF, for compatibility withdifferent microprocessor addressing conventions. TheMaster Serial mode generates CCLK and receives the con-figuration data in serial form from a Xilinx serial-configura-tion PROM.

    Peripheral Modes

    The two Peripheral modes accept byte-wide data from abus. A READY/BUSY status is available as a handshakesignal. In the asynchronous mode, the internal oscillatorgenerates a CCLK burst signal that serializes the byte-widedata. In the synchronous mode, an externally suppliedclock input to CCLK serializes the data.

    Mode M2 M1 M0 CCLK Data

    MasterSerial

    0 0 0 output Bit-Serial

    SlaveSerial

    1 1 1 input Bit-Serial

    MasterParallelUp

    1 0 0 output Byte-Wide,incrementfrom 00000

    MasterParallelDown

    1 1 0 output Byte-Wide,decrementfrom 3FFFF

    PeripheralSynch.*

    0 1 1 input Byte-Wide

    PeripheralAsynch.

    1 0 1 output Byte-Wide

    Reserved 0 1 0 — —

    Reserved 0 0 1 — —

    *Peripheral Synchronous can be considered Slave Parallel

  • XC4000E Field Programmable Gate Array Family

    28

    Slave Serial Mode

    In Slave Serial mode, the FPGA receives serial configura-tion data on the rising edge of CCLK and, after loading itsconfiguration, passes additional data out, resynchronizedon the next falling edge of CCLK.

    Multiple slave devices with identical configurations can bewired with parallel DIN inputs. In this way, multiple devicescan be configured simultaneously.

    Multiple devices with different configurations can be con-nected together in a “daisy chain,” DOUT to DIN, and a sin-gle combined bitstream used to configure the chain of slavedevices. See the Daisy Chained Devices section for fur-ther information on this configuration option.

    Setting CCLK Frequency

    CCLK can be generated in either of two frequencies. In thedefault slow mode, the frequency ranges from 0.5 MHz to1.25 MHz. In fast CCLK mode, the frequency ranges from4 MHz to 10 MHz. The frequency is sel