Download - LatticeECP4 High Speed IO White Paper - Caxapacaxapa.ru/thumbs/313865/LatticeECP4_High_Speed_IO_-_WP.pdf · 7 ECP4 High-Speed IO A Lattice Semiconductor White Paper ECP4-250 40 Table

1 ECP4 High-Speed IO

A Lattice Semiconductor White Paper

LatticeECP4 High Speed IO:

Salvaging High Speed Serial Interfaces


November 2011

Lattice Semiconductor 5555 Northeast Moore Ct.

Hillsboro, Oregon 97124 USA Telephone: (503) 268-8000

www.latticesemi.com

http://www.latticesemi.com/



Introduction

The appetite for bandwidth is insatiable. In the communications industry it manifests

itself through the increasing number of smart phones and tablets, causing a bottleneck

in the current backhaul network. In the video transport industry, system requirements

demand higher speeds to support high definition video, ensuring a quality customer

experience. Given this, it is clear that “more is better” when it comes to the features and

capabilities of silicon devices designed to address these markets. Additionally, ever

evolving specifications challenge the designer to provide solutions that are flexible and

extensible. Programmable platforms such as FPGAs help address these challenges.

Not only must FPGAs retain their flexibility to support a wide variety of protocols, they

must also keep pace with the amount of raw data to be transferred in and out of the

programmable fabric. To do so, device IO must support high speed data converters,

memory controllers and the implementation of high speed serial protocols to provide a

viable solution.

This white paper examines the LatticeECP4 mid-range FPGA family and its ability to

address the aforementioned challenges, while maintaining the low cost and low power

of the previous generation LatticeECP family, enabling a unique, next-generation

networking solution.

High Speed IO (GIGA sysIO)

Increased bandwidth, data integrity and decreased board-level real estate continue to

drive the need for serial IO interface support. As a result, many of these serial links

have transitioned to Dual Data Rate (DDR) and self-clocked interfaces. These types of

interfaces have tight timing requirements, making them relatively difficult to support

using general purpose IO within an FPGA. LatticeECP4 mid-range FPGAs have been

specifically architected with high speed IO (GIGA sysIO) logic to provide designers the

tools they need to implement a variety of DDR and self-clocked solutions. Additionally,

whereas self-clocked gigabit interfaces have traditionally used embedded SERDES

channels, they can now be supported with the LatticeECP4’s GIGA sysIO logic. This



not only reduces overall power consumption, it frees up SERDES channels for other

applications. This section outlines some of the LatticeECP4’s GIGA sysIO features.

Data Eye Monitor

Timing budgets are critical with high speed serial links. With DDR interfaces, there are

two bits per clock period, making receive synchronization more challenging.

LatticeECP4 GIGA sysIO provides a dynamic data window optimization scheme without

having to stop or reinitiate the transmit data flow through the receive IO buffers. This

feature is basically an edge monitor that allows a dynamic clock to receive data phase

realignment that can be used for many high speed DDR applications. Figure 1 is a

functional block diagram of the LatticeECP4 Data Eye Monitor Circuit.

Figure 1 - Data Eye Monitor Block Diagram

The Data Eye Monitor logic monitors multiple delayed versions of the serial input data

and measures the data transitions of the incoming serial data against the associated

receive clock. The resulting measurements define a valid window within which data can

be synchronized to a local system clock. If either side of the data window falls too close

to a clock transition, the “MINUS” or “PLUS” control signal will be asserted. The

assertion of these signals indicates the receive clock requires either advancement or

additional delay in order to retain a clean sampling window. For added flexibility, the

size of the measured receive data window is configurable, allowing it to be used for a

wide variety of DDR-type interfaces.



Input FIFO

After sampling valid DDR data, the positive and negative edge data needs to cross

clock domains between the external synchronizing signal (e.g. DQS for DDR memory

controllers) and the internal system clock. In DDR3 memory controllers there is typically

a DQS control signal for every byte of data. The LatticeECP4 GIGA sysIO provides an

optional input data FIFO and control block for every DQS group to ensure read data is

accurately conveyed between clocks. When enabled, the input FIFO also provides a

form of read leveling, where DDR read datum is being received at the FPGA with

slightly different arrival times. By removing the need to deal with timing uncertainties at

the GIGA sysIO inputs, the designer can be confident the receive data will be valid

under a variety of operating conditions.

DDR Gearbox

Gearing data is required in both receive (IDDR) and transmit (ODDR) directions. In the

receive direction, data is converted to the final system bus width before being presented

to the FPGA logic for processing. Likewise, in the transmit direction, system bus data is

converted by the gearbox to the DDR serial form in order to be transmitted.

The gearing modes supported by the LatticeECP4 GIGA sysIO are x1 / x2 / x4 / x5 /

x7_1, which will support a wide variety of DDR interfaces. The 7_1 mode enables

LatticeECP4 FPGAs to support applications using video camera links that cannot use

any of the other gearing modes.

ISI Circuit

Intersymbol (ISI) interference occurs on a PC board due mainly to non-linearities

associated with the data channel and symbol flight time. The key symptom of ISI is the

distortion associated with one symbol interfering with a subsequent symbol. With the

advanced operating speeds of high speed serial and DDR interfaces, in conjunction with

the inevitable band limited channel (traces) signals are traversing, ISI reduction is

becoming more necessary than ever.



LatticeECP4 GIGA sysIO provides an optional ISI reduction circuit that resides on the

output channels just before the final output buffer. The ISI logic maintains a local signal

history and, based upon a configurable “stretch” value, will stretch the leading edge of a

data transition in order to reduce symbol blurring (Figure 2).

Figure 2 - ISI Stretching

Clock Recovery Unit

Many serial interfaces employ self-clocking, which requires an accurate receive clock to

be extracted (CRU) on the receive side of the interface. One such protocol is SGMII.

Historically, SGMII, operating at 1.25Gbps, would require the use of an embedded

SERDES channel. Now, with the LatticeECP4, serial interfaces up to 1.25Gbps can be

supported using a soft CRU, freeing up SERDES channels and lowering the

applications power budget.

Figure 3 provides a functional block diagram of the Clock Recovery Unit found within

the LatticeECP4.



Figure 3 - Clock Recover Unit Block Diagram

As shown in

Figure 3, the input reference clock is passed through a PLL to the Clock Alignment

block. Within the Clock Alignment block updated phase adjustment status is received,

which is used to either advance or retard the recovered clock that is fed to the DDR

Monitor block. The Tracking Logic is a soft FPGA gates solution that uses the phase

measurements from the DDR Monitor block to generate the adjustment settings. The

Tracking Logic dynamically monitors the phase difference between the receive data and

clock, maintaining a valid clock to phase relationship, even when the incoming receive

data is wandering due to Process, Voltage, and Temperature variation. Soft IP is

specifically designed to dynamically monitor and control a variety of applications that

require a CRU. The number of CRUs within each ECP4 device is shown in

Table 1.

ECP4 Device Max. # of CRUs

ECP4-30/50 18

ECP4-95/130 32

ECP4-190 36



ECP4-250 40

Table 1 - Clock Recovery Units Per LatticeECP4 Device

Margin Control

Margin Control is typically used for DDR type interfaces, allowing designers to detect

the clock to receive data timing relationship during the initial hardware integration stage.

This logic is used to establish a board-level timing margin value. As a result of the

measurement, the device interface’s timing “sweet spot” is found. Figure 4 provides a

high-level example of a valid data window relative to a clock.

Figure 4 - Margin Control Timing Window

Using the Margin measurement, the proper input data delay tap can be selected to set

the timing margin, providing a high confidence level before going to final production.

Figure 5 provides a block diagram of the LatticeECP4 Margin Control Logic.

Figure 5 - Margin Control Block Diagram



The number of Margin Control units is also a function of the LatticeECP4 device.

Table 2 sows the number of Margin Control Units per LatticeECP4 device.

ECP4 Device Max. # Of Margin Control Units

ECP4-30/50 18

ECP4-95/130 32

ECP4-190 36

ECP4-250 40 Table 2 - Margin Control Units Per LatticeECP4 Device

GIGA sysIO Applications

The functional blocks described in the previous section have been designed as building

blocks to support a wide variety of high speed serial interfaces. Using these blocks with

additional IO and user logic features, a vast array of interface solutions can be built.

This section outlines a few high speed interface applications and how they could use

the LatticeECP4 GIGA sysIO features.

DDR Memory Support

Many networking designs require temporary data storage elements that possess low latency and configurable bus widths. With their advanced high speed logic, LatticeECP4 FPGAs provide support for a wide variety of applications, such as packet buffering, video data

buffering and processor bulk data storage. Traditional memory interfaces such as DDR1 and DDR2 are supported at data rates up to 266 and 800Mbps, respectively. The advanced DDR3

IO logic supports dynamic Clock and Data alignment logic at data rates up to 1066Mbps.

Figure 6 provides a high level diagram of a DDR3 interface that uses several of the

advanced logic elements defined above.



Figure 6 - DDR3 Interface Functional Diagram

As depicted in

Figure 6, the LatticeECP4 provides the data and DQS control logic required to support

DDR3 interfaces. The DQS signals function as input and output signals based on

whether a data Read or data Write transaction is occurring. When a Read transaction is

occurring, the DQS signals are sampled and internally delayed in order to be properly

aligned with the read data and to account for any internal delay routing within the FPGA

IO logic. For Write transactions, the DQS signal originates within the ODDR logic

specifically designed to create the prerequisite ninety degree phase offset, ensuring the

rising edge of the DQS signal is mid-bit to the associated Transmit data.

As indicated earlier, all Read data is sent through a Margin Control block in order to

define the proper timing margin input. The data is then sent through a small

resynchronizing FIFO and finally to the IDDR logic, which synchronously converts the

receive data to the internal data bus format. Also shown is the Data Eye monitor that

works in conjunction with the Soft IP logic to define and maintain a valid relationship

between the System clock and incoming receive data.

The above diagram is just one example of several potential DDR solutions. The

advanced IO capabilities of the LatticeECP4 can support a wide variety of DDR



interfaces, including those that possess the most stringent DDR interfacing

requirements.

SPI4.2 Support

As with earlier LatticeECP FPGA families, the LatticeECP4 FPGA family continues to

support the SPI4.2 chip interconnect protocol. With the LatticeECP4’s innovative high

speed IO architecture, performance enhancements are more readily achievable. The

LatticeECP4 family supports both SPI4.2 Static and Dynamic alignment modes of

operation up to 840Mbps and 1Gbps, respectively, and, with individual clock to data

alignment logic, locking to the incoming receive data is implemented efficiently. As with

other DDR interfaces, additional soft IP can be implemented to dynamically monitor the

clock to data phase relationship to maintain their specific timing budget.

Figure 7 is a high level functional diagram of the SPI4.2 interface using LatticeECP4

GIGA sysIO.

Figure 7 - SPI4.2 Interface Functional Diagram

Similar to the DDR3 interface, when the LatticeECP4 GIGA sysIO is configured for

SPI4.2 mode, the reference clock is used to create a phase-aligned System clock,

whose relationship to incoming data can be dynamically maintained using a



combination of the Data Eye Monitor and soft IP logic. The Receive data also uses

many of the same functional blocks as the DDR3 interface implementation. With the

receive timing margin set, and data successfully converted to the internal System clock

domain, the input DDR logic converts the incoming data stream to a data rate that is

operating at a much lower data rate for internal processing. In the transmit direction,

the logic can use either the recovered System clock or an independent clock source.

The output DDR logic takes the incoming transmit data bus and converts it to the

required sixteen bit output format. The DDR logic also creates the TDCLK synchronous

to the TDAT bus, ensuring the dynamic timing requirements for the SPI4.2 transmit

interface are maintained.

DAC/ADC Interface Support

The LatticeECP4 family of mid-range FPGAs is specifically designed to provide low cost and low power wireless Remote Radio Head (RRH) solutions. On the RRH card the LatticeECP4

fabric contains the digital logic and the high speed IO is equipped to interface with many modern differential DAC/ADC devices. The downlink and uplink DAC/ADC interfaces support

up to 1.4Gsps and 700Msps, respectively.

Figure 8 provides an example of a DAC/ADC interface pair using a sixteen bit downlink

and twelve bit uplink interface bus pair.



Figure 8 - DAC/ADC Interface Functional Diagram

The ADC interface assumes an uplink clock that is centered relative to the twelve bit

data bus. The downlink data bus and associated clock are all processed within the

LatticeECP4’s IO logic. As shown in the above example, a minimal amount of logic is

required to support this type of DDR interface, producing a highly cost-effective solution

that requires little design and verification time.

Clock Recovery Support

A variety of existing serial data protocols support some form of encoding scheme

whereby the receive clock must be recovered from the incoming data stream. For such

protocols, a sufficient number of data transitions must occur to ensure a reliable clock

can be recovered. The LatticeECP4 high speed IO logic meets the challenge of clock

recovery using innovative Clock Recovery Unit (CRU) logic that monitors the incoming

data relative to a reference clock.



Figure 9 is an example of a CRD interface that can be created using the LatticeECP4’s

high-speed logic.

Figure 9 - CDR Functional Interface Diagram

The CRU IO logic shown in

Figure 9 can support clock recovery of up to 1.25Gbps, using much of the same logic as

with other DDR solutions. The incoming serial data can use the optional Margin Test

logic to establish the receive timing budget. The data is then delayed and clock domain

crossed to the internal system clock domain. The input DDR logic converts the data to

a 10-bit parallel bus, which interfaces to the customer’s PCS logic. As shown in the

diagram, the PCS logic can then use one of the LatticeECP4’s Tri-Speed Ethernet

MACO Communications Engines, if desired. In the transmit direction, parallel data is

synchronously converted to the outbound serial data rate for transmission.

The number of CRUs within the LatticeECP4 family is a function of device size.

Because the logic reuses much of the same IO logic as other interfaces, there is an

ample amount of CRU circuits provided in each device, allowing a “best fit” customer

solution when sizing LatticeECP4 devices to this type application space.

Video Interface

The LatticeECP4 provides an enhanced set of high-speed IO features that meet a

variety of LCD display and sensor interfaces. Depending upon the exact protocol

support, the IO gearing logic supports 2:1, 4:1, 7:1, 8:1, and 10:1 modes of operation, at



data rates of up to 800Mbps. For all the gearing modes, internal data monitoring can be

used to ensure clock to data timing integrity. Optionally, margin testing can be used to

ensure the design’s timing budget is satisfactorily met.

Summary

System designers will continue to be under pressure to work their “magic” to produce

higher performance systems while maintaining lower build and operational costs. And

these pressures are expected to increase, with high speed I/O being just one of the

attributes that the LatticeECP4 offers to help meet these challenges.

FPGAs have historically played key roles in system designs, but are now pushing

forward to new levels of performance while assisting in lowering overall system build

and operational costs. Feature rich, low cost FPGAs enable fast time to market and time

to revenue, and the flexibility and performance to accommodate evolving standards.

Systems/design engineers are now equipped with an enhanced set of IO capabilities

that enable them to provide high-speed solutions without a lengthy design process.

Download - LatticeECP4 High Speed IO White Paper - Caxapacaxapa.ru/thumbs/313865/LatticeECP4_High_Speed_IO_-_WP.pdf · 7 ECP4 High-Speed IO A Lattice Semiconductor White Paper ECP4-250 40 Table

Top Related