1 ECP4 High-Speed IO
A Lattice Semiconductor White Paper
LatticeECP4 High Speed IO:
Salvaging High Speed Serial Interfaces
A Lattice Semiconductor White Paper
November 2011
Lattice Semiconductor 5555 Northeast Moore Ct.
Hillsboro, Oregon 97124 USA Telephone: (503) 268-8000
www.latticesemi.com
2 ECP4 High-Speed IO
A Lattice Semiconductor White Paper
Introduction
The appetite for bandwidth is insatiable. In the communications industry it manifests
itself through the increasing number of smart phones and tablets, causing a bottleneck
in the current backhaul network. In the video transport industry, system requirements
demand higher speeds to support high definition video, ensuring a quality customer
experience. Given this, it is clear that “more is better” when it comes to the features and
capabilities of silicon devices designed to address these markets. Additionally, ever
evolving specifications challenge the designer to provide solutions that are flexible and
extensible. Programmable platforms such as FPGAs help address these challenges.
Not only must FPGAs retain their flexibility to support a wide variety of protocols, they
must also keep pace with the amount of raw data to be transferred in and out of the
programmable fabric. To do so, device IO must support high speed data converters,
memory controllers and the implementation of high speed serial protocols to provide a
viable solution.
This white paper examines the LatticeECP4 mid-range FPGA family and its ability to
address the aforementioned challenges, while maintaining the low cost and low power
of the previous generation LatticeECP family, enabling a unique, next-generation
networking solution.
High Speed IO (GIGA sysIO)
Increased bandwidth, data integrity and decreased board-level real estate continue to
drive the need for serial IO interface support. As a result, many of these serial links
have transitioned to Dual Data Rate (DDR) and self-clocked interfaces. These types of
interfaces have tight timing requirements, making them relatively difficult to support
using general purpose IO within an FPGA. LatticeECP4 mid-range FPGAs have been
specifically architected with high speed IO (GIGA sysIO) logic to provide designers the
tools they need to implement a variety of DDR and self-clocked solutions. Additionally,
whereas self-clocked gigabit interfaces have traditionally used embedded SERDES
channels, they can now be supported with the LatticeECP4’s GIGA sysIO logic. This
3 ECP4 High-Speed IO
A Lattice Semiconductor White Paper
not only reduces overall power consumption, it frees up SERDES channels for other
applications. This section outlines some of the LatticeECP4’s GIGA sysIO features.
Data Eye Monitor
Timing budgets are critical with high speed serial links. With DDR interfaces, there are
two bits per clock period, making receive synchronization more challenging.
LatticeECP4 GIGA sysIO provides a dynamic data window optimization scheme without
having to stop or reinitiate the transmit data flow through the receive IO buffers. This
feature is basically an edge monitor that allows a dynamic clock to receive data phase
realignment that can be used for many high speed DDR applications. Figure 1 is a
functional block diagram of the LatticeECP4 Data Eye Monitor Circuit.
Figure 1 - Data Eye Monitor Block Diagram
The Data Eye Monitor logic monitors multiple delayed versions of the serial input data
and measures the data transitions of the incoming serial data against the associated
receive clock. The resulting measurements define a valid window within which data can
be synchronized to a local system clock. If either side of the data window falls too close
to a clock transition, the “MINUS” or “PLUS” control signal will be asserted. The
assertion of these signals indicates the receive clock requires either advancement or
additional delay in order to retain a clean sampling window. For added flexibility, the
size of the measured receive data window is configurable, allowing it to be used for a
wide variety of DDR-type interfaces.
4 ECP4 High-Speed IO
A Lattice Semiconductor White Paper
Input FIFO
After sampling valid DDR data, the positive and negative edge data needs to cross
clock domains between the external synchronizing signal (e.g. DQS for DDR memory
controllers) and the internal system clock. In DDR3 memory controllers there is typically
a DQS control signal for every byte of data. The LatticeECP4 GIGA sysIO provides an
optional input data FIFO and control block for every DQS group to ensure read data is
accurately conveyed between clocks. When enabled, the input FIFO also provides a
form of read leveling, where DDR read datum is being received at the FPGA with
slightly different arrival times. By removing the need to deal with timing uncertainties at
the GIGA sysIO inputs, the designer can be confident the receive data will be valid
under a variety of operating conditions.
DDR Gearbox
Gearing data is required in both receive (IDDR) and transmit (ODDR) directions. In the
receive direction, data is converted to the final system bus width before being presented
to the FPGA logic for processing. Likewise, in the transmit direction, system bus data is
converted by the gearbox to the DDR serial form in order to be transmitted.
The gearing modes supported by the LatticeECP4 GIGA sysIO are x1 / x2 / x4 / x5 /
x7_1, which will support a wide variety of DDR interfaces. The 7_1 mode enables
LatticeECP4 FPGAs to support applications using video camera links that cannot use
any of the other gearing modes.
ISI Circuit
Intersymbol (ISI) interference occurs on a PC board due mainly to non-linearities
associated with the data channel and symbol flight time. The key symptom of ISI is the
distortion associated with one symbol interfering with a subsequent symbol. With the
advanced operating speeds of high speed serial and DDR interfaces, in conjunction with
the inevitable band limited channel (traces) signals are traversing, ISI reduction is
becoming more necessary than ever.
5 ECP4 High-Speed IO
A Lattice Semiconductor White Paper
LatticeECP4 GIGA sysIO provides an optional ISI reduction circuit that resides on the
output channels just before the final output buffer. The ISI logic maintains a local signal
history and, based upon a configurable “stretch” value, will stretch the leading edge of a
data transition in order to reduce symbol blurring (Figure 2).
Figure 2 - ISI Stretching
Clock Recovery Unit
Many serial interfaces employ self-clocking, which requires an accurate receive clock to
be extracted (CRU) on the receive side of the interface. One such protocol is SGMII.
Historically, SGMII, operating at 1.25Gbps, would require the use of an embedded
SERDES channel. Now, with the LatticeECP4, serial interfaces up to 1.25Gbps can be
supported using a soft CRU, freeing up SERDES channels and lowering the
applications power budget.
Figure 3 provides a functional block diagram of the Clock Recovery Unit found within
the LatticeECP4.
6 ECP4 High-Speed IO
A Lattice Semiconductor White Paper
Figure 3 - Clock Recover Unit Block Diagram
As shown in
Figure 3, the input reference clock is passed through a PLL to the Clock Alignment
block. Within the Clock Alignment block updated phase adjustment status is received,
which is used to either advance or retard the recovered clock that is fed to the DDR
Monitor block. The Tracking Logic is a soft FPGA gates solution that uses the phase
measurements from the DDR Monitor block to generate the adjustment settings. The
Tracking Logic dynamically monitors the phase difference between the receive data and
clock, maintaining a valid clock to phase relationship, even when the incoming receive
data is wandering due to Process, Voltage, and Temperature variation. Soft IP is
specifically designed to dynamically monitor and control a variety of applications that
require a CRU. The number of CRUs within each ECP4 device is shown in
Table 1.
ECP4 Device Max. # of CRUs
ECP4-30/50 18
ECP4-95/130 32
ECP4-190 36
7 ECP4 High-Speed IO
A Lattice Semiconductor White Paper
ECP4-250 40
Table 1 - Clock Recovery Units Per LatticeECP4 Device
Margin Control
Margin Control is typically used for DDR type interfaces, allowing designers to detect
the clock to receive data timing relationship during the initial hardware integration stage.
This logic is used to establish a board-level timing margin value. As a result of the
measurement, the device interface’s timing “sweet spot” is found. Figure 4 provides a
high-level example of a valid data window relative to a clock.
Figure 4 - Margin Control Timing Window
Using the Margin measurement, the proper input data delay tap can be selected to set
the timing margin, providing a high confidence level before going to final production.
Figure 5 provides a block diagram of the LatticeECP4 Margin Control Logic.
Figure 5 - Margin Control Block Diagram
8 ECP4 High-Speed IO
A Lattice Semiconductor White Paper
The number of Margin Control units is also a function of the LatticeECP4 device.
Table 2 sows the number of Margin Control Units per LatticeECP4 device.
ECP4 Device Max. # Of Margin Control Units
ECP4-30/50 18
ECP4-95/130 32
ECP4-190 36
ECP4-250 40 Table 2 - Margin Control Units Per LatticeECP4 Device
GIGA sysIO Applications
The functional blocks described in the previous section have been designed as building
blocks to support a wide variety of high speed serial interfaces. Using these blocks with
additional IO and user logic features, a vast array of interface solutions can be built.
This section outlines a few high speed interface applications and how they could use
the LatticeECP4 GIGA sysIO features.
DDR Memory Support
Many networking designs require temporary data storage elements that possess low latency and configurable bus widths. With their advanced high speed logic, LatticeECP4 FPGAs provide support for a wide variety of applications, such as packet buffering, video data
buffering and processor bulk data storage. Traditional memory interfaces such as DDR1 and DDR2 are supported at data rates up to 266 and 800Mbps, respectively. The advanced DDR3
IO logic supports dynamic Clock and Data alignment logic at data rates up to 1066Mbps.
Figure 6 provides a high level diagram of a DDR3 interface that uses several of the
advanced logic elements defined above.
9 ECP4 High-Speed IO
A Lattice Semiconductor White Paper
Figure 6 - DDR3 Interface Functional Diagram
As depicted in
Figure 6, the LatticeECP4 provides the data and DQS control logic required to support
DDR3 interfaces. The DQS signals function as input and output signals based on
whether a data Read or data Write transaction is occurring. When a Read transaction is
occurring, the DQS signals are sampled and internally delayed in order to be properly
aligned with the read data and to account for any internal delay routing within the FPGA
IO logic. For Write transactions, the DQS signal originates within the ODDR logic
specifically designed to create the prerequisite ninety degree phase offset, ensuring the
rising edge of the DQS signal is mid-bit to the associated Transmit data.
As indicated earlier, all Read data is sent through a Margin Control block in order to
define the proper timing margin input. The data is then sent through a small
resynchronizing FIFO and finally to the IDDR logic, which synchronously converts the
receive data to the internal data bus format. Also shown is the Data Eye monitor that
works in conjunction with the Soft IP logic to define and maintain a valid relationship
between the System clock and incoming receive data.
The above diagram is just one example of several potential DDR solutions. The
advanced IO capabilities of the LatticeECP4 can support a wide variety of DDR
10 ECP4 High-Speed IO
A Lattice Semiconductor White Paper
interfaces, including those that possess the most stringent DDR interfacing
requirements.
SPI4.2 Support
As with earlier LatticeECP FPGA families, the LatticeECP4 FPGA family continues to
support the SPI4.2 chip interconnect protocol. With the LatticeECP4’s innovative high
speed IO architecture, performance enhancements are more readily achievable. The
LatticeECP4 family supports both SPI4.2 Static and Dynamic alignment modes of
operation up to 840Mbps and 1Gbps, respectively, and, with individual clock to data
alignment logic, locking to the incoming receive data is implemented efficiently. As with
other DDR interfaces, additional soft IP can be implemented to dynamically monitor the
clock to data phase relationship to maintain their specific timing budget.
Figure 7 is a high level functional diagram of the SPI4.2 interface using LatticeECP4
GIGA sysIO.
Figure 7 - SPI4.2 Interface Functional Diagram
Similar to the DDR3 interface, when the LatticeECP4 GIGA sysIO is configured for
SPI4.2 mode, the reference clock is used to create a phase-aligned System clock,
whose relationship to incoming data can be dynamically maintained using a
11 ECP4 High-Speed IO
A Lattice Semiconductor White Paper
combination of the Data Eye Monitor and soft IP logic. The Receive data also uses
many of the same functional blocks as the DDR3 interface implementation. With the
receive timing margin set, and data successfully converted to the internal System clock
domain, the input DDR logic converts the incoming data stream to a data rate that is
operating at a much lower data rate for internal processing. In the transmit direction,
the logic can use either the recovered System clock or an independent clock source.
The output DDR logic takes the incoming transmit data bus and converts it to the
required sixteen bit output format. The DDR logic also creates the TDCLK synchronous
to the TDAT bus, ensuring the dynamic timing requirements for the SPI4.2 transmit
interface are maintained.
DAC/ADC Interface Support
The LatticeECP4 family of mid-range FPGAs is specifically designed to provide low cost and low power wireless Remote Radio Head (RRH) solutions. On the RRH card the LatticeECP4
fabric contains the digital logic and the high speed IO is equipped to interface with many modern differential DAC/ADC devices. The downlink and uplink DAC/ADC interfaces support
up to 1.4Gsps and 700Msps, respectively.
Figure 8 provides an example of a DAC/ADC interface pair using a sixteen bit downlink
and twelve bit uplink interface bus pair.
12 ECP4 High-Speed IO
A Lattice Semiconductor White Paper
Figure 8 - DAC/ADC Interface Functional Diagram
The ADC interface assumes an uplink clock that is centered relative to the twelve bit
data bus. The downlink data bus and associated clock are all processed within the
LatticeECP4’s IO logic. As shown in the above example, a minimal amount of logic is
required to support this type of DDR interface, producing a highly cost-effective solution
that requires little design and verification time.
Clock Recovery Support
A variety of existing serial data protocols support some form of encoding scheme
whereby the receive clock must be recovered from the incoming data stream. For such
protocols, a sufficient number of data transitions must occur to ensure a reliable clock
can be recovered. The LatticeECP4 high speed IO logic meets the challenge of clock
recovery using innovative Clock Recovery Unit (CRU) logic that monitors the incoming
data relative to a reference clock.
13 ECP4 High-Speed IO
A Lattice Semiconductor White Paper
Figure 9 is an example of a CRD interface that can be created using the LatticeECP4’s
high-speed logic.
Figure 9 - CDR Functional Interface Diagram
The CRU IO logic shown in
Figure 9 can support clock recovery of up to 1.25Gbps, using much of the same logic as
with other DDR solutions. The incoming serial data can use the optional Margin Test
logic to establish the receive timing budget. The data is then delayed and clock domain
crossed to the internal system clock domain. The input DDR logic converts the data to
a 10-bit parallel bus, which interfaces to the customer’s PCS logic. As shown in the
diagram, the PCS logic can then use one of the LatticeECP4’s Tri-Speed Ethernet
MACO Communications Engines, if desired. In the transmit direction, parallel data is
synchronously converted to the outbound serial data rate for transmission.
The number of CRUs within the LatticeECP4 family is a function of device size.
Because the logic reuses much of the same IO logic as other interfaces, there is an
ample amount of CRU circuits provided in each device, allowing a “best fit” customer
solution when sizing LatticeECP4 devices to this type application space.
Video Interface
The LatticeECP4 provides an enhanced set of high-speed IO features that meet a
variety of LCD display and sensor interfaces. Depending upon the exact protocol
support, the IO gearing logic supports 2:1, 4:1, 7:1, 8:1, and 10:1 modes of operation, at
14 ECP4 High-Speed IO
A Lattice Semiconductor White Paper
data rates of up to 800Mbps. For all the gearing modes, internal data monitoring can be
used to ensure clock to data timing integrity. Optionally, margin testing can be used to
ensure the design’s timing budget is satisfactorily met.
Summary
System designers will continue to be under pressure to work their “magic” to produce
higher performance systems while maintaining lower build and operational costs. And
these pressures are expected to increase, with high speed I/O being just one of the
attributes that the LatticeECP4 offers to help meet these challenges.
FPGAs have historically played key roles in system designs, but are now pushing
forward to new levels of performance while assisting in lowering overall system build
and operational costs. Feature rich, low cost FPGAs enable fast time to market and time
to revenue, and the flexibility and performance to accommodate evolving standards.
Systems/design engineers are now equipped with an enhanced set of IO capabilities
that enable them to provide high-speed solutions without a lengthy design process.