ultra high speed digital down converter

5
Ultra High Speed Digital Down Converter Design for Virtex-6 FPGAs Joachim Meyer , Simon Menzel , Michael Dreschmann , Rene Schmogrow , David Hillerkuss , Wolfgang Freude , Juerg Leuthold and Juergen Becker Institute for Information Processing Technologies Institute for Institute of Photonics and Quantum Electronics Karlsruhe Institute of Technology, Karlsruhe, Germany Email: [email protected] Abstract— This paper describes the design and optimiza- tion of an ultra- high speed Digital Down Converter (DDC) for a realization by FPGAs. After explaining the general structure of the Digital Down converter we describe in detail how to implement such a design in order to process a digital, massively parallelized signal. The necessary optimizations to achieve an efficient implementation in state of the art FPGAs are explained and a case study for an FPGA optimized Digital Down Converter design suitable for OFDMA systems is presented. The key components of this DDC are highly parallelized half-band filters which are optimized for Virtex- 6 FPGAs and enable the design to decimate a 6 bit wide input signal with a sample rate of 25 GS/s into a 16 bit 1.5625 GS/s signal while achieving an attenuation of around 35 dB. The results include the resource consumption of the DDC for a Virtex-6 XC6VHX380T FPGA as well as the filter response to a chirp test signal. Index Terms— FPGA; DDC; OFDMA; FIR; Decimator I. I NTRODUCTION While Orthogonal Frequency Multiplexing (OFDM) is deployed in many of today’s wireless communication systems for years, recently research groups all over the world started to investigate OFDM for its benefits in optical communication systems, compare [1]. Especially the advantages of Orthogonal Frequency Division Multi- ple Access (OFDMA) seem promising for optical access networks of the future [2]. OFDMA systems can increase the flexibility of band- width allocation by forming independent channels using a subset of subcarriers. An advantage of this approach is that a receiver might discard all subcarriers he is not interested in, before he processes the signal. By doing so, successive signal processing stages benefit from relaxed data rate requirements and therefore reduced costs. The necessary preprocessing can either be implemented by analogue components or digitally by a Digital Down Con- verter (DDC). The advantages of a digital implementation include the absence of tolerances as well as the absence of deterioration. Using state of the art Analogue-Digital-Converters (ADCs) combined with state of the art Field Pro- grammable Gate Arrays (FPGAs) makes it possible to directly process optically received OFDM signals. This paper explains how to optimize the structure and the low level implementation of a DDC design in order to successfully process optically received OFDM signals with a sample rate of up to 25GS/s. II. RELATED WORK Digital Down Converters can play an important role in OFDM systems and therefore many research groups deal with the efficient design of DDCs for such systems, compare [3] and [4]. A general DDC implementation with up-to-date FPGAs is described in [5]. However, all of these digital down converters are designed to handle sample rates in the area of several tenths of mega samples per second. Research results for digital down converters which are able to process data rates in the two-digit GS/s domain are rather rare. The reason for this is the complex and resource intensive design of appropriate ultra-high speed low pass filters which have to process multiple sample samples per clock cycle since their own operating frequency is much lower than the sample frequencies. In [6], Dinechin et al. demonstrated an approach for a 20 GS/s ultra-high speed FIR filter based on two hardware implementations of the (i)FFT algorithm. While achieving good results, they needed almost all resources provided by the biggest high-end Stratix 4 FPGA of the GX- Family (100% of the DSP resources, 100% of the M9K resources, and 92% of the logic resources). Therefore it gets obvious that this approach cannot be used in a DDC since there would be no resources left to add the remaining components of a DDC, e.g. numerically controlled oscillators. Integrating even more than one of such filters into a single FPGA is out of question. III. I MPLEMENTATION Digital Down Converters (DDC) are used in electrical communication systems in order to translate a real signal which is centered at an intermediate frequency, into a complex signal, centered around baseband. To achieve this, a DDC consists of three stages. Stage one is formed by Numerically Controlled Oscillators and digital mixers. They are used to multiply the received signal with the

Upload: abdel

Post on 28-Apr-2015

13 views

Category:

Documents


2 download

TRANSCRIPT

Page 1: Ultra High Speed Digital Down Converter

Ultra High Speed Digital Down ConverterDesign for Virtex-6 FPGAs

Joachim Meyer∗, Simon Menzel∗, Michael Dreschmann∗, Rene Schmogrow†, David Hillerkuss†,Wolfgang Freude†, Juerg Leuthold† and Juergen Becker∗

∗Institute for Information Processing Technologies†Institute for Institute of Photonics and Quantum Electronics

Karlsruhe Institute of Technology, Karlsruhe, GermanyEmail: [email protected]

Abstract— This paper describes the design and optimiza-tion of an ultra- high speed Digital Down Converter (DDC)for a realization by FPGAs. After explaining the generalstructure of the Digital Down converter we describe in detailhow to implement such a design in order to process a digital,massively parallelized signal. The necessary optimizationsto achieve an efficient implementation in state of the artFPGAs are explained and a case study for an FPGA optimizedDigital Down Converter design suitable for OFDMA systemsis presented. The key components of this DDC are highlyparallelized half-band filters which are optimized for Virtex-6 FPGAs and enable the design to decimate a 6 bit wide inputsignal with a sample rate of 25 GS/s into a 16 bit 1.5625 GS/ssignal while achieving an attenuation of around 35 dB. Theresults include the resource consumption of the DDC for aVirtex-6 XC6VHX380T FPGA as well as the filter responseto a chirp test signal.

Index Terms— FPGA; DDC; OFDMA; FIR; Decimator

I. INTRODUCTION

While Orthogonal Frequency Multiplexing (OFDM) isdeployed in many of today’s wireless communicationsystems for years, recently research groups all over theworld started to investigate OFDM for its benefits inoptical communication systems, compare [1]. Especiallythe advantages of Orthogonal Frequency Division Multi-ple Access (OFDMA) seem promising for optical accessnetworks of the future [2].

OFDMA systems can increase the flexibility of band-width allocation by forming independent channels usinga subset of subcarriers. An advantage of this approachis that a receiver might discard all subcarriers he is notinterested in, before he processes the signal. By doing so,successive signal processing stages benefit from relaxeddata rate requirements and therefore reduced costs. Thenecessary preprocessing can either be implemented byanalogue components or digitally by a Digital Down Con-verter (DDC). The advantages of a digital implementationinclude the absence of tolerances as well as the absenceof deterioration.

Using state of the art Analogue-Digital-Converters(ADCs) combined with state of the art Field Pro-grammable Gate Arrays (FPGAs) makes it possible todirectly process optically received OFDM signals. This

paper explains how to optimize the structure and thelow level implementation of a DDC design in order tosuccessfully process optically received OFDM signalswith a sample rate of up to 25GS/s.

II. RELATED WORK

Digital Down Converters can play an important rolein OFDM systems and therefore many research groupsdeal with the efficient design of DDCs for such systems,compare [3] and [4]. A general DDC implementationwith up-to-date FPGAs is described in [5]. However, allof these digital down converters are designed to handlesample rates in the area of several tenths of mega samplesper second. Research results for digital down converterswhich are able to process data rates in the two-digit GS/sdomain are rather rare. The reason for this is the complexand resource intensive design of appropriate ultra-highspeed low pass filters which have to process multiplesample samples per clock cycle since their own operatingfrequency is much lower than the sample frequencies.

In [6], Dinechin et al. demonstrated an approach for a20 GS/s ultra-high speed FIR filter based on two hardwareimplementations of the (i)FFT algorithm. While achievinggood results, they needed almost all resources providedby the biggest high-end Stratix 4 FPGA of the GX-Family (100% of the DSP resources, 100% of the M9Kresources, and 92% of the logic resources). Thereforeit gets obvious that this approach cannot be used ina DDC since there would be no resources left to addthe remaining components of a DDC, e.g. numericallycontrolled oscillators. Integrating even more than one ofsuch filters into a single FPGA is out of question.

III. IMPLEMENTATION

Digital Down Converters (DDC) are used in electricalcommunication systems in order to translate a real signalwhich is centered at an intermediate frequency, into acomplex signal, centered around baseband. To achievethis, a DDC consists of three stages. Stage one is formedby Numerically Controlled Oscillators and digital mixers.They are used to multiply the received signal with the

Page 2: Ultra High Speed Digital Down Converter

output of the NCO in order to shift the signal spectrum tobaseband. In Stage two appropriate digital low pass filtersremove frequencies higher than the highest frequencyof the signal of interest. Finally the last stage usuallyreduces the sample rate in order to relax processingspeed requirements for the successive signal processingcomponents. Compare figure 1.

NCO

cos

-sinLow

Pass Filter

Down Sampling

Down Sampling

I

Q

Low Pass Filter

Fig. 1. Block diagram of a Digital Down Converetr (DDC)

A. Digital Filter and DecimationIn order to design a filter which is capable of processing

data in parallel to achieve very high data rates, the filterhas to be realized as a polyphase filter. While usually Cas-caded Integrator-Comb (CIC) filters are a good choice dueto their linear phase response and their resource efficientimplementation without requiring multipliers, their recur-sive design inhibits an efficient polyphase implementation.A parallel implementation would require a high numberof additional register chains to store intermediate feedbackterms. Since this would result in inacceptable resourceconsumption, we decided to use a polyphase FIR-filterapproach for the DDC.

When designing the filter it is possible exploit thatthe signal will be downsampled after the filter anyway.By moving this decimation into the filter, it is possibleto remove several processing paths leading to sampleswhich will be removed. The analysis of different mul-tistage polyphase FIR filter decimators lead us to a four-stage half-band filter decimator structure. Half-band filtersprovide a linear phase response and can be implementedvery efficiently [7]. On one hand the first stages of half-band filters can be applied resource friendly with largerripple and less complexity since the passband-to-samplingfrequency ratio is relatively small. On the other hand,since half of the coefficients are zero and the other half issymmetric, even high order half-band filters require onlya small amount of resource intensive multiplications.

B. Mixing StageFor a complete parallel working ultra-high speed DDC,

in addition to the filters, it is required to parallelize theNumerically Controlled Oscillators (NCO) in the mixingstage. In this configuration an NCO does not accumulatethe phase difference between two samples of the target fre-quency but the phase difference times the number of NCOcollaborating. By starting every NCO with an aligned butdifferent offset, the NCOs generate a frequency higherthan their own clock frequency.

C. Efficient usage of FPGA resources

To implement a DDC efficiently in modern Virtex-5or Virtex-6 FPGAs, it is required to make sure to mapthe operations efficiently on appropriate elements of theFPGA, especially for those of the multistage polyphaseFIR Filter decimator. One optimization is the proper usageof ternary adders. In modern Virtex FPGAs, the 6-inputLUT architecture can efficiently create an addition of 3numbers using the same amount of logic resources asa simple 2-input addition [8]. This becomes very usefulfor the adder trees of digital filters which add up all thesingle results. Additionally such ternary adders can beused to build the low-complexity half-band filters of thefirst stages without multiplications and without utilizingthe native Digital Signal Processing (DSP48E1) elementsin order to save them up for the complex latter stage half-band filters.

Fig. 2. Block diagram for an 19th-order FIR filter

Another critical optimization is the proper usage of theDSP48E1 slices in modern FPGAs [8]. Those slices con-tain a 25x18 bit multiplier, as well as a pre-adder, a post-adder and several registers which can become very usefulto achieve high clock rates via pipelining. The coefficientsof FIR filters with a linear phase response (e.g. halfbandfilters) are symmetric and therefore every coefficient existstwice in the system (with the exception of the middlecoefficient for an odd number of coefficients). While thepre-adders can now be used to combine both sampleswhich will be multiplied by the same coefficient, postadders can be used to already implement the first stage ofthe adder tree for the system. Figure 2 shows an example

Page 3: Ultra High Speed Digital Down Converter

implementation of an ordinary 19th-order FIR filter. Thefilter can be implemented by a combination of DSPs andtheir pre-/post-adders (add mul and add mul add), ternaryadders (tern add) and flip-flops (reg bal) for pipelining thefilter.

IV. EXPERIMENTS AND RESULTS

In our case study, we want to implement at least twofull DDCs in a single Virtex-6 XC6VHX380T FPGA. Theinput data arrives at 25 GS/s from the analogue digitalconverter (e.g. Vega ADC 30 from Micram). The systemclock of the FPGA was chosen to be 195.3125 MHz,therefore it is required to process 128 samples per clockcycle. A DDC should implement a decimation of the factorof 16, from 25 GS/s to 1.5625 GS/s, while the minimumattenuation of the stop-band should be 30 dB.

Fig. 3. Block diagram of the implemented ultra-high speed DDC

The complete DDC design consists of a NCO which isimplemented 128x parallel, an in-phase path as well as aquadrature-phase path. Each path contains 128 multiplierto multiply the output NCOs with the input signal and a4-stage polyphase half-band filter, compare figure 3.

The order of each single halfband filter of each stagewas determined with the methods described in [7]. Forthe filters of the first two stages, the attenuation of low-complexity 3-tap halfband filters is sufficient. Such filterscan be implemented very efficiently by ternary adders andtherefore utilizing only basic logic blocks of the FPGA,compare figure 4).

Fig. 4. The low-complexity 3-tap halfband filters can be implementedresource efficiently by the use of a ternary adder

For the third-stage a 7-tap filter is required in order tofulfill the attenuation requirements. When using scaling tonormalize the middle coefficient to one, such a filter can

be implemented by two DSP48E slices and few registers,compare figure 5. All coefficients have a resolution of sixbits.

Fig. 5. By exploiting the pre-adder and post-adder of DSP48E slices,the 7-tap halfband filter can be implemented without using additionaladder trees

The fourth and last stage implements the most complexhalfband filter. This is necessary since the size of thetransition band should be as small as possible. In ourdesign we created a filter of the order 30 by usingthe equiripple method [9]. The coefficients are storedwith a resolution of 8 bit, the structure of the FPGAimplementation is shown in figure 6.

Fig. 6. The 30th-order halfband filter of the last stage is the mostcomplex filter in the design

In order to verify and test the filter we ran simulationsusing Modelsim, a HDL Simulator from Mentor Graphics,in combination with Matlab. A chirp signal from 0-12.5GHz was generated in Matlab and exported into Modelsimwhich was used to simulate our decimator. Afterwards, theresults were imported by Matlab for verification. Figure7 shows the filter response to the chirp signal. It is easyto identify the stop-bands of the different filter stages.The attenuation of the stop-band for the complete filter isaround 35 dB.

Figure 8, 9, 10 and 11 show the spectra of each of thecreated halfband filters. The passband and the stopbandare highlighted for each implementation. It is importantthat the filters provide a good attenuation for these areas.

Page 4: Ultra High Speed Digital Down Converter

Fig. 7. The filter response of the 4-stage polyphase halfband filter to chirp signal 0-12.5 GHz cosimulated using Matlab and Modelsim.

0 5 10 15

−70

−60

−50

−40

−30

−20

−10

0

Frequency (kHz)

Mag

nitu

de (d

B) (

norm

aliz

ed to

0 d

B)

_

Frequency / GHz (normalized to 16 GHz)

15105

Magnitude

/dB

(normalize

dto

0dB)

0

0

10

20

30

40

50

60

70

Fig. 8. Spectra of the 2nd-order halfband filter which is deployd in thefirst stage.

0 1 2 3 4 5 6 7

−70

−60

−50

−40

−30

−20

−10

0

Frequency (kHz)

Mag

nitu

de (d

B) (

norm

aliz

ed to

0 d

B)

HB_2 8:4

Frequency / GHz (normalized to 16 GHz)

753

Magnitude

/dB

(normalize

dto

0dB)

0

0

10

20

30

40

50

60

70

6421

Fig. 9. Spectra of the 2nd-order halfband filter used in stage 2. Whilethe filter is basically the same as in the first stage it differs by thecoefficients.

While for the first filter stages it is very easy to fulfillthis requirement, it gets harder with every stage. Thefrequencies are normalized to 16 GHz in order to illustratethe decimation rate of 16 for the combined filter.

Table I compares the resource consumption of onecomplete digital down converter to the resources availablein a Virtex-6 XC6VHX380T device. As you can see,the resource consumption of DSP48E slices and ordinaryslices, representing the most critical resources for digitalfilters, is very balanced. This ensures to avoid a situationwhere the lack of one specific resource type prevents fromimplementing further digital down converters in the sameFPGA. We were able to place and route three complete

0 0.5 1 1.5 2 2.5 3 3.5

−70

−60

−50

−40

−30

−20

−10

0

Frequency (kHz)

Mag

nitu

de (d

B) (

norm

aliz

ed to

0 d

B)

HB_6 4:2

Frequency / GHz (normalized to 16 GHz)

3.52.51.5

Magnitude

/dB

(normalize

dto

0dB)

0

0

10

20

30

40

50

60

70

3210.5

Fig. 10. Spectra of the 6th-order halfband filter whis is used in stagethree.

0 0.2 0.4 0.6 0.8 1 1.2 1.4 1.6 1.8

−70

−60

−50

−40

−30

−20

−10

0

Frequency (kHz)

Mag

nitu

de (d

B) (

norm

aliz

ed to

0 d

B)

HB_30 2:1

Frequency / GHz (normalized to 16 GHz)

1.81.20.6

Magnitude

/dB

(normalize

dto

0dB)

0

0

10

20

30

40

50

60

701.410.40.3 0.8 1.6

Fig. 11. Spectra of the high complexity 30th-order halfband filterdeployed in the last stage.

DDCs (including in-phase and quadrature-phase path) ina single Virtex-6 XC6VHX380T device and still meet thetiming-constraints. While theoretically there are enoughresources to implement a fourth DDC, such a design isnot routable.

V. CONCLUSION

In this work we describe the implementation of an ultra-high speed digital down converters on modern Field Pro-grammable Gate Arrays in order to enable the deploymentin OFDMA-based communication systems. We explainhow to efficiently build a massively parallelized designusing multistage polyphase halfband filters and we show

Page 5: Ultra High Speed Digital Down Converter

TABLE IRESOURCES UTILISATION FOR A COMPLETE DIGITAL DOWN

CONVERTER

Resource Type One DDC Design XC6VHX380T %

LUT 29 424 239 040 12 %Register 40 654 478 080 8 %Slices 13 853 59 760 23 %

DSP48E1 192 864 22 %

in detail, how to optimize the critical components for up-to-date FPGA technology.

While most other systems are capable of handling datarates of several MS/s, our results show that it is possible touse a design like described in this work to build systemswith the capability of processing data rates of up to25GS/s. Additionally the developed digital down converteris designed in a way, allowing to implement up to threeof them into one single device.

ACKNOWLEDGMENT

The authors acknowledge support from the BMBFjoint project CONDOR, the EU research project ACCOR-DANCE, and the Xilinx University Program (XUP).

REFERENCES

[1] D. Hillerkuss, T. Schellinger, R. Schmogrow, M. Winter, T. Val-laitis, R. Bonk, A. Marculescu, J. Li, M. Dreschmann, J. Meyer,S. Ben Ezra, N. Narkiss, B. Nebendahl, F. Parmigiani, P. Petropou-los, B. Resan, K. Weingarten, T. Ellermeyer, J. Lutz, M. Moller,M. Huebner, J. Becker, C. Koos, W. Freude, and J. Leuthold, “Singlesource optical ofdm transmitter and optical fft receiver demonstratedat line rates of 5.4 and 10.8 tbit/s,” in Optical Fiber Communication(OFC), collocated National Fiber Optic Engineers Conference, 2010Conference on (OFC/NFOEC), 2010, pp. 1 –3.

[2] K. Kanonakis, I. Tomkos, T. Pfeiffer, J. Prat, and P. Kourtessis,“Accordance: A novel ofdma-pon paradigm for ultra-high capacityconverged wireline-wireless access networks,” in Transparent Opti-cal Networks (ICTON), 2010 12th International Conference on, 272010-july 1 2010, pp. 1 –4.

[3] Yih-Min Chen and I-Yuan Kuo, “Design of lowpass filter for digitaldown converter in ofdm receivers,” in Wireless Networks, Communi-cations and Mobile Computing, 2005 International Conference on,june 2005, vol. 2, pp. 1094 – 1099 vol.2.

[4] Jian Sun, Nan Cen, and Dongfeng Yuan, “Implementation of a 2x2mimo-ofdm real-time system on dsp/fpga platform,” in Commu-nications and Mobile Computing (CMC), 2011 Third InternationalConference on, april 2011, pp. 441 –444.

[5] Wu Changrui, Kong Chao, Xie Shigen, and Cai Huizhi, “Designand fpga implementation of flexible and efficiency digital down con-verter,” in Signal Processing (ICSP), 2010 IEEE 10th InternationalConference on, oct. 2010, pp. 438 –441.

[6] F. de Dinechin, H. Takeugming, and J.-M. Tanguy, “A 128-tapcomplex fir filter processing 20 giga-samples/s in a single fpga,”in Signals, Systems and Computers (ASILOMAR), 2010 ConferenceRecord of the Forty Fourth Asilomar Conference on, nov. 2010, pp.841 –844.

[7] D. Goodman and M. Carey, “Nine digital filters for decimationand interpolation,” Acoustics, Speech and Signal Processing, IEEETransactions on, vol. 25, no. 2, pp. 121 – 126, apr 1977.

[8] Xilinx, Virtex-6 FPGA DSP48E1 Slice User Guide, UG369 (v1.3),v1.3 edition, February 2011.

[9] J. McClellan and T. Parks, “A unified approach to the designof optimum fir linear-phase digital filters,” Circuit Theory, IEEETransactions on, vol. 20, no. 6, pp. 697 – 701, nov 1973.