Electronics Engineering Department of Electrical Engineering, Linköping University, 2016
Modeling in Simulink and Synthesis of Digital Pre-Distortion for WLAN Power
Amplifiers on a Coarse-Grained Reconfigurable Fabric
Muhammad Safdar
Linköping University SE-581 83 Linköping, Sweden
Copyright 2016 Muhammad Safdar.
Master of Science Thesis in Electrical Engineering
Muhammad Safdar
LiTH-ISY-EX--16/4997--SE
Supervisors:
Ted Johansson
ISY, Linköping University
Shuo Li
KTH, Stockholm
Examiner:
Mark Vesterbacka
ISY, Linköping University
Abstract
3
Abstract
High data rates are highly demanded now-a-days in most of the communication systems such
as audio/video broadcasting, cable networks, wireless networks etc. This can be achieved using
Orthogonal Frequency Division Multiplexing (OFDM), which is a bandwidth-efficient method.
However, the major drawback of the OFDM technique is its high Peak-to-Average Power Ratio
(PAPR). Due to this high PAPR, the amplified signal is distorted if its peaks are not controlled.
This thesis investigates a PAPR reduction technique called Fourier Projection Algorithm
(FPA). During the thesis, the FPA algorithm is successfully designed to reduce the PAPR in the
OFDM systems to avoid the clipping. The results of the FPA algorithm show that the efficiency
of the system depends on the throughput, the complexity, and Tone Rate Loss (TRL) of the
system. The simulations are first carried out in SIMULINK and MATLAB environments and
later on it is synthesized on coarse-grained reconfigurable fabric platform.
Contents
4
CONTENTS
Abstract ...................................................................................................... 3
Contents ..................................................................................................... 4
Acknowledgments ..................................................................................... 6
List of Figures ............................................................................................ 7
Glossary of Terms ..................................................................................... 9
1 Introduction ............................................................................................. 11
1.1 Motivation ........................................................................................... 11
1.2 Purpose ................................................................................................ 11
1.3 Problem Statement ............................................................................... 12
1.4 Research Limitations ........................................................................... 12
2 Theory ...................................................................................................... 13
2.1 The Power Amplifier (PA) ................................................................... 13
2.2 RF Transmitter ..................................................................................... 14
2.3 Nonlinearity ......................................................................................... 15
2.4 AM-AM and AM-PM Distortions ....................................................... 17
2.5 Orthogonal Frequency Division Multiplexing (OFDM) ..................... 18
2.5.1 Multi-Carrier Modulation .............................................................. 19
2.5.2 Orthogonality in OFDM ................................................................ 20
2.5.3 Mathematical Definition of PAPR ................................................. 21
2.5.4 Advantages and disadvantages of OFDM System ......................... 21
2.6 Non-Linearity of OFDM Signals ......................................................... 22
2.7 Compensation Techniques for Non-Linear Distortions ....................... 22
2.7.1 Power Back-off Method ................................................................ 23
2.7.2 Amplifier Linearization Methods .................................................. 24
2.7.2.1 Feed-forward Linearizer ........................................................... 24
2.7.2.2 Feedback Linearizer ................................................................. 25
2.7.2.3 Pre-Distortion Linearizer .......................................................... 26
2.7.3 PAPR Reduction Techniques ......................................................... 27
2.9 Coarse-Grained Reconfigurable Architectures (CGRAs) .................... 27
2.9.1 Introduction ................................................................................... 27
2.9.2 Dynamically Reconfigurable Resource Array (DRRA) ................ 28
2.9.2.1 Data-Path Unit (DPU) .............................................................. 28
2.9.2.2 Register File (RFile) ................................................................. 29
2.9.2.3 Sequencer ................................................................................. 29
2.9.3 VESYLA .......................................................................................... 29
Contents
5
3 Implementation ....................................................................................... 31
3.1 Fourier Projection Algorithm .............................................................. 31
3.1.1 The POCS Algorithm ..................................................................... 31
3.1.2 Flow graph of FPA......................................................................... 33
3.2 Implementation of FPA in SIMULINK .............................................. 34
3.3 Implementation of Radix-2 FFT and IFFT on DRRA fabric .............. 38
3.4 Implementation of FPA on DRRA fabric............................................ 41
4 Results ...................................................................................................... 42
4.1 Number of Iterations versus Number of unused tones ........................ 42
4.2 Throughput versus Number of unused tones ....................................... 44
4.3 Throughput versus Number of Iterations ............................................. 44
4.4 Discussions .......................................................................................... 45
5 Conclusion and Future work ................................................................. 46
6 References ................................................................................................ 47
Acknowledgments
6
Acknowledgments
First of all, I am thankful to Prof. Ahmed Hemani for giving me this opportunity of working
here with his team at KTH Royal Institute of Technology, Stockholm. His encouragement, pos-
itive feedback, and guidance was always excellent. He was always available to help whenever
I wanted to discuss any problem I was facing during this thesis. I would like to thank my su-
pervisor Dr. Shuo Li at KTH, who guided and assisted me throughout my thesis and made it
possible for me to understand the FPA algorithm. I learned a lot from him during this period. It
was a great experience of moving back to Stockholm and doing my thesis at KTH.
I would like to thank my examiner Prof. Mark Vesterbacka and supervisor Prof. Ted Johansson
at Linköping University for their support and guidance. I am very grateful to Ted Johansson for
his valuable insights during the selection of this thesis. I would like to thank my parents and
other family members for their unconditional support and love. Finally, thanks to my all friends
in Linköping and Stockholm for their company and wonderful time I had with them.
.
List of Figures
7
List of Figures
2_1 Block diagram of a basic wireless communication system ........................... 14
2_2 A schematic of RF transmitter ....................................................................... 15
2_3 Response of a nonlinear amplifier with two tone test signal ......................... 16
2_4 Third order Intercept Point (IP3) ................................................................... 17
2_5 AM-AM characteristics ................................................................................. 18
2_6 AM-PM characteristics ................................................................................. 18
2_7 Spectrum of multi-Carrier modulation .......................................................... 19
2_8 Multi-Carrier Modulator ............................................................................... 19
2_9 (a)Traditional multi-carrier technique(b)Orthogonal multi-carrier Technique
.............................................................................................................................. 20
2_10 Frequency Spectrum of OFDM signal ........................................................ 21
2_11 IBO and OBO .............................................................................................. 24
2_12 Feed-forward Linearizer .............................................................................. 25
2_13 Feedback Linearizer .................................................................................... 26
2_14 PDs response opposite to a PA’s response in magnitude and phase ........... 26
2_15 DRRA single cell ......................................................................................... 28
2_16 Mapping example for DRRA cell ................................................................ 30
3_1 Decomposition of POCS algorithm ............................................................... 32
3_2 Flow graph of FPA ........................................................................................ 34
3_3 Four Projection Algorithm from SIMULINK ............................................... 35
3_4 Inside view of the training stage ................................................................... 35
3_5 Inside view of the FPA .................................................................................. 36
3_6 Counter logic ................................................................................................. 36
3_7 Clipping block ............................................................................................... 37
3_8 Single butterfly operation .............................................................................. 39
3_9 Implementation of Radix-2 FFT single butterfly on DRRA ......................... 40
3_10 Implementation of FPA on DRRA fabric .................................................... 41
4_1 Number of iterations versus Number of unused tones ........................... 43
List of Figures
8
4_2 Number of iterations versus Number of unused tones for varying PAPR levels .............................................................................................................................. 43
4_3 Throughput versus Number of unused tones .......................................... 44
4_4 Throughput versus Number of iterations ................................................. 45
Glossary of Terms
9
Glossary of Terms
ACE Active Constellation Extension
AFE Analog Front End
AGU Address Generation Units
ASICs Application Specific Integrated Circuits
BER Bit Error Rate
CF Crest Factor
CGIs Coarse Grain Instructions
CGRA Coarse-Grained Reconfigurable Architecture
DAC Digital to Analog Converter
DFT Discrete Fourier Transform
DPU Data-Path Unit
DRRA Dynamically Reconfigurable Resource Array
FFT Fast Fourier Transform
FPA Fourier Projection Algorithm
FPGAs Field-Programmable Gate Arrays
IBO Input Back-Off
IFFT Inverse Fast Fourier Transform
IM Inter-Modulation
IM3 Third-order Inter-Modulation
IP3 Third-Order Intercept Point
ISI Inter-Symbol Interference
MCM Multi-Carrier Modulation
OBO Output Back-Off
OFDM Orthogonal Frequency Division Multiplexing
P1dB 1-dB Compression Point
PA Power Amplifier
Glossary of Terms
10
PAPR Peak-to-Average Power Ratio
PD Pre-Distortion
POCS Projection Onto Convex Sets
PTS Partial Transmit Sequences
RF Radio Frequency
RFile Register File
SLM Selective Level Mapping
SNR Signal-to-Noise Ratio
TR Tone Reservation
TRL Tone Rate Loss
Introduction
11
1 INTRODUCTION
1.1 Motivation
The increasing demand of high-bandwidth data communication has been the main reason for
multi-carrier systems and Orthogonal Frequency Division Multiplexing (OFDM) systems in
recent communication systems. Despite many benefits of OFDM systems, high Peak-to-Aver-
age Power Ratio (PAPR) in OFDM signals creates complexity to the Digital to Analog Con-
verter (DAC) and Analog Front End (AFE).
The high fluctuations in OFDM signal’s power levels demand higher dynamic range DACs,
which consume more power and the signal is distorted due to the limited linearity characteristics
of the Power Amplifier (PA). This causes spectral re-growth of the signal. The spectral re-
growth causes an unwanted interference with the adjacent channels. Thus, reducing the PAPR
reduces the power dissipation and lowers the cost of DAC and AFE.
There are different ways to deal with the high fluctuations of PAPR. Many methods have been
presented for linearizing the PAs in the literature [1] and many techniques have been introduced
for reducing the PAPR of the OFDM systems [2]. This thesis mainly focuses on PAPR reduction
techniques but an overview of some linearization techniques is also presented and compared
with PAPR reduction techniques. System simulations are carried out in the SIMULINK and
MATLAB environments and later on it is synthesized on coarse-grained reconfigurable config-
uration embedded system technology platform, CREST.
1.2 Purpose
This thesis uses a technique called Fourier Projection Algorithm (FPA) to reduce the PAPR in
OFDM communication systems. The purpose of this thesis is to model the FPA in SIMULINK
and MATLAB environments, and then synthesize it on a CREST fabric, which is a Coarse-
Grained Reconfigurable Architecture (CGRA), developed by Royal Institute of Technology
KTH in Sweden, -IIT Delhi and -IISc Banglore in India [3]. CREST provides better flexibility
compared to Application Specific Integrated Circuits (ASICs) and better performance than
Field-Programmable Gate Arrays (FPGAs). The SIMULINK and MATLAB models and the
synthesized design serve as a reference design for refining the CREST fabric.
Introduction
12
1.3 Problem Statement
The output in the OFDM system is the superposition of multiple subcarriers. Some instantane-
ous outputs might be higher than the average power in the OFDM system due to the same phases
of these subcarriers. This is known as high PAPR, which is the most serious problem in the
OFDM system. This requires power amplifiers with high dynamic range, which are very ex-
pensive. The dynamic range of the power amplifiers is limited, therefore PAPR in the OFDM
system should be reduced. FPA is the algorithm to reduce the PAPR of the OFDM system. In
order to evaluate the FPA implementation, we first compute the PAPR of the system; then after
the signal is distorted by the implemented FPA, we compute the resulting PAPR and compare
it to the original signal’s PAPR. The questions to be answered are: How can FPA be imple-
mented on CGRAs? How is the implemented FPA evaluated? How effective is the implemented
FPA on the PAPR reduction?
1.4 Research Limitations
The thesis focuses on the FPA technique. The implemented FPA targets the OFDM systems
since normally they produce high PAPR signals, which can easily go out of the dynamic range
of the power amplifier. The FPA is used to reduce the peak values such that the signal is within
the dynamic range of the power amplifier. The FPA algorithm is implemented on CGRA be-
cause KTH and Catena’s joint project CREST II requires a CGRA implementation of the FPA
algorithm.
Theory
13
2 THEORY
Now-a-days, Orthogonal Frequency Division Multiplexing (OFDM) technique is preferred
over the traditional Multi-Carrier Modulation (MCM) technique due to its useful properties
such as high data rates. This chapter describes the basic concepts of Power Amplifier (PA), RF
transmitter, and third order Inter-Modulation distortions (IM3) generated by the PA, and the
PA’s characteristic curve. Later, the non-linearity problems in the PA due to high Peak-to-Av-
erage Power Ratio (PAPR) in OFDM systems are discussed in detail.
2.1 The Power Amplifier (PA)
The PA plays the key role in modern communication systems. It is mainly present in the trans-
mitter to increase the power level of the signal before being sent to an antenna. The amplifica-
tion of the signal is very important otherwise a highly distorted signal is received at the receiver
due to its poor Signal-to-Noise Ratio (SNR). The PA is an essential component in the transmit-
ter but causes many problems in the signal.
The important parameters of the PA are its output power, gain, distortion, and efficiency. The
gain of the PA should be as high as possible while its distortion should be kept as small as
possible to get the best results.
The PA’s efficiency is another important factor, which should be looked into while designing
the transmitter as the devices containing the PAs are usually battery driven e.g. a mobile tele-
phone. A PA with lower efficiency dissipates more heat and hence consumes more power and
battery. The PA has a linear response for small signal conditions and starts getting non-linear
as the signal level increases.
Increasing the efficiency of the PA drives it into the non-linear region due to its limited linear
characteristics, which produces unwanted frequencies causing spectral re-growth of the signal
and in-band distortions. The spectral re-growth causes undesirable adjacent channel interfer-
ence while in-band distortions cause Bit Error Rate (BER) distortions. Thus, there is a trade-off
between the efficiency and the interference. In practice, channels that are adjacent in frequency
may have sidebands that interfere with each other, called adjacent channel interference. This is
why the adjacent channel interference is being regulated by the regulation authority and it must
Theory
14
be tightly controlled to avoid the violation of the rules. One way of avoiding the adjacent chan-
nel interference is to run the PA in back-off mode but this reduces the efficiency of the ampli-
fier.
Due to this non-linear behavior of the PA, we are restricted to a certain input peak power oth-
erwise we will get the distorted output. Before we progress to the PAPR reduction of OFDM
system, it is a good idea to understand the block diagram of the transmitter and the nonlinearities
of the PA, which are discussed in next two sections.
2.2 RF Transmitter
An overall wireless system can be represented by Figure 2_1. The digital baseband signals to
be transmitted, are converted to analog, up-converted, and finally transmitted via an antenna
through the channel. The received signals are down-converted back to the baseband signals and
then are converted back to the digital domain.
Source Encoder
D/A RF Tx Tx Antenna
Source Decoder
A/D RF Rx Rx Antenna
Channel
Figure 2_1: Block diagram of a basic wireless communication system
The PA and mixer add the most non-linearity impairments to the transmitter. These are usually
located just before the antenna in the transmitter in a communication system as shown in Figure
2_2. Mixers add phase noise, non-linearity, and spurious frequencies while the PA adds non-
linearity in the signal. The non-linearity characteristics of the PA cause a problem to the OFDM
systems due to its high PAPR.
Theory
15
PA
Antenna
Mixer
Local Oscillator
Modulator
Figure 2_2: A schematic of RF transmitter
2.3 Nonlinearity
In a perfectly linear system, the principle of superposition holds. Suppose that K maps the in-
put x to an output y, then
y = K(x). (2.1)
For two inputs x1 and x2 we can write as:
y1 = K(x1), (2.2)
y2 = K(x2), (2.3)
then linear system must satisfy:
ay1 + by2 =aK(x1) + bK(x2) . (2.4)
In a non-linear system, superposition property does not hold anymore unlike in a linear system.
In a frequency domain, a linear system generates the same number of tones as that of the input
while a non-linear system generates extra tones known as harmonics.
Electronic devices are never perfectly linear. This non-linear behavior is often not desirable in
the devices and is characterized by a single tone and a two-tone test.
In a single-tone test, if we apply a single tone f on the input of a non-linear PA, more than one
frequencies (of the order of positive integers) other than the original tone are generated at the
output of an amplifier. These tones are called harmonics.
Theory
16
In a two-tone test, if we apply two closely spaced tones f1 and f2 at the input of a non-linear
device, the output of the non-linear device includes third-order Inter-Modulation products
(IM3) of the form nf1±mf2 as shown in Figure 2_3, where m and n are positive integers and
m+n is known as the order of distortion. IM3 tones cause an undesired spectral re-growth at the
output and hence interference with the adjacent channels. The original tones, i.e. f1 and f2, ap-
pear also amplified at the output of the PA. The PA also generates more distortion tones other
than the IM3 products, which are placed very far away from the main tones. These tones are
not much harmful and can be easily filtered out.
Po
ut
RF PAPin Pout
Pin
Frequency
Frequency
f1 f2
f1 f2
2f2–f12f1–f2
3f2–2f13f1–2f2
2f1 2f2
f1+f2 2f2+f12f1+f2
3f23f1
Fundamental
zone
Second
Harmonic
zone
Third
Harmonic
zone
DC
zone
f2–f1
2f2–2f1
Figure 2_3: Response of a nonlinear amplifier with two-tone test signal
Figure 2_4 shows the plot of Pout versus Pin and IM3 versus Pin, where Pin is the input power
and Pout is the output power. The slope of the fundamental Pout is 1 while the slope of IM3 is 3.
If we extend these slopes, they intersect at a point, called Third-Order Intercept Point (IP3). IP3
is used for approximating the linear region of PA. If IP3 is higher, there is less distortion at
higher power levels and vice versa.
Theory
17
Pin (dBm)
Pout (dBm)
IIP3
OIP3
IP3
Fundamental,
Slope=1
IM3 products,
Slope=3
P1dB,out
P1dB,in
1dB
Figure 2_4: Third order intercept point (IP3)
Figure 2_4 shows another way of characterizing the PA that is the 1-dB compression point
(P1dB). It is defined as the point where the difference between the output of the device and the
linear output is 1dB.
2.4 AM-AM and AM-PM Distortions
The PA’s performance is power limited i.e. after a certain input power it shows non-linear be-
havior at the output and has a big impact on the signal with every increase of the power level.
The non-linear characteristics of the PA can be represented using AM-AM and AM-PM plots.
AM stands for amplitude and PM stands for phase. AM-AM represents the amplitude distor-
tions while AM-PM represents the phase distortion to the output. There are three different op-
eration regions in PA’s curves:
Linear region, where output follows the input
Saturation region where output reaches the maximum level
Non-linear region is also known as compression region, where the PA’s output de-
creases with every increase of the input power level
These regions can be seen in Figure 2_5 and 2_6 [4].
Theory
18
IBO(dB)
OB
O(d
B)
Linear Region
Saturation Region
Compression Region
Figure 2_5: AM-AM characteristics [4]
IBO(dB)
Ph
ase
(de
gre
es)
Figure 2_6: AM-PM characteristics [4]
2.5 Orthogonal Frequency Division Multiplexing (OFDM)
In the next sub-sections, basic concepts of Multi-Carrier Modulation (MCM), Orthogonal Fre-
quency Division Multiplexing (OFDM), the mathematical definition of PAPR, the advantages,
and the disadvantages of OFDM systems are discussed.
Theory
19
2.5.1 Multi-Carrier Modulation (MCM)
In this scheme, transmitted data stream is divided into multiple data streams of equal bandwidth
known as sub-channels. Each of these bit streams has low bit rate. Each subcarrier is inde-
pendently modulated using narrow band signals [5].
In a single carrier modulation scheme, the system does not utilize the whole bandwidth effi-
ciently. Therefore, the idea of MCM was first presented in the mid-1960s [6]. In a classical
parallel transmission system, the total signal bandwidth of width W is divided into a number of
non-overlapping equidistant sub-channels Ch. 1 to Ch. N with identical bandwidths. Unlike the
single carrier modulation, each sub-channel in MCM is modulated using separate sub-carriers
as shown in Figure 2_7. These modulated signals are then added to get the desired signal for
transmission. There are guard bands in order to avoid the spectral overlapping of the sub-chan-
nels [6].
Figure 2_7: Spectrum of multi-Carrier modulation
The basic principle of generating MCM is shown in Figure 2_8. Input data M is divided into n
number of messages, which are then modulated by different sub-carriers.
Figure 2_8: Multi-Carrier Modulator
Theory
20
The modulated carriers are summed up for transmission. At the receiver end, the reverse oper-
ation is performed to that of the transmitter. Those modulated sub-carriers are separated by
using filters before being demodulated to retrieve the sent messages [7].
2.5.2 Orthogonality in OFDM
OFDM is the special case of the MCM technique. In OFDM systems, subcarriers are orthogonal
to each other and the information is sent on parallel overlapping subcarriers, unlike MCM where
it is sent on non-overlapping subcarriers. Thus, the bandwidth is saved by using overlapping
orthogonal sub-carriers. The other advantage of orthogonality is that it causes lesser interfer-
ence from the neighboring carriers [5]. It can be seen in Figure 2_9 [6] that by using OFDM
technique, almost 50% of the bandwidth can be saved. This saved bandwidth can then be uti-
lized for sending more information.
Figure 2_9: (a) Traditional multi-carrier technique (b) Orthogonal multi-carrier Technique [6]
The frequency spectrum of the OFDM signal is shown in Figure 2_10. The spectrum of a single
sub-carrier is a sinc function in the frequency domain. All the sub-carriers in this spectrum are
orthogonal to each other, hence they do not interfere with each other. Accurate carrier synchro-
nization is very important in the OFDM systems otherwise, it will cause interference from the
adjacent sub-channels.
Theory
21
Figure 2_10: Frequency Spectrum of OFDM signal [5]
2.5.3 Mathematical Definition of PAPR
The fluctuations in the OFDM signal can be expressed in terms of PAPR. PAPR is the ratio
between the maximum instantaneous power and the average power of the signal. PAPR is re-
lated to another term called Crest Factor (CF). The CF is another way to see how extreme the
peaks are in a waveform. The CF of a waveform is the ratio of the waveform’s peak value to its
effective value. The relation between CF and PAPR can be expressed as:
CF = √PAPR (2.5)
or PAPR = CF2. (2.6)
The PAPR of the OFDM signal can be expressed by the equation:
PAPR{x(t)} =maxt∈τ(|x(t)|)2
E{x(t)2} , (2.7)
where x(t) is the original OFDM signal, τ is the time interval, maxt∈τ(|x(t)|)2 is the peak signal
power, and E{x(t)2} is the average signal power.
2.5.4 Advantages and Disadvantages of OFDM Systems
There are several advantages and disadvantage of OFDM systems according to [8]. The ad-
vantages are described below:
OFDM makes an efficient use of bandwidth by using the overlapping sub-carriers.
It is spectrally efficient by using FFT and IFFT operations for modulation and demodulation
functions.
Interference appearing does not affect all sub-channels and hence not all the data is lost.
Theory
22
OFDM systems have better resistance to frequency selective fading due to multiple narrow-
band signals.
OFDM systems have many drawbacks in spite of some very useful advantages as discussed
above. These are summarized below:
OFDM systems have high PAPR i.e. their amplitudes have very high fluctuations. Hence,
they require a high linearity in power amplifiers to accommodate the large amplitude vari-
ations.
Accurate synchronization is required in OFDM otherwise, it will cause interference from
the adjacent channels.
Another disadvantage of OFDM systems is their higher sensitivity to carrier-frequency off-
set and drift than single-carrier systems.
2.6 Non-Linearity of OFDM Signals
We know that there is a PAPR problem in OFDM systems i.e. their amplitude values have very
high amplitude variations. Therefore, it requires a high linearity in the power amplifier to ac-
commodate these large amplitude variations as discussed in the previous section. These fluctu-
ations occur due to IFFT processing at the transmitter due to a large number of independently
modulated sub-carriers.
Due to this high PAPR of OFDM systems, it causes the power amplifier to be driven in an
uncontrolled way in its saturation and compression regions. Hence, this behavior of OFDM
systems to the non-linear PAs makes them worse compared to the traditional single-carrier sys-
tems. To overcome this problem, there are some techniques for compensating non-linear dis-
tortions, which are discussed in the next section.
2.7 Compensation Techniques for Non-Linear Distortions
To decrease the BER and out-of-band distortions, the non-linear distortion of the system must
be dealt with in a way that allows it to be operated close to the saturation region. There are
several approaches, which provide distortion compensation in the transmitter. These are
grouped into three main classes as below:
Power back-off method
Amplifier linearization methods
Theory
23
PAPR reduction methods
It is a good idea to understand all of the above methods before we present the detailed discussion
of an FPA technique. In the next sub-sections, all of these three methods are described briefly.
2.7.1 Power Back-off Method
One of the main problems in PAs is Inter-Modulation terms (IM). When a multi-tone signal is
applied to the input of a PA, it amplifies the desired signal as well as generates some unwanted
terms known as IM terms. This non-linearity in PAs increases as the power level approaches to
its saturation point i.e. the region where the output reaches the maximum level. Though, this
varies from amplifier to an amplifier and with varying conditions.
To tackle this non-linearity issue, the PA is operated at power back-off from its saturation point.
This means that that maximum output power level of PAs is reduced to bring its signal within
the linear range of PA’s transfer curve.
The non-linear distortions are reduced by the amount of back-off level, which is measured using
the two quantities such as Input Back-Off (IBO) and Output Back-Off (OBO). These are de-
fined in dBs according to [9] by:
IBO = 10log10Pmax,in
<Pin> , (2.8)
OBO = 10log10Pmax,out
<Pout> , (2.9)
where Pmax, in and Pmax, out are the input and output saturation power levels, and Pin and Pout are
the averages of the input and output powers. Either of the IBO and OBO can be used to specify
the operating point of PAs. The IBO and OBO can be defined graphically as shown in Figure
2_11.
The disadvantage of using back-off mode is that it decreases the power efficiency of the PA.
Therefore, this technique is not considered as a good option to deal with non-linear distortions.
Theory
24
Pin
Pout
IP1dB
OP1dB
1dB
Saturation
Region
Back-off from P1dB
Input back-off
Output back-off
Operation
point
Figure 2_11: IBO and OBO
2.7.2 Amplifier Linearization Methods
Amplifier linearization is a way to reduce the distortion of the PAs. Many linearization tech-
niques have been presented to compensate the non-linear distortions of the PAs. Some extra
circuitry or components are added to compensate the non-linearity effects of the amplifier.
Though there are different kinds of methods to deal with the non-linearity issues, the most
common approaches are feed-forward, feedback, and pre-distortion. Each of these techniques
uses different algorithms to compensate the non-linearity of the PAs.
Digital techniques are preferred as they offer a cost-effective solution due to the baseband im-
plementation compared to analog techniques. In the following sub-sections, these different am-
plifier linearization techniques are briefly discussed [1].
2.7.2.1 Feed-forward Linearizer
The feed-forward linearizer can be seen in Figure 2_12 [1]. This system has two loops, i.e.
signal- and error-cancellation loops. The first loop produces the distortion signal of the main
amplifier by subtracting the input signal from the output signal. The purpose of the second loop
Theory
25
is to produce a distortion-free signal by subtracting the amplified distorted signal from the de-
layed distorted output of the main amplifier. Hence the resulting signal at the output is distortion
free.
The disadvantages of this technique are complexity to track the component behavior changes
e.g. effects of temperature or properties of the components can change with time etc. and high
power consumption.
Main Amp
Aux. Amp
Coupler
Delay
Coupler
Delay
(-)(+)
Vin
VVin
Vout1 Vout2
(-)
Figure 2_12: Feed-forward linearizer [1]
2.7.2.2 Feedback Linearizer
The feedback linearizer’s basic principle is illustrated in Figure 2_13. In this technique, the
output of the amplifier is attenuated and subtracted from the input signal to compensate for the
distortion of the amplifier. It can be seen that the gain of the closed loop is:
GCL = Gamp / (1 + Gamp). (2.10)
Thus, the gain of the amplifier is reduced due to this closed loop but at the same time, it atten-
uates the distortion by 1/(1+ Gamp) at the cost of gain reduction.
The best results for linearity can be obtained by correcting both the phase and the amplitude. In
the envelope feedback technique, only the amplitude information can be corrected but the Car-
tesian feedback technique can correct both the amplitude as well as the phase. The disadvantage
Theory
26
of Cartesian feedback technique is that it is not successful in high-frequency applications.
PA
α
Vin(t) Verror(t)
Vd(t)
Vout(t)
+-
Figure 2_13: Feedback Linearizer
2.7.2.3 Pre-Distortion Linearizer
The Pre-Distortion (PD) is a method of distorting the signal before being sent to the PA. The
PD has reverse transfer characteristics to that of the PA as shown in Figure 2_14 [1] and hence
we get the linearized characteristics at the output. Another way to look at it is that the PD cir-
cuitry produces the Inter-Modulation (IM) products, which are equal in magnitude but 180 de-
grees out-of-phase to that of IM products produced by the PA. Depending on where the PD
linearizer is placed, it can be divided into two main categories i.e. Analogue IF/RF pre-distor-
tion and Digital baseband pre-distortion.
The analog PDs are small and inexpensive but they usually focus only on the reduction of third-
order Inter-Modulation (IM3) products but on the other hand, the digital PDs are easier to be
adaptive and are popular due to their rapid advancements in FPGAs, ASICs etc.
Predistortion Linearizer PA
Pin,dpd Pout,dpd Pin,A Pout,AInput Output
Pin,dpd
Pout,dpd
Pin,A
Pout,A
Pin
Pout
Figure 2_14: PDs response opposite to a PA’s response in magnitude and phase [1]
Theory
27
2.7.3 PAPR Reduction Techniques
There are many techniques, which are used for reducing the PAPR of the OFDM systems but
the main focus of this thesis is on a technique called Fourier Projection Algorithm (FPA), which
is discussed in detail in chapter 4. In PAPR reduction methods, there is a trade-off between the
Bit-Error-Rate (BER) increase, the transmit signal power increase, data rate loss, and compu-
tation complexity increase.
According to [2], PAPR reduction techniques are categorized into two main groups:
Signal scrambling techniques: How to scramble the codes to decrease the PAPR of the
system. Most famous techniques in this category are Tone Reservation (TR), Selective
Level Mapping (SLM) and Partial Transmit Sequences (PTS). The disadvantage of
these techniques is that they decrease the throughput of the system by introducing the
redundancy.
Signal distortion techniques: These techniques introduce both in-band and out-of-band
interferences and increase the complexity of the system. These techniques clip the signal
to reduce the high peaks prior to amplification. Clipping distorts the signal, which
causes both in-band and out-of-band interferences. The most practical techniques of this
category are peak clipping and filtering, windowing, peak cancellation, peak power sup-
pression, weighted multicarrier transmission etc.
2.8 Coarse-Grained Reconfigurable Architectures (CGRAs)
2.8.1 Introduction
Field-Programmable Gate Arrays (FPGAs) have been an affordable solution for implementing
the logic circuits without the use of integrated circuits fabrication facility in the past. One draw-
back of FPGAs is that the logic functions are programmed at a bit level, which is unnecessary
for many applications. Another disadvantage of FPGAs is that a large amount of area is needed
for a large number of processing units and routing switches [10]. A new trend has been to limit
the standard logic operations to be performed on a word level instead of a bit level. Coarse-
Grained Reconfigurable Architectures (CGRAs) provide the word level optimizations and very
efficient routing switches. The CGRAs have worse flexibility compared to the traditional
FPGAs but offer reduction in energy consumption and area and also low configuration memory
and time. Lots of research is going on to integrate time-multiplexing, parallelism, and power
Theory
28
management in the CGRAs. Recent CGRAs provide improved performance and flexibility at
the cost of additional memory [11]. The performance-flexibility gap between the Application
Specific Integrated Circuits (ASICs) and the FPGAs have always been the problem, which is
now filled by the CGRAs.
Dynamically Reconfigurable Resource Array (DRRA) fabric has been designed by researchers
at the Royal Institute of Technology (KTH), Stockholm, which is used for customization of the
CGRA. In the next section, a brief introduction of the DRRA is presented. The DRRA is still
in its early stages and is under development. Therefore, there are still certain limitations and
problems in this fabric.
2.8.2 Dynamically Reconfigurable Resource Array (DRRA)
The DRRA is a fabric of coarse-grained cells. Each cell in this fabric contains a Register File
(RFile), a sequencer, four Address Generation Units (AGU) and a Data Path Unit (DPU). One
DRRA cell can be seen in Figure 2_15. Current DRRA version has 2 rows and 4 columns of
cells.
RFile
DPU
Sequencer
Figure 2_15: DRRA single cell
2.8.2.1 Data-Path Unit (DPU)
The DPU has four inputs and two outputs corresponding to the two complex numbers and has
16-bits data path. The Inputs and the outputs of the DPU are used for receiving and sending
Theory
29
data to the DRRA interconnection network. The DPU has different modes for different opera-
tions like multiple add/subtract or multiply and add operations. It has the ability to perform
truncation, rounding and saturation operations as well as signal processing operations such as
encoding, scrambling etc. DPUs can even be configured in parallel to achieve the parallelism
of operations.
2.8.2.2 Register File (RFile)
The Register File (RFile) is used for local storage for implementing the Coarse Grain Instruc-
tions (CGIs). The RFile has 2 read and 2 write ports and can store 32 words of 16 bits each.
Every port in the RFile has a dedicated Address Generation Unit (AGU), which is controlled
by the sequencer.
2.8.2.3 Sequencer
The DRRA is controlled by the sequencer. Each sequencer controls only a single DRRA cell in
the DRRA fabric, which makes it easier for customization, unlike the other architectures where
a single controller is used. Each sequencer controls the RFile, DPU, and interconnect-switches
in its DRRA cell.
2.9 VESYLA
A software called VESYLA is used for mapping an application on the DRRA. VESYLA is a
compiler, which accepts MATLAB code and generates the VHDL to run on the DRRA fabric.
VESYLA has currently several limitations as it is still being developed.
VESYLA does not support the automatic resources currently. Therefore, the MATLAB code is
mapped manually to the fabric. For example, for Register and Processor Allocation, it is
mapped as below:
For Register Allocation: %! RFILE<>[R_index, C_index]
For Processor Allocation: %! CDPU[R_index, C_index]
Where R_index and C_index indicate the location of RFile and CDPU in the fabric.
Writing the MATLAB code for VESYLA can be explained by one simple example. Let us
assume we want to implement the operation,
Result = (a+b-c) * d. (2.11)
Theory
30
This task can be mapped as shown in Figure 2_16 on the DRRA fabric. The DRRA Cell [0, 0]
is used for calculating a+b, which is then sent to the next DRRA Cell [0, 1] to calculate a+b-
c. Finally, this is fed to the last DRRA Cell [0, 2], which gives us the final result (a+b-c) * d.
The result at the end is stored in the RFile[0, 2].
The MATLAB code for this example can be seen in Table 4_1 below.
Figure 2_16: Mapping example for DRRA cell
Table 3_1: MATLAB source code example for VESYLA a = [0]; %! RFILE<>[0, 0]
b = [0]; %! RFILE<>[0, 0]
c = [0]; %! RFILE<>[0, 1]
d = [0]; %! RFILE<>[0, 2]
abcd = [0]; %! RFILE<>[0, 2]
ab(1) = a(1) + b(1); %! CDPU<>[0, 0]
abc(1) = ab(1) - c(1); %! CDPU<>[0, 1]
abcd(1) = abc(1) * d(1); %! DPU<>[0, 2]
Implementation
31
3 IMPLEMENTATION
Fourier Projection Algorithm (FPA) for reducing the Peak-to-Average Power Ratio (PAPR) of
the Orthogonal Frequency Division Multiplexing (OFDM) system is discussed in detail in this
chapter. It is first implemented in SIMULINK and MATLAB environments and then it is syn-
thesized on the coarse-grained reconfigurable fabric platform.
3.1 Fourier Projection Algorithm (FPA)
We have learnt in the previous chapter that the OFDM systems have very high PAPR. Fourier
Projection Algorithm (FPA) is a technique to reduce the peaks to avoid the distortions in the
OFDM systems. The OFDM system rarely uses all its bandwidth i.e. there is some portion of
the available bandwidth, which carries almost no or little data. These tones carrying no data are
reserved for the clipping control. Since the reserved tones in OFDM are orthogonal to the rest
of the tones, therefore changing their energy does not affect the original data. These reserved
tones are used to store the energy of the tones of the original data, which have high peaks to
avoid the non-linear distortions of the Power Amplifier (PA).
The purpose of this technique is that the magnitude of all elements of a length N is kept below
the pre-defined clipping threshold C. The FPA works with a popular algorithm called Projection
Onto Convex Sets (POCS) [12]. This algorithm converges by bouncing back and forth between
the two sets. Firstly, L number of tones in the frequency domain are reserved for the clipping
control. These tones carry no data but are used to store the energy of the clipped used tones.
The signal of N-length in the time domain is projected onto an N-dimensional clipping 'hyper-
cube' centered around the origin with side length 2C. The data in the time domain is clipped to
bring all data sample values under the threshold value C if it is greater than C. The clipped
signal is mapped in the frequency domain and its used tones are reset while unused tones L are
left as they are. These L tones contain now the energy of the clipped used tones. This signal
data in the time domain is again projected back onto the clipping hypercube and the whole
process is repeated until both projections generate no changes [12].
3.1.1 The POCS algorithm
The POCS algorithm can be seen graphically in Figure 3_1. It starts by taking IFFT of the data
to get x in the time domain. The algorithm can be described as below [12]:
Implementation
32
1. IFFT of data to get x.
2. Clip x. Check if the elements change values then go to step 3 otherwise return x and
terminate.
3. FFT of x to get data in frequency domain and set (N-L) used elements back to the
original data whereas keeping the unused tones L as they are.
4. IFFT to get the new x.
5. Go back to step 2.
is the clipping amplitude, which occurs at the point of clip d. As this algorithm utilizes
FFT/IFFT processing, its complexity, therefore has reduced to O (N log N). Firstly, clip x to
get xc and dn
N
nc
1
0 then take FFT on both sides to get X and C:
xc = x + c (3.1)
FFT: Xc = X + C
Figure 3_1: Decomposition of POCS algorithm [12]
Projecting the clipped signal Xc onto the used sub-carriersX returns 𝑋′ whereas projecting x
onto the used sub-carriers gives x unchanged. Then projecting the clip impulse C onto B zeros
Implementation
33
out the used tones, which carry the transmitted data and gives us new 𝐶′ which can be written
as:
C′ = C ∗ 1SB (3.2)
where 1𝛿𝐵is the indicator function of
B . Taking IFFT gives us the signal back in time-domain
𝑥′:
x′ = IFFT(X′)
x′ = IFFT(X + C′)
x′ = IFFT(X) + IFFT(C′)
x′ = IFFT(X) + IFFT(C ∗ 1SB)
x′ = x + c ⊗ s (3.3)
x′ = x + ∑ αs(n−d) mod NNn=0 (3.4)
where, ⊗ = the circular convolution [12]. Thus, it is completely iterated in the time domain by
applying scaled and shifted version of the shaping functions to X.
1. IFFT of input data to get x.
2. Check if the amplitude changes at any position when x is clipped. If no changes occur
return x and terminate.
3. For each clip, add the shaping function scaled by the clipping amplitude to x. The shap-
ing function is also circularly shifted so that it is centered about the clipping position.
4. Go back to step 2.
3.1.2 Flow Graph of FPA
The flow graph of the FPA algorithm can be seen in Figure 3_2. The first step of the FPA is
clipping of the input data xs in the time-domain to get xc. The input samples are brought under
C if it is greater than C. In the next step, there is a while loop condition where it is checked if
any sample value of the input data xs is greater than C. If the condition is false, it returns xs and
terminates the loop otherwise it goes to the body loop. In the body loop, the clipped signal xc is
converted back into the frequency domain and the used tones N-L are reset while the unused
tones L are left as they are. The new xs is obtained by converting the signal into the time-domain.
Implementation
34
New xs is compared against the threshold value C and the whole process is repeated until the
condition is false.
Figure 3_2: Flow graph of FPA
3.2 Implementation of the FPA in SIMULINK
Figure 3_3 shows the FPA block in SIMULINK, which is used to deal with the PAPR problem
in the OFDM systems. C is the clipping threshold, M are the used tones, which carry data and
L are the unused tones, which carry no data. Threshold value C is usually selected little more
than the average value of the OFDM signal.
Implementation
35
FPA
xs, in
xs, out
L
c
TrainingstageM
Figure 3_3: Four Projection Algorithm from SIMULINK
As we have described, there are always a certain number of unused tones in the OFDM systems.
Therefore, first of all, the unused tones are determined in the training stage. It is done by con-
verting the input data xs into the frequency domain, which has N number of elements. It is
assumed that the L number of tones are the unused tones, which are used for storing the energy
of the clipped data. Thus, if the L number of elements are zeroed out then there are N-L used
elements left, which carry the original data. Inside view of the training stage can be seen in
Figure 3_4. The vector M is multiplied with the input data Xs in the frequency domain, which
zeros out the L elements. Hence, the input data contains only N-L elements after the training
stage.
Figure 3_4: Inside view of the training stage
The second block called FPA is the most important block in this algorithm. This block is used
to clip the signal in the time domain and then reset the used tones in the frequency domain. We
are using a while loop in our model to control the peak of the OFDM signal. The condition of
Implementation
36
the while loop is shown in the Figure 3_5(a) whereas the loop body is shown in Figure 3_5(b).
Each of these blocks is explained below in detail separately.
Figure 3_5: Inside view of the FPA
The Condition block checks if the input data xs is greater than the required threshold value C.
If the condition is true, we get 1 on the first input of the AND gate and vice versa. On the other
input of the AND gate, there is a Counter block. This block counts simply the number of itera-
tions of the while loop. The internal view of the Counter block can be seen in Figure 3_6. The
number of iterations can be changed from this block and we set to a certain limit otherwise
simulations can run forever in some cases. Hence the while loop runs as long as we are under
the iteration numbers limit and the input data xs is greater than the threshold value C.
Figure 3_6: ‘Counter’ logic
Implementation
37
There is one another block called Clipping, which serves the purpose of clipping the data in
time domain. Though, there is one built-in block available in the SIMULINK library called
Saturation, which does the required clipping task but to understand how it works, we have
implemented this block ourselves. Inside view of the Clipping block can be seen in Figure 3_7.
Figure 3_7: Clipping block
This block works on real and imaginary parts separately. First of all, each input data sample is
compared against 0 to find if the sample value is positive or negative. In next step, the sampled
values are compared against the threshold value C or -C, depending on if the input data sample
is positive or negative.
If the sample value is positive, it is compared against the threshold value C with the next rela-
tional operators. It gives C at the output if the input sample is greater than C otherwise it returns
the same value as that of the input sample value.
In case, if it is found negative value, it is compared against the threshold -C instead of a positive
C. Rest of the process of bringing the values under the threshold limit is same as of the positive
Implementation
38
value explained previously. Finally, we will get the output data in the time domain, which is
clipped by threshold value C. For both real and imaginary part, this whole process is done sep-
arately.
In the loop body of the while loop, the input data in the time domain is first clipped by the
Clipping block if it has peaks greater than the threshold value C. The output of the Clipping
block is converted back into the frequency domain by taking its FFT. By multiplying with the
vector L and then adding with the original signal, its used tones are reset while unused tones
are left as they are. Our target is to keep getting data in the time-domain within the threshold
limit. This data contains the original used tones while unused tones contain the energy of the
clipped data. New data containing both L and M tones, is converted back to time domain using
IFFT. This signal is the new xs, which is compared against the threshold value in while condi-
tion. This process is repeated again and again until the while condition is false.
3.3 Implementation of Radix-2 FFT and IFFT on the DRRA fabric
As the DRRA fabric is in its early stages, the FFT/IFFT operations are not available in the
fabric. These operations were implemented first as these are the important operations in this
thesis. We will not describe the implementation of IFFT because is it same as FFT except you
swap the real and imaginary parts and the divide the result with the number samples to get the
IFFT. Implementation of Radix-2 FFT on the DRRA fabric is described in this section.
Fast Four Transform (FFT) is a method for computing the Discrete Fourier Transform (DFT)
of a series of input samples in time-domain. Radix-2 FFT butterfly operations are defined by
the equation below [13]:
X[k] = ∑ x(n)WNknN−1
n=0 , (3.5)
for k=0,....,N-1, where
WN = e−j2πN . (3.6)
Splitting the sequence x(n) of the above equation into two sequences of odd and even samples,
each of length N/2 we can write as:
X[k] = ∑ x(2m)WNk2mN/2−1
m=0 + ∑ x(2m + 1)WNk(2m+1)N/2−1
m=0 .
Implementation
39
X[k] = ∑ g1(m)WNk2mN/2−1
m=0 + WNk ∑ g2(m)WN
k2mN/2−1m=0 . (3.7)
Here, WNk2m = e−j2k2π
N = e−jk 2π
N/2 = WN/2km , so we get:
X[k] = ∑ g1(m)WN/2kmN/2−1
m=0 + WNk ∑ g2(m)WN/2
kmN/2−1m=0 , (3.8)
X[k] = G1[k] + WNkG2[k], (3.9)
where G1[k] = ∑ g1(m)WN/2kmN/2−1
m=0 and G2[k] = ∑ g2(m)WN/2kmN/2−1
m=0 are N/2-point DFTs of
g1(n) and g2(n). G1[k] and G2[k] are periodic with a period N/2, so G1[k] = G1[k +N
2] , G2[k] =
G2[k +N
2] and WN
k+N
2 = −WNk. Thus we can write as:
X[k] = G1[k] + WNkG2[k], (3.10)
X [k +N
2] = G1[k] − WN
kG2[k], (3.11)
where k=0,…, N/2-1.
This single butterfly operation can be seen in Figure 3_8.
G1[k]
G2[k](WN)^k
X[k]
X[k+N/2]-1
Figure 3_8: Single butterfly operation
In general, butterfly equations of Radix-2 FFT can be written as [3]:
A = a + W b (3.12)
B = a - W b (3.13)
In the DRRA, we work on real and imaginary parts separately. Let us suppose that Dr and Di
are the real and imaginary parts of ‘a’ then a = Dr (k) + i Di (k). Similarly, for ‘b’ we have b=Dr
(kh) + i Di (kh). Where ‘k’ is the index for even samples and ‘kh’ is the index for odd samples.
Thus, we can insert ‘a’ and ‘b’ in the above equations:
Implementation
40
A = (Dr (k) + iDi (k) ) + (Wr + iWi) (Dr (kh) + iDi (kh)),
A = Dr (k) + iDi (k) + Wr Dr (kh) - Wi Di(kh) + i(Wr Di (kh) + Wi Dr (kh)),
A = (Tr0(k) - Wi Di(kh)) + i(Ti0(k) + Wi Dr(kh)), (3.14)
where Tr0(k) = Dr (k) + Wr Dr (kh) and Ti0(k) = Di (k) + Wr Di (kh).
B = (Dr (k) + iDi (k)) - (Wr + iWi)(Dr (kh) + iDi (kh)),
B = Dr (k) + Di (k) - Wr Dr (kh) + Wi Di (kh) - i(Wr Di (kh) + Wi Dr (kh)),
B = (Tr1 (k) + Wi Di (kh)) + i(Ti1 (k) - Wi Dr (kh)), (3.15)
where Tr1(k) = Dr (k) - Wr Dr (kh) and Ti1(k) = Di (k) - Wr Di (kh).
Single butterfly implementation on the DRRA fabric can be seen in Figure 3_9 [3]. We are
using 2 RFiles, one for the real and other for the imaginary part of the equations (3.14) and
(3.15) separately. The third RFile is used to store the twiddle factor and the fourth RFile is used
for the delay in Dr(kh) and Di(kh).
Current RFile in DRRA fabric can store up to 32 words, each of 16 bits. Therefore, for 64-
points FFT, we need 2 RFiles for the real and imaginary parts and we need 2 more RFiles for
the twiddle factor and the delay line.
Seq
uen
cer
Dreal
Seq
uen
cer
Dimag
Seq
uen
cer
Twiddle factor, W
Seq
ue
nce
r
Delay Line
RFile
RFile
RFile
RFile
DP
U
DP
UD
PU
DP
U
DR
RA
Cell [
0, 0
]
DR
RA
Cell [
0, 1
]
DR
RA
Cell [
1, 0
]
DR
RA
Cell [
1, 1
]
For real parts For imaginary parts
×
-+
Dr(kh) Dr(k) Di(kh) Di(k)
×
-
+
×-
+
×
-+
WrWr
Wi WiDr(kh)Di(kh)
Wr Wi
Dr(kh) Di(kh)
Dr(kh) Di(kh)
AREBRE BIMAIM
Tr1(k)Tr0(k) Ti0(k) Ti1(k)
Figure 3_9: Implementation of Radix-2 FFT single butterfly on DRRA [3]
Implementation
41
3.4 Implementation of FPA on the DRRA fabric
Implementation of FPA on the DRRA fabric can be seen in Figure 3_10. We have used 8 DRRA
cells for implementing the FPA algorithm. RFile[0, 0] and RFile[1, 0] are being used for the
storing the real part Xs,r and imaginary part Xs,i respectively in the frequency domain of the
original OFDM signal, which we want to clip. As in the DRRA fabric, we work on real and
imaginary parts separately, therefore RFile[0, 1] and RFile[1, 1] are being used for the FFT and
IFFT operations. Clipping function of the signal is implemented in the DRRA cell [1, 3]. The
clipped signal xc is sent to the DDRA cell [0, 1] and DDRA cell [1, 1] for conversion into the
frequency domain. FFT operation generates the real part Xc,r and imaginary part Xc,i separately.
Xc,r and Xc,i are multiplied with the vector L, then are added with Xs,r and Xs,i respectively to
reset the used tones while the unused tones are left as they are. The generated output is then
converted back into the time domain, which is the new xs. This xs is compared again against the
predefined threshold value C, which is stored in the RFile[1, 3]. If it is greater than C, the whole
process is repeated again otherwise it terminates and we get the required signal with reduced
PAPR. S
eq
ue
nce
r
Xs,r
DRRA Cell [0, 0]
Se
qu
en
ce
r
Xs,i
DRRA Cell [1, 0]
Se
qu
en
ce
r
Dreal
DRRA Cell [0, 1]
Se
qu
en
ce
r
Dimag
DRRA Cell [1, 1]
Se
qu
en
ce
r
DRRA Cell [0, 2]
Se
qu
en
ce
r
Tiddle factor, W
DRRA Cell [1, 2]
Se
qu
en
ce
r
L
DRRA Cell [0, 3]
Se
qu
en
ce
r
Clip
DRRA Cell [1, 3]
RFile
DPU
Figure 3_10: Implementation of FPA on DRRA fabric
Results
42
4 RESULTS
This chapter describes the results of simulations of this thesis. The efficiency of the Fourier
Projection Algorithm (FPA) can be measured with Tone Rate Loss (TRL), the number of iter-
ations required to reduce the Peak-to-Average Power Ratio (PAPR), and the throughput of the
system.
The TRL is defined as the ratio of the number of unused tones to the total tones.
Tone rate loss = Number of unused tones
Total number of tones. (4.1)
Throughput is a measure of the number of bits per unit time or a sample per unit time. Our
system is sending N number of bits, out of which NB are the unused tones. So, the sample size
is N−NB
N. The time taken by the total samples is same as by the number of iterations n. Thus,
N−NB
N/n is the required throughput of our system.
Another important factor is the complexity of the system. The complexity of the system is de-
fined as the number of iterations required to reduce the PAPR of the system.
In the system, four different outputs of the simulations are the reduction of the PAPR, the
number of iterations versus the number of unused tones, throughput versus the number of un-
used tones, and throughput versus the number of iterations. In the next four sections, the sim-
ulation results are plotted and discussed in detail.
4.1 Number of iterations versus number of unused tones
In this analysis, a random number of unused tones are selected and the FPA algorithm is applied
to count the required number of iterations to reduce the PAPR of the system. Figure 4_1 shows
the simulation result for the varying the number of unused tones for the FPA algorithm versus
the number of iterations it needs to reduce the PAPR of the signal. Note that with the increase
of the number of unused tones, the FPA model requires a lesser number of iterations to reduce
the PAPR to the required threshold limit and vice versa. Thus, the TRL increases with the in-
crease of the number of unused tones but the number of iterations to reduce the PAPR reduces
and vice versa. It can be observed from the Figure 4_1 that the number of iterations for conver-
gence remains fairly constant for a higher number of unused tones. However, the complexity
of the FPA algorithm increases for the lower number of unused tones. It can also be noticed
Results
43
that for the number of unused tones from 11 to 20, the number of iterations has dropped from
487 to 37.
Figure 4_1: Number of iterations versus Number of unused tones
In the Figure 4_2, effects of different PAPR is shown. The PAPR level is varied from 3 to 4
dB, and the number of iterations increases from 487 to 637 for the 11 number of unused
tones. However, it can be seen that for the higher number of unused tones, the number of iter-
ations are fairly same for both the 3 and 4 dB PAPR signals.
Figure 4_2: Number of iterations versus Number of unused tones for varying PAPR levels
Results
44
4.2 Throughput versus number of unused tones
Next plot is the throughput of the system against the number of unused tones. The FPA is sim-
ulated for different number of unused tones and the corresponding throughput of the system is
calculated by N−NB
N/n to plot the number of unused tones versus throughput. It can be seen in
Figure 4_3 that the throughput of the system increases with the increase of the TRL. Thus, there
is a trade-off between the TRL and the throughput of the system. Higher the TRL, better the
throughput of the FPA algorithm and vice versa. It can be noticed that for increasing the number
of unused tones from 11 to 20 tones, the throughput of the system increases from 0.0017 to
0.0186.
Figure 4_3: Throughput versus Number of unused tones
4.3 Throughput versus number of Iterations
Figure 4_4 shows the graph, which is a measure of the throughput of the FPA versus the number
of iterations. Note that as the number of unused tones increases, the number of iterations for
reducing the PAPR of the OFDM system decreases and hence the throughput of the system
increases. The throughput of the system increases from 0.0017 to 0.0186 with the decrease of
the number of iterations from 487 to 37.
Results
45
Figure 4_4: Throughput versus Number of iterations
4.4 Discussions
The reduction of PAPR of OFDM systems, using the FPA algorithm, has been achieved suc-
cessfully. The results in this chapter show that the performance of the FPA algorithm depends
on the number of unused tones and the PAPR level of the system. The throughput and the
complexity of the FPA algorithm can be improved at the cost of the number of unused tones as
well as the PAPR of the system. For higher PAPR level, we need more unused tones to reduce
the PAPR of the OFDM system. So, increasing the number of unused tones reduces the data
rate of the system, which is one of the disadvantages of this algorithm.
The FPA method works well in reducing the PAPR but its computation is complex. We are not
sure on how many unused tones we need for clipping control of the signal. There is a need to
find a way to calculate the number of unused tones that are needed for a specific PAPR value.
The number of unused tones can vary depending on the PAPR of the system. If the given num-
ber of unused tones are lesser than the required number of unused tones, FPA algorithm does
not reduce the PAPR to a required limit.
Conclusions and Future Work
46
5 CONCLUSIONS AND FUTURE WORK
The purpose of this thesis of modeling the Fourier Projection Algorithm (FPA) in SIMULINK
and MATLAB environments and its synthesized design on the CREST fabric is achieved suc-
cessfully. Simulation results in chapter 4 show that the FPA is a very efficient way of suppress-
ing the Peak-to-Average Power ratio (PAPR) of Orthogonal Frequency Division Multiplexing
(OFDM) systems but there is a trade-off between the number of unused tones, the number of
iterations and the throughput. The level of PAPR reduction and time to suppress to its required
level depends on the number of unused tones. The FPA algorithm is also successfully imple-
mented in CGRAs. The effectiveness of the FPA algorithm depends on the original PAPR of
the system and the number of unused tones to reduce the original PAPR. Further, the FPA
algorithm offers good PAPR reduction with a less amount of data loss.
This thesis is based only on the prototype implementation of the FPA. The main future work of
this thesis is full integration of the FPA algorithm on the WLAN transmitter. There is one more
extension of this work on the receiver side. Since FPA technique is used to reduce the PAPR in
the transmitter but at the receiver end to retrieve the data back to its original form, there is
another technique called Clipping Estimation and Correction (CEC), which is not implemented
in this thesis. In addition, a better estimator for the unused tones can be implemented.
References
47
REFERENCES
[1] A. Katz, “Linearization: Reducing distortion in power amplifiers,” IEEE microwave
magazine, vol. 2, no. 4, pp. 37–49, 2001.
[2] V. Vijayarangan and R. Sukanesh, “An overview of techniques for reducing peak to av-
erage power ratio and its selection criteria for orthogonal frequency division multiplexing radio sys-
tems,” Journal of theoretical and applied information technology, vol. 5, no. 1, pp. 25–36, 2009.
[3] M. A. Shami, “Dynamically reconfigurable resource array,” PhD dissertation, KTH
Stockholm, 2012.
[4] S. Cioni, G. E. Corazza, M. Neri, and A. Vanelli-Coralli, “On the use of ofdm radio in-
terface for satellite digital multimedia broadcasting systems,” International Journal of Satellite Com-
munications and Networking, vol. 24, no. 2, pp. 153–167, 2006.
[5] K. Pietikäinen, “Orthogonal frequency division multiplexing,” Internet presentation,
2005.
[6] N. Chide, S. Deshmukh, and P. Borole, “Implementation of ofdm system using ifft and
fft,” International Journal of Engineering Research and Applications (IJERA), vol. 3, no. 1, pp. 2009–
2014, 2013.
[7] J. A. Bingham, “Multicarrier modulation for data transmission: An idea whose time has
come,” IEEE Communications magazine, vol. 28, no. 5, pp. 5–14, 1990.
[8] A. Chadha, N. Satam, and B. Ballal, “Orthogonal frequency division multiplexing and
its applications,” arXiv preprint arXiv:1309.7334, 2013.
[9] E. Costa, M. Midrio, and S. Pupolin, “Impact of amplifier nonlinearities on ofdm trans-
mission system performance,” IEEE Communications Letters, vol. 3, no. 2, pp. 37–39, 1999.
[10] R. Panda and S. Hauck, “Dynamic communication in a coarse grained reconfigurable
array,” in Field-Programmable Custom Computing Machines (FCCM), 2011 IEEE 19th Annual Inter-
national Symposium on. IEEE, 2011, pp. 25–28.
[11] S. M. Jafri, M. Daneshtalab, A. Hemani, N. Abbas, M. A. Awan, and J. Plosila, “Tea:
Timing and energy aware compression architecture for efficient configuration in cgras,” Microproces-
sors and Microsystems, vol. 39, no. 8, pp. 973–986, 2015.
[12] A. Gatherer and M. Polley, “Controlling clipping probability in dmt transmission,” in
Signals, Systems & Computers, 1997. Conference Record of the Thirty-First Asilomar Confer-
ence on, vol. 1. IEEE, 1997, pp. 578–584.
[13] W. T. Cochran, J. W. Cooley, D. L. Favin, H. D. Helms, R. A. Kaenel, W. W. Lang,
G. Maling, D. E. Nelson, C. M. Rader, and P. D. Welch, “What is the fast fourier transform?” Pro-
ceedings of the IEEE, vol. 55, no. 10, pp. 1664–1674, 1967.