vlsi based reconfigurable architecture for mobile … · ii certificate by the supervisor i certify...
TRANSCRIPT
VLSI BASED RECONFIGURABLE ARCHITECTURE FOR MOBILE ADHOC NETWORK
Thesis submitted in partial fulfilment for the award of the degree of
DOCTOR OF PHILOSOPHY
By
S. CHANDRIKA
Under the Guidance of
Dr. R. RANI HEMAMALINI
Professor & Head,
ST.PETER’S COLLEGE OF ENGINEERING AND TECHNOLOGY, CHENNAI
VINAYAKA MISSIONS UNIVERSITY SALEM, TAMIL NADU, INDIA
APRIL 2016
VLSI BASED RECONFIGURABLE ARCHITECTURE FOR MOBILE ADHOC NETWORK
Thesis submitted in partial fulfilment for the award of the degree of
DOCTOR OF PHILOSOPHY
By
S. CHANDRIKA
Under the Guidance of
Dr. R. RANI HEMAMALINI
Professor & Head, ST.PETER’S COLLEGE OF ENGINEERING AND
TECHNOLOGY, CHENNAI
VINAYAKA MISSIONS UNIVERSITY SALEM, TAMIL NADU, INDIA
APRIL 2016
ii
CERTIFICATE BY THE SUPERVISOR
I certify that the thesis entitled “VLSI BASED
RECONFIGURABLE ARCHITECTURE FOR MOBILE ADHOC
NETWORK” submitted for the Degree of Doctor of Philosophy by
Mrs. S. CHANDRIKA is the record of research work carried out by her
during the period from April 2010 to April 2016 under my guidance and
supervision and that this work has not formed the basis for the award of
any degree, diploma, associate-ship, fellowship or titled in the University
or any other University or Institutions of higher learning.
Signature of the Supervisor with designation
Place:
Date:
iii
DECLARATION
I declare that the thesis entitled “VLSI BASED
RECONFIGURABLE ARCHITECTURE FOR MOBILE ADHOC
NETWORK” submitted by me for the Degree of Doctor of Philosophy is
the record of research work carried out by me during the period from
April 2010 to April 2016 under the guidance of Dr. R. Rani Hemamalini,
Professor & Head, Electronics And Communication Department,
St.Peter’s College Of Engineering And Technology, Chennai and that
this work has not formed the basis for the award of any degree,
diploma, associate-ship, fellowship or titled in the University or any
other University or Institutions of higher learning.
Signature of the Candidate
Place:
Date:
iv
ACKNOWLEDGEMENT
I am thankful to the Chancellor Prof.Dr.A.Shanmugasundaram,
Vice Chancellor Prof.Dr.V.R.Rajendran and Prof.Dr.K.Rajendran,
Dean Research, Vinayaka Missions University, Salem who have
extended their co-operation for the several phases of my research
work.
I extend my sincere gratitude to my supervisor Dr.R.Rani
Hemamalini, Professor & Head, Electronics and Communication
Department, St.Peter’s College of Engineering and Technology,
Chennai for providing the vital, inspirational, formative and
constructive thought process which facilitated my research work. I
gratefully acknowledge her guiding insight towards the articulation
and execution of this scholarly enterprise.
I express my sincere thanks to Dr.G.Gunasekaran, Principal,
Meenakshi College of Engineering, Chennai for having extended
his fullest co-operation and constant encouragement throughout
the span of this work.
The timely support rendered by my family members is duly
acknowledged.
S. Chandrika
v
ABSTRACT
A Mobile Ad-hoc Network (MANET) is a self-configuring and
infra-structure less independent network for wireless mobile
telecommunication applications. The minimal configuration and
quick deployment make the MANET as suitable for emergency
solutions like natural disaster and military conflicts. The presence
of dynamic reconfigurable hardware architecture enables ad-hoc
networks to be formed quickly. Orthogonal Frequency Division
Multiplexing (OFDM) is one of the key architectures presented in
MANET to establish data communication between source and
destination. In order to achieve high speed in data communication
process of OFDM, reduced complexity based OFDM architectures
are required. Hence Very Large Scale Integration (VLSI) System
design environment is considered to design high speed data
communication architectures.
OFDM System consists of frequency transformation (Fast
Fourier Transformation (FFT)) and Channel Encoding/Channel
Decoding for establishing data communication process. In order to
transfer the original information signal over a long distance,
frequency transformation is a essential part in every data
communication process. In this work, an efficient VLSI based
vi
architecture for establishing frequency transformation is proposed.
The proposed frequency transformation technique is called as
Radix-2 Single-path Delay Feedback (R2SDF) FFT. FFT technique
is used to convert the time domain signal into frequency domain
signal and vice versa. Processing Element and Complex
Multiplication architecture of R2SDF FFT has been modified to
improve the area and speed performances.
Additive White Gaussian Noise (AWGN) is the one of the
important noises which is added in the receiver section of data
communication process while transferring data into channel. To
reduce this noise, an efficient architecture of channel decoder
called “Hamming Decoder” is proposed. Practically radiation
issues may affect more than one bit during data transmission in
channels. Hence, Single Error Correction and Triple Adjacent Error
Detection (SEC-TAED) hamming code is designed in this research
work. Proposed SEC-TAED hamming code helps to improve the
detection probability of error detection and correction process. The
simulation results of proposed FFT and decoding blocks are
validated by using Model Sim6.3C. Synthesis performances are
evaluated with the help of Xilinx 10.1i design tool. Hence, the
proposed Reconfigurable architecture such as FFT and decoding
vii
block helps to increase the efficiency of MANET in various aspects
like area, delay and power.
viii
LIST OF CONTENTS
CHAPTER NO. TITLE PAGE NO.
Abstract v
List of Tables xiii
List of Figures xiv
List of Symbols and Abbreviations xviii
1 MOBILE ADHOC NETWORK 1
1.1 Introduction to Mobile ADHOC
Networks
1
1.2 Orthogonal Frequency Division
Multiplexing (OFDM) in Mobile Ad-hoc
Network (MANET)
2
1.3 Architectural Approach of OFDM 5
1.4 OFDM Systems based on
Encoding/Decoding
7
1.5 Challenges in VLSI 9
1.6 Stages in VLSI Design 10
1.7 Performance Measurements of VLSI 11
1.8 Need for this study 13
1.9 Objectives 15
1.10 Methodology adopted 16
ix
2 REVIEW OF LITERATURES 18
2.1 Overview of OFDM 18
2.2 Architectural Oriented OFDM Model 19
2.3 Architectural Oriented FFT Models 22
2.3.1 Complex Multiplier Design of FFT
Models
24
2.4 Hamming Error Correction Codes 28
Summary 33
3 HIGH SPEED PIPELINED BASED 64-POINT
RADIX-2 SINGLE PATH DELAY FEEDBACK
(R2SDF) FFT
36
3.1 FFT in MANET 36
3.2 FFT Algorithm 37
3.2.1 Discrete and Fast Fourier
Transformation Techniques
38
3.2.2 Properties of Twiddle Factor 42
3.2.3 FDM Systems based on FFT and
Modulation
44
3.2.4 Digital Signal Processors for FFT
Computation
47
x
3.3 Radix-2 Single path Delay Feedback
(R2SDF) FFT
53
3.4 Pipelined Processing Element (PE)
Structures for R2SDF FFT
56
3.5 Design of Reduced Complex Multiplier 61
3.5.1 Bit Parallel Multiplier for 1/ 62
3.5.2 Design of Complex Multiplier 64
3.5.3 Reduced Complex Multiplier
design
66
3.6 Design of Radix-2 Single path Delay
Feedback (R2SDF) FFT for 64-point
68
Summary 70
4 HAMMING SINGLE ERROR CORRECTION –
TRIPLE ADJACENT ERROR DETECTION
CODE ALGORITHM FOR DATA
COMMUNICATION
72
4.1 Introduction to Error Detection and
Correction (EDC) Codes
72
4.2 Hamming Error Detection and
Correction Mechanism
74
xi
4.2.1 Single Error Correction (SEC)
Hamming Codes
82
4.2.2 Single Error Correction–Double
Adjacent Error Detection (SEC-
DAED) Hamming Code
85
4.2.3 Single Error Correction – Triple
Adjacent Error Detection (SEC-
TAED)
91
4.3 Proposed Extended (12, 8) Hamming
Code for SEC-TAED
92
Summary 93
5 RESULTS AND DISCUSSIONS 95
5.1 Synthesis Result of Pipeline Based
Processing Element Structures
96
5.2 Synthesis Result of Reduced Complex
Multiplier
101
5.3 Synthesis Result of Pipelined PEs and
Reduced Complex Multiplier Based
R2SDF FFT
103
5.4 Simulation Result of Proposed
Hamming (12, 8) SEC-TAED Codes
106
xii
6 CONCLUSION 110
6.1 SUMMARY OF THE THESIS 110
6.2 FUTURE WORK 111
References 113
List of Publication 126
xiii
LIST OF TABLES
TABLE NO. TITLE PAGE NO.
3.1 Tabulation for number of complex
multiplications and complex additions
required for Radix-2 16 point FFT
44
3.2 Twiddle factor values for 64-point FFT 70
4.1 Calculation of parity bits for (7, 4) hamming
code
78
4.2 Double Adjacent Error Detection for
Hamming (12, 8)
91
4.3 Triple Adjacent Error Detection for
Hamming (13, 8)
93
4.4 Triple Adjacent Error Detection for
Proposed Hamming (12, 8)
94
5.1 Comparison of area and delay for
proposed pipeline based Processing
Elements
102
5.2 Comparison of area and delay between
traditional complex multiplier and proposed
reduced complex multiplier
103
5.3 Comparison of area and delay between
traditional and proposed R2SDF FFT
architectures
107
xiv
LIST OF FIGURES
FIGURE NO. TITLE PAGE NO.
1.1 Single hop ad hoc network 2
1.2 Multi-hop ad hoc network 3
1.3 Physical layer model of MANET 5
1.4 Block diagram of OFDM System 6
3.1 Butterfly Structure for 2-point DIT FFT 40
3.2 Butterfly structure for 2-point DIF FFT 40
3.3 Radix-2 DIF-FFT structure for 8-point 44
3.4 The partial twiddle factor for N-point DFT 45
3.5 Block diagram for Multi-carrier Modulation
Scheme
47
3.6 Spectrum of overlapping of sub-carriers in
OFDM
48
3.7 OFDM System based on FFT 49
3.8 General Purpose Programmable DSP
Processor
50
3.9 Programmable specific FFT processor 52
3.10 Butterfly structure for R2SDF FFT 55
3.11 Symbolic representation of R2SDF FFT 56
3.12 Structure of Radix-2 8-point Single-path
Delay Feedback (R2SDF) FFT
56
3.13 Structure of Radix-2 Single-path Delay
Feedback (R2SDF) FFT
59
3.14 Block diagram of PE3 structure for 64-
point FFT
59
xv
3.15 Block diagram of PE2 structure for 64-
point FFT
60
3.16 Block diagram of PE1 structure for 64
point FFT
61
3.17 Block diagram of Pipelined PE3 structure
for 64 point FFT
62
3.18 Block diagram of Pipelined PE2 structure
for 64 point FFT
62
3.19 Block diagram of Pipelined PE1 structure
for 64 point FFT
63
3.20 Circuit diagram of the bit-parallel
multiplication by 1/
64
3.21 Circuit diagram of reduced bit parallel
multiplier for 1/
65
3.22 Butterfly structure of FFT with the help of
bit parallel multiplier
65
3.23 Structure of Complex Multiplier 66
3.24 Structure of Proposed Reduced Complex
Multiplier
69
3.25 Architecture of 64-point R2SDF FFT 71
4.1 Classification of Error Detection and
Correction (EDC) codes
75
4.2 Parity bit calculation 78
4.3 Flow chart of Bit Placement Strategy 92
5.1 Synthesis result of PE1 to determine the
Slice and LUT utilization
98
5.2 Synthesis result of PE2 to determine the
Slice and LUT utilization
99
xvi
5.3 Synthesis result of PE3 to determine the
Slice and LUT utilization
99
5.4 Synthesis result of Pipelined PE1 to
determine the Slice and LUT utilization
100
5.5 Synthesis result of Pipelined PE2 to
determine the Slice and LUT utilization
100
5.6 Synthesis result of PE3 to determine the
Slice and LUT utilization
101
5.7 Performances of proposed pipelined PE1,
PE2 and PE3 processors
102
5.8 Synthesis result of proposed reduced
complex multiplier to determine the Slice
and LUT utilization
104
5.9 Synthesis result of proposed reduced
complex multiplier to determine the delay
consumption
104
5.10 Synthesis result of Proposed R2SDF FFT
by using Pipelined PEs and Reduced
Complex Multiplier to determine the Slice
and LUT utilization
106
5.11 Synthesis result of Proposed R2SDF FFT
by using Pipelined PEs and Reduced
Complex Multiplier to determine the delay
consumption
106
5.12 Performances of proposed and traditional
R2SDF FFT
107
5.13 Simulation result of proposed hamming
(12, 8) SEC-TAED code
109
xvii
5.14 Simulation result of hamming (12, 8) SEC-
TAED error-less data transmission: Status
displayed as “No error”
110
5.15 Simulation result of hamming (12, 8) SEC-
TAED code with single bit flipping: Status
displayed as “SEC”
110
xviii
LIST OF SYMBOLS AND ABBRIVIATIONS
ADC - Analog to Digital Converter
ADSL - Asymmetric Digital Subscriber Line
ANN - Artificial Neural Network
ASIC - Application Specific Integrated Circuits
AVD - Adaptive Viterbi Decoder
BER - Bit Error Rate
BPSK - Binary Phase Shift Keying
CFFT - Complex Fast Fourier Transform
CLB - Configurable Logic Block
CMA - Cached-Memory Architecture
CP - Cyclic Prefix
CPI - Clocks Per Instruction
CPU - Central Processing Unit
CRC - Cyclic Redundancy Check
DAB - Digital Audio Broadcast
DAC - Digital to Analog Converter
DAED - Double Adjacent Error Detection
DBPSK - Differential Binary Phase Shift Keying
DFT - Discrete Fourier Transform
DIF-FFT - Decimation in Frequency Fast Fourier Transform
DIT-FFT - Decimation in Time Fast Fourier Transform
xix
DQPSK - Differential Quadrature Phase Shift Keying
DSP - Digital signal processor
DVB - Digital Video Broadcasting
DWT - Discrete Wave Transform
ECC - Error Correcting Codes
EDC - Error Detection and Correction
FB - Feedback
FDMA - Frequency Division Multiple Access
FEC - Forward Error Correction
FF - Feedforward
FFT - Fast Fourier Transform
Fig. - Figure
FPGA - Field programmable gate array
GSM - Global System for Mobile Communication
HDL - Hardware Description Language
IC - Integrated Circuits
ICI - Inter-Carrier Interference
ICI - Inter-Carrier Interference
IDWT - Inverse Discrete Wave Transform
IEEE - Institute of Electrical and Electronic Engineering
IFFT - Inverse Fast Fourier Transform
IOB - Input/output blocks
xx
IoT - Internet of Things
IPC - Instruction Per Cycle
ISI - Inter-Symbol Interference
LDPC - Low Density Power Check
LED - Light Emitting Diode
LTE - Long Term Evaluation
LUT - Look up table
MAC - Multiplication and Accumulation
MANET - Mobile Ad-hoc Network
MBU - Multiple Bit Upset
MCM - Multi-chip Module
MCM - Multiple Constant Multiplications
MFCC - Mel Frequency Cepstral Coefficient
MIMO - Multi-In-Multi-Out
MIPS - Million Instructions Per Second
MMSE - Minimum Mean Square Error
MOS - Metal Oxide Semiconductor
MVS - Multiple Virtual Storage
OFDM - Orthogonal Frequency Division Multiplexing
OOK - On-Off-Keying
PAPR - Peak-to-Power Ratio
PC - Program Counter
xxi
PCB - Printed Circuit Board
PE - Processing Element
PE - Processing Element
PSK - Phase Shift Keying
PTS - Partial Transmit Sequence
QAM - Quadrature Amplitude Modulation
QPSK - Quadrature Phase Shift Keying
R22MDC - Radix-22 Feedforward Multipath Delay Commutator
R2MDC - Radix-2 Multipath Delay Commutator
R2MDC - Radix-2 Multipath Delay Commutator
R2SDF - Radix-2 Single path Delay Feedback
R2SDF - Radix-2 Single-path Delay Feedback
R4MDC - Radix-4 Multipath Delay Commutator
R4SDF - Radix-4 Single-path Delay Feedback
RAM - Random Access Memory
RF - Radio Frequency
RFFT - Real-valued-Fast Fourier Transform
ROM - Read Only Memory
ROM - Read Only Memory
SCA - Software Communication Architecture
SCs - Sub-Carriers
SDC - Silent Data Corruption
xxii
SDC - Single Delay Commutator
SDR - Software Defined Radio
SEC - Single Error Correction
SMU - Survivor Memory Unit
SNR - Signal to Noise Ratio
SNR - Signal to Noise Ratio
SoC - System on Chip
SRAM - Static Random Access Memory
TAED - Triple Adjacent Error Detection
TDMA - Time Division Multiple Access
TTM - Time to Market
TTV - Time to Volume
ULSI - Ultra Large Scale Integration
USRP - Universal Software Radio Peripheral
VLSI - Very Large Scale Integrated Circuits
VMS - Virtual Memory System
Wi-Fi - Wireless Fidelity
WLAN - Wireless Local Area Network
WPT - Wave Pipelining Technique
WSS - Wide Spread Spectrum
ZF - Zero Forcing
1
CHAPTER 1
MOBILE ADHOC NETWORK
1.1 Introduction to Mobile ADHOC Networks
A Mobile Ad-hoc Network (MANET) is a self-configuring and
infra-structure less independent network for wireless mobile
telecommunication applications. The minimal configuration and
quick deployment make the MANET as suitable for emergency
solutions like natural disaster and military conflicts. The presence
of dynamic reconfigurable hardware architecture enables ad-hoc
networks to be formed quickly. In this research work, MANET is
reconfigured through Very Large Scale Integration (VLSI) System
design implementation. Orthogonal Frequency Division
Multiplexing (OFDM) is one of the architectures presented in
MANET. The OFDM consists of Fast Fourier Transformation (FFT)
technique and Channel Encoding/Decoding block for performing
data communication process. The simulation results are validated
by using Modelsim 6.3C. Synthesis performances are evaluated
with the help of Xilinx 10.1i design tool. Hence, the proposed
reconfigurable architecture such as FFT and decoding block helps
2
to increase the efficiency of MANET in various aspects like area,
delay and power.
1.2 Orthogonal Frequency Division Multiplexing (OFDM) in
Mobile Ad-hoc Network (MANET)
OFDM is one of the emerging fields in wireless local area network
which is targeted for ad hoc network. OFDM can be exploited in MANET
to improve the energy and speed performance. Mobile nodes in MANET
communicate directly through radio frequency range and wireless links.
If the destination mobile node is out of range, then other nodes between
source and destination act as router to transmit information between
source and destination. This process is referred as multi-hop ad hoc
networks. Single and Multi-hop ad hoc networks is illustrated in Figure.
1.1 and Figure. 1.2.
Figure 1.1 Single hop ad hoc network
Source
Destination 1
Destination 2
Destination 3
3
Figure 1.2 Multi-hop ad hoc network
Mobile Ad Hoc Networks are self-configurable and less
infrastructure networks consisting of mobile devices and routers which
are able to support mobility and organize themselves arbitrarily. It
requires an extremely flexible technology for establishing
communications between source and destination nodes.
There are some challenges in mobile environments like
limitations of the wireless network, variable capacity links, data loss due
to transmission errors, limited communication bandwidth, frequent
disconnections or partitions and broadcast nature of the
Source
Node (Router)
Destination
4
communications. Also limitation is imposed by mobility dynamically
changing routers or topologies which lack mobility awareness by Malik
Nasereldin Ahmed [26]. Limitations of mobile devices like capacities
and battery life time will create more problems for the transmission.
OFDM is also a multiplexing technique as well as modulation
technique. It is a multi-carrier transmission technique in which single
high data stream is divided into a number of lower rate streams that are
transmitted simultaneous over some narrow sub channel. OFDM avoids
Inter-Symbol Interference (ISI), Inter-Carrier Interference (ICI) and fault
transmissions between source and destination node. In addition, OFDM
is used to improve the performance of MANET in terms of energy,
power consumption, time consumption and throughput for transmission
of information between source and destinations. OFDM is combined
with MANET to improve the efficiency in various aspects described by
Malik Nasereldin Ahmed [26], and Abdeldime M.S. Abdelgader [1].
Physical layer model of MANET is illustrated in figure 1.3, which shows
how to contribute OFDM in MANET. In OFDM, most of the approaches
to combat ISI and ICI are towards using interference cancellation and
frequency synchronization. It reduces equalization complexity by
implementing with Inverse Fast Fourier Transform (IFFT) at the
transmitter and Fast Fourier Transform (FFT) at the receiver that
converts the wide band signal into N narrow band flat fading signals
5
explained in Moose, Paul S S [38], Yang, Hongwei [56], Malik
Nasereldin Ahmed [26].
Figure 1.3 Physical layer model of MANET
From above description, it is clear that OFDM is a promising
technique for supporting co-operative transmission in MANET. To
achieve a large diversity gain for combating frequency-selective and
fast time-varying as well as tolerating imperfect synchronization among
different mobile nodes, an asynchronous cooperative transmission
scheme is developed in Quan Yu [41] with the help of distributed unitary
space-frequency coded OFDM (USFC-OFDM) which provide better
reliability and robustness and has a lower decoding complexity with no
need for any channel estimation.
1.3 Architectural Approach of OFDM
The structure of OFDM transmitter and receiver is illustrated in
Figure 1.4. The Encoder and Decoder of OFDM transmitter and receiver
6
act as channel encoder and channel decoder which converts the source
signals into set of binary information. Fast Fourier Transform is a key
technique used in OFDM which is implemented on the sender side and
receiver side for efficient communication with narrow bandwidth. FFT is
used to convert the time domain signals into frequency domain signals
and vice versa.
Figure 1.4 Block diagram of OFDM System
Convolutional encoder is considered as a channel encoder of any
wireless transmission techniques. The encoder performs a convolution
of the input stream with impulse response. Serial to parallel unit
converts the serial information of channel encoder into parallel one for
access all the inputs to IFFT unit at the same time. IFFT converts the
time domain signals into frequency domain signals. Parallel information
from IFFT processor is converted into serial one. Cyclic prefix is used to
eliminate the ISI and ICI by adding the guard interval or prefixing of bits
Serial to Parallel
Channel Serial Data Source
Channel Encoder
Serial to Parallel
IFFT
Parallel to Serial
Cyclic Prefix
Channel Decoder
Parallel to Serial
FFT
Remove Cyclic Prefix
Serial Data Source
Channel
OFDM Transmitter
OFDM Receiver
7
into estimated binary signals. In receiver side of OFDM System, reverse
operation is made to retrieve the original information signals.
The presence of multipath fading channel and capability for
parallel/pipelined processing of signal in OFDM make it a promising
technique for the next generation wide-band communications systems.
The modulation and demodulation of OFDM System can be efficiently
implemented with the help of IFFT and FFT transformation technique.
The OFDM based communication systems need to have high
performance in both power consumptions and throughput. This
performance requirement in OFDM can be achieved by efficient
IFFT/FFT implementation. This thesis addresses the problem of
designing efficient application-specific FFT processor for OFDM based
wide-band communication systems. The functionality of OFDM scheme
is represented in figure. 1.4. It indicates digital implementation of OFDM
modulator/demodulators with respect to Discrete Fourier Transform
(DFT). The performance of FFT with regard to Area occupancy and
power consumption in FPGA provides better solutions than Application-
Specific Integrated Circuit (ASIC) solution for FFT implementation.
1.4 OFDM Systems based on Encoding/Decoding
OFDM System consists of channel encoder and channel decoder
for encoding and decoding purpose. Source encoder and source
decoder are used to convert the analog signals into digital one and
8
digital signals to analog one respectively. The purpose of channel
encoder and decoder in OFDM System is to transmit the multiple
discrete signals into single channel. Two types of binary codes are
available for channel encoder and channel decoder of OFDM Systems.
These are block codes and convolutional codes.
Block codes is the combination of both linear and cyclic codes.
Linear block codes are error correction codes. The error correction code
encodes data into blocks. This is a linear combination of code words.
Linear codes are used in forward error correction and in transmitting
symbols (i.e. bits) on a communication channel. Cyclic code is also a
block code in which cyclic shifts of each code word is considered as
another code word belonging to that code. In other hand, convolutional
codes are one of the best codes in the encoding part.
Various decoders are available to decode the digital inputs such
as hamming decoder, Viterbi decoder, Adaptive Viterbi decoder (AVD),
Low density Power Check (LDPC) decoder, Cyclic Redundancy Check
(CRC) decoder, Bose, Ray- Chaudhuri, Hocquenghem (BCH) decoder
and Reed Solomon decoder. These all are the block codes, hence it
may be a linear error correction codes or cyclic error correction codes.
Among those encoders and decoders hamming encoders and decoders
are the best Error Correcting Coding (ECC) techniques for Very Large
Scale Integration (VLSI) Implementation.
9
1.5 Challenges in VLSI
Process variation: When lithographic technique is used in the IC
fabrication process it is difficult to maintain the accuracy of doping
concentrations. The fabricated wires are prone to errors in terms of their
geometrical dimensions and electrical characteristics.
Strict design rules: The IC scaling creates problems in
lithographic and etching process. As a result the design rules for IC
layout become tedious. The situation becomes worse with custom
integrated circuits. Hence designers move to automated tools for doing
layout. The big disadvantage of automated tools is that they do not
produce efficient layout. Reducing area by squeezing up to the last bit
available is possible only in manual layout.
Timing: The clock frequency keeps on scaling up with advent of
new fabrication techniques. This leads to the skew problem of the clock
signal distributed to the entire chip. To overcome this problem multi
core, multiprocessor chips are fabricated. The functionality of a single
core processor at high clock frequency can be achieved with multi-core
processor with low frequency.
Success-rate: The die size keeps on shrinking as fabrication
technology improves and wafer size keeps on going up due to low
manufacturing cost. The mask price becomes higher as technology
scales down. The mask involves a high non recurring cost that assures
10
first pass silicon success without several spin cycles to find errors in
silicon. Several new design philosophies have to be developed to meet
this strategy.
1.6 Stages in VLSI Design:
Schematic Entry: The realization of the circuit in the form of a
Netlist is done in this step. With the help of gates, transistors and
interconnects, we can make a netlist. The outcome of designed circuit is
checked via Simulation.
Physical Design: The conversion of the netlist into its geometrical
representation is done with the help of some predefined fixed rules
called lambda rules. The result of this level is called as a layout. This
step is further divided into sub-steps like circuit partitioning, floor
planning & placement, Routing, Layout compaction and Extraction &
Verification.
Logic Verification and Implementation: The physical design and
routing of desired design, logic must be verified by using suitable
simulation tools and implementation is done on suitable Field
Programmable Gate Array (FPGA) board.
Packaging: The chips are put together on a Multi Chip Module
(MCM) or a Printed Circuit Board (PCB) to obtain the final finished
implemented design.
11
1.7 Performance Measurements of VLSI
VLSI design is the process of finding optimal point in a
multidimensional space. Obvious tradeoffs of VLSI design processes
are hardware utilization, timing constraints, frequency analysis and
power consumption. Robustness and Trade-offs are the two important
issues for VLSI designers. The trade-offs includes complexity, Time to
Market (TTM), Time to Volume (TTV), Instructions per Clock cycle
(IPC), chip size, frequency and power performances. Time to Volume
(TTV) is the most important trade-off to consider.
There are numerous VLSI design metrics which impact the
successful design of a VLSI chip. The primary design metrics are as
follows:
Area: It includes size of the die, number of Look up Tables
(LUTs), Flip-Flop (FFs) and Slices utilized in desired design. Also this
design metrics relates to cost and profit. The performance of circuit
depends on the wire delays. There are standard methods to estimate
the area from the schematic or RTL design. For random/control logic
estimation is done by using synthesis tool or cell area or preliminary
placement. For data-path structures, estimation is done by combining a
regular structure and random logic techniques.
Speed/Delay: The switching time of used devices (transistors,
FFs, etc.) and how fast the desired design can execute. The path of the
12
entire design can be used to determine the time required for signal
propagation. More computational paths will take long time to execute
and simple paths will execute on shorter time period. The task of
designer could be reduced by eliminating the shortest computational
path to improve the system speed and other performance without
change in system functionality.
Power: It measures energy consumption to operate the circuit
over certain clock cycles. There are two primary issues associated with
power. They are Power delivery and Power extraction. The power which
has ability to deliver the voltage and current needed to run the chip is
referred as delivered power and the power which has ability to remove
the heat generated by the chip is referred as extracted power.
The VLSI design metrics can be measured by using proper
synthesis tool like Xilinx or Altera Quartus II web edition, etc. Area
utilization and delay consumption of desired design is directly measured
by using synthesis tools. But in case of measurement of power
consumption, it is essential the simulation tool like Model Sim any
version, etc. These design metrics produce the resulting chip attributes
like the number of instructions, frequency and Area-Delay Product
(ADP).
13
1.8 Need for this study
Mobile ADHOC networks use OFDM for wireless data
transmission. In OFDM several carriers are packed in to a band. Data is
transmitted serially within a band at a single frequency. Data is treated
as parallel among several bands. The parallel data transmission
requires N oscillators for N data transmission channels. This becomes a
disadvantage. If IFFT is taken for the input data stream it implicitly
means the actual input is a spectrum or band of frequencies. This
avoids the need for N oscillators to generate N bands. In conventional
OFDM spectrum produced may be subjected to Inter Symbol
Interference. To avoid inter symbol interference the IFFT architecture
has to be tuned periodically. This is the need for VLSI based
reconfigurable architecture for Mobile ADHOC Networks and is also the
notion behind this study.
The Fast Fourier Transform (FFT) is a critical block and widely
used in digital signal processing (DSP) applications. The FFT facilitates
the efficient transformation between the time domain and the frequency
domain for a sampled signal. Various FFT processors can be used for
hardware implementation. These implementations can be classified into
pipeline and memory-based to design an FFT processor. The best
existing Radix-2 Single-path Delay Feedback (R2SDF) FFT recognized
as the single processing element (PE) approach. Thus power
consumption and the hardware cost both are lower than the other
14
architecture methods. The effective complex multiplier structure is used
in R2SDF FFT for performing complex multiplication operation. Hence,
more flexibility and high speeds are available in existing R2SDF FFT. In
addition, hamming codes are available for detect and correct a single
error in data communication systems. Further to detect the double and
triple bit errors, hamming code can be extended by using more number
of parity bits. The best existing SEC-TAED hamming code can detect a
single error as well as triple adjacent error with help of extended
hamming code.
The disadvantage of existing R2SDF FFT architecture is large
area, delay, power and throughput Also existing R2SDF FFT
architecture cannot be parallelized. The processing elements of existing
R2SDF FFT require more delay to implement FFT computation due to
absence of synchronization of inputs. The complex multiplier structure
in R2SDF FFT consists of more number of adders and multipliers to
perform the complex multiplication. Hence, existing R2SDF FFT
requires more hardware complexity, area and delay for computation of
FFT. In existing SEC-TAED hamming code technique, more number of
parity bits required to detect the single and triple adjacent error as well
as have less detection efficiency.
In order to overcome these disadvantages, pipelined PEs based
R2SDF FFT and new SEC-TAED hamming codes are proposed in this
research work through VLSI implementation. The proposed R2SDF FFT
15
consists of pipeline based PE structures and reduced complexity
multiplier. With use of pipeline PE structures, synchronization is
provided between the input signals of R2SDF FFT. Further, the reduced
complex multiplier consists less number of adder and multiplier
structures. Further, we can effectively realize the bit replacement
algorithm for improving the detection efficiency of hamming SEC-TAED
codes.
1.9 Objectives
Wireless Mobile Ad hoc networks (MANET) have found a
significant place in growth of technologies. MANET in OFDM provides
communication between devices. OFDM transmits data using a set of
FET algorithm. This algorithm is implemented using PEs, the complex
multiplier and Radix - 2 single-path Delay Feedback (R2SDF) FFT. The
hamming SEC-TAED code is used for error detection. The objectives of
this research work are as follows:
1) To increase the speed of Processing Element (PE) of FFT
architectures.
2) To reduce the area and delay of Complex Multiplier in FFT.
3) To increase the efficiency of Radix-2 Single-path Delay Feedback
(R2SDF) FFT.
4) To introduce Reconfigurable architecture for complex multiplier.
16
5) To increase the detection probability of Single Error Correction
and Triple Adjacent Error Detection (SEC-TAED).
1.10 Methodology Adopted
A modification is made on processing elements in FFT
architecture with the help of pipelining technique to increase the
processing speed of ad hoc networks. The asynchronous effect in
existing PE structure leads to more delay for FFT computation. In order
to overcome this problem, the register unit is added in the end structure
of PE architecture. In register unit, Flip-Flops are used to synchronize
the all incoming signals.
To reduce the area and delay of the network,
a reconfiguration is done on complexity multiplier architecture. The
existing complex multiplier consists of more number of adder and
multiplier units and it leads to more area and delay to perform the
multiplication computation for FFT. In order to overcome this problem,
the expressions for existing complex multipliers are simplified using
Common Sub expression technique.
To increase the efficiency for Radix-2 Single-path Delay
Feedback (R2SDF) FFT, the modified PE structures with pipelined
technique and reduced complex multiplier are used in R2SDF FFT [7].
The R2SDF FFT consists of multiplier, PEs and delay elements. Instead
of complex multiplier and existing PE structure, the proposed reduced
17
complex multiplier and pipeline based PE structures are incorporated
into R2SDF FFT.
To maximize the probability of triple adjacent error detection with
less number of parity bits, Bit Replacement algorithm is used in SEC-
TAED hamming code. In Bit Replacement algorithm the encoded
codeword are re-ordered in a certain condition in order to maximize the
probability of detecting more number of triple adjacent error as well as
correct a single error. The proposed re-ordered combination is designed
with less number of parity bit.
1.11 Summary
In this research, the reduced complex multiplier, pipelined PE
architectures, pipelined PE structure based R2SDF FFT with reduced
complex multiplier and proposed SEC-TAED hamming codes are
designed using Verilog Hardware Description Language (HDL) and
simulation results are verified using Model sim6.3c tool. The Synthesis
for all proposed methods are done in Xilinx Spartan 3 XC3S200
(package: pQ208, speed grade: -5) FPGA using the Xilinx ISE 10.1i
design tool.
18
CHAPTER 2
REVIEW OF LITERATURES
The literature survey focuses on design of Fast Fourier
Transformation (FFT) techniques, architecture oriented FFT models and
fixed benchmarks regarding to decoding strategy. Also, the
architectures oriented Orthogonal Frequency Division Multiplexing
(OFDM) designs are presented.
2.1 Overview of OFDM
The transmitter side of OFDM consists of Source Encoder for
observing the input analog signals and convert into discrete one,
Channel Encoder to encode the input discrete signal, use vales of same
kind. Inverse Fast Fourier Transformation (IFFT) for performing
conversion of frequency samples into time samples and cyclic prefix to
add some more bits in either prefix/suffix side to recover the original
signal after transmitting.
The receiver side of OFDM consists of cyclic prefix removal Fast
Fourier Transformation (FFT), Channel Decoder and Source decoder.
Quadrature Amplitude Modulation (QAM) is frequently used for
better Bit Error Rate (BER) and Signal to Noise Ratio (SNR). Hamming
codes are widely used for improving the error detecting probability.
Single-path Delay Feedback (SDF) based IFFT/FFT structure is used
19
for converting the time domain signal into frequency domain signal and
vice versa.
To increase the performances of OFDM Systems in terms VLSI
concerns like high speed, low power consumption and less area, two
supportive mechanisms are developed in our research work.
I. Pipeline technique based Radix-2 Single-path Delay
Feedback (R2SDF) FFT to increase the efficiency of
frequency transformation technique of OFDM System.
II. Single Error Correction (SEC) and Triple Adjacent Error
Detection (TAED) extended hamming code to increase
probability of error detection efficiency.
The previous works for supporting these two proposals are briefly
discussed in this chapter. Further architectural oriented OFDM System
based works are also considered for our future work.
2.2 Architectural Oriented OFDM Models
Transmitter side of OFDM consists of source and channel
encoder, IFFT and receiver side of OFDM consists of source and
channel decoder and FFT. Hence, frequency transformation technique
(FFT/IFFT), encoding and decoding techniques are very important tools
in OFDM communication based system design. Other than FFT/IFFT
20
and Encoding/Decoding techniques, Modulation technique also plays an
important role in OFDM System.
In the design of Naga Tanuja, K [33], OFDM architecture is
analyzed and implemented the supportive tools of OFDM Systems. The
simulation results of serial to parallel converter, Binary Phase Shift
Keying (BPSK) modulation and frequency transformation techniques
are validated in this study with the help of Modelsim and results are
synthesized in Xilinx - Project Navigator, ISE 8.2i suite. VHDL language
is used to implement synthesis of the scalable Radiz-2 N-point FFT
processor. Clear architectural diagram for OFDM Systems are
presented in this design.
In the review of Noman, H. M. F [35], Software Defined Radio
(SDR) System is designed with the help of reconfigurable mechanism
for OFDM transceivers. Reconfigurable architecture for transmitter and
receiver side of OFDM is illustrated. Universal Software Radio
Peripheral (USRO) board is used for the design of SDR. Simplified
structure of OFDM consists of less hardware than traditional based one.
The suggested methods overcome the detrimental effects due to the
limited accuracy of the internal reference clock. Results are analyzed in
terms of Signal to Noise (SNR) values and Bit Error Rates (BER).
Spread spectrum (SS) and Multi-Carrier Modulation (MCM)
techniques are recognized as potential techniques for the design of
21
Cognitive Radio (CR) Systems and OFDM Systems. Literature of
Sundararajan, M [51], proposes the MCM and CR techniques in
MATLAB and simulation results are validated using MATLAB tool.
Various modified FFT algorithms are discussed for cognitive radio
applications. The pruning algorithms are able to achieve much reduction
in computational complexity, but in the view point of hardware
implementation transform decomposition technique is more efficient and
flexible.
In the review of Niladri Mandal [34], a novel efficient input zero
traced FFT pruning (IZIFFTP) algorithm based on DIF radix-2 frequency
transformation technique is implemented. The suggested FFT algorithm
is implemented in high level computer program, and this is similar to the
Cooley-Tukey radix-2 FFT algorithm, retaining all key features such as
regularity and simplicity, by making some programming modification
and alternation. The incorporation procedure of developed input zero
traced FFT into OFDM is briefly explained. The difficulty arises when we
map the frequency transformation technique into receiver side of OFDM
system than transmitter side of OFDM system. Results of developed
FFT structure is developed for various lengths such as 8-bit, 16-bit, 32-
bit, 64-bit, 128-bit, 256-bit, 512-bit and 1024 bit.
22
2.3 Architectural Oriented FFT Models
Fast Fourier Transformation (FFT) technique is used to convert
the discrete frequency samples into discrete time samples. Normal
algorithm for Radix-2 FFT has more computational path to convert the
frequency domain signal into time domain signal. The algorithm for
Radix-2 FFT requires more hardware complexity to implement the
design. It is essential that timing based system design to reduce the
computational path of FFT algorithm. This section discusses different
architectural oriented FFT models to increase the performance in terms
of hardware complexity, delay and power consumption.
In the study of Zhou, B [59], optimized implementation two
different pipeline FFT architectures are presented on Xilinx Spartan-3
and Virtex-4 FPGA device. With the help of Feedback and Feed-forward
techniques, the computational path can be reduced. To provide the
synchronization between more than one number of inputs and outputs
of FFT model, Pipelining register is used. The synthesis results of
normal and pipeline based FFT techniques are compared for different
number of point size such as 16, 64, 256 and 1024, and different types
of devices such as Xilinx Spartan-3, Virtex-4 and Virtex-E. This work
achieves high speed to perform the frequency transformation technique.
In the review of Paul, S. S [38], performance of different types of
multipliers are compared to incorporate with real multiplication of FFT
23
architectures. The parallel mechanism is provided with the help of flip-
flop and buffer circuit. Flip-flop is used to provide the proper delay for
proper information access. Hence, the hardware complexity and delay
of FFT computation can be reduced significantly. Results for different
types of multipliers such as Vedic multiplier, Array multiplier and Baugh
Wooley multiplier are compared. This review concludes that Vedic
multiplier is the best solution for FFT calculation. In the design of
Sundari, R. M [52], different types of Vedic multiplications such as
Urdhva Tiryakbhyam, Nikhilam Navatascharamam Dashatah and
Anurupye Vedic multiplication are analyzed for complex multiplication of
FFT structure. Those Vedic multipliers use the principal of Vedic Sutras.
The computational delay of those three methods are analyzed and
compared in this design. 8X8 Multiplication using Urdhva Sutra
algorithm provides 17.27% of speed improvement than Nikhilam Sutra
algorithm. On other hand, 8X8 Multiplication using Anurupye Sutra
algorithm provides 18.44% of speed improvement than Urdhva Sutra
and 32.52% of speed improvement than Nikhilam Sutra algorithm. This
review concludes that the Vedic Anurupye Sutra algorithm is the best
solution for complex multiplication of FFT models.
In the review of Salehi, S [45], Pipelined Real Valued FFT and
Hermitian-Symmetric IFFT are designed for different input point size.
The real FFT structure is developed by transferring twiddle factors to
subsequent stages, such that each stage in the developed Signal Flow
24
Graph (SFG) contains one column of butterfly units and one column of
twiddle factor blocks, and each column of the flow graph contains only
N samples. This is the key requirement to design the FFT model.
Hence, hardware complexity and computational path of FFT models
reduced automatically. The results for Radix-2 FFT and Radix-22 FFT
are analyzed with the help of proper synthesis tools.
In the brief of Ayinala, M [7], a novel scalable architecture is
designed for Real-valued In-place Fast Fourier Transform (RIFFT)
computation. This brief removes the redundant operations of SFG of
FFT computation. A new processing element (PE) structure is
introduced in RIFFT design by using two radix-2 butterflies’ structures
that can process four inputs in parallel. A conflict-free memory
addressing scheme is extended to support multiple parallel PE
structures. The proposed work of this brief reduces the computation
cycles by a factor of 2 for a 256-point RIFFT compared to normal radix-
2 FFT algorithm while maintaining a lower hardware complexity.
2.3.1 Complex Multiplier Design of FFT Models
In the study of Reddy, K. V. S [44], Radix-8 64-point FFT/IFFT
algorithm is implemented with the help of fixed width modified booth
multiplier. In normal Radix-2 and Radix-8 FFT algorithm, Read Only
Memory (ROM) is used to store the twiddle factors. Number of Look up
Tables (LUTs) has been increased due to ROM of FFT algorithm. To
25
eliminate the usage of ROM, a reconfigurable complex multiplier the
design of reconfigurable complex multiplier based FFT is named as
“ROM-less FFT/IFFT”. Single-path Delay Feedback (SDF) architectures
are used in this study to read the 4-point inputs at the same time. The
performances of fixed width booth multiplier are compared for Radix-2
Single-path Delay Feedback (R2SDF) FFT, Radix-22 Single-path Delay
Feedback (R22SDF) FFT and Radix-23 Single-path Delay Feedback
(R23SDF) FFT.
In the study of Mehta, U. C [30], Single-path Delay Feedback
(SDF) based FFT/IFFT is designed for 2048 point for Wi-Max
application. Modified ROM module is used in this study to reduce the
storage complexity of FFT. This design supports for variable length from
128-2048 point for FFT/IFFT structure. To admit the half of the point of
input at a time, shift register is used and generalized complex multiplier
is used to perform the complex multiplication of FFT computation.
In the study of Berkeman, A [9] and Berkeman, A [10], low logic
depth complex multiplier has been developed for FFT processors.
Generally more number of Slices and LUTs can be utilized to perform
the multiplication of FFT processors. For instance, 4-point FFT requires
only real valued multiplications, but in case of 8 or more than 8 point
FFT requires complex valued multiplication. Therefore, it is essential to
design the complex multiplier for FFT processor. Distributed Arithmetic
(DA) based multiplication is used for complex multiplication. DA based
26
multiplication is one of the best multiplications in which ROMs are used
for performing particular kind of multiplications. This multiplier provides
better performance in terms of logic depth and speed. The limitation of
this study is the utilization of more hardware complexity, since number
of LUTs has been increased when increasing the storage capacity of
ROM.
In the review of Archana Fande [6], signed complex multiplier
design is developed on Field Programmable Gate Array (FPGA). The
real addition, real subtraction, imaginary addition and imaginary
subtraction of variable length FFT processors are compared with best
existing one. The reports of this review are as following: Developed
complex multiplier of 16-bits, 64-bits, 256-bits and 1024-bits offers 20%,
69.23%, 81.62% and 86.96% of real multipliers reduction respectively
and 78.37%, 86.88%, 90.67% and 92.77% reduction of real adders
reduction respectively. Hence, design of proper complex multiplier
provides the better performance for FFT processors.
In the brief of Sreekanth Yadav, K [49], 64-point FFT design is
provided by using Radix-4 algorithm. The design for Decimation in Time
(DIT) FFT and Address Generation Unit (AGU) are developed first time.
The results for butterfly structures, AGU, control units are simulated and
validated by using proper simulation tools.
27
In the design of Kandhi Srikanth [19], Radix-4 64 point pipeline
FFT/IFFT processor has been developed for 3G and 4G wireless
application. The periodicity properties of twiddle factors and
reconfigurable complex multiplier are used reduce the ROM size for
storing twiddle factor values. The result of this productive modification
on FFT architecture reduces the hardware complexity effectively.
Further to reduce the Slices and LUT of FFT/IFFT computation, the
hardware can be reused effectively. The structure of complex
multiplication for FFT architecture is demonstrated in this design.
In the study of Kumar, A [21], realization is made on the structure
of butterfly to improve the performance of FFT. The structure of butterfly
structure consist of real addition, signed complex multiplication and real
subtraction processes. Radix-4 FFT processors have 3N/4 log4N
complex multiplications and 3N log4N complex additions. A number of
complex multipliers and complex adders, Memory size and Control
logics are compared for different types of FFT architectures such as
R2SDF FFT, Radix-4 SDF (R4SDF) FFT, Radix-4 Single-path Delay
Commutator (R4SDC) FFT, Radix-22 SDF (R22SDF) FFT, Radix-2
Multi-path Delay Commutator (R2MDC) FFT and Radix-4 MDC
(R4MDC) FFT. Further this study concludes that Radix-4 structures
provide better performance than Radix-2 FFT structures in terms of
utilization of complex number of additions and multiplications.
28
In the study of Manimaran, A [28], an effective complex multiplier
circuit is developed for 64-point FFT computation. Proposed
architecture of this study completely removes the use of ROM with the
help of reconfigurable complex multiplier and bit parallel multiplier. Bit
parallel multiplier uses the shifters and adders for the multiplication of
fractional value with complex output coming from butterfly structures.
The complex multiplier architecture utilizes the three multipliers and five
adders to compute the complex multiplication of FFT. It takes more
advantage than bit parallel multiplier.
2.4 Hamming Error Correction Codes
Hamming codes is one of the best error correction codes (ECCs)
to detect and correct a single bit error and detect up to double adjacent
bit error. Hamming codes are widely used to protect registers or
memories from soft errors. As technology scale increases, radiation
particles create soft errors that are more likely to affect more than one
bit when they impact an electronic circuit or memory circuit. This effect
is names as Multiple Cell Upset (MCU). To avoid MCU from causing
more than one bit error in a given word, interleaving is commonly used
in registers and memories. Although interleaving process increases the
complexity of the memory device and is not suitable for small memories
or content-addressable memories. However, if interleaving is not used,
MCUs can cause multiple errors in a word that may not even be
29
detected by a Hamming code. Therefore the solution for this problem is
to enhance the hamming code to increase the detection probability.
In the brief of Sanchez-Macian [47], Bit placement algorithm is
used to increase the detection probability of double adjacent bit error. A
Lexicographic Hamming matrix for (8, 12) is used for decoder design.
The block size of (8, 12) hamming matrix is 12 and input length is 8,
therefore four parity bits are used to detect the double adjacent error
and correct a single bit error. This review overcomes the MCU problem
of radiation in space and terrestrial communication. The block size of
designed hamming codes is 12. Therefore, there are eleven
combinations for probability to detect the double adjacent bit
error(DAED). But normal order detects only one out of eleven
combination of double adjacent bit error whereas bit re-ordered
codeword detect nine out of eleven combination of double adjacent bit
error. In this way bit-placement algorithm helps to increase the
efficiency of hamming decoder. Further this review gives the detection
of triple adjacent bit error(TAED). For both DAED and TAED, normal
order provides 9% of detection efficiency whereas bits re-order of this
review provides 82% of detection efficiency.
In the study of Sanchez-Macian [46], selective shortening and bit
placement techniques are used for hamming SEC-DAED and Extended
Hamming SEC-DED-TAED. Shortening algorithm is the best algorithm
for hamming codes to increase the probability detection. In addition to
30
SEC-DED and TAED, SEC-DED-TAED is developed in this brief. The
approach of handling the codeword is same but only the bit size is
extended. A new parity check matrices for SEC-DAED hamming codes
and SEC-DED-TAED hamming codes are developed in this brief. The
generation of parity check matrix is different from lexicographic matrix.
A lexicographic matrix is the normal order of length of block size, but
parity check matrix is generated by combining the identical matrix of
parity bit’s length and transpose of parity matrix generated from
identical matrix of information bit’s length. Apart from detecting all
consecutive errors, the SEC-DAED hamming codes can detect 30%-
67% of double non-adjacent errors. Similarly, SEC-DED-TAED
hamming codes can detect 34%-42% of triple non-adjacent errors.
In the review of Dutta, A [15], Multiple Bit Upset (MBU) tolerant
memory is designed using a Selective Cycle Avoidance based SEC-
DED-DAEC codes. Previous reviews only can detect the double
adjacent error, but the correction of double error is also made by
avoiding the selective cycles. At the first time, the circuit for double
adjacent bit error detection and error correction is designed. However
the developed circuit did not correct all the combination of double
adjacent bit. The error detecting probability is low when compared to the
previous works.
In the literature of Nutan Shep [37], conventional hamming codes
for (7, 4) is implemented in VLSI. With the help of Verilog Hardware
31
Description Language (Verilog HDL) and Very high speed integration
circuit Hardware Description Language (VHDL), algorithm for hamming
codes is implemented in this literature. The minimum hamming distance
of SEC design is 3, which mean three parity bits are used to detect and
correct a single bit error using hamming code. Similarly, minimum
hamming distance of SEC-DAED design is 4, which mean four parity
bits are used to detect the double adjacent error, but extended these
hamming codes can miscorrect the error to invalid one. The process of
this miscorrection is referred as Silent Data Corruption (SDC). When
designing the hamming error correction codes using Verilog HDL or
VHDL design, it is essential that more number of LUT to store the
hamming matrix results. Therefore, it is not sufficient to analyze the
results of hamming in terms of utilization of Silicon chip size, delay and
power consumption; instead the performances are compared in terms of
detection and correction probability.
In the study of Cha, S [11], Check bit Pre-computation methods is
used for SEC and DED. The H-matrix of the developed SEC-DED code
is the same as that of the odd-weight-column code during the write
operation and is designed by replacing 0’s with 1’s at the last row of the
read operation. This design achieves reductions in the number of gates,
latency and power consumption of the ECC processing circuits by up to
9.3%, 18.4% and 14.1% for 64 information bits in a word. This literature
provides alternate solution for the design of SEC-DED.
32
In the review of Cui, Y [12], hamming (40, 32) SEC-DED code is
developed to increase the Error Detection and Correction (EDAC)
ability. Hamming (40, 32) SEC-DED codes have 8-bits parity memory
for single error correction and double error detection. An algorithm
based mutual expressions are developed to minimize the EDAC circuit
area, and delay parameters. The results of (40, 32) hamming SEC-DED
codes are compared to (39, 32) Hsiao code. The critical path of encoder
and decoder computation causes more delay and power. But hamming
codes have only smoothened path for encoding and decoding process
than Hsiao codes. The developed hamming (40, 32) SEC-DED codes
offers 2.97% reduction of encoding delay than Hsiao (39, 32) error
correction code.
In the brief of Noorbasha, F [36], optimized encoding and
decoding process of hamming codes is presented by using
Complementary Metal Oxide Semi-conductor (CMOS) technology.
Hamming ECC codes are verified using 50nm, 70nm and 90nm
technology. Field Programmable Gate Array (FPGA) implementation
methods for developed Hamming ECC codes are provided in this brief.
They have simulated and tested the system and got an excellent
performance at 50 GHz. The reliability of developed hamming SEC-
DEC codes measured at a voltage of 1V, 0.7V and 0.5V. The decoding
of hamming codes can give accurate result even transmission
technique has a single bit error.
33
Usually, reliability problem of data transmission is improved by
channel coding which employs forward error correction (FEC)
techniques. FEC technique can detect and correct a single bit error with
the help of check or redundant bits. These check bits are determined
from data bits and appended to data bits to get the codeword of original
data bits. If AWGN affects the data bits other than check bits of
codeword, error could be easily detected and correct. If check bits are
also corrupted by AWGN, then FEC cannot detect and correct an error.
To overcome this problem, multidirectional parity code is used in
Manchanda, G [27]. The report of Manchanda, G [27] provides the
MATLAB realization of encoder and decoder for multidirectional parity
code with hamming code. Multidirectional parity code with hamming
code, improves the reliability of data transmission in data computer
network with acceptable bit overhead in 26.22% and Code rate in
79.22%. This scheme is very expensive and check bits are also
corrupted by noisy environment. Multidirectional parity code with
hamming code can correct four bit error, three error bits from the data
parts and one error bit from check bits. In receiver, there is no need to
re-transmit the data.
Summary
The Study of Manimaran A [28] gives the best overview of
complex multiplier design of FFT processors than other studies.
Similarly, the study of Zhou, B [59] and Salehi, S [45] provide the best
34
overview for pipelined structures based FFT designs. We consider the
complex multiplier design from the study of Manimaran A [28] and
pipelined techniques from the study of Zhou, B [59] and Salehi, S [45] to
design the FFT architectures. We consider the SDF architectures to
completely reduce the ROM of FFT. In our research work, productive
modification is made on those all considerations of FFT structures to
improve the performance of frequency transformation techniques.
The study of Sanchez-Macian [47] gives the best overview of bit
placement algorithm for extended hamming codes and the study of
Sanchez-Macian [46] gives the best overview of bit shortening algorithm
for extended hamming codes. The literature of Nutan Shep [37] provide
VLSI based design for (7, 4) hamming codes. We consider the study of
Sanchez-Macian [47], for SEC and TAED design. We realize the
methods of bit replacement algorithm for TAED and alternative
replacement procedures are made to improve the detection probability
of TAED. Further literature of Nutan Shep [37] was considered to realize
the problems, for implementing hamming codes in Verilog HDL or
VHDL. In our research work, productive modification is made on bit
replacement algorithm based extended hamming code to improve the
efficiency of detection probability of TAED.
From the review of above literature, it is clear that R2SDF FFT is
available for frequency transformation technique and extended
hamming codes are available for correcting a single error and detecting
35
the double and triple adjacent errors. The possibilities are there to
improve the performance of R2SDF FFT and detection efficiency of
extended hamming codes with the help of pipelining mechanism and bit
replacement algorithm respectively. These two improvements will be
absolutely helps to increase the performance of OFDM System.
36
CHAPTER 3
HIGH SPEED PIPELINED BASED 64-POINT RADIX-2 SINGLE
PATH DELAY FEEDBACK (R2SDF) FFT
3.1 FFT in MANET
Wireless communication technology has enlarged the demands
for signal processing operations such as Convolution, Correlation,
Filtering and frequency transformation techniques. Among those kinds
of operations FFT is frequency transformation technique and is
recognized as a high potential for wireless based communication
technologies in terms of hardware complexity. FFT is widely used to
convert the time domain signal into frequency domain signal and IFFT is
widely used to convert the frequency domain signal into time domain
signal. These frequency transformation techniques are used to transmit
and reconstruct the original input signals in OFDM based
communication. Mobile Ad-hoc Network (MANET) is the types of infra-
structure less wireless network in which OFDM is used for transmission
of information signals to desired users. OFDM is a multi-carrier
transmission scheme in which higher rate single data stream is
transmitted over a number of lower sub-carriers. To analyze and
transmit the frequency characteristics of more number of lower rate data
streams, FFTs and IFFTs are generally used. FFT/IFFT blocks
consume more silicon area and power consumption. Also, Speed of
37
frequency transformation processes is also poor due to difficult signal
flow graph. In this research work, pipelining mechanism is introduced to
increase the speed of the FFT/IFFT processors. The complex multiplier
is designed to reduce the hardware cost and power consumption of the
FFT/IFFT processors.
3.2 FFT Algorithm
FFT is used to analyze the timing characteristics of discrete
frequency response. Butterfly structures are used to determine
frequency response of time domain signals in IFFT and to determine
timing response in frequency domain signals. FFT processors can be
classified as two categories, as Decimation in Time (DIT) FFT and
Decimation in Frequency (DIF) FFT. Generalized buttefly structures for
2-point DIT FFT and 2-point DIF FFT are illustrated in Figure. 3.1 and
Figure. 3.2 respectively as in Takala, J [54] and Sreekanth Yadav, K
[49]. It consists of complex multiplier, complex adder and complex
subtractor.
Figure 3.1 Butterfly structure for 2-point DIT FFT
Twiddle Factor Multiplication
Complex Adder
+
-
Complex input 1
Complex input 2
Complex output 1
Complex output 2
38
Figure 3.2 Butterfly structure for 2-point DIF FFT
This butterfly structures are generally represented as Radix-2
structures, because of processing the two stages in every time period.
In case of 2-point DIT FFT, twiddle factor multiplication, complex
addition and subtraction processes are involved. DIF FFT is used to
construct the frequency representation of discrete time domain signals.
We can construct 8 point, 16 point, 32 point and 64 point FFT
processors.
3.2.1 Discrete and Fast Fourier Transformation Techniques
Discrete Fourier Transformation technique is used to convert the time
domain signals into Frequency domain signals.
The N-point Discrete Fourier Transformation (DFT) of input sequence x
(n) is defined as:
[ ] [ ]W nkN
N
nnxkX ∑
−
==
1
0, k=0, 1, 2 ... N-1 (3.1)
39
Where, eW Nnkjnk
N
π2−= , k=0, 1, 2 ... N-1 is referred as twiddle factor or
DFT coefficients. X[k] is the kth harmonic and x (n) is the nth input
sample. This DFT calculation requires a computational complexity of
O(N2) to transform the time domain signals into frequency domain
signals.
In order to overcome this problem, Fast Fourier Transformation
(FFT) technique is introduced by Cooley and Tukey & Lyon, Douglas A
[28]. By using the Cooley-Tukey FFT, the complexity for computation
can be reduced to O (logr N). This algorithm is the most universal of all
FFT algorithms, because of any factorization of N is possible. In most
Cooley-Tukey FFTs, transform length should be power of a basis r, i.e.,
N=2r. Hence, Cooley-Tukey algorithm is also represented as radix-r
algorithms. The most commonly used are those of basis r=2 and r=4.
The Cooley-Tukey algorithm follows the divide and conquers
approach to determine the frequency transformation of original input
signals. Fast Fourier Transformation algorithm can be classified into two
types such as Decimation in Time (DIT) FFT and Decimation in
Frequency (DIF) FFT.
The first type of classification is based on divide and conquers
approach in the time domain and hence, it is referred to as the
Decimation in Time (DIT) FFT. Similarly, the second type of
classification is also based on divide and conquers approach into the
40
frequency domain and hence, it is referred to as the Decimation in
Frequency (DIF) FFT. In general, first the input sequence can be
divided into two summations in DIF FFT computation. Further they are
simplified as follows:
[ ] ( ) ( )WW nkN
N
Nn
nkN
N
nnxnxkX ∑∑
−
−=
−
=+=
1
12
12
0 (3.2)
( )kNn
N
N
n
nkN
N
nWNnxWnx
⎟⎠⎞
⎜⎝⎛ +−
=
−
=∑∑ ⎟
⎠⎞
⎜⎝⎛ ++= 2
12
0
12
0 2 (3.3)
( ) ( )WWW kNN
nkN
N
n
nkN
N
n
Nnxnx 21
2
0
12
0 2∑∑−
=
−
=⎟⎠⎞
⎜⎝⎛ ++= and (3.4)
( ) ( )12 −= kkNNW
( ) ( ) W nkN
N
n
k Nnxnx∑ −−
=⎟⎟⎠
⎞⎜⎜⎝
⎛⎟⎠⎞
⎜⎝⎛ ++=
12
0 21 (3.5)
X[k] can be represented as frequency transformation of input sequence.
X[k] can be decimated into even and odd indexed frequency samples:
[ ] ( ) W nkN
N
n
NnnxkX 21
2
0 22 ∑
−
=⎟⎟⎠
⎞⎜⎜⎝
⎛⎟⎠⎞
⎜⎝⎛ ++=
( ) W nkN
N
n
Nnnx2
12
0 2∑−
=⎟⎟⎠
⎞⎜⎜⎝
⎛⎟⎠⎞
⎜⎝⎛ ++= (3.6)
41
[ ] ( ) W nkN
N
n
NnnxkX 21
2
0 212 ∑
−
=⎟⎟⎠
⎞⎜⎜⎝
⎛⎟⎠⎞
⎜⎝⎛ +−=+
( ) W nkN
N
n
Nnnx2
12
0 2∑−
=⎟⎟⎠
⎞⎜⎜⎝
⎛⎟⎠⎞
⎜⎝⎛ +−= (3.7)
Equation [3.6] represented as even frequency samples and equation.
[3.7] represented as odd frequency samples. Further this procedure can
be repeated through decimation of the N/2-point DFTs, i.e., X[2k] and
X[2k+1]. The entire Cooley-Tukey algorithm involves log2N stages,
where each stage involves N/2 operation butterflies units. Therefore, the
computation of the N point DFT through DIT-FFT requires (N/2).log2N
complex multiplication and N.log2N complex addition. For instance,
Radix-2 8-point FFT computation using DIF FFT is shown in Figure. 3.3.
Figure 3.3 Radix-2 DIF-FFT structure for 8-point
42
Similarly, we can design for Radix-2 16 point FFT computation model
with the help of DIT-FFT. The number of complex multiplications and
complex additions required for Radix-2 16 point FFT is analyzed for
both DFT and FFT and compared in Table 3.1 [Implementation of 16
point].
Table 3.1 Tabulation for number of complex multiplications and complex
additions required for Radix-2 16 point FFT
OPERATION DFT FFT 16 POINT DFT 16 POINT FFT
Complex Multiplications N2 N/2(log2N-
1) 256 24
Complex Additions N(N-1) N(log2N) 240 64
Real Multiplications 4N2 2N(log2N-1) 1024 96
Real Additions N(4N-2) 2N(log2N) 992 128
3.2.2 Properties of Twiddle Factor
In FFT computation, is represented as twiddle factor or DFT
coefficient. Twiddle factor is generally referred as “rotating vector” which
rotates in increments according to the number of samples, N. The
partial twiddle factor for N-point DFT is illustrated in Figure. 3.4.
The properties of twiddle factor coefficients for DFT are as follows:
1. 120 =−= WWNNN
jWWN
NNN −=−= 4
34
43
2. ( ) WW nkN
NknN −=+ 2
3. ( ) ( )WWW knnN
kNnN
nkN
++ == Periodicity in n and k.
:
Figure 3.4 The partial twiddle factor for N-point DFT
For certain values of the product n*k, twiddle factor takes on the value
either 1 or 0 (property 1). These types of products are calculated by as
follows:
( ) WWWWW nk
N
nk
N
N
N
Nnk
N
nk
N BABABA ×−=×⎟⎠⎞⎜
⎝⎛ ×+=×+× + 22
(3.8)
( ) ( ) WWWWW nkN
nkNN
NnkN
nkN jBABABA ×−=××+=×+× + 44
(3.9)
However, reductions of this type still have an amount of computation
that is proportional to N2. Fortunately, the second property of the
44
periodicity of the complex sequence reduces the computation
significantly.
( ) ( ) WWW knN
NknN
nkN BABA ×+=×+× +
, where n = 0, 1, 2..... N-1 (3.10)
According to symmetric property, we can further reduce equation as
( ) WWW nkN
NnkN
NnkN ji ×−=−= ++
22
85
8 and
( ) WWW nk
N
Nnk
N
Nnk
N ji ×+=−= ++
228
78
3 (3.11)
According to equation. 3.11, the multiplication of a complex number by
twiddle factor involves two real multiplication and two real additions.
Thus, only ±450 phase difference requires for two real multiplications.
3.2.3 FDM Systems based on FFT and Modulation
OFDM is a Discrete Multi-tone (DMT) frequency division multiplexing
(FDM) scheme in which high rate data stream at M fsym bits/s is divided
into blocks with M bits per block at a rate of fsym. Divided blocks are
called as symbols. A symbol allocates (m*k) bits of M bits for
modulation of a carrier k at fc,k and totally M bits for modulation of N
carriers. This results in N sub channels, which send symbols at rate of
fsym. The block diagram of multi-carrier modulation scheme for OFDM
System is illustrated in Figure. 3.5.
45
In traditional MCM technique, the sub-channels are non-
overlapping. Each sub-channel has its own modulator and demodulator
for information transmission purpose. This leads to more utilization of
spectrum and excess hardware requirement. OFDM System overcomes
this drawback by introducing orthogonality. In orthogonality overlapping
of the sub-channels are allowed. The orthogonality of OFDM can be
exploited in frequency domain. The spectrum of overlapping sub-
carriers in OFDM is illustrated in figure. 3.6. The process of change in
frequency or other periodic event is called as Doppler Effect. Carrier
Offset due to Doppler Effect, peak and null of the orthogonal signals,
fc,n-
fc,n-
fc,0 M bits (a Symbol)
Seri
al to
Pa
ralle
l
Modulator n-1
Modulator n-1
Modulator 0
fc,n-
fc,n-
fc,0 M bits (a Symbol)
Para
llel t
o Se
rial
demodulator n-1
demdulator n-1
demodulator 0
Channel Noise
Input
Mfsym b/s
Output
Figure 3.5 Block diagram for Multi-carrier Modulation Scheme
46
Subcarrier spacing and orthogonal sub-carriers of OFDM are indicated
in Figure. 3.6.
Figure 3.6 Spectrum of overlapping of sub-carriers in OFDM
The OFDM modulator can be implemented with the help of IFFT
processor and N sub-carrier instead of N modulation in traditional MCM.
Similarly, OFDM demodulator can be implemented efficiently with the
help of FFT processor and N-sub-carrier than that of traditional MCM.
The simplified OFDM System based FFT is illustrated in figure. 3.7.
47
.
Figure 3.7 OFDM System based on FFT
3.2.4 Digital Signal Processors for FFT Computation
After the publication of Cooley-Tukey FFT algorithm, various
modifications as well as implementations has been provided to improve
the performance of FFT. In general, designed various model of FFT can
be implemented in software, general-purpose processors, algorithm
specific processors and application specific processors. The
architecture of general purpose programmable digital signal processor
(DSP) is shown in Figure. 3.8.
The general purpose programmable DSP processor consists of
Address generator, data memory, program memory, input/output
interface, program controller and Multiplication and Accumulation unit
(MAC) and Arithmetic and Logic Unit (ALU).
IFFT
FFT
D/A
A/D
Output
Channel noise
Input
48
Figure 3.8 General Purpose Programmable DSP Processor
Program memory and Data memory are used to store the
program and data respectively. MAC & ALU of DSP processor controls
the process of frequency transformation technique. Program controller
controls the data flow of processors. In various commercial
programmable DSP processors, the special instruction set for FFT
computation is included. But, the performances are varied from one
processor to another. In architecture point of view, most of commercial
I/O
Interface
Address Generator
Program Memory
Data Memory
Program Controller
I/O
Interface
MAC & ALU
Address Bus
Address Bus
Data Bus
Data Bus
Program
Program
Data
Data
49
DSP processor followed by Harvard architecture. Processors with
Harvard architecture have independent buses for data and control.
For typical FFT/IFFT implementation, general purpose DSP
processor takes approximately 1ms, which is far from the
implementation using more specialized implementations. Hence, the
general purpose DSP processor is not applicable for high speed and
low power applications due to the lack of throughput requirement. In
other hand, general purpose processors are designed to execute
multiple applications and perform multiple tasks. Also this processor
might lack high performance that certain task required. Hence,
application specific application processors emerged as a good solution
for high performance, lower power consumption and cost effective
processors. These processors can be classified into three major
categories:
• Digital Signal Processor (DSP): Programmable
microprocessors/Programmable microcontrollers for extensive
real time and mathematical computations.
• Application Specific Instruction Set Processor (ASIP):
Programmable microprocessors/Programmable microcontroller in
which hardware and instruction set are designed together for
particular special application.
50
• Application Specific Integrated Circuit (ASIC): Specific algorithm
completely implemented in hardware.
Various programmable FFT processors have been developed for the
FFT/IFFT computations. These processors are 5 to 10 times faster than
the general-purpose DSP processors. The architecture of
programmable specific FFT processor is illustrated in Figure. 3.9.
Figure 3.9 Programmable specific FFT processor
The general programmable specific FFT processor consists of
butterfly units and complex multipliers. On-chip ROM is used available
in programmable FFT processor to store the sine and cosine coefficient
values. This type of programmable FFT specific processor are often
provided with windowing functions in either time or frequency domain.
3 Term Window Operator
Workspace RAM
Workspace RAM
Radix-4 Data path
Output Buffer
Coefficient ROM
Input
Output
51
Non Programmable Specific Processors can also be used for FFT
computation. This architecture supports only fixed length of FFT
calculation. Generally algorithm specific processors can be classified
into three categories; they are Fully Parallel FFT Processor, Column
FFT Processors and Pipelined FFT Processors
Mapping of FFT signal-flow graph to hardware structures is different
for all three algorithm specific processors. In a Fully parallel FFT
processor, the hardware structure is the form of isomorphic for FFT
signal flow graph. For instance, the signal flow graph for 8-point DFT
requires 24 complex adders and 5 complex multipliers for FFT
computation. The hardware complexity is more in this implementation.
Hence, it is not power efficient for FFT implementation.
In order to overcome the disadvantage of fully parallel FFT
processor, column based FFT processor is introduced. Set of
processing elements in a column can be computed at a single clock.
These results are fed back to another same set of processing elements
to compute the next stage. The hardware complexity of column based
FFT processors is reduced. The routing for the processing elements is
complex and difficult for the long transform length. Pipelined FFT
processors are introduced to overcome the disadvantage of column
based FFT processors.
52
In a pipelined FFT processor, each stage has its own set of
processing elements. Every stage is computed automatically when data
are available. Pipelined FFT processors have the features like
simplicity, flexibility, modularity and high throughput. The most common
group of pipelined FFT processors are Radix-2 Multipath Delay
Commutator (R2MDC) FFT, Radix-2 Single-path Delay Feedback
(R2SDF) FFT, Radix-2 Single-path Delay Feedback (R4SDF) FFT,
Radix-4 Single Delay Commutator (R4MDC) and Radix-22 Single-path
Delay Commutator (R22SDC) FFT. Lot of endeavours such as T. S.
Ghouse Basha [53], Abhijit D. Palekar [2] have been working on
architecture oriented FFT model and made more modifications to
improve the performance of FFT computations in terms of hardware
complexity and power consumptions.
In generalized FFT architecture, more number of computational
paths is involved to determine the spectrum characteristics of discrete
time signals. Due to large number of computational paths, more number
of logic elements is utilized to design the FFT processors. Also, delay
for FFT computation can be increased significantly. To overcome these
disadvantages of traditional Radix-2 FFT, Radix-2 Single-path Delay
Feedback (R2SDF) FFT is preferred in our research work.
53
3.3 RADIX-2 SINGLE PATH DELAY FEEDBACK (R2SDF) FFT
Radix-2 Single-path Delay Feedback (R2SDF) FFT is a parallel
technique for estimating the frequency response of discrete timing
response. This structure also referred as “stream-like” processing of
block based algorithm. One of the key advantages of R2SDF FFT is
processing the data in a parallel manner whenever input points are
available. Butterfly structure for R2SDF FFT is illustrated in Figure.
3.10. It consist of single butterfly structure for performing signed
addition and signed subtraction process and a single delay line unit for
processing the second point of data, after a single unit delay. In Figure.
3.10 two ways of representations are illustrated to analyze the signal
flow of R2SDF FFT.
Figure 3.10 Butterfly structure for R2SDF FFT
(1) First half of input is shifted in delay buffer and second half of
output is getting from delay buffer.
(2) Second half of input is shifted into butterfly together with first
half of input from delay buffer and second half of output from delay
Delay Line
But
terf
ly
Data in Data out
54
buffer. Symbolic representation of signals flow in butterfly structure of
R2SDF FFT is illustrated in Figure. 3.11. Single path Feedback unit is
used to access next point of input in Figure. 3.10 and Figure. 3.11
respectively. Hence, this structure named as Radix-2 Single path Delay
Feedback (R2SDF) FFT. A single stage has been used for Radix-2 2-
point SDF FFT to estimate the frequency response of the signal. Similar
to this, Radix-2 8-point SDF FFT structure is illustrated in Figure. 3.12.
Figure 3.11 Symbolic representation of R2SDF FFT
Figure 3.12 Structure of Radix-2 8-point Single-path Delay Feedback (R2SDF) FFT
In 8-point R2SDF FFT, input sequence is broken into two parallel
data streams flowing forward with correct “distance” between data
elements entering the butterfly scheduled by proper delays. Both
4D
But
terf
ly
2D
But
terf
ly
1D
But
terf
ly
Delay Line
But
terf
ly
Single Path Single Path
55
complex multiplier and butterfly structures are less utilization in R2SDF
FFT architecture.
One of the straightforward approaches for parallel implementation
of R2SDF FFT algorithm is as follows:
1. The input data sequence is broken into two parallel data streams.
2. In each stage of R2SDF FFT processors, half of the input data is
delayed via feedback delay unit and processed with second half
of the input data.
3. The delay elements used are 4, 2 and 1 respectively for three
stages in 8-point R2SDF FFT processors. Hence, total number of
delay elements used is 4+2+1=7 in case of 8-point R2SDF FFT.
4. In 8-point R2SDF FFT architecture, three types of complex
multiplier has used for performing complex multiplications.
Similarly only 3 number of delay unit structures and 3 number of
butterfly units are used for performing 8-point R2SDF FFT. When
compared to Radix-2 FFT computation, R2SDF FFT reduces
70% of hardware components. Due to reducing the complexity of
computational path, speed of the R2SDF FFT processors is very
high compared to traditional Radix-2 FFT structure.
Butterfly structure of R2SDF FFT is referred to as Processing
Elements (PEs). Therefore three PE structures are required to perform
56
8-point R2SDF FFT and six PE structures are required to perform 64-
point R2SDF FFT. In this research work, 64-point R2SDF FFT
processor is designed. To improve the performances of frequency
transformation processors, three steps are considered in this research
work.
Step 1: Pipelining Mechanism
Pipelining mechanism is introduced in each and every PE
structures to improve the speed of the R2SDF FFT processors.
Step 2: Reduced Complex Multiplier Design
Complex multiplier plays a significant role in hardware
requirement of R2SDF FFT. Hence, to reduce the silicon chip size of
R2SDF FFT, Reduced complex multiplier is designed. Reduced
complex multiplier has only less number of adder units when compared
to complex multiplier. Hence, this multiplier is referred as “Reduced
Complex Multiplier”.
Step 3: 64-point Pipelined R2SDF FFT Design
In this method, developed Pipelined PE structures and designed
reduced complex multipliers are incorporated into 64-point R2SDF FFT.
3.4 Pipelined Processing Element (PE) Structures for R2SDF FFT
The block diagram of 64-point R2SDF FFT is illustrated in Figure.
3.13. It consists of six processing elements to perform the FFT
57
computation. The input data sequences are divided into two parallel
data streams. Six delay units such as 32, 16, 8, 4, 2 and 1 are used to
process the two half of the input data sequences at every stage.
Figure 3.13 Structure of Radix-2 Single-path Delay Feedback (R2SDF)
FFT
It composed of three different types of PEs, a complex constant
multiplier and delay-line (DL) buffers. Three processing elements PE1,
PE2 and PE3 have different architectures to perform the different type
of butterfly operations. Among those PE3 structure is used to implement
a simple radix-2 butterfly construction and it supplies as the associate
modules of the PE2 and PE1 structures. Block diagram of PE3 structure
is illustrated in figure. 3.14.
Real part of the input and output samples are represented as Iin
and Iout respectively. Similarly, imaginary part of input and output
samples is represented as Qin and Qout respectively.
PE1
32
PE2
16
PE3
8
PE1
4
PE2
2
PE3
1
Input
Output
Twiddle Factor Coefficient
58
Figure 3.14 Block diagram of PE3 structure for 64-point FFT
DL_Iin and DL_Iout stand for the real parts of input and output of
DL buffers respectively. Similarly DL_Qin and DL_Qout stand for
imaginary part of input and output of DL buffers respectively. In addition
to PE3 stages, we need to perform multiplication by -1 in PE2 stage.
The working principle of PE3 is as follows:
When S0 = 0,
DL_Iin = Iin, Qout = DL_Qout
Iout = DL_Iout, DL_Qin = Qin
When S0 = 1,
DL_Iin = DL_Iout + (-Iin), Qout = Qin + (-DL_Qout)
Iout = Iin+ (-DL_Iout), DL_Qin = DL_Qout + (-Qin)
DL_Iout
Iin
DL_Iin
Iout
S0
Qin
DL_Qout
Qout
DL_Qin
S0
1
0
1
1
1
0
0
0
59
The block diagram of PE2 Stages and PE1 stages are shown in figure.
3.15 and Figure. 3.16 respectively.
Figure 3.15 Block diagram of PE2 structure for 64-point FFT
Figure 3.16 Block diagram of PE1 structure for 64 point FFT
These processing elements provide the better solution for
computing the frequency transformation of input time samples.
However, asynchronous effect of input/output access mechanism is one
of the main disadvantages in those processing elements. For instance,
two input devices can perform only after arrival of two inputs from any
source. Therefore differences between arrival times of two inputs make
an asynchronous mechanism which disturbs the speed of devices.
0
1
1
0
DL_Iout
Iin DL_Iin
DL_Qin
S1
1
10
0
PE3
DL_Qout
Qin
-1
Iout
Qout
S2
60
Similarly, in every block in PE structures have asynchronous effect. Due
to those effects, more delay has been consumed in PE structures.
In our research work, asynchronous effect of FFT blocks is
identified to reduce the problem of delay consumption. To completely
reduce the asynchronous effect of FFT blocks, pipelining registers are
used in every processing element structures. These pipelining registers
help to reduce the arrival time of input from any sources. Inner block of
register unit has Flip-Flops (FFs) to provide clock matching
synchronism. Hence, processing speed of every processing element
must be high in Pipelined PE structures. Block diagram of Pipelined
PE3, PE2 and PE1 structures are illustrated in Figure. 3.17, Figure.
3.18 and Figure. 3.19 respectively.
Figure 3.17 Block diagram of Pipelined PE3 structure for 64 point FFT
1
0
1
1
1
0
0
0
DL_Iout
Iin
DL_Iin
Iout
S0
Qin
DL_Qout
Qout
DL_Qin
S0
Reg
Reg
Reg
Reg
Pipelining Registers
61
Figure 3.18 Block diagram of Pipelined PE2 structure for 64 point FFT
Figure 3.19 Block diagram of Pipelined PE1 structure for 64 point FFT
3.5 DESIGN OF REDUCED COMPLEX MULTIPLIER
In FFT computation, signed adders, signed subtractors and
complex multipliers are required to convert the frequency response of
discrete signals into time response of discrete signals. Signed adder
and signed subtractor is the fundamental logic function which can be
easily generated by half adder and full adder circuits. But, complex
multiplier is a difficult task to exhibit an accurate result. FFT
computation requires complex multiplier in the place of twiddle factor
1
1
0
DL_Iou
Iin DL_Iin
DL_Qi
S1
0
1
PE3
DL_Qout
Qin
-1
Iout
Qout
Reg
Reg
1
Pipelining Registers
I
Q
WN/KN
PE3
Iou
Qout
DL_Iou
Iin
Qi
DL_Qou
1
0
0
1
S1
0
1
1
0
DL_Iin
DL_Qin
S2 -1
Reg Reg
Reg Reg
Pipelining Register
S2
62
multiplication. For instance, 2 point FFT require to perform multiplication
of twiddle factor w02 with output of signed subtraction. Similarly, in 4
point FFT computation require twiddle factor multiplication of w04 and
w14 values. The value of first twiddle factor ( w0
4 ) is 1, hence there is no
need to perform multiplication. But, value of second twiddle factor (w14 )
is 1/ =0.707, hence it requires a multiplier to perform the multiplication
with value of 0.707. Bit parallel multiplier has been suggested in large
endeavours such as Manimaran, A [28], and Kandhi Srikanth [19] for
performing the multiplication of fractional values.
3.5.1 Bit Parallel Multiplier for 1/
Bit Parallel Multiplier is based on shifting and adding operation.
We can easily estimate the rounding value of multiplications. The
structure of bit-parallel multiplication is illustrated in Figure. 3.20 as in
Manimaran, A [28].
Figure 3.20 Circuit diagram of the bit-parallel multiplication by 1/
>> >>
>>
>>In Output
63
There are four different bit shifters and four adders are used to
generate the bit parallel multiplications by 1/ . The hardware structure
of bit parallel multiplication for 1/ can be further reduced in terms of
one number of adder and shifter in Yu, C [57]. The circuit diagram of
reduced bit parallel multiplier for 1/ is illustrated in Figure. 3.21. The
multiplication by 1/ using bit parallel multiplier is derived as follows:
( )222222 1486431in22inOutput −−−−−− +++++∗=∗=
(3.12)
( )( )[ ]222 26211in22inOutput −−− −++∗=∗=
(3.13)
Figure 3.21 Circuit diagram of reduced bit parallel multiplier for 1/
Figure.3.22 shows the butterfly structure of FFT with the help of
bit parallel multiplier. In this structure, twiddle factor is used in the place
of bit parallel multiplier.
>>2 >>4
>>2In Output
64
Figure 3.22 Butterfly structure of FFT with the help of bit parallel
multiplier
Twiddle factor multiplication of 0.707 is enough for both 4-point
and 8-point FFT computation. Similarly in case of 16 point FFT, we
need the twiddle factor multiplication of 0.3826, 0.9238. Hence, a
complex multiplier has been designed for providing all the twiddle factor
multiplication in a single multiplier circuits. Different twiddle factor
multiplication values are controlled by circuit switch of the complex
multiplier. In Yu, C [57] and Manimaran, A [28], circuit of complex
multiplier has been designed.
3.5.2 Design of Complex Multiplier
The structure of complex multiplier is illustrated in Figure. 3.22. It
consists of three number of multiplier and five number of adder to
control the different twiddle factor multiplication for 64-point FFT. Circuit
switch of complex multiplier is used to control the different types of
twiddle factor multiplications.
1/
1/
Iin
Qi
Iout
Qout
-
65
Figure 3.23 Structure of Complex Multiplier
The complex multiplier of FFT is designed by using the generalized
expression as
qQiII 1in1inout −= (3.14)
Equation (3.14) is simplified by adding and subtracting the Iinq1 terms as
given in equation (3.15) and rearranged to equation (3.16).
qIqIqQiII 1in1in1in1inout −+−= (3.15)
qQqIqIiII 1in1in1in1inout −−+= (3.16)
When we taking Iin and q1 terms as common factors,
( ) ( )QIqqiII inin111inout +−+= (3.17)
Circuit Switch
i1 to i8 q0 to q8
66
Final expression of real term of complex multiplier is given in equation
(3.17). Similarly, output of imaginary terms are considered as follows,
iQqIQ 1in1inout += (3.18)
Equation (3.18) is reduced by adding and subtracting Qinq1 terms as
given in equation (3.19) and rearranged in equation (3.20).
qQqQiQqIQ 1in1in1in1inout −++= (3.19)
qQqIqQiQQ 1in1in1in1inout ++−= (3.20)
When we taking Qin and q1 terms as common factors,
( ) ( )QIqqiQQ inin111inout ++−= (3.21)
Final expression of imaginary term of complex multiplier is given in
equation (3.21). From equation (3.20) and equation (3.21), it is clear
that complex multiplier require five adders and three multipliers units
3.5.3 Reduced Complex Multiplier design
In traditional complex multiplier design, thee multipliers and five
adders are used to twiddle factor multiplication of 64-point FFT. In our
proposed work, low density adders are identified to reduce the density
and hardware complexity of complex multiplication. Hence, this
multiplier named as “Reduced Complex Multiplier”.
67
In traditional complex multiplier, we identified i1+ q1 and i1 - q1
have low density than other adder structures. Hence, in reduced
complex multiplier, i1 + q1 and i1 – q1 have to be considered as LUT and
other elements remain unchanged. The structure of proposed reduced
complex multiplier is illustrated in Figure 3.24.
Figure 3.24 Structure of Proposed Reduced Complex Multiplier
This multiplier architecture performs twiddle factor multiplication
operation based on configuring circuit switches. Hence the reduced
complex multiplier is also termed as reconfigurable complex multiplier.
Proposed Reduced Complex Multiplier consists of only 3 number of
adder and 3 number of multiplier for performing 64-point FFT
computation. Hence, reduced complex multiplier consumes less
hardware and delay than traditional complex multiplier. This proposed
complex multiplier performs different type of twiddle factor multiplication
values as shown in table 3.2. In every stage, bit parallel multipliers are
used to perform the twiddle factor multiplication with the input values.
-
(i+q)(7q(7:0)
Iin
Qi
Iou
Q
Circuit
(i-
68
Hence, this complex multiplier also referred as “Reduced
Reconfigurable Complex Multiplier”. Circuit Switch of the reduced
complex multiplier is used to choose the suitable bit parallel multiplier in
every stage. Further this reduced complex multiplier is integrated into
normal Radix-2 butterfly structures to improve the performances
frequency transformation techniques.
Table 3.2 Twiddle factor values for 64-point FFT
Coefficient Value Coefficient Value
Real_1 0.7071 Imag_1 0.7071
Real_2 0.7730 Imag_2 0.6343
Real_3 0.8314 Imag_3 0.5555
Real_4 0.8819 Imag_4 0.4713
Real_5 0.9238 Imag_5 0.3826
Real_6 0.9569 Imag_6 0.2902
Real_7 0.9807 Imag_7 0.1950
Real_8 0.9951 Imag_8 0.0980
3.6 Design of Radix-2 Single path Delay Feedback (R2SDF) FFT for
64-point
Radix-2 Single path Delay Feedback (R2SDF) FFT is a “stream-
like” processing of block-based algorithm. It reduces the processing
time of FFT computation. Single Delay path is used in every stage to
69
process the first half of the input points with other half of the points. In
every stage, reconfigurable complex multiplier is used to perform the
twiddle factor multiplication. The architecture of 64-point R2SDF FFT is
illustrated in Figure 3.25.
Figure 3.25 Architecture of 64-point R2SDF FFT
In every stage of 64-point R2SDF FFT both Pipelined and
reduced complex multiplier are integrated to improve the performance
of 64-point R2SDF FFT processor. In Pipelining technique, pipelining
registers are used to eliminate the asynchronous effects between the
inputs and outputs. Hence, processing speed of pipelined PE structures
has been improved significantly. Next to pipelining technique, reduced
complex multiplier provides the multiplication results of twiddle factors
with the help of less hardware. Hence, processing speed of complex
multiplier also gets improved. Therefore, finally, the performances of
32D
But
terf
ly
16D
But
terf
ly
8D
But
terf
ly
4D
But
terf
ly
2D B
utte
rfly
1D
But
terf
ly
70
R2SDF FFT get improved when incorporating the Pipelined PE
Structures and reduced complex multiplier into R2SDF FFT
architecture.
SUMMARY
Orthogonal Frequency Division Multiplexing (OFDM) is a wireless
communication technique in which modulation & demodulation, FFT &
IFFT, encoder & decoder play an important role for performing data
communication services. In our research work, frequency
transformation is to be considered to improve the performances of
OFDM for MANET application.
FFT is widely used for converting the frequency domain of signals
into time domain signal. To improve the architecture of FFT model,
R2SDF structure is used in this research work. R2SDF FFT is a stream
like processor. However it has two disadvantages, one of them is
asynchronous effect of intermediate input/output and another one is
complexity of complex multiplier. To overcome these disadvantages
following steps are to be made in this research work.
Pipelining technique is introduced to PE structures of R2SDF FFT
to increasing the processing speed. Pipelining registers are used to
remove the asynchronous effect of R2SDF FFT.
Reduced complex multiplier is designed to reduce the complexity
of complex multiplier. Low density hardware components has been
71
identified and eliminated to reduce the hardware complexity of complex
multiplier. Proposed reduced complex multiplier performs different
twiddle factor multiplications with less hardware. Hence, processing
speed of the multiplication also gets improved.
Finally, both Pipelined PE structures and Reduced Complex
Multiplier are integrated into R2SDF FFT processor to improve the
performance of frequency transformation techniques.
72
CHAPTER 4
HAMMING SINGLE ERROR CORRECTION – TRIPLE
ADJACENT ERROR DETECTION CODE ALGORITHM FOR
DATA COMMUNICATION
4.1 Error Detection and Correction (EDC) Codes
Error detection and correction codes provide reliable delivery of
information signals. The small size of transistors and capacitors are
combined with radiation effects from cosmic rays, hence these causes
occasional errors in large storage of information. These types of errors
are generated in RAM chips. These errors can be detected and
corrected by employing the error-detecting and error-correcting codes in
RAMs. The scheme for detecting the error in error detecting technique
is parity bit. The parity of the information word is checked after reading
the data from either memory or registers. The information word is
correct, when even parity of 1’s arrived in same information word;
similarly the information word is incorrect, when odd parity of 1’s arrived
in same information word.
Different types of Error Detection and Correction (EDC) codes are
available to transmit the information data from source to destination
without any error. All those different types of EDC codes are bounded
73
from two types of binary codes named as block codes and convolutional
codes.
Block codes is the combination of both linear and circular code
which encodes the data into blocks. In linear block codes, input bits are
partitioned into blocks traditionally. Linear block codes are used in
Forward Error Correction (FEC) in which symbols are transmitted on a
communications channel so that, if errors occur in the transmission that
can be detected or corrected by the parity of block codes. Cyclic codes
are also block codes, in which circular shifts of each code word
produces another circular code.
Figure 4.1 Classification of Error Detection and Correction (EDC) codes
On the other hand, convolutional codes are also one of the best
encoders in which sequential function of convolution processes is
performed by sequence of bits. In convolutional encoder, 2*N bit code
Error Detection and Correction
(EDC)
Convolutional
Codes
Block Codes
Viterbi
Turbo Codes
LDPC
Reed Solomon
74
word is generated while using N bit data. Different types of block and
convolutional encoders have been suggested in the past to bring error
free transmission. The basic classifications of both block and
convolutional EDC codes are illustrated in Figure. 4.1.
Viterbi and Turbo codes are convolutional codes, because
convolution function has been performed in the encoder part of data
transmission system. In the other hand, Low Density Parity Check
(LDPC) codes and Reed Solomon codes are the linear error detecting
and correcting codes. Other than LDPC and Reed Solomon code,
Hamming code is also one of the best linear EDCs. The codeword of
linear codes is also linear. Hence, we can improve the probability of
error detection and error correction.
In this research work, improvements of error detection probability
of Triple Adjacent Error Detection (TAED) have been illustrated with the
help of hamming EDC and Bit replacement algorithm for wireless Mobile
Ad-hoc Network (MANET).
4.2 Hamming Error Detection and Correction Mechanism
Hamming codes are a family of linear block error detecting and
correcting codes that generalize the linear codes invented by Ritchard
Hamming in [42]. Hamming codes are perfect codes in which single bit
error can be detected and corrected successfully with their block length
and minimum distance. In hamming EDC codes, parity bits are used to
75
detect and correct a single error. The difference between word length of
input and parity bits is called as hamming distance (m).
In numerically, hamming codes are characterized for m≥3 with the
following,
12 −= mn (4.1)
mnk −= (4.2)
3min =d (4.3)
Where, n is the block size, k is the number of information bits, m is the
number of parity bits and dmin is the minimum hamming distance.
Minimum hamming distance of the hamming codes is always equal to
the number of parity bits. In hamming codes three parity bits are
required for performing a Single Error Correction (SEC) that means
minimum hamming distance of SEC hamming code is three.
For instance, considered as (7, 4) hamming code in which k = 4
(Width of information bits), n = 7 (Width of information bits) and number
of parity bit m = 3 (m = n - k). Therefore, the minimum hamming
distance is three. By using Figure. 4.2 and equations from equation 4.1
to 4.3, we can determine the values of parity bits.
76
Figure 4.2 Parity bit calculations
P1 = d1 + d2 + d3 (4.4)
P2 = d1 + d2 + d4 (4.5)
P3 = d2 + d3 + d4 (4.6)
Table 4.1 Calculation of parity bits for (7, 4) hamming code
Information bits Parity bits
d1 d2 d3 d4 P1 (d1 + d2+ d3)
P2 (d1 + d2+ d4)
P3 (d2 + d3+ d4)
1 0 0 0 1 1 0
0 1 0 0 1 1 1
0 0 1 0 1 0 1
0 0 0 1 0 1 1
Before, calculating the hamming matrix, it is essential to find the
generation matrix by using input word length and parity bits. For (7, 4)
hamming code, input word length is four. Hence, combination of
identical matrix for four bits and their corresponding parity bits are used
d1
d2
d3 d4
P1 P2
P3
77
to determine the generation matrix [G]. Calculation of parity bits for 4 bit
identical matrix is illustrated in table 4.1.
[G] = [Pm : Ik] (4.7)
Generation matrix for (7, 4) hamming code as follows:
[G] = (4.8)
From the generation matrix, we can estimate the hamming matrix as
follows:
[H] = [PT : Im] (4.9)
[H] = (4.10)
Hamming encoding process could be done with the help of vector
in Figure. 4.2 and from equation 4.1 to 4.3. Hamming decoding process
Parity bits 4 –bits Identical Matrix
Transpose of Parity bits
Identical Matrix for 3-bits
[H] =
[G] =
78
could be done with the help of syndrome vector. Syndrome vector is the
vector which determines error location of encoded signals. Syndrome
vector can be determined by taking the transpose of hamming matrix.
[Syndrome] = (4.11)
For instance, 4-bit information signal considered as 0100, and
then the parity bits of information bits are 111. Therefore code words for
information bit 0100 are represented as [parity bits: information bits],
Code word = [1110100] (4.12)
Hamming decoding processes are as follows:
Step 1: Getting the syndrome vector with the help of multiplication of
code word and Transpose of hamming matrix.
Step 2: Find the error location with the help of syndrome vector.
Step 3: If all the bits of syndrome vector are 0’s, then there is no error in
the encoding transmission. If any one of the bits of syndrome vector are
1’s, then there will be occurred a single bit error in the encoding
transmission. With the help of syndrome locator, we can easily detect
the location of error and correct it.
79
For instance, code word [1110100] is directly sent to the decoding
block. Then the syndrome vector can be obtained as [000]. Therefore,
there will be no error in the encoding transformation. If the second bit of
code word is flipped, then the code word become as 1010100. Now the
syndrome vector can be obtained as [111] which indicate the second
location in syndrome vector. Therefore, there will be error in the
encoding transmission. Hamming code can able to detect the location of
error and also able to correct it. Therefore, the corrected code word as
[1110100]. In this way, hamming code can detect and correct a single
bit error.
As in case of SEC using hamming code, DAED also performed in
hamming code with the help of extending the parity bits. The extended
hamming codes can support the SEC mechanism, but cannot support
double adjacent error correction as shown in Sanchez-Macian [46].
However, as technology scales, it is more likely that more than
one memory cell or register causing multiple errors by Ibe, E [18]. This
is known as a Multiple Cell Upset (MCU) as in Lawrence, R. K [22]. The
cells or register used by the MCU are physically close and in many
cases adjacent. This is because errors are created along the path that
information bit traverses. MCUs can therefore cause multiple adjacent
errors on a given information word causing a failure even when a SEC
code is used. SEC hamming code can cause erroneous correction,
when two adjacent bits are in error. In order to overcome this problem,
80
interleaving method is used in Zhao, J [58], and Baeg, S [8] which
places the bits of a word physically apart such that an MCU can only
affect one bit per word. But interleaving process makes the design as
more complex and can impact more area and power consumption.
Hence Single Error Correction – Double Adjacent Error Detection (SEC-
DAED) is the best solution for this problem. In order to detect the double
adjacent bit error, one more parity bit, (i.e. 3+1=4) is required.
From above consecution, it is clear that SEC-DAED codes can
cause erroneous correction, when three adjacent bits are in error.
Hence, we prefer Single Error Correction – Triple Adjacent Error
Detection (SEC-TAED) codes to detect the triple adjacent error. In both
SEC-DAED and SEC-TAED hamming codes, Silent Data Corruption
(SDC) has occurred where the system is unaware that an error has
occurred and continues its operation.
An algorithm to generate Hamming Codeword’s of information
bits is as follows:
Step 1 Number the positions of bits starting from 1 to n, where n is the
last positions of the bit.
Step 2 All the positions are written in their binary form as 1, 10, 11,
100, 101, 110, 111, 1000, 1001, etc.
81
Step 3 All bit positions that are powers of two (i.e. have only one 1 bit
in the binary form of their position) are to be considered as a
parity bits.
Position of parity bits is to be determined as follows:
[1] Parity position 20: Check 1 bit and Skip 1 bit step positions are
followed such as 1, 3, 5, 7, 9..............
[2] Parity position 21: Check 2 bits and Skip 2 bits step positions are
followed such as 2, 3, 6, 7, 10, 11, 14, 15..............
[3] Parity position 22: Check 3 bits and skip 3 bits step positions are
followed such as 4, 5, 6, 7, 12, 13, 14, 15, 20, 21, 22, 23,
24............
[4] Parity position 23: Check 8 bits and Skip 8 bits step positions are
followed such as 8-15, 24-31, 40-47.............
[5] Parity position 24: Check 16 bits and Skip 16 bits step positions
are followed such as 16-31, 48-63.............
Finally, set a parity bit to 0, if the total number of ones in the
positions is even. On the other hand, set a parity bit to 1, if the total
number of ones in the positions is odd. In this way we get the codeword
by using hamming codes. Syndrome vectors have been used in
hamming code to detect and correct a single bit error. The product of
lexicographic matrix and codeword of given information is called as
82
Syndrome vector. If the value of syndrome is a null vector, there will be
no error in data transmission. If the value of syndrome vector is fixed
from 1 to 2m, position of syndrome vectors indicates position of
erroneous bit. Therefore, we can change position of erroneous bit.
Hamming codes can detect and correct a single error effectively.
4.2.1 Single Error Correction (SEC) Hamming Codes
SEC hamming code uses the lexicographic hamming matrix to
determine the erroneous bit. Lexicographic matrix is constructed by
writing the binary representation of digits starting from 1 to 2m. In
general hamming code can be represented as (n, k). Where n is the
block size and k is the number of information bits.
Let to be consider (7, 4) hamming code, Where length of
information is four and block size of hamming code is 7. Therefore,
parity bit of given hamming code is three. Hence, lexicographic matrix of
(7, 4) hamming code can be written as follows
H = (4.13)
Let, four bit information, input = 1011.
Length of codeword = Length of information bit + Length of Parity bit
Length of codeword = 4 + 3
Length of codeword = 7.
H =
83
Code word can be formulated as follows,
0 1 2 3 4 5 6
Parity1 Parity2 1 0 Parity3 1 1
Parity 1 = Even Parity of [parity1, 1, Parity2, 1] = 1 XOR 1 = 0.
Parity 2 = Even Parity of [Parity2, 1, 1, 1] = 1 XOR 1 XOR 1 = 1.
Parity 3 = Even Parity of [0, Parity3, 1, 1] = 0 XOR 1 XOR 1 = 0.
Hence, code word formed as
In general, transpose of lexicographic matrix is termed as Syndrome
vector.
Syndrome = (4.14)
Parity1 Parity2 Data1 Data2 Parity3 Data3 Data4
0 1 1 0 1 1 1
2220 21
Syndrome =
84
Multiplication of hamming code word and transpose of
lexicographic matrix gives Syndrome vector position. It is denoted as
‘S’.
S = (4.15)
S=
Hence, Syndrome vector provides null vector. From value of S, it
is clear that there is no error in data transmission. If error has been
occurred in hamming codeword, then Syndrome vector gives the values
from 1 to 2m . For instance, if hamming codeword is coded as 0111011
for a given information bits 1011, then the Syndrome vector provides
[100]. This code word located at fourth place of Syndrome vector
(equation 4.15). Hence, we can manually change the fourth location of
codeword to correct a single error. In this way SEC hamming codes can
effectively detect and correct a single bit error. When error has been
occurred in double adjacent position, then this fixed algorithm can
miscorrect the codeword. For instance, hamming codeword is coded as
0111111 for a given information bits 1011, then the Syndrome vector
provides [001]. The code word located at first place of Syndrome vector
(equation 4.6). But, first position of bit is transmitted as itself; there is no
error in transmission. In this way, SEC hamming codes can miscorrect
S =
85
the codeword when occurring double adjacent bit errors. It causes SDC
effect which lead to an incorrect system behaviour and further data
corruption. In order to reduce this problem, SEC-DAED hamming codes
have been introduced in the past.
4.2.2 Single Error Correction–Double Adjacent Error Detection
(SEC-DAED) Hamming Code
As discussed in previous section, SEC hamming codes can
miscorrect the codeword. To avoid this problem, hamming code can be
extended by adding one more parity bit, (i.e.) four parity bits are to be
considered to detect the double adjacent error. Hence, this hamming
code named as “Extended Hamming Code”.
In order to increase the detection probability of SEC-DAED, two
algorithms has been suggested in Sanchez-Macian, Alfonso [46].
• Bit Shortening Algorithm
• Bit Replacement Algorithm
The bit shortening algorithm for improving the probability
detection of SEC or DAED can be explained as follows,
For a 16-bit data word (k=16), shortening technique is applied to a
(31, 26) Hamming code, which producing a (21, 16) SEC code. Hence,
10 columns can be removed from our original matrix. The procedure for
shortening algorithm for normal hamming codes are explained below,
86
• Fill the first k=16 columns with odd-weight values to maximize
double error detection. A double error may affecting any of these
columns will produce an even weight syndrome. So, it will not
correspond to any of these columns.
• Sort those columns for trying to maximize the different even
weight syndromes. Adjacent errors on these k=16 columns can
produce 15 syndromes. The goal is to maximize the coincidences
between these syndrome values.
• The remaining n-k=5 columns need to be filled by even weight
values. Even though, an adjacent error produced in the transition
between the last odd-weight column and the first even-weight
value would produce a miscorrection as it corresponds to a
difference existing odd-weight column. So, a specific odd-weight
column will be selected for removing it from the matrix to provide
for the identified odd-weight syndrome.
• Totally 6 column (5 columns plus the removed one) are filled with
even-weight values placing them in the appropriate order and
excluding those which coincide with a previous double-adjacent
error syndrome.
In other hand, Bit replacement algorithm is an efficient algorithm
where single error can be successfully detected and corrected. Also
87
maximum number of double and triple adjacent error can be detected
successfully with the help of bit replacement algorithm.
In hamming code model, it is essential to extend the number of
parity bits to detect the double and triple adjacent errors successfully.
For instance, consider a (12, 8) hamming codes. The parity bit for
hamming code (12, 8) is four bit. So, its maximum value is 15, but there
are only 12 positions in the shortened code. If syndrome vector gives
the value as 13, 14 and 15 means there means that there will be
detecting the double or triple adjacent errors. But, it is impossible to
correct the detected double or triple adjacent errors because the
syndrome vector provides the vector information in four bits such as
1101 (13), 1110 (14) or 1111 (15). Hence, it’s very difficult to detect the
correct position of errors. If trying to correct the double or triple
adjacent error means, it will miscorrect the errors in inappropriate
manner. The syndrome vector is defined as the multiplication of
transpose of hamming matrix and code word. In traditional method,
lexicographic matrix has been used for finding the code word. The
lexicographic matrix has been illustrated as follows,
H = (4.7)
If a single bit error occurs in the code word, the syndrome vector
that results from product of lexicographic matrix transpose and code
H =
88
word gives the four bit zero (all zeros) vector. For instance, considered
a hamming code (12, 8), data bits (01010100) are coded as
(000010110100). Thus the syndrome vector gives the result as (0000).
Hence, there will be no error in the data transmission. When an error
occurs and, for instance, the fifth bit is changed the code word turns into
(000000110100). The product of this vector by the lexicographic check
matrix results in the syndrome vector (0101) corresponding to the binary
representation of five. Hence, it is possible to flip the fifth location of bit
itself. Hence, single error can be detected and corrected successfully.
Alternatively, a hamming code can be used to correct single errors
as well as detect the double and triple adjacent errors. Traditionally, if
minimum distance between two words is three, it is not possible to
distinguish between single and double errors. For example, coming
back to previous example, if there is a double error in the original word
in positions 3 and 4 we get the vector (000001110100). Syndrome in
this case is (1110) corresponding to the binary representation is 14. In
this case, codeword would be corrected into (001110010100) instead of
the right word. This mis-prediction is termed as Silent Data Corruption
(SDC). In order to maximize the detection probability of double as well
as triple adjacent errors, hamming codes are extended in Sanchez-
Macian, Alfonso [46].
This solution increases the minimum distance to four and allows
both single error correction and double error detection (SEC-DED)
89
simultaneously. In order to maximize the detection probability of double
adjacent error, Bit Replacement algorithm is preferred in the previous
research work. In bit replacement algorithm, code word of the encode
output can be re-ordered which targeted to increase the detection
probability of double or triple adjacent errors. Figure 4.2 shows the flow
chart of selective bit placement strategy. The algorithm of Figure 4.3
illustrates for selecting the combinations of double or triple adjacent
errors by using MATLAB simulation tool. If the obtained syndrome value
is greater than code length there will be occurring the double or triple
adjacent errors.
Bit placement algorithm for detecting double adjacent error by
using hamming (12, 8) is shown in Table 4.2.
Table 4.2 Double Adjacent Error Detection for Hamming (12, 8)
Bit Placement Detection
1 2 3 4 5 6 7 8 9 10 11 12 1/11 9%
1 12 2 3 6 8 7 9 4 10 5 11 9/11 82%
There are 15 combinations of double errors have been identified by
using MATLAB. The combinations are: 1-12, 2-12, 3-12, 4-9, 4-10, 4-11,
5-8, 5-10, 5-11, 6-8, 6-9, 6-11, 7-8, 7-9 and 7-10. In normal order, only
7-8 combination will detect the double adjacent error detection. Hence,
only 8% detection efficiency was obtained in normal order based
90
hamming code. But in case of modified bit placement strategies, there
are 9 combinations are used to detect the double adjacent errors.
These are 1-12, 2-12, 4-9, 4-10, 5-10, 5-11, 6-8, 7-8 and 7-9. Hence,
82% detection efficiency was achieved in bit placement algorithm based
hamming codes.
Choose a code word
Generate next double/triple bit error
Multiply by Lexicographic Matrix
Syndrome > codelength
Select Combination
Any double/triple error remaining
Print Selected Combinations
YES
NO
YES
NO
Rearrange the Code Bit Positions Manually
Figure 4.3 Flow chart of Bit Placement Strategy
91
4.2.3 Single Error Correction – Triple Adjacent Error Detection
(SEC-TAED)
The parity bit can be extended to detect a single error as well as
triple adjacent errors. Hence, SEC-TAED based hamming code is
referred to as “Extended Hamming SEC-TAED codes”. Similar to DED
combination, Triple Error Combinations are determined by using
MATLAB. There are 49 triple error combinations are found in normal bit-
order based hamming code. Hamming (12, 8) SEC-DAED code could
cause miscorrection for detecting triple adjacent errors. Thus it requires
one more parity bit ‘p’ for performing SEC and TAED operation. SEC-
TAED process with (13, 8) hamming code is described in Sanchez-
Macian, Alfonso [47]. Hamming (13, 8) code for detecting triple adjacent
errors is illustrated in Table 4.3.
Table 4.3 Triple Adjacent Error Detection for Hamming (13, 8)
Bit Placement Detection 1 2 3 4 5 6 7 8 9 10 11 12 p 1/11 9% 6 8 1 7 11 3 5 9 2 4 p 10 12 9/11 82%
In normal bit order, only 10-11-12 combinations only can detect
the triple adjacent error successfully. Hence, only 9% of detection
efficiency can be achieved in hamming TAED codes. But in Bit-
reordered codeword, there are 9 combinations (except 2-4-p and p-10-
12) of triple adjacent errors can be detected successfully. Hence, 82%
of detection efficiency can be achieved in modified bit-reordered based
92
SEC-TAED process. However the most disadvantage of hamming (13,
8) SEC-TAED code is reducing the performances in terms of larger time
consumption and power consumption due to additional parity bit (‘P’). In
order to overcome this problem, enhanced extended hamming (12, 8)
code is developed in the current research work for detection &
correcting a single error as well as detecting the triple adjacent errors. It
will support in channel decoder part of MIMO-OFDM system which can
further extended to MANET based temporarily network operations.
4.3 Proposed Extended (12, 8) Hamming Code for SEC-TAED
In proposed methodology, bit replacement algorithm is effectively
used to change the order of the code word and to maximize the
probability detection of triple adjacent error processes. Further one
more crucial key things our research consideration is hamming (12, 8)
code. As discussed earlier, traditional hamming (13, 8) SEC-TAED code
requires 5 parity bits to detect the triple adjacent errors. To reduce this
problem, hamming (12, 8) code is used in the current research work to
detect the triple adjacent error.
Bit Re-ordered format for (12, 8) hamming code for detecting the
triple adjacent errors are shown in Table 4.4. It is the proposed bit-
reordered format which used to maximize the probability detection of
triple adjacent error.
93
Table 4.4 Triple Adjacent Error Detection for Proposed Hamming (12, 8)
Bit Placement Detection 1 2 3 4 5 6 7 8 9 10 11 12 1/10 10% 7 11 2 6 10 1 4 8 3 5 9 12 9/10 90%
In normal bit order, 1 out of 10 combinations only help to detect
the triple adjacent error. But in case of proposed bit re-ordered format 9
out of 10 combinations help to detect the triple adjacent errors. Hence,
90% probability detection efficiency can be achieved in the case of
proposed extended hamming SEC-TAED code. All combination of
proposed bit re-ordered format except 5-9-12 are triple error
combinations. Hence, it helps to improve the bit detective probability
wherever the bits flipping occur during data transmissions.
SUMMARY
Channel Encoder and Channel Decoder are the most essential
blocks of OFDM transceiver architecture. In these blocks, Error
Detection and Correction (EDC) codes are suited to encode and decode
the original data bits. In previous works, hamming codes are used in the
normal order code word. It should detect and correct a single error
perfectly. But, it could miscorrect the double and triple adjacent errors.
In order to maximize the probability detection of double & triple
adjacent error, bit replacement and bit shortening algorithm has been
used in the past. In our research work, Bit Replacement Algorithm is
94
used effectively to improve the probability detection of TAED.
Traditionally (13, 8) hamming code is used to find the TAED process.
Our proposed enhanced extended hamming SEC-TAED processes use
(12, 8) hamming code for detecting the triple adjacent errors.
Hamming (13, 8) SEC-TAED code has 82% triple adjacent error
detection efficiency with 100% SEC efficiency whereas proposed
hamming (12, 8) SEC-TAED code has 90% triple adjacent error
detection efficiency with 100% SEC efficiency. Hence, proposed
pipelined PE structures & Reduced Complex Multiplier based R2SDF
FFT and enhanced extended hamming (12, 8) SEC-TAED hamming
code is most suitable for MANET temporarily network applications.
95
CHAPTER 5
RESULTS AND DISCUSSIONS
The discussion of results obtained at every stage of the research
work presented in this chapter. The design of proposed processing
elements (PEs), pipelined Radix-2 Single path Delay Feedback
(R2SDF) FFT and proposed enhanced extended hamming codes are
simulated and validated by using ModelSim 6.3C Mentor Graphics tool.
Verilog HDL) is used for the design of processing elements, R2SDF
FFT and extended hamming codes. Further Xilinx 10.1i (Family:
Spartan – 3, Device: Xc3s50, Package: PQ208, Speed: -5) design tool
is used to generate the synthesis report of proposed design. Lower
power consumption, less slice & LUT utilization, high speed &
throughput are the main concerns of VLSI System design. The main
target of the current research work is reducing the delay consumption
and area consumption of the frequency transformation technique which
involved in OFDM process. To convert the time domain signal into
frequency domain signal, Radix-2 Single path Delay Feedback (R2SDF)
based pipelined architecture is developed in the current research work.
96
5.1 Synthesis Result of Pipeline based Processing Element
Structures
To increase the processing speed of MANET, the existing PE
structures are modified by adding a register unit in the end of the PE
structure. The register unit of PE structures make the synchronization
between input and output line. Thus the register unit of PE structure is
called as “Pipelining Registers”. The effects of synchronization among
all outputs make the high speed operation in frequency transformation
computation. The synthesis results to determine the Slice and LUT
utilization of existing Processing Elements (PE1, PE2 and PE3) are
shown in Figure 5.1, Figure 5.2 and Figure 5.3 respectively.
Figure 5.1 Synthesis result of PE1 to determine the Slice and LUT
utilization
97
Figure 5.2 Synthesis result of PE2 to determine the Slice and LUT
utilization
Figure 5.3 Synthesis result of PE3 to determine the Slice and LUT
utilization
98
Similarly, the synthesis results of Pipeline based Processing
Elements (PE1, PE2 and PE3) are shown in Figure 5.4, Figure 5.5 and
Figure 5.6 respectively.
Figure 5.4 Synthesis result of Pipelined PE1 to determine the Slice and
LUT utilization
Figure 5.5 Synthesis result of Pipelined PE2 to determine the Slice and
LUT utilization
99
Figure 5.6 Synthesis result of PE3 to determine the Slice and LUT
utilization
From above sequences, it is clear that pipeline based PE1
processor offers 2.29% reduction in delay consumption than PE1
processor without using pipelining method. Pipeline based PE2
processor offers 45.45% reduction in hardware slices, 45.23% reduction
in LUTs and 1.34% reduction in delay consumption than PE2 processor.
Also pipeline based PE3 processor offers 2.5% reduction in hardware
slices and 1.62% reduction in delay consumption than PE3 processors.
The comparison result of area and delay for proposed pipeline based
Processing Elements blocks are shown in Table 5.1. Performances of
proposed pipeline based PE1, PE2 and PE3 processors are graphically
illustrated in Figure 5.7.
100
Table 5.1 Comparison of area and delay for proposed pipeline based
Processing Elements
Methods Slices LUT Delay (ns)
PE1 without Pipeline 67 131 17.633
PE1 With Pipeline 67 134 17.277
PE2 without Pipeline 44 84 17.683
PE2 with Pipeline 24 46 17.446
PE3 without Pipeline 40 76 13.254
PE3 with Pipeline 39 76 13.039
Figure 5.7 Performances of proposed pipelined PE1, PE2 and PE3
processors
101
5.2 Synthesis Result of Reduced Complex Multiplier
Twiddle factor multiplication of frequency transformation (FFT)
process will produce the frequency representation results for
corresponding timing representation inputs. In order to perform the
twiddle factor multiplication parallel shifter based multiplier has been
used. In the current research work, bit parallel multiplication is used to
perform the twiddle factor multiplication. Further the complexity of bit
parallel multiplication has been identified by using equation solving
method. The synthesis result of traditional and proposed reduced
complex multiplier is shown in Table 5.2.
Table 5.2 Comparison of area and delay between traditional complex
multiplier and proposed reduced complex multiplier
Parameters Traditional Complex Multiplier
Proposed Reduced Complex Multiplier
Percentage Reduction
Slices 299 217 27.42%
LUT 590 426 27.79%
Delay (ns) 26.716 25.822 3.34%
The synthesis result of proposed reduced complex multiplier
structures to determine the slice and LUT utilization is shown in Figure
5.8. Similarly, the synthesis result of proposed reduced complex
multiplier to generate a timing report is shown in Figure 5.9.
102
Figure 5.8 Synthesis result of proposed reduced complex multiplier to
determine the Slice and LUT utilization
Figure 5.9 Synthesis result of proposed reduced complex multiplier to
determine the delay consumption
103
From Figure 5.8, Figure 5.9 and Table 5.2, proposed reduced
complex multiplier offers 27.42% reduction in hardware slices, 27.79%
reduction in LUTs and 3.34% reduction in delay consumption than
traditional complex multiplier. Further, proposed pipeline based
processing elements and proposed reduced complex multipliers are
incorporated into pipelined architecture called “Radix-2 Single path
Delay Feedback (R2SDF)”.
5.3 Synthesis Result of Pipelined PEs and Reduced Complex
Multiplier based R2SDF FFT
Radix-2 Single path Delay Feedback (R2SDF) FFT is the best
feedback based frequency transformation process in which timing
signals is converted into frequency signals with the help of processing
element architectures and twiddle factor multiplications. In order to
improve the FFT architectures, pipelined PEs and Reduced Complex
Multiplier (RCM) is proposed in the current research work. The
synthesis results of proposed R2SDF FFT by using pipelined PEs &
RCM multiplier to determine the slice & LUT and delay consumptions
are shown in Figure 5.10 and Figure 5.11 respectively. Further the
performance evaluation of proposed pipelined R2SDF FFT
architectures are shown in Table 5.3. The performance of proposed and
traditional R2SDF FFTs is graphically illustrated in Figure 5.12.
104
Figure 5.10 Synthesis result of Proposed R2SDF FFT by using
Pipelined PEs and Reduced Complex Multiplier to determine the Slice
and LUT utilization
Figure 5.11 Synthesis result of Proposed R2SDF FFT by using
Pipelined PEs and Reduced Complex Multiplier to determine the delay
consumption
105
Table 5.3 Comparison of area and delay between traditional and
proposed R2SDF FFT architectures
Parameters Traditional R2SDF FFT
Proposed R2SDF FFT
Percentage Reduction
Slices 616 576 6.49%
LUTs 1195 1122 6.10%
Delay (ns) 53.341 53.307 Slightly reduced
Figure 5.12 Performances of proposed and traditional R2SDF FFT
106
5.4 Simulation Result of Proposed Hamming (12, 8) SEC-TAED
Codes
Channel encoder and decoder part of MIMO-OFDM architecture
performs error correction processes. OFDM channel is mostly affected
by Additive White Gaussian Noise (AWGN) in which bit flipping of
original information signal could leads to fault transmission. In order to
overcome this problem, hamming error detection and correction codes
are used in the current research work. Usually it will detect and correct
a single error perfectly. But, the proposed work performs SEC functions
as well as triple adjacent error detection functions. The simulation result
of extended hamming code is shown in Figure 5.13.
The status of the signal displayed indicates the status of the
current signals. If encoding process is in under process means, the
status signal printed as “PROCESSING”. If the decoding process
doesn’t detect any error means, the status printed as “NO ERROR”.
Similarly, if the decoding process detects a single error means, it can
able to correct the error with the help position of syndrome vector and
the status printed as “SEC”. Also if the syndrome vector detects the bits
as 1101 or 1110 or 1111, the status printed as “TED”. The constant 8-
bit input is considered as 01010100. The encoded output is obtained as
000010110100. If the same encoded data as it is transferred to input of
decoder means, the status signal provide “No Error” output. Figure 5.14
shows hamming (12, 8) error-less data transmission. For instance, the
107
third bit of encoded output is changed manually and the input of
decoder is as 001010110100. Figure 5.15 shows hamming (12, 8)
single error correction data transmission. The status is displayed as
“SEC”.
Figure 5.13 Simulation result of proposed hamming (12, 8) SEC-TAED
code
108
Figure 5.14 Simulation result of hamming (12, 8) SEC-TAED error-less
data transmission: Status displayed as “No error”
Figure 5.15 Simulation result of hamming (12, 8) SEC-TAED code with
single bit flipping: Status displayed as “SEC”
109
Similarly for triple adjacent error detection process, the status of
the signal is displayed as “TED” which shown in Figure 5.13.
Thus the proposed pipelining processing elements (PEs) and
reduced complex multiplier based frequency transformation (R2SDF
FFT) and proposed hamming (12, 8) SEC-TAED code will be useful to
implement an efficient MIMO-OFDM and this application will be further
useful to extend in MANET based temporarily network architecture.
110
CHAPTER 6
CONCLUSION
6.1 Summary of the thesis
Architecture of FFT is analyzed successfully and realized the
problem of dataflow structures involved in FFT architectures.
OFDM provides communication using Mobile Adhoc Networks.
An efficient VLSI based Radix - 2 Single - Path Delay Feedback
(R2SDF) FFT technique is implemented. The hamming SEC-TAED
code is used to reduce noise.
The Speed of Processing Element (PE) of FFT is increased using
pipeline based PE Structure. This offers 45.45% reduction in hardware
slices, 45.23% reduction in LUTs and 1.34% reduction in delay
consumption.
Complex Multiplication architecture of FFT has been realized
and re-designed in the current research work. Complex multiplier
reduces the area and delay by performing twiddle factor multiplication.
Complex multiplier offers 27.42% reduction in hardware slices, 27.79%
reduction in LUTs and 3.34% reduction in speed traditional.
The efficiency of Radix-2 single-pat Delay Feedback (R2SDF)
FFT has increased by implementing reduced complex multiplier and
111
pipeline based PE structures on R2SDF FFT. This offers 6.49%
reduction in area than existing R2SDF FFT architecture.
Reconfigurable architecture for complex multiplier is introduced
by using different twiddle multiplication value.
The probability of error detection in SEC-TAED has increased by
using Bit Replacement algorithm. This provides 8% more detection
efficiency when compared to existing SEC-TAED algorithm.
The application of the presented research work can be
implemented in the design of FFT / IFFT architecture with less area and
delay of OFDM system which suits wireless communication MANETS.
6.2 Future work
The same application can be used in Software Defined Radio
(SDR) based wireless data transmission system.
Software defined radio relies heavily on reconfigurable IIFT/FFT
architectures. Wireless standards used in SDR are a set of Media
Access Control (MAC) and physical layer specification for
implementation. The reconfigurable FFT/IFFT processor is the main tool
for generating these frequency bands and hence can be used in SDR
applications in future.
112
The architecture can also be used for real time reconfiguration of
4G networks which are used for high speed digital transmission at
present.
113
REFERENCES
1. Abdeldime M.S. Abdelgader and Wu Lenan, “The Physical Layer
of the IEEE 802.11p WAVE Communication Standard: The
Specifications and Challenges” Proceedings of the World
Congress on Engineering and Computer Science, Vol. 2, pp: 22-
24, 2014.
2. Abhijit D. Palekar and prashant V. Ingole, “OFDM System Using
FFT and LFFT” International Journal of Advanced Research in
Computer Science and Software Engineering (IJARCSSE), Vol.
3, Issue. 12, pp: 675-679, 2013.
3. Abhishek Mankar, “FPGA Implementation of Fast Fourier
Transform Core Using NEDA”, National Institute of Technology, A
Thesis submitted on 2013.
4. Ankur O. Bang and Prabhakar L. Ramteke, “MANET: History,
Challenges And Applications” International Journal of Application
or Innovation in Engineering & Management (IJAIEM), Vol. 2,
Issue. 9, pp: 249-251, 2013.
5. Anwar Bhasha Pattan and M. Madhavi Latha, “Fast Fourier
Transform Architectures: A Survey and State of the Art”
International Journal of Electronics and Communication
Technology, Vol. 5, No. 4, PP: 94-98, 2014.
114
6. Archana Fande and Anil Sahu, “Efficient Implementation &
Comparison of Signed Complex Multiplier on FPGA using FFT
Algorithm” International Journal of Scientific Research
Engineering & Technology (IJSRET), Vol. 3, Issue. 2, pp: 188-
191, 2014.
7. Ayinala, M., Lao, Y., & Parhi, K. K., “An In-Place FFT Architecture
for Real-Valued Signals”. IEEE Trans. on Circuits and Systems,
Vol. 60, Issue. 10, pp: 652-656, 2013.
8. Baeg, S., Wen, S., & Wong, R. “SRAM interleaving distance
selection with a soft error failure model” Nuclear Science, IEEE
Transactions on Vol. 56, Issue. 4, pp: 2111-2118. 2009,
9. Berkeman, A., Owall, V., & Torkelson, M, “A low logic depth
complex multiplier”. In Solid-State Circuits Conference,
ESSCIRC'98. Proceedings of the 24th European (pp. 204-207).
IEEE, 1998.
10. Berkeman, A., Owall, V., & Torkelson, M, “A low logic depth
complex multiplier using distributed arithmetic”. IEEE Journal of
Solid State Circuits, Vol. 35, Issue. 4, pp: 656-659, 2000.
11. Cha.S., & Yoon, H, “Efficient Implementation of Single Error
Correction and Double Error Detection Code with Check Bit Pre
computation for Memories”, Journal of Semiconductor
115
Technology and Science (JSTS), Vol. 12, Issue. 4, pp: 418-425,
2012.
12. Cui, Y., Lou, M., Xiao, J., Zhang, X., Shi, S., & Lu, P, “Research
and implementation of SEC-DED Hamming code algorithm”. In
TENCON 2013-2013 IEEE Region 10 Conference (31194), pp. 1-
5, 2013.
13. Del Mundo, C., Adhinarayanan, V., & Feng, W. C., “Accelerating
fast Fourier transform for wideband channelization”. In
Communications (ICC), IEEE International Conference on (pp.
4776-4780) 2013.
14. Dickson, B. W., & Conti, A. A., “Parallel Extensions to Single-Path
Delay-Feedback FFT Architectures” pp: 1-9, 2014.
15. Dutta, A., & Touba, N. A, “Multiple bit upset tolerant memory
using selective cycle avoidance based SEC-DED-DAEC code”. In
VLSI Test Symposium in IEEE, pp. 349-354, 2007.
16. Gavin Yeung, Mineo Takai, Rajive Bagrodia, Alireza Mehrnia,
Babak Daneshrad, “Detailed OFDM Modeling in Network
Simulation of Mobile Ad Hoc Networks” IEEE proceedings of the
18th workshop on parallel and distributed simulation (PADS’04),
2004.
116
17. Datuk Mohd , Mobile Ad Hoc Network Overview , Asia-Pacific
conference on Applied Electromagnetic Proceedings, December
4-6, 2007.
18. Ibe, E., Taniguchi, H., Yahagi, Y., Shimbo, K. I., & Toba, T.
“Impact of scaling on neutron-induced soft error in SRAMs from a
250 nm to a 22 nm design rule. Electron Devices”, IEEE
Transactions on Vol. 57, Issue. 7, pp: 1527-1538, 2010.
19. Kandhi Srikanth, “Design Radix-4 64-Point Pipeline FFT/IFFT
Processor for Wireless Application” International Journal of
Engineering Inventions (IJEI), Vol. 3, Issue. 2, pp: 67-70, 2013.
20. Kavita Taneja, R.B. Patel “An Overview of Mobile Ad hoc
Networks: Challenges and Future”, CiteSeerX Scientific Literature
Digital Library, The Pennsylvania State University.
21. Kumar, A., Tripathi, U. N., Verma, R. K., & Mishra, M, “64 Point
Radix-4 FFT Butterfly Realization using FPGA” International
Journal of Engineering and Innovative Technology (IJEIT), Vol. 4,
Issue. 4, pp: 57-60, 2014.
22. Lawrence, R. K., & Kelly, A. T. “Single event effect induced
multiple-cell upsets in a commercial 90 nm CMOS digital
technology” Nuclear Science, IEEE Transactions on Vol. 55,
Issue. 6, 3367-3374, 2008.
117
23. Lyon, Douglas A. "The Discrete Fourier Transform, Part 2: Radix
2 FFT." Journal of Object Technology, Vol. 8, No. 5, pp: 21-23,
2009.
24. Mahendra Babu D.S , Vinutha M.R and Uma C, “Design and
Implementation of MIMO-OFDM using Encoding and Decoding
techniques on FPGA” International Journal of Scientific &
Engineering Research (IJSER), Vol. 5, Issue. 6, pp: 939-944,
2014.
25. Makwana, V., & Parmar, N., “Analysis of Performance of Fast
Fourier Transformation of an Audio File” International Journal of
Application or Innovation in Engineering & Management (IJAIEM),
pp: 68-71, Vol. 2, Issue. 11, 2013.
26. Malik Nasereldin Ahmed, Abdul Hanan Abdullah and Satria
Mandala, “A Study On OFDM In Mobile Ad Hoc Network”
International Journal of Advanced Computer Science and
Applications (IJACSA), Vol. 3, No. 6, pp: 16-19, 2012.
27. Manchanda, G., & Chesta Verma, G. G., “Design of
Multidirectional Parity Code Using Hamming Code Technique for
Error Detection and Correction”, Indian Journal of Research (IJR),
Vol. 3, Issue. 5, pp: 79-81, 2014.
28. Manimaran, A., Sudheer, S. K., & Harshan, M. K, “A Novel
Approach in Pipeline Architecture for 64-Point FFT Processor
118
without ROM” International Journal of Advanced Research in
Electrical, Electronics and Instrumentation Engineering
(IJAREEIE), Vol. 3, Special Issue. 3, pp: 95-100, 2014.
29. Maslen, D. K., & Rockmore, D. N. “The Cooley-Tukey FFT and
group theory”. Notices of the AMS, Vol. 48, No. 10, PP: 1151-
1160, 2001.
30. Mehta, U. C., & Sharma, M. S., “VLSI Implementation of 2048
Point FFT/IFFT for Mobile Wi-MAX”. International Journal of
Computer Applications (IJCA), Vol. 65, Issue. 25, 2013.
31. Mohit Kumar and Rashmi Mishra, “An Overview of MANET:
History, Challenges and Applications” Indian Journal of Computer
Science and Engineering (IJCSE), Vol. 3, No. 1, pp: 121-125,
2012.
32. Moose, Paul H. "A technique for orthogonal frequency division
multiplexing frequency offset correction", IEEE Transactions on
communications, Vol. 42, pp: 2908-2914, 1994.
33. Naga Tanuja, K, “Implementation of OFDM Based
Communication System using Novel FFT Processor Architecture”
International Journal of Advanced Research in Computer and
Communication Engineering (IJARCCE), Vol. 3, Issue. 10, pp:
8346-8349, 2014.
119
34. Niladri Mandal and Souragni Ghosh, “A Modified Fast FFT
Algorithm for OFDM Based Future Wireless Communication
System” International Journal of Soft Computing and Engineering
(IJSCE), Vol. 1, Issue. 6, pp: 179-184, 2012.
35. Noman, H. M. F., Fuzail, M., & Arshad, J, “Software-Defined
Radio Architecture for Broadband OFDM Transceivers”,
International Journal of Computer Science and
Telecommunications (IJCST), Vol. 5, Issue. 4, pp: 20-24, 2014.
36. Noorbasha, F., Harikishore Kakarla, S. R. R., Maruthi, G. V.,
Manoj, S. P. U., & Varalakshmi, G, “VLSI Implementation of
Encryption and Decryption System Using Hamming Code
Algorithm” International Journal of Engineering Research and
Applications (IJERA), Vol. 4, Issue. 4, pp: 52-55, 2014.
37. Nutan Shep and P.H. Bhagat, “Implementation of Hamming code
using VLSI” International Journal of Engineering Trends and
Technology (IJETT), Vol. 4, Issue. 3, pp: 186-190, 2013.
38. Paul, S. S., & Baby, S. M., “An Efficient Design of Parallel
Pipelined FFT Architecture” International Journal Of Engineering
and Computer Science (IJECS), Vol. 3, Issue. 10, pp: 8926-8931,
2014.
120
39. Peng, S., & Wang, C. F., “Precorrected-FFT method on graphics
processing units. Antennas and Propagation”, IEEE Transactions
on Vol. 61, Issue. 4, pp: 2099-2107, 2013.
40. Pravin Ghosekar, Girish Katkar and Pradip Ghorpade, “Mobile Ad
Hoc Networking: Imperatives and Challenges” IJCA Special Issue
on “Mobile Ad-hoc Networks” 2010.
41. Quan Yu, Jun Zheng, Tielian Fu, Kejun Wu and Baoxian Zhang,
“Asynchronous Cooperative Transmission Using Distributed
Unitary Space-Frequency Coded OFDM in Mobile Ad Hoc
Networks” Published in IEEE future generation communication
and networking (FGCN), Vol. 2, pp: 291-296, 2007.
42. R. W. Hamming, “Error Detecting and Error Correcting Codes.”
Bell Syst.tech.J., vol. 29, no. 2, pp. 147–160, 1950.
43. Ramesh Bhakthavatchalu et al, “Modified FPGA based Design
and Implementation of Reconfigurable FFT Architecture” Institute
of Electrical and Electronics Engineers conference on PP: 818-
822, 2013.
44. Reddy, K. V. S., & Bala, K., “Implementation of 64-Point
FFT/IFFT By Using Radix-8 Algorithm” International Journal of
Electrical and Electronic Engineering Telecommunication
(IJEEET), Vol. 2, No. 4, pp: 57-61, 2013.
121
45. Salehi, S. A., Amirfattahi, R., & Parhi, K. K., “Pipelined
Architectures for Real-Valued FFT and Hermitian-Symmetric IFFT
With Real Datapaths” Circuits and Systems II: Express Briefs,
IEEE Transactions on, Vol. 60, Issue. 8, pp: 507-511, 2013.
46. Sanchez-Macian, Alfonso, Pedro Reviriego, and Jaun Antonio
Maestro. "Hamming SEC-DAED and extended hamming SEC-
DED-TAED codes through selective shortening and bit
placement." Device and Materials Reliability, IEEE Transactions
on Vol. 14, Issue. 1, pp: 574-576, 2014.
47. Sanchez-Macian, Alfonso, Pedro Reviriego, and Juan Antonio
Maestro. "Enhanced detection of double and triple adjacent errors
in Hamming codes through selective bit placement." Device and
Materials Reliability, IEEE Transactions on Vol. 12, Issue. 2, pp:
357-362, 2012.
48. Satoh, S., Tosaka, Y., & Wender, S. A., “Geometric effect of
multiple-bit soft errors induced by cosmic ray neutrons on
DRAM's”, Electron Device letters, IEEE, Vol. 21, Issue. 6, pp;
310-312, 2000.
49. Sreekanth Yadav, K, Charishma, V and Neelima koppala,
“Design and simulation of 64 point FFT using Radix 4 algorithm
for FPGA Implementation” International Journal of Engineering
122
Trends and Technology (IJETT), Vol. 4, Issue. 2, pp: 109-113,
2013.
50. Sun, Y., Karkooti, M., & Cavallaro, J. R., “High throughput,
parallel, scalable LDPC encoder/decoder architecture for OFDM
systems”, In Design, Applications, Integration and Software, IEEE
Dallas/CAS Workshop on (pp. 39-42), 2006.
51. Sundararajan, M., & Govindaswamy, U, “Multicarrier Spread
Spectrum Modulation Schemes and Efficient FFT Algorithms for
Cognitive Radio Systems”. Electronics, Vol. 3, Issue. 3, 419-443,
2014.
52. Sundari, R. M., Subathra, D., & Dhanalaxmi, M. S., “Enhancing
Multiplier Speed in Fast Fourier Transform Based on Vedic
mathematics”. International Journal of VLSI design &
communication Systems (VLSICS), Vol. 4, Issue. 3, 2013.
53. T. S. Ghouse Basha and L. Suneetha, “Implementation of High
Speed MDC FFT/IFFT Processor for MIMO-OFDM Systems”
International Journal of Advanced Research in Electrical,
Electronics and Instrumentation Engineering (IJAREEIE), Vol. 3,
Issue. 9, pp: 12201-12207, 2014.
54. Takala, J., & Punkka, K., “Butterfly unit supporting radix-4 and
radix-2 FFT”. In Proceedings of The 2005 International TICSP
123
Workshop on Spectral Methods and Multirate Signal Processing,
SMMSP 2005, Riga, Latvia, Vol. 30, pp. 47-54, 2005.
55. Wang, J., & Ronningen, L. A., “An Implementation of Pipelined
Radix-4 FFT Architecture on FPGAs”. Journal of Clean Energy
Technologies (JCET), Vol. 2, Issue. 1, pp: 101-103, 2014.
56. Yang, Hongwei. "A road to future broadband wireless access:
MIMO-OFDM-based air interface." Communications Magazine,
Vol. 43, No. 1, pp: 53-60, 2005.
57. Yu, C., Yen, M. H., Hsiung, P. A., & Chen, S. J., “A low-power 64-
point pipeline FFT/IFFT processor for OFDM applications”.
Consumer Electronics, IEEE Transactions on, Vol. 57, Issue. 1,
pp: 40-40, 2011.
58. Zhao, J., & Shi, Y. “A novel approach to improving burst errors
correction capability of Hamming code”, In Communications,
Circuits and Systems, 2007. ICCCAS 2007. International
Conference on (pp. 1193-1196), 2007.
59. Zhou, B., Peng, Y., & Hwang, D., “Pipeline FFT architectures
optimized for FPGAs”. International Journal of Reconfigurable
Computing, pp: 1-9, 2009.
124
60. Vikaram Patalbasi, Sonali Mote “An Overview of MANET:History,
Challenges and Applications” International Journal of Computer
Science and Engineering, Vol. 3, No .1 Feb-Mar 2012.
61. Convolutional encoding and Viterbi decoding tutorial is linked in
http://ems.eit.unikl.de/fileadmin/user_upload/Appendix_task7_8.p
df.
62. Cyclic prefix tutorial linked in
http://www2.siit.tu.ac.th/prapun/ecs455_2010_2/ECS455%20-
%205%20-%204%20-%20Cyclic%20Prefix.pdf.
63. Error detection and correction codes linked in
http://logos.cs.uic.edu/366/notes/ErrorCorrectionAndDetectionSu
pplement.pdf.
64. Implementation of 16 point radix 2 FFT, tutorial linked in
http://teal.gmu.edu/courses/ECE645/projects_S05/specs/FFT_as
hwin_vamsi.pdf.
65. Introduction to VLSI technology in
http://www.slideshare.net/yayavaram/introduction-to-vlsi-
technology.
66. Twiddle factor tutorial in
http://www.alwayslearn.com/dft%20and%20fft%20tutorial/DFTan
dFFT_FFT_TwiddleFactor.html.
125
67. Uses and advantages of VLSI technology in
http://www.techulator.com/resources/13398-What-is-VLSI-
Technology.aspx.
68. VLSI design metrics in
http://users.ece.utexas.edu/~mcdermot/vlsi-2/Lecture_1.pdf.
69. VLSI design technology in
http://www.engineersgarage.com/articles/vlsi-design-future?
page=2.
70. VLSI Models of Computation in
http://cs.brown.edu/~jes/book/pdfs/ModelsOfComputation_Chapt
er12.pdf.
126
LIST OF PUBLICATIONS
International Journals
1. Chandrika. S, and Rani Hemamalini. R, “A Novel Pipelined
Radix-2 64 Point FFT with Modified Complex Multiplier in OFDM
for Wireless Ad-hoc Netowrks” International Journal of Applied
Engineering Research (IJAER), ISSN 1087—1090, pp: 19869-
19879, 2014.
2. Chandrika. S, and Rani Hemamalini. R, “Efficient Implementation
of Hamming SEC-TAED Code Algorithm for Data
Communication” International Journal of Innovative Research &
Studies (IJIRS), ISSN 2319 – 9725,pp: 275-289, 2014.