harshu1
TRANSCRIPT
CHAPTER 1
INTRODUCTION
Serial to communication is an essential computers and allows them to
communicate with low speed peripheral devices, such as Keyboard, the mouse,
modems etc. Universel Asynchrones Receiver and Transmitter is most important
component requiers in serial communication. UART is an integrated circuit used
for conversion of serial data to parallel and vice versa. In this project, we study,
design and implement a UART and APB interface environment for that UART
using VHDL.
The UART-APB core is a serial communication controller with a serial
data interface that is intended primarily for embedded systems and designing
ASIC. The UART-APB core can be used to interface directly to industry standard
UARTs. The UART-APB core is intentionally a subset of full UART capability
to make the function cost-effective in a programmable device. In Several Control
systems, UART a kind of serial communication circuit is used widely. A
universal asynchronous receive/transmit (UART) is an integrated circuit which
plays the most important.
The APB interface allows access to the UART apb internal registers,
FIFO, and internal memory. This interface is synchronous to the clock. The baud
generator creates a divided down clock enable that correctly paces the transmit
and receive state machines. To transmit data, it is first loaded into the transmit
data buffer in normal mode, and into the transmit FIFO in FIFO mode. The
receive state machine monitors the activity of the RX signal. Once a START bit
is detected, the receive state machine begins to store the data in the receive buffer
in normal mode and the receive FIFO in FIFO mode.
1
Block Diagram:
FIGURE 1.1: BASIC STRUCTURE OF APB UART
1.1 APB:
APB (Advanced Peripheral Bus) is used to connect general purpose low
speed low-power peripheral devices. The bridge is peripheral bus master, while
all buses devices (Timer, UART, PIA, etc) are slaves. APB is static bus that
provides a simple addressing with latched addresses and control signals for easy
interfacing.
APB is optimized for minimal power consumption and reduced interface
complexity to support peripheral functions.
Advanced Peripheral Bus provides the basic peripheral macro cell
communications infrastructure as a secondary bus from the higher bandwidth
pipelined main system bus. Such peripherals typically have interfaces which are
2
APB System Interface
UART CONTROLLER
memory-mapped registers, have no high bandwidth interfaces and are accessed
under programmed control.
1.2 BAUD RATE GENERATOR:
The Baud Rate Generator is a programmable transmit and receive bit
timing device. Given the programmed value, it generates a periodic pulse, which
determines the baud rate of the UART transmission. This pulse is used by the
receiver and transmitter circuit to generate a sampling pulse for sampling the
received serial data and to determine the bit width of the transmit data.
1.3 UART (Universal asynchronous receiver/transmitter):
A universal asynchronous receiver/transmitter (UART) is a type of
"asynchronous receiver/transmitter", a piece of computer hardware that translates
data between parallel and serial forms.
The Universal Asynchronous Receiver/Transmitter (UART) controller is
the key component of the serial communications subsystem of a computer. The
UART takes bytes of data and transmits the individual bits in a sequential
fashion. At the destination, a second UART re-assembles the bits into complete
bytes.
When transmitting, the UART takes 8 bits of parallel data and converts
the data to a serial bit stream that consists of a start bit (logic 0), 8 data bits (least
significant bit first), and one or more stop bits (logic 1).
3
CHAPTER 2
SYSTEM ON CHIP BUSSES
A system on a chip or system on chip (SoC or SOC) is an integrated
circuit (IC) that integrates all components of a computer or
other electronic system into a single chip. It may contain digital, analog, mixed-
signal, and often radio-frequency functions all on a single chip substrate.
This technology promises new levels of integration on a single chip,
called the System-on-a-Chip (SoC) design, but also presents significant
challenges to the chip designer. Currently, on-chip interconnection networks are
mostly implemented using buses. For SoC applications, design reuse becomes
easier if standard internal connection buses are used for interconnecting
components of the design.
A heterogeneous SoC might include one or more programmable
components such as general purpose processors cores, digital signal processor
cores, or application-specific intellectual property (IP) cores, as well as an analog
front end, on-chip memory, I/O devices, and other application specific circuits.
In other words, a SoC is an IC that implements most or all the functions of a
complete electronic system. On-chip bus organized communication architecture
(CA) is among the top challenges in CMOS SoC technology due to rapidly
increasing operation frequencies and growing chip size. In general, the
performance of the SoC design heavily depends upon the efficiency of its bus
structure. The balance of computation and communication in any application or
task is, of course, known as a fundamental determinant of delivered performance.
Usually, IP cores, as constituents of SoCs, are designed with many different
interfaces and communication protocols. Integrating such cores in a SoC often
requires insertion of suboptimal glue logic. Standards of on-chip bus structures
were developed to avoid this problem. Currently there are a few publicly
available bus architectures from leading manufacturers, such as Core Connect
from IBM AMBA from ARM , Silicon Backplane from Sonics , and others.
4
These bus architectures are usually tied to processor architecture, such as the
PowerPC or the ARM processor. Manufacturers provide cores optimized to work
with these bus architectures, thus requiring minimal extra interface logic.
SOME STANDARD BUS ARCHITECTURES OF THE
SYSTEM ON CHIP:
AMBA 2.0, 3.0 (ARM)
Core Connect (IBM)
Sonics Smart Interconnect (Sonics)
ST Bus (STMicroelectronics)
Wishbone (Open cores)
Avalon (Altera)
PI Bus (OMI)
MARBLE (Univ. of Manchester)
Core Frame (Palm Chip)
2.1 AMBA Bus
AMBA (Advanced Microcontroller Bus Architecture), is a bus standard
devised by ARM with aim to support efficient on-chip communications
among ARM processor cores. AMBA is hierarchically organized into two bus
segments, system- and peripheral-bus, mutually connected via bridge that
buffers data and operations between them.
Three distinct buses are defined within the AMBA specification:
Advanced High-performance Bus (AHB)
The The Advanced System Bus (ASB)
The Advanced Peripheral Bus (APB).
5
2.1.1 Advanced High-performance Bus (AHB)
The AMBA AHB is for high-performance, high clock frequency system
modules. The AHB acts as the high-performance system backbone bus. AHB
supports the efficient connection of processors, on-chip memories and off-chip
external memory interfaces with low-power peripheral macrocell functions. AHB
is also specified to ensure ease of use in an efficient design flow using synthesis
and automated test techniques.
AHB (Advanced High-performance Bus) as a later generation of AMBA
bus is intended for high performance high-clock synthesizable designs. It provides
high-bandwidth Communication channel between embedded processor (ARM,
MIPS, AVR, DSP 320xx, 8051, etc.) and high performance peripherals/ hardware
accelerators (ASICs MPEG, color LCD , etc), on-chip SRAM, on-chip external
memory interface, and APB bridge. AHB Supports a multiple bus masters
operation, peripheral and a burst transfer, wide data bus Configurations and non
tristate implementations. Constituents of AHB are: AHB-master, slave, Decoder.
2.1.2 Advanced System Bus (ASB)
The AMBA ASB is for high-performance system modules. AMBA ASB
is an alternative system bus suitable for use where the high-performance features
of AHB are not required. ASB also supports the efficient connection of
processors, on-chip memories and off chip external memory interfaces with low-
power peripheral macrocell functions.ASB (Advanced System Bus) - first
generation of AMBA system bus used for simple cost-effective designs that
support burst transfer, pipelined transfer operation, and multiple bus masters.
Characteristics of ASB
High Performance
Pipelined Operation
Burst Transfers
Mulitple Bus Masters
2.1.3 Advanced Peripheral Bus (APB)
6
The AMBA APB is for low-power peripherals. AMBA APB is optimized
for minimal power consumption and reduced interface complexity to support
peripheral functions. APB can be used in conjunction with either version of the
system bus.
Apb provides the basic peripheral macrocell communications
infrastructure as a secondary bus from the higher bandwidth pipelined main
system bus. Such peripherals typically have interfaces which are memory-mapped
registers, have no high bandwidth interfaces and are accessed under programmed
control. APB (Advanced Peripheral Bus) is used to connect general purpose low
speed low-power peripheral devices. The bridge is peripheral bus master while all
buses devices (Timer, UART, PIA, etc) are slaves. APB is static bus that provides
a simple addressing with latched addresses and control signals for easy interfacing
Characteristics of APB
Low Power
Latched Address and Control
Simple Interface
Suitable for Many Peripherals
CHAPTER 3
AMBA Hierarchy
7
The processor, on-chip memory and external bus interface all reside on
the high performance system bus. This bus provides a high bandwidth interface
between the elements that are involved in the majority of transfers. Also located
on the high performance ASB is a bridge to the lower bandwidth APB, where
most of peripherals in the system reside.
FIGURE 3.1: A TYPICAL AMBA SYSTEM
An AMBA-based microcontroller typically consists of a high-
performance system backbone bus (AMBA AHB or AMBA ASB), able to
sustain the external memory bandwidth, on which the CPU, on-chip memory and
other Direct Memory Access (DMA) devices reside. This bus provides a high-
bandwidth interface between the elements that are involved in the majority of
transfers. Also located on the high performance bus is a bridge to the lower
bandwidth APB, where most of the peripheral devices in the system are located.
AMBA APB provides the basic peripheral macrocell communications
infrastructure as a secondary bus from the higher bandwidth pipelined main
system bus. Such
Peripherals typically:
8
• have interfaces which are memory-mapped registers
• have no high-bandwidth interfaces
• accessed under programmed control.
The external memory interface is application-specific and may only have
a narrow data path, but may also support a test access mode which allows the
internal AMBA AHB, ASB and APB modules to be tested in isolation with
system-independent test sets.
The Advanced Peripheral Bus appears as a local secondary bus that is
encapsulated as a single ASB slave device. APB provides a low-power extension
to the system bus which builds on ASB signals directly. The APB bridge appears
as a slave module which handles the bus handshake and control signal retiming on
behalf of the local peripheral bus. By defining the APB interface from the starting
point of the system bus, the benefits of the system diagnostics and test
methodology can be exploited.
A full ASB interface is used for:
Bus masters
On-chip memory blocks
External memory interfaces
High-bandwidth peripherals with FIFO interfaces
DMA slave peripherals
A simple APB interface is recommended for:
Simple register-mapped slave devices
Very low power interfaces where clocks cannot be globally routed
Grouping narrow-bus peripherals to avoid loading the system bus
9
3.1 Objectives of the AMBA specification:
The AMBA specification has been derived to satisfy four key requirements:
To facilitate the right-first-time development of embedded
microcontroller products with one or more CPUs or signal processors
To be technology-independent and ensure that highly reusable peripheral
and system macro cells can be migrated across a diverse range of IC
processes and be appropriate for full-custom, standard cell and gate array
technologies
To encourage modular system design to improve processor independence,
providing a development road-map for advanced cached CPU cores and
the development of peripheral libraries
To minimize the silicon infrastructure required to support efficient on-
chip and off-chip communication for both operation and manufacturing
test.
CHAPTER 4
AMBA Advanced Peripheral Bus (APB)
10
The Advanced Peripheral Bus (APB) is part of the Advanced
Microcontroller Bus Architecture (AMBA) hierarchy of buses and is optimized
for minimal power consumption and reduced interface complexity.
The AMBA APB should be used to interface to any peripherals which are low
bandwidth and do not require the high performance of a pipelined bus interface.
The latest revision of the APB ensures that all signal transitions are only related
to the rising edge of the clock. This improvement means the APB peripherals can
be integrated easily into any design flow, with the following advantages:
Performance is improved at high-frequency operation
Performance is independent of the mark-space ratio of the clock
Static timing analysis is simplified by the use of a single clock edge
No special considerations are required for automatic test insertion
Many Application-Specific Integrated Circuit (ASIC) libraries have a
better selection of rising edge registers
easy integration with cycle based simulators.
These changes to the APB also make it simpler to interface it to the new
Advanced High-performance Bus (AHB).
4.1 APB specification:
The APB specification is described under the following headings:
• Write transfer
• Read transfer
WRITE TRANSFER:
11
The write transfer starts with the address, write data, write signal and
select signal all changing after the rising edge of the clock. The first clock cycle
of the transfer is called the SETUP cycle. After the following clock edge the
enable signal PENABLE is asserted and this indicates that the ENABLE cycle is
taking place. The address, data and control signals all remain valid throughout the
ENABLE cycle. The transfer completes at the end of this cycle. The enable
signal, PENABLE, will be reasserted at the end of the transfer. The select signal
will also go LOW, unless the transfer is to be immediately followed by another
transfer to the same peripheral. In order to reduce power consumption the address
signal and the write signal will not change after a transfer until the next access
occurs. The protocol only requires a clean transition on the enable signal.
FIGURE 4.1: WRITE TRANSFER DIAGRAM
READ TRANAFER:
12
The timing of the address, write, select and strobe signals are all the same
as for the write transfer. In the case of a read, the slave must provide the data
during the ENABLE cycle. The data is sampled on the rising edge of clock at the
end of the ENABLE cycle.
FIGURE 4.2: READ TRANSFER DIAGRAM
4.2 APB BRIDGE:
The APB Bridge is the only bus master on the AMBA APB. In addition,
the APB Bridge is also a slave on the higher-level system bus.
13
FIGURE 4.3: APB BRIDGE INTERFACE DIAGRAM
APB bridge description
The bridge unit converts system bus transfers into APB transfers and
performs the following functions:
Latches the address and holds it valid throughout the transfer.
Decodes the address and generates a peripheral select, PSELx.
Only one select signal can be active during a transfer.
Drives the data onto the APB for a write transfer.
Drives the APB data onto the system bus for a read transfer.
Generates a timing strobe, PENABLE, for the transfer.
4.3 APB SLAVE:
14
APB slave description
APB The slave interface is very flexible. For a write transfer the data can be
latched at the following points:
• on either rising edge of PCLK, when PSEL is HIGH
• on the rising edge of PENABLE, when PSEL is HIGH.
The select signal PSELx, the address PADDR and the write signal PWRITE can
be combined to determine which register should be updated by the write
operation. For read transfers the data can be driven on to the data bus when
PWRITE is LOW and both PSELx and PENABLE are HIGH. While PADDR is
used to determine which register should be read.
AMBA APB signal list
All AMBA APB signals use the single letter P prefix. Some APB signals,
such as the clock, may be connected directly to the system bus equivalent signal.
Table shows the list of AMBA APB signal names, along with a description of
howeach of the signals is used.
Name Description
PCLK : Bus clock The rising edge of PCLK is used to time all transfers on
theAPB.
PRESETn: APB reset The APB bus reset signal is active LOW and this signal
will normally be connected directly to the system bus reset signal.
PADDR [31:0] APB address bus This is the APB address bus, which may be up
to 32-bits wide and is driven by the peripheral bus bridge unit.
PSELx : APB select A signal from the secondary decoder, within the peripheral
bus bridge unit, to each peripheral bus slave x. This signal indicates that the slave
device is selected and a data transfer is required. There is a PSELx signal for
each bus slave.
15
PENABLE: APB strobe This strobe signal is used to time all accesses on the
peripheral bus. The enable signal is used to indicate the second cycle of an APB
transfer. The rising edge of PENABLE occurs in the middle of the APB transfer.
PWRITE: APB transfers direction When HIGH this signal indicates an APB
write access and when LOW a read access.
PRDATA: APB read data bus The read data bus is driven by the selected slave
during read cycles (when PWRITE is LOW). The read data bus can be up to 32-
bits wide.
PWDATA: APB write data bus the write data bus is driven by the peripheral bus
bridge unit during write cycles (when PWRITE is HIGH). The write data bus
can be up to 32-bits
APB slaves have a simple, yet flexible, interface. The exact implementation of
the interface will be dependent on the design style employed and many different
options are possible.
16
FIGURE 4.4: APB SLAVE INTERFACE DIAGRAM
CHAPTER 5
UART (Universal asynchronous receiver/transmitter)
Block Diagram:
17
FIGURE: 5.1: UART (Universal asynchronous receiver/transmitter)
A universal asynchronous receiver/transmitter (usually abbreviated
UART and pronounced) is a type of "asynchronous receiver/transmitter", a piece
of computer hardware that translates data between parallel and serial forms.
The Universal Asynchronous Receiver/Transmitter (UART) controller is
the key component of the serial communications subsystem of a computer. The
UART takes bytes of data and transmits the individual bits in a sequential
fashion. At the destination, a second UART re-assembles the bits into complete
bytes.
When transmitting, the UART takes 8 bits of parallel data and converts the data
to a serial bit stream that consists of a start bit (logic 0), 8 data bits (least
significant bit first), and one or more stop bits (logic 1).
5.1 UART (Universal Asynchronous Receiver and Transmitter)
A universal asynchronous receiver/transmitter (usually abbreviated
UART and pronounced) is a type of "asynchronous receiver/transmitter", a piece
of computer hardware that translates data between parallel and serial forms.
18
A UART is usually an individual (or part of an) integrated circuit used for
serial communications over a computer or peripheral device serial port. UARTs
are now commonly included in microcontrollers. A dual UART or DUART
combines two UARTs into a single chip. Many modern ICs now come with a
UART that can also communicate synchronously; are called USARTs (universal
these devices synchronous/asynchronous receiver/transmitter). the individual bits
in a sequential fashion. At the destination, a second UART re-assembles the bits
into complete bytes. Each UART contains a shift register, which is the
fundamental method of conversion between serial and parallel forms. Serial
transmission of digital The Universal Asynchronous Receiver/Transmitter
(UART) takes bytes of data and transmits information (bits) through a single wire
or other medium is much more cost effective than parallel transmission through
multiple wires.
FIGURE 5.2: SERIAL DATA TRANSMISSION
The above fig is standard format for serial transmission. Since no clock (clk) line,
data D is transmitted asynchronously, one byte at a time.
19
FIGURE 5.3: STRUCTURE OF UART
When no data is transmitted, D remains high.
To mark start bit, a low bit is transmitted (D will goes low).
Now, 8 bits will be transmitted, least significant bit will be first.
When text is being transmitted, ASCII code is usually used. In ASCII
code each character is represented by 7 bits and the 8 th bit is the parity bit.
After 8 bits are transmitted; D should go high at least once; representing a
character is transmitted.
Then, another character can be transmitted at any time.
5.2 SERIAL DATA FORMAT
20
When transmitting, the UART takes 8 bits of parallel data and converts
the data to a serial bit stream that consists of a start bit (logic 0), 8 data bits (least
significant bit first), and one or more stop bits (logic 1)
FIGURE 5.4: STANDARD SERIAL DATA FORMAT
When transmitting, the UART takes 8 bits of parallel data and converts the data
to a serial bit stream that consists of a start bit (logic 0), 8 data bits (least
significant bit first), and one or more stop bits (logic 1).
The Universal Asynchronous Receiver/Transmitter (UART) controller is
the key component of the serial communications subsystem of a computer. The
UART takes bytes of data and transmits the individual bits in a sequential
fashion. At the destination, a second UART re-assembles the bits into complete
bytes. Serial transmission is commonly used with modems and for non-
networked communication between computers, terminals and other
devices.Asynchronous transmission allows data to be transmitted without the
sender having to send a clock signal to the receiver. Instead, the sender and
receiver must agree on timing parameters in advance and special bits are added to
each word which is used to synchronize the sending and receiving units.
21
An asynchronous transmitting, teletype-style UARTs send a "start" bit,
five to eight data bits, least-significant-bit first, an optional "parity" bit, and then
one, one and a half, or two "stop" bits. The start bit is the opposite polarity of the
data-line's idle state. The stop bit is the data-line's idle state, and provides a delay
before the next character can start. (This is called asynchronous start-stop
transmission). In mechanical teletypes, the "stop" bit was often stretched to two
bit times to give the mechanism more time to finish printing a character. A
stretched "stop" bit also helps resynchronization.
Asynchronous transmission allows data to be transmitted without the
sender having to send a clock signal to the receiver. Instead, the sender and
receiver must agree on timing parameters in advance and special bits are added to
each word which is used to synchronize the sending and receiving units.
5.3 Design of UART:
The structure of UART is as shown in figure 5.3, consists of Transmitter
part and Receiver part, rather we can say consists of 3 units, transmitter circuit,
receiver circuit and Control/Status Registers.
FIGURE 5.5: STRUCTURE OF UART BLOCK
22
5.3.1 Design of UART Transmitter
The Block diagram of UART Transmitter is as shown in figure 5.3.1. The
data is loaded from Data Bus into TBR (Transmit Buffer Register) and from TBR
to TSR (Transmit Shift Register), based on the control and status signals
produced by the Control unit. The Size of TSR is taken in such a way that, it
should accommodate the START and STOP bits along with the Data bits which
are loaded from the Data Bus.
operation is simpler since it Transmission is under the control of the transmitting
system. As soon as data is deposited in the shift register after completion of the
previous character, the UART hardware generates a start bit, shifts the required
number of data bits out to the line, generates and appends the parity bit (if used),
and appends the stop bits. Since transmission of a single character may take a
long time relative to CPU speeds, the UART will maintain a flag showing busy
status so that the host system does not deposit a new character for transmission
until the previous one has been completed; this may also be done with an
interrupt. Since full-duplex operation requires characters to be sent and received
at the same time, practical UARTs use two different shift registers for transmitted
characters and received characters.
FIGURE 5.6: UART TRANSMITTER UNIT
23
The Data loaded into TSR has the format of START-DATA-STOP bits
which is as shown in figure of which, every time one bit will be sent, with
reference to baud clock.Correspondingly, the data in TSR will keeps updating
with 0’s; will be completely filled with 0’s, after transmission of the complete
data packet.
5.3.2 Design of UART Receiver
The Block diagram of UART Receiver is as shown in figure 3.6. The data
receiving will be captured using receiving baud clock and then loaded into RSR
(Receive Shift Register) and from RSR to RBR (Receive Buffer Register), and
then to Data Bus, based on the control and status signals produced by the Control
unit. All operations of the UART hardware are controlled by a clock signal which
runs at a multiple of the data rate. For example, each data bit may be as long as
16 clock pulses. The receiver tests the state of the incoming signal on each clock
pulse, looking for the beginning of the start bit. If the apparent start bit lasts at
least one-half of the bit time, it is valid and signals the start of a new character. If
not, the spurious pulse is ignored. After waiting a further bit time, the state of the
line is again sampled and the resulting level clocked into a shift register. After the
required number of bit periods for the character length (5 to 8 bits, typically) have
elapsed, the contents of the shift register is made available (in parallel fashion) to
the receiving system. The UART will set a flag indicating new data is available,
and may also generate a processor interrupt to request that the host processor
transfers the received data. The Size of RSR is taken in such a way that, it should
accommodate the START and STOP bits along with the Data bits which are
loaded from the Data Bus.
24
FIGURE 5.7: UART RECEIVER
5.3.3 SERIAL DATA FORMAT
FIGURE 5.8: SERIAL DATA FORMAT
25
The start bit is always a 0 (logic low), which is also called a space. The
start bit signals the receiving DTE that a character code is coming. The next five
to eight bits, depending on the code set employed, represent the character. In the
ASCII code set the eighth data bit may be a parity bit. The next one or two bits
are always in the mark (logic high, i.e., '1') condition and called the stop bit(s).
They provide a "rest" interval for the receiving DTE so that it may prepare for the
next character which may be after the stop bit(s). The rest interval was required
by mechanical Teletypes which used a motor driven camshaft to decode each
character. At the end of each character the motor needed time to strike the
character bail (print the character) and reset the camshaft.
All operations of the UART hardware are controlled by a clock signal
which runs at a multiple (say, 16) of the data rate - each data bit is as long as 16
clock pulses. The receiver tests the state of the incoming signal on each clock
pulse, looking for the beginning of the start bit. If the apparent start bit lasts at
least one-half of the bit time, it is valid and signals the start of a new character. If
not, the spurious pulse is ignored. After waiting a further bit time, the state of the
line is again sampled and the resulting level clocked into a shift register. After the
required number of bit periods for the character length (5 to 8 bits, typically) have
elapsed, the contents of the shift register is made available (in parallel fashion) to
the receiving system. The UART will set a flag indicating new data is available,
and may also generate a processor interrupt to request that the host processor
transfers the received data. In some common types of UART, a small first-in,
first-out (FIFO) buffer memory is inserted between the receiver shift register and
the host system interface. This allows the host processor more time to handle an
interrupt from the UART and prevents loss of received data at high rates.
Transmission operation is simpler since it is under the control of the
transmitting system. As soon as data is deposited in the shift register, the UART
hardware generates a start bit, shifts the required number of data bits out to the
line, generates and appends the parity bit (if used), and appends the stop bits.
Since transmission of a single character may take a long time relative to CPU
26
speeds, the UART will maintain a flag showing busy status so that the host
system does not deposit a new character for transmission until the previous one
has been completed; this may also be done with an interrupt. Since full-duplex
operation requires characters to be sent and received at the same time, practical
UARTs use two different shift registers for transmitted characters and received
characters.
Transmitting and receiving UARTs must be set for the same bit speed,
character length, parity, and stop bits for proper operation. The receiving UART
may detect some mismatched settings and set a "framing error" flag bit for the
host system; in exceptional cases the receiving UART will produce an erratic
stream of mutilated characters and transfer them to the host system.
Typical serial ports used with personal computers connected to modems
use eight data bits, no parity, and one stop bit; for this configuration the number
of ASCII character per seconds equals the bit rate divided by 10.
5.3.4 Special Receiver Conditions
Overrun Error:
An "overrun error" occurs when the UART receiver cannot process the
character that just came in before the next one arrives. Various UART devices
have differing amounts of buffer space to hold received characters. The CPU
must service the UART in order to remove characters from the input buffer. If the
CPU does not service the UART quickly enough and the buffer becomes full, an
Overrun Error will occur.
Under run Error:
An "under run error" occurs when the UART transmitter has completed
sending a character and the transmit buffer is empty. In asynchronous modes this
is treated as an indication that no data remains to be transmitted, rather than an
27
error, since additional stop bits can be appended. This error indication commonly
found in USARTs, since an under run is more serious in synchronous systems.
Framing Error:
A "framing error" occurs when the designated "start" and "stop" bits are
not valid. As the "start" bit is used to identify the beginning of an incoming
character, it acts as a reference for the remaining bits. If the data line is not in the
expected idle state when the "stop" bit is expected, a Framing Error will occur.
Parity Error:
A "parity error" occurs when the number of "active" bits does not agree
with the specified parity configuration of the UART, producing a Parity Error.
Because the "parity" bit is optional, this error will not occur if parity has been
disabled. Parity error is set when the parity of an incoming data character does
not match the expected value.
BAUD RATE GENERATOR
The Baud Rate Generator is a programmable transmits and receive bit
timing device. Given the programmed value, it generates a periodic pulse, which
determines the baud rate of the UART transmission. This pulse is used by the
receiver and transmitter circuit to generate a sampling pulse for sampling the
received serial data and to determine the bit width of the transmit
28
CHAPTER 6
UART-APB
The UART-APB core is a serial communication controller with a serial
The UART-APB core data interface that is intended primarily for embedded
systems and designing ASIC. The UART-APB core can be used to interface
directly to industry standard UARTs. The UART-APB core is intentionally a
subset of full UART capability to make the function cost-effective in a
programmable device. In Several Control systems, UART a kind of serial
communication circuit is used widely. A universal asynchronous receive/transmit
(UART) is an integrated circuit which plays the most important role in serial
communication
The APB interface allows access to the UART through APB. UART is
being used in SoC which consists of transmitter, receiver and baud rate generator
and therefore connecting it to the APB which is a peripheral bus in AMBA to
connect different peripherals, hence APB interface design with UART is needed.
The UART-APB core is a serial communication controller with a serial data
interface that is intended primarily for embedded systems and designing ASIC.
The UART-APB core can be used to interface directly to industry standard
UARTs. The UART-APB core is intentionally a subset of full UART capability
to make the function cost-effective in a programmable device. In Several Control
systems, UART a kind of serial communication circuit is used widely. A
universal asynchronous receive/transmit (UART) is an integrated circuit which
plays the most important.
The APB interface allows access to the UART apb internal registers,
FIFO, and internal memory. This interface is synchronous to the clock. The baud
generator creates a divided down clock enable that correctly paces the transmit
and receive state machines. To transmit data, it is first loaded into the transmit
29
data buffer in normal mode, and into the transmit FIFO in FIFO mode.The
receive state machine monitors the activity of the RX signal.
CHAPTER 7
SOFTWARE CODE
UART TRANSMITTER CODE
library IEEE;
use IEEE.STD_LOGIC_1164.ALL;
use IEEE.STD_LOGIC_ARITH.ALL;
use IEEE.STD_LOGIC_UNSIGNED.ALL;
entity uart_transmitter is
port(clk, rst_n, wr: in std_logic;
data: in std_logic_vector(7 downto 0);
txrdy: inout std_logic;
tx: out std_logic);
end uart_transmitter;
architecture Behavioral of uart_transmitter is
signal count: integer;
signal tbr: std_logic_vector(7 downto 0);
signal tsr: std_logic_vector(10 downto 0);
signal baud_clk: std_logic;
signal tx_sts: std_logic;
30
begin
--This module is to keep tx_status to be 1 or not - i.e, to monitor tx is busy or
not tx_sts indicates transmitter status, tx_sts = 1 means transmitter is busy;
tx_sts = 0 means transmittr ise free
process(clk)
begin
if(clk'event and clk = '1') then
if(rst_n = '0') then
tx_sts <= '0'; --Transmitter is free
elsif(wr = '1' and txrdy = '1') then
tx_sts <= '1'; --Transmitter is busy
elsif(txrdy = '1') then
tx_sts <= '0'; --Transmitter is free
end if;
end if;
end process;
--This module is to load data from dataline to tbr
process(clk)
begin
if(clk'event and clk = '1') then
if(rst_n = '0') then
tbr <= "00000000";
elsif(txrdy = '1') then
--If transmitter is ready, then we need to load data from dataline to data
buffer register
tbr <= data;
31
end if;
end if;
end process;
-- This module is to generate baud clock
process(clk)
begin
if(clk'event and clk = '1') then
if(rst_n = '0') then
count <= 0;
elsif(tx_sts = '1' and count = 9) then
count <= 0;
elsif(tx_sts = '1') the
count <= count+1;
else
count<=0;
end if;
end if;
end process;
--This module is used, to trigger the baud clock
process(clk)
begin
32
if(clk'event and clk = '1') then
if(rst_n = '0') then
baud_clk <= '0';
elsif(count = 1) then
baud_clk <= '1';
else
baud_clk <= '0';
end if;
end if;
end process;
-- This module is for shifing bit by bit
process(clk, baud_clk)
begin
if(clk'event and clk = '1') then
if(rst_n = '0') then
tsr <= "00000000000";
txrdy <= '1';
elsif((wr = '1') and (txrdy = '1')) then
-- and ((tbr(0) or tbr(1) or tbr(2) or tbr(3) or tbr(4) or tbr(5) or tbr(6)) = '1')
-- This piece of code is to load data from TBR to TSR(with Start, Stop and
Parity)
tsr(10) <= '1';
33
tsr(9) <= (tbr(0) xor tbr(1) xor tbr(2) xor tbr(3) xor tbr(4) xor tbr(5) xor tbr(6)
xor tbr(7));
tsr(8 downto 1) <= tbr;
tsr(0) <= '0';
end if;
if((tsr(0) or tsr(1) or tsr(2) or tsr(3) or tsr(4) or tsr(5)or tsr(6)or tsr(7) or tsr(8)or
tsr(9) or tsr(10)) = '0') then
txrdy <= '1'; --txrdy is 1, when TSR has finished sending data.
else
txrdy <= '0'; --txrdy is 0, when TSR has data to be sent.
end if;
if(txrdy = '0' and baud_clk = '1') then
tx <= tsr(0);
tsr <= '0' & tsr(10 downto 1);
end if;
end if;
end process;
end Behavioral;
34
TESTBENCH
Clk<=not clk after10ns
Rst_n<=10;'1'after20ns;
Tb process
Begin
Wr<='1';
Data<="10001001";
Wait for2100ns;
Wr<='0';
35
RECEIVER CODE
library IEEE;
use IEEE.STD_LOGIC_1164.ALL;
use IEEE.STD_LOGIC_ARITH.ALL;
use IEEE.STD_LOGIC_UNSIGNED.ALL;
entity uart_receiver is
port(clk, rst_n, rd, rx: in std_logic;
parityerr: out std_logic;
rxrdy: inout std_logic;
rsr: inout std_logic_vector(10 downto 0);
rhr: inout std_logic_vector(7 downto 0);
det_rx: inout std_logic;
rd_clk: inout std_logic;
flag: inout std_logic;
data: out std_logic_vector(7 downto 0));
end uart_receiver;
architecture Behavioral of uart_receiver is
signal count, countrx: integer;
signal temp_rhr: std_logic_vector(7 downto 0);
signal rbaud_clk: std_logic;
begin
36
-- This module is to detect the receiving bit i.e., startbit
process(clk)
begin
if(clk'event and clk = '1') then
if(rst_n = '0') then
det_rx <= '0';
-- On reset, we assume that, det_rx is not activated.
countrx <= 0;
elsif(rx = '0') then
det_rx <= '1';
-- If a, start bit is received det_rx control signal is enabled.
countrx <= countrx+1;
elsif (flag = '1') then --and (countrx < 100))
det_rx <= '0';
-- If start bit occupies the first bit of RSR i.e., all the bits are received into
receiver
end if;
end if;
end process;
-- This module is to keep track of the count, which will be helpfull in
generating baud clocks
process(clk)
37
begin
if(clk'event and clk = '1') then
if(rst_n = '0') then
count <= 0;
elsif(det_rx = '1' and count = 9) then
count <= 0;
elsif(det_rx = '1') then
count <= count+1;
else
count <= 0;
end if;
end if;
end process;
---This module is for generation of baud clk
process(clk)
begin
if(clk'event and clk = '1') then
if(rst_n = '0') then
rbaud_clk <= '0';
elsif(count = 1) then
rbaud_clk <= '1';
38
else
rbaud_clk <= '0';
end if;
end if;
end process;
---This module is for receiving data from transmitter line to receiver i.e, to
RSR
process(rbaud_clk, rst_n)
begin
if(clk'event and clk = '1') then
if(rst_n = '0') then
rsr <= "11111111111";
elsif(rbaud_clk'event and rbaud_clk = '1') then
rsr(9 downto 0) <= rsr(10 downto 1);
---Receiving bits bit by bit
rsr(10) <= rx;
end if;
if(clk'event and clk = '1') then
if(flag = '1') then
--If start bit reaches the first position, then, it is reset
rsr <= "11111111111";
end if;
39
end process;
--- This module is to assign value to the flag
process(clk)
begin
if(rst_n = '0') then
flag <= '0';
elsif(clk'event and clk = '1') then
if(rsr(0) = '0') then
flag <= '1';
elsif(det_rx = '1') then
flag <= '0';
end if;
end if;
end process;
--- This module is to receive data from RSR to RHR
process(clk)
begin
if(rst_n = '0') then
rhr <= "11111111";
elsif(clk'event and clk = '1') then
rhr <= rsr(8 downto 1);
40
end if;
end process;
process(clk)
begin
if(rst_n = '0') then
rd_clk <= '0';
elsif(clk'event and clk = '1') then
if(flag = '1') then
rd_clk <= '1';
else
rd_clk <= '0';
end if;
end if;
end process;
---This module is to shift data from RHR to Dataline with the help of read rd
signal
process(flag)
begin
if(rst_n = '0') then
data <= "00000000";
elsif(flag'event and flag = '1') then
data <= rhr;
41
end if;
end process;
---This module is to monitor, whether Receiver is ready or not
process(clk)
begin
if(clk'event and clk = '1') then
if(rst_n = '0') then
rxrdy <= '0';
elsif(flag = '1') then
rxrdy <= '1';
elsif(rd = '1') then
rxrdy <= '0';
end if;
end if;
end process;
---This module is for parity error
process(clk, rsr)
begin
if(clk'event and clk = '1') then
if(rst_n = '0') then
parityerr <= '0';
42
elsif(rd = '1') then
parityerr <= '0';
elsif(rsr(0) = '0') then
if(((rsr(8) xor rsr(7) xor rsr(6) xor rsr(5) xor rsr(4) xor rsr(3) xor rsr(2) xor rsr(1))
and rsr(9)) = '1') then
parityerr <= '0';
else
parityerr <= '1';
end if;
end if;
end if;
end process;
end Behavioral;
TESTBENCH
clk<=not clk after 1ns;
rst_n<='0','1' after 10ns;
process
begin
rx<='0';
wait for 35ns;
rx<='1';
wait for 35 ns;
rd<='0';rx<='0';
wait for 35ns;
rd<='1';rx<='1';
43
UART INTERFACE CODE
library IEEE;
use IEEE.STD_LOGIC_1164.ALL;
use IEEE.STD_LOGIC_ARITH.ALL;
use IEEE.STD_LOGIC_UNSIGNED.ALL;
entity uart is
Port(clk,rst,wr,rd,rx:in std_logic;
tx : out std_logic;
din : in STD_LOGIC_VECTOR (7 downto 0);
rdata_out : out STD_LOGIC_vector(7 downto 0);
txrdy, rxrdy : out std_logic);
end uart;
architecture Behavioral of uart is
signal parityerr:std_logic;
signal rxrdy1,txrdy1:std_logic;
component uart_transmitter is
port(clk, rst_n, wr: in std_logic;
data: in std_logic_vector(7 downto 0);
txrdy: inout std_logic;
tx: out std_logic);
end component;
component uart_receiver is
44
port(clk, rst_n, rd, rx: in std_logic;
parityerr: out std_logic;
rxrdy: inout std_logic;
data: out std_logic_vector(7 downto 0));
end component;
begin
rxrdy<=rxrdy1 when rst='1' else 'Z';
txrdy<=txrdy1 when rst='1' else 'Z';
u4:uart_transmitter port map (
clk => clk,
rst_n=>rst,
wr=>wr,
data=>din,
txrdy=>txrdy1,
tx=>tx);
u5:uart_receiver port map(
clk => clk,
rst_n=>rst,
rd=>rd,
rx=>rx,
45
parityerr=>parityerr,
rxrdy=>rxrdy1,
data=>rdata_out);
end Behavioral;
TESTBENCH
process
begin
clk <= '1';
wait for 1 ns;
clk <= '0';
wait for 1 ns;
end process;
process
begin
rst <= '0';
wr <= '0';
wait for 100 ns;
rst <= '1';
wait for 15 ns;
din <= "01011011";
46
rx<='0';
wr <= '1';
wait for 85 ns;
din <= "01001001";
wait for 100 ns;rx<='1';
din <= "01011010";
wait for 100 ns;
din <= "01011110";
wait for 100 ns;rx<='0';
din <= "01111100";
wait for 100 ns;
rx<='1'; rd<='1';
wait;
end process;
47
APB INTERFACE CODE
library IEEE;
use IEEE.STD_LOGIC_1164.ALL;
use IEEE.STD_LOGIC_ARITH.ALL;
use IEEE.STD_LOGIC_UNSIGNED.ALL;
entity apb_interface is
port( PCLK:in std_logic;
PRESETn:in std_logic;
PSEL:in std_logic;
PENABLE:in std_logic;
PWRITE:in std_logic;
PWDATA:in std_logic_vector(31 downto 0);
PADDR:in std_logic_vector(9 downto 0);
PRDATA:out std_logic_vector(31 downto 0);
---UART
uart_rx_data:in std_logic_vector(7 downto 0);
uart_tx_data:out std_logic_vector(7 downto 0);
rxrdy,txrdy:in std_logic;
uart_rd,uart_wr:out std_logic);
end apb_interface;
architecture Behavioral of apb_interface is
48
Constant REG0_ADDR:std_logic_vector(1 downto 0):="00";
Constant REG1_ADDR:std_logic_vector(1 downto 0):="01";
Constant REG2_ADDR:std_logic_vector(1 downto 0):="10";
Constant REG3_ADDR:std_logic_vector(1 downto 0):="11";
signal Reg0:std_logic_vector(31 downto 0):=(others=>'0');
signal Reg1:std_logic_vector(31 downto 0):=(others=>'0');
signal Reg2:std_logic_vector(31 downto 0):=(others=>'0');
signal Reg3:std_logic_vector(31 downto 0):=(others=>'0');
signal Next_Reg0:std_logic_vector(31 downto 0):=(others=>'0');
signal Next_Reg1:std_logic_vector(31 downto 0):=(others=>'0');
signal Next_Reg2:std_logic_vector(31 downto 0):=(others=>'0');
signal Next_Reg3:std_logic_vector(31 downto 0):=(others=>'0');
signal Next_PRDATA:std_logic_vector(31 downto 0):=(others=>'0');
signal iPRDATA:std_logic_vector(31 downto 0):=(others=>'0');
-- Read Fill Vector
signal ZeroFill:std_logic_vector(31 downto 0):=(others=>'0');
-- Gated version of PADDR
signal GatedPADDR:std_logic_vector(9 downto 0):=(others=>'0');
49
-- Internal read enable signal
signal Rden:std_logic;
-- Internal write enable signal
signal Wren:std_logic;
-- Internal PRDATA write enable signal
signal PRDATAEn:std_logic;
-- Internal Write Data Bus, to reduce power consumption
signal PWDATAIn:std_logic_vector(31 downto 0):=(others=>'0');
signal REG0rd:std_logic:='0';
signal REG1rd:std_logic:='0';
signal REG2rd:std_logic:='0';
signal REG3rd:std_logic:='0';
signal REG0wr:std_logic:='0';
signal REG1wr:std_logic:='0';
signal REG2wr:std_logic:='0';
signal REG3wr:std_logic:='0';
begin
PWDATAIn <= PWDATA when ((PSEL='1') and (PWRITE = '1')) else
(others=>'0');
50
GatedPADDR <= PADDR when (PSEL='1') else (others=>'0');
Wren <= PENABLE and PWRITE and PSEL;
Rden <= PSEL and (not(PWRITE)) and (not(PENABLE));
PRDATAEn <= PSEL and (not(PWRITE)) and (not(PENABLE));
PRDATA <= iPRDATA;
REG0wr <= '1' when ((Wren = '1') and (GatedPADDR(1 downto 0) =
REG0_ADDR)) else '0';
REG1wr <= '1' when ((Wren = '1') and (GatedPADDR(1 downto 0) =
REG1_ADDR)) else '0';
REG2wr <= '1' when ((Wren = '1') and (GatedPADDR(1 downto 0) =
REG2_ADDR)) else '0';
REG3wr <= '1' when ((Wren = '1') and (GatedPADDR(1 downto 0) =
REG3_ADDR)) else '0';
REG0rd <= '1' when ((Rden = '1') and (GatedPADDR(1 downto 0) =
REG0_ADDR)) else '0';
REG1rd <= '1' when ((Rden = '1') and (GatedPADDR(1 downto 0) =
REG1_ADDR)) else '0';
REG2rd <= '1' when ((Rden = '1') and (GatedPADDR(1 downto 0) =
REG2_ADDR)) else '0';
REG3rd <= '1' when ((Rden = '1') and (GatedPADDR(1 downto 0) =
REG3_ADDR)) else '0';
51
process(PCLK,PRESETn)
begin
if(PRESETn='0')then
iPRDATA <= (others=>'0');
elsif(PCLK'event and PCLK='1')then
iPRDATA <= Next_PRDATA;
end if;
end process;
process(REG0rd,REG1rd,REG2rd,REG3rd,REG0,REG1,REG2,REG3)
begin
if(REG0rd = '1') then
Next_PRDATA <= REG0;
elsif(REG1rd = '1') then
Next_PRDATA <= REG1;
elsif(REG2rd = '1') then
Next_PRDATA <= REG2;
elsif(REG3rd = '1') then
Next_PRDATA <= REG3;
else
Next_PRDATA <= ZeroFill;
end if;
52
end process;
---Implementation of REG0 register
---UART_Tx_DATA register
process(PWDATAIn,REG0wr,REG0)
begin
if(REG0wr = '1')then
Next_REG0 <= PWDATAIn;
else
Next_REG0 <= REG0;
end if;
end process;
process(PCLK,PRESETn)
begin
if(PRESETn='0')then
REG0 <= (others=>'1');
elsif(PCLK'event and PCLK='1')then
if(txrdy='1')then
uart_tx_data <= Next_REG0(7 downto 0);
else
REG0 <= Next_REG0;
end if;
53
end if;
end process;
-- Implementation of REG1 register
-- UART RX DATA request generating register
process(PWDATAIn,REG1wr,REG1)
begin
if(REG1wr='1')then
Next_REG1 <= PWDATAIn;
else
Next_REG1 <= REG1;
end if;
end process;
process(PCLK,PRESETn)
begin
if (PRESETn='0')then
REG1 <= (others=>'0');
elsif(PCLK'event and PCLK='1')then
if(rxrdy='1')then
REG1(7 downto 0) <=uart_rx_data;
else
REG1 <= Next_REG1;
54
end if;
end if;
end process;
---Implementation of REG2 register
---Control Register
process(PWDATAIn,REG2wr,REG2)
begin
if(REG2wr ='1') then
Next_REG2 <= PWDATAIn;
else
Next_REG2 <= REG2;
end if;
end process;
process(PCLK,PRESETn)
begin
if(PRESETn ='0') then
REG2 <= (others=>'0');
elsif(PCLK'event and PCLK='1') then
REG2 <= Next_REG2;
end if;
end process;
55
uart_wr <= REG2(0);
uart_rd <= REG2(7);
---Implementation of REG3 register
process(PWDATAIn,REG3wr,REG3)
begin
if(REG3wr ='1')then
Next_REG3 <= PWDATAIn;
else
Next_REG3 <= REG3;
end if;
end process;
process(PCLK,PRESETn)
begin
if (PRESETn ='0')then
REG3 <= x"00000200";
elsif(PCLK'event and PCLK='1') then
REG3 <= Next_REG3;
end if;
end process;
end Behavioral;
56
TESTBENCH
PROCESS
BEGIN
PRESETn<='0', '1' after 10 ns;
PSEL<='1';
PENABLE<='1';
PWRITE<='1';
PADDR<=x"00" & "00";
PWDATA<=x"000000aa";
wait for 40 ns;
txrdy<='1';
PADDR<=x"00" & "10";
PWDATA<=x"00000001";
wait for 100 ns;
PWRITE<='0';
PENABLE<='0';
uart_rx_data<=x"29";
PADDR<=x"00" & "01";
rxrdy<='1','0' after 5 ns;
-- Wait 100 ns for global reset to finish
wait for 100 ns;
57
APB UART INTERFACE CODE:
library IEEE;
use IEEE.STD_LOGIC_1164.ALL;
use IEEE.STD_LOGIC_ARITH.ALL;
use IEEE.STD_LOGIC_UNSIGNED.ALL;
entity apb_uart_top is
port(APB signals
PCLK:in std_logic;
PRESETn:in std_logic;
PSEL:in std_logic;
PENABLE:in std_logic;
PWRITE:in std_logic;
PWDATA:in std_logic_vector(31 downto 0);
PADDR:in std_logic_vector(9 downto 0);
PRDATA:out std_logic_vector(31 downto 0);
--UART Signals
tx : out std_logic;
rx:in std_logic);
end apb_uart_top;
architecture Behavioral of apb_uart_top is
component apb_interface is
58
port( PCLK:in std_logic;
PRESETn:in std_logic;
PSEL:in std_logic;
PENABLE:in std_logic;
PWRITE:in std_logic;
PWDATA:in std_logic_vector(31 downto 0);
PADDR:in std_logic_vector(9 downto 0);
PRDATA:out std_logic_vector(31 downto 0);
--- UART
uart_rx_data:in std_logic_vector(7 downto 0);
uart_tx_data:out std_logic_vector(7 downto 0);
rxrdy,txrdy:in std_logic;
uart_rd,uart_wr:out std_logic);
end component;
component uart is
Port(clk,rst,wr,rd,rx:in std_logic;
tx : out std_logic;
din : in STD_LOGIC_VECTOR (7 downto 0);
rdata_out : out STD_LOGIC_vector(7 downto 0);
txrdy, rxrdy : out std_logic);
end component;
59
Signal uart_rx_data,uart_tx_data:std_logic_vector(7 downto 0):=(others=>'0');
Signal uart_rd,uart_wr:std_logic:='0';
Signal txrdy,rxrdy:std_logic:='0';
begin
APB_IF: apb_interface port map(PCLK => PCLK,
PRESETn => PRESETn,
PSEL => PSEL,
PENABLE => PENABLE,
PWRITE => PWRITE,
PWDATA => PWDATA,
PADDR => PADDR,
PRDATA => PRDATA,
uart_rx_data => uart_rx_data,
uart_tx_data => uart_tx_data,
rxrdy => rxrdy,
txrdy => txrdy,
uart_rd => uart_rd,
uart_wr => uart_wr);
UART_TOP: uart port map(clk => PCLK
rst => PRESETn,
wr => uart_wr,
60
rd => uart_rd,
rx => rx,
tx => tx,
din => uart_tx_data,
rdata_out => uart_rx_data,
txrdy => txrdy,
rxrdy => rxrdy);
end Behavioral;
TESTBENCH
PROCESS
BEGIN
PRESETn<='0', '1' after 2 ns;
PSEL<='1';
PENABLE<='1';
PWRITE<='1';
PADDR<=x"00" & "00";
PWDATA<=x"00000007"; -- uart-tx data
rx<='0';
wait for 4 ns;
PADDR<=x"00" & "10"; --to enable uart write
PWDATA<=x"00000001";
rx<='0';
61
wait for 4 ns;
PADDR<=x"00" & "10";
PWDATA<=x"00000000"; --to disable uart write
rx<='0';
wait for 10 ns;
rx<='0';
wait for 60 ns;
rx<='1';
wait for 60 ns;
rx<='0';
wait for 60 ns;
rx<='1';
PADDR<=x"00" & "10";
PWDATA<=x"00000080"; --to enable uart read
wait for 10 ns;
PADDR<=x"00" & "10";
PWDATA<=x"00000000"; --to disable uart read
wait for 2 ns;
PADDR<=x"00" & "01"; --to read from Reg1 i.e uart rx data
PSEL<='1';
PENABLE<='0';
PWRITE<='0';
62
CHAPTER 8
RESULT
UART TRANSMITTER TEST BENCH
63
UART RECEIVER TEST BENCH
64
UART INTERFACE
65
APB INTERFACE
66
APB UART TOP MODULE WAVEFORM
67
CHAPTER 9
TOOLS AND HDL USED
Tools and HDL Used
We have used Xilinx ISE 9 for simulation and synthesis purposes. We
implemented the prescribed design in VHDL, a famous Industry and IEEE
standard HDL.
Brief History:
VHDL Was developed in the early 1980s for managing design problems
that involved large circuits and multiple teams of engineers. Funded by U.S.
Department of Defense.The first publicly available version was released in
1985.In 1986 IEEE (Institute of Electrical and Electronics Engineers, Inc.) was
presented with a proposal to standardize the VHDL.In 1987 standardization =>
IEEE 1076-1987.An improved version of the language was released in 1994 =>
IEEE standard1076-1993.
VHDL
VHDL = VHSIC Hardware Description Language (VHSIC = Very High-Speed
IC)
Design specification language
Design entry language
Design simulation language
Design documentation language
An alternative to schematics
68
HDL
Interoperability
Technology independence
Design reuse
Several levels of abstraction
Readability
69
CHAPTER 10
VLSI DESIGN FLOW
10.1 INTRODUCTION
The word digital has made a dramatic impact on our society. More
significant is a continuous trend towards digital solutions in all areas from
electronicinstrumentation,control,datamanipulation,signalsprocessing,telecommu
nications to consumer electronics. Development of such solutions has been
possible due to good digital system design and modeling techniques.
10.2 CONVENTIONAL APPROACH TO DIGITAL DESIGN
Digital ICs of SSI and MSI types have become universally standardized
and have been accepted for use. Whenever a designer has to realize a digital
function, he uses a standard set of ICs along with a minimal set of additional
discrete circuitry.
Consider a simple example of realizing a function as
Q n+1 = Q n + (A B)
Here Qn, A, and B are Boolean variables, with Q n being the value of Q at
the nth time step. Here A B signifies the logical AND of A and B; the ‘+’ symbol
signifies the logical OR of the logic variables on either side. A circuit to realize
the function is shown in Figure 4.1. The circuit can be realized in terms of two
ICs – an A-O-I gate and a flip-flop. It can be directly wired up, tested, and used.
FIGURE10.1: SIMPLE DIGITAL CIRCUIT
70
With comparatively larger circuits, the task mostly reduces to one of
identifying the set of ICs necessary for the job and interconnecting; rarely does
one have to resort to a micro level design. The accepted approach to digital
design here is a mix of the top-down and bottom-up approaches as follows:
• Decide the requirements at the system level and translate them to circuit
requirements.
• Identify the major functional blocks required like timer, DMA unit, register file
say as in the design of a processor.
• Whenever a function can be realized using a standard IC, use the same –for
example programmable counter, mux, demux.
• Whenever the above is not possible, form the circuit to carry out the block
functions using standard SSI – for example gates, flip-flops.
• Use additional components like transistor, diode, resistor, capacitor,wherever
essential.
Once the above steps are gone through, a paper design is ready. Starting
with the paper design, one has to do a circuit layout. The physical location of all
the components is tentatively decided; they are interconnected and the ‘circuit-on
paper’ is made ready. Once a paper design is done, a layout is carried out and a
net-list prepared. Based on this, the PCB is fabricated and populated and all the
populated cards tested and debugged. The procedure is shown as a process
flowchart in Figure.
71
FIGURE10.2:SEQUENCE OF STEPS IN CONVENTIONAL ELECTRONIC
CIRCUIT DESIGN
At the debugging stage one may encounter three types of problems:
• Functional mismatch: The realized and expected functions are different. One
may have to go through the relevant functional block carefully and locate any
error logically. Finally the necessary correction has to be carried out in hardware.
• Timing mismatch: The problem can manifest in different forms. One possibility
is due to the signal going through different propagation delays in two paths and
arriving at a point with a timing mismatch. This can cause faulty operation.
Another possibility is a race condition in a circuit involving asynchronous
feedback. This kind of problem may call for elaborate debugging. The preferred
practice is to do debugging at smaller module stages and ensuring that feedback
through larger loops is avoided: It becomes essential to check for the existence of
long asynchronous loops.
• Overload: Some signals may be overloaded to such an extent that the signal
transition may be unduly delayed or even suppressed. The problem manifests as
72
reflections and erratic behavior in some cases (The signal has to be suitably
buffered here.). In fact, overload on a signal can lead to timing mismatches.
The above have to be carried out after completion of the prototype PCB
manufacturing; it involves cost, time, and also a redesigning process to develop a
bug free design.
10.3 VLSI DESIGN
The complexity of VLSI is being designed and used today makes the
manual approach to design impractical. Design automation is the order of the
day. With the rapid technological developments in the last two decades, the status
of VLSI technology is characterized by the following
• A steady increase in the size and hence the functionality of the ICs.
• A steady reduction in feature size and hence increase in the speed of operation
as well as gate or transistor density.
• A steady improvement in the predictability of circuit behavior.
• A steady increase in the variety and size of software tools for VLSI design.
The above developments have resulted in a proliferation of approaches to VLSI
design. We briefly describe the procedure of automated design flow the aim is
more to bring out the role of a Hardware Description Language (HDL) in the
design process. An abstraction based model is the basis of the automated design.
10.3.1 Abstraction Model
The model divides the whole design cycle into various domains with such
an abstraction through a division process the design is carried out in different
layers. The designer at one layer can function without bothering about the layers
above or below. The thick horizontal lines separating the layers in the figure
signify the compartmentalization. As an example, let us consider design at the
gate level. The circuit to be designed would be described in terms of truth tables
73
and state tables. With these as available inputs, he has to express them as Boolean
logic equations and realize them in terms of gates and flip-flops. In turn, these
form the inputs to the layer immediately below. Compartmentalization of the
approach to design in the manner described here is the essence of abstraction; it is
the basis for development and use of CAD tools in VLSI design at various levels.
The design methods at different levels use the respective aids such as Boolean
equations, truth tables, state transition table, etc. But the aids play only a small
role in the process. To complete a design, one may have to switch from one tool
to another, raising the issues of tool compatibility and learning new
Environments.
10.4 ASIC DESIGN FLOW
As with any other technical activity, development of an ASIC starts with an idea
and takes tangible shape through the stages of development as shown in Figure
4.4 and shown in detail in Figure 4.5. The first step in the process is to expand the
idea in terms of behavior of the target circuit. Through stages of programming,
the same is fully developed into a design description – in terms of well defined
standard constructs and conventions.
FIGURE10.3 DESIGN DOMAIN LEVELS OF ABSTRACTION
74
FIGURE10.4: MAJOR ACTIVITIES IN ASIC DESIGN
The design is tested through a simulation process; it is to check, verify,
and ensure that what is wanted is what is described. Simulation is carried out
through dedicated tools. With every simulation run, the simulation results are
studied to identify errors in the design description. The errors are corrected and
another simulation run carried out. Simulation and changes to design description
together form a cyclic iterative process, repeated until an error-free design is
evolved.
Design description is an activity independent of the target technology or
manufacturer. It results in a description of the digital circuit. To translate it into a
tangible circuit, one goes through the physical design process. The same
constitutes a set of activities closely linked to the manufacturer and the target
technology
10.4.1 Design Description
The design is carried out in stages. The process of transforming the idea
into a detailed circuit description in terms of the elementary circuit components
constitutes design description. The final circuit of such an IC can have up to a
billion such components; it is arrived at in a step-by-step manner. The first step in
evolving the design description is to describe the circuit in terms of its behavior.
75
The description looks like a program in a high level language like C. Once the
behavioral level design description is ready, it is tested extensively with the help
of a simulation tool; it checks and confirms that all the expected functions are
carried out satisfactorily. If necessary, this behavioral level routine is edited,
modified, and rerun – all done manually. Finally, one has a design for the
expected system – described at the behavioral level. The behavioral design forms
the input to the synthesis tools, for circuit synthesis. The behavioral constructs
not supported by the synthesis tools are replaced by data flow and gate level
constructs. To surmise, the designer has to develop synthesizable codes for his
design.
76
FIGURE10.5: ASIC DESIGN AND DEVELOPMENT FLOW
The design at the behavioral level is to be elaborated in terms of known
and acknowledged functional blocks. It forms the next detailed level of design
description. Once again the design is to be tested through simulation and
iteratively corrected for errors. The elaboration can be continued one or two steps
further. It leads to a detailed design description in terms of logic gates and
transistor switches.
10.4.2 Optimization
The circuit at the gate level – in terms of the gates and flip-flops – can be
redundant in nature. The same can be minimized with the help of minimization
tools. The step is not shown separately in the figure. The minimized logical
design is converted to a circuit in terms of the switch level cells from standard
libraries provided by the foundries. The cell based design generated by the tool is
the last step in the logical design process; it forms the input to the first level of
physical design.
10.4.3 Simulation
The design descriptions are tested for their functionality at every level –
behavioral, data flow, and gate. One has to check here whether all the functions
are carried out as expected and rectify them. All such activities are carried out by
the simulation tool. The tool also has an editor to carry out any corrections to the
source code. Simulation involves testing the design for all its functions,
functional sequences, timing constraints, and specifications. Normally testing and
simulation at all the levels – behavioral to switch level – are carried out by a
single tool; the same is identified as “scope of simulation tool”.
10.4.4 Synthesis
With the availability of design at the gate (switch) level, the logical design
is complete. The corresponding circuit hardware realization is carried out by a
synthesis tool. Two common approaches are as follows:
77
• The circuit is realized through an FPGA. The gate level design description is the
starting point for the synthesis here. The FPGA vendors provide an interface to
the synthesis tool. Through the interface the gate level design is realized as a final
circuit. With many synthesis tools, one can directly use the design description at
the data flow level itself to realize the final circuit through an FPGA. The FPGA
route is attractive for limited volume production or a fast development cycle.
• The circuit is realized as an ASIC. A typical ASIC vendor will have his own
library of basic components like elementary gates and flip-flops. Eventually the
circuit is to be realized by selecting such components and interconnecting them
conforming to the required design. This constitutes the physical design. Being an
elaborate and costly process, a physical design may call for an intermediate
functional verification through the FPGA route. The circuit realized through the
FPGA is tested as a prototype. It provides another opportunity for testing the
design closer to the final circuit.
10.4.5 Physical Design
A fully tested and error-free design at the switch level can be the starting
point for a physical design [Baker & Boyce, Wolf]. It is to be realized as the final
circuit using (typically) a million components in the foundry’s library. The step-
by-step activities in the process are described briefly as follows:
• System partitioning: The design is partitioned into convenient compartments
or functional blocks. Often it would have been done at an earlier stage itself and
the software design prepared in terms of such blocks. Interconnection of the
blocks is part of the partition process.
• Floor planning: The positions of the partitioned blocks are planned and the
blocks are arranged accordingly. The procedure is analogous to the planning and
arrangement of domestic furniture in a residence. Blocks with I/O pins are kept
close to the periphery; those which interact frequently or through a large number
of interconnections are kept close together, and so on. Partitioning and floor
planning may have to be carried out and refined iteratively to yield best results.
78
• Placement: The selected components from the ASIC library are placed in
position on the “Silicon floor.” It is done with each of the blocks above.
• Routing: The components placed as described above are to be interconnected to
the rest of the block: It is done with each of the blocks by suitably routing the
interconnects. Once the routing is complete, the physical design cam is taken as
complete. The final mask for the design can be made at this stage and the ASIC
manufactured in the foundry.
10.4.6 Post Layout Simulation
Once the placement and routing are completed, the performance
specifications like silicon area, power consumed, path delays, etc., can be
computed. Equivalent circuit can be extracted at the component level and
performance analysis carried out. This constitutes the final stage called
“verification.” One may have to go through the placement and routing activity
once again to improve performance.
10.4.7 Critical Subsystems
The design may have critical subsystems. Their performance may be
crucial to the overall performance; in other words, to improve the system
performance substantially, one may have to design such subsystems afresh. The
design here may imply redefinition of the basic feature size of the component,
component design, placement of components, or routing done separately and
specifically for the subsystem. A set of masks used in the foundry may have to be
done afresh for the purpose.
10.5 ROLE OF HDL
An HDL provides the framework for the complete logical design of the
ASIC. All the activities coming under the purview of an HDL are shown
enclosed in bold dotted lines in Figure 1.4. Verilog and VHDL are the two most
commonly used HDLs today. Both have constructs with which the design can be
fully described at all the levels. There are additional constructs available to
79
facilitate setting up of the test bench, spelling out test vectors for them and
“observing” the outputs from the designed unit.
IEEE has brought out Standards for the HDLs, and the software tools conform to
them. Verilog as an HDL was introduced by Cadence Design Systems; they
placed it into the public domain in 1990. It was established as a formal IEEE
Standard in 1995. The revised version has been brought out in 2001. However,
most of the simulation tools available today conform only to the 1995 version of
the standard.
VHDL used by a substantial number of the VLSI designers today is the used in
this project for modeling the design.
80
CHAPTER 11
APPLICATIONS
It is used in Embedded processor applications.
Mainly used in full duplex communication.
81
CHAPTER 12
FUTURE SCOPE
SOC FUTURE SCOPE
The SoCs of the future will:
have 100s of hardware blocks,
have billions of transistors,
have multiple processors,
have large wire-to-gate delay ratios,
handle large amounts of high-speed data,
need to support “plug-and-play” IP blocks
AMBA FUTURE SCOPE
The Advanced Microcontroller Bus Architecture (AMBA) is used as the on-chip
bus in system-on-a-chip (SoC) designs. Since its inception, the scope of AMBA
has gone far beyond microcontroller devices, and is now widely used on a range
of ASIC and SoC parts including applications processors used in modern portable
mobile devices like smartphones.
82
CHAPTER 13
CONCLUSION
Complex VLSI IC design has been revolutionized by the widespread
adoption of the SoC paradigm. The benefits of the SoC approaches are numerous,
including improvements in system performance, cost, size, power dissipation, and
design turn around time. Many SoC designs consist of one or more IPs, designed
for a single or narrow set of applications with highly characterize-able
communication. As the level of chip integration continues to advances at a fast
pace, the desire for efficient interconnects rapidly increase. Currently on-chip
interconnections networks are mostly implemented using traditional interconnects
like buses. The wide variety of buses used in SoC designs presents the major
problem for reusable-design. A number of companies and standards committees
have attempted to standardize buses and interfaces with mixed results. In this
paper we have discussed some of the issues facing SoC designers in determining
which bus architecture to use in order to provide flexible and high-bandwidth
between IPs.
83
CHAPTER 14
REFERENCES
[1] M. Keating and P. Bricaud, Reuse Methodology Manual for System-on-a-
Chip Designs,2/E. Boston: Kluwer Academic Publishers, 1999.
[2] W. Ho and T. Pinkston, “A design methodology for efficient application-
specific on-chip interconnects,” IEEE Trans. On Parallel and Distributed Systems
February,vol. 17, no. 2, pp. 174–190, Feb. 2006.
[3] N. Horspool and P. Gorman, The ASIC Handbook. Upperside River, NJ:
Prentice Hall, 2001.
[4] L. Bernini and G. D. Micheli, “Networks on chips: A new paradigm for
component based mp soc design,” in Miltiprocessor Systems-on-Chips, A. A.
Jerraya and W. Wolf, Eds. Amsterdam: Elsevier, 2005, pp. 49–80.
[5] Core connect bus architecture. IBM Microelectronics. [Online]. Available:
http://www.ibm.com/chips/products/coreconnect
84