data transformation unit

Upload: shailesh-tendulkar

Post on 04-Apr-2018

225 views

Category:

Documents


0 download

TRANSCRIPT

  • 7/29/2019 data transformation unit

    1/55

    BACHELOR OF ENGINEERING PROJECT ON

    INTEGRATED DATA TRANSFORMATION UNIT

    Submitted ByADISH GULECHHA

    NEHA RASKARSHAILESH TENDULKAR

    IN PARTIAL FULFILLMENT OF THE REQUIREMENT FOR THE DEGREE OFBACHELOR OF ENGINEERING

    INELECTRONICS

    UNDER THE GUIDANCE OFPROF.GIRISH GIDAYE

    Department of Electronics Engineering

    Vidyalankar Institute of TechnologyWadala (E) Mumbai 400 037.

    University of Mumbai

    2011- 2012

  • 7/29/2019 data transformation unit

    2/55

    CERTIFICATE

    This is to certify that

    ADISH GULECHHANEHA RASKAR

    SHAILESH TENDULKAR

    Have successfully completed project titled

    INTEGRATED DATA TRANSFORMATION UNIT

    IN PARTIAL FULFILLMENT OF THE REQUIREMENT FOR THE DEGREE OFBACHELOR OF ENGINEERING

    INELECTRONICS

    Leading to Bachelors Degree in Engineering2011-2012

    UNDER THE GUIDANCE OFPROF. GIRISH GIDAYE

    Signature of Guide Head of Department

    Examiner 1 Examiner 2 Principal

    College Seal

  • 7/29/2019 data transformation unit

    3/55

    ACKNOWLEDGEMENT

    First and foremost, we would like to extend our deepest gratitude to our

    project guide, Professor Girish Gidaye, for giving us the opportunity to workon new areas of digital system design. Without his continued support andinterest, this project would not have been the same as presented here.

    My sincerest appreciation goes out to all those who have contributed directly

    and indirectly to the completion of this project. Of particular mention is

    Professor Shrikant Velankar for his guidance, advices and motivations. Hisconstant encouragement, critics and guidance were a key to bringing this

    project to a fruitful completion.

    My sincere appreciation also extends to all my colleagues and others who

    have provided assistance at various occasions. Their views and tips are

    useful indeed. At the same time, the constant encouragement and

    camaraderie shared between all my friends during my graduate studies has

    been an enriching experience.

  • 7/29/2019 data transformation unit

    4/55

    ABSTRACT

    Design of Integrated Data Transformation Unit on FPGA to increase the

    Bandwidth of the channel. The Data would be received from multiple datachannels from host processors (PCs). The data streams would be first

    multiplexed to form a single data stream. Then the data stream would

    undergo Data compression by Run Length Encoding. Finally the compressed

    data stream would be encrypted using DES. This data stream would be

    communicated to another FPGA, where reverse process of decrypting,

    decompressing and de-multiplexing is carried out to retrieve the original data

    channels. For these operations, the data received / sent from / to the host

    processor (PC) on 4, 8 or 16 Channels and Multiplexer / Compression /

    Encryption as well as De-multiplexer / Decompression / Decryption logic

    would be implemented on 2 FPGAs. Hence, the concept of secured high

    bandwidth channel is implemented.

  • 7/29/2019 data transformation unit

    5/55

    I

    CONTENTS

    Sr. No. Page Title Page No.

    1. List of Figures II

    List of Tables III

    2. Introduction 1

    3. Review of Literature 3

    3.1 Verilog 3

    3.2 FPGA 5

    3.3 Multiplexer and De-multiplexer Unit 7

    3.4 Compression and Decompression Unit 9

    3.5 Encryption and Decryption Unit 11

    4. Design Hierarchy 22

    5. Plan of work 24

    6. Testing and Results 25

    7. Discussion of Results 29

    8. Conclusion 31

    9. Appendix 32

    10. References 48

  • 7/29/2019 data transformation unit

    6/55

    II

    List of Figures

    FIGURE 1 STAGES IN VERILOG 5

    FIGURE 2 SYMBOL OF 4:1 MULTIPLEXER 7

    FIGURE 3 SYMBOL OF 1:4 DE-MULTIPLEXER 8

    FIGURE 4 DATA COMPRESSION MODEL 9

    FIGURE 5 DES ALGORITHM OVERVIEV 14

    FIGURE 6 KEY SCHEDULING 16

    FIGURE 7 CALCULATION OF F(R,K) 19

    FIGURE 8 TRANSMISSION SYSTEM HIERARCHICAL FLOW 22FIGURE 9 RECEIVER SYSTEM HIERARCHICAL FLOW 23

    FIGURE 10 RTL SCHEMATIC OF THE TRANSMITTER SYSTEM 26

    FIGURE 11 WAVEFORM FOR TRANSMITTER SYSTEM 26

    FIGURE 12 RTL SCHEMATIC OF THE RECEIVER SYSTEM 28

    FIGURE 13 WAVEFORM FOR RECEIVER SYSTEM 28

    FIGURE 14 HUFFMAN CODER BLOCK 32

    FIGURE 15 STRUCTURE OF LZSS ALGORITHM USED 33

    FIGURE 16 RTL SCHEMATIC OF DES ENCRYPTION 36

    FIGURE 17 RTL SCHEMATIC OF DES DECRYPTION 36

    FIGURE 18 RTL SCHEMATIC OF COMPLETE TRX SYSTEM 38FIGURE 19 RTL SCHEMATIC OF COMPLETE RX SYSTEM 38

  • 7/29/2019 data transformation unit

    7/55

    III

    List of Tables

    TABLE 1 FUNCTION TABLE OF MUX 7

    TABLE 2 FUNCTION TABLE OF 1:4 DEMUX 8

    TABLE 3 PC-1 PERMUTED CHOICE 1 15

    TABLE 4 PC-2 PERMUTED CHOICE 2 17

    TABLE 5 IP INITIAL PERMUTATION MATRIX 18

    TABLE 6 INVERSE INITIAL PERMUTATION MATRIX 21

    TABLE 7 DEVICE UTILIZATION SUMMARY OF TRX SYSTEM 25

    TABLE 8 DEVICE UTILIZATION SUMMARY OF RX SYSTEM 27

    TABLE 9 TIMING REPORT OF TRANSMITTER CORE 29

    TABLE 10 TIMING REPORT OF SYSTEM CORE 30

  • 7/29/2019 data transformation unit

    8/55

    Page | 1

    2. INTRODUCTION

    This project implements register-transfer-level design of a proprietary high-

    speed data transformation processor core using Verilog Hardware Description

    Language. In addition, this project also offers enhancements aimed at

    improving the design portability to any hardware implementation technologies.

    The main aim of this project has been to develop a core that processes data

    in a fast and a secure manner entirely in hardware. We have made use of the

    Data Encryption Standard (DES) and some other standard compression

    techniques like LZ77/LZSS which operate on the input stream of the data. All

    of this takes place completely in hardware which also increases the security ofthe system. We present the design of a complete Transmitter and the

    Receiver system which can be ported to the Xilinx Spartan 3/3E Family FPGA

    Boards.

    The growing possibilities of modern communications need the special means

    of security especially on computer network. The network security is becoming

    more important as the amount of data being exchanged on the Internet is

    increasing. Security requirements are necessary both at the final user level

    and at the enterprise level, especially since the massive utilization of personal

    computers, networks, and the Internet with its global availability. Throughout

    time, computational security needs have been focused on different features:

    secrecy or confidentiality, identification, verification, non-repudiation, integrity

    control and availability.

    This has resulted in an explosive growth of the field of information hiding. In

    addition, the rapid growth of publishing and broadcasting technology also

    requires an alternative solution in hiding information.

  • 7/29/2019 data transformation unit

    9/55

    Page | 2

    The rapid growth of networking is driving high-bandwidth data transfers all

    over the world. Today, all the financial transactions, video surveillance, and e-

    commerce are performed online. All data transfers are carried over networks

    like LAN, WAN, and ATMs, which are interconnected with routers, switches,

    bridges, and other network equipment. The growth of virtual private networks

    (VPNs) and IP security solutions (IPSec) has heightened demand for secure,

    high performance data transfers.

  • 7/29/2019 data transformation unit

    10/55

    Page | 3

    3. REVIEW OF LITERATURE

    3.1 VERILOG

    Verilog hardware description language is an IEEE standard (IEEE std. 1364-

    1995) language used for describing the behaviour and functionality of digital

    circuits.

    In the semiconductor and electronic design industry, Verilog is a hardware

    description language (HDL) used to model electronic systems. Verilog HDL,

    not to be confused with VHDL (a competing language), is most commonlyused in the design, verification, and implementation of digital logic chips at the

    register-transfer level of abstraction. It is also used in the verification of analog

    and mixed-signal circuits.

    Hardware description languages such as Verilog differ from software

    programming languages because they include ways of describing the

    propagation of time and signal dependencies (sensitivity). At the time of

    Verilog's introduction (1984), Verilog represented a tremendous productivity

    improvement for circuit designers who were already using graphical

    schematic capture software and specially written software programs to

    document and simulate electronic circuits.

    Entry of large digital designs at the schematic level is very time consuming

    and can be exceedingly tedious for circuits with wide data paths that must be

    repeated for each bit of the data path. Hardware description languages

    (HDLs) provide a more compact textual description of a design. Verilog is a

    powerful language and offers several different levels of descriptions. The

    lowest level is the gate level, in which statements are used to define individual

    gates.

  • 7/29/2019 data transformation unit

    11/55

    Page | 4

    3.1.1 Structural v/s Behavioral Verilog

    Behavioral modeling describes what a design must do, but does not have an

    obvious mapping to hardware. Behavioral Verilog is used to describe designs

    at a high level of abstraction, to design a processor at the gate level, in order

    to quantify the complexity and timing requirements of the design. Hence you

    will use structural Verilog only. The behavioral level of description is the most

    abstract, resembling C with function calls (called tasks), for and while loops,

    etc.

    In the structural level, more abstract assign statements and always blocks are

    used. These constructs are more powerful and can describe a design with

    fewer lines of code, but still provide a clearly defined relationship to actual

    hardware.

    Verilog libraries containing modules that will be the basic building blocks are

    used for design in Structural Verilog. These library parts include simple logic

    gates, registers, and memory modules, for example. While the library parts

    are designed behaviorally, they incorporate some timing information that will

    be used in simulations. Using the class libraries ensures a uniform timing

    standard for everyone.

    Structural Verilog allows designers to describe a digital system as a

    hierarchical interconnection of modules.

    The Verilog code for the project consists only of module definitions and their

    instances, the use of some behavioral Verilog for debugging purposes.

  • 7/29/2019 data transformation unit

    12/55

    Page | 5

    3.2 FPGA

    A synthesis tool is used to translate the Verilog into actual hardware, such as

    logic gates on a custom Application Specific Integrated Circuit (ASIC) or

    configurable logic blocks (CLBs) on a Field Programmable Gate Array

    (FPGA).

    Various stages of ASIC/FPGA

    Figure 1 Stages in Verilog

  • 7/29/2019 data transformation unit

    13/55

    Page | 6

    A field-programmable gate array (FPGA) is an integrated circuit designed to

    be configured by the customer or designer after manufacturinghence "field-

    programmable". The FPGA configuration is generally specified using a

    hardware description language (HDL), similar to that used for an application-

    specific integrated circuit (ASIC) (circuit diagrams were previously used to

    specify the configuration, as they were for ASICs, but this is increasingly rare).

    FPGAs can be used to implement any logical function that an ASIC could

    perform. The ability to update the functionality after shipping, partial re-

    configuration of a portion of the design and the low non-recurring engineering

    costs relative to an ASIC design (notwithstanding the generally higher unit

    cost), offer advantages for many applications.

    FPGAs contain programmable logic components called "logic blocks", and a

    hierarchy of reconfigurable interconnects that allow the blocks to be "wired

    together"somewhat like many (changeable) logic gates that can be inter-

    wired in (many) different configurations. Logic blocks can be configured to

    perform complex combinational functions, or merely simple logic gates like

    AND and XOR. In most FPGAs, the logic blocks also include memory

    elements, which may be simple flip-flops or more complete blocks of memory.

  • 7/29/2019 data transformation unit

    14/55

    Page | 7

    3.3 MULTIPLEXER AND DEMULTIPLEXER UNIT

    3.3.1 MULTIPLEXER

    A multiplexer is a combinatorial circuit that is given a certain number (usuallya power of two) data inputs, let us say 2n, and n address inputs used as a

    binary number to select one of the data inputs. The multiplexer has a single

    output, which has the same value as the selected data input.

    Depending upon the digital code applied at the select inputs one out of n data

    input is selected& transmitted to a single o/p channel.

    At face value a multiplexer is a logic circuit whose function is to select one

    data line from among many. For this reason, many people refer to

    multiplexers as data selectors.

    Figure 2 Symbol of 4:1 Multiplexer Table 1 Function Table of MUX

    Input Output

    S1 S0 Y

    0 0 D0

    0 1 D1

    1 0 D2

    1 1 D3

  • 7/29/2019 data transformation unit

    15/55

    Page | 8

    3.3.2 DEMULTIPLEXER

    The de-multiplexer is the inverse of the multiplexer, in that it takes a single

    data input and n address inputs. It has 2n outputs. The address input

    determine which data output is going to have the same value as the datainput. The other data outputs will have the value 0.

    Figure 3 Symbol of 1:4 De-multiplexer

    Table 2 Function Table of 1:4 DEMUX

    Input Output

    E S0 S1 D0 D1 D2 D3

    E

    E

    E

    E

    0

    1

    0

    1

    0

    0

    1

    1

    E

    0

    0

    0

    0

    E

    0

    0

    0

    0

    E

    0

    0

    0

    0

    E

  • 7/29/2019 data transformation unit

    16/55

    Page | 9

    3.4 COMPRESSION AND DECOMPRESSION UNIT

    Data compression is the technique to reduce the redundancies in data

    representation in order to decrease data storage requirements and hence

    communication costs. Reducing the storage requirement is equivalent to

    increasing the capacity of the storage medium and hence communication

    bandwidth. Thus the development of efficient compression techniques will

    continue to be a design challenge for future communication systems and

    advanced multimedia applications.

    Data is represented as a combination of information and redundancy.

    Information is the portion of data that must be preserved permanently in its

    original form in order to correctly interpret the meaning or purpose of the data.

    Redundancy is that portion of data that can be removed when it is not needed

    or can be reinserted to interpret the data when needed. Most often, the

    redundancy is reinserted in order to generate the original data in its original

    form. A technique to reduce the redundancy of data is defined as Data

    compression. The redundancy in data representation is reduced such a way

    that it can be subsequently reinserted to recover the original data, which is

    called decompression of the data.

    Figure 4 Data compression model

    When we speak of a compression technique or a compression algorithm we

    actually refer to two algorithms: the first one takes an input Xand generates a

    representation XCthat requires fewer bits; the second one is a reconstruction

    algorithm that operates on the compressed representation XCto generate the

    reconstruction Y.

  • 7/29/2019 data transformation unit

    17/55

    Page | 10

    3.4.1 Types of Data Compression Models

    There are two types of data compression models: lossy and lossless.

    The lossy data compression works on the assumption that the data do

    not have to be stored perfectly. Text files (specially files containing

    computer programs) are stored using lossless techniques, since losing

    a single character can make, in the worst case, the text dangerously

    misleading.

    Lossless compression ensures that the original information can be

    exactly reproduced from the compressed data.

    3.4.2 Advantages of Data Compression

    It reduces the data storage requirements

    The audience can experience rich-quality signals for audio-visual data

    representation

    Data security can also be greatly enhanced by encrypting the

    decoding parameters and transmitting them separately from the

    compressed database files to restrict access of proprietary information

    The rate of input-output operations in a computing device can be

    greatly increased due to shorter representation of data

    Data Compression obviously reduces the cost of backup and recovery

    of data in computer systems by storing the backup of large database

    files in compressed form

    The technique used in the design of high-speed data compression and

    decompression processor cores is based on combination of LZSS

    compression algorithm and Huffman coding. The source data to be

    compressed is first processed by the LZSS compression technique since the

    algorithm is not restricted in what type of data it can process, coupled with the

    fact that it requires no a priori knowledge of the source. LZSS codeword is

    then generated whenever matches between the source data and the

    dictionary elements are detected, where the encoded data are represented as

    position-lengthpair codeword.

  • 7/29/2019 data transformation unit

    18/55

    Page | 11

    3.5 ENCRYPTION AND DECRYPTION UNIT

    Fast computers and advances in telecommunications have made high-speed,

    global, widespread computer networks possible, in particular the Internet,

    which is an open network. It has increased the access to databases, such as

    the open World Wide Web. To decrease communication cost and to be user

    friendly, private databases containing medical records, proprietary

    information, tax information, etc., are often accessible via the Internet by using

    a low-security password scheme.

    The privacy of data is obviously vulnerable during communication, and data

    in transit can be modified, in particular in open networks. Because of the lack

    of secure computers, such concerns extend to stored data. Data

    communicated and/or and accessible over such networks include bank and

    other financial transactions, love letters, medical records, proprietary

    information, etc., whose privacy must be protected. The authenticity of (the

    data in) contracts, databases, electronic commerce, etc. must be protected

    against modifications by outsider or by one of the parties involved in the

    transaction.

    Modern cryptography provides the means to address these issues.

    Cryptography includes two basic components: Encryption algorithm and Keys.

    If sender and recipient use the same key then it is known as symmetrical or

    private key cryptography. It is always suitable for long data streams. Such

    system is difficult to use in practice because the sender and receiver must

    know the key. It also requires sending the keys over a secure channel from

    sender to recipient. The question is that if secure channel already exist then

    transmit the data over the same channel.

    On the other hand, if different keys are used by sender and recipient then it is

    known as asymmetrical or public key cryptography. The key used for

    encryption is called the public key and the key used for decryption is called

    the private key. Such technique is used for short data streams and also

    requires more time to encrypt the data.

  • 7/29/2019 data transformation unit

    19/55

    Page | 12

    3.5.1 Techniques of Cryptography

    There are two techniques used for data encryption and decryption, which are:

    A] Symmetric Cryptography

    If sender and recipient use the same key then it is known as symmetrical or

    private key cryptography. It is always suitable for long data streams. Such

    system is difficult to use in practice because the sender and receiver must

    know the key. It also requires sending the keys over a secure channel from

    sender to recipient.

    There are two methods that are used in symmetric key cryptography: block

    and stream.

    The block method divides a large data set into blocks (based on

    predefined size or the key size), encrypts each block separately and

    finally combines blocks to produce encrypted data.

    The stream method encrypts the data as a stream of bits without

    separating the data into blocks. The stream of bits from the data is

    encrypted sequentially using some of the results from the previous bit

    until all the bits in the data are encrypted as a whole.

    B] Asymmetric Cryptography

    If sender and recipient use different keys then it is known as asymmetrical or

    public key cryptography. The key used for encryption is called the public key

    and the key used for decryption is called the private key. Such technique is

    used for short data streams and also requires more time to encrypt the data.

    Asymmetric encryption techniques are almost 1000 times slower than

    symmetric techniques, because they require more computational processing

    power. To get the benefits of both methods, a hybrid technique is usually

    used. In this technique, asymmetric encryption is used to exchange the secret

    key; symmetric encryption is then used to transfer data between sender and

    receiver.

  • 7/29/2019 data transformation unit

    20/55

    Page | 13

    3.5.2 DES ALGORITHM

    Data Encryption Standard (DES) is a cryptographic standard that was

    proposed as the algorithm for the secure and secret items in 1970 and was

    adopted as an American federal standard by National Bureau of Standards

    (NBS) in 1973. DES is a block cipher, which means that during the encryption

    process, the plaintext is broken into fixed length blocks and each block is

    encrypted at the same time. Basically it takes a 64 bit input plain text and a

    key of 64-bits (only 56 bits are used for conversion purpose and rest bits are

    used for parity checking) and produces a 64 bit cipher text by encryption and

    which can be decrypted again to get the message using the same key.

    Additionally, we must highlight that there are four standardized modes of

    operation of DES:

    ECB (Electronic Codebook mode)

    CBC (Cipher Block Chaining mode)

    CFB (Cipher Feedback mode) and

    OFB (Output Feedback mode)

    The general depiction of DES encryption algorithm which consists of initial

    permutation of the 64 bit plain text and then goes through 16 rounds, where

    each round consists permutation and substitution of the text bit and the

    inputted key bit, and at last goes through an inverse initial permutation to get

    the 64 bit cipher text

  • 7/29/2019 data transformation unit

    21/55

    Page | 14

    Figure 5 DES Algorithm Overview

  • 7/29/2019 data transformation unit

    22/55

    Page | 15

    3.5.3 Steps for Algorithm

    Step 1: Create 16 sub-keys, each of which is 48-bits long

    The 64-bit key is permuted according to the following table, PC-1. Since the

    first entry in the table is "57", this means that the 57th bit of the original key K

    becomes the first bit of the permuted key K+. The 49th bit of the original key

    becomes the second bit of the permuted key. The 4th bit of the original key is

    the last bit of the permuted key. Note only 56 bits of the original key appear in

    the permuted key.

    .

    Table 3 PC-1 Permuted choice 1

    Next, split this key into left and right halves, C0 and D0, where each half has

    28 bits.

    From the permuted key K+, we get

    C0 = 0011001111000011001100111100

    D0 = 0011001111000011001100110011

  • 7/29/2019 data transformation unit

    23/55

    Page | 16

    With C0 and D0 defined, we now create sixteen blocks Cn and Dn,

    1

  • 7/29/2019 data transformation unit

    24/55

    Page | 17

    We now form the keys Kn, for 1

  • 7/29/2019 data transformation unit

    25/55

    Page | 18

    Step 2: Encode each 64-bit block of data

    There is an initial permutation IP of the 64 bits of the message data M. This

    rearranges the bits according to the following table, where the entries in the

    table show the new arrangement of the bits from their initial order.

    Table 5 IP Initial Permutation Matrix

    Here the 58th bit of M is "1", which becomes the first bit of IP. The 50th bit of

    M is "1", which becomes the second bit of IP. The 7th bit of M is "0", which

    becomes the last bit of IP.

    Next divide the permuted block IP into a left half L0 of 32 bits, and a right half

    R0 of 32 bits.

    We now proceed through 16 iterations, for 1

  • 7/29/2019 data transformation unit

    26/55

    Page | 19

    This results in a final block, for n = 16, of L16 R16. That is, in each iteration,

    we take the right 32 bits of the previous result and make them the left 32 bits

    of the current step.

    For the right 32 bits in the current step, we XOR the left 32 bits of the previous

    step with the calculation f.

    R1 = L0 + f(R0,K1)

    To calculate f, we first expand each block Rn-1 from 32 bits to 48 bits. This is

    done by using a selection table that repeats some of the bits in Rn-1 We'll call

    the use of this selection table the function E. Thus E(Rn-1) has a 32 bit input

    block, and a 48 bit output block. Thus the first three bits of E(Rn-1) are the

    bits in positions 32, 1 and 2 of Rn-1 while the last 2 bits of E(Rn-1) are the bits

    in positions 32 and1.

    (Note that each block of 4 original bits has been expanded to a block of 6

    output bits.)

    Next in the f calculation, we XOR the output E(Rn-1) with the key Kn:

    Kn + E(Rn-1).

    Figure 7 Calculation of f(R,K)

  • 7/29/2019 data transformation unit

    27/55

    Page | 20

    To this point we have expanded Rn-1 from 32 bits to 48 bits, using the

    selection table, and XORed the result with the key Kn . We now have 48 bits,

    or eight groups of six bits. We now do something strange with each group of

    six bits: we use them as addresses in tables called "S boxes". Each group of

    six bits will give us an address in a different S box. Located at that address

    will be a 4 bit number. This 4 bit number will replace the original 6 bits.

    The net result is that the eight groups of 6 bits are transformed into eight

    groups of 4 bits (the 4-bit outputs from the S boxes) for 32 bits total.

    Write the previous result, which is 48 bits, in the form:

    Kn + E(Rn-1) =B1B2B3B4B5B6B7B8, where each Bi is a group of six bits.

    We now calculate S1(B1)S2(B2)S3(B3)S4(B4)S5(B5)S6(B6)S7(B7)S8(B8)

    where Si(Bi) refers to the output of the i-th S box.

    To repeat, each of the functions S1, S2,..., S8, takes a 6-bit block as input and

    yields a 4-bit block as output.

    The final stage in the calculation of f is to do a permutation P of the S-box

    output to obtain the final value of f:

    f = P(S1(B1)S2(B2)...S8(B8))

    P yields a 32-bit output from a 32-bit input by permuting the bits of the input

    block.

    We calculate, R2 =L1 +f(R1, K2), and so on for 16 rounds. At the end of the

    sixteenth round we have the blocks L16 and R16. We then reverse the order

    of the two blocks into the 64-bit block R16 L16 and apply a final permutation

    IP-1 as defined by the following table:

  • 7/29/2019 data transformation unit

    28/55

    Page | 21

    Table 6 Inverse Initial Permutation Matrix

    Decryption is simply the inverse of encryption, following the same steps as

    above, but reversing the order in which the sub-keys are applied.

  • 7/29/2019 data transformation unit

    29/55

    Page | 22

    4. DESIGN HIERARCHY

    4.1 Transmission System

    Figure 8 Transmission System Hierarchical Flow

  • 7/29/2019 data transformation unit

    30/55

    Page | 23

    4.2 Receiver System

    Figure 9 Receiver System Hierarchical Flow

  • 7/29/2019 data transformation unit

    31/55

    Page | 24

    5. PLAN OF WORK

    August September

    Formation of final block

    diagram

    Study and Selection of

    algorithms for

    compression core

    October November

    Study of algorithms for

    encryption core

    Selection and Study of

    hard ware description

    language

    January February

    Coding of MUX and

    DEMUX unit in Verilog

    Study of DESencryption algorithm

    Coding and

    implementation of DES

    encryption and

    Decryption Unit

    March April

    Decision of not

    implementing

    compression core

    because of increase in

    complexity

    Final system

    connections and

    structuring

    Implementation of final

    system

  • 7/29/2019 data transformation unit

    32/55

    Page | 25

    6. TESTING AND RESULTS

    6.1 Transmission System

    6.1.1 Device Utilization Summary

    DEVICE UTILIZATION SUMMARY

    Logic Utilization Used Available Utilization

    Number of slices 559 3584 15%

    Number of slice

    Flip Flops

    487 7168 6%

    Number of 4 input

    LUTs

    989 7168 13%

    Number of

    bounded IOBs

    17 141 12%

    Number of BRAMs 4 16 25%

    Number of GCLKs 1 8 12%

    Table 7 Device Utilization Summary of Trx system

  • 7/29/2019 data transformation unit

    33/55

    Page | 26

    Figure 10 RTL Schematic of the Transmitter System

    Figure 11 Waveform for Transmitter System

  • 7/29/2019 data transformation unit

    34/55

    Page | 27

    6.2 Receiver System

    6.2.1 Device Utilization Summary

    DEVICE UTILIZATION SUMMARY

    Logic Utilization Used Available Utilization

    Number of slices 772 3584 21%

    Number of slice

    Flip Flops

    743 7168 10%

    Number of 4 input

    LUTs

    1185 71568 16%

    Number of

    bounded IOBs

    20 141 14%

    Number of

    BRAMs

    4 16 25%

    Number of

    GCLKs

    5 8 62%

    Table 8 Device Utilization summary of Rx System

  • 7/29/2019 data transformation unit

    35/55

    Page | 28

    Figure 12 RTL Schematic of the Receiver System

    Figure 13 Waveform for Receiver System

  • 7/29/2019 data transformation unit

    36/55

    Page | 29

    7. DISCUSSION OF RESULTS

    7.1 Timing Report of Transmitter Core

    Delay: 9.534ns (Levels of Logic = 4)Source: Sel (PAD)

    Destination: sample (PAD)

    Data Path: Sel to sample

    Cell: in->out Fanout Delay Delay

    IBUF:I->O 128 0.715 2.338

    LUT3:I0>O 1 0.479 0.000

    MUXF5:I1>O 4 0.314 0.779

    OBUF:I->O 4.909

    Total 9.534ns (6.417ns logic, 3.117ns route)(67.3% logic, 32.7% route)

    Table 9 Timing Report of Transmitter Core

  • 7/29/2019 data transformation unit

    37/55

    Page | 30

    7.2 Timing Report of System core

    Offset: 10.138ns (Levels of Logic = 4)

    Source: T/bitcounter_5 (FF)Destination: SERIAL_CIPHER_TEXT (PAD)

    Source Clock: CLK rising

    Data Path: T/bitcounter_5 to SERIAL_CIPHER_TEXT

    Cell:in->out Fanout Gate Delay Net Delay

    FDRE:C->Q 5 0.626 1.078

    LUT4:I0->O 1 0.479 0.704

    LUT4:I3->O 1 0.479 0.704

    LUT4:I3->O 1 0.479 0.681

    OBUF:I->O 4.909

    Total 10.138ns (6.972ns logic, 3.166ns route)(68.8% logic, 31.2% route)

    Table 10 Timing Report of System Core

  • 7/29/2019 data transformation unit

    38/55

    Page | 31

    8. CONCLUSION

    A proprietary high-speed encryption and decryption core design is analyzed. It

    is observed from the timing reports that the computations of the System Core

    occur at a very high speed as compared to the existing software Prototypes.

    Since the data is sent on an FPGA the data sent is secured as it is a

    Hardware Channel. Hence, due to the Hardware Implementation of such a

    system a secure and fast data transfer takes place.

    The first limitation is that of the hardware implementation of the compressioncore which occurs due to the complexity of the algorithm to be implemented in

    HDL.

    The second limitation is the data sent is sent serially through a PC which

    makes the system slow (UART).

    A complete system core and its associated test firmware are also developed

    that form the hardware evaluation platform. Using this evaluation platform,

    functionality of the design running on real hardware is proven.

  • 7/29/2019 data transformation unit

    39/55

    Page | 32

    9. Appendix A

    CORE DESIGN

    Design of Compression Unit

    The main hardware module of the compression unit consists of three

    hierarchical blocks, which are the LZSS coder, fixed Huffman coder and data

    packer. All modules are synchronously clocked. The LZSS coder performs the

    LZSS encoding of the source data symbol, while the fixed Huffman coder re-

    encodes the length of LZSS codeword to achieve better compression ratio.

    Finally, the data packer packs the unary codes from the fixed Huffman coder

    into a fixed-length output packet and sends it to the interfacing block.

    Figure 14 Huffman Coder Block

    This suggests Huffman coding be employed to further encode the length

    portion of LZSS code-word in order to achieve higher compression saving. In

    the decompression side, the whole process is performed in the reverse order.

  • 7/29/2019 data transformation unit

    40/55

    Page | 33

    1. LZSS CODER

    The LZSS algorithm, however, involves computationally intensive matching

    process during the compression stage because each input phrase has to be

    compared with every possible phrase in the dictionary. Furthermore, the

    dictionary updating process involves variable length shifting of the input

    source into the dictionary, since the length of longest matched phrase

    changes with time. If this operation is done using variable-length shifter,

    considerable amount of hardware resources will be consumed, which can

    lead to higher implementation cost because bigger (and correspondingly,

    more expensive) programmable logic device or ASIC silicon is needed. The

    design tackles these problems through systolic array architecture of the LZSS

    compression dictionary, where each input data is compared with every

    dictionary elements simultaneously, while shifting input data is done one

    symbol at a time through the use of a fixed-length shifter.

    Figure 15 Structure of LZSS Algorithm Used

    In order to achieve sufficiently high processing speed to obtain data

    independent throughput, and to use fixed-length shifter to reduce the

    hardware resource utilization, the LZSS coder design employs systolic arrays

  • 7/29/2019 data transformation unit

    41/55

    Page | 34

    architecture. The hardware architecture consists of four main components;

    namely the dictionary, reduction tree, delay tree and codeword generator sub-

    modules.

    2. HUFFMAN CODER

    The Huffman coding technique also presents certain design challenges.

    Conventional Huffman coding requires a priori knowledge of the source data

    distribution characteristics in order to construct an optimal encoding table for

    better performance.

    However, in many real-life applications, it is difficult to determine the

    characteristics of source data because its probability distribution normally

    changes with time. Even when the source distribution statistics are available,

    different sources have different distribution characteristics. The encoding table

    must then be generated for each type of source data. Furthermore, the

    generated table must be transmitted along with the encoded data so that

    decompression can be performed correctly. This would both reduce the

    compression saving and increase the processing time of the hardware. The

    design tackles these problems by employing a predefined Huffman encoding

    table for both compression and decompression cores.

    The reason for this is two-fold; the first one is to simplify generation of the

    encoding table since adaptively building the table for different source data is

    no longer required. The second reason is to eliminate the need to transmit the

    encoding table to the decompression side, so that inefficient resource

    utilization and degradation of compression saving issues due to this encoding

    table transmission can be overcome.

  • 7/29/2019 data transformation unit

    42/55

    Page | 35

    DES ENCRYPTION AND DECRYPTION SYSTEM CORE

  • 7/29/2019 data transformation unit

    43/55

    Page | 36

    Figure 16 RTL Schematic of DES - ENCRYPTION

    Figure 17 RTL Schematic of DES - DECRYPTION

  • 7/29/2019 data transformation unit

    44/55

    Page | 37

    COMPLETE SYSTEM CORE

  • 7/29/2019 data transformation unit

    45/55

    Page | 38

    Figure 18 RTL Schematic of Complete Trx System

    Figure 19 RTL Schematic of Complete Rx System

  • 7/29/2019 data transformation unit

    46/55

    Page | 39

    APPENDIX B

    TRANSMITTER SYSTEM CORE VERILOG CODE

    This appendix presents the Verilog source codes of the transmitter system

    core and all its sub-modules. The design hierarchy is presented in Design

    Hierarchy.

    The Verilog source codes starting from the top level module are presented

    here. However, the complete codes are not given in the report.

    Module Name: Transmitter_System_Top

    module Transmitter_System_Top(CLK, RST, CHIP_SELECT_BAR,

    ADDRESS, SERIAL_CIPHER_TEXT, Sel, transmit,

    waddress,

    we,

    cs_ram_rec,

    cs_ram_tx,

    ENA,

    DIN,

    RD,

    DR

    );

    //Input Signals

    input ENA,DIN,RD;

    input CLK;

    input RST;

    input cs_ram_rec, cs_ram_tx;

    input CHIP_SELECT_BAR;

    input ADDRESS;

    input transmit,we;

    input [3:0] waddress;

    input [1:0]Sel;

  • 7/29/2019 data transformation unit

    47/55

    Page | 40

    //Output Signals

    output DR;

    output SERIAL_CIPHER_TEXT;

    // Internal Wires

    wire CLK;

    wire RST;

    wire CHIP_SELECT_BAR;

    wire ADDRESS;

    wire [64 : 1] CIPHER_TEXT_RAM;

    wire [1:0]Sel;

    wire [64:1]I3,I2,I1,I0;

    wire [64:1]O;

    wire [64:1] inter_mux;

    wire [64:1] to_tx;

    wire [64:1] to_ram;

    // Receiver Module

    RX Receiver(

    .CLK(CLK),

    .RST(RST),

    .DIN(DIN),

    .ENA(ENA),

    .RD(RD),

    .DR(DR),

    .DOUT(to_ram)

    );

    // Receiver RAM module

    ram1 RAM_REC(

    .CLK(CLK),

    .waddress(waddress),

    .data_in(to_ram),

    .we(we),

  • 7/29/2019 data transformation unit

    48/55

    Page | 41

    .cs(cs_ram_rec),

    .data_out(inter_mux)

    );

    // MUX module

    mux4to1 MUX(

    .I0(inter_mux),

    .I1(inter_mux),

    .I2(inter_mux),

    .I3(inter_mux),

    .Sel(Sel),

    .Y(O)

    );

    // Encryption Module

    Des_Top ENCRYPT(

    .CLK(CLK),

    .RST(RST),

    .CHIP_SELECT_BAR(CHIP_SELECT_BAR),

    .ADDRESS(ADDRESS),

    .PLAIN_TEXT(O),

    .CIPHER_TEXT(CIPHER_TEXT_RAM)

    );

    //Transmitter RAM Module

    ram1 RAM_TX(

    .CLK(CLK),

    .waddress(waddress),

    .data_in(CIPHER_TEXT_RAM),

    .we(we),

    .cs(cs_ram_tx),

    .data_out(to_tx)

    );

  • 7/29/2019 data transformation unit

    49/55

    Page | 42

    //Transmitter Module

    Transmitter T(

    .CLK(CLK),

    .RST(RST),

    .transmit(transmit),

    .data(to_tx),

    .TxD(SERIAL_CIPHER_TEXT)

    );

    endmodule

  • 7/29/2019 data transformation unit

    50/55

    Page | 43

    APPENDIX C

    DATASHEETS

  • 7/29/2019 data transformation unit

    51/55

    Page | 44

  • 7/29/2019 data transformation unit

    52/55

    Page | 45

  • 7/29/2019 data transformation unit

    53/55

    Page | 46

  • 7/29/2019 data transformation unit

    54/55

    Page | 47

  • 7/29/2019 data transformation unit

    55/55

    10. REFERENCE

    [1] J. Gailly, GZIP the Data Compression Program, 1993.

    ftp://ftp.gnu.org/gnu/GZIP/ GZIP-1.2.4.tar.gz.

    [2] T. A. Welch, A Technique for High-Performance Data Compression,

    IEEE Computer., vol. 17, pp. 819, 1984.

    [3] J. Ziv and A. Lempel, Compression of Individual Sequences Via Variable-

    Rate Coding, IEEE Transactions on Information Theory, 1978.

    [4] S. Leinen, Long-Term Traffic Statistics, 2001.

    http://www.cs.columbia.edu/hgs/ internet/traffic.html.

    [5] L. Deutsch, DEFLATE Compressed Data Format Specification Version 1.3,1996. ftp: //ftp.uu.net/pub/archiving/zip/doc/.

    [6] D. Huffman, A Method for the Construction of Minimum-Redundancy

    Codes, Proceedings of the Institute of Radio Engineers, vol. 40, pp. 1098

    1101, September 1952.

    [7] J. Ziv and A. Lempel, A Universal Algorithm for Sequential Data

    Compression, IEEE Transactions on Information Theory, vol. 23, no. 3, pp.

    337343, 1977.

    [8] Z. Li and S. Hauck, Configuration Compression for Virtex FPGAs, Field

    Programmable Custom Computing Machines, pp. 147159, 2001.

    [9] J. Storer and T. Szymanski, Data Compression via Textual Substitution,

    Journal of the ACM, vol. 29, no. 4, pp. 928951, 1982.

    [10] N. Larsson, Extended Application of Suffix Trees to Data Compression,

    Proceedings of the Conference on Data Compression, p. 190, 1996.

    [11] T. C. Bell and D. Kulp, Longest-Match String Searching for Ziv-Lempel

    Compression, Software - Practice and Experience, vol. 23, no. 7, pp. 757

    771, 1993.

    [12] Suzanne Rigler, FPGA-Based Lossless Data Compression Using GNU

    Zip.