asynchronous aes-side attack ieee

Upload: phuc-hoang

Post on 03-Jun-2018

227 views

Category:

Documents


0 download

TRANSCRIPT

  • 8/12/2019 Asynchronous AES-Side Attack IEEE

    1/24

    Asynchronous AES (Advanced Encryption

    Standard) Key Expander and Round Function

    Design for Improved Power Analysis Attack

    Resistance

    Siva Kotipalli 1 , Yong-Bin Kim 2 and Minsu Choi 3

    1 Samsung Electronics, Austin, TX, USA

    2 Department of Electrical and Computer Engineering,

    Northeastern University, Boston, MA, USA

    3 Department of Electrical and Computer Engineering,

    Missouri University of Science & Technology, Rolla, MO, USA

    Abstract

    This work presents the design, hardware implementation and performance analysis of novel Asyn-

    chronous AES (Advanced Encryption Standard) Key Expander and Round Function, which offer in-

    creased Side-Channel Attack (SCA) resistance. These designs are based on a Delay Insensitive (DI) logic

    paradigm known as Null Convention Logic (NCL), which supports a few useful properties for resisting

    SCAs like dual-rail encoding, clock-free operation and monotonic transitions. Potential benets include

    reduced and more uniform switching activities and reduced Signal-to-Noise (SNR) ratio. Thereby, theproposed designs leak less side-channel information than conventional approaches. To quantitatively

    verify such improvements, software simulations for Functional verication and WASSO (Weighted

    Average Simultaneous Switching Output) analysis simulations were carried out on both conventional

    synchronous approach and the proposed NCL based approach using Mentor Graphics ModelSim and

    Xilinx simulation tools. Hardware implementation was carried out on both the designs via a specied

  • 8/12/2019 Asynchronous AES-Side Attack IEEE

    2/24

    2

    Side channel Attack Standard Evaluation FPGA Board (SASEBO-GII) and the corresponding power

    waveforms for both designs were collected. Along with the results of software simulations, we analyzed

    this collected waveforms to validate the claims related to benets of the proposed crypto-hardware

    design approach.

    Index Terms

    Advanced Encryption Standard; Null Convention Logic; Key Expander; Round Function; Security;

    Side-Channel Attacks; WASSO Analysis; FPGA implementation; Power trace analysis.

    I. INTRODUCTION

    The Advanced Encryption Standard is the most widely used symmetric-key algorithm standard

    in different security protocols [1]. Originally the algorithm was called Rijndael, but after its

    selection as the candidate for AES due to its merits it gained popularity. It is used by hundreds of

    millions of users worldwide to protect internet banking, wireless communications, and the data on

    their hard disks. The Advanced Encryption Standard (AES) was conceived as reliable in providing

    security for data until researchers proved that side channel attacks (SCA) were successful

    in compromising. Since the detection of power analysis and EM analysis SCA effectiveness,

    researchers have started exploring different approaches to design countermeasures.Wave Dynamic Differential Logic (WDDL) [2] and Sense Amplier Based Logic (SABL)

    [3] are some of the previously proposed countermeasures of synchronous category. But both

    these approaches suffer from timing related issues that could leak side-channel information. Jun

    Wu et al. [4] proposed an asynchronous S-box design that proved to be power efcient and

    side channel attack resistant. Chunchun Sui et al. [5] proposed a design approach that combines

    aforementioned S-box design with Random Dynamic Voltage Scaling (RDVS) to boost SCA

    resistance to greater extent.

    This article proposes a scalable dual rail AES Key Expander and Round Function designs that

    incorporate the merits of Null Convention Logic (NCL). In this work, these two modules are

    then utilized to design a NCL based subset of the AES cryptosystem. The reason for calling it

    a subset is that in an actual AES-128, the two modules are utilized for ten iterations. But for

  • 8/12/2019 Asynchronous AES-Side Attack IEEE

    3/24

    3

    the cryptosystem subset discussed in this work, we utilize the two modules only for a single

    iteration.

    This work has multiple contributions in the eld of improving SCA resistance of cryptosys-

    tems:

    1) The proposed approach contributes to a uniform and reduced switching activity in cryptosys-

    tem and thereby curtail the leaked power and improve resistance against Power Analysis

    SCA;

    2) The anticipated improved switching prole also translates to uniform and reduced EM

    radiation side channel information emanating from cryptosystem and boosts the resistance

    of cryptosystem against EMA SCA [6];

    3) The proposed Key Expander and Round Function designs allow easy scaling for imple-

    menting it to entire AES algorithm of any of the following variants - 128, 192 or 256

    bit;

    4) They can also be easily scaled and implemented for different modes of AES like Electronic

    Codebook (ECB), Cypher Feedback (CFB) and Cypherblock Chaining (CBC) modes;

    5) Both the proposed designs incorporate a power efcient NCL combinational substitution

    box design, which provides power benets when compared to the conventional approach;

    6) The proposed design can also be effectively coupled with active frequency/voltage scaling

    techniques such as RVDS (random voltage dynamic scaling) to enhance security to even

    greater extent as demonstrated previously for the AES S-box in [5].

    The rest of the article is arranged as follows. Section II gives a background of AES, NCL and

    vulnerabilities of synchronous AES which are essential in understanding the proposed design

    techniques. Section III details the inuence of switching activity on SCA followed by Section IV

    which deals with the variation of SNR ratio. Section V describes the proposed NCL AES Key

    Expander. Section VI describes the proposed NCL AES Round Function. Section VII discusses

    the Results, which include the functional verication, WASSO analysis, hardware implementation

    and power trace analysis for both the conventional and proposed designs. This is nally followed

    by conclusion and future work.

  • 8/12/2019 Asynchronous AES-Side Attack IEEE

    4/24

    4

    I I . P RELIMINARIES AND R EVIEW

    A. Advanced Encryption Standard

    The AES algorithm is a symmetric block cipher that processes data blocks of 128 bits using

    cipher keys of three different lengths: 128, 192 or 256 bits. Its operations are performed on the

    State. The State is a two-dimensional array of bytes which contains the Plaintext, consisting

    of four rows and N b columns, where N b is the block length divided by 32. Similarly, the Key

    Schedule is a two-dimensional array of bytes which contains the Key.

    At the start of the cipher operation, input Plaintext is copied to the State and input Key is

    copied to the Key Schedule. After an initial Round Key addition, the State is transformed by a

    Round Function implemented N r times. This number depends on the key length: N r = 10 for

    128 bits, N r = 12 for 192 bits and N r = 14 for a key length of 256 bits.

    SubBytes

    ShiftRows

    MixColumns

    AddRoundKey

    PlainText

    Input_Key

    Round_Key Round FunctionOutput

    RoundFunction

    AddRoundKey

    RotateWord

    SubWord

    Round Constant

    XOR

    KeyExpander

    Fig. 1. Block Diagram of AES Round Function with Key Expander

    Figure 1 shows the two main components of AES. Key Expander and Round Function have

    four basic byte-oriented transformations each, which are applied to the Key Schedule and the

    State respectively. These transformations are succinctly described in Table I.

    B. Vulnerability of Synchronous AES Design

    The main aws of synchronous AES implementation is its vulnerability to side-channel

    attacks (SCA). SCA utilize any of the following leakage information: power consumption,

  • 8/12/2019 Asynchronous AES-Side Attack IEEE

    5/24

    5

    Transformations in Key Expander Main FunctionRotateWord Single rotation on a column of Key ScheduleSubWord Nonlinear byte substitutionRound Constant Selection of round constantXOR Makes round key, round constant dependent

    Transformations in Round Function Main FunctionSubByte Nonlinear byte substitutionShiftRow Inter-column diffusionMixColumn Inter-byte diffusion within columnsAddRoundKey Makes round function key dependent

    TABLE ITRANSFORMATIONS PRESENT IN K EY E XPANDER AND ROUND F UNCTION

    switching activity, timing information, electromagnetic leaks or acoustic information emanating

    from cryptosystem to identify the secret key. The Power analysis SCA has proved to be an

    effective approach to compromise security. The attack is based on the fact that the leaked power

    consumption information actually contains information about the modules behavior. Out of the

    different types of Power analysis attacks, the Differential Power Analysis (DPA) [7] are the most

    effective at revealing the hidden private key. These DPA attacks make use of statistical analysis

    and multiple waveforms to reveal the secret key [8].

    Just as the power consumption of CMOS devices is data-dependent, the electromagnetic

    radiation emanating from a cryptosystem is also data-dependent. This data-dependent radiation

    is again the origin of side-channel information leakages. The leaked side-channel information

    is analyzed by means of Electromagnetic Analysis (EMA) , which measures electromagnetic

    elds near cryptographic device [6] and uses this data to compromise the security. But if we can

    curtail the leakage of side channel information and thereby make it difcult for the attacker to

    have sufcient information to identify the segments in the power waveform and EM radiation.We can secure the cryptosystem more effectively against these power analysis and EMA SCAs.

  • 8/12/2019 Asynchronous AES-Side Attack IEEE

    6/24

    6

    C. Null Convention Logic

    NCL is a delay insensitive asynchronous paradigm. The delay insensitivity of NCL circuits

    is achieved by dual-rail and quad-rail logic [9]. A dual rail signal can effectively represent four

    states. Out of them, the three valid states are: DATA0, DATA1, NULL. The fourth state in which

    both the rails are asserted is considered as an illegal state. The valid data states DATA0, DATA1

    correspond to boolean logic 0, boolean logic 1, respectively. The control signal NULL is used for

    asynchronous handshaking. The clock free operation is implemented via the two delay-insensitive

    registers located on either side of the combinational circuit and the local handshaking signals.

    The main benet of using dual-rail logic is that, constant power consumption can be achieved

    since the signals are implemented by two complementary wires. Furthermore, due to delay

    insensitive nature these DI circuits adhere to monotonic transitions between DATA and NULL,

    so there is no glitching, unlike clocked Boolean circuits that produce substantial glitch power.

    DI systems better distribute switching over time and area, reducing the switching activity, peak

    power demand and system noise, unlike clocked boolean circuits where much of the circuitry

    switches simultaneously at the clock edge. The downside is dual-rail method generally incurs

    area overhead.

    The independence of power consumption from input data and the overall reduction in powerconsumption, which are attributed to dual rail logic and monotonic transitions respectively, boost

    the SCA resistance of AES cryptosystem. Additionally, the above mentioned merits also reduce

    the electromagnetic interference making the AES more robust.

    I I I . I NFLUENCE OF SWITCHING A CTIVITY ON SCA

    A. Role of Switching Activity on Power Analysis SCA

    The dynamic power consumption of CMOS gates is particularly relevant from a side-channel

    point of view since it determines a simple relationship between a devices internal data and its

    externally observable power consumption. It can be written as:

    P dyn = A C L V 2

    dd f (1)

  • 8/12/2019 Asynchronous AES-Side Attack IEEE

    7/24

    7

    In equation (1), P dyn is the power consumed, A is the switching activity factor, C L is

    the switched capacitance, V dd is the supply voltage, and f is the clock frequency. This data-

    dependent power consumption is the origin of side-channel information leakages. If we are able

    to reduce the switching activity factor A in equation (1), that would directly translate to decreaseddynamic power consumption. Thomas S. Messerges et al. [10] discussed the role of SNR ratio

    in determining the success probability of a DPA attack.

    SN R = var (P expl )var (P noise )

    (2)

    The equation (2) can be used to estimate SNR [11], in this equation var (P expl ) is the

    variance of exploitable component of power consumption and var (P noise ) is the variance of

    noise component. By reducing this exploitable power information P expl we can lower the SNR

    ratio. The lower the SNR ratio, lower is the leakage, so performing the DPA attack becomes

    harder.

    B. Role of Switching Activity on EMA SCA

    The switching activity also inuences the EM radiation leaked from the cryptosystem. The

    voltage uctuation caused by ground bounce can be expressed as [6]:

    V = Lef f M dI dt

    (3)

    In this equation, Lef f is the effective parasitic inductance, M is the number of simultaneous

    switching outputs, and dI/dt is the rate of change of the current. So it is clear that if we are

    able to reduce the switching activity M , we can reduce the information leakage due to V , as

    V M .

    IV. VARIATION OF S IGNAL -TO -N OISE RATIO

    As discussed in [11], each point of a power trace can be modeled as the sum of an operation-

    dependent component P op , a data-dependent component P data , electronic noise P el.noise , and a

  • 8/12/2019 Asynchronous AES-Side Attack IEEE

    8/24

    8

    constant component P const . The relationship between different components of the power con-

    sumption is given by (4) and in terms of variance it is given by (5).

    P op + P data = P expl + P swnoise (4)

    var (P op ) + var (P data ) = var (P expl ) + var (P swnoise ) (5)

    In the context of a given attack scenario, the signal-to-noise ratio of a point of a power trace is

    given by the following equation. The SNR quanties how much information is leaking from a

    point of a power trace. The higher the SNR, the higher is the leakage.

    SN R = var (P expl )

    var (P noise ) =

    var (P expl )

    var (P elnoise ) + var (P swnoise ) (6)

    The plot presented in Figure 2 is from [11] and represents distribution of power consumption

    when a microcontroller transfers different data from the internal memory to a register. The x-axis

    represents the voltage and the y-axis represents the probability of occurrence of corresponding

    hamming weight for an 8-bit data. The gure reveals that for all data values with the same ham-

    ming weight, the power consumption of the microcontroller is distributed in approximately the

    same way. Generally, this is also the case for any cryptographic circuit, the power consumption

    of the circuit is dependent on hamming weight of the data it process. This feature of hamming

    weight dependent power consumption can be effectively employed for improving the security

    using NCL. The reason for this is, because of NCL being a dual rail logic, hamming weights

    of all the inputs remain same as each input has equal number of 1s and 0s.

    So using this plot and applying it to the synchronous design and the proposed NCL based

    design we see the benets. In the synchronous case, rst consider if we have a series of

    three inputs applied to our cryptosystem as listed in Figures 3, 4. All the three inputs havedifferent hamming weights, so this causes three distinct power traces as the power consumption

    is signicantly hamming weight dependent. Now consider the asynchronous case and serially

    apply the same three inputs which were applied to the synchronous cryptosystem.

    From Figures 3 & 4 we can observe that the number of 1s is the same for all the inputs

  • 8/12/2019 Asynchronous AES-Side Attack IEEE

    9/24

    9

    Fig. 2. Distribution of the power consumption in relation to HW (Hamming Weight) of data that is processed [11].

    2 3 0 0 1 0 0 0 1 1

    5 C 0 1 0 1 1 1 0 0

    3

    7 F 0 1 1 1 1 1 1 1

    4

    7

    Fig. 3. Hamming weights of data in Conventional Design

    2 3

    5 C

    801 01 10 01 01 01 10 10

    01 10 01 10 10 10 01 01 8

    7 F 01 10 10 10 10 10 10 10 8

    Fig. 4. Hamming weights of data in NCL based Design

    which enables equal hamming weight, as contrary to the synchronous case. This shows that due

    to NCL, the var (P data ) decreases, as all the data have the same hamming weight and the resulting

    power distribution will be more narrow than the power distribution which is more spread out

    for the conventional approach due to varying hamming weights as presented in Figure 2.

    2 3 0 0 1 0 0 0 1 1

    5 C 0 1 0 1 1 1 0 07

    7 F 0 1 1 1 1 1 1 1

    0 0 1 1 0 1 1 1

    3

    3 72

    1 0 0 0 1 0 0 18 9

    6

    Fig. 5. variations in number of bit transitions for differentdata for Conventional Design

    2 3

    5 C

    8

    01 01 10 01 01 01 10 10

    01 10 01 10 10 10 01 01

    NULL 00 00 00 00 00 00 00 008

    NULL 00 00 00 00 00 00 00 008

    7 F 01 10 10 10 10 10 10 10

    NULL 00 00 00 00 00 00 00 008

    8

    Fig. 6. variations in number of bit transitions for differentdata for NCL based Design

    Additionally due to NCL, the number of bit transitions are the same for any two input

    transitions as the system alternates between DATA and NULL. From Figures 5, 6 we can

  • 8/12/2019 Asynchronous AES-Side Attack IEEE

    10/24

    10

    clearly see the random variations in the number of bit transitions occurring in the conventional

    synchronous design and the constant number of bit transitions occurring in the NCL based design

    for the same set of data. So this constant number of bit variations enable more uniformity to the

    trace and thereby lead to a uniform SNR. Next, we analyze the effect of uniformly and randomlydistributed data on SNR.

    In DPA, var (P op ) = 0 for uniformly distributed data, this is because the attacker performs

    the same operation again and again with different data. Due to Uniformly distributed data,

    var (P swnoise ) = 0. As discussed earlier, due to NCL, var (P data ) decreases and this leads to

    reduced var (P expl ).

    SN R = Reduced var (P expl )

    var (P elnoise ) Reduced SNR (7)

    For randomly distributed data, due to DPA, var (P op ) = 0 and due to NCL, var (P data ) decreases

    as explained earlier. So equation 6 reduces to equation 8.

    var (P data ) = var (P expl ) + var (P swnoise ) (8)

    Randomly distributed data causes, var (P swnoise ) to increase as many of the bits are indepen-dently distributed and hence, they contribute to increased switching noise [11]. Consequently,

    these factors lead to reduced var (P expl ) and nally a reduced SNR.

    V. NCL AES K EY E XPANDER D ESIGN

    The AES algorithm uses a Key Expander to calculate the round keys used in AddRoundKey

    stage of the Round Function. The AES specication refers to this process as the KeyExpansion.

    The motive behind the purpose of this unit is that generating multiple keys from an initial key

    and using a unique key for each round, instead of using the same key for all the rounds greatly

    increases the diffusion of bits. For this research we chose AES with a key size of 128 bits.

    The controller for this NCL AES Key Expander and Round Function is shown in Figure 7.

    In this control unit, the input data which is in ordinary binary format is read and is converted

  • 8/12/2019 Asynchronous AES-Side Attack IEEE

    11/24

    11

    into dual rail inputs by Single rail to Dual-rail convertor. Ko is the output acknowledgement

    signal coming out of the NCL Round function and Key Expander. It acts like clock signal for

    the other units in the controller. The converter and multiplexer (MUX) are controlled by Ko.

    When Ko is 1, it means NCL Round function and Key expander are ready for NULL wavefront,then MUX will send all 0s to PlainText and Input Key to nullify the NCL Key Expander and

    Round function. Otherwise, MUX will select the dual rail data that is output from the convertor.

    The dual rail Input Key is fed as input to the NCL Key Expander and it generates the Round

    Keys necessary for each encryption round of AES.

    Ko feedback to Ki

    256

    Single Rail toDual Rail

    Converter

    256

    0

    256

    0

    Reset

    2:1MUX

    Ko Ki

    PlainText [128:0]

    Input_Key [128:0]

    256

    PlainText [128:0]Dual Rail

    256

    Input_Key [128:0]Dual Rail

    2:1MUX

    256NCL AES

    KeyExpander

    Round_Key [128:0]Dual Rail

    256

    To NCL AESRound Function

    Control Unit

    Fig. 7. Block Diagram of NCL AES Control Unit.

    ButtonRSX

    ButtonRSX

    W 0 W 1 W 2 W 3

    W 4 W 5 W 6 W 7

    W 8 W 9 W 10 W 11

    ButtonRSX

    RotateWord

    SubWord

    XOR

    Round Constant

    Fig. 8. Block Diagram of AES Key Expander [12].

  • 8/12/2019 Asynchronous AES-Side Attack IEEE

    12/24

    12

    The block diagram of the architecture of Key Expander [12] is presented in Figure 8. The w 0 ,

    w1 , w2 , w3 are the four columns of the Key Schedule. The columns of the Key Schedule which

    have their index as a multiple of four undergo the RSX step along with the XOR operation,

    all the remaining columns undergo XOR operations to generate the Round Key. As depicted inthe gure, Key Expander consists of the following modules:

    RotateWord: This operation accepts an array of 4 bytes and rotates them 1 position to the

    left. The RotWord function used by KeyExpansion is very similar to the ShiftRows routine used

    by the encryption algorithm except that it works on a single column of the key schedule, instead

    of the rows of the State array.

    SubWord: The SubWord routine performs a byte-by-byte substitution on a given row of the

    key schedule table using the NCL S-box. The substitutions in KeyExpansion operate exactly like

    those in the SubBytes step of Round Function. The input byte to be substituted is fed as input to

    the NCL combinational S-box, and this input then undergoes Multiplicative Inversion in GF( 28 )

    and Afne Transformation during encryption. We employed the dual-rail combinational NCL

    S-box proposed in [4] for this step, as this design already proved to be very power efcient and

    resistant to SCA. The architecture of the S-box and the block diagram of its internal Multiplicative

    Inversion module are presented in Figures 9 and 10.

    Fig. 9. Combinational S-box architecture. Fig. 10. Block diagram of Multiplicative Inversion overGF(2 8 ) where MM is modular multiplication unit.

  • 8/12/2019 Asynchronous AES-Side Attack IEEE

    13/24

    13

    Round Constant module: This module uses an array Rcon, called the round constant table. In

    the synchronous implementation, these round constants are 4 bytes each to match with a column

    of the key schedule table. The AES KeyExpansion routine [1] requires 10 round constants, one

    for each round of the AES algorithm. In our implementation we implement this as an array of round constants represented in dual rail notation.

    XOR module: In this module we perform the XOR operation between the columns of the Key

    Schedule with or without the round constant selected in previous step depending on the column

    which is being calculated. In order to realize this XOR function in NCL we have to make use

    of NCL XOR function designed using the NCL threshold gates:

    0

    A

    B

    0

    1

    0 1

    01

    1

    Fig. 11. K-Map for XOR

    THxor0

    THxor0 Z 0

    Z 1

    A 1 A 0 B 1 B 0

    Fig. 12. NCL XOR Function using

    THxor gates

    TH24comp

    TH24comp Z 0

    Z 1

    A 1 A 0 B 1 B 0

    Fig. 13. NCL XOR Function using

    TH24comp gates

    Unlike boolean logic, NCL has 27 fundamental threshold gates [9] to realize arbitrary logic. In

    order to achieve the input-completeness and observability, it is important to choose appropriate

    threshold gates. For the design of NCL XOR function, according to the Karnaugh Map in

    Figure 11, the sum-of-product (SOP) expressions are Z 1 = A1 B 0 + A0 B 1 and Z 0 = A0 B 0 +

    A1 B 1 . They can be realized by mapping them to THxor0 gates as shown in Figure 12. However,

    two transistors can be eliminated for each rail of Z (when using static gates) by realizing this

    same functionality using TH24comp gates. This is done by adding the two dont care terms,

    representing the cases when both rails of either A or B are simultaneously asserted.

    The new equations are: Z 1 = A1 B 0 + A0 B 1 + A0 A1 + B 0 B 1 and Z 0 = A0 B 0 + A1 B 1 + A0 A1 +

    B 0 B 1 . The NCL XOR function realized using these equations and TH24comp gates is presented

  • 8/12/2019 Asynchronous AES-Side Attack IEEE

    14/24

    14

    in Figure 13 and this is used in our proposed design. This TH24comp based XOR offers a 10%

    reduction in the number of transistors required compared to the approach using THxor0 gates.

    VI. NCL AES R OUND F UNCTION

    The top-level architecture of the proposed NCL AES Round Function design is presented in

    Figure 14. The controller for this module was presented previously in Figure 7. This control

    unit takes care of converting the ordinary PlainText and Input Key into dual rail notation. The

    dual rail Input Key is fed as input to the NCL Key Expander and it generates the Round Key,

    which along with the dual rail PlainText from the controller is fed to the AES Round Function.

    NCL AES RoundFunction

    ControlUnit

    Round_Key [128:0]Dual Rail

    PlainText [128:0]Dual Rail

    Input_Key [128:0]Dual Rail

    Ko

    Ki

    RoundFunc_op [128:0]DualRail

    256

    256 256

    256

    Reset

    NCL AESKey

    Expander

    Fig. 14. Block Diagram of NCL AES Round Function Top-level architecture

    The NCL AES Round Function consists of the following four steps which are performed

    sequentially:

    1) NCL SubBytes: This transformation is presented in Figure 15, where each dual-rail byte

    of the State matrix is substituted independently by another one which is computed by the NCL

    S-box. The S-box is a key element in the AES architecture as it signicantly inuences the

    security, power consumption and throughput of the AES hardware. We are using the dual-rail

    combinational NCL S-box proposed in [4] for this step as this design already proved to be very

    power efcient and resistant to SCA.

    2) NCL ShiftRows: The NCL ShiftRow transformation function presented in Figure 16,

    performs byte transposition of all dual-rail NCL signals by using circular shifting, where each

  • 8/12/2019 Asynchronous AES-Side Attack IEEE

    15/24

    15

    row of dual rail State is rotated cyclically to left using 0, 1, 2 and 3-byte offset for encryption.

    Fig. 15. NCL SubBytes Fig. 16. NCL ShiftRows

    3) NCL MixColumns: In this transformation each column of the dual rail State matrix is

    multiplied by a circulant maximum distance separable matrix. This MixColumns function shownin Figure 17, takes four dual-rail bytes as input and outputs four dual-rail bytes, where each

    input byte affects all four output bytes. The multiplication of the state array element with 2 in the

    dual rail domain is realized by 1-bit left shift of dual rail signals followed by a conditional NCL

    XOR operation. The multiplication with 3 is implemented in a similar fashion but it involves an

    additional NCL XOR operation.

    4) NCL AddRoundKey: AddRoundKey transformation shown in Figure 18, performs a byte

    level dual-rail XOR operation on the dual-rail output of MixColumn and corresponding dual-rail

    round-key.

    Fig. 17. NCL MixColumns Fig. 18. NCL AddRoundKey

  • 8/12/2019 Asynchronous AES-Side Attack IEEE

    16/24

    16

    VII. E XPERIMENTAL V ERIFICATION OF THE P ROPOSED D ESIGN

    A. Functional Verication of Proposed Design

    The traditional synchronous implementation and the proposed NCL AES Key Expander and

    NCL AES Round Function have been implemented in VHDL. The functional verication simu-

    lations of these designs were performed with Mentor Graphics ModelSim. The proposed designs

    has been functionally veried completely using a large set of test vectors which were chosen from

    [1]. A sample test vector is presented in Figure 19 and the corresponding functional verication

    results are presented in Figures 20, 21 and 22.

    Fig. 19. Test Vectors Fig. 20. Functional Verication Result for SynchronousDesign

    Fig. 21. Functional Verication Result for the proposedNCL based Key Expander Design

    Fig. 22. Functional Verication Result for the proposedNCL based Round Function Design

  • 8/12/2019 Asynchronous AES-Side Attack IEEE

    17/24

    17

    B. Weighted Average Simultaneous Switching Output Analysis

    WASSO tool is an utility of Xilinx PlanAhead suite that validates signal integrity of the device

    This analysis gives a measure of the amount of simultaneous switching occurring in the design.

    So we used this analysis to determine the variation in switching activity across both the AES

    Round Function designs. The results obtained were plotted and presented in Figure 23. The

    implementation platform chosen for carrying out WASSO analysis is Xilinx Virtex-5 FPGA. As

    switching activity directly depends on the number of simultaneously switching outputs, switching

    activity can be reduced if SNR gets reduced.

    Fig. 23. WASSO utilization plots for individual banks and neighbors.

    From Figures 23 (a) and (b), it can be observed that the switching activity in the proposed

    design is lessened to a considerable extent and is also more uniform as compared to its syn-

    chronous counterpart. This reduction decreases the amount of unintentionally leaked information

    and the uniformity makes it more difcult to exploit the remaining leaked information to carry

    out SCAs.

    C. Effects of Switching Activity on Signal-to-Noise ratio

    According to equation (2), it is clear that SNR is directly proportional to var (P expl ). The P expl

    is a combination of two quantities P oprn and P data . But var (P oprn ) is zero as we are considering

    a DPA attack, in which we perform the same operation again and again but with different input

  • 8/12/2019 Asynchronous AES-Side Attack IEEE

    18/24

    18

    data. So var (P expl ) becomes equal to var (P data ). The P data is data dependent and is a function

    of switching activity. So the reduction of switching activity observed from WASSO simulations

    will translate into reduction of P data of all the points on the power trace. This overall reduction

    of P data will translate into reduction of var (P expl ) and consequently reduction of SNR.Additionally as discussed previously, power consumption of a cryptosystem is heavily depen-

    dant on hamming weight of data it processes. Due to this, equal hamming weights of all inputs

    in our proposed design will enable our NCL design to maintain a uniform power consumption

    and thereby a uniform SNR on power trace. Thus the proposed design enables the cryptosystem

    to have a reduced and an uniform SNR, which is a key element for enhancing security.

    By using the switching activity results we performed parametric simulations and plotted SNR

    of NCL design in comparison to the synchronous approach. These approximate results are

    presented in Figure 24(a). Using this SNR data, Figure 24(b) shows how variation in SNR,

    inuences number of traces that an attacker must collect to perform a successful DPA attack.

    As SNR ratio decreases, performance of this NCL based approach keeps getting better. So this

    is the advantage of employing NCL for cryptosystem design.

    Fig. 24. Comparison of SNR and Difculty of performing successful DPA for both designs

    D. Power Benets

    In AES implementations, the SubBytes transformation which entirely depends on the S-box

    is the most crucial factor deciding the energy performance of the AES itself. More than 50% of

    entire power is dependent on this step [13] [14] [15]. Due to the use of novel NCL S-box design

  • 8/12/2019 Asynchronous AES-Side Attack IEEE

    19/24

    19

    we achieve a 22% reduction in power consumption [4] at this Subbytes step. So this reduction

    will cause signicant improvement in the energy efciency of the proposed NCL based design

    approach.

    E. Hardware Implementation and Power Trace Analysis

    In the previous section, the performance of our proposed design was evaluated using software

    simulations. However, to get a more accurate performance analysis, simulations on the hardware

    implementation are necessary. In this section we discuss in detail the procedure used for hardware

    implementation experiment of the proposed design and the synchronous AES. Additionally we

    present the power trace data obtained from the power measurements on the hardware imple-

    mentations and discuss the variations between this obtained data for the two designs. Figure 25

    shows the Side-channel Attack Standard Evaluation Board (SASEBO-GII board) [16] that is

    used as the basic platform in this experiment.

    Fig. 25. Side-channel Attack Standard Evaluation FPGA Board (SASEBO-GII)

    The reason for choosing this FPGA board as a platform for hardware implementation is that,

    this board has been specically designed for security evaluation of cryptographic circuits and

  • 8/12/2019 Asynchronous AES-Side Attack IEEE

    20/24

    20

    for the purpose of side-channel attack experiments. There are two FPGA cores in this board

    that can be utilized; The rst FPGA is a cryptographic FPGA which is a Xilinx Virtex-5 series

    FPGA. The second one is the control FPGA which is a Spartan-3A series FPGA. These FPGAs

    are connected through a general-purpose input/output common bus. The AES Round Functionand Key Expander circuits are implemented in the cryptographic FPGA and the conguration

    circuit is programmed into the conguration FPGA. The purpose of separating these two circuits

    is to prevent the power trace of the conguration circuit from interfering with the power trace of

    the cryptographic circuit, so that the measurements of power traces, which decide the resistance

    of the design to power analysis attacks can be done fairly.

    For the purpose of power trace measurement, shunt resistors are present on FPGA board which

    utilize core VDD and/or ground lines of cryptographic FPGA to give an accurate measurement

    of the cryptographic FPGA power consumption. These measurements can be captured by an

    oscilloscope via a voltage probe.

    Figure 26 presents the experimental setup used for power trace analysis. For making a quali-

    tative comparison in terms of security, between the quality of power traces of the conventional

    design and the proposed NCL design we supply a set of three inputs to both the designs. As

    the same inputs are applied to both the designs, this enables us to evaluate the performance of

    different circuits to the same input data.

    If we are able to prove that the following two features of the power trace are true for NCL

    based design then we can conclude that the proposed approach enhances security. They are: (1)

    The power trace is more uniform compared to synchronous design for the same input and (2)

    The power trace of NCL based approach exhibits a higher degree of similarity between all the

    three different input cases as compared to the similarity exhibited by synchronous approach.

    So in order to perform a qualitative comparison, we applied a series of three Plaintexts whichare shown in Figure 27, to both cryptosystem designs and encrypted it with the same key. Then

    we recorded the power traces for each of these cases for both designs and compared their quality

    in terms of security. The results are presented in Figures 28, 29, 30, 31, 32 33.

  • 8/12/2019 Asynchronous AES-Side Attack IEEE

    21/24

    21

    Fig. 26. Experimental Setup for Power Trace Measure-ment.

    Fig. 27. Plaintexts and Key used for Power TraceAnalysis.

    Fig. 28. Power Trace of Synchronous cryptosystem

    for Plaintext 1.

    Fig. 29. Power Trace of Asynchronous cryptosystem for

    Plaintext 1 (DATA).

    Fig. 32. Power Trace of Synchronous cryptosystem

    for Plaintext 3.

    Fig. 33. Power Trace of Asynchronous cryptosystem for

    Plaintext 3 (DATA).

    From the Figures 29, 31, 33 we can clearly see that the power waveforms look considerably

    similar for the proposed design in all the three cases even when the input plaintext is different. But

  • 8/12/2019 Asynchronous AES-Side Attack IEEE

    22/24

    22

    Fig. 30. Power Trace of Synchronous cryptosystemfor Plaintext 2.

    Fig. 31. Power Trace of Asynchronous cryptosystem forPlaintext 2 (DATA).

    on the contrary for synchronous design, from the Figures 28, 30, 32 we can see that the power

    trace has clear variations between the three cases, as represented by ovals. These variations

    as discussed previously can be effectively exploited to compromise security. But, in case of

    proposed design we dont see any clear variations between the three traces. In addition to the

    lack of this variations in the proposed design, we can also see that the waveforms are far more

    uniform as compared to their synchronous counterparts.

    So with this increased uniformity and high degree of similarity between power traces for

    different plaintexts we can conclude that security is improved to a considerable extent due to

    inherent benets of NCL.

    Figure 34 shows the power trace corresponding to NULL - DATA wavefronts in the hardware

    implemented design. Figure 35 presents the propagation delay in the hardware implementation

    of the proposed design. After the input is applied, output arrives after 40ns.

    VIII. C ONCLUSION AND F UTURE W OR K

    A novel design approach for the two main components of NCL AES, which are the Key

    Expander and Round function are reported and validated in this work. This research is being

    used as the basis for a research project that aims to tapeout a silicon chip of NCL AES design,which can be used to carry out more performance evaluation experiments. Contrary to the existing

    countermeasures which do not eliminate the source of SCA problem and try to nd solutions in

    later stages, the proposed approach combines the merits of dual rail encoding with asynchronous

    design approach to eliminate the source of the SCA problem, which is side channel information

  • 8/12/2019 Asynchronous AES-Side Attack IEEE

    23/24

    23

    Fig. 34. Power Traces of NULL - DATA wavefronts inhardware implementation of proposed design.

    Fig. 35. Propagation Delay in NCL based design.

    leakage. In addition to providing power analysis SCA resistance, our approach also enhances

    resistance to EMA SCAs.

    Qualitative comparisons between the proposed approach and the traditional synchronous design

    have conducted to verify merits of the proposed design. Both software simulation and hardware

    implementation results validate the effectiveness and correctness of our approach.

    In this work we validated the benet of using NCL for cryptosystem design by using WASSO

    and Power trace analysis. But in future the performance of this design can be evaluated by

    performing an actual side channel attack like the DPA or Correlation Power Analysis (CPA),

    as performed for NCL AES S-box in [17]. This will provide a more accurate view of the

    cryptosystem resistance to side channel attacks. As stated previously further research may also

    aim at efciently coupling this design with RDVS as demonstrated for S-box in [5] for improving

    security to a greater extent.

    R EFERENCES

    [1] NIST, Advanced Encryption Standard (AES), FIPS PUB 197 . National Institute of Standards and Technology, Nov 2001.

    [2] I. Verbauwhede, K. Tiri, and K. Tiri, A dynamic and differential cmos logic style to resist power and timing attacks on

    security ics, iacr cryptology eprint archive , 2004.

    [3] K. Tiri and I. Verbauwhede, A logic level design methodology for a secure dpa resistant asic or fpga implementation ,

    2004.

  • 8/12/2019 Asynchronous AES-Side Attack IEEE

    24/24

    24

    [4] J. Wu, Y.-B. Kim, and M. Choi, Low-power side-channel attack-resistant asynchronous s-box design for aes cryptosys-

    tems, in Proceedings of the 20th symposium on Great lakes symposium on VLSI , ser. GLSVLSI 10. New York, NY,

    USA: ACM, 2010, pp. 459464.

    [5] C. Sui, J. Wu, Y. Shi, Y.-B. Kim, and M. Choi, Random dynamic voltage scaling design to enhance security of ncl s-box,

    in 2011 IEEE 54th International Midwest Symposium on Circuits and Systems (MWSCAS) , aug. 2011, pp. 1 4.

    [6] T. Sugawara, Y.-I. Hayashi, N. Homma, T. Mizuki, T. Aoki, H. Sone, and A. Satoh, Information security applications,

    H. Y. Youm and M. Yung, Eds. Berlin, Heidelberg: Springer-Verlag, 2009, ch. Mechanism behind Information Leakage

    in Electromagnetic Analysis of Cryptographic Modules, pp. 6678.

    [7] P. Kocher, J. Jaffe, and B. Jun, Introduction to differential power analysis and related attacks, Technical Report,

    Cryptography Research Inc., San Francisco, California , 1998.

    [8] S. Mangard, E. Oswald, and T. Popp, Power Analysis Attacks : Revealing the Secrets of Smart Cards . Springer-Verlag,

    2007.

    [9] S. Smith and J. Di, Designing asynchronous circuits using NULL convention logic (NCL), Synthesis Lectures on Digital

    Circuits and Systems , vol. 4, no. 1, pp. 196, 2009.

    [10] T. S. Messerges, E. A. Dabbish, and R. H. Sloan, Examining smart-card security under the threat of power analysis

    attacks, IEEE Trans. Comput. , vol. 51, pp. 541552, May 2002.

    [11] S. Mangard, E. Oswald, and T. Popp, Power Analysis Attacks-Revealing the Secrets of Smart Cards . Springer, March 12,

    2007.

    [12] A. Kak, Lecture notes on computer and network security by avinash kak, March 2012. [Online]. Available:

    https://engineering.purdue.edu/kak/compsec/NewLectures/Lecture8.pdf

    [13] An optimized s-box circuit architecture for low power aes design. in Cryptographic Hardware and Embedded Systems

    - CHES 2002, 4th International Workshop, Redwood Shores, CA, USA, August 13-15, 2002, Revised Papers , 2002, pp.

    172186.

    [14] M. Kim, J. Kim, and Y. Choi, Low power circuit architecture of aes crypto module for wireless sensor network.

    [15] Gurkaynak and F. Kagan, GALS system design: side channel attack secure cryptographic accelerators . ETH, 2006.

    [16] R. C. for Information Security, Side-channel attack standard evaluation board SASEBO-GII specication, September

    2009. [Online]. Available: http://www.rcis.aist.go.jp/special/SASEBO/SASEBO-GII-en.html

    [17] J. Wu, Y. Shi, and M. Choi, Fpga-based measurement and evaluation of power analysis attack resistant asynchronous

    s-box, in Instrumentation and Measurement Technology Conference (I2MTC), 2011 IEEE , may 2011, pp. 1 6.