real-time video streaming over wimax networks with … · chapter 1. introduction 2 coding bit...

REAL-TIME VIDEO STREAMING OVER

WIMAX NETWORKS WITH H.264 CODING

by

Pariya Raoufi

B.Sc., Sharif University of Technology, 2008

a Thesis submitted in partial fulfillment

of the requirements for the degree of

Master of Science

in the

School of Computing Science

Faculty of Applied Sciences

c© Pariya Raoufi 2013

SIMON FRASER UNIVERSITY

Spring 2013

All rights reserved.

However, in accordance with the Copyright Act of Canada, this work may be

reproduced without authorization under the conditions for “Fair Dealing.”

Therefore, limited reproduction of this work for the purposes of private study,

research, criticism, review and news reporting is likely to be in accordance

with the law, particularly if cited appropriately.

APPROVAL

Name: Pariya Raoufi

Degree: Master of Science

Title of Thesis: Real-time video streaming over WiMAX networks with H.264

coding

Examining Committee: Dr. Arrvindh Shriraman

Assistant Professor, Computing Science

Simon Fraser University

Chair

Dr. Joseph Peters, Professor, Computing Science


Senior Supervisor

Dr. Arthur Liestman, Professor, Computing Science


Senior Supervisor

Dr. Mohamed Hefeeda, Associate Professor, Com-

puting Science


Examiner

Date Approved:

ii

lib m-scan3

Typewritten Text

January 15, 2013

Partial Copyright Licence

Abstract

Broadcasting multimedia content over a wireless channel from a base station to wireless

devices is a challenging compromise between the limited battery power of wireless devices

and the expected quality of the users. A further complication is that the data rate of a wire-

less channel changes over time. In this thesis, we develop a Power-Rate-Buffer-Distortion

optimization framework with the goal of minimizing the energy consumption of a wireless

receiver subject to quality and data rate constraints. Our framework provides a method for

selecting coding parameters, such as bit rate and intra refresh rate, in a way that minimizes

the energy consumption of the receiver. Our simulation results show that our framework

can reduce energy consumption by up to 50% compared to the conventional AVC video

streaming method, depending on the required video quality.

iii

lib m-scan3

Typewritten Text

Acknowledgments

Foremost, I would like to express my sincere gratitude to my supervisor Dr. Joseph Peters

for the continuous support of my master study and research, for his patience, motivation,

enthusiasm, and immense knowledge. His guidance helped me in all the time of research

and writing of this thesis.

I would like to express my gratitude to Dr. Arthur Liestman, my co-supervisor, and

Dr. Mohamed Hefeeda, my thesis examiner, for being on my committee and reviewing this

thesis. I also would like to thank Dr. Arrvindh Shriraman for taking the time to chair my

thesis defense.

I would like to thank all my wonderful friends and Network Systems Lab members who

helped me during the difficult times, and offered me great support.

Last but certainly not least, I want to express my gratitude to my family for their

constant support and love.

iv

Contents

Approval ii

Abstract iii

Acknowledgments iv

Contents v

List of Tables vii

List of Figures viii

1 Introduction 1

1.1 Overview . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1

1.2 Problem Statement and Thesis Contributions . . . . . . . . . . . . . . . . . . 3

1.3 Thesis Organization . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 3

2 Background and Related Work 5

2.1 Overview of AVC Video Coding . . . . . . . . . . . . . . . . . . . . . . . . . . 5

2.2 Overview of WiMAX and Modulation . . . . . . . . . . . . . . . . . . . . . . 8

2.3 Related Works . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 9

3 Energy Efficient Scheme for Variable Packet Size 11

3.1 Problem Definition . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 11

3.2 Proposed Solution . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 16

3.2.1 Overview of the Problem . . . . . . . . . . . . . . . . . . . . . . . . . 16

3.2.2 Solution Method . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 20

v

4 Evaluation 23

4.1 Verification of the Decoder Energy Consumption Model . . . . . . . . . . . . 23

4.2 Simulation Setup . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 28

4.3 Results . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 29

4.3.1 Method Precision . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 30

4.3.2 Exhaustive Search Method . . . . . . . . . . . . . . . . . . . . . . . . 34

4.3.3 Experimental Evaluation . . . . . . . . . . . . . . . . . . . . . . . . . 35

4.3.4 Quality-Aware Streaming Method . . . . . . . . . . . . . . . . . . . . 36

5 Energy Efficient Scheme for Fixed Packet Size 44

5.1 Problem Formulation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 44

5.2 Solution . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 51

5.3 Results . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 53

6 Conclusions and Future Work 56

6.1 Conclusions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 56

6.2 Future Work . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 57

Bibliography 59

vi

List of Tables

3.1 List of Symbols . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 13

5.1 Average PSNR and total energy consumption over 10 seconds of the Foreman

sequence, branch and cut and Lagrangian method, PSNR limit 35 . . . . . . 54

5.2 Average PSNR and total energy consumption over 10 seconds of the Foreman

sequence, branch and cut and Lagrangian method, PSNR limit 38 . . . . . . 54

vii

List of Figures

2.1 Encoding process. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 6

2.2 Decoding process. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 6

2.3 Relationships among scheduling windows, bursts and packet times. . . . . . . 9

4.1 Fitted planes for decoder energy consumption model. . . . . . . . . . . . . . . 23

4.2 Fitted plane for decoder energy consumption model. . . . . . . . . . . . . . . 24

4.3 Energy consumption versus α for Mother and daughter. . . . . . . . . . . . . 25

4.4 Energy consumption versus α for Foreman. . . . . . . . . . . . . . . . . . . . 26

4.5 Energy consumption versus α for Football. . . . . . . . . . . . . . . . . . . . . 26

4.6 Energy consumption versus decoding bit rate for Mother and daughter. . . . 27

4.7 Energy consumption versus decoding bit rate for Foreman. . . . . . . . . . . . 27

4.8 Energy consumption versus decoding bit rate for Football. . . . . . . . . . . . 28

4.9 Energy comparison between Exhaustive Search and Lagrangian method with

the same parameters for Foreman with PSNR limit 35 (top) and 38 (bottom)

dB. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 29


the same parameters for Mother and Daughter with PSNR limit 35 (top) and

38 (bottom) dB. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 30


the same parameters for Football with PSNR limit 35 (top) and 38 (bottom)

dB. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 31

4.12 Energy comparison for Exhaustive search and Lagrangian method for Fore-

man with PSNR limit 35 (top) and 38 (bottom) dB. . . . . . . . . . . . . . . 32

4.13 Energy comparison for Exhaustive search and Lagrangian method for Mother

and Daughter with PSNR limit 35 (top) and 38 (bottom) dB. . . . . . . . . . 33

viii

4.14 Energy comparison for Exhaustive search and Lagrangian method for Football

with PSNR limit 35 (top) and 38 (bottom) dB. . . . . . . . . . . . . . . . . . 37

4.15 PSNR for Foreman video sequence with PSNR limit 35 (top) and 38 (bottom)

dB. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 38

4.16 PSNR for Mother and Daughter video sequence with PSNR limit 35 (top)

and 38 (bottom) dB. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 39

4.17 PSNR for Football video sequence with PSNR limit 35 (top) and 38 (bottom)

dB. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 40

4.18 Energy comparison between the energy estimated by Lagrangian method and

experiment for Foreman with PSNR limit 35 (top) and 38 (bottom) dB. . . . 41


experiment for Mother and Daughter with PSNR limit 35 (top) and 38 (bot-

tom) dB. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 42


experiment for Football with PSNR limit 35 (top) and 38 (bottom) dB. . . . . 43

5.1 Shifting process. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 48

5.2 Average energy consumption, Foreman sequence, branch and cut (Convex)

and Lagrangian method (non-Convex) problem, PSNR limit 35 dB. . . . . . . 53

5.3 Average energy consumption, Foreman sequence, branch and cut (Convex)

and Lagrangian method (non-Convex) problem, PSNR limit 38 dB. . . . . . . 54

5.4 Average PSNR, Foreman sequence, branch and cut (Convex) and Lagrangian

method (non-Convex) problem, PSNR limit 35 dB. . . . . . . . . . . . . . . . 55

5.5 Average PSNR, Foreman sequence, branch and cut (Convex) and Lagrangian

method (non-Convex) problem, PSNR limit 38 dB. . . . . . . . . . . . . . . . 55

ix

Chapter 1

Introduction

1.1 Overview

In this section, we describe challenges for sending multimedia over wireless networks with

limited power receivers. Then, we introduce Advanced Video Coding (AVC) and the

WiMAX protocol. Finally, we present the problem that we will solve in this thesis to

address the described challenges.

Multimedia over wireless networks has attracted a lot of attention during the past few

years due to increases in multimedia applications and usage, especially for cellphones. Tech-

nological progress has made it possible to watch sports online or have an online meeting via

portable wireless devices.

Transmitting multimedia over wireless networks has limitations. The most important

problems that should be addressed are the limited power and processing capabilities as

indicated by some studies [19, 15]. It is not desirable for users to receive media with high

quality if it results in fast energy usage in their wireless devices. There is always a trade-off

among rate (the amount of required data to encode a video), video quality or distortion,

and power consumption of a wireless device. If the rate is increased, the distortion would

be decreased, but we will need more power. It is possible that some increments in the

power and bit rate do not decrease the distortion significantly. In other words, it may

not be worthwhile to increase the receiving bit rate with a resulting increase in the power

consumption in order to increase the quality of video by a small amount.

One way to manage the power consumption while keeping the quality of a video at

an acceptable level is adjusting coding parameters. One of these coding parameters is the

1

CHAPTER 1. INTRODUCTION 2

coding bit rate. We can also adjust other parameters depending on the coding format

we use. Currently, H.264/MPEG 4 or AVC is one of the most commonly used encodings.

Different parameters can be adjusted for this coding such as the intra refresh rate and the

quantization parameter (QP) [5].

There are also some techniques to decrease the energy consumption at the receiver side.

One of them is called time slicing [11]. In this technique, packets are sent in bursts at the

maximum rate of the channel which is much higher than the video coding rate. Then, the

receiver’s radio circuits are turned off for a period of time during which no energy is needed to

receive packets. Note that waking up the transmitter has an energy consumption overhead,

but this energy is much less than the energy needed for reception. We can save energy

compared to the situation of having the transmitter on all the time during the transmission

by scheduling the burst periods properly.

The WiMAX (Worldwide Interoperability for Microwave Access) protocol is specified

by the IEEE 802.16 standard [7]. It is a standard for metropolitan area wireless networks.

This protocol supports different kinds of modulation such as QPSK, 16-QAM, and 8-PSK.

The supported packet size may be different for each kind of modulation. For example,

packet sizes can be in the range of 128 to 2048 bits in the QPSK modulation while the

8-PSK modulation only supports packets with 3072 bits [8]. WiMAX uses the time slicing

technique to send data in wireless environments and we can use its features to reduce the

receiver’s energy consumption. Since we want to benefit from the time-slicing technique

for reducing energy consumption, we need to manage the burst times. We divide the time

into equal-size intervals (scheduling windows) and the base station determines the schedule

for sending bursts of data to a receiver at the beginning of each interval. The scheduling

window is chosen to be small enough so that the channel bit rate can be assumed to be fixed

for that amount of time, but it may be different between different windows. Each scheduling

window includes one or more bursts. We propose an algorithm to determine the time and

duration of each burst in a scheduling window. The base station uses this algorithm at the

beginning of each scheduling window to schedule bursts in that window.

Sending packets in bursts is subject to several constraints. The first constraint is related

to the limited buffer capacity. We cannot exceed the buffer capacity at any time (buffer

overflow). A second buffer constraint is buffer underflow which happens when the buffer

becomes empty. This causes glitches or interruptions in the video playback at the receiver.

Users at the receiver side want to watch the video with a good level of quality. If the quality


goes under some threshold, it is not acceptable to the users. This adds a quality constraint

to our problem. Finally, the decoding bit rate should be less than the channel bit rate. In

other words, we cannot decode frames faster than receiving them. We will use AVC hybrid

video coding for our problem. Our goal is to choose the mode for each Macro Block (MB),

the intra refresh rate, and the coding bit rate parameters to acheive a balance among bit

rate, quality, and power consumption.

1.2 Problem Statement and Thesis Contributions

The goal of this thesis is to reduce energy consumption in wireless devices while keeping the

video quality at an acceptable level. In general, the problem is to find the best mode and

coding bit rate for each frame in order to minimize the energy consumption of the receiver

subject to quality, decoding rate, and buffer constraints.

We solve this problem for two cases:

• The base station can use modulation with variable packet sizes. We show that this

problem is a non-convex optimization problem and find an approximate solution for

it. The time needed to find this solution is small enough to be useful for real-time

media transmissions.

• The base station can use modulation with a fixed packet size. We show how to

formulate this problem and solve it in real time.

We propose algorithms to schedule bursts for each window in a way that satisfies the

buffer constraints. Our simulation results show that these algorithms save considerable

amounts of energy in wireless devices.

1.3 Thesis Organization

The rest of this thesis is organized as follows. Chapter 2 describes different approaches

to analyze and solve the problem of video streaming over wireless networks and provides

the background needed to understand the rest of thesis. In Chapter 3, we explain and

formulate the problem of minimizing energy consumption at the receivers for modulations

with variable size packets. Then, we propose a solution for this problem. In Chapter 4,

we verify our energy model and describe the details and results of the simulations. In


Chapter 5, we formulate the problem for fixed packet size modulations, solve it, and describe

our simulation results. Chapter 6 presents conclusions and discusses future work.

Chapter 2

Background and Related Work

In this chapter, we describe Advanced Video Coding (AVC), video compression, and fea-

tures of the WiMAX protocol that we use in our problem formulation. Then, we describe

previous research on sending multimedia over wireless networks while considering either en-

ergy, quality, or channel bit rate. Finally, we discuss the novelty of our approach compared

to previous research.

2.1 Overview of AVC Video Coding

H.264/MPEG-4 Advanced Video Coding (AVC) is an industry standard for video coding

which performs a lossy compression coding. In genaral, lossy compression means that the

decoded frame sequence is not identical to the original sequence, and we lose some frame

quality during the compression. In this coding, the compression is performed based on the

data in the current or one or more previous/future frames. There are two kinds of frame

compression, intra-coded frame compression and inter-coded frame compression. Intra-

coded frame compression (spatial compression) reduces spatial redundancy inside a frame

by using neighboring image samples in the same frame. On the other hand, inter-coded

frame compression (temporal compression) uses neighboring frames. Thus, there are two

coding modes for a frame, Intra-coded and Inter-coded. Frames that are coded in the intra-

coded mode are compressed spatially while the frames that are coded in the inter-coded

mode are first compressed temporally and then spatially. Intra-coded frames have better

error resiliency, but they need a higher bit rate. Inter-coded frames have better compression,

5

CHAPTER 2. BACKGROUND AND RELATED WORK 6

Current frame

Current MB + Residual MBTransf

ormEntropyEncoder

Coded bitstream

prediction

Previously encodedframes

Prediction MB

-

DecodedResidual MB+

+

Intra

Inter

Quantize

ReverseQuantize

Reverse

Transform

MotionEstimation

Figure 2.1: Encoding process.

Decoded MB+

DecodedResidual MB

EntropyDecoder

Coded bitstream

Prediction MB

+

Intra

Inter

decoded frame

Previously decodedframes

ReverseQuantize

Reverse

Transform

prediction

Motion

Estimation

Figure 2.2: Decoding process.


so they need lower bit rate, but they can cause error propagation. We will refer to frames

coded in inter-coded mode (P or B) as inter-coded frames and intra-coded mode (I) as

intra-coded frames.

In order to encode data, the H.264 video encoder performs prediction, transformation,

and encoding processes to produce a compressed bit stream. Each frame is divided into

some blocks of pixels. A macro block is composed of two or more blocks. It corresponds

to 16 × 16 pixels in a frame and is the basic unit for motion compensation. The encoder

forms a prediction macro block for the current macro block. It uses either the current frame

with intra-coded prediction or other previously coded frames with inter-coded prediction. In

inter-coded prediction, motion estimation for an original macro block in the current frame

is determined by finding the best macro block match in a previously encoded frame called a

reference frame. The GOP format determines which frames are reference frames for which

other frames. A reference frame can be before or after the current frame in the frame

sequence. Then, the best matched macro block is subtracted from the original macro block

to form a residual macro block. The output of prediction is a residual macro block containing

the subtraction of the prediction from the original one and some model parameters showing

the inter-coded prediction (how the motion was compensated), including the offset between

the current block and the position of the candidate region (motion vector). Then, this

residual macro block is spatially compressed. For intra-coded frames, the encoder performs

only spatial compression.

The resulting macro block from the previous part is transformed using a Discrete Cosine

Transform (DCT). The output of this transform is a set of coefficients. Each coefficient is

quantized (divided by an integer value). Quantization is one of the steps that causes lossy

compression. A large quantization parameter causes high compression but poor quality,

whereas a small quantization parameter results in less compression but higher quality. After

quantization, the encoder uses entropy coding such as Arithmetic coding to further compress

the data.

The decoder performs the inverse process to obtain the decoded sequence. First, the

decoder uses entropy decoding to decompress the data. Then, the quantized transformed

coefficients are rescaled. In other words, each coefficient is multiplied by an integer value

to restore the original one. An inverse DCT is applied to obtain the residual macro block.

For inter-coded prediction, it uses previously decoded frames and finds the macro blocks

that should be added to the residual macro block to form the decoded macro block by using


motion vectors. We add this macro block to the decoded residual block and obtain the

decoded macro block [25, 5]. Figures 2.1 and 2.2 summarize the encoding and decoding

processes.

2.2 Overview of WiMAX and Modulation

The WiMAX protocol supports different kinds of modulation, so the base station and its

receivers that use the WiMAX protocol should support some of these kinds of modula-

tion. The base station and its receivers may use several different modulations during the

transmission of a video. The kind of modulation they use depends on the quality of the

transmission channel between the base station and its receiver. The base station monitors

the quality of the channel. A receiver often sends Channel Quality Indicator (CQI) feedback

messages and the base station decides the kind of modulation based on these messages. For

example, if the base station is far from the receiver, the CQI will show low channel quality,

so the base station switches to a more robust modulation scheme, e.g. QPSK, to reduce

transmission errors. The base station informs the receiving device about the change in the

modulation. Once the quality of the channel improves, the base station switches to another

modulation scheme. In the case of multi-cast, the base station decides on a common modu-

lation scheme for all receiving devices according to the worst channel quality between itself

and its receivers. Different kinds of modulation use different packet sizes. Some of them

support a wide range of packet sizes and some of them just a few [7, 14].

Another feature of the WiMAX protocol that we use is burst scheduling. The WiMAX

protocol defines sleep mode in which it turns off its wireless receiver’s interface [22]. This

helps to reduce energy consumption by choosing the sleep intervals properly. We define

fixed time intervals or scheduling windows in our problem and determine the burst times at

the beginning of each one.

Figure 2.3 shows the relationships among scheduling windows, bursts, and packet times

(i.e., the time needed to send a packet). As we can see in the figure, when fixed packet

sizes are used, the packet time is fixed for each window. But, since the bit rate is variable,

different fixed packet sizes may be used for different windows.


Window 2Window 1

Burst 1 Burst 2 Burst 1Burst 3

Windows

Bursts

Variable Packet Size

Fixed Packet Size

Time (s)

Figure 2.3: Relationships among scheduling windows, bursts and packet times.

2.3 Related Works

Lin et al. [17] develop a Power-Channel Error-Rate-Distortion(P-E-R-D) optimization frame-

work for AVC video coding. They try to minimize distortion based on the energy consump-

tion at the sender and channel rate constraints. He, Cai and Chen [9] develop a rate-

distortion(R-D) model and estimate the distortion based on the coding bit rate and intra

refresh rate. However, energy is a more important factor to be minimized.

Liang and Ahmad [16] try to find a parameter set for a parametric encoder that minimizes

both the power consumption and video distortion. They consider fixed bit rate and search for

a trade-off between the distortion and power consumption to ensure an applicable design.

This assumption does not match the reality for wireless transmissions. In the wireless

environment, channel bit rate changes over time.

Pu et al. [24] address two problems. In the first one, they try to find optimized encod-

ing parameters such as the quantization parameter in order to minimize the total power

consumption with a given distortion level. In the second one, they try to find optimized

encoding parameters in order to minimize the distortion with a given power consumption

level.

He et al. [10] develop a Power-Rate-Distortion(P-R-D) analysis framework. In this work,

they develop a video encoding structure and try to adjust video complexity parameters in

this structure in order to maximize the video quality and match it to the power supply.

The work most related to our work is [26] and [18]. Sharangi et al. [26] formulate and


solve the problem of selecting sub streams in SVC video coding when there are multiple

video streams to be broadcasted in WiMAX networks. Their first aim is to maximize video

quality which is a NP-Complete problem and they propose a polynomial time approxima-

tion algorithm. Then, they consider the energy consumption as a second objective function.

Our work differs from [26] in that we consider an uncompressed video and adjust coding

parameters for H.264 coding for that video. Lu et al. [18] investigate minimizing the total

power consumption of a mobile transmitter subject to a fixed end-to-end source distortion in

H.263. They use intra refresh rate, source bit rate, channel code rate, and transmit energy

level as variables in order to minimize energy consumption. In contrast, we consider H.264

which is a more recent and commonly used coding and we exploit the burst property of the

WiMAX protocol to save more energy.

Chapter 3

Energy Efficient Scheme for

Variable Packet Size

In this section we formulate our energy efficiency optimization problem for modulations

which support variable packet sizes. Then, we simplify the problem and propose an approx-

imation solution. We list the symbols used in this chapter in Table 3.1.

3.1 Problem Definition

In this section, we derive formulae for the energy consumption of the receiver of the wireless

device, and for the constraints of the optimization problem. We have an uncompressed video

to be coded at the base station. We split the time into scheduling windows of duration T .

T is chosen to be small enough so that the channel bit rate can be assumed to be fixed for

that amount of time.

We use f to denote the video frame rate and it is fixed. The number of frames that

should be sent during each window is f × T . Since we will schedule windows independently

of each other, we will consider the problem of sending a video with f × T frames during a

time interval of length T . The base station will send this data in bursts and will determine

the schedule for bursts at the beginning of each window. We use si, ti, and n to denote the

start time and duration of the ith burst, and the number of bursts, respectively.

The video will be transmitted with some of its frames encoded as inter-coded frames

and the others as intra-coded frames using AVC coding with coding bit rates equal to RI

11

CHAPTER 3. ENERGY EFFICIENT SCHEME FOR VARIABLE PACKET SIZE 12

and RP , respectively. The fractions of intra-coded and inter-coded frames in a window are

α and 1−α, respectively. We need to find the optimal values for α, RI , RP , n, and si and ti

for 1 ≤ i ≤ n such that the energy consumption is minimized according to some constraints

that will be explained later in this section.

The average frame coding bit rate is α × RI + (1 − α) × RP . The base station sends

data when the transmitter is on with channel bit rate R. Therefore, the total amount of

data transmitted by the base station is equal to the channel bit rate R multiplied by the

total time that transmitter is on: R ×∑n

i=1 ti. The amount of data that is decoded at the

receiver is equal to the decoding bit rate multiplied by the time that the decoder is working.

Since the decoding bit rate is the same as the coding bit rate and the decoder works all the

time, we have:

Rd = α×RI + (1− α)×RP (3.1)

and the amount of data is Rd×T = (α×RI + (1−α)×RP )×T . We assume that the data

that a wireless device receives in a window is decoded in the same window to simplify the

problem. Therefore, we have:

R×n∑i=1

ti = Rd × T = (α×RI + (1− α)×RP )× T. (3.2)

Two main components of energy consumption at the receiver are the energy used by the

radio interface to wake up after a sleep interval and to receive data, and the energy used to

decode the data. We denote the energy needed to wake up the receiver’s radio interface and

to receive one bit by Ew and Eb, respectively. The total number of received bits is equal to

the rate of decoding multiplied by the total time that the decoder works or T × Rd. The

total energy to receive the data that is transmitted in a window is T ×Rd×Eb, and the total

energy for waking up the radio interface is the number of bursts multiplied by the energy

needed to wake up the radio interface or n× Ew. Therefore, the total energy consumed by

the receiver’s radio interface is:

Er = n× Ew + T ×Rd × Eb. (3.3)

The second component of energy consumption is the energy needed to decode the received

data. Inter-coded frames are encoded using both temporal compression and spatial compres-

sion while intra-coded frames use only spatial compression. Therefore, decoding inter-coded


Table 3.1: List of Symbols

Symbol Description

R Channel bit rateRI Intra-coded frames decoding bit rateRP Inter-coded frames decoding bit rateRd Decoding bit rateα Portion of intra-coded frames in a windowEb Receiver energy consumption in receiving one bitEw Energy consumption for waking up from sleep stateEr Receiving energy consumptionn Number of burstsT Duration of a windowf frame rate of the videop Packet loss rateF Packet sizeB Buffer capacityti Duration of ith burstsi Start time of ith burstD Distortion limitDc Distortion caused by lossy codingDl Distortion caused by lossy channel


frames consumes more energy than decoding intra-coded frames and the energy consumed

to decode the data depends on the ratio of inter-coded frames to intra-coded frames. Note

that the reverse of the quantization process is independent of the coding bit rate, but en-

tropy decoding may depend on it. Therefore, we can conclude that the decoding energy

also depends on the decoding rate. We propose the following model for decoding energy

consumption:

Ed = c1 × α× (α×RI + (1− α)×RP ) + c2 × α+ c3 × (α×RI + (1− α)×RP ) + c4.

(3.4)

In our proposed model the decoding energy consumption depends on the portion of

intra-coded frames and the coding bit rate. We will verify this model and determine its

fixed coefficients c1, c2, c3 and c4 in Section 4.

The total energy consumption (E) at the receiver is the sum of Er and Ed.

E = Er + Ed (3.5)

The minimization of total energy consumption is subject to several constraints.

Rate Constraint: We cannot decode packets faster than they are received. Therefore,

the average decoding bit rate should be less than the receiving bit rate or channel bit rate

R.

Rd = α×RI + (1− α)×RP ≤ R. (3.6)

Buffer Constraints: There are two types of buffer constraints: buffer underflow and

buffer overflow. Buffer underflow occurs when the receiver’s buffer is empty before the next

burst arrives. Since the receiving bit rate, which is the channel bit rate, is larger than the

decoding bit rate according to Eq. (3.6), buffer underflow cannot occur during the reception

of packets. If the buffer becomes empty while the radio interface is in sleep mode, it will be

still empty at the beginning of the next burst because no packets ar received during sleep

mode. Therefore, we need to check the buffer level at the beginning of each burst to make

sure that no buffer underflow has happened before that point of time. We have n buffer

underflow constraints, one for each burst:

R×k−1∑i=1

ti − sk × (α×RI + (1− α)×RP ) ≥ 0 for 1 ≤ k ≤ n. (3.7)


Buffer overflow occurs when the difference between the total data received and the total

data consumed by the receiver exceeds the receiver’s buffer capacity which we denote B.

Buffer overflow cannot occur when the radio interface is in sleep mode. If overflow occurs

during reception of a burst, there will still be buffer overflow at the end of that burst by

Eq. (3.6), so it suffices to check the buffer overflow at the end of each burst. This gives n

buffer overflow constraints:

R×k∑i=1

ti − (sk + tk)× (α×RI + (1− α)×RP ) ≤ B for 1 ≤ k ≤ n. (3.8)

Distortion Constraint: Distortion can be introduced by lossy compression in the

coding process or a lossy channel. We only consider distortion caused by the coding. Lin et

al. [17] proposed a model for coding distortion:

σ2 × 2−2×γ×r, (3.9)

where r, σ2 and γ denote the coding bit rate, coefficient variance of the frame after the

DCT transform, and a parameter used for model accuracy, respectively. The inter-coded

and intra-coded modes have different coding bit rates and coefficient variances. Using the

model Eq. (3.9), intra-coded frames have distortion σ2I ×2−2×γI×RI where σ2

I , RI and γI are

the coefficient variance, coding bit rate and model accuracy parameter related to the intra-

coded frames, respectively. Similarly, the distortion for inter-coded frames has parameters

σ2P , RP and γP . Therefore, the average coding distortion is

Dc = α× σ2I × 2−2×γI×RI + (1− α)× σ2

P × 2−2×γP×RP . (3.10)

Since we do not know the values of σI , γI , σP , γP before coding a window, we use the

values from previous windows as estimates for the current window. Finally,

Dc ≤ D, (3.11)

where D is the maximum distortion that is acceptable to the users of the multimedia trans-

missions.

In this section, we have derived formulae for total energy consumption, which is the

objective function of our minimization problem, and rate, distortion, and buffer constraints.

The variable parameters are n, α, RI , RP , and si and ti, 1 ≤ i ≤ n. Putting these together,


our optimization problem is the following:

Minimize E = Er + Ed

subject to

R×n∑i=1

ti = Rd × T

Rd = α×RI + (1− α)×RP ≤ R

0 ≤ R×k−1∑i=1

ti − sk ×Rd, 1 ≤ k ≤ n

R×k∑i=1

ti − (sk + tk)×Rd ≤ B, 1 ≤ k ≤ n

Dc = α× σ2I × 2−2×γI×RI

+ (1− α)× σ2P × 2−2×γP×RP ≤ D

0 ≤ α ≤ 1 and 0 ≤ RI , RP

(3.12)

3.2 Proposed Solution

3.2.1 Overview of the Problem

In this section, we will simplify problem (3.12) by dividing it into smaller, easier to solve,

subproblems.

First, we use Eq. (3.2) to rewrite the two buffer constraints, Eq. (3.7) and Eq. (3.8), as

follows:

0 ≤ R× (

k−1∑i=1

ti −skT×

n∑i=1

ti) for 1 ≤ k ≤ n

R× (

k∑i=1

ti −(sk + tk)

T×

n∑i=1

ti) ≤ B for 1 ≤ k ≤ n

Now, we divide the whole problem into two sub-problems. The first sub-problem includes

n, and si and ti for 1 ≤ i ≤ n, as its variables, the parts of the objective function that contain


these variables, and the buffer constraints:

Minimize n× Ew

Subject to

0 ≤ R×

(k−1∑i=1

ti −skT×

n∑i=1

ti

)for 1 ≤ k ≤ n

R× (k∑i=1

ti −(sk + tk)

T×

n∑i=1

ti) ≤ B for 1 ≤ k ≤ n

(3.13)

The second sub-problem contains the remaining part of the problem (3.12) with α, RI

and RP as its variables, and rate and distortion constraints:

Minimize (α×RI + (1− α)×RP )× T × Eb + Ed

Subject to

Rd = α×RI + (1− α)×RP ≤ R

Dc = α× σ2I × 2−2×γI×RI

+ (1− α)× σ2P × 2−2×γP×RP ≤ D

0 ≤ α ≤ 1 and 0 ≤ RI , RP

(3.14)

These two subproblems are related to each other by the equality (3.2). The objective

function for subproblem (3.13) depends only on n. Since Ew > 0, we need to minimize n.

Assuming that the decoding rate is fixed, n only depends on the algorithm used to fill and

consume from the receiver’s buffer. We can show that a greedy algorithm minimizes n.

Lemma 3.2.1. Assuming that the decoding rate (α × RI + (1 − α) × RP ) is fixed and

equal to A, then n is minimized by the greedy algorithm that fills and empties the receiver’s

buffer completely for each burst, except possibly the last burst that might not fill the buffer

completely. Thus, ti = BR−A for 1 ≤ i < n− 1, tn ≤ B

R−A .

Proof. We prove this lemma by contradiction. Suppose we have a different algorithm that

gives n′ ≤ n − 1. The buffer filling rate is the same for both the greedy and the other

algorithm and equals R−A. The time needed to fill the buffer completely from the empty

state is the maximum time that the transmitter’s interface can be on for a burst and is equal

to BR−A . This is the maximum time that a burst can take. The total time the receiver’s

interface should be on to receive all the data to decode is∑n

i=1 ti = A×TR from Eq. (3.2). In


the greedy algorithm, we have (n− 1)× BR−A + C = A×T

R , where C is the receiving time of

the nth burst. Each burst can take at most BR−A time and 0 < C ≤ B

R−A . Since n′ ≤ n− 1,

we have:

n′ × BR−A ≤ (n− 1)× B

R−A < (n− 1)× BR−A + C = A×T

R .

The total time that the receiver’s interface is on to receive data for the second algorithm

is less than the time needed to receive all the data in the current window, and this is the

contradiction. Therefore, the greedy algorithm gives the minimum n and the time that the

receiver’s interface should be on in each step is ti = BR−A for 1 ≤ i < n−1 and tn ≤ B

R−A .

Lemma 3.2.2. The greedy algorithm preserves the buffer constraints.

Proof. Since the greedy algorithm alternately fills and empties the buffer completely before

the last burst, the buffer constraints hold before the last burst (tn) happens. We do not

have buffer overflow for the last burst, otherwise we would fill the buffer completely with

the remaining bits going into the next burst and this burst could not be the last one. Since

buffer underflow does not happen during a burst, we need to check for it after the nth burst

in the last sleep interval.

The nth burst takes time tn =∑n

i=1 ti −∑n−1

i=1 ti = A×TR − (n − 1) × B

R−A . The total

bits remaining in the buffer after the last burst to be consumed by the decoder during the

last sleep interval is:

tn × (R−A) = A×T×(R−A)R − (n− 1)×B.

Now, we compute the time remaining for the last sleep interval. Since the buffer is full

after each burst (B), except possibly for the last one, and the rate of consuming from the

buffer is A, the sleep interval between ti and ti+1 (or the time between two consecutive

bursts) for i < n is BA . Therefore, the total time before the nth burst is equal to (n− 1)×

(BA + BR−A). The total time passed before the last sleep interval is the sum of the time passed

until the last burst and the last burst time (tn):

(n− 1)× ( BR−A + B

A ) + A×TR − (n− 1)× B

R−A = A×TR + (n− 1)× B

A .

If we subtract the time passed before the last sleep interval from the total time T , we

obtain the duration of the last sleep interval:

T − (A×TR + (n− 1)× BA ) = T × R−A

R − (n− 1)× BA .

The total bits that we will consume during this sleep interval at consumption rate A equals:

A× (T × R−AR − (n− 1)× B

A ) = A×T×(R−A)R − (n− 1)×B and this is equal to the number

of bits in the buffer after the nth burst. Therefore, we do not have buffer underflow during


the last sleep interval. We can conclude that this greedy algorithm preserves the buffer

constraints.

Lemma 3.2.3. The minimum value for n is⌈T × (α×RI + (1− α)×RP )× (R− (α×RI + (1− α)×RP ))

B ×R

⌉(3.15)

Proof. From the previous two lemmas, we know that the greedy algorithm that fills and

empties the buffer minimizes n. Suppose that the decoding rate α × RI + (1 − α) × RP is

fixed and equal to A. In each step, we fill the buffer and the time it takes is BR−A except

for the last burst which can take less time. The whole time that the receiver’s interface is

on equals to A×TR . So, we have:

n =

⌈A×TRB

R−A

⌉=⌈A×T×(R−A)

B×R

⌉=⌈

(α×RI+(1−α)×RP )×T×(R−(α×RI+(1−α)×RP ))B×R

⌉.

If we replace n with the formula that we obtained in Lemma 3.2.3, and remove the two

buffer constraints, the problem formulation is as follows:

Minimize Ew ×⌈Rd × T × (R−Rd)

B ×R

⌉+Rd × T × Eb + Ed

Subject to

Rd = α×RI + (1− α)×RP ≤ R

Dc = α× σ2I × 2−2×γI×RI

+ (1− α)× σ2P × 2−2×γP×RP ≤ D

0 ≤ α ≤ 1 and 0 ≤ RI , RP

(3.16)

where α, RI and RP are variables. After solving this problem and finding the optional

values α∗, R∗I and R∗P , we can obtain n∗, s∗i and t∗i as follows:

n∗ =

⌈(α∗ ×R∗I + (1− α∗)×R∗P )× T × (R− (α×RI + (1− α)×RP ))

B ×R

⌉t∗i =

B

R−A

s∗i = (B

R−A+B

A)× i

(3.17)


3.2.2 Solution Method

Problem (3.16) is a non-convex optimization problem. We can find a globally optimum

solution for this problem with methods for solving non-convex optimization problems, but

these methods are complicated and too time-consuming for a real-time application. Instead

we use methods for solving convex optimization problems. They may only find locally

optimum solutions, but they take considerably less time. Since we want to find a solution

for real-time multimedia transmission, time is a more important factor than the precise

solution. Therefore, we sacrifice a little precision to obtain a solution in a reasonable time.

This constrained minimization problem can be solved by a generalization of the La-

grangian multiplier method [2, 3]. In the following, we explain this method for the following

problem where x is an n× 1 vector of variables:

Minimize f(x)

subject to

gi(x) = 0 for 1 ≤ i ≤ m

hj(x) ≤ bj for 1 ≤ j ≤ l

(3.18)

The idea behind this method is to reduce the constrained optimization problem to an

unconstrained optimization problem. We add a weighted sum of the constraints to the

objective function to get a new objective function.

We introduce an l × 1 vector V of slack variables such that:

h(x) + V 2 − b = 0 (3.19)

that is

hj(x) + v2j − bj = 0 for 1 ≤ j ≤ l (3.20)

The Lagrangian function would be as follows:

L(x, λ1, λ2, s) = f(x)− λT1 × (h(x) + V 2 − b)− λT2 × (g(x)) (3.21)

λ1 and λ2 are l× 1 and m× 1 vectors, respectively, containing Lagrange multipliers. In

addition, the complementary slackness conditions λT2 ×V 2 = 0 should holds at the optimum.


To find the minimum, we set the following partial derivatives to zero:

∂L

∂xi(x∗, λ∗1, λ

∗2) = 0 for 1 ≤ i ≤ n

∂L

∂λ∗1j(x∗, λ∗1, λ

∗2) = 0 for 1 ≤ j ≤ l

∂L

∂λ∗2j(x∗, λ∗1, λ

∗2) = 0 for 1 ≤ k ≤ m

(3.22)

where x∗ is the minimum solution and λ∗1 and λ∗2 are sets of associated Lagrange multipliers.

We form an equation system and solve it to find x∗.

We apply the Lagrangian multiplier method described above to our optimization prob-

lem. Introducing Lagrange multipliers λi and slack variables vi for 1 ≤ i ≤ 6, we solve the

following:

5(RI ,RP ,α,λ1,λ2,λ3,λ4,λ5,λ6) = L(RI , RP , α, λ1, λ2, λ3, λ4, λ5, λ6, v1, v2, v3, v4, v5, v6) (3.23)

with

L(RI , RP , α, λ1, λ2, λ3, λ4, λ5, λ6) =

E − λ1 × (D − α× σ2I × 2−2×γI×RI−

(1− α)× σ2P × 2−2×γP×RP − v2

1)−

λ2 × (R− (α×RI + (1− α)×RP )− v22)−

λ3 × (−RI − v23)− λ4 × (−RP − v2

4)−

λ5 × (−α− v25)− λ6 × (1− α− v2

6)

(3.24)

and

5(RI ,RP ,α,λ1,λ2,λ3,λ4,λ5,λ6) =

(∂L

∂RI,∂L

∂RP,∂L

∂α,∂L

∂λ1,∂L

∂λ2,∂L

∂λ3,∂L

∂λ4,∂L

∂λ5,∂L

∂λ6,

∂L

∂v1,∂L

∂v2,∂L

∂v3,∂L

∂v4,∂L

∂v5,∂L

∂v6)

(3.25)

In this method, we equate each element of the above vector to 0 and solve the resulting


equation system:

α×T×EwB − 2×Ew×T×α

B×R × (α×RI + (1− α)×RP ) + α× T × Eb + c1 × α2+

c3 × α− 2× λ1 × α× γI × σ2I × log(2)× 2−2×γI×RI + λ2 × α+ λ3 = 0

(1−α)×Ew×TB − 2×Ew×T×(1−α)

B×R × (α×RI + (1− α)×RP ) + (1− α)× T × Eb

−2× λ1 × (1− α)× γP × σ2P × log(2)× 2(−2×γP×RP ) + c1 × α× (1− α)

+c3 × (1− α) + λ2 × (1− α) + λ4 = 0

Ew×T×(RI−RP )B − 2×Ew×T×(RI−RP )

B×R × (α×RI + (1− α)×RP ) + (RI −RP )× T × Eb

+c1 × (α×RI + (1− α)×RP ) + c1 × α× (RI −RP ) + c2 + +λ5 + λ6+

c3 × (RI −RP )− λ1 × (σ2I2−2×γI×RI − σ2

P × 2−2×γP×RP )− λ2 × (RP −RI) = 0

D − α× σ2I × 2−2×γI×RI − (1− α)× σ2

P × 2−2×γP×RP = 0 or λ1 = 0

R− (α×RI + (1− α)×RP ) = 0 or λ2 = 0

RI = 0 or λ3 = 0

RP = 0 or λ4 = 0

α = 0 or λ5 = 0

α = 1 or λ6 = 0

(3.26)

This system has many solutions, we choose the one with minimum value for the objective

function E. Since these nonlinear system equations are not parametrically solvable, we use

the first Quasi-Newton method (the Davidson, Fletcher, Powel method) to solve this problem

numerically [4]. It is an iterative method and starts from an initial point and produces a

sequence of points that converges to 0. In each step, it finds the next point based on the

change in gradient between iterations. Since we want to find the solution quickly in order

to solve the general problem in real time, we set the terminating point to be within an

accepted error interval from zero or after a fixed amount of time. In the simulation section,

we show that using this Lagrangian method to solve the problem (3.16) gives a good output

in reasonable time.

Chapter 4

Evaluation

In this chapter, we run an experiment to verify the energy model (3.4) and to determine

the coefficients. Then, we set up a simulation to evaluate the effectiveness of our solution

in saving energy and present the results.

4.1 Verification of the Decoder Energy Consumption Model

0.1 0.15 0.2 0.25 0.3 0.35 0.4 0.45 0.5

1000

1200

1400

0

0.5

1

1.5

2

2.5

α

Bit Rate(Kbps)

Ene

rgy(

J)

FootballForemanMother and daughter

Figure 4.1: Fitted planes for decoder energy consumption model.

We used Joulemeter software to verify the decoder energy consumption model (3.4) and

to determine the values of the coefficients. Joulemeter is software that estimates the power

consumption of the computer and applications running on it [12]. We measured the energy

23

CHAPTER 4. EVALUATION 24

consumption of an H.264 decoder (JM decoder) [27] running on a SONY VAIO with an

Intel(R) Core(TM)2 Duo T5800 processor. First, the video sequences are encoded with

the x264 encoder [30]. The x264 encoder is a free software library and application for

encoding video streams into the H.264/MPEG-4 AVC format, and is released under the

terms of the GNU GPL. We encoded the Foreman, Mother and daughter, and Football video

sequences [28] to consider different kinds of video sequences based on the contents. For

example, the Mother and daughter video sequence is a slow moving video sequence. On

the other hand, the Football video sequence is a fast moving video sequence. We encode

these sequences with several values for α and the decoding bit rate Rd. For α, we used

values of 1i for 2 ≤ i ≤ 10 (i.e., α = 1

2 ,13 ,

14 , . . . ,

110). The values for Rd were 1000, 1100,

1200, 1300, 1400, and 1500 Kbps. Then, we decoded each resulting sequence according

to the parameters used for its encoding while measuring the energy consumption Ed using

Joulemeter [12] and the time td taken to decode it. The energy consumption is Edtd

per

unit of time. We obtained a set of three-dimensional points for each video sequence. These

dimensions are intra refresh rate (α), coding bit rate and corresponding decoding energy

consumption. We used a least squares algorithm [13] to fit these points to a curve with the

following model where β = 1− α:

c1 × α× (α×RI + β ×RP ) + c2 × α+ c3 × (α×RI + β ×RP ) + c4 (4.1)

0.1 0.15 0.2 0.25 0.3 0.35 0.4 0.45 0.51000

11001200

13001400

1500

0.5

1

1.5

2

Bit Rate(Kbps)

α

Ene

rgy(

J)

Figure 4.2: Fitted plane for decoder energy consumption model.


0.1 0.15 0.2 0.25 0.3 0.35 0.4 0.45 0.50.4

0.5

0.6

0.7

0.8

0.9

1

1.1

1.2

1.3

1.4

α

Ene

rgy(

J)

Figure 4.3: Energy consumption versus α for Mother and daughter.

Figure 4.1 shows this fitting for the three video sequences. The Mother and daughter

video sequence is a slow motion video and it needs less energy for decoding compared to

the Foreman and Football video sequences. On the other hand, the Football video sequence

is a fast motion video and needs more energy for decoding compared to the other two video

sequences. The Foreman video sequence is a medium motion video. The surface fitted for

the Foreman sequence lies between the surfaces for the Mother and daughter and Football

video sequences. Therefore, we used this sequence to estimate decoding coefficients, c1,

c2, c3, and c4. Figure 4.2 shows this fitting for the Foreman video sequence. The model

coefficients obtained from the fitted curve are: c1 = −0.00146, c2 = −0.677, c3 = 0.00098,

and c4 = 0.714 and the error in the fitting was less than 1%.

In Figures 4.3, 4.4, and 4.5, we plot the decoding energy consumption versus α without

considering the decoding bit rate for the Mother and daughter, Foreman, and Football video

sequences, respectively. As can be seen from the figures, the energy consumption becomes

smaller, when α gets larger values. Since higher values of α cause higher numbers of intra-

coded frames and they need less processing to decompress, we need less energy for decoding.

Figures 4.6, 4.7, and 4.8 show the decoding energy consumption versus the decoding bit rate

without considering α for the Mother and daughter, Foreman, and Football video sequences,

respectively. Since we need to decode more bits with higher coding bit rates, the energy

consumption for decoding gets higher. Therefore, both of these parameters affect the decod-

ing energy consumption and our model presents their effects. We use this model with the


0.1 0.15 0.2 0.25 0.3 0.35 0.4 0.45 0.50.4

0.6

0.8

1

1.2

1.4

1.6

1.8

2

α

Ener

gy(J

)

Figure 4.4: Energy consumption versus α for Foreman.

0.1 0.15 0.2 0.25 0.3 0.35 0.4 0.45 0.5

0.8

1

1.2

1.4

1.6

1.8

2

2.2

α

Ene

rgy(

J)

Figure 4.5: Energy consumption versus α for Football.


1000 1100 1200 1300 1400 15000.4

0.5

0.6

0.7

0.8

0.9

1

1.1

1.2

1.3

1.4

Bit Rate(Kbps)

Ene

rgy(

J)

Figure 4.6: Energy consumption versus decoding bit rate for Mother and daughter.

1000 1100 1200 1300 1400 15000.4

0.6

0.8

1

1.2

1.4

1.6

1.8

2

Bit Rate(Kbps)

Ener

gy(J

)

Figure 4.7: Energy consumption versus decoding bit rate for Foreman.


1000 1100 1200 1300 1400 1500

0.8

1

1.2

1.4

1.6

1.8

2

2.2

Bit Rate(Kbps)

Ene

rgy(

J)

Figure 4.8: Energy consumption versus decoding bit rate for Football.

obtained coefficients from the Foreman video sequence as the decoding energy consumption

model in the next section.

4.2 Simulation Setup

We evaluated our algorithm by using the Foreman, Mother and daughter, and Football video

sequences in the raw format of YUV from [28]. Each of these videos has 300 frames and

a frame rate of 30 frames per second. We set the window duration to 1 second (T = 1).

Therefore, we have 10 windows for each sequence. In the WiMAX network, receivers usually

get a download bandwidth between 1 and 5 Mbps depending on different parameters such

as the number of receivers in that network and the distance of the receiver from the base

station. We assumed that the channel bit rate changes randomly between 800 Kbps and

2.3 Mbps. We used the x264 encoder to code the raw sequences into MPEG-4 AVC format

at the base station and decode them with the JM decoder at the receiver. The buffer size

of the receivers is set to 512 Kb. According to [31], the power consumption of a WiMAX

wireless interface during sleep and receiving modes is 50 and 1000 mW, respectively. The

energy needed to switch between the two modes is 1000 µJ .

In the AVC coding specification, the GOP format is one intra-coded (I) frame followed

by zero or more inter-coded (P) frames [25] (if we don’t consider B frames in our GOP).

Since each window consists of at least one GOP and each GOP has at least one inter-coded


1 2 3 4 5 6 7 8 9 10

450

500

550

600

650

Time(s)

Ene

rgy(

mJ)

Lagrangian MethodExhaustive Search

1 2 3 4 5 6 7 8 9 10650

700

750

800

850

900

950

1000

Time(s)

Ene

rgy(

mJ)


Figure 4.9: Energy comparison between Exhaustive Search and Lagrangian method withthe same parameters for Foreman with PSNR limit 35 (top) and 38 (bottom) dB.

frame, the intra refresh rate ( 1α) is an integer number between 1 and f ×T . In our problem

formulation, α can take any value between 0 and 1. For our experiments, we solved the

optimization problem in Eq. 3.16 to get α and then rounded α to the closest 1i , where i

is an integer and 1 ≤ i ≤ f × T . We used the value 1i as an approximation of α in our

experiments.

4.3 Results

In this section, we consider different scenarios and evaluate our algorithm. We ran the

experiments with two different values for the distortion limit: D = 20.56 and D = 10.023,


1 2 3 4 5 6 7 8 9 10170

180

190

200

210

220

230

240

250

Time(s)

Ene

rgy(

mJ)

Exhaustive SearchLagrangian Method

1 2 3 4 5 6 7 8 9 10200

220

240

260

280

300

320

340

Time(s)

Ene

rgy(

mJ)


Figure 4.10: Energy comparison between Exhaustive Search and Lagrangian method withthe same parameters for Mother and Daughter with PSNR limit 35 (top) and 38 (bottom)dB.

which are equivalent to PSNR thresholds of 35 and 38 dB, respectively.

4.3.1 Method Precision

In this scenario, we show that the Lagrangian method gives an acceptable answer to prob-

lem 3.16. We use a near optimal exhaustive search and compare its results to the results

of the Lagrangian method. Since RI and RP are continuous, it is not possible and find

the exact solution. We varied α from 0 to 1 with a step size of 0.01, and RI and RP from

100 Kbps to 5 Mbps with a step size of 1 Kbps to find the best parameter values in the

exhaustive search method.


1 2 3 4 5 6 7 8 9 10

800

850

900

950

1000

Time(s)

Ene

rgy(

mJ)

Exhaustive Search

Lagrangian Method

1 2 3 4 5 6 7 8 9 10

800

850

900

950

1000

Time(s)

Ene

rgy(

mJ)

Exhaustive Search

Lagrangian Method

Figure 4.11: Energy comparison between Exhaustive Search and Lagrangian method withthe same parameters for Football with PSNR limit 35 (top) and 38 (bottom) dB.

Theorem 4.3.1. Let E∗ be the minimum energy value found by the exhaustive search and

let α∗ and R∗d be the corresponding values of α and decoding bit rate. The maximum possible

unsigned difference ∆E∗ between E∗ and the energy value at a neighboring point to (α∗, R∗d)

in the exhaustive search is obtained by setting Rd to Rmax (the maximum channel bit rate)

and α = 0.

Proof. Consider the optimization problem (3.12) with the distortion constraint removed.

This removes the dependence between the decoding bit rate and α, so we can just consider

the decoding bit rate Rd instead of the bit rates for each kind of frame. This simplifies the


1 2 3 4 5 6 7 8 9 10500

1000

1500

2000

2500

3000

3500

4000

4500

5000

5500

Time(s)

Ene

rgy(

mJ)


1 2 3 4 5 6 7 8 9 101000

2000

3000

4000

5000

6000

7000

8000

Time(s)

Ene

rgy(

mJ)


Figure 4.12: Energy comparison for Exhaustive search and Lagrangian method for Foremanwith PSNR limit 35 (top) and 38 (bottom) dB.

expression for total energy consumption (see (3.3), (3.4), (3.5), and (3.15)) to

E′ =T × EwB ×R

×Rd × (R−Rd) + T × Eb ×Rd

+ c1 × α×Rd + c2 × α+ c3 ×Rd + c4.

(4.2)

Suppose that E′∗ is the minimum energy value found by the exhaustive search and let α∗

and R∗d be the corresponding values of α and decoding bit rate. To find the maximum

distance between the point (α∗, R∗d) and a neighboring point, we differentiate the expression

for E′ in 4.2 with respect to α and Rd as follows:

∂E′

∂α= c1 ×Rd + c2, (4.3)


1 2 3 4 5 6 7 8 9 10200

400

600

800

1000

1200

1400

1600

1800

2000

2200

Time(s)

Ene

rgy(

mJ)


1 2 3 4 5 6 7 8 9 10

500

1000

1500

2000

2500

Time(s)

Ene

rgy(

mJ)


Figure 4.13: Energy comparison for Exhaustive search and Lagrangian method for Motherand Daughter with PSNR limit 35 (top) and 38 (bottom) dB.

∂E′

∂Rd=T × EwB

− 2× T × EwB ×R

×Rd + T × Eb + c1 × α+ c3. (4.4)

The maximum unsigned values of δE′

δα and δE′

δRdare obtained when α = 0 and Rd = Rmax.

From this, we can calculate the maximum changes in the energy due to α and Rd, ∆E′α and

∆E′Rd respectively, using the exhaustive search step sizes and then the maximum change in

E′ is

∆E′ =√

∆E′2α + ∆E′2Rd (4.5)

Using the values and step sizes for α, RI , and RP specified at the beginning of this


sub-section gives Rmax = 2.5 Mbps and ∆E∗ = 21 mJ. Figures 4.9, 4.10 and 4.11 show the

estimated energy using the Lagrangian method and exhaustive search to solve Problem 3.16

for the Foreman, Mother and Daughter and Football sequences. In each figure, the top and

bottom plots are obtained by setting the PSNR thresholds to 35 and 38 dB, respectively. In

each plot, the X axis corresponds to the window with T = 1 second and the Y axis shows the

estimated energy consumption for each window for the two methods using the same video

parameters (σI , γI , σP and γP ). The maximum possible error percentage for Exhaustive

search is defined as ∆E∗

E∗ × 100. These errors for the Foreman, Mother and Daughter and

Football sequences are 10%, 4% and 2%, and 7.5%, 2.5% and 2.2% for PSNR thresholds of

35 and 38 dB. The maximum differences between the results from the Lagrangian method

and exhaustive search in 10 seconds of the Foreman, Mother and Daughter and Football

sequences are 26, 8 and 31 J, and 21, 9 and 29 J for PSNR thresholds of 35 and 38 dB.

These differences correspond to the maximum error from optimal solution of 4.7%, 3.8%

and 2.9%, and 2.9%, 3.2% and 2.7% for Foreman, Mother and Daughter and Football using

Lagrangian method with PSNR thresholds of 35 and 38 dB. The average error for 10 seconds

of the Foreman, Mother and Daughter and Football sequences using the Lagrangian method

are 3.1%, 2.6% and 1.5%, and 2.0%, 1.8% and 1.9% for thresholds of 35 and 38 dB. These

small error percentages show that the Lagrangian method gives a near optimal solution and

is a good heuristic to solve the non-Convex problem (3.16).

4.3.2 Exhaustive Search Method

In this scenario, we compare the resulting total energy consumption and average distortion

from the Exhaustive search method with our Lagrangian method for 10 second videos.

Note that the coding parameters are different for each window for these two methods in this

scenario. Figures 4.12, 4.13 and 4.14 show the total energy consumption for the Foreman,

Mother and daughter and Football video sequences. The X axis is the window and the Y

axis is the total energy consumed until the end of the specified window. The total amounts

of energy consumed over 10 seconds using the Exhaustive search and Lagrangian methods

are 5.42 and 5.27, and 8.04 and 7.88 J for the Foreman sequence, 2.14 and 2.04, and 2.75

and 2.67 J for the Mother and daughter sequence and 8.44 and 9.22, and 9.45 and 9.49 J for

the Football sequence for PSNR thresholds of 35 and 38 dB, respectively.

Figures 4.15, 4.16 and 4.17 show the average PSNR over 30 frames in each window for

the video sequences when the coding parameters obtained from the Exhaustive search and


Lagrangian methods are used to encode the sequences. The average PSNR over 10 seconds

for the three video sequences using the Lagrangian method and Exhaustive search are 35.27

and 34.7 dB, 36 and 35.6, and 34.7 and 35.2 dB for a PSNR threshold of 35 dB, and 37.4

and 37.3, 38.3 and 38.1, and 37.7 and 37.6 dB for PSNR threshold of 38 dB. The maximum

deviations from the PSNR threshold are small: 4%, 5% and 3%, and 5%, 1% and 3% for

the Foreman, Mother and daughter and Football sequences, respectively.

Note that while the Lagrangian method may not find the minimum solution in every

window, it sometimes finds a sequence of coding parameters that results in slightly less total

energy consumption than Exhaustive search with the penalty of slightly more distortion.

The reason is that the encoding parameters are based on the parameters from previously en-

coded frames and these parameters are not the same for each window for different methods.

Since the average distortion and energy consumption for the Lagrangian method is nearly

the same as Exhaustive search, and the Lagrangian method is much faster, we conclude that

the Lagrangian method is a good approximation for solving Problem (3.16). Note that we

used the distortion model for the Foreman sequence for the Mother and daughter sequence

and this resulted in distortions that are slightly higher than the threshold. A more finely

tuned distortion model would reduce these differences.

4.3.3 Experimental Evaluation

We will use the parameter values from our Lagrangian method in experiments to check

the accuracy of our method. We encoded the video sequences using the parameter values,

transmitted them in bursts, and measured the energy used by the receiver to receive and

decode the video sequences. Figures 4.18, 4.19 and 4.20 compare the total energy consump-

tions predicted by the Lagrangian method and the total measured energy consumption for

the Foreman, Mother and daughter and Football sequences. The Y axis is the total energy

consumed until the end of each window.

The energy consumptions estimated by the Lagrangian method are close to the measured

energy consumptions for the Foreman and Football video sequences, but the measured values

for Mother and daughter are smaller. The reason is that we used the coefficients for the

Foreman sequence to encode all sequences. However, the differences between the threshold

and the average distortion of the Mother and daughter sequence are small, so the decoding

energy consumption model produces the desired outcome.


4.3.4 Quality-Aware Streaming Method

Lu et al. [18] and Lin et al. [17] compare their methods to a scenario with fixed α and Rd and

they consider the energy consumption at the sender. Here, we consider a method for stream-

ing videos that encodes the video sequences with the specified quality (PSNR threshold)

and sends them over the channel without any sleep time. This model is more accurate in

terms of video quality. We call this method the Quality-Aware No-Burst method. In order to

encode the video with the given threshold in this method, we used the relation between the

quantization parameter (QP) and distortion which was proposed in [20]: PSNR = l×QP+b.

We used this relation for the Foreman sequence and obtained l = −0.608 and b = 55.78 to

encode the videos. Now we examine the Quality-Aware No-Burst method. We determined

QP as described above for distortion threshold 35 and 38 dB, encoded the video sequences,

and transmitted them. The wireless interface of the receiver was on for the entire duration

of each video. We measured the energy consumption for the receiver to receive and decode

each video. These total energies were 7.35, 3.28 and 10.75, and 9.85, 4.73 and 12.31 J for the

Foreman, Mother and daughter and Football sequences with PSNR thresholds of 35 and 38

dB, respectively. Compared to these results, our Lagrangian method saved 51% and 51% for

the Mother and daughter sequence, 28% and 20% for the Foreman sequence, and 14% and

22% for the Football sequence with PSNR thresholds of 35 and 38 dB. We conclude that

our method is very successful in saving energy while satisfying the distortion constraint.


1 2 3 4 5 6 7 8 9 10

1000

2000

3000

4000

5000

6000

7000

8000

9000

Time(s)

Ene

rgy(

mJ)


1 2 3 4 5 6 7 8 9 10

2000

3000

4000

5000

6000

7000

8000

9000

Time(s)

Ene

rgy(

mJ)

Lagrangian Method

Exhaustive Search

Figure 4.14: Energy comparison for Exhaustive search and Lagrangian method for Footballwith PSNR limit 35 (top) and 38 (bottom) dB.


1 2 3 4 5 6 7 8 9 1033.5

34

34.5

35

35.5

36

36.5

Time(s)

PS

NR

(db)


1 2 3 4 5 6 7 8 9 1036

36.5

37

37.5

38

38.5

Time(s)

PS

NR

(db)


Figure 4.15: PSNR for Foreman video sequence with PSNR limit 35 (top) and 38 (bottom)dB.


1 2 3 4 5 6 7 8 9 1034.5

35

35.5

36

36.5

37

37.5

Time(s)

PS

NR

(db)


1 2 3 4 5 6 7 8 9 1036.5

37

37.5

38

38.5

39

Time(s)

PS

NR

(db)


Figure 4.16: PSNR for Mother and Daughter video sequence with PSNR limit 35 (top) and38 (bottom) dB.


1 2 3 4 5 6 7 8 9 1033.8

34

34.2

34.4

34.6

34.8

35

35.2

35.4

35.6

Time(s)

PS

NR

(db)


1 2 3 4 5 6 7 8 9 1035.5

36

36.5

37

37.5

38

38.5

39

39.5

Time(s)

PS

NR

(db)


Figure 4.17: PSNR for Football video sequence with PSNR limit 35 (top) and 38 (bottom)dB.


1 2 3 4 5 6 7 8 9 10500

1000

1500

2000

2500

3000

3500

4000

4500

5000

5500

Time(s)

Ene

rgy(

mJ)

Lagrangian MethodExperiment

1 2 3 4 5 6 7 8 9 101000

2000

3000

4000

5000

6000

7000

8000

Time(s)

Ene

rgy(

mJ)


Figure 4.18: Energy comparison between the energy estimated by Lagrangian method andexperiment for Foreman with PSNR limit 35 (top) and 38 (bottom) dB.


1 2 3 4 5 6 7 8 9 10200

400

600

800

1000

1200

1400

1600

1800

2000

2200

Time(s)

Ene

rgy(

mJ)


1 2 3 4 5 6 7 8 9 10

500

1000

1500

2000

2500

Time(s)

Ene

rgy(

mJ)


Figure 4.19: Energy comparison between the energy estimated by Lagrangian method andexperiment for Mother and Daughter with PSNR limit 35 (top) and 38 (bottom) dB.


1 2 3 4 5 6 7 8 9 10

1000

2000

3000

4000

5000

6000

7000

8000

9000

Time(s)

Ene

rgy(

mJ)


1 2 3 4 5 6 7 8 9 10

2000

3000

4000

5000

6000

7000

8000

9000

Time(s)

Ene

rgy(

mJ)


Figure 4.20: Energy comparison between the energy estimated by Lagrangian method andexperiment for Football with PSNR limit 35 (top) and 38 (bottom) dB.

Chapter 5

Energy Efficient Scheme for Fixed

Packet Size

In this section, we consider the case that the packet size is fixed and equal to F . The

problem can be formulated as before, because fixed packet size is a special case of variable

packet size. However, by considering some physical aspects of using a fixed packet size, we

derive a different formulation that allows us to solve the problem more efficiently.

5.1 Problem Formulation

When the packet size is fixed, the packet is queued when it is received in the buffer and

dequeued as a whole. Therefore, the contents of the buffer change by F bits at a time and

the number of bits in the buffer is a multiple of F at any point of time. Thus, we consider

packet rates instead of bit rates. The bit rates for removing and adding data from the buffer

are Rd and R, respectively. The corresponding packet rates are RdF and R

F frames per second.

In practice, RdF > 1 and R

F > 1, so the ratios FRd

and FR satisfies 0 < F

Rd< 1 and 0 < F

R < 1

seconds per frame. We rescale the problem to eliminate fractional values as follows. We

choose m to be the number of digits of accuracy and then rescale the window to have length

T ′ = T × 10m. Then, we set f1 = bFR × 10mc, f2 = b FRd × 10mc, and t′ = gcd(f1, f2), and we

divide T ′ into sub-intervals of duration t′.

Now, we have N = T ′

t′ sub-intervals in a window, and we receive an integer number of

44

CHAPTER 5. ENERGY EFFICIENT SCHEME FOR FIXED PACKET SIZE 45

packets in each burst. Suppose that we receive A packets in a burst. We have:

A× t′ = A× F

R× 10m. (5.1)

Since the time needed to receive one packet in the new scale is FR × 10m, each burst fits

into an integer number of sub-intervals. We define the Boolean variable xi for the ith sub-

interval, where xi = 1 if we send data in that sub-interval and xi = 0 otherwise. Therefore,

we have:

t′

10m×

N∑i=1

xi =

n∑i=1

ti. (5.2)

A burst starts after the ith sub-interval if we have xi = 0 and xi+1 = 1. We define

x0 = 0 to allow for the possibility of a burst starting in the first sub-interval. Thus, we

can count a burst wherever we see xi = 0 and xi+1 = 1 for 0 ≤ i ≤ N − 1. We have to

enumerate occurrences of this pattern (0 followed by 1) in the sequence x0...xN to determine

the number of bursts. We can do so as follows.

n =N−1∑i=0

xi ∧ xi+1 =N−1∑i=0

(1− xi)× xi+1

=N∑i=1

xi −N−1∑i=0

xi × xi+1

(5.3)

We define another Boolean sequence zi for 1 ≤ i ≤ N − 1 and replace xi × xi+1 in the

last summation of Eq. (5.3) by zi to get

n =N∑i=1

xi −N−1∑i=0

zi. (5.4)

The introduction of the zis will be used below to make the objective function convex.

Since we want to consume all the data we have received in the same window, we have:

(α×RI + (1− α)×RP )× T =t′

10m×R×

N∑i=0

xi. (5.5)


The resulting objective function which is to be minimized is:

n× Ew + (α×RI + (1− α)×RP )× Eb × TF

+ c1 × α× (α×RI+

(1− α)×RP ) + c2 × α+ c3 × (α×RI + (1− α)×RP ) + c4 =

Ew × (N∑i=1

xi −N−1∑i=0

zi) +R× t′ × EbF × 10m

×N∑i=0

xi+

c1 × α× (α×RI + (1− α)×RP ) + c2 × α+

c3 × (α×RI + (1− α)×RP ) + c4.

(5.6)

We add two constraints, zi ≤ xi and zi ≤ xi+1 so that Eq. (5.3) and Eq. (5.4) result in

the same optimal solution for (5.6). We show their equivalence by enumerating all of the

possible cases for xi and xi+1. If xi = 0 or xi+1 = 0, then zi = 0. If xi = 1 and xi+1 = 1,

then zi can be 0 or 1. In this case, zi = 1 will be chosen to minimize the objective function.

We explain the constraints for this discrete time problem. Since the distortion and rate

constraints do not depend on the rescaling, they are the same as in problem (3.12). We

investigate the buffer underflow and overflow constraints. Since the receiving rate is at least

as large as the consuming rate, buffer underflow cannot happen in a sub-interval with xi = 1.

If buffer underflow happens in the middle of a sub-interval with xi = 0, then since we do

not receive any packets during this sub-interval, the buffer underflow will still be present at

the end of this sub-interval. Therefore, all buffer underflows can be detected by checking at

the end of each sub-interval.

Since we do not receive any packets in a sub-interval with xi = 0, buffer overflow cannot

occur in this sub-interval. If buffer overflow happens in a sub-interval with xi = 1, then

since the receiving rate is at least as large as the consuming rate, the buffer overflow will

still be present at the end of the ith sub-interval. Therefore, all buffer overflows can be

detected by checking at the end of each sub-interval. In summary, if we check the buffer

state at the end of each sub-interval, we can detect all buffer underflows and overflows.

We formulate the buffer constraints as follows: for buffer overflow, what we have received

minus what we have consumed until the end of the kth sub-interval should be less than the

buffer size (B). The total bits that we have received is the total receiving time multiplied

by the receiving rate (R). The total receiving time of packets until the kth sub-interval is

the number of xi = 1 for 1 ≤ i ≤ k multiplied by the sub-interval size: t′

10m ×∑k

i=1 xi. The

total number of bits that we have consumed until the end of the kth burst is equal to the


total time that we have consumed bits multiplied by the decoding rate. The total time that

we have consumed bits from the buffer is t′

10m × k (only the receiving stops during sleep

mode), and the total consumed bits is t′

10m × k × (α×RI + (1− α)×RP ) until the end of

the kth burst.

For the buffer underflow constraint, what we have consumed should be at most equal

to what we have received until the kth burst. We have the following buffer overflow and

underflow constraints, respectively:

R× t′

10m×

k∑i=1

xi −t′

10m× (k)× (α×RI + (1− α)×RP ) =

R× t′

10m×

k∑i=1

xi −F × t′

10m × kT

N∑i=1

xi ≤ B for 1 ≤ k ≤ N

(5.7)

0 ≤ R× t′

10m×

k∑i=1

xi −F × t′

10m × (k)

T

N−1∑i=1

xi for 1 ≤ k ≤ N (5.8)

We simplify these two constraints to get:

k∑i=1

xi ≤k × FR× T

N∑i=1

xi +B × 10m

R× t′for 1 ≤ k ≤ N

k × FR× T

N∑i=1

xi ≤k∑i=1

xi for 1 ≤ k ≤ N.

(5.9)


Figure 5.1: Shifting process.

The problem formulation is as follows:

Minimize Ew × (N∑i=1

xi −N−1∑i=0

zi) +R× t′ × EbF × 10m

×N∑i=1

xi+

c1 × α× (α×RI + (1− α)×RP ) + c2 × α+ c3 × (α×RI + (1− α)×RP ) + c4

subject to

α× σ2I × 2−2×γI×RIα + (1− α)× σ2

P × 2−2×γP×

RP(1−α) ≤ D

(α×RI + (1− α)×RP )× T =t′

10m×R×

N∑i=1

xi

k∑i=1

xi ≤k × FR× T

×N∑i=1

xi +B × 10m

R× t′for 1 ≤ k ≤ N

k × FR× T

×N∑i=1

xi ≤k∑i=1

xi for 1 ≤ k ≤ N

α×RI + (1− α)×RP ≤ R

zi ≤ xi for 1 ≤ k ≤ N

zi ≤ xi+1 for 1 ≤ k ≤ N − 1

0 ≤ α ≤ 1

0 ≤ RI , RP(5.10)

Any solution to problem (5.10) is also a solution to problem (3.12) when the packet size

is fixed. Otherwise, suppose that the best solution for problem (5.10) is Y, and there is a

solution X for problem (3.12) with less energy consumption. We can get a solution X ′ for

problem (5.10) by shifting the bursts in solution X backwards to the beginning of the closest


sub-interval. Figure 5.1 shows the shifting process to obtain new bursts for the new discrete

problem from the continuous one. We show that these shifts preserve the constraints and

that the new burst schedule X ′ has the same energy consumption as solution X. We decode

the received data with the α, RI , and RP obtained from solution X and these values preserve

the rate and distortion constraints. For the buffer constraints, note that the transport layer

takes out one packet every bFR ×10mc seconds in the new scale and that t′|bFR ×10mc. Since

these accesses start at the beginning of each window, the transport layer has access to the

buffer only at the beginnings of sub-intervals. We can conclude that we have the same

amount of data in the buffer at the beginning of each burst as in solution X, and that shifts

preserve the buffer constraints. For the energy consumption, we have received and decoded

the same amount of data as solution X, the decoding parameters α, RI , and RP are the

same as in solution X, and we have the same number of bursts. According to 3.2.2, the

interval between two consecutive bursts in solution X is BRd

and we have:

t′ = gcd(t1, t2) ≤ F

Rd≤ B

Rd. (5.11)

Since the duration of a shift is less than FRd

, it doesn’t attach two consecutive bursts together.

So, the energy consumption of X ′ would be the same as solution X. However, we have

assumed that the best solution for problem (5.10) is Y with more energy consumption,

and this is a contradiction. We have shown that the solution for problem (3.12) with fixed

packet size cannot be better than the solution for problem (5.10) and that we can transform a

solution for problem (3.12) with fixed packet size to a solution for problem (5.10) by shifting

the bursts. The constraints are preserved for problem (3.12) with the decoding parameters

and burst schedule from problem (5.10). Therefore, the solution for problem (5.10) is also a

solution for problem (3.12) when the packet size is fixed. Problem (5.10) is still non-convex,

but we will show how to make it convex in the following.

Note that there is a t′ that results in an optimal solution to problem (5.10). Since we

do not know Rd, it is not possible to find t′ prior to solving problem (5.10). We propose an

approximation for t′. We set t′ = FR × 10m. Since we receive an integer number of packets

in each burst, shifted bursts fit into an integer number of sub-intervals. The transport layer

has access to the buffer at most once in each sub-interval because FR < F

Rd, but unlike

problem (5.10), we do not know when this access happens in a sub-interval.

In the following, we explain the changes that we need to make to the buffer constraints to

make them a good approximation for the buffer constraints in problem (3.12). Assume that


there is a solution X for problem (3.12) and we have received Y1 packets and used Y2 packets

at the beginning of the ith burst in solution X. Since X is a solution for problem (3.12),

0 ≤ Y1 − Y2. We shift the bursts in this solution backward as we did before. It is possible

to meet one decoder access to the buffer while shifting backwards, therefore we would have

used either Y2 or Y2 − 1 packets so far. Since 0 ≤ Y1 − Y2 ⇒ 0 ≤ Y1 − (Y2 − 1), there is no

buffer underflow in this case. But, there may be a buffer overflow for the new burst schedule

if we have Y1 − Y2 = BF , and we meet a decoder’s access to the buffer by shifting backward

in a burst for solution X. In this case, what we have used is Y2−1. Since BF < Y1− (Y2−1),

there is a buffer overflow. We will consider the buffer capacity to be equal to B−F to solve

this inconsistency.

We show in the results section that the solution for this new problem is a good approx-

imation for the problem 3.12.

So, by setting t′ = F×10m

R and using B−F instead of B, we have the following problem:

Minimize Ew × (

N∑i=1

xi −N−1∑i=1

zi) + Eb ×N∑i=1

xi+

c1 × α× (α×RI + (1− α)×RP ) + c2 × α+ c3 × (α×RI + (1− α)×RP ) + c4

subject to

α× σ2I × 2−2×γI×RI + (1− α)× σ2

P × 2−2×γP×RP ≤ D

(α×RI + (1− α)×RP )× T = F ×N∑i=1

xi

k∑i=1

xi ≤k × FR× T

×N∑i=1

xi +B − FF

for 1 ≤ k ≤ N

k × FR× T

×N∑i=1

xi ≤k∑i=1


α×RI + (1− α)×RP ≤ R


zi ≤ xi+1 for 1 ≤ k ≤ N

0 ≤ α ≤ 1

0 ≤ RI , RP(5.12)


5.2 Solution

Problem (5.12) is a non-convex mixed integer programming problem. In this problem, we

have (2×N − 1) Boolean variables. But, it suffices to assign the values to xis for 1 ≤ i ≤ Nto determine the values of the zis for 1 ≤ i ≤ (N − 1). Therefore, there are O(2N ) ways to

give values to the xis. Since we want to solve this problem in real time, we need to find a

polynomial solution for it. In the following, we show how to apply heuristics to solve this

problem in polynomial time.

Our approach benefits from the discrete nature of problem (5.12). We use the branch and

cut technique [6]. Since it is not easy to use this technique for large scale problems, we use the

IBM ILOG CPLEX Optimizer [23] to solve this problem. The IBM ILOG CPLEX optimizer

is a mathematical programming solver for mixed linear integer programming problems. This

solver uses a dynamic search algorithm consisting of the same steps as the branch and

cut method. These steps are: LP relaxation, branching, cuts, and heuristics. In general,

CPLEX solves a series of continuous sub-problems with a branch and cut algorithm and

uses a tree structure to manage these sub-problems. The root of this tree consists of the

continuous relaxation of the original problem. If the solution for a relaxation has some

fractional variables, CPLEX tries to find cuts to remove some regions from the feasible

region that contain fractional values. If we still have fractional values after cutting these

regions, CPLEX branches on them and builds two sub-problems for each fractional value

with each one having more restrictive bounds. Sub-problems may give the optimal solution

with all integer values or need more branches and cuts.

We need to make our problem convex in order to use CPLEX. Note that we have non-

convexity in the objective function and in the distortion constraint. We have been using

PSNR or the equivalent mean squared error (MSE) to measure video quality. The PSNR of

a decoded video signal at the receiver is compared to its original video signal in this quality

metric. We took the average over all frames to obtain the video quality. In problem (3.12),

we have specified that the average distortion of inter-coded and intra-coded frames should

be less than the threshold D. We replace this constraint with two constraints, one for each

kind of frame (inter-coded and intra-coded), and specify that the distortion for each one

should not exceed the threshold D. We have the following constraints for distortion:

σ2I × 2−2×γI×RI ≤ D

σ2P × 2−2×γP×RP ≤ D

(5.13)


Since both sides are non-negative, we take the logs of both sides of these two constraints.

These two new constraints are convex. Since σI , σP , γI and γP are greater than zero, the

new constraints are as follows:

2× log(σI)− 2× γI ×RI ≤ log(D)→ −log(D) + 2× log(σI)

2× γI≤ RI

2× log(σP )− 2× γP ×RP ≤ log(D)→ −log(D) + 2× log(σP )

2× γP≤ RP

(5.14)

All the constraints are convex now. We substitute FT ×

∑Ni=1 xi for (α×RI+(1−α)×RP )

in problem (5.12) and use the buffer constraints Eq. (5.14) to get

Minimize Ew × (N∑i=1

xi −N−1∑i=1

zi) + Eb ×N∑i=1

xi+

c1 × α× FT

×N∑i=1

xi + c2 × α+c3 × FT

N∑i=1

xi + c4

subject to

−log(D) + 2× log(σI)

2× γI≤ RI

−log(D) + 2× log(σP )

2× γP≤ RP

(α×RI + (1− α)×RP )× T = F ×N∑i=1

xi

k∑i=1

xi ≤k × FR× T

×N∑i=1

xi +B − 1

Ffor 1 ≤ k ≤ N

k × FR× T

×N∑i=1

xi ≤k∑i=1


α×RI + (1− α)×RP ≤ R


zi ≤ xi+1 for 1 ≤ k ≤ N

0 ≤ α ≤ 1

0 ≤ RP , RI

(5.15)

If we assume that α is fixed, then the objective function will be convex. We solved this

problem with CPLEX for different possible values of α in parallel and chose the value of


α that gives the minimum value for the objective function. Since we have limited time for

finding the solution, we branch and cut until we reach the optimal solution or until a fixed

amount of time has elapsed. In the next section, we show that this method gives better

results than the Lagrangian method. Furthermore, the space that it needs is reasonable

according to the solution parameter.

5.3 Results

1 2 3 4 5 6 7 8 9 10

450

500

550

600

650

Time(s)

Ene

rgy(

mJ)

Convexnon−Convex

Figure 5.2: Average energy consumption, Foreman sequence, branch and cut (Convex) andLagrangian method (non-Convex) problem, PSNR limit 35 dB.

In this section, we compare the results from two different algorithms. The first algorithm

is the Lagrangian method and the second method is the branch and cut algorithm. We

compare the energy and distortion that results when the Lagrangian method is used to

solve the non-convex optimization problem in Eq. (3.16) with the results when the CPLEX

optimizer is used to solve the convex optimization problem in Eq. (5.15). We used the

Foreman video sequence with the same parameter settings as Section 4.2.

We used D = 20.56 and D = 10.023 which correspond to PSNR thresholds of 35 and

38 dB. We used an equal channel bit rate for both algorithms. Figures 5.2 and 5.3 show

the results for the average energy consumption in each window using these methods for

PSNR thresholds of 35 and 38 dB, respectively. The resulting energy consumptions using

the CPLEX optimizer and the Lagrangian method for 10 seconds are 5.186 and 5.417 J for


1 2 3 4 5 6 7 8 9 10550

600

650

700

750

800

850

900

950

1000

Time(s)

Ene

rgy(

mJ)

Convexnon−Convex

Figure 5.3: Average energy consumption, Foreman sequence, branch and cut (Convex) andLagrangian method (non-Convex) problem, PSNR limit 38 dB.

PSNR 35, and 7.22 and 7.883 J for PSNR 38.

Figures 5.4 and 5.5 show the average PSNR for these two methods for PSNR thresholds

of 35 and 38 dB, respectively. The resulting average PSNRs using the CPLEX optimizer

and Lagrangian method for 10 seconds are 35.44 and 35.2 dB for PSNR 35, and 37.79 and

37.3 for PSNR 38, respectively. We summarize the results in Tables 5.1 and 5.2 for PSNR

35 and 38 dB, respectively.

Table 5.1: Average PSNR and total energy consumption over 10 seconds of the Foremansequence, branch and cut and Lagrangian method, PSNR limit 35

PSNR(db) Energy(J)

Branch and cut method 35.44 5.186

Lagrangian method 35.2 5.417

Table 5.2: Average PSNR and total energy consumption over 10 seconds of the Foremansequence, branch and cut and Lagrangian method, PSNR limit 38

PSNR(db) Energy(J)

Branch and cut method 37.79 7.22

Lagrangian method 37.3 7.88


1 2 3 4 5 6 7 8 9 1034

34.5

35

35.5

36

36.5

37

Time(s)

PS

NR

(db)

Convexnon−Convex

Figure 5.4: Average PSNR, Foreman sequence, branch and cut (Convex) and Lagrangianmethod (non-Convex) problem, PSNR limit 35 dB.

The average energy consumptions are less using the CPLEX optimizer and the resulting

PSNRs are higher compared to the Lagrangian method. We can conclude from the results

that changing the problem to a convex mixed integer problem and using a branch and cut

algorithm to solve it gives better results than using convex problem solving methods such as

the Lagrangian method to solve the non-convex problem for fixed packet size transmission.

1 2 3 4 5 6 7 8 9 1036

36.5

37

37.5

38

38.5

39

Time(s)

PS

NR

(db)

Convexnon−Convex

Figure 5.5: Average PSNR, Foreman sequence, branch and cut (Convex) and Lagrangianmethod (non-Convex) problem, PSNR limit 38 dB.

Chapter 6

Conclusions and Future Work

In this chapter, first we summarize our work and the performance of our framework. Then,

we introduce several future research directions to extend this work.

6.1 Conclusions

This thesis provides a framework for optimizing the total energy consumption of receivers

subject to a given distortion limit, limited channel bandwidth, and receiver buffer size.

This optimization problem is based on models for distortion and energy consumption at the

receivers. One of our contributions is a decoding energy consumption model and its verifi-

cation for the x264 coder. We formulated the problem of selecting the mode for each frame,

finding the coding bit rates for intra-coded and inter-coded frames, and burst scheduling in

WiMAX wireless networks.

There are three main reasons why these results are important. The first reason is that

real-time video requests over wireless devices are increasing and our method can transmit

data with a given distortion for real-time streams. Second, preserving the requested video

quality requested by the users helps to increase their satisfaction. Finally, saving energy on

wireless mobile devices is one of the main concerns of users.

We considered two scenarios based on different kinds of modulations: variable and fixed

packet sizes. In the first scenario, we formulated the energy optimization problem at the

receivers based on the rate, distortion, and buffer constraints. This optimization problem is

non-convex. We proposed a method which uses a greedy algorithm for scheduling bursts in

a window based on the formulated optimization problem. This method preserves the buffer

56

CHAPTER 6. CONCLUSIONS AND FUTURE WORK 57

constraints and consumes optimal energy for burst scheduling. After applying the burst

scheduling algorithm, the resulting problem formulation is simpler but still non-convex.

Then, we applied a Lagrangian method to solve this non-convex problem.

In order to show the effectiveness of our algorithm in energy saving at the receivers, we

compared the energy consumption and distortion resulting from the use of our Lagrangian

method with the two other methods. The results show the effect of choosing intra refresh

rate and coding bit rate on energy efficient multimedia reception. Our framework can save

up to 50% energy on receiving devices with only a small deviation from the requested quality.

In the second scenario, we formulated a convex mixed integer energy optimization prob-

lem for using a fixed packet size. Then, we used CPLEX to solve the proposed problem

and showed the improvements in energy saving compared to the Lagrangian method which

is used to solve the variable packet size problem. Our results show that it is more efficient

to use a branch and cut method for solving energy optimization problems for fixed packet

sizes rather than considering the general variable packet size energy optimization problem.

6.2 Future Work

This thesis can be extended in many directions. Some of these directions are summarized:

• We considered reliable data transmission without any loss over the channel for our

problem formulation. However, existing channels are unreliable and lossy. We can

extend our framework to work over lossy channels with the given loss ratio. In this

case, the distortion model should be extended to contain the distortion which is caused

by packet loss over a lossy channel. An error concealment method is required to recover

the corrupted frames with a small increment in the distortion and its deviation from

a given threshold.

• Our current framework works efficiently to find the coding parameters and burst

scheduling for one video stream. However, it potentially can be extended to con-

sider the reception of multiple video streams. Since every stream decodes separately,

we should consider the buffer constraints for each stream. Also, the burst scheduling

method has to be generalized to consider multiple buffer constraints. Finally, there

is a trade-off between bandwidth usage and energy consumption, and this trade-off

becomes more important when there are multiple streams and limited bandwidth.

CHAPTER 6. CONCLUSIONS AND FUTURE WORK 58

• The CPLEX optimizer is an applicable tool to solve convex mixed integer problems. A

different approach would be to use dynamic programming to solve the problem (5.15).

Bibliography

[1] S. Baek and B.D. Choi. Performance analysis of sleep mode operation in IEEE 802.16mwith both uplink and downlink packet arrivals. In Proceedings of the IEEE Interna-tional Workshop on Computer Aided Modeling and Design of Communication Linksand Networks (CAMAD), pages 112–116, 2011.

[2] D.P. Bertsekas. Constrained optimization and Lagrange multiplier methods. AcademicPress, 1982.

[3] D.P. Bertsekas. Nonlinear programming. 1999.

[4] J.F. Bonnans, J.C. Gilbert, C. Lemarechal, and C.A. Sagastizabal. Numerical opti-mization: theoretical and practical aspects. Springer, 2006.

[5] J. Burg. The science of digital media. Prentice Hall/Pearson Education, 2009.

[6] M. Dell’Amico, F. Maffioli, and S. Martello. Annotated bibliographies in combinatorialoptimization. John Wiley & Sons Inc., 1997.

[7] M. Ergen. Mobile broadband: including WiMAX and LTE. Springer, 2009.

[8] WiMAX Forum. Mobile WiMAX - Part II: A comparative analysis,2006. http://www.wimaxforum.org/technology/downloads/Mobile_WiMAX_Part2_

Overview_and_Performance.pdf.

[9] Z. He, J. Cai, and C.W. Chen. Joint source channel rate-distortion analysis for adaptivemode selection and rate control in wireless video coding. IEEE Transactions on Circuitsand Systems for Video Technology, 12(6):511–523, 2002.

[10] Z. He, Y. Liang, L. Chen, I. Ahmad, and D. Wu. Power-rate-distortion analysis forwireless video communication under energy constraints. IEEE Transactions on Circuitsand Systems for Video Technology, 15(5):645–658, 2005.

[11] C.H. Hsu and M. Hefeeda. Time slicing in mobile TV broadcast networks with arbitrarychannel bit rates. In Proceedings of the IEEE International Conference on ComputerCommunications (INFOCOMM), pages 2231–2239, 2009.

[12] Joulemeter. http://research.microsoft.com/en-us/projects/joulemeter.

59

http://www.wimaxforum.org/technology/downloads/Mobile_WiMAX_Part2_Overview_and_Performance.pdf

http://www.wimaxforum.org/technology/downloads/Mobile_WiMAX_Part2_Overview_and_Performance.pdf

http://research.microsoft.com/en-us/projects/joulemeter

BIBLIOGRAPHY 60

[13] T. Kariya and H. Kurata. Generalized least squares. Wiley, 2004.

[14] A. Kumar. Mobile broadcasting with WiMAX: principles, technology, and applications.Taylor & Francis US, 2008.

[15] T.H. Lan and A.H. Tewfik. Power optimized mode selection for H.263 video codingand wireless communications. In Proceedings of the IEEE International Conference onImage Processing (ICIP), volume 2, pages 113–117, 1998.

[16] Y. Liang and I. Ahmad. Power and distortion optimization for pervasive video coding.IEEE Transactions on Circuits and Systems for Video Technology, 19(10):1436–1447,2009.

[17] Y. Lin, E. Gurses, A.N. Kim, and A. Perkis. Optimal joint power-rate adaptation forerror resilient video coding. In Proceedings of SPIE, volume 6822 of Visual Communi-cations and Image Processing, pages 20–28, 2008.

[18] X. Lu, E. Erkip, Y. Wang, and D. Goodman. Power efficient multimedia communi-cation over wireless channels. IEEE Journal on Selected Areas in Communications,21(10):1738–1751, 2003.

[19] C.E. Luna, Y. Eisenberg, R. Berry, T.N. Pappas, and A.K. Katsaggelos. Joint sourcecoding and data rate adaptation for energy efficient wireless video streaming. IEEEJournal on Selected Areas in Communications, 21(10):1710–1720, 2003.

[20] S. Ma, W. Gao, and Y. Lu. Rate-distortion analysis for h. 264/avc video coding andits application to rate control. IEEE Transactions on Circuits and Systems for VideoTechnology, 15(12):1533–1544, 2005.

[21] S.K. Mishra. Topics in Nonconvex Optimization: Theory and Applications, volume 50.Springer, 2011.

[22] Institute of Electrical and Electronics Engineers. Local and metropolitan area net-works - Part 16: Air interface for broadband wireless access systems broadband wire-less metropolitan area network, 2009. http://standards.ieee.org/getieee802/

download/802.16-2009.pdf.

[23] IBM ILOG CPLEX Optimizer. http://www-01.ibm.com/software/integration/

optimization/cplex-optimizer/.

[24] W. Pu, Y. Lu, and F. Wu. Joint power-distortion optimization on devices with MPEG-4AVC/H.264 codec. In Proceedings of the IEEE International Conference on Commu-nications, volume 1, pages 441–446, 2006.

[25] I.E. Richardson. The H.264 advanced video compression standard. Wiley, 2011.

http://standards.ieee.org/getieee802/download/802.16-2009.pdf

http://standards.ieee.org/getieee802/download/802.16-2009.pdf

http://www-01.ibm.com/software/integration/optimization/cplex-optimizer/

http://www-01.ibm.com/software/integration/optimization/cplex-optimizer/

BIBLIOGRAPHY 61

[26] S. Sharangi, R. Krishnamurti, and M. Hefeeda. Energy-efficient multicasting of scalablevideo streams over WiMAX networks. IEEE Transactions on Multimedia, 13(1):102–115, 2011.

[27] JM Software. http://iphome.hhi.de/suehring/tml.

[28] Arizona State University YUV video sequences. http://trace.eas.asu.edu/yuv.

[29] T. Wiegand, G.J. Sullivan, G. Bjontegaard, and A. Luthra. Overview of the H.264/AVCvideo coding standard. IEEE Transactions on Circuits and Systems for Video Tech-nology, 13(7):560–576, 2003.

[30] x264. http://www.videolan.org/developers/x264.html.

[31] S. Zhu and T. Wang. Enhanced power efficient sleep mode operation for IEEE 802.16 ebased WiMAX. In Proceedings of the IEEE Conference on Mobile WiMAX Symposium,2007. IEEE, pages 43–47, 2007.

http://iphome.hhi.de/suehring/tml

http://trace.eas.asu.edu/yuv

http://www.videolan.org/developers/x264.html

real-time video streaming over wimax networks with … · chapter 1. introduction 2 coding bit...

Documents