a context-driven iot middleware architectureseelab.ucsd.edu/papers/jvenkate.techcon15.pdf · a...

A Context-Driven IoT Middleware Architecture Jagannathan Venkatesh, Christine Chan, Alper Sinan Akyurek and Tajana Simunic Rosing

University of California, San Diego

Abstract—The Internet of Things (IoT) refers to an environment of

ubiquitous sensing and actuation from devices connected to the web

backend. IoT applications leverage contextual information about

entities in the system for reasoning and actuation. These context-aware

applications are difficult to scale to the large amount of heterogeneous

data in the IoT, as the current state-of-the-art is black-box, monolithic,

application-specific implementations. We propose a middleware

framework for context-aware applications that generates intermediate,

reusable context extracted from input by breaking down applications

into a set of functional units, or context engines. Leveraging existing

IoT ontologies, we can replace application-specific implementations

with a composition of context engines that use statistical learning to

generate output, improving context reuse and reducing computational

redundancy and complexity. We implement an IoT application using

our framework, extracting residential user activity from plug loads,

and demonstrate a reduction in computational complexity by 23% and

execution overhead by 69%.

I. INTRODUCTION

Sensor networks and ubiquitous sensing are evolving into the

Internet of Things (IoT) – the collection of sensing and actuation

backed by the existing and growing Internet infrastructure [1]. Pre-

IoT work in this area envisioned a level of compatibility and control

over the sensors in the systems [2] and applications that used a

manageable amount of raw sensor data. However, the number of

available sensing and actuation devices has grown rapidly in the last

few years [3], promising a truly pervasive sensing and actuating

environment. The main goal of the ubiquitous sensing in the

Internet of Things is to drive automated actuation. These context-

aware applications operate on dynamically changing raw sensor

data to extract context. However, as the number and heterogeneity

of sensing devices increases, the idea of applications operating on

their data in a scalable manner becomes infeasible. A middleware

infrastructure that classifies and maintains the context derived from

the data for these applications can alleviate scalability issues. This

layer serves to identify the relationship between objects in the

system, and leverage both raw data and high-level context about the

dependent objects to enable useful processing and automation.

Related work [4] [5] on middleware places the burden of

implementation on each individual application, but this is inefficient

when multiple applications need to process the same data in the

same way. Furthermore, reliance on application-specific code

reduces the potential for designing and reusing general-purpose

machine learning across context-aware applications. In this work,

we design a middleware infrastructure designed to overcome these

limitations: an ontology-driven framework that enables the use of

general statistical learning to drive output for the IoT.

Our framework is implemented as middleware through an

extensible set of context engines, translating heterogeneous raw

sensor data into high-level usable context. Each context engine also

generates intermediate context. which can be reused across

applications instead of performing redundant computation. The

middleware this reduces an application’s input set and facilitates

more scalable application development. We also automate context-

aware computing by leveraging general implementations of

statistical machine learning that can easily be reused for other

applications.

We provide an overview of related work in IoT, sensor

ontologies, and characteristics and requirements of context-aware

applications and middleware in Section II. We outline the context

engine implementation in Section III, including the algorithm to

learn and train a model for data translation and output generation.

Finally, in Section IV, we design an example context-aware IoT

application and compare application characteristics between our

infrastructure and a single-step optimization example. We

demonstrate 69% speedup in execution time enabled by our

framework due to better scalability and parallelism while obtaining

comparable accuracy.

II. RELATED WORK

Pervasive sensors gather raw data from a diverse combination of

data sources, including raw sensors and user-supplied or high-level

context processed from mobile and computing devices. Analog

sensor data is digitized and/or preprocessed before software can use

it as meaningful input. In the IoT, most data will go through several

levels of abstraction, combination, or distillation to produce a

description of the environment (and its users) with discrete,

semantic states. This higher level context is used for visualization

[4] or actuation [5][6]. Discretized context trades off intuitive

reasoning for raw data precision, and can be reused across

applications. Current context-aware applications are individual

deployments that rarely share infrastructure, code, or data natively.

Practically, this end-to-end development approach results in a

disorganized data space, necessitating the use of ontologies –

formal data representations – to maintain a unified, regulated data

representation [9]. The Web Ontology Language (OWL), an early

Internet ontology, is one such standard that provides description of

web relationships with several implementations: smart corporate

spaces [6], homes [10] and semantic services (OWL-S) [9]. An

important aspect of ontologies for the IoT is identifying the object

to which an application is related. The Context Modeling Language

(CML) [11] accomplishes this by relying on the Object-Role

Model, attributing all context data to a physical or virtual entity (the

object) and provides a particular form of information associated

with it (role).

Finally, pervasive sensing and computing in the IoT requires

learning/reasoning to appropriately transform input data into output

context. K-means clustering is a prevalent way to relate low-level

data into high-level context [4]. Reinforcement learning (RL)

allows IoT users who are already involved in sensing and actuation

to reinforce and guide the system towards better accuracy and

intuitive actuation [5]. Rashidi et al. [12] remove reliance on

labeled data by performing unsupervised learning over low-level

sensor data to extract patterns that represent high-level activities.

The above examples are application specific, but we propose a

framework and algorithms that can perform a similar level of data

translation and actuation in a domain-independent manner.

The state of the art in IoT middleware defines it as an interface

between sensing and actuation [13]. Ontologies can be used to

organize applications, but they remain multi-input, multi-output

(MIMO) black boxes transforming input to application-specific

output [10]. However, we and others [1] observe that the overall

process is made up of similar, application-agnostic data processing.

III. SYSTEM DESIGN

Our main contribution in this work is an alternate view of IoT

middleware: a hierarchy of multiple-input-single-output (MISO)

functional units (context engines) that improve scalability while

reducing data redundancy across applications with identical

functionality of the previous monolithic multi-input multi-output

(MIMO) units. We trade off compact, specific implementations by

exposing intermediate data that can benefit other applications,

exploiting the unique opportunity in IoT where reasoning and data

is often replicated among applications (see Figure 1.right).

Furthermore, the smaller, hierarchical functional units represent a

simpler data translation than before, and we can implement a

general machine learning algorithm to perform data transformation

and reduce application-specific code. We prove that this approach

decreases overall compute complexity and enables scalability, in

terms of reduced input processing for higher function orders (III.A).

Additionally, splitting single-step applications into small functional

units (each with fewer inputs, simpler logic) facilitates a

generalized data transformation through machine learning (III.B).

A. Context Engine System Architecture

We outline our middleware architecture leveraging ontologies.

We use them to specify the interfaces to each of the functional units

and also to drive data transformation. Each context variable is the

individual input or output data unit, and the variable’s data space

defines the domain/range of the variable, and subsequently, the

context engine. The MIMO, black box implementation of current

IoT applications is then broken down into several context engines,

each producing a context variable. These smaller MISO functional

units are composed to form the same outputs as the consolidated

application (Figure 1.right). While the monolithic applications can

themselves be modular and parallelized, the breakdown is hidden

from the rest of the system.

We opt to expose the internal composition, reducing the

application into smaller, less-complex functional units and

improves scalability and compute complexity. This approach raises

questions about the overhead, latency, and accuracy of breaking

down a possibly compact application into a composition of steps.

We validate our approach by proving that the overall computational

complexity of the architecture is actually reduced with a possible

marginal impact on output accuracy.

We aim to prove that dividing an N-input single context engine to

multiple context engines decreases the total computational

complexity. We start with a general representation of a context

engine: N inputs and a computational complexity order of α, for a

maximum computational overhead of 𝑁𝛼. We divide the single

engine into two stages: 𝑁

𝐴 multiple engines with an arbitrary number

of inputs A. The second stage takes the outputs of the first stage and

gives the final output. The total complexity overhead is now 𝑁

𝐴𝐴𝛼 +

(𝑁

𝐴)𝛼. We look for the conditions where the two-stage has a lower

complexity than the single engine, where A must be an integer and

the number of context engines must be an integer (𝑁

𝐴). This means

that A must be between 2 and 𝑁

2. The complexity of this staged

system is the multiplication of two terms. The minimum value for

the first term is achieved, when 𝐴 =𝑁

2 and results in 2𝛼−1. The

minimum value for the second term is achieved when 𝐴 = 2 and

results in 1 − 2−𝛼. This product provides a lower bound for the

result: 2𝛼−1(1 − 2−𝛼) = 2𝛼−1 −1

2, which satisfies our inequality

and provides another insight.;

(𝑁

𝐴)𝛼−1(1 −

1

𝐴𝛼) > 2𝛼−1 −1

2> 1 → 𝛼 > 𝑙𝑜𝑔23 ∼ 1.6 (1)

This result states that for any system of order greater than 1.6, i.e.

a nonlinear system, any arbitrary division of the single engine

results in a decrease in computational complexity, with the

minimum achieved for a system of 2-input engines. While in

general, context-aware applications do not necessarily fit perfectly

into a system of two-input engines, as we reduce inputs into each

CE and increase the path from initial input to output context, we

reduce the overall complexity of the system.

Figure 2. Breakdown of a single-step into lower-complexity equivalent reductions, with minimum complexity occurring with maximum division (right)

We now investigate the accuracy change between sequential and

consolidated applications. We consider a general functional

representation of data transformation from statistical learning:

generating a model for data transformation as a polynomial of

varying complexity and functional order, which can be solved to

best-fit through techniques such as regression [14]. By providing

the means to vary the inputs (ontology) and relationship (functional

Figure 1. (left) The current state-of-the-art: monolithic end-to-end applications. (right) Our implementation:. Functional units (context engines) are multi-in-single-output, and each context engine performs a general statistical learning and publishes intermediate context for reuse

order), complex relationships between the inputs and output can be

represented and trained. We now derive the error of a 2-stage, 2-

input sequential engine compared against the corresponding 4-

input, single-stage engine as a factor 𝛿, which creates 𝛿 ∗ 𝑖1 ∗ 𝑖2

truncation error, where 𝑖1, 𝑖2 are two inputs. We can also quantify

the impact of input signal noise on the sequential context engine

compared to the single-stage approach. We model each input with

zero-mean additive white Gaussian noise: 𝑥𝑖 + 𝑤𝑖, a common

expression of sensor noise [19]. The resulting noise coefficients

propagate through the application, compounding the truncation

error: 𝛿 ∗ 𝑖2 ∗ 𝑖4 + 𝛿 ∗ 𝑤2 ∗ 𝑖4 + 𝛿 ∗ 𝑤4 ∗ 𝑖2 + 𝛿 ∗ 𝑤2𝑤4. The first

term is the truncation error we previously quantified; the remaining

terms are the scaled Gaussian values due to noise. The significance

of the error terms is entirely dependent on the relationship between

the cross-product input terms 𝑖2and 𝑖4. However, from a system

design perspective, simply selecting highly correlated input terms

for context engines – the intuitive choice – mitigates truncation

error, minimizing the impact of the missing cross-coefficient.

B. Generalized Data Transformation

Our approach, a modular multi-stage context engine, results in

more functional units (FU) per application. An important

consequence is that each FU is a simpler translation of input data to

a single output. This enables us to use general data transformation

in place of application-specific code. Thus, a context-aware

application can be created by specifying the inputs and output of

each FU alone, and allowing the data transformation algorithm to

incur the overhead of generating and training a model based on

input and output observations. We can leverage ontologies that exist

in the current state-of-the-art middleware, as ontologies provide a

specification for each variable. From a data standpoint, they

regulate inputs and outputs of applications. We can exploit this

ontological information for machine learning algorithms that

generate results based on the input/output space of each FU.

Matrix-based stochastic learning models express potential data

dependencies as a system of equations. Over time, observed input

and output data is gathered until the coefficients can be trained and

a model generated. However, complex relationships can exist

among the input data for an IoT application, and a purely linear

model may not be sufficient [14]. Several works [15] [16]

implement learning by considering higher orders and time

correlation as well. We leverage TESLA, a learning model

originally designed for solar forecasting, as our data translation

algorithm. It provides efficient model generation: 𝑂(𝑛𝛼), where 𝑛

represents the number of inputs and 𝛼 represents the function order.

The generic Taylor expansion is as follows:

∑ 𝐶𝑖𝑥𝑖𝑛𝑖=0 (1st order), ∑ ∑ 𝐶𝑖𝑗𝑥𝑖𝑥𝑗

𝑖𝑗=0

𝑛𝑖=0 (2nd order) etc. (2)

where 𝐶𝑖𝑗 represents individual coefficients learned once

observations are determined, and 𝑥0 = 1 (a constant). The result is

the equation 𝐴 ∗ 𝑥 = 𝐵, where 𝐴 is the row matrix of input

observations; 𝑥 is the column vector of coefficients, and 𝐵 is the

column vector of output observations correlating with the

corresponding rows of A, and solved by least squares estimation.

A limitation of this model is that at least m independent

observations are required for training, where 𝑚 = 𝑛𝛼 for functional

order 𝛼, which is space-inefficient as the order grows. To use the

model, the equation is simply solved using the learned coefficients

and input context to produce the output context.

We select TESLA because of its general formulation, versatility,

and applicability to context processing, but other statistical learning

approaches can also be used: Bayesian Networks [7], Hidden

Markov Models, and Artificial Neural Networks can all leverage

input/output spaces to generate conditional probability models that

define output context [15].

IV. SYSTEM IMPLEMENTATION

For our case study, we investigate an application that is useful to

two IoT domains: the smart grid and smart environments. User

occupancy and activity is used in these domains for plug load

prediction and actuation, respectively. We focus into residential

spaces, which 1) have not seen the same automation as commercial

and industrial sectors, and 2) represent hundreds of millions of

consumers with different behavior and small individual energy

contribution but a significant grid share. Our middleware presents a

unique opportunity: to train and learn the behavior of different

residences using the same infrastructure. This information can be

used as the intermediate context for both domains. Our application

aims to identify active user presence in a room from plug loads

using the context engine middleware. Data Translation: In order to go from plug loads to user

activity, we break up the problem into two steps – 1) determining

appliance usage and 2) determining user activity based on 1). We

design a context engine for each, with the outputs, respectively:

{appliance-name: string; appliance-active: boolean} and {room-

name: string; user-active: boolean}. Context Engine Setup: Figure 5 illustrates both the sequential

context engine configuration (a) and the single-stage application (b)

to determine output context. From a set of n appliances and m

rooms, the first set of n context engines train and learn appliance

usage from the raw data. Each first-stage context engine is a single-

input, single-output (O(1)) engine consuming the raw power data

from its corresponding appliance as input, and trained on human-

observed binary output on appliance activity. The second stage

transforms the output of all the first-stage context engines (active

appliance usage) to a binary representation of whether or not a room

is being actively used. These second-stage context engines, with n

inputs, have a complexity of O(nα) for generating the output

context, where a is the functional order. The single-stage,

consolidated application represents the current state of the art,

taking in the same input to produce the same output. It will be used

for comparison of scalability, complexity, and accuracy. In this

case, the single-stage application has a complexity of O(nα).

Using 1Hz raw appliance power data from the REDD dataset

[17] as input, we train each context engine with input and output

traces. We use up to 172,800 samples (2 days’ worth), and test

(a) (b) Figure 3. (a) Sequential context engine and (b) consolidated application for room-level activity detection.

against 86,400 (1 day) samples.

Results: Complexity: the sequential context engine consists of n

second-order TESLA context engines for the first stage, each

consuming one input (appliance raw data) and producing binary

appliance usage context. The second stage has m second-order

context engines, one for each room, with n inputs, one from each

first stage. The result is n*O(1) complexity for the first stage and

m*O(n2) for the second stage, and an overall complexity of

O(m*n2). The consolidated context engine has exactly the same

configuration as the second stage alone and consequently, the same

complexity, at O(m*n2). However, even with the same complexity,

the sequential approach is advantageous in execution overhead. All

context engines iterate when a new input observation is recorded.

However, the second stage of the sequential context engine reacts

only to changes in binary input data, which is much less frequent

than the correspondingly subtle changes to the raw data. Table 1

below highlights the number of computations performed between

the single-stage and sequential context engines for the refrigerator:

The sequential application performs more individual computations

because there are more context engines, but it exhibits only 31% of

the latency of the single-stage context engine despite the additional

throughput. This is because of the nature of the computations: the

sequential context engine offloads the processing of raw data to the

low-overhead (O(1)) first stage. The single-stage context engine,

however, has no choice but to perform over 73000 O(n2)

computations. The sequential application requires only 23% of the

intensive O(n2) computations as the single-stage implementation,

and consequently, can complete the work 69% faster.

Table 1. Execution overhead for the context engines associated with the refrigerator

Refrigerator Number of computations

Total Latency 1st Stage 𝑶(𝟏) 2nd Stage 𝑶(𝒏𝟐)

Sequential 73428 17376 0.42 sec Single-Stage -- 73428 1.34 sec

Accuracy: While there is a reduction in compute overhead with our

approach, the accuracy of the sequential application and the single-

stage application must also be considered. We first investigate the

output prediction error for kitchen appliances in Table 2. Error is

calculated based on the percentage of instances where the model-

derived output differs from the expected output. These appliances

were trained with 10,000 unique observations culled from 2 days’

worth of data, and demonstrate generally very low error.

Table 2. First-stage error of appliance models for 2nd-order context engine

Appliance Output Prediction Error (%) Kitchen Outlets 2 32.8% Kitchen Outlets 3 1.4% Kitchen Outlets 4 0.6% Microwave 4.0% Oven 01 1.4% Oven 02 1.4% Refrigerator 0.8% Stove 0.0%

We now investigate the output accuracy of both implementations

of the same application. An important consideration is that the

sequential application contains more opportunities for training, as

each context engine can be trained individually. Conversely, the

consolidated application has only one context engine (see Figure

5(b)). If we train each context engine in both applications with the

same number of inputs, the sequential will consume and train on 9x

the data of the single-stage. Alternatively, we can train the

sequential application such that the total number of trained

observations matches the consolidated application - that is, provide

each of the 9 context engines with 1/9 observations compared to the

single-stage. We investigated both cases: training each context

engine with the same number of observations (row 1) and training

the overall application with the same number of observations (row

2). Table 3 compares the results of the two engines for the kitchen:

Table 3. Accuracy comparison between sequential and single-stage context engines

Application Type Output Prediction Error Sequential (same # obs/context engine) 2.4% Sequential (same total # observations) 34.3% Single-Stage 17.4%

The preprocessing in the first stage and the discretizing of raw

appliance data resulted in a significantly reduced error in the second

stage, with over 15% reduction in output error. However, when

relying upon the same total number of observations, the single-stage

application provided better accuracy, as the sequential application

undersampled and aggregated noise and error in the second stage.

V. CONCLUSION

In this work, we establish the motivation for and design a novel

middleware architecture for the Internet of Things. The current state

are implementations are ultimately monolithic and application-

specific [1]. This reduces the overall scalability and increases

compute complexity of applications. In contrast, we design a

modular framework that exploits common computational processes

between applications, exposing shareable intermediate context. We

increase scalability while reducing computational complexity by

decomposing an application into a sequence of functional units

driven by general data transformation. We implement this

transformation using a statistical learning technique that exploits

ontological data. We test our approach with residential activity

detection, demonstrating 69% improvement in execution overhead

and comparable accuracy while also exposing the intermediate

context variables for reuse. Overall, the result is a middleware

architecture for the IoT that leverages machine-learning intelligence

and improves scalability and complexity of context-aware designs.

REFERENCES [1] C. Perera, et al, "Context Aware Computing for the Internet of Things: A

Survey," IEEE Communications, Surveys, & Tutorials, pp. 414-454, 2013. [2] M. Friedewald and O. Raabe, "Ubiquitous computing: an overview of technology impacts," Telematics and Informatics, vol. 28, pp. 55-65, 2011. [3] J. Hammer and T. Yan, "Poster: A virtual Sensing Framework for Mobile

Phones," in Proceedings of MobiSys, 2014. [4] J.-H. Hong et al, "Conamsn: A context-aware messenger using dynamic bayesian

networks with wearable sensors," ESA, vol. 37, no. 6, p. 4680–4686, 2010. [5] S. K. Madhu et al, "An Ontology-based Framework for Context-Aware Adaptive E-Learning System," in ICCI 2013. [6] H. Chen, T. Finin and A. Joshi, "An Ontology for Context-Aware Pervasive

Environments," The Knowledge Eng. Review, vol. 18, no. 3, pp. 197-207, 2004. [7] S. Lee and K. C. Lee, "Context-prediction performance by a dynamic bayesian

network: Emphasis on location prediction in ubiquitous decision support

environment," Expert Systems with Applications, vol. 39, no. 5, p. 4908–4914, 2012. [8] M. Rudary et al, "Adaptive cognitive orthotics: combining reinforcement learning

and constraint-based temporal reasoning," in ICML 2004. [9] S. Staab and R. Studer, Handbook of Ontologies, Springer, 2010. [10] T. Gu, X. Wang, H. Pung and D. Zhang, "An Ontology-based Context Model in

Intelligent Environments," in CNDS 2004. [11] M. Nebeling et al, "XCML: providing context-aware language ext. for specification of multi-device web applications," WWW, vol. 15.4, pp. 447-81, 2012. [12] P. Rashidi et al, "Discovering activities to recognize and track in a smart

environment," IEEE TKDE, vol. 23, no. 4, pp. 527-539, 2011. [13] S. Bandyopadhyay et al, "A Survey of Middleware for Internet of Things,"

RTWMN ‘11, vol. 162, pp. 288-296, 2011. [14] B. O. Akyurek, A. S. Akyurek, J. Kleissl and T. Rosing, "TESLA: Taylor expanded solar analog forecasting," in IEEE SmartGridComm 14, 2014. [15] J. Gubbi et al, "Internet of Things (IoT): A vision, architectural elements, and

future directions," Future Generation Computer Systems vol 29.7 pp. 1645-60, 2013. [16] A. Katasonov et al, "Smart Semantic Middleware for the Internet of Things," in

ICINCO-ICSO, 2008. [17] J. Kolter and M. Johnson, "REDD: A public data set for energy disaggregation

research," in SustKDD '11, 2011.

a context-driven iot middleware architectureseelab.ucsd.edu/papers/jvenkate.techcon15.pdf · a...

Documents