big data and how bi got its groove back

48
FOR DISCLOSURE AND FOOTNOTE INFORMATION, REFER TO THE JMP FACTS AND DISCLOSURES SECTION Insert bullet heading here. Insert bullet text here Insert bullet heading here. Insert bullet text here Industry Overview - Software November 15, 2011 Big Data and How BI Got Its Groove Back Greg McDowell [email protected] (415) 835-3934 Patrick Walravens [email protected] (415) 835-8943 Peter Lowry [email protected] (415) 869-4418

Upload: joy-xi

Post on 24-Oct-2014

239 views

Category:

Documents


2 download

TRANSCRIPT

Page 1: Big Data and How BI Got Its Groove Back

FOR DISCLOSURE AND FOOTNOTE INFORMATION, REFER TO THE JMP FACTS AND DISCLOSURES SECTION

• Insert bullet heading here. Insert bullet text here

• Insert bullet heading here. Insert bullet text here

Industry Overview - Software November 15, 2011

Big Data and How BI Got Its Groove Back

Greg McDowell

[email protected]

(415) 835-3934

Patrick Walravens

[email protected]

(415) 835-8943

Peter Lowry

[email protected]

(415) 869-4418

Page 2: Big Data and How BI Got Its Groove Back

2

TABLE OF CONTENTS

Executive Summary…………………………………………………………………………………………...….3

Part I: An Introduction to Big Data

A Big Data Primer………………………………………………………………………………………….…..….4

Big Data Market Opportunity………………………………………………………………...……………….…11

The Data Management Landscape…………………………………………………………………………….16

The Resurgence of Business Intelligence……………………………………………………………………..23

Big Data Stock Performance……………..……………………………………………………………………..33

Part II: Initiation Summaries

Part III: Privately Held Companies in Big Data Space (hard copy only)

JMP Securities Software Team ……………………………………………...……………………….………44

JMP Facts and Disclosures……………………………………………………………………………………..46

Page 3: Big Data and How BI Got Its Groove Back

3

EXECUTIVE SUMMARY

We believe the proliferation of data is one of the most disruptive forces in technology today. Despite

accounting for a small portion of industry revenues today, we believe "Big Data" is poised for rapid

growth. The purpose of this report is to help investors better understand "Big Data" and the market

opportunity ahead.

Part I: An Introduction to Big Data and the Resurgence of Business Intelligence

In Part I of this report, we define and size the market opportunities created by Big Data. We define Big

Data as data sets of extreme volume and extreme variety. In 2011, we estimate that "Big Data" is a

$9.1 billion market opportunity, representing only 2% of the $407 billion spent on software, storage, and

servers. We refer to this software, storage, and server spending as enterprise IT spending. Ten years

ago, spending on Big Data was minimal due to the fact that data sets (volume) were much smaller, data

had less variety, and the velocity of data flowing into organizations was much slower. Over the next ten

years, we expect Big Data-related computing to increase to $86.4 billion, representing 11% of all

enterprise IT spending and a 10-year CAGR of 25%. The key growth driver of Big Data is the

proliferation of data. This proliferation of data has caused enterprises to need new tools and processes

to collect data (both structured and unstructured) and also to store, manage, manipulate, analyze,

aggregate, combine, and integrate data.

In Part I we also discuss the resurgence of the Business Intelligence ("BI") market. We believe the

business intelligence landscape is about to go through a major sea change that will radically transform

the landscape and the way the industry thinks about analytics. In our view, the two primary reasons for

the sea change are Big Data and the consumerization of Enterprise BI driven by trends such as mobile

BI. With respect to Big Data, we believe it has become very easy to collect data, but difficult to make

sense of that data using traditional BI tools. In other words, as the useful life of information has

decreased, so has the utility of traditional BI tools which have historically been very backwards looking.

Part II: Key Publicly Traded Companies in the Big Data Space

In Part II of this report, we are initiating coverage of the infrastructure software group with a relatively

constructive viewpoint. In the current volatile environment for stocks, we believe long-term investors

should focus on the positive implications of emerging secular trends such as Big Data that could create

significant profit opportunities over the next few years. We are recommending software companies with

solid but flexible operating strategies that, in our opinion, will be primary beneficiaries of the Big Data

trend.

We are initiating coverage on six infrastructure software companies: MicroStrategy Inc. (MSTR),

Progress Software Corp. (PRGS), Qlik Technologies (QLIK), Quest Software (QSFT), Teradata

Corporation (TDC), and Tibco Software Inc. (TIBX).

We are initiating coverage as follows:

• MicroStrategy with a Market Outperform rating and $140 price target.

• Progress Software with a Market Perform rating.

• Qlik Technologies ("QlikTech") with a Market Outperform rating and a $35 price target.

• Quest Software with a Market Perform rating.

• Teradata with a Market Outperform rating and $63 price target.

• TIBCO Software with a Market Outperform rating and a $33 price target.

We also discuss the Big Data strategies for eight other publicly traded companies.

Part III- Privately Held Companies in Big Data Space (Available in Hard Copy Only)

In Part III of this report, we provide profiles of 100 leading private software companies that are

leveraged to benefit from the Big Data trend. Many of these companies approach the Big Data market

from different angles, including the NoSQL movement, in-memory databases, columnar databases,

Hadoop-related technologies, data grid/data cache solutions, solutions related to open source R, data

visualization, predictive analytics, and real-time dashboards. Our favorite private companies include

Cloudera, Splunk, Tableau Software, and Talend.

Page 4: Big Data and How BI Got Its Groove Back

4

WHAT IS BIG DATA?

Big Data is one of the biggest trends in technology as enterprises increasingly need to manage the

explosion of data caused by trends like cloud computing, the rise of mobility, globalization, and social

media. This proliferation of data has caused enterprises to need new tools and processes to collect

data (both structured and unstructured), store data, manage data, manipulate data, analyze data, and

aggregate, combine, and integrate data.

There is no set definition for Big Data although many third-party firms have provided their perspective.

We define Big Data as data sets of extreme volume and extreme variety. In Figure 1 below, we illustrate

how five different, influential firms define Big Data.

FIGURE 1: The Many Different Definitions of Big Data

IDC

"Big data technologies describe a new generation of technologies and architectures,

designed to economically extract value from very large volumes of a wide variety of data, by

enabling high-velocity capture, discovery, and/or analysis."

Forrester

"Big data: techniques and technologies that make handling data at extreme scale economical."

451 Group

"Big data is a term applied to data sets that are large, complex or dynamic (or a combination

thereof) and for which there is a requirement to capture, manage and process the data set in

its entirety, such that it is not possible to process the data using traditional software tools

and analytic techniques within tolerable time frames."

McKinsey Global Institute

"Big data" refers to datasets whose size is beyond the ability of typical database software

tools to capture, store, manage, and analyze. This definition is intentionally subjective and

incorporates a moving definition of how big a dataset needs to be in order to be considered

big data (i.e., we don't define big data in terms of being larger than a certain number of

terabytes (thousands of gigabytes). We assume that, as technology advances over time,

the size of datasets that qualify as big data will also increase."

Gartner

"When business leaders or data management professionals talk about big data, they often

emphasize volume, with minimal consideration of velocity, variety and complexity – the other

aspects of quantification: Velocity involves streams of data, structured record creation, and

availability for access and delivery. Velocity means both how fast data is being produced

and how fast the data must be processed to meet demand. Variety includes tabular data

(databases), hierarchical data, documents, e-mail, metering data, video, image, audio, stock

ticker data, financial transactions and more. Complexity means that different standards,

domain rules and even storage formats can exist with each asset type. An information

management system for media cannot have only one video solution."

Source: IDC, Forrester, 451 Group, McKinsey and Gartner

We think the third-party analyst firms have done a commendable job in their attempts to define Big

Data. We do point out, however, that both vendors and industry analysts have latched onto the concept

of the three V's: Volume, Velocity and Variety. We are also seeing firms add additional V's such as

Variability and Value.

Page 5: Big Data and How BI Got Its Groove Back

5

Many of these firms have also provided some useful illustrations on Big Data, as shown in Figure 2

below.

FIGURE 2: Forrester's Four V's of Extreme Scale

Source: http://blogs.forrester.com/brian_hopkins/11-08-29-big_data_brewer_and_a_couple_of_webinars

Gartner takes a similar approach with the "V's", but also adds Complexity, as shown in Figure 3 below.

FIGURE 3: Garter's Big Data Graph

Source: Gartner

Page 6: Big Data and How BI Got Its Groove Back

6

Just how popular is the term Big Data becoming? A quick look at Google Trends Search Volume Index

reveals the popularity of the term, as shown in Figure 4 below:

FIGURE 4: Google Trends of Term "Big Data"

Source: Google Trends

We compared the term "Big Data" to "Cloud Computing" and interestingly, the trajectory of "Cloud

Computing" in 2008 was very similar. Investors who bought publicly-traded companies leveraged to the

cloud-computing trend in 2008 have done well, as evidenced by the price performance of stocks such

as RightNow Technologies (RNOW, MP, $37 PT, Walravens), salesforce.com (CRM, MO, $170 PT),

and VMware (VMW, MO, $123 PT).

A look at the job trend graph on Indeed.com illustrates a similar trend as shown in Figure 5 below:

FIGURE 5: Job Trends from Indeed.com for Term "Big Data"

Source: Indeed.com

Page 7: Big Data and How BI Got Its Groove Back

7

WHAT ARE THE CAUSES OF BIG DATA?

Why is data proliferating at such a rapid rate? There are numerous reasons, including cloud computing,

mobile phones, social media, machine data, web logs, RFID tags, and sensor networks, among others.

According to a recent report by McKinsey, 60% of the world's population, or more than four billion

people, have a mobile phone. There are 30 million network sensor modes in the transportation,

automotive, industrial, utilities, and retail sectors (increasing at a rate of more than 30% a year), and 30

billion pieces of content shared on Facebook every month. Figure 6 below outlines just a few

examples.

FIGURE 6: The Growth of Business Transaction and Web Application Data

Source: Informatica

Page 8: Big Data and How BI Got Its Groove Back

8

One reason the concept of Big Data even exists is because the world's technological installed capacity

to store information has increased by a factor of 113 in a 20-year period (1986-2007), as shown in

Figure 7 below. In an excellent article by Martin Hilbert and Priscilla Lopez published in Science

magazine, the authors estimated that the total amount of information grew from 2.6 optimally-

compressed exabytes in 1986 to 295 optimally-compressed exabytes in 2007. The authors note that

"piling up the imagined 404 billion CD-ROM from 2007 would create a stack from the earth to the moon

and a quarter of this distance beyond (with 1.2 mm thickness per CD)." In a short span of 20 years we

have moved from an almost 100% analog driven world (books, newsprint, x-rays, etc.) in 1986 to a

primarily digital driven world in 2007.

FIGURE 7: World’s Technological Installed Capacity to Store Information

Source: Published Online 10 February 2011, Science 1 April 2011: Vol. 332 no. 6025 pp. 60-65, DOI: 10.1126/science.1200970

Article Title: The World's Technological Capacity to Store, Communicate, and Compute Information

Article Authors: Martin Hilbert and Priscilla Lopez

Page 9: Big Data and How BI Got Its Groove Back

9

We like Figure 8 below because it shows, in MIPS (million instructions per second), the world's

technological installed capacity to compute information on general-purpose computers. As shown, we

have gone from a world in 1986 where 41% of installed capacity was by pocket calculator, to 2007,

when pocket calculators were less than 1%.

FIGURE 8: World’s Technological Installed Capacity to Compute Information on General-purpose Computers, in MIPS

Source: Published Online 10 February 2011, Science 1 April 2011: Vol. 332 no. 6025 pp. 60-65, DOI: 10.1126/science.1200970

Article Title: The World's Technological Capacity to Store, Communicate, and Compute Information

Article Authors: Martin Hilbert and Priscilla Lopez

Page 10: Big Data and How BI Got Its Groove Back

10

Just how fast is this "digital universe" expected to grow looking ahead? According to IDC, as shown in

Figure 9 below, in 2009 there were nearly 800,000 petabytes (a million gigabytes) in the digital

universe. In 2011, the amount of information created and replicated will surpass 1.8 zettabytes (1.8

trillion gigabytes) - growing by a factor of nine in just five years. By 2020, IDC expects the number to

grow to 35 zettabytes, which is a factor of 44 and a CAGR of 40%.

FIGURE 9: IDC Figure on the Growth of the Digital Universe

Source: IDC

The explosion of data is causing new firms and technologies to emerge. Our favorite private company

example is Splunk. Splunk is the engine for machine data. It is software which collects, indexes and

harnesses any machine data generated by an organization’s IT systems and infrastructure - physical,

virtual and in the cloud. According to Splunk, machine data is unstructured, massive in scale and

contains a categorical record of all transactions, systems, applications, user activities, security threats

and fraudulent activity. Splunk can be used in a variety of use cases, including application

management, security and compliance, infrastructure and IT Operations Management, and business

and web analytics. Almost half of the Fortune 100 and over 2,900 licensed customers in 70 countries

use Splunk. Interestingly, beginning with Version 4 Splunk uses MapReduce to retrieve and analyze

massive datasets.

Page 11: Big Data and How BI Got Its Groove Back

11

BIG DATA MARKET SIZE

In 2011, we estimate that "Big Data" represents a $9.1 billion market opportunity, representing

approximately 2% of the $407 billion spent on software, storage, and servers. We refer to this software,

storage, and server spending as enterprise IT spending. Ten years ago, spending on Big Data was

minimal, due to the fact that data sets were much smaller. Over the next ten years, we expect Big

Data-related computing to increase to $86.4 billion, representing 11% of all enterprise IT spending.

FIGURE 10: Big Data Estimates: 2011-2021

Source: JMP Estimates

Page 12: Big Data and How BI Got Its Groove Back

12

Based on our estimates and IDC estimates, we project that the total Enterprise IT market will grow

around 5% over the next 10 years, reaching $676 billion by 2021.

FIGURE 11: Total Enterprise IT Spending: 2011-2021

Source: JMP Estimates

Page 13: Big Data and How BI Got Its Groove Back

13

Because Big Data is becoming a larger share of enterprise IT spending, it is growing much faster than

the overall enterprise IT market. As shown in Figure 12 below, we expect Big Data to grow from $9.1

billion in 2011 to $86.4 billion in 2021, a compound annual growth rate of 25%.

FIGURE 12: Big Data Estimates: 2011-2021

Source: JMP Estimates

We arrive at these estimates by making certain assumptions of different components of the Big Data

market. In the next section, we break down the different components of the market.

Page 14: Big Data and How BI Got Its Groove Back

14

We believe the Big Data market is comprised primarily of three different sub-segments: Business

Analytics, Storage, and Servers. In this section we define the total size of these markets and discuss

penetration rates of Big Data in each of these markets.

Figure 13 below highlights the total size of these markets, based on IDC and JMP estimates. As

shown, the total market is around $131.4 billion in 2011 growing to $238.4 billion in 2021.

FIGURE 13: Total Market Size of Business Analytics, Storage, and Services

2011 2012 2013 2014 2015 2016 2017 2018 2019 2020 2021

Business Analytics 32.0 35.3 39.0 43.0 47.2 52.2 57.8 64.4 72.2 81.3 92.2

YOY Growth 10% 10% 10% 10% 10% 11% 11% 12% 13% 13%Storage 45.2 47.5 49.9 50.8 53.3 56.0 58.8 61.7 64.8 68.1 71.5

YOY Growth 5% 5% 2% 5% 5% 5% 5% 5% 5% 5%

Servers 54.1 54.6 55.0 55.0 55.7 58.5 61.4 64.5 67.7 71.1 74.7

YOY Growth 1% 1% 0% 1% 5% 5% 5% 5% 5% 5%Total Market Size 131.4 137.4 143.9 148.8 156.3 166.7 178.1 190.7 204.7 220.5 238.4

YOY Growth 5% 5% 3% 5% 7% 7% 7% 7% 8% 8%

Source: IDC and JMP Securities

The segment that requires the most explanation, in our opinion, is the Business Analytics market. IDC

defines the Business Analytics Market as the "combination of the data warehouse (DW) platform

software with performance management and analytic applications and business intelligence (BI) and

analytic tools." Figure 13 below provides a taxonomy of the Business Analytics market. As shown,

there are three overall categories of the business analytics market: BI and Analytics Tools, Data

Warehousing Platform software, and Analytic Applications. IDC expects these three markets to grow at

a 2010-2015 CAGR of 9.2%, 9.8%, and 7.9%, respectively, with the total Business Analytics Market

representing a CAGR of 8.9%. The Business Analytics market is expected to grow from $30.7 billion in

2011 to $43.1 billion in 2015.

FIGURE 14: IDC's Business Analytics Taxonomy, 2011

Source: IDC

Page 15: Big Data and How BI Got Its Groove Back

15

The key question we had to ask ourselves in trying to size the Big Data market was "What could Big

Data's penetration be within each of the three main sub-segments of the market: Business Analytics,

Storage, and Servers?" In other words, what percentage of the total market is comprised of projects

that can fall under the Big Data definition?

As a baseline, we have assumed that around 7% of the size of the Business Analytics, Storage and

Servers market in 2011 meets the definition of Big Data. We assume that by 2021, 36% of the

Business Analytics, Storage, and Servers market will meet the definition of Big Data. This leads to the

breakdown of the $9.1 billion estimate in 2011 and the $86.4 billion estimate in 2021, as shown in

Figure 15 below.

FIGURE 15: Total Market Size of Business Analytics, Storage, and Services (in billions)

2011 2012 2013 2014 2015 2016 2017 2018 2019 2020 2021

Business Analytics 2.4$ 3.2$ 4.3$ 5.6$ 7.4$ 9.7$ 12.8$ 16.8$ 22.1$ 29.1$ 38.3$

YOY Growth 33% 33% 32% 31% 32% 32% 32% 32% 32% 32%

Storage 3.1$ 3.9$ 5.1$ 6.3$ 8.0$ 10.3$ 13.2$ 15.2$ 17.6$ 20.3$ 23.5$

YOY Growth 28% 28% 24% 28% 28% 28% 16% 16% 16% 16%

Servers 3.6$ 4.5$ 5.6$ 6.8$ 8.4$ 10.8$ 13.8$ 16.0$ 18.5$ 21.3$ 24.6$

YOY Growth 24% 24% 22% 24% 28% 28% 16% 16% 16% 16%

Big Data 9.1$ 11.6$ 14.9$ 18.7$ 23.9$ 30.8$ 39.8$ 48.0$ 58.2$ 70.7$ 86.4$

YOY Growth 28% 28% 26% 27% 29% 29% 21% 21% 22% 22%

Source: IDC and JMP Securities

Page 16: Big Data and How BI Got Its Groove Back

16

WHAT TECH LAYERS ARE IMPACTED BY BIG DATA?

The Big Data movement will impact every layer of the technology stack, in our opinion, from servers to

storage to software. In this report we have chosen to focus on two specific layers which we believe will

be most impacted by Big Data, in our opinion: the Data Management layer and the Business

Intelligence layer. In the next two sections we drill deeper into each level.

DATA MANAGEMENT LAYER

In this section we discuss the Data Management layer. We first take a brief look at the entire data

management landscape, and then drill down into two specific areas of the landscape: Hadoop and the

NoSQL movement.

The data management landscape is fragmenting to handle Big Data, in our opinion. Matt Aslett at The

451 Group produced the below graph which illustrates how new technologies are emerging to handle

the big data trend. Our key takeaways of this graph are twofold. First and foremost, it is clear, in our

opinion, that there will not be a single "data management" approach to handling the proliferation of data.

Second, the traditional players in the data management space will still be significant players, but will

increasingly have to compete against upstarts that approach the Big Data opportunity with new

technological approaches.

FIGURE 16: Data Management Landscape According to 451 Group

Source: http://blogs.the451group.com/information_management/2011/04/15/nosql-newsql-and-beyond/

One technology in the above graph that we would like to highlight is Hadoop. Hadoop and Big Data are

often used in the same breath. While we contend that Big Data is a lot more than just Hadoop, it is

useful to understand what Hadoop is, in order to have a better appreciation for the Big Data movement.

Page 17: Big Data and How BI Got Its Groove Back

17

According to the Apache Hadoop website, Hadoop is defined as "the Apache Hadoop software library is

a framework that allows for the distributed processing of large data sets across clusters of computers

using a simple programming model. It is designed to scale up from single servers to thousands of

machines, each offering local computation and storage. Rather than rely on hardware to deliver high-

availability, the library itself is designed to detect and handle failures at the application layer, so

delivering a highly-available service on top of a cluster of computers, each of which may be prone to

failures."

Like the term "Big Data", Hadoop is increasingly popular, as shown in Figure 17 below.

FIGURE 17: Google Trends of Term "Hadoop"

Source: Google Trends

A look at the job trend graph on Indeed.com illustrates a similar trend as shown in Figure 18 below:

FIGURE 18: Job Trends from Indeed.com for Term "Hadoop"

Source: Google Trends

Page 18: Big Data and How BI Got Its Groove Back

18

Hadoop also includes a number of subprojects, as shown in Figure 19 below.

FIGURE 19: Hadoop Subprojects and Related Projects at Apache

The project includes these subprojects:

Hadoop Common: The common utilities that support the other Hadoop subprojects.

Hadoop Distributed File System (HDFS™): A distributed file system that provides high-

throughput access to application data.

Hadoop MapReduce: A software framework for distributed processing of large data sets on

compute clusters.

Other Hadoop-related projects at Apache include:

Avro™: A data serialization system.

Cassandra™: A scalable multi-master database with no single points of failure.

Chukwa™: A data collection system for managing large distributed systems.

HBase™: A scalable, distributed database that supports structured data storage for large tables.

Hive™: A data warehouse infrastructure that provides data summarization and ad hoc querying.

Mahout™: A Scalable machine learning and data mining library.

Pig™: A high-level data-flow language and execution framework for parallel computation.

ZooKeeper™: A high-performance coordination service for distributed applications.

Source: http://hadoop.apache.org/

There are a number of private companies that are producing commercial offerings around the Hadoop

community. Private companies such as Appistry, Cloudera, DataStax, Hortonworks, and MapR

Technologies are all worth watching and are highlighted in the private company section of this report.

In the next section, we provide real world case studies of how enterprises are using Hadoop and

Hadoop related technologies.

Page 19: Big Data and How BI Got Its Groove Back

19

HADOOP CASE STUDIES

Hadoop is being used as an analysis tool in a wide array of business situations and industries. To best

understand Hadoop, in the section below we provide 12 case studies of how Hadoop is being used in

the real world.

AOL (AOL, Not Covered)

AOL Advertising runs one of the largest online ad serving operations, serving billions of impressions

each month to hundreds of millions of people. The company wanted to improve how it targeted its ads

to each person by mining large amounts of information about users. With Hadoop, AOL is able to serve

up ads for services near where a user is located or ads targeted toward a user's interest by generating

hundreds of millions of user profiles and making optimized decisions about real-time ad placement.

Groupon

Data is one of Groupon’s most strategic assets. Groupon relies on information from both vendors and

customers to make daily deal transactions run smoothly. Groupon realized that it needed better ways to

organize and make sense of the data generated by their massive user base for long-term usage.

Groupon uses Hadoop as a staging area for all of their extreme data and implemented Hadoop in such

a way that it can feed data sets that it has sifted into relational database frameworks designed to

simplify access to key customer and business-focused data. This solution allows Groupon to take

advantage of the ease of scale of the system and ultimately to be prepared for future growth while

consistently gaining new insights into its customers and business.

IBM (IBM, Not Covered)

In February 2011, IBM's Watson computer competed against and defeated Jeopardy! Champions Brad

Rutter and Ken Jennings on the popular game show. Watson used a complex array of natural language

processing, semantic analysis, information retrieval, automated reasoning and machine learning to

evaluate, understand and answer the questions. It had to determine verbs, nouns, objects and

moreover, nuances in the English language, and then look for the best answer. This required tasks to

be executed in parallel in order to answer the questions quickly enough to be competitive with the

human contestants. Hadoop was used to create Watson’s “brain,” or the database of knowledge, and

facilitate preprocessing of enormously large volumes of data in milliseconds. Watson is able to process

90 trillion operations (teraflops) per second and depends on 200 million pages of content and 500

gigabytes of preprocessed information to answer Jeopardy questions. That huge catalog of documents

has to be indexed so that Watson can answer the questions within the 3-second time limit. Watson

accumulated $77,147 in winnings and won the challenge.

LinkedIn (LNKD, Not Covered)

Many of LinkedIn’s products are critically dependent on computationally intensive data mining

algorithms. Examples of these include modules like People You May Know, Viewers of This Profile

Also Viewed, InMaps and much of the Job matching functionality that they provide to people who post

jobs on the site. Using Hadoop, LinkedIn crunches 120 billion relationships per day and scores these

relationships using a statistical model to determine the probability that a user may know another

LinkedIn member, for example. By using Hadoop, combined with other technologies, LinkedIn is able to

blend large-scale data computation with a high volume, low latency site serving to provide extremely

accurate and relevant recommendations and information to LinkedIn users.

Orbitz (OWW, Not Covered)

Orbitz generates hundreds of gigabytes of log data from web traffic each day. Its challenge was that it

was expensive and difficult to use existing data and infrastructure for storing and processing this data.

Hadoop was selected to provide a solution to the problem of long-term storage and the processing of

these large quantities of unstructured or semi-structured data. With the ability to store and analyze

more data, Orbitz has used Hadoop to automate the classification process of the data to optimize their

hotel rankings, increase bookings, and measure and track website performance. Hadoop was much

more scalable than their existing solution and showed a four time improvement in processing time.

Page 20: Big Data and How BI Got Its Groove Back

20

Wal-Mart (WMT, Not Covered)

Hadoop is part of Wal-Mart's strategy to analyze large amounts of data to better compete against online

retailers including Amazon.com. With the increasing role that social networking sites such as Facebook

and Twitter are playing in online shopping, Wal-Mart is also looking to glean insights into what

consumers want. Wal-Mart uses Hadoop in its keyword campaigns to drive traffic from search engines

to Walmart.com. The software collects information about millions of keywords and then comes up with

optimal bids for each word. It also allows them to create language models so the site can return more

relevant product results when a user searches for a specific product or an item based on that users'

Tweets or Facebook posts.

Tennessee Valley Authority (TVA)

The Tennessee Valley Authority ("TVA") is a federally-owned corporation in the United States that

provides flood control, electricity generation, and economic development in the Tennessee Valley. The

TVA was selected to collect data from phasor measurement unit ("PMU") devices on behalf of the North

American Electric Reliability Corporation ("NERC") to help ensure the reliability of the bulk power

system in North America. PMU data includes voltage, current, frequency and location data, and is

considered part of the measurement data for the generation and transmission portion of the so-called

“smart grid”. It uses smart-grid field devices to collect data on its power-transmission lines and facilities

across the country. These sensors send in data at a rate of 30 times per second and the rate of

incoming PMU data was growing very quickly with more and more PMU devices coming online

regularly. The TVA was faced with the problem of how to reliably store this data and make it available

for use. Hadoop was selected because it solved their storage issues and provided a robust computing

platform to analyze the data. It also allowed them to employ commodity hardware and open source

software at a fraction of the price of proprietary systems to achieve a much more manageable

expenditure curve as its repository grows.

Rapleaf

Rapleaf helps businesses create more personalized experiences to their customers by providing them

with useful information about each customer, such as age, gender, location and interests via their

Personalization API. Businesses leverage this insight to better understand their customers in order to

personalize deals and offers, show them more relevant content and give them a better experience

online and off. Rapleaf has a vast amount of consumer data which includes over a billion email

addresses and terabytes of data. Hadoop has allowed Rapleaf to manage and work with this data a

scale much more easily than their previous RDBMS systems. They have implemented a batch-oriented

process that allows them to ingest and normalize raw data from numerous sources, analyze it and then

package the data into easily-served objects.

Crossbow

Crossbow is an open-source, Hadoop-enabled software pipeline for quickly, accurately, and cheaply

analyzing human genomes in the cloud. While human genomes are about 99.9% identical, discovering

differences between genomes is the key to understanding many diseases, including how to treat them.

While sequencing has undoubtedly become an important and ubiquitous tool, the rapid improvements in

sequencing technology have created a “firehose” problem of how to store and analyze the huge volume

of DNA sequence data being generated in a short period of time. Presently, the process of scanning

and mapping generates about 100GB of compressed data (read sequences and associated quality

scores) for one human genome. Crossbow combines one of the fastest sequence alignment

algorithms, Bowtie, with a very accurate genotyping algorithm, SoapSNP, within Hadoop to distribute

and accelerate the computation. The pipeline can accurately analyze an entire genome in one day on a

10-node local cluster or in about three hours for less than $100, using a 40-node, 320-core cluster

rented from Amazon’s (AMZN, NC) EC2 utility computing service. Our evaluation against a “gold

standard” of known differences within the individual, shows Crossbow is better than 99% accurate at

identifying differences between human genomes. Crossbow will enable the computational analysis

without requiring researchers to own or maintain their own computer infrastructure.

Bank of America (BAC, Market Perform, Covered By David Trone)

With Hadoop, Bank of America has been able to analyze billions of records to gain a better

understanding of the impact of new and existing financial products. The bank can now examine things

like credit and operational risk of products across different lines of business including home loans,

insurance, and online banking.

Page 21: Big Data and How BI Got Its Groove Back

21

Disney (DIS, Not Covered)

Disney was faced with the challenge of what to do with the increasing amount of data collected from

business operations, customers transactions, along with unstructured data created by social media and

their various web properties (i.e. ESPN and ABC). Disney's Technology Shared Service Group uses

Hadoop as a cost-effective way to analyze and correlate information from all of its different businesses

including theme-park attendance, reservations at resort hotels, purchases from Disney stores and

viewership of Disney's cable TV programming.

General Electric (GE, Not Covered)

GE is running several use cases on their Hadoop cluster, which gave them deeper analytic capabilities

and insights into their business. The marketing and communications teams can assess how the public

perceives the company through sentiment analysis. It uses Hadoop to mine text such as updates on

Facebook and Twitter along with news reports and other information on the Internet to understand, with

80-percent accuracy, how consumers feel about GE and its various divisions. They have also built a

recommendations engine for their intranet allowing them to display targeted press releases to each user

based on their job function, user profile and prior visits to the site. Finally, Hadoop enables them to

work with several types of remote monitoring and diagnostic data from their energy and wind business.

Page 22: Big Data and How BI Got Its Groove Back

22

NoSQL MOVEMENT

Besides Hadoop, one of the most interesting areas of the Data Management landscape is the NoSQL

movement. The NoSQL movement (or sometimes called the "not only SQL" movement) refers to

database management systems that tend to be non-relational.

The NoSQL movement consists of four primary categories. Key-value stores, Bigtable clones,

document databases, and graph databases. Figure 19 below highlights the four NoSQL categories

from a "data size" and "data complexity" angle.

The first NoSQL category is "key-value stores", which is based on Amazon's Dynamo paper that was

published in 2007. The data model of key-value stores is a collection of K-V pairs. Examples include

Dynomite, Voldemort, Membrain, and Berkeley DB, among others.

The second NoSQL category is BigTable clones, which is based on Google's BigTable paper that was

published in 2006. The data model of BigTable is big table, column families. Examples of BigTable

include HBase, Hypertable, and Cassandra.

The third NoSQL category is document databases. People often think of Lotus Notes when they think

of document databases. Examples include CouchDB, MongoDB, and RavenDB.

The fourth NoSQL category is graph databases. A graph database "uses graph structures with nodes,

edges, and properties to represent and store information." Examples of graph databases include

AllegroGraph, Sones, Neo4J, InfiniteGraph and GraphDB.

FIGURE 20: The Four NoSQL Categories

Source: http://www.slideshare.net/emileifrem/nosql-overview-neo4j-intro-and-production-example-qcon-london-2010?src=related_normal&rel=8600029

There is controversy around the use of NoSQL solutions. Proponents of NoSQL solutions cite the

flexibility, scalability, low price, and NoSQL solutions appropriateness for specific use cases. Critics of

the NoSQL movement often cite the maturity level of NoSQL solutions, the lack of commercial support,

and the inability for NoSQL databases to work with traditional BI tools.

Stepping back, we believe new data management technologies will continue to emerge to handle the

Big Data trend. While traditional RDBMS's will likely continue to play a significant role well into the

future, the days of one database standard or technology within an organization are quickly coming to an

end, in our opinion

Page 23: Big Data and How BI Got Its Groove Back

23

BUSINESS INTELLIGENCE, ANALYTICS AND BIG DATA

The second layer of the technology stack most impacted by the Big Data trend is the Business Intelligence layer. In this section we briefly look at the Business Intelligence landscape, we discuss the hottest areas of business intelligence and analytics, three reasons why the BI landscape is going through a major sea change, and four areas within business intelligence that we believe investors should understand, including the differences between Business Analytics versus Business Intelligence, Agile BI Versus Traditional BI, The R Open Source Programming Language, and Data Visualization.

Business intelligence (BI) refers to a series of related activities like online analytical processing, querying and reporting to analyze an organization’s raw data using standardized software applications (see Figure 21 below).

FIGURE 21: Info Tech Research Group BI Architecture

Source: Info Tech Research Group

Page 24: Big Data and How BI Got Its Groove Back

24

What are the hottest areas of business intelligence and analytics? TDWI Research analyst Philip Russom recently put together a thoughtful piece on Big Data Analytics. TDWI surveyed 360 companies across a broad sector of industries and found that Advanced Analytics, Advanced Data Visualization, and Predictive Analytics had the highest commitment levels with the most potential for growth, as shown in Figure 22 below. Gartner's research on the topic supports TDWI's survey, with Gartner noting that lighter footprint, visualization-based analytics and data discovery products are the fastest growing areas in the business intelligence space, growing at 3x the overall BI market, according to Gartner. Gartner expects the data discovery market alone to grow from $591 million in 2011 to $1.6 billion in 2015.

FIGURE 22: Options for Big Data Analytics Plotted by Potential Growth and Commitment

Source: Info Tech Research Group

Page 25: Big Data and How BI Got Its Groove Back

25

We believe the business intelligence landscape is about to go through a major sea change that will radically transform the landscape and the way the industry thinks about analytics. We have identified three primary reasons for the sea change:

1) Big Data and the Explosion of Data 2) The Consumerization of Enterprise BI 3) Industry Consolidation

We discuss each below:

1) Big Data and the Explosion of Data

Earlier in this report we covered some of the reasons behind the explosion in data, including the

precipitous drop in memory prices over the last 10 years and the sheer number of "devices" now

collecting information for enterprises. The problem is simply that it has become very easy to collect

data, but difficult to make sense of that data using traditional BI tools. In other words, as the useful life

of information has decreased, so has the utility of traditional BI tools which have historically been very

backwards looking.

2) The Consumerization of Enterprise BI

We see users within enterprises increasingly demanding easier-to-use and more intuitive business

intelligence solutions. This is driven simply by the consumerization of all enterprise IT. The

consumerization of IT has been covered extensively by other sources, but in our view, simply means

that there has been a shift from enterprise behavior influencing an individuals behavior at home (i.e.,

Circa 1995, a person saying I have email at work and I now want email at home) to behavior in the

home influencing enterprise IT (i.e., Circa 2011, a person saying what do you mean I can't get my

corporate email on my iPhone). We believe the consumerization of BI is being driven by individuals

being able to have amazing analytics on their internet and mobile devices. These individuals

increasingly insist on having access to analytics in their day-to-day jobs.

3) Industry Consolidation The final reason a major sea change is occurring in the business intelligence space, is simply the change in the vendor landscape. As investors recall, there has been massive consolidation as the heavyweights in the tech industry invested heavily in the space. The most prominent examples include IBM's acquisition of Cognos and SPSS, SAP's acquisition of Business Objects, and Oracle's acquisition of Hyperion.

In the next section, we discuss five areas within business intelligence that we believe investors should understand:

1) Business Analytics Versus Business Intelligence 2) Agile BI Versus Traditional BI 3) Business Intelligence in the Cloud 4) The R Open Source Programming Language 5) Data Visualization

Page 26: Big Data and How BI Got Its Groove Back

26

BUSINESS ANALYTICS VERSUS BUSINESS INTELLIGENCE

Business Analytics ("BA") is the structured, iterative approach to exploring an organization’s data with a heavy emphasis on statistical analysis. The combination of business knowledge, statistical tools, knowledge of business intelligence / analysis tools, rapid hypothesis generation and most importantly, a methodical approach to analyzing data make the field of business analytics very complicated, unique and the fastest growing field in the business realm. The fundamental differences between business intelligence and business analytics are summarized in Figure 23 below.

FIGURE 23: Business Intelligence and Business Analytics

S.No. Business Intelligence Business Analytics

1. Identify business trends Understand and act on business trends

2. Understand the timeframe of change in business trend Forecast the possibility of the trend

occurring again

3. Understand the different elements involved in the business

trend

Understand the implications of

the business trend

4. Quantify the change in the business trend Understand other possible

explanations and scenarios

associated with the change

in business trend

5. Key aspects include –

(a) Reporting (KPIs, metrics)

(b) Automated monitoring / alerting

(c) Dashboards

(d) Scorecards

(e) OLAP (Cubes, slice and dice, drill-down)

(f) Ad hoc query generation

(g) Deployment of solution to multiple devices

Key aspects include –

(a) Statistical / quantitative

analysis

(b) Data mining

(c) Predictive modeling

(d) Multi-variate testing

Source: JMP Securities

We make the distinction between Business Intelligence and Business Analytics to highlight that the BI

industry so far has done a good job with the "Business Intelligence" side of the chart but in many ways

we are still in the early innings of the "Business Analytics" side of the chart.

Page 27: Big Data and How BI Got Its Groove Back

27

AGILE BUSINESS INTELLIGENCE

According to a study by Gartner, 70% to 80% of business intelligence projects fail. Much of the failure is due to a structured, linear BI-development cycle that spans from four months to one year or more. In the current business environment, underlying business rules that drive measurements change rapidly. In such an environment, the linear development cycle falls apart as the rules have changed by the time the original set of requirements have been deployed to a production environment. Agile business intelligence deployment addresses the concerns of the linear development model through an iterative development process to enable rapid delivery of solutions with the least amount of rework and risk. Agile BI involves agile project management based on taking the standardized documentation and approval and reducing it to a leaner essential process of shorter requirement documents and faster approval processes. Agile BI also requires an agile infrastructure driven by a dynamic data integration layer. Vendors such as Informatica (INFA, MO, $61, Walravens) have created tools to integrate data in a virtual layer with the external source data cached and refreshed as provided by the data owner or as required by the business users. Virtual data integration lends itself to data visualization at a very early stage of the BI development cycle, thereby empowering users to refine requirements at early stages. Agile BI brings a host of challenges in most organizations. A few of these are:

(a) Agile BI requires strong involvement from the business which would translate to closely

followed resource scheduling and task management practices in the business community. A typical business community is perceived to be relatively less structured in these aspects.

(b) A powerful and flexible IT infrastructure that can support faster delivery is a foundational requirement to the success of agile BI. Most IT infrastructures are not upgraded regularly and adopt slightly outdated versions of hardware and software, primarily due to budget constraints.

(c) Organizational cultural aspects play a key role in the success of agile BI. Daily team meetings are not viewed favorably in certain organizations and getting stakeholder buy-in for such requests might be difficult.

(d) Agile BI requires operations resources to be on their toes all the time. Scheduled server maintenance tasks and server upgrades have to be handled more diligently to ensure the least disruption for the development cycle.

(e) Agile development is typically known to fit a regular software / product development cycle due to the inherent need for automating most or all of the QA testing process. Agile BI does not lend itself well to automated testing largely due to complex business rules involved in the development of BI solutions. Testing requirements are also made complex due to the integration of disparate data sources from internal and external data feeds.

(f) The difficulty of maintaining a BI team with the least turnover makes agile BI difficult to implement. Most BI teams involve regular turnover of resources. Agile BI requires a standard set of resources to be tied to a BI program stream until the BI solution reaches a critical maturity stage.

(g) Defining smaller chunks of scope is fundamental to the definition of agile BI. However, certain BI implementations require looking at a broad range of data sources simultaneously to clearly define business rules. Such requirements would make agile BI difficult to implement.

Agile BI happens to be the next big implementation approach for BI across the industry. Most BI vendors are designing innovative tools and methodologies to enable agile BI development. With the rapid growth in the amount of data and constant requirements for new and innovative analysis approaches tied to faster operational turnaround, we believe that agile BI is going to be at the center of any marketed BI software in the next few years.

Page 28: Big Data and How BI Got Its Groove Back

28

BUSINESS INTELLIGENCE IN THE CLOUD

Cloud-driven business intelligence refers to a virtualized, dynamically-provisioned, service-oriented, utilization-based, multi-tenant architecture on the internet, through which customers can unleash the power of business intelligence. The power of cloud-driven BI lies in the dynamic provisioning model where hardware and software components are virtually added and utilized as needed. For example, at 9 p.m., only one server component could be required while at 9 a.m., three servers could be required. The utilization of the architecture is dynamic in nature. Such a model yields itself to better workload management and dynamic billing based on CPU utilization per hour and memory utilization per hour, thereby helping to lower IT costs for small and medium businesses.

Cloud BI brings a host of advantages that are listed below –

(a) Flexibility – Computing resources could be scaled with few barriers. (b) Time to develop solutions – SaaS BI provides the ability to shorten BI implementation

windows. (c) Cost reduction – Cloud BI helps reduce upfront infrastructure costs for cash-strapped

companies. (d) Geographic scalability – Given cloud deployment, geographic expansion of solutions is a key

advantage, especially in a B2B environment with BI as the core platform.

Despite the advantages of a cloud BI solution, there have been concerns among industry players on the wider adoption of the model. Top concerns are listed below:

(a) Security – Information security and SOX compliance aspects are not easy to address in a

cloud BI solution. (b) Control – Availability To Business ("ATB"), a standard metric used in B2B-based BI solutions,

is not in control of the organization once the BI solution is migrated to the cloud. (c) Performance – Performance optimization of a BI environment occasionally requires a non-

standard implementation model that would be customized to specific business needs. Such optimizations may not be easy to implement in a cloud BI solution.

(d) Support – Cloud BI also poses the challenge of technical support availability in crunch situations.

(e) Vendor lock-in – A cloud-based BI solution creates external dependencies for migration of new technologies.

(f) Speed to activate new services / expand capability and configurability – External dependencies occasionally hamper the ability to quickly activate new capabilities and modify the configuration of the implementation.

Page 29: Big Data and How BI Got Its Groove Back

29

R OVERVIEW

R is an open-source programming language and software environment for statistical computing and graphics. R provides a wide variety of statistical and graphical techniques, including linear and non-linear modeling, classical statistical tests, time-series analysis, classification, clustering, and others. R is easily extensible through functions and extensions and the R community is noted for its active contributions in terms of packages. According to Rexer's Annual Data Miner Survey in 2010, R has become the most widely used data mining tool, used by 43% of data miners.

R is an implementation of the S programming language, developed by Bell Laboratories in 1976 to provide an alternative and more interactive approach to statistical analysis than what had been currently available. R is a free, open-source dialect of S and there is also a commercial version called S-PLUS available. R is also part of the GNU project, created by Ross Ihaka and Robert Gentleman of the University of Auckland, New Zealand. It was initially conceived as both men wanted better-suited technology for their statistics students, who needed to analyze data and produce graphical models of the information and they found existing software difficult to use.

What is R and Why is it Used?

Recently, the business intelligence sector began taking notice of the many benefits of R programming language, which is particularly adaptive to predictive analytics. It can be used to identify patterns or trends in massive data sets, making it ideal for researching retail, financial, and medical trends.

Predictive analytics is an area of statistical analysis that deals with extracting information from data and using it to predict future trends and behavior patterns as well as identifying risks and opportunities. Models capture relationships among many factors to allow the assessment of risk or potential associated with a particular set of conditions, guiding decision making for candidate transactions. As data mining and predictive analytics continue to rapidly accelerate, R provides the tools to support these activities across all industries including, actuarial science, financial services, insurance, telecommunications, retail, travel, healthcare, pharmaceuticals and others. It supports drug companies doing pharmaceutical development, insurance companies performing underwriting and risk assessment, and students performing academic research.

R has become one of the most popular and primary programming languages used by statisticians, scientists, and data analysts both commercially and within academia. R's popularity seems to be a result of its usability, extensibility, and roots in open-source.

R is an integrated programming environment for data manipulation, calculation and graphical display of data sets. It helps people perform a wide variety of computing tasks by giving them access to various commands and pre-supplied packets. It also allows users to script their own functions (or modify existing ones) to do custom tasks. This provides much of the flexibility of languages such as C, but with the advantage of building upon R's robust numerical routines, data management functions, and graphing tools. Its ease of use has made it especially appealing to people without deep computer programming skills and is making it the de facto standard. According to an article from the New York Times, "It allows statisticians to do very intricate and complicated analyses without knowing the blood and guts of computing systems." Another strength of R is static graphics and the ease with which well-designed, publication-quality graphs and mathematical symbols can be produced. Speed is also one of the biggest draws for the R programming language, which can process up to 12 gigabytes of data in seconds.

Because R has stronger object-oriented programming facilities than most statistical computing languages, it can be more easily customized and extended through the use of user-submitted packages for specific functions or specific areas of study. Advanced users can write C code to manipulate R objects directly or link code written in C, C++ or Fortran to R at run-time. Over 1500 packages exist today. Some examples include:

• BiodiversityR - offers a graphical interface aimed at simplifying the calculation of environmental trends

• Emu - analyzing speech patterns

Finally, because R is open-source, users have the ability to freely modify existing or create entirely new functions and packages compared with commercial software packages that use proprietary functions to perform the analysis.

Page 30: Big Data and How BI Got Its Groove Back

30

Who uses R?

R is used by both corporate users and universities, with an estimated two millions users, particularly scientists, programmers and academics who routinely do research. While software from SAS Institute has been the preferred tool, R is gaining popularity, especially within academia. Its ability to perform high-end analytics combined with its open-source, free-distribution model seem to be key in this shift.

Corporate customers include Google (Not Covered), Pfizer (NC), Merck (NC), Bank of America (BAC, MP, Trone), Shell (NC) and the InterContinental Group (NC). Google uses R to help it understand trends in ad pricing and illuminate trends in the search data it collects. Pfizer has created customized packages that allow its scientists to manipulate their own data during non-clinical drug studies immediately versus sending that information off to a statistician. A number of financial institutions have also used it to create packages to perform derivatives analysis. Wal-Mart (NC) is also a high-profile user of R, using it to interpret the needs and spending habits of customers.

R Disadvantages:

While there appear to be many advantages to R, there are also currently some disadvantages. In the eyes of some, SAS Institute is better suited to handle "big data." R is limited by RAM because the data is in memory. R also appears to lack documentation and has limited commercial support.

Commercial Releases:

In Oct 2011, Oracle (ORCL, MO, $36 PT, Walravens) announced the Big Data Appliance, which integrates R, Apache Hadoop, Oracle Enterprise Linux, a NoSQL database with the Exadata hardware. This is an engineered system optimized for acquiring, organizing and loading unstructured data into Oracle Database 11g.

In 2007, Revolution Analytics was founded to provide commercial support for Revolution R, its distribution of R which also includes components developed by the company. It includes additional components such as a web services framework and the ability for reading and writing data in the SAS File Format.

Page 31: Big Data and How BI Got Its Groove Back

31

DATA VISUALIZATION

The final area of business intelligence is the data visualization or business discovery market. This is the

most exciting component of the business intelligence market, in our opinion, and includes vendors such

as QlikTech, Tableau Software, and TIBCO Software's Spotfire.

Perhaps the best way to understand the data visualization market is to simply look at the type of

graphs/charts these tools can produce. Some of our favorite examples of what enterprises now expect

are shown in Figure 24 and 25 below.

FIGURE 24: Example of a Data Visualization Tool

Source: http://globaleconomicanalysis.blogspot.com/search?updated-max=2011-09-02T10%3A27%3A00-05%3A00&max-results=3

Page 32: Big Data and How BI Got Its Groove Back

32

Figure 25 below highlights an interactive visualization of an Average Draft Position (ADP) of the CBS

Sports Fantasy Football league, powered by Tableau Software.

FIGURE 25: Another Data Visualization Example

Source: http://fantasynews.cbssports.com/fantasyfootball

Page 33: Big Data and How BI Got Its Groove Back

33

BIG DATA INDEX STOCK PERFORMANCE

Big Data stocks have performed significantly better YTD than the NASDAQ. Through November 10,

Big Data stocks are up 9.3% versus down 0.4% for the NASDAQ Composite, as shown in Figure 26

below. The stocks we include in our "Big Data Index" include QLIK, BIRT, TDC, TIBX, INFA, MSTR,

PRGS, ORCL, IBM, EMC, NTAP, and CVLT.

FIGURE 26: Big Data Stock Index Versus NASDAQ- YTD

Source: JMP Estimates & FactSet

As shown in Figure 27 below, the best performing stock YTD in our Big Data index is TIBCO Software

(TIBX), up 44% YTD, followed by MicroStrategy (MSTR), up 38% YTD, compared to a 0.4% decline in

NASDAQ. The worst performing stock has been Progress Software (PRGS), down 26% YTD,

compared to a 0.4% decline in NASDAQ.

FIGURE 27: YTD Individual Stock Performance of Big Data Index Companies

Source: FactSet

Page 34: Big Data and How BI Got Its Groove Back

34

While Big Data Stocks have outperformed the NASDAQ YTD, the stocks have also pulled back harder

than the NASDAQ since the market started to turn south from July 22nd. Figure 28 below shows the

median performance of the "Big Data Index" Compared to NASDAQ from July 22nd to November 10th.

As shown, NASDAQ is down 8% while the "Big Data Index" is down 12%

FIGURE 28: Big Data Stock Index Versus NASDAQ- Since July 22nd Pullback

Source: FactSet

The Big Data stocks that have pulled back the most since their 52 week highs include Pegasystems,

Progress Software, and MicroStrategy, as shown in Figure 29 below. We believe the recent pullback

may represent a compelling buying opportunity for investors to build or add to positions in selected Big

Data Stocks.

FIGURE 29: Percentage Change from 52-Week High

Source: FactSet

Page 35: Big Data and How BI Got Its Groove Back

35

BIG DATA VALUATION DISCUSSION

Most Big Data stocks look expensive on traditional valuation metrics such as P/E. Our key observation

in analyzing the "Big Data Index" is that these companies are simply growing a lot faster than other

technology companies, giving less credence to this metric, in our opinion.

Figure 30 below illustrates the median consensus CY12 revenue growth rate for companies in the Big

Data index, compared to the median consensus growth rate for NASDAQ companies in the technology

sector for which there are estimates. As shown, the "Big Data" companies are expected to grow the

top-line at 14% while NASDAQ companies are expected to grow the top line only 9%.

FIGURE 30: Expected CY12 "Big Data Index" Revenue Growth Versus Tech Index

Source: FactSet

Page 36: Big Data and How BI Got Its Groove Back

36

The fastest growing company in the Big Data Index using consensus estimates is Qlik Technologies,

followed by NetApp, Pegasystems, and Informatica.

FIGURE 31: Fastest Growing Companies in the "Big Data Index" (CY12 Estimates)

Source: FactSet

Interestingly, on a PEG Basis using 2012 P/E /2012-2013 EPS Growth, the Big Data stocks trade at

1.0x, the same multiple of our NASDAQ technology index.

FIGURE 32: PEG Ratio of Big Data Index Versus HW/SW Technology Index

Source: FactSet

Page 37: Big Data and How BI Got Its Groove Back

37

One metric we like to focus on is Free Cash Flow. On a TTM basis, our Big Data index trades at.21x

EV/TTM FCF versus our NASDAQ technology index of 18x.

FIGURE 33: EV/TTM FCF

Source: FactSet

While this multiple seems steep, when we compare it to the expected 2011 revenue growth, the stocks

appear inexpensive compared to the NASDAQ technology index. As shown in Figure 34 below, our Big

Data Index trades at an EV/TTM FCF divided by CY11 expected growth rate of 1.5x compared to 2.1x

for the NASDAQ technology index.

FIGURE 34: EV/TTM FCF Divided by CY11 Expected Growth Rate

Source: FactSet

Page 38: Big Data and How BI Got Its Groove Back

38

PUBLICLY TRADED COMPANIES IN THE BIG DATA SPACE

Almost every publicly traded technology company now has a Big Data strategy. In this section we

provide brief overviews of the solutions that some public technology companies are offering in the Big

Data space.

MICROSOFT AND BIG DATA

Microsoft (MSFT, Not Covered) is making its advance into the Big Data space from multiple fronts.

Microsoft's strategy is to provide an end-to-end solution that spans the entire process of data capture,

loading, analysis, reporting, and visualization. Microsoft has a number of Big Data initiatives, four of

which we detail below.

Dryad. Dryad was created by Microsoft Research and is a platform to build applications that can

process large amounts of unstructured data running on Microsoft's Windows HPC Server. It provides a

sophisticated, distributed run-time and associated programming model that allows for analyzing and

storing large volumes of data across large clusters of commodity servers. The technologies behind

Dryad are the commercial versions of the same technology used by the Bing search engine to process

very large quantities of data.

Project Daytona. Daytona was developed by the Microsoft Extreme Computing Group's Cloud

Research Engagement program and provides the tools and services to use its Azure cloud platform to

analyze extremely large data sets. Microsoft is positioning Daytona as a Data Analytics-as-a-Service

solution that provides parallel processing of extremely large data sets and allows users to quickly scale

up or down virtual machines based on processing power required. It uses a runtime version of Google's

MapReduce programming model for processing and analyzing the large data sets.

Excel DataScope. In June 2011, Microsoft Research released Excel DataScope, its newest big data

analytical and visualization candidate. Excel DataScope lets users upload data to the cloud, extract

patterns from data stored in the cloud, identify hidden associations, discover similarities between

datasets, and perform time-series forecasting using a familiar spreadsheet user interface called the

research ribbon.

DataMarket. Microsoft's vision of big data includes combining large volumes of corporate data with

third-party data, enabling greater insights or more value. The DataMarket is a cloud-based Azure

marketplace for accessing third-party data. It makes it significantly easier for non-expert business users

to explore and choose third-party data sources on their own.

Microsoft has also recently released a Hadoop connector to its MPP data warehouse offering, allowing

developers to integrate data between SQL Server 2008 R2 Parallel Data Warehouse and Hadoop

clusters.

Page 39: Big Data and How BI Got Its Groove Back

39

ORACLE AND BIG DATA

Oracle's (ORCL, Market Outperform, Walravens) Exadata Database Machine is a database solution

that focuses on Online Transaction Processing (OLTP), data warehousing, and the consolidation of

workloads. Exadata was originally conceived to provide large enterprises a more effective and

optimized solution for both sequential and random I/O performance. This means it can handle scan-

intensive data warehousing applications just as optimally as highly-concurrent OLTP applications.

Oracle's vision is that applications, data, and storage, which have traditionally been separate, require

tighter alignment for optimal performance. Exadata addresses this. Exadata is a strategic platform that

unifies the entire IT stack under a single technology vendor and delivers an easy to manage, fast,

efficient, scalable, and cost effective solution.

Exadata has been architected to deliver extreme performance by leveraging a massively parallel grid

architecture using Real Application Clusters and Exadata Smart Flash Cache to dramatically accelerate

database processing and speed I/O operations. By pushing SQL processing to the storage server, all

disks can operate in parallel, reducing server CPU and consumption, while using less bandwidth to

move data between storage and database servers.

Oracle also announced interfaces that will allow its customers to use Hadoop systems and connect

them to Oracle systems so they can load Hadoop data into Oracle. They believe this will further

increase the use of their solutions.

Exadata is being positioned as the future of Oracle, offering a single hardware ecosystem with solutions

for database management, storage, data crunching, and mission critical workloads. Notably, on

October 2, Oracle announced Oracle Exalytics, an in-memory machine for business intelligence and

planning applications.

HP AND BIG DATA

HP (HPQ, Not Covered) has made two key acquisitions to build out its Big Data portfolio: Vertica in

February 2011 and Autonomy in October 2011.

Vertica: Vertica was a leading provider of analytics database and analytics platform with over 300

customers including Groupon, Twitter, AOL, Zynga, and Comcast, among others. Vertica is known for

its massively parallel columnar database technology – called Vertica Analytics platform. After the

acquisition, the new HP Vertica Analytics system now combines the original Vertica Analytics Platform

with HP Converged Infrastructure. This upgraded solution helps customers analyze in real-time, large

amounts of complex data in physical, virtual and cloud environments.

Autonomy: Autonomy is a leading provider of infrastructure software and before it was bought by HP

was the second largest publicly-traded software company in Europe. Autonomy's core platform, IDOL

server, primarily obtains information from connectors and archives them in its proprietary structure, but

also offers more than 500 functions such as hyper-linking and summarization. During the process of

categorization and archiving, IDOL server deploys Meaning Based Computation to form a contextual

and conceptual meaning from the information.

Page 40: Big Data and How BI Got Its Groove Back

40

GOOGLE AND BIG DATA

Google (GOOG, Not Covered) has been a significant player in the Big Data space via its development

of Bigtable. Bigtable is a compressed, high-performance proprietary database system built on the

Google File System ("GFS"). This distributed storage system is used for managing structured data that

is designed to scale to a very large size. Bigtable development began in 2004 and is now used by a

number of Google applications such as Google Reader, Google Maps, Google Earth, YouTube and

Gmail. Bigtable has been able to support the demands on its data size from applications requesting

various forms of data ranging from URLs to web pages to satellite imagery. Due to the variety of data

requested, these applications also exert diverse latency requirements on Bigtable from slow, back-end,

bulk processing to real-time data serving. Despite these varied demands, Bigtable has provided a

flexible high-performance solution for all these Google products.

Following Google’s philosophy, Bigtable was an in-house development designed to run on commodity

hardware. Bigtable allows Google to have a very small incremental cost for new services and expanded

computing power.

Mechanics of Bigtable: Each table is a multi-dimensional sparse map. The table consists of rows and

columns and each cell has a time version. There can be multiple copies of each shell with different

times, so they can keep track of changes over time. In order to manage the huge tables, they are split at

row boundaries and saved as tablets. Tablets are each around 100-200MB and each machine stores up

to 100 of them. This setup allows fast-grain load balancing and fast rebuilding. There is lot of redundant

data in the system, so they make heavy use of compression; the compression looks for similar values

along rows, columns, and times.

IBM AND BIG DATA

IBM (IBM, Not Covered) has made analytics a cornerstone of its strategy. It has both built solutions

organically and has also been very acquisitive. We review some of IBM's solutions in the Big Data

space below:

IBM InfoSphere: BigInsights is IBM's solution for managing and analyzing Internet-scale volumes of

structured and unstructured data. BigInsights is built on the open source Apache Hadoop software

framework.

IBM InfoSphere Streams enable continuous analysis of massive volumes of streaming data with sub-

millisecond response times.

IBM WebSphere® Operational Decision Management V7.5 software combines business events and

rules management in a single platform to unify rules-based contextual decisions with time-based

situational awareness.

IBM Cognos is IBM's business intelligence and Financial Performance Management Solution. It

includes solutions for query and reporting, analysis, dashboarding, and scorecarding, among other

solutions. IBM acquired Cognos for $4.9 billion in 2007.

IBM Netezza is IBM's data warehousing appliance. In June 2011, IBM rolled out its first appliance,

post-Netezza’s acquisition, called IBM Netezza High Capacity Appliance which is designed to allow

companies to analyze up to 10 pentabytes in a few minutes. The latest Netezza appliance is optimized

for IBM’s BladeCenter technology and is part of a broader portfolio focused on big data projects. IBM

bought Netezza for $1.7 billion in 2010.

IBM's SPSS solutions are centered on predictive analytics. SPSS Statistics provides an advanced

statistical analysis solution. SPSS Modeler is a pattern and trend solution. SPSS Data Collections is a

solution to help organizations get an accurate view of people's attitudes, preferences, and opinions with

IBM SPSS Data Collection. IBM bought SPSS for $1.2 billion in 2009.

Page 41: Big Data and How BI Got Its Groove Back

41

SAP AND BIG DATA

SAP's (SAP, Market Underperform, Walravens) Big Data strategy is primarily centered around HANA.

HANA, which stands for High Performance Analytic Appliance, is based on a superset of technologies

that SAP has been building for some time, including the MaxDB database and TREX in-memory engine.

HANA is made available to partners in appliance form and places data to be processed in RAM instead

of reading it off disks, adding a performance boost. Data held in memory by HANA can be restored in

the event of a power outage or any disruption, as the in-memory data is backed by a persistence layer

that logs transactions and incorporates save points. HANA is compatible with any BI (Business

intelligence) application that supports common query languages like SQL and MDX.

Usage: SAP has initially focused on HANA's ability to support real-time analytics, especially as part of a

series of specialized applications that target discrete business problems. One of the first such products

is Strategic Workforce Planning, which companies can use to figure out the economic and logistical

ramifications of making big shifts in staffing.

HANA can also handle the transactional workloads of ERP (enterprise resource planning) applications

like its Business One, Business ByDesign, and Business Suite products. All of those products are

expected to gain HANA support, but the enterprise-grade Business Suite is expected to take the longest

time.

EMC AND BIG DATA

In July 2010, EMC (Not Covered) acquired Greenplum, a 10-year old data warehouse software

company. As part of the deal, Greenplum became a new computing division for EMC. In October

2010, EMC launched a Greenplum data warehousing appliance designed to take on the likes of IBM’s

Netezza and Oracle’s Exadata. The launch came 75 days after EMC acquired Greenplum. With the

Greenplum appliance, EMC is making its splash into the integrated appliance market.

INFORMATICA AND BIG DATA

Informatica (INFA, Market Outperform, Walravens) views Big Data as a confluence of three trends: Big

Transaction Data, which includes both traditional OLTP databases and OLAP and data warehouse

appliances; Big Interaction Data, which includes data from social media such as Facebook; and Big

Data Processing, which includes new technologies such as Hadoop. Informatica's view is that its

integration tools address all three facets of the Big Data trend.

Informatica's messaging around its most release, Informatica 9.1, is all about Big Data, including "Big

Data Integration", "Authoritative and Trustworthy Data", "Self-Service", and "Adaptive Data Services."

What exactly is new in Informatica 9.1? Informatica provided some hints on its last earnings call, when

it noted that 9.1 "will feature adapters for the Hadoop file system, HDFS, to move data in for parallel

processing by Map/Reduce and move results out for consumption." Based on our conversation with an

Informatica representative, customers currently on maintenance will have free update rights to 9.1, but

would need to pay extra for any adapters or connectors to new systems, as with all releases.

Page 42: Big Data and How BI Got Its Groove Back

42

INITIATIONS SUMMARY

We are initiating coverage on six companies in the infrastructure software universe:

MicroStrategy (MSTR)

We are initiating coverage on MicroStrategy with a Market Outperform rating and $140 price target.

MicroStrategy is the largest publicly-traded independent BI vendor. We like MicroStrategy because of

its powerful value proposition of an end-to-end BI architecture and analytics platform, its large market

presence with some of the leading companies in the world, its well-built developer ecosystem, and its

four-quarters in a row of double-digit license growth which we expect to continue. While MicroStrategy

has invested heavily in 2011 (with operating margins expected to be down 800 basis points) to better

compete with the emerging players like QlikTech and Tableau Software, we believe the investments will

start to bear fruit toward the end of this year leading to significant operating margin expansion next year

and revenue with EPS estimates coming in above consensus estimates. We look for 2011, 2012, and

2013 EPS of $1.81, $3.67, and $5.14 versus consensus of $1.79, $3.63, and $4.59, respectively. Our

$140 price target implies a very reasonable 2013 EV/Revenue multiple of 1.9x, a discount to the peer

group, and a 2013 P/E of 27x, slightly above MicroStrategy's average 5 year forward P/E multiple,

roughly in line with its TTM revenue growth rate of 28%.

Progress Software (PRGS)

We are initiating coverage on Progress Software with a Market Perform rating. Progress Software

provides enterprise software products that enable organizations to be operationally responsive to a

changing business environment. We like the steps Progress is taking to transition the business to the

fast-growing category of Enterprise Business Solutions; however, we remain on the sidelines until we

see a permanent CEO in place, more consistent sales execution, and possibly a divestiture of non-

strategic assets. Progress Software trades at a 2013 P/E multiple of 10x versus the peer group median

of 11x. We look for 2011 non-GAAP EPS of $1.43, versus consensus of $1.45; 2012 non-GAAP EPS

of $1.66, versus consensus of $1.63; and 2013 non-GAAP EPS of $1.76, versus consensus of $1.72.

Qlik Technologies (QLIK)

We are initiating coverage on Qlik Technologies ("QlikTech") with a Market Outperform rating and a $35

price target. QlikTech is the fastest growing company in our Big Data/Business Intelligence coverage

universe and one of the fastest growing publicly-traded software companies, with expected 2011

revenue growth of 41%. We like QlikTech because we believe it has a wide-open market opportunity, a

strong value proposition, and based on our survey of 16 customers, we believe the company will be

able to exceed growth expectations. We look for 2011, 2012, and 2013 non-GAAP EPS of $0.30,

$0.47, and $0.68 (versus consensus of $0.29, $0.44, and $0.63) on revenue growth of 42%, 28%, and

25%, respectively. Our $35 price target implies an EV/2013 revenue multiple of 5.6x, a modest

premium to the high-growth software peer group.

Quest Software (QSFT)

We are initiating coverage on Quest Software with a Market Perform rating. Quest Software is a

provider of enterprise systems management software products that has grown primarily via acquisition.

We like Quest's ability to generate cash (with a TTM FCF yield of 10%) and its deep product portfolio.

However, Quest's performance in the past four quarters has been inconsistent, with EPS misses in

three of the last six quarters and revenue misses in two of the last six quarters. While we believe the

downside risk on this name is limited due to its valuation, we remain on the sidelines until we see more

consistent execution. Quest Software trades at a 2013 P/E of 10x, versus the comp group of 11x. We

look for 2011 non-GAAP EPS of $1.33, in line with consensus; 2012 non-GAAP EPS of $1.64, versus

consensus of $1.65; and 2013 non-GAAP EPS of $1.90, versus consensus of $1.93.

Page 43: Big Data and How BI Got Its Groove Back

43

Teradata (TDC)

We are initiating coverage on Teradata with a Market Outperform rating and $63 price target. We like

Teradata because it is the leading data warehousing vendor, we believe it stands to benefit from the Big

Data trend more than any other software vendor; the competitive environment for Teradata is more

benign than the conventional wisdom believes, and we believe the company is well positioned to beat

consensus expectations for 2012 and 2013. We look for 2011, 2012, and 2013 non-GAAP EPS of

$2.29, $2.70, and $3.15, versus consensus of $2.25, $2.58, and $2.95. Our $63 price target represents

an FY13 P/E multiple of 20x, in line with Teradata's 10-year average.

TIBCO Software (TIBX)

We are initiating coverage on TIBCO Software with a Market Outperform rating and a $33 price target.

We like TIBCO because we believe it is a well-managed company growing the top line 21% and that is

committed to growing the bottom line 15-20% per year, it is a cash flow machine with a 10-year FCF

CAGR of 24%, it is tapping a large market opportunity that we feel is getting even bigger as a result of

the Big Data trend, it is well diversified across verticals and product areas, it has strong partnerships,

and we believe it represents a good acquisition target. We look for 2011 non-GAAP EPS of $0.94

(consensus $0.94), 2012 non-GAAP EPS of $1.12 (consensus $1.11), and 2013 non-GAAP EPS of

$1.32 (consensus of $1.27) on revenue growth of 21%, 13%, and 11%, respectively, well above its

comp group. Our $33 price target implies a 2013 P/E of 25x, in line with TIBCO's expected 2011

license growth rate and a premium to the peer group median of 15x.

Page 44: Big Data and How BI Got Its Groove Back

44

JMP SECURITIES SOFTWARE RESEARCH TEAM

Greg McDowell

Vice President - Infrastructure Software

[email protected]

415-835-3934

Greg McDowell joined JMP Securities in December 2007 and

serves as a Vice President in Equity Research covering

Software.

Prior to joining JMP, Greg spent nine years at Oracle in various

Account Executive and Finance positions. While at Oracle, Greg

sold both Oracle's Application solutions, including e-Business

Suite and PeopleSoft solutions, and Oracle's core technology

solutions, including database software, business intelligence software, and middleware software.

Greg received an MBA from the Smith School of Business at the University of Maryland and a BA from

the University of California at Davis.

Patrick D. Walravens Director of Technology Research

Senior Analyst, Software

[email protected]

415-835-8943

Pat Walravens joined JMP Securities in November 2001 and

serves as Director of Technology Research and as a Senior

Research Analyst covering Software.

Prior to joining JMP, Pat spent five years at Lehman Brothers,

where he served as a Senior Research Analyst in the Equities

Group and a Vice President in the Technology Investment

Banking group. Previously, Pat practiced corporate law with the firm of Cooley Godward Kronish (now

"Cooley LLP"), where he represented emerging technology companies and venture capital investors.

In 2007, Pat ranked among the top three analysts in the software industry for stock picking, according to

StarMine results published in Forbes. Pat is frequently quoted and interviewed by the media, including

the Wall Street Journal, Business Week, Forbes, CNBC, and Bloomberg.

Pat received an MBA from the Anderson School of Management at UCLA and a JD from the UCLA

School of Law, where he received the Order of the Coif. Pat holds a BA in Economics from Stanford

University, where he served as an undergraduate teaching assistant in the computer science

department.

Page 45: Big Data and How BI Got Its Groove Back

45

Peter Lowry

Associate

[email protected]

415-869-4418

Peter Lowry joined JMP Securities in June 2011 and serves as an

Associate in Equity Research covering Software.

Prior to joining JMP, Peter had 15 years experience as an Investment

Banker, Private Banker and CPA at top-tier firms such as PWC,

Schroder, Lehman Brothers, UBS, Bank of America, Deutsche Bank

and Ion Partners. Peter worked with both corporate and private clients with finance issues across

public accounting, corporate finance, capital markets and private wealth management.

Peter has an MBA from Columbia University, an MS in Public Accounting from the University of

Hartford, and a BA from Hamilton College.

We would also like to acknowledge the following gentlemen for their help with the Big Data project:

Praveen Chandran

Rishi Sharma

Alec Short

Vincent Song

Naga Surendran

Vijay Tennety

Julian Terkaly

Page 46: Big Data and How BI Got Its Groove Back

46

JMP FACTS AND DISCLOSURES

Analyst Certification: The research analyst(s) who prepared this report does/do hereby certify that the views presented in this report are in accordance with my/our personal views on the securities and issuers discussed in this report. As mandated by SEC Regulation AC no part of my/our compensation was, is or will be directly or indirectly related to the specific views or recommendations expressed herein. This certification is made under the obligations set forth in SEC Regulation AC. Any other person or entity may not use it for any other purpose. This certification is made based on my/our analysis on the date of this report’s publication. I/We assume no obligation to update this certification to reflect any facts, circumstances or events that may subsequently come to my/our attention. Signed Greg McDowell, Patrick Walravens, Peter Lowry

Publicly Traded Companies Covered by JMP and Mentioned in This Report (as of November 15, 2011):

Company Disclosures

Actuate Corporation (1)

Adobe Systems, Inc. (1)

Bank of America Corp. (1)

Cisco Systems, Inc. (1)

Citrix Systems, Inc.

CommVault Systems, Inc. (1)

Cornerstone OnDemand, Inc. (1,3)

Demand Media, Inc. (1,3,5)

DemandTec, Inc. (1)

EMC Corporation (1)

Hewlett-Packard Company (1)

Informatica Corporation (1)

JDA Software Group Inc. (1)

MicroStrategy, Inc. (1)

Oracle Corporation (1)

Progress Software Corporation (1)

Qlik Technologies Inc. (1)

Rackspace Hosting, Inc. (1)

RealPage, Inc. (1,3)

Responsys, Inc. (1,3)

RightNow Technologies, Inc. (1,3)

SAP AG (1)

Symantec Corporation (1)

TIBCO Software Inc. (1)

Teradata Corporation (1)

Ultimate Software Group (1)

VMware, Inc. (1)

salesforce.com (1)

JMP Securities Disclosure Definitions: (1) JMP Securities currently makes a market in this security. (2) JMP Securities has received compensation for banking or other services rendered to this company in the past 12 months. (3) JMP Securities was manager or co-manager of a public offering for this company in the past 12 months. (4) JMP Securities participated as an underwriting or selling group member of a public offering by this company in the past 12 months. (5) JMP Securities and/or its affiliates have obtained a position of at least 1% in the equity securities of this company during the ordinary course of its/their business/investments. (6) An officer of JMP Securities is a director or officer of this company. (7) The analyst covering this company (as defined in NASD Rule 2711) or a member of the analyst's household has a financial interest in this company. (8) The analyst covering this company or a member of the analyst’s household serves as an officer, director, or advisory board member of this company. (9) The analyst covering this company has had discussions of employment with the company.

JMP Securities Investment Opinion Definitions: Market Outperform (MO): JMP Securities expects the stock price to outperform relevant market indices over the next 12 months. Market Perform (MP): JMP Securities expects the stock price to perform in line with relevant market indices over the next 12 months. Market Underperform (MU): JMP Securities expects the stock price to underperform relevant market indices over the next 12 months.

JMP Securities Research Ratings and Investment Banking Services: (as of October 3, 2011)

# Co's % # Co's % # Co's

Receiving % of Co's

Regulatory Under of Regulatory Under of IB Services in With This

JMP Rating Equivalent Coverage Total Rating Coverage Total Past 12 Months Rating

Market Outperform Buy 207 66% Buy 207 66% 58 28%

Market Perform Hold 105 33% Hold 105 33% 7 7%

Market Underperform Sell 3 1% Sell 3 1% 0 0%

TOTAL: 315 100% 315 100% 65 21%

Stock Price Chart of Rating and Target Price Changes: Note: First annotation denotes initiation of coverage or 3 years, whichever is shorter. If no target price is listed, then the target price is N/A. In accordance with NASD Rule 2711, the chart(s) below reflect(s) price range and any changes to the rating or price target as of the end of the most recent calendar quarter. The action reflected in this note is not annotated in the stock price chart. Source: Jovus and JMP Securities. In order to obtain these (6 or more) stock price charts or additional applicable disclosures and information concerning JMP's recommendations of companies under coverage mentioned in this report, please contact JMP Securities at (877) 263-1333 or visit www.jmpsecurities.com.

Page 47: Big Data and How BI Got Its Groove Back

47

JMP Disclaimer: JMP Securities LLC (the “Firm”) compensates research analysts, like other Firm employees, based on the Firm’s profitability, which includes revenues from the Firm’s institutional sales, trading, and investment banking departments as well as on the quality of the services and activities performed that are intended to benefit the Firm’s institutional clients. These data have been prepared by JMP Securities LLC for informational purposes only and are based on information available to the public from sources that we believe to be reliable, but we do not guarantee their accuracy or completeness. Any opinions and projections expressed herein reflect our judgment at this date and are subject to change without notice. These data are neither intended nor should be considered as an offer to sell or a solicitation or a basis for any contract for the purchase of any security or other financial product. JMP Securities LLC, its affiliates, JMP Group LLC, Harvest Capital Strategies LLC, and their respective partners, directors, officers, and associates may have a long or short position in, may act as a market maker for, or may purchase or sell a position in the securities mentioned herein. JMP Securities LLC or its affiliates may be performing, have performed, or seek to perform investment banking, advisory, or other services and may have acted as manager or co-manager for a public offering of securities for any company mentioned herein. The reader should assume that JMP Securities LLC will solicit business from the company covered in this report. © Copyright 2011. All rights reserved by JMP Securities LLC. JMP Securities LLC is a member of FINRA, NYSE Arca, NASDAQ, and SIPC.

Page 48: Big Data and How BI Got Its Groove Back

JMP SECURITIES LLC

600 Montgomery Street, Suite 1100, San Francisco, CA 94111-2713, www.jmpsecurities.com Peter V. Coleman Director of Equity Research (415) 869-4455

Financial Services Capital Markets David Trone (212) 906-3525 Steven Fu, CFA (212) 906-3548 Chris Ross, CFA (212) 906-3532 Consumer & Specialty Finance, Commercial Banks John Hecht (415) 835-3912 Kyle M. Joseph (415) 835-3940 Financial Processing & Outsourcing David M. Scharf (415) 835-8942 Kevane A. Wong (415) 835-8976 Insurance Matthew J. Carletti (312) 768-1784 Christine Worley (312) 768-1786 Market Structure David M. Scharf (415) 835-8942 Kevane A. Wong (415) 835-8976 Residential & Commercial Real Estate Finance Steven C. DeLaney (404) 848-7773 Trevor Cranston, CFA (415) 869-4431 Trevor Cranston, CFA (415) 869-4431 Healthcare Biotechnology Charles C. Duncan, PhD (212) 906-3510 Roy Buchanan, PhD (212) 906-3514 Jason N. Butler, PhD (212) 906-3505 Gena H. Wang, PhD (212) 906-3528 Liisa A. Bayko (312) 768-1785 Heather Behanna, PhD (312) 768-1795 Jason N. Butler, PhD (212) 906-3505 Healthcare Facilities & Services Peter L. Martin, CFA (415) 835-8904 Aaron Hecht (415) 835-3963 Healthcare Services Constantine Davides, CFA (617) 235-8502 Tim McDonough (617) 235-8504 Medical Devices J. T. Haresco, III, PhD (415) 869-4477

Real Estate Hotels & Resorts William C. Marks (415) 835-8944 Housing & Housing Supply Chain Michael G. Smith (415) 835-8965 Land Development Michael G. Smith (415) 835-8965 Real Estate & Property Services William C. Marks (415) 835-8944 Real Estate Technology Michael G. Smith (415) 835-8965 REITs: Healthcare Peter L. Martin, CFA (415) 835-8904 Aaron Hecht (415) 835-3963 REITs: Office & Industrial Mitch Germain (212) 906-3546 Technology Clean Technology Alex Gauna (415) 835-8998 Communications Equipment Erik Suppiger (415) 835-3918 Semiconductors Alex Gauna (415) 835-8998 Software Patrick Walravens (415) 835-8943 Greg McDowell (415) 835-3934 Peter Lowry (415) 869-4418 Greg McDowell (415) 835-3934

For Additional Information

Mark Lehmann President, JMP Securities (415) 835-3908

Erin Seidemann Vice President, Publishing (415) 835-3970