moving to hadoop -smart

1
MEET SARA Sara is the Director of Data Warehousing at Acme, Inc. Sara’s job is to deliver analytics to business users within the constraints of a limited IT budget. On one hand, she has thousands of end users in many departments doing all types of business intelligence. On the other hand, she has a budget that is decreasing each year while the data, queries, processing are all increasing. Enter Big Data. What was a manageable problem just blew up. The business users can now access a hundred times more data, and want more and different BI on new tools and devices. Great for them. Until Sara has to say no. Sara was on a treadmill at increasing speed. BI performance expectations from users did not let up. Data sizes kept increasing. To keep up, Sara was forced to use relational database software on- premise, often with proprietary hardware. Sara needed to use high-end, expensive technology to keep up. Sara tried to keep all new data coming into her well managed platform. But to no avail. Many databases sprung up that business users ended up relying on Sara to manage and support. Sara’s story is the norm. Many databases surrounding the central data warehouse. High cost technologies required. Data sizes increasing faster than anyone thought possible when setting the budgets. Annual budgets decreasing. Users expecting the same or better performance. ENTER HADOOP If the size of data is increasing exponentially, then Sara needs a technology that costs exponentially less for storage and processing of analytics. That is Hadoop. Hadoop’s open-source delivery model makes the software inexpensive. Its ability to run on commodity processors makes the hardware inexpensive. The beauty for Sara is that she can migrate data and analytic workloads from her existing expensive databases to Hadoop. But how? Sara has heard that some workloads don’t lend themselves to Hadoop. That Hadoop is not for low latency or real-time queries. Or that due to its ‘schema-on-read’ feature of not using a data model, that some of her highly tuned queries may take longer to run on Hadoop. Sara knows that there’s a lot of data in her data warehouses that are not being used—but she doesn’t know which data it is. She’d like to get that data over to Hadoop as soon as possible. Too many questions. Sara did not have hard data on who was using what, when, how often, how expensive it was, and where to start. Sara had no plan. And no way of making a plan. ENTER APPFLUENT Appfluent software is designed to give Sara a Hadoop migration plan. How? Appfluent answers all Sara’s questions about what is happening in her analytic environments. Appfluent connects to Sara’s existing high-end analytic databases (like Oracle, IBM, Netezza and Teradata) and lets her know: what data has not been used or is used infrequently whose queries are the most ‘Hadoop-able’ which data sets are used in batch which data loads are exceeding the batch window which users are the most expensive Now Sara has a tool that can answer her questions about what to move into Hadoop and when, saving the most money, and not impacting end users. Appfluent’s engine is a set of distributed processes that create a nondisruptive process for continuously watching, storing, and analyzing all queries generated by users and applications against one or more data warehouses. It includes a web application that provides out-of-the box reports and analytics designed to identify what is happening with users, analytics, data, tables, columns, views, and costs. Enabling IT to be smarter in areas like security, auditing and information life cycle management. MEET SARA…AGAIN Sara is happy. She’s has left the Hadoop ‘sandbox’ and is using production Hadoop side-by-side with her higher end warehouse. Sara has given Acme options: to pocket the capital saved by optimizing the data platform, or redeploy the capital on new ways to generate analytic insights. APPFLUENT www.appfluent.com Moving to Hadoop, Smart THOUGHT LEADERSHIP SERIES | AUGUST 2013 13 Sponsored Content Shawn Dolley, VP, Corporate Development & Strategy, Appfluent

Upload: appfluent-technology

Post on 22-Apr-2015

343 views

Category:

Technology


1 download

DESCRIPTION

Looking for a technology that costs significantly less to store and process analytics? Then you need Hadoop. Learn more about the benefits and savings of Hadop. Authored by Shawn Dolley, VP, Corporate Development & Strategy, Appfluent

TRANSCRIPT

Page 1: Moving To Hadoop -Smart

MEET SARASara is the Director of Data Warehousing

at Acme, Inc. Sara’s job is to deliver analyticsto business users within the constraints of a limited IT budget. On one hand,she has thousands of end users in manydepartments doing all types of businessintelligence. On the other hand, she has a budget that is decreasing each year while the data, queries, processing are allincreasing. Enter Big Data. What was amanageable problem just blew up. Thebusiness users can now access a hundredtimes more data, and want more anddifferent BI on new tools and devices.Great for them. Until Sara has to say no.

Sara was on a treadmill at increasingspeed. BI performance expectations fromusers did not let up. Data sizes keptincreasing. To keep up, Sara was forced to use relational database software on-premise, often with proprietary hardware.Sara needed to use high-end, expensivetechnology to keep up. Sara tried to keepall new data coming into her well managedplatform. But to no avail. Many databasessprung up that business users ended uprelying on Sara to manage and support.

Sara’s story is the norm. Manydatabases surrounding the central datawarehouse. High cost technologiesrequired. Data sizes increasing faster thananyone thought possible when setting thebudgets. Annual budgets decreasing.Users expecting the same or betterperformance.

ENTER HADOOP If the size of data is increasing

exponentially, then Sara needs a technologythat costs exponentially less for storage andprocessing of analytics. That is Hadoop.Hadoop’s open-source delivery modelmakes the software inexpensive. Its abilityto run on commodity processors makes thehardware inexpensive. The beauty for Sarais that she can migrate data and analyticworkloads from her existing expensive

databases to Hadoop.But how?

Sara has heard thatsome workloads don’tlend themselves toHadoop. That Hadoopis not for low latencyor real-time queries.Or that due to its‘schema-on-read’feature of not using adata model, that someof her highly tunedqueries may takelonger to run onHadoop. Sara knowsthat there’s a lot ofdata in her datawarehouses that are not being used—butshe doesn’t know which data it is. She’dlike to get that data over to Hadoop assoon as possible. Too many questions.Sara did not have hard data on who wasusing what, when, how often, how expensiveit was, and where to start. Sara had noplan. And no way of making a plan.

ENTER APPFLUENTAppfluent software is designed to give

Sara a Hadoop migration plan. How?Appfluent answers all Sara’s questionsabout what is happening in her analyticenvironments. Appfluent connects toSara’s existing high-end analytic databases(like Oracle, IBM, Netezza and Teradata)and lets her know:• what data has not been used

or is used infrequently• whose queries are the most

‘Hadoop-able’• which data sets are used in batch• which data loads are exceeding

the batch window• which users are the most

expensive

Now Sara has a tool that can answerher questions about what to move into

Hadoop and when, saving the mostmoney, and not impacting end users.

Appfluent’s engine is a set of distributedprocesses that create a nondisruptiveprocess for continuously watching, storing,and analyzing all queries generated byusers and applications against one or more data warehouses. It includes a webapplication that provides out-of-the boxreports and analytics designed to identifywhat is happening with users, analytics,data, tables, columns, views, and costs.Enabling IT to be smarter in areas likesecurity, auditing and information lifecycle management.

MEET SARA…AGAIN Sara is happy. She’s has left the Hadoop

‘sandbox’ and is using production Hadoopside-by-side with her higher end warehouse.Sara has given Acme options: to pocket the capital saved by optimizing the dataplatform, or redeploy the capital on newways to generate analytic insights. ■

APPFLUENTwww.appfluent.com

Moving to Hadoop, Smart

THOUGHT LEADERSHIP SERIES | AUGUST 2013 13Sponsored Content

Shawn Dolley, VP, Corporate Development & Strategy, Appfluent