six step approach to identify a big data problem and choose the right solution

12
Six-Step Approach to Identify a Big Data Problem and Choose the Right Solution Author : Manju Devadas VP Solutions and Technology, Bodhtree [email protected] www.linkedin.com/in/manjudevadas

Upload: bodhtree

Post on 28-Oct-2014

55 views

Category:

Documents


0 download

TRANSCRIPT

Page 1: Six Step approach to identify a Big data problem and choose the right Solution

Six-Step Approach to Identify a Big Data Problem and Choose the Right Solution

Author : Manju Devadas

VP Solutions and Technology, Bodhtree [email protected] www.linkedin.com/in/manjudevadas

Page 2: Six Step approach to identify a Big data problem and choose the right Solution

Many have heard the dire predictionsabout the state of information technology with 10x data

growth projections over the coming years. While there is truth to the exploding growth rate of

data and the accompanying complexity of analysis, we have faced similar exponential growth in

dataovereach of the most recent decades; and every time,technology has risen to the challenge

and delivered needed capacity for business, governments and individual users. For parallels we

need only look to distributed computing in the 1990s and websites in the 2000s.

In the 2010s, big data is a phenomenon nearly everyone comes into contact with, whether they

realize it or not. If you carry a smartphone for work or dump thousands ofdigital photos on your

home computer, you're already swimming in the Big Data ocean. You just may not have the

tools yet to capture, store and process that massive data flow for better decision-making.

The purpose of this paper is to demystify Big Data and provide a methodology to assess

whether the problems you encounter in your enterprise are Big Data problems. Working with

large companies and start-ups in the Silicon Valley has allowed me to validate this methodology

in diverse business verticals and company sizes.If nothing else, this paper will help you act from

a position of knowledge surrounding Big Data, avoiding the hype and misinformation that

commonly accompanies the latest technologies.

First Our Ecosystem

Each time you press a key, rate a product or navigate a GPS map, you generate data in some

form. Of all this stored data,usually only a small portion is being analyzed to find new answers

to challenging questions.

If Samsung launches a new phone, the most unbiased and direct feedback today probably

comes from Facebook rather than traditionalcustomer surveys and support lines. Book a flight

online today at Kayak.comor Expedia.com and switch to your inbox to find emails with Priceline

hotel recommendations in a matter of seconds. How the hell did Priceline know that I wanted a

3.5 star hotel in Monterey for the holiday weekend?With or without my knowledge, I allowed

them to capture my personal preferences and travel plans, analyze current web offerings using

Big Data, then email merecommendations.

The Australian government is using Big Data to analyze seismology patterns and predict

earthquakes precious minutes earlier. Big data analysis has found a role in vehicle

maintenance, predicting part failure; space exploration such as the Rover landings on Mars; and

Page 3: Six Step approach to identify a Big data problem and choose the right Solution

fraud prevention, often identifying unauthorized purchases before a customer even realizes

their credit card is stolen.

The total computing power provided by an optimized Big Data system is capable of analyzing all

the data on every desktop in your neighborhood in less than 1 second. The digital image of a

hundred-year-old document could be retrieved from a city government’s archive databases in

less time than it takes to pull a book from the shelf. The evolving technologies in Big Data

world are not only making this kind of analytical power possible, they are democratizing it

through approaches affordable even to the smallest businesses.

The analysis of large volumes of data at lightning speedsis great, but does it actually create real

value for people and businesses?Let’s start with your next promotion, which depends on the

successful launch of a new product line. Big Data becomes your cheat sheet for understanding

customers, allowing you to proactively analyze buying trends and marketing strategies over the

last ten years of similar launches. Or consider biometric monitoring that signals you to go to

the ER before you actually feel any symptoms. Or maybe your goal is enterprise efficiency in a

competitive industry, and you need to identify and eliminate bottlenecks in your supply chain.

Each of these challenges can be addressed with Big Data solutions.

Solving most business problems in large companies involves some form of data analysis. With

the data now being captured in all forms including human, environment, and machine-

generated, it is necessary to identify which problems are Big Data-relatedand which can be

solved using traditional data analysis techniques. The last thing management wants is to

purchase a new system only to realize existing tools were capable of achieving the same results.

(Remember the story of NASA spending millions to invent a pen to write in space when a pencil

would have been an adequate solution). Nothing is more wasteful in business than a great

solution in search of a nonexistent problem.

What exactly is Big Data?

Big Data is simply complex data sets in massivevolumes (petabytes) and multiple formats (table

contents, text, audio, video). With the speed and amount of data being generated today, the

corresponding technology demand is drivingnew ways to analyze the information faster,

cheaper, and with better results.

Three types of data may be present in your enterprise:

Page 4: Six Step approach to identify a Big data problem and choose the right Solution

a. Large Volumes, e.g. Data stored in Database tables, Excel spreadsheets, Access

database

b. Unstructured data, e.g. Video, Audio, Facebook, Twitter, Blogs, Customer Reviews,

Log files

c. ‘Gray’ data, e.g. web traffic where the exact usage is yet to be determined based on

business needs that may arise

Enterprises with ever increasing data volumes must take measures to better analyze these data

sets to accelerate progress toward company goals and objectives. Even if business seems fine

now, you may be ignoring this data at your peril since your competition could be using it to run

more efficiently, respond faster, and make better business decisions. As is so frequently the

case with technology, if you’re holding still, you’re falling behind.

Enterprises also need to have people who can think about data in new ways – not just

information stored in tables, rows and columns, but also data as blogs, videos, Facebook posts,

GPS coordinates, and traffic sensors. As of today, these ‘Data Scientists’ are difficult to train

internally; and the natural reaction is to look for outside hires. But hiring from the outside can

present its own set of challenges as these transplants may not bring the same understanding of

your business challenges and differentiators. I recommend that you begin with the employees

you already have and apply the methodology outlined below toward creating an effective Big

Data strategy and roadmap.

How do I know if my problem is a Big Data Problem?

Without delving into details about the nature of the business challenge and existing sources of

data, it is difficult for anyone to determine for sure if the problem is a Big Data problem. A

Fortune 100 High Tech company in San Jose, California, paid us to fix what they labeled a Big

Data problem. Following the initial analysis, we concluded the problem was best solved with

traditional data analysis techniques rather than a Big Data implementation. We educated the

customer about the unique characteristics of a Big Data problem, and saved the team

substantial money since their existing tools were adequate to solve the issues. Hence, even

though no general model can substitute for a thorough hands-on analysis, the simple

methodology we outline below has been highly effective at quickly determining whether a

challenge isBig Data-related.

Page 5: Six Step approach to identify a Big data problem and choose the right Solution

Quite a few companies see Big Data as a concern only forweb product companies like Facebook

and Google with petabytes of data to organize and process. However, a 2011 McKinsey Global

Institute study argues otherwise. The McKinsey report found that investment firms averaging

less than 1,000 employees have 3.8 petabytes of data stored, a data growth rate of 40 percent

per year and a mix of structured, semi-structured and unstructured data types. Overall,

McKinsey found in 15 of 17 USindustry sectors have more data stored per company than the

U.S. Library of Congress (which currently has 235 terabytes of data) and companies from all

sectors have at least 100 terabytes stored, as shown in Figure 1:

Page 6: Six Step approach to identify a Big data problem and choose the right Solution

Big Data Solution classification:

There are five data conditions called the “Vs” that assist in defining a Big Data problem:

1. Volume, e.g. multiple petabytes of data

2. Velocity e.g. results need to be analyzed in seconds or less

3. Variety, e.g. Structured and unstructured data like social media posts and video files

4. Variability, e.g. Constantly changing like a stock market

Value, e.g. You’ve identified the clear business value you plan to derive from the data

What does all this mean?

The relentless growth of data, new data formats todeal with, and the competitive advantages

achieved from managing large volumes of data all emphasize why Big Data should matter to

you. If you are an IT professional, you already recognize how difficult it can be to find a solution

capable of handling a task as monumental as big data management. Whether you are looking

for growth, profitability or productivity in your organization, you are invariably dealing with

data; and when that data shows the 5 V characteristics, you now need to start thinking of it as a

Big data problem and approach it differently than traditional solutions.

How do you get started?

Many of the enterprises fail to implement a Big Data solution because they have not identified

clear business cases for the tools. The common trigger to initiate Big Data development is a

data blast that existing systems can no longer manage. As these datasets continue to grow in

size, the enterprises face the problem of managing, storing and processing the data at the

speed required for timely business response.

Below is the Bodhtree’s six-step process to take enterprises from Big Data Problem

Definition to Solution Implementation, a methodology which has been applied with excellent

results at a large Bay Area networking company and several other Bodhtree customer locations:

Page 7: Six Step approach to identify a Big data problem and choose the right Solution

Bodhtree Six Step process for a Big Data problem definition to solution delivery:

Step 1: Understand the Use Case

Depending on where you reside in the organization,the chances are high that you will

first feel a sense of data overload before you can articulate a clear business case to

leverage that data. Often this prompts enterprises to reactively implement a Big Data

solution without deciding in advance what problems it will be used to resolve.

Page 8: Six Step approach to identify a Big data problem and choose the right Solution

It is critical that you deep dive and understand the business case first before even

thinking along the lines of Big Data. Otherwise, there will be a lack of focusthat feels a

little like staring through a microscope at unintelligible detail without ever stepping back

to see what is specimen sitting on the glass. In terms of IT, one Bodhtree client

managing a large warehouse of customer, product and geography information with 100s

of terabytes of data said he had a Big Data use case, but everything he spoke about

involved only structured data, failing the 5 Vs test. Even before worrying about Big

Data, do a litmus test by asking the following questions:

Business Case – Do I understand the value of solving the problem in hand? Can I

quantify the potential value of a big data solution or at least articulate the

qualitative benefits?

• Dependencies – Have I collectedall the relevant information about the customer,

install base etc.?

• Complexity –Have I inventoried the data sources and characteristics to

determine the complexity?

• Lead Time – Have I created a reasonable plan with adequate time to acquire

relevant hardware and data?

• Initiative Alignment – Is the project aligned with corporate objectives and are

project sponsors committed to the end-to-end process?

Step 2: Understand the Current Landscape

• Carefully analyzing the use cases defined in step 1 enables you to identifyall data entry

and storage points. Often critical data entry points are discovered during this review

process which were not realized initially.

• Map the end-to-end process and data flows for the business capabilities, e.g.How does

the data flow to you from the customer and among internal teams?

Page 9: Six Step approach to identify a Big data problem and choose the right Solution

• Build a Reference Architecture to highlight the current systems and tools and its

readiness for Big Data. Validate you have access to the data you plan to analyze.

Step 3: Build a Blueprint

• Define your overarching architecturalchallenges in doing theBig Data analysis defined in

the use cases, e.g. What architecture will I need to store the customer install base

information along with product information?

• Identify the right high level Big Data solutions leveraging technology agnostic vendors

and advisors.

• Document a clear delta between As-Is & To-Be with the introduction of the Big Data

solutions while addressing the pain pointsat eachtransition phases

• Document the Risks & Dependencies that could impact business results, cost or

schedule. Remember rolling out sophisticated tools does not guarantee success. Watch

out for hidden landmines, e.g. Data Quality.

Page 10: Six Step approach to identify a Big data problem and choose the right Solution

Step 4: Identify the Big Data Technologies

• Deep dive into the Big Data technology dependencies and the impacts they have on the

system/tools and organizations. For example, you might consider howHadoop adoption

overlaps with your Business Objects installation to analyze the customer and product

data.

• Determine which users will be consuming the information and analysis. What formats

do these reports need to be in? Do they require mobile interfaces? Current BI reports

and subscribers often provide relevant insight to these questions.

Step 5: Build a Big Data Roadmap

• Avoid the traps of either over investing or under investing – have the business cases

drive the solution.

• Plan the roadmap for your Big Datarollout based on such factors as –

• Business priority and management support. Remember, your execs may need to

be educated in order to understand the relative business value offered by each

phase.

• Timeframe of expected results and ROI.

• Big Data technology complexities, i.e. Apply the right order to ensure a clean

data foundation before conducting analytics.

Page 11: Six Step approach to identify a Big data problem and choose the right Solution

Step 6: Big Data Solution Rollout

• Formalize the right team, experienced in conducting multiple implementations.

• Divide scope items across multiple phases/releases to track progress and provide

important quality checkpoints.

• Document Business Requirement, Functional Analysis and the Solution Architecture

• Begin user training before the implementation is complete so analysts can immediately

realize business value, building momentum for expanded uses.

Conclusion

Big Data, by its very nature, contains endless possibilities for business insight and improved

operations. But much like venturing into space without a defined mission, the Big Data world

demands that businesses clearly define what they intend to achieve in advance. Otherwise

enterprises can spend substantially on fancy tools that may never happen upon real business

benefit.

Once those business goals are defined, and you have captured a clear picture of the current

state of your data, apply the 5 Vs screening questions to determine if the problem truly

warrants a Big Data solution. An objective vendor that specializes in a broad cross section of BI

and Big Data solutions can assist in this process and advise solutions that maximize your ROI.

Upon identifying a Big Data problem, carefully proceed through the six steps of the Problem to

Solution Methodology. Realize the real value of Big Data solutions do not come simply with

implementation but through applying creative and insightful approaches to harvesting business

value from the data. Ensure all dependencies are considered so that your data foundation is

Page 12: Six Step approach to identify a Big data problem and choose the right Solution

clean, comprehensive and current. Finally, proceed with the implementation, highlighting

“quick wins” to convey business value to execs and analysts building momentum for the full

implementation.

If the above methodologies are applied right, you will end up with saving time, energy and

achieving better results with by applying the right Big Data prescription for a REAL Big Data

problem.

Contributors : Ryan Madsen, Sushanth Reddy

References :

• McKinsey Global Institute study

• Bodhtree Customer Case Studies