big data analytic platform

24
What are you building? A Platform or an IDE Two approaches in big data analytics

Upload: jesse-wang

Post on 15-Apr-2017

375 views

Category:

Data & Analytics


0 download

TRANSCRIPT

Page 1: Big data analytic platform

What are you building?

A Platform or an IDE

Two approaches in big data analytics

Page 2: Big data analytic platform

Intro: Big Data Analytics is In

The biggest difference between traditional enterprises and internet-age enterprises lies in the data, everything is being converted to data. The competition between companies is really the competition between their utilization of data.

Neel Master
how can we make this stronger. its not just than an app uses xPatterns, but EVERY true big data analytics capability across the pipeline must use xPatterns, not some customer-made app
Page 3: Big data analytic platform

Data as CapitalIn the age of everything is being digitalized, data is everywhere and winning in the virtual world directly translates into competitive advantage in real world

How to properly value and utilize data is the key to capitalizing on big data

Page 4: Big data analytic platform

Steps to Achievement

Data Collection• Identify datasets•Collect data• Joining data

Basic Analytics•Understand data•Clean, enrich, transform data

•Simple discovery

Experimentation

•Apply data science•Learn more about data•Build and test models

Model Application•Apply various models•Get feedback, refine models

How to win at Big Data Analytics

Page 5: Big data analytic platform

Types of Users

Decision

makers

Line-of-business workers

Business analysts

Data scientist

s / engineer

s

Domain experts

Page 6: Big data analytic platform

What need be built?

Data Information

Knowledge Wisdom

Applications to convert raw data into wisdom

Page 7: Big data analytic platform

Two approaches

Applications

Analytic Platform

Integrated Developm

ent Environme

nt

Page 8: Big data analytic platform

Do you have a big data analytics

platform?Many companies claim they have one

Do they really have a true platform, or they are building custom applications every time?

Page 9: Big data analytic platform

What is a Platform

In general, a piece of computer software designed to support applications, with fundamental functions provided, obeying its constraints, and making use of its facilities

Different abstraction levels: hardware, OS, Runtime, Web, Cloud, Analytics …

A big data analytics platform allows people to build apps out of components that are hosted or provided by the providers with specific protocols linking them together

Page 10: Big data analytic platform

Features of a Computing Platform

Enables quick development of custom apps by providing prebuilt functionalities (not tools or usability enhancements)

Components can be independently applied and can communicate with each other, often with proprietary semantics and protocols

Usually result in a lock-in due to data or protocol specifications, hard to move apps away

Page 11: Big data analytic platform

Examples of a Platform

Intel, Microsoft Windows (Wintel)

Adobe AIR, Apple iOS, …

Java Platform (J2EE etc.), .Net Framework

Facebook/Twitter …

WordPress

Page 12: Big data analytic platform

What is an IDEAn Integrated Development Environment (IDE) is a software application that provides comprehensive facilities to developers for software development. IDE normally contains a code/script editor, build automation, file/item browser, debugger (profiler/monitor).

IDE normally offers features like GUI, MDI, RAD, and support code generation, automation of execution (deployment), and revision control…

Modern features include intelligent code completion, visual browser, workflow manager and other productivity features

Page 13: Big data analytic platform

Examples of IDEMicrosoft Visual Studio, Delphi

Eclipse, IntelliJ IDEA, PyCharm

Xcode

WebStorm

Cloud9

Page 14: Big data analytic platform

Similarities Between a

Platform and IDEBoth are software providing facilities to its users

Both can enable faster application development

Page 15: Big data analytic platform

Differences between Platform

and IDEPlatformProviding facilities in the form of functional components

Faster development speed via pre-packaged functionalities

Allowing users to build applications only with its functions*, can use multiple IDEs

Resulting in lock-in of applications

IDEProviding facilities in the forms of usability improvements

Faster development speed via stream-lined operations

Allowing users develop only within its environment, can support multiple platforms

Resulting in lock-in of project files

* Most platforms allow calling external components, but still need fit into its own platform constraints

Page 16: Big data analytic platform

Why Not IDEIDE can help one type of user, most likely Data scientists or software engineers

These users are usually not the majority users in the company

The ROI is mainly usability: lowVs. platform that produces applications which can multiply productivity

Page 17: Big data analytic platform

Why PlatformHigh reusability Decreased time and cost to market

Supporting more customers Higher value for customers

Built-in flexibility Faster application development time

Component Marketplace (AppStore) Lower support cost, enable third-party contributions

Page 18: Big data analytic platform

Five-star DataAccessible

Parse-able (structurizable)

With shared metadata

Identifiable

Connected with relations

Neel Master
What is the interface for the programmer to use these data connectors? how do they implement it? what rules must they follow?
Page 19: Big data analytic platform

Four PillarsKnowledge-base: domain expertise, rules, data, metadata, etc.

Semantic data management system: manage all software artifacts including data sources, datasets, projects, users…

Function modules: parsers, algorithms, visualization modules, transformers, models, … to build apps

Infrastructure support: connect to proper infrastructure to run all the things

Page 20: Big data analytic platform

Custom Application Development Workflow

Example Application: HR Insights

Start with requirements and goals:Overview of whole company’s employees’ hours, times on which app/sites, sentiments, average time of responding email/requests, models to predict performance or attritionThe goals contain specific details on standards, conditions, environment, resources, and even methodologies

Page 21: Big data analytic platform

HR Insights Workflow Step #1

Find or Create GoalsFind similar goalsIf not, specifying details such as working hours, email/request response time, mapping out natural workgroups via communication patterns …

Collect data Select from list of known Sources and DatasetsOr create new sources or datasets if necessary

Page 22: Big data analytic platform

HR Insights Workflow Step #2

Performing Ingestion, Pre-processing (Parsing), and Instant Analytics (basic stats and other quick insights)

To help understand the data better for further steps

Perform transformation to get more targeted datasetsSMEs can run a set of existing tools (apps, models, transformations) to get more insights

Including enriching, filtering, linking to other data sets and do it over again

Page 23: Big data analytic platform

HR Insights Workflow Step #3

Create Analytic Pipeline Requests to solicit data science experts

Data scientists can start doing experimentation and build models

Models reviewed and published as applications

End users can benefit from new models/capabilities

Page 24: Big data analytic platform

24

Thank you