how to sell an azure data lake project for your … · nosql/ms-sql 2. what is a data lake?...

25
HOW TO SELL AN AZURE DATA LAKE PROJECT FOR YOUR ORGANIZATION’S BENEFIT Presented by: Victor Karamalis TTI Corp.

Upload: others

Post on 22-May-2020

10 views

Category:

Documents


0 download

TRANSCRIPT

Page 1: HOW TO SELL AN AZURE DATA LAKE PROJECT FOR YOUR … · NOSQL/MS-SQL 2. What is a Data Lake? Explanation of Azure Data Lake Storage GEN 2 3. USE CASE SCENARIO 4. AZURE Data Lake IaaS

HOW TO SELL AN AZURE DATA LAKE PROJECT FOR YOUR ORGANIZATION’S BENEFIT

Presented by:

Victor Karamalis

TTI Corp.

Page 2: HOW TO SELL AN AZURE DATA LAKE PROJECT FOR YOUR … · NOSQL/MS-SQL 2. What is a Data Lake? Explanation of Azure Data Lake Storage GEN 2 3. USE CASE SCENARIO 4. AZURE Data Lake IaaS

WHO I AM

20 Years on a broad range of Sectors in Information Technology Services

Education & Affiliations:

Master of Science in Management & Systems (NYU)

Project Management Professional (PMI.ORG)

Data Management International (DAMA.ORG)

Fellow, Royal Society of Arts, Manufactures & Commerce (thersa.org)

Large Scale Artificial Intelligence Projects with Multi-National Companies

System and Data Integrations in Enterprise ERP & IIoT

Innovative Proof of Concepts (PoC) with Formal Sponsor Support

Product Management with multiple global teams

Past Contributor in Leading Silicon Valley Tech Blogs

Page 3: HOW TO SELL AN AZURE DATA LAKE PROJECT FOR YOUR … · NOSQL/MS-SQL 2. What is a Data Lake? Explanation of Azure Data Lake Storage GEN 2 3. USE CASE SCENARIO 4. AZURE Data Lake IaaS

WHAT WE WILL COVER1. DATA LAKE DESIGN

LEAN DATA GOVERNANCE MACHINE LEARNING NOSQL/MS-SQL

2. What is a Data Lake? Explanation of Azure Data Lake Storage GEN 2

3. USE CASE SCENARIO

4. AZURE Data Lake IaaS VS. PaaS1. IaaS2. PaaS

5. EXAMPLE IaaS Architecture

6. DEMONSTRATION BASED ON BASIC ACCOUNT SUBSCRIPTION

7. LESSONS LEARNED

Page 4: HOW TO SELL AN AZURE DATA LAKE PROJECT FOR YOUR … · NOSQL/MS-SQL 2. What is a Data Lake? Explanation of Azure Data Lake Storage GEN 2 3. USE CASE SCENARIO 4. AZURE Data Lake IaaS

THE ROSETTA STONE @ THE BRITISH MUSEUM

Page 5: HOW TO SELL AN AZURE DATA LAKE PROJECT FOR YOUR … · NOSQL/MS-SQL 2. What is a Data Lake? Explanation of Azure Data Lake Storage GEN 2 3. USE CASE SCENARIO 4. AZURE Data Lake IaaS

YOUR ORGANIZATION

AS A ‘LEAN STARTUP’

“Somebody has a theory about what’s going to work and what the benefit will be. We don’t measure it. We don’t actually see if it did what we thought it was going to do. And we keep doing it. And then it doesn’t work, so we do something else. And then we layer on program after program that doesn’t actually meet its objectives. And if we actually brought in the mind-set that said, “No, actually we’re going to figure out if we actually accomplish what we set out to accomplish; and if we don’t, we’re going to change it,” that would be huge.”

-Eric Ries, Lean Startup

Page 6: HOW TO SELL AN AZURE DATA LAKE PROJECT FOR YOUR … · NOSQL/MS-SQL 2. What is a Data Lake? Explanation of Azure Data Lake Storage GEN 2 3. USE CASE SCENARIO 4. AZURE Data Lake IaaS

DATA MANAGEMENT LESSONS

Data Governance: must support business strategy and goals. An organization’s business strategy and goals inform both the enterprise data strategy and how data governance and data management activities need to be operationalized in the organization.

Must contribute to the organization by identifying and delivering on specific benefits

Formalized via Project Charter

Enterprise Data Architecture: Enterprise Data Model (EDM)

Data Flow Design

Maintain compliance throughout data lifecycle

HIPAA

GDPR

DPA UK

PIPEDA (Canada)

Page 7: HOW TO SELL AN AZURE DATA LAKE PROJECT FOR YOUR … · NOSQL/MS-SQL 2. What is a Data Lake? Explanation of Azure Data Lake Storage GEN 2 3. USE CASE SCENARIO 4. AZURE Data Lake IaaS

MACHINE LEARNING IN A NUTSHELL

Requires Data Scientists to teach system how to learn

Good performance is difficult or infeasible using traditional programming techniques

Complete Logic or Formula to implement solution is not known or does not currently exist

Significant Data size to Compute.

Business Questions Answered Which Products are likely to be bought

together? Collaborative Filtering

How much, what will be the number of..? Regression

Who are my best customers? Clustering

What will be price of stock in a month? Gradient Boosted Tree

Is Fraud Occurring? Decision Tree

Is that image a known intruder? Support Vector Machine (aka, supervised

learning)

Page 8: HOW TO SELL AN AZURE DATA LAKE PROJECT FOR YOUR … · NOSQL/MS-SQL 2. What is a Data Lake? Explanation of Azure Data Lake Storage GEN 2 3. USE CASE SCENARIO 4. AZURE Data Lake IaaS

AI VS. ML VS. DLEXAMPLE OF RECOGNIZING A PICTURE

Artificial Intelligence

Requires a programmer(s) to write all the code required for a computer to recognize a picture of an object (e.g. cat).

Machine Learning

Requires data scientists to teach the system how to learn what a cat looks like by feeding images and correcting its analysis until the system becomes accurate.

DEEP LEARNING

Divide the task of recognizing an object into different layers1st layer of the algorithm earns to recognize cat body part2nd layer learns another cat body partFinal connects previous layers

Page 9: HOW TO SELL AN AZURE DATA LAKE PROJECT FOR YOUR … · NOSQL/MS-SQL 2. What is a Data Lake? Explanation of Azure Data Lake Storage GEN 2 3. USE CASE SCENARIO 4. AZURE Data Lake IaaS

MACHINE LEARNING ALGORITHMS

Page 10: HOW TO SELL AN AZURE DATA LAKE PROJECT FOR YOUR … · NOSQL/MS-SQL 2. What is a Data Lake? Explanation of Azure Data Lake Storage GEN 2 3. USE CASE SCENARIO 4. AZURE Data Lake IaaS

NOSQL/MS-SQL MIGRATION OPTIONS

NO-SQL

SPARK

COUCHDB

HADOOP

COSMOS

RDBMS/SQL

AZURE SQL

MS-SQL SERVER

ORACLE

SAP

Page 11: HOW TO SELL AN AZURE DATA LAKE PROJECT FOR YOUR … · NOSQL/MS-SQL 2. What is a Data Lake? Explanation of Azure Data Lake Storage GEN 2 3. USE CASE SCENARIO 4. AZURE Data Lake IaaS

WHAT IS A DATA LAKE?

A data lake is an organic store of data without regard for the perceived value or structure of the data unlike a data warehouse

Unstructured

Semi-structured

Structured

A Data Warehouse is a highly structured store of data.

Data Lakes Market segment by Type:

Data Discovery (Insight)

Data Integration and Management

Data Lakes Analytics

Data Visualization

Page 12: HOW TO SELL AN AZURE DATA LAKE PROJECT FOR YOUR … · NOSQL/MS-SQL 2. What is a Data Lake? Explanation of Azure Data Lake Storage GEN 2 3. USE CASE SCENARIO 4. AZURE Data Lake IaaS

WHAT MAKES A DATA LAKE SO GREAT?

Massive Scale Granular, Multi-layered Security

Optimized for Maximum

Performance

Integration Friendly

Cost Effectiveness

Petabyte Scale, data accessible

everywhere, growth on demand

Granular Security & Protection against

accidental data loss

Extremely fast job execution

Supports multiple methods of data

ingress, processing, egress,

and visualization

Cloud Economic Model with the

ability to intelligently

manage costs

RICH DATA MANAGEMENT & GOVERNANCE(Standards Compliant & Available Everywhere)

Page 13: HOW TO SELL AN AZURE DATA LAKE PROJECT FOR YOUR … · NOSQL/MS-SQL 2. What is a Data Lake? Explanation of Azure Data Lake Storage GEN 2 3. USE CASE SCENARIO 4. AZURE Data Lake IaaS

A “NO COMPROMISES” DATA LAKE

A Secure, performant, massively scalable Data Lake Storage that brings the cost & scale of object storage together with the performance and analytics feature set of data lake storage

Secure

Manageable

Fast

Scalable

Cost Effective

Integration Ready

Page 14: HOW TO SELL AN AZURE DATA LAKE PROJECT FOR YOUR … · NOSQL/MS-SQL 2. What is a Data Lake? Explanation of Azure Data Lake Storage GEN 2 3. USE CASE SCENARIO 4. AZURE Data Lake IaaS

AZURE DATA LAKE STORAGE GEN 2: ADLS GEN 2

SECURE MANAGAEABLE FAST SCALABLE COST EFFECTIVE INTEGRATION READY

Support for fine-grained Access Control Lists, Protecting data at File & Folder Level

Automated Lifecycle Policy Management

Atomic File Operations Means Jobs complete faster

No Limits on Data Store Size

Object Store Pricing Levels

Optimized for Spark & Hadoop Analytic Engines

Multi-Layered protections via at-rest storage service encryption *Azure Active Directory Integration

Object Level Tiering

Global Footprint(54 Regions)Including Government Clouds

File System operations minimize transactions required for job completion

Tightly integrated with Azure end to end Analytics Solutions

Page 15: HOW TO SELL AN AZURE DATA LAKE PROJECT FOR YOUR … · NOSQL/MS-SQL 2. What is a Data Lake? Explanation of Azure Data Lake Storage GEN 2 3. USE CASE SCENARIO 4. AZURE Data Lake IaaS

GEN 1 STORAGE DIFFERENCES

Blob Storage

Large Partner Ecosystem

Global Scale- All 57 Regions

Durability Options

Tiered – Hot/Cool/Archive

Cost Efficient

Data Lake Store

Built for Hadoop

Hierarchical Namespace

ACL, AAD, & RBAC

Performance Tuned for Big Data

Very High Scale Capacity & Throughput

Page 16: HOW TO SELL AN AZURE DATA LAKE PROJECT FOR YOUR … · NOSQL/MS-SQL 2. What is a Data Lake? Explanation of Azure Data Lake Storage GEN 2 3. USE CASE SCENARIO 4. AZURE Data Lake IaaS

DATA LAKE DESIGN

Cloud/On-premises, Hybrid Cloud, Multi-Cloud (AZURE)

Storage (AZURE SQL DATA BLOB Storage)

Processing (AZURE DATA LAKE)

Data Management (AZURE DATA STORE)

Advanced Analytics Enterprise Reporting Apps (Power Bi)

Page 17: HOW TO SELL AN AZURE DATA LAKE PROJECT FOR YOUR … · NOSQL/MS-SQL 2. What is a Data Lake? Explanation of Azure Data Lake Storage GEN 2 3. USE CASE SCENARIO 4. AZURE Data Lake IaaS
Page 18: HOW TO SELL AN AZURE DATA LAKE PROJECT FOR YOUR … · NOSQL/MS-SQL 2. What is a Data Lake? Explanation of Azure Data Lake Storage GEN 2 3. USE CASE SCENARIO 4. AZURE Data Lake IaaS

USE CASE SCENARIO

BUSINESS CONSIDERATIONS

SPONSOR/MANAGEMENT SUPPORT

AUGMENT DEFINED BUSINESS INSIGHTS

TIME TO MARKET FOR KEY INSIGHTS (aka AGILITY)

BUDGET CONSIDERATIONS

TECHNICAL SKILLS CONSIDERATIONS

MINIMAL DEPENDENCE ON IT FOR DRASTIC CHANGES

RIGIDITY OF SINGLE DATA MODEL

ABILITY TO HANDLE STREAMING DATA

SCALABILITY

Page 19: HOW TO SELL AN AZURE DATA LAKE PROJECT FOR YOUR … · NOSQL/MS-SQL 2. What is a Data Lake? Explanation of Azure Data Lake Storage GEN 2 3. USE CASE SCENARIO 4. AZURE Data Lake IaaS

MINIMAL SIZE FOR A BUSINESS ADLS PROJECT TEAM

Project Manager

Solution Architect

Data Engineer/Lead

Data Scientist

Page 20: HOW TO SELL AN AZURE DATA LAKE PROJECT FOR YOUR … · NOSQL/MS-SQL 2. What is a Data Lake? Explanation of Azure Data Lake Storage GEN 2 3. USE CASE SCENARIO 4. AZURE Data Lake IaaS

IAAS ADLS VS. PAAS: ADLS (GEN 2)

INGESTING DATA FROM VARIOUS SOURCES

MIGRATE FROM EXISTING ON-PREMISE DATA WAREHOUSE MOBILE DATA

ERP DATA WAREHOUSE

APP DATA

SENSOR DATA

MASTER DATA

PROGRAMATIC

MACHINE LEARNING SERVICES WITH LITTLE OR NO-CODE Run & Monitor Experiments

Register Models

Build Docker Images

Deploy Models

Create Pipeline

Page 21: HOW TO SELL AN AZURE DATA LAKE PROJECT FOR YOUR … · NOSQL/MS-SQL 2. What is a Data Lake? Explanation of Azure Data Lake Storage GEN 2 3. USE CASE SCENARIO 4. AZURE Data Lake IaaS
Page 22: HOW TO SELL AN AZURE DATA LAKE PROJECT FOR YOUR … · NOSQL/MS-SQL 2. What is a Data Lake? Explanation of Azure Data Lake Storage GEN 2 3. USE CASE SCENARIO 4. AZURE Data Lake IaaS

DEMONSTRATION ON AZURE FOR ADLS GEN 2

Page 23: HOW TO SELL AN AZURE DATA LAKE PROJECT FOR YOUR … · NOSQL/MS-SQL 2. What is a Data Lake? Explanation of Azure Data Lake Storage GEN 2 3. USE CASE SCENARIO 4. AZURE Data Lake IaaS

LESSONS LEARNED

The Soft Skills

Get Buy-In from Technical Staff (IT)

Security Policies are understood and use approved VM’s

Ensure Business/Technical Stakeholders are informed regularly.

The Technical Matters

ADDRESS DATA GOVERNANCE INTEGRITY SECURITY

ACCESS ONLY DATA YOU NEED REGULATIONS May Add Cost (Transport & Store)

ADD ALERTS MONITORING FOR ANY & ALL VM’S + SERVICES

AFTER FINISHED SHUT DOWN V-NET RESOURCE GROUPS UPDATED

AUTHORIZED AD USERS

POLICIES

Page 24: HOW TO SELL AN AZURE DATA LAKE PROJECT FOR YOUR … · NOSQL/MS-SQL 2. What is a Data Lake? Explanation of Azure Data Lake Storage GEN 2 3. USE CASE SCENARIO 4. AZURE Data Lake IaaS

IMPORTANT URL’S

Azure updates: https://azure.microsoft.com/en-us/updates/

Azure Blogs: https://azure.microsoft.com/en-us/blog/

Azure Data Lake Storage Gen2:

https://azure.microsoft.com/en-us/blog/under-the-hood-performance-scale-security-for-cloud-analytics-with-adls-gen2/

https://docs.microsoft.com/en-us/azure/stream-analytics/stream-analytics-define-outputs#blob-storage-and-azure-data-lake-gen2

SPARK to SQL SERVER: https://docs.microsoft.com/en-us/sql/big-data-cluster/spark-mssql-connector?view=sql-server-ver15

AZURE V-NET: https://docs.microsoft.com/en-us/azure/virtual-network/virtual-network-service-endpoints-overview

AZ 300 Ref Exam: https://www.microsoftpressstore.com/store/exam-ref-az-300-microsoft-azure-architect-technologies-9780135802540

Page 25: HOW TO SELL AN AZURE DATA LAKE PROJECT FOR YOUR … · NOSQL/MS-SQL 2. What is a Data Lake? Explanation of Azure Data Lake Storage GEN 2 3. USE CASE SCENARIO 4. AZURE Data Lake IaaS

THANK YOU FOR COMING!

Contact information:

E: [email protected]

P: 954-707-7545