forget the data unleash the usersassets.teradata.com/pdf/tugs/presentations/2015/...add new data,...

17
2/10/2015 1 1 Forget about the Data, Unleash the USERS! Driving Self-service BI Environment Alison Torres 2015 2 Example from Everyday Life (with thanks to Bill Franks)

Upload: others

Post on 12-Oct-2020

3 views

Category:

Documents


0 download

TRANSCRIPT

Page 1: Forget the Data Unleash the Usersassets.teradata.com/pdf/TUGS/Presentations/2015/...add new data, what if scenarios Lower performance SLA, Data Quality Data model, database operations

2/10/2015

1

1

Forget about the Data, Unleash the USERS!Driving Self-service BI Environment

Alison Torres

2015

2

Example from Everyday Life (with thanks to Bill Franks)

Page 2: Forget the Data Unleash the Usersassets.teradata.com/pdf/TUGS/Presentations/2015/...add new data, what if scenarios Lower performance SLA, Data Quality Data model, database operations

2/10/2015

2

3

•Check in for flights

•Used your phone

•Used an ATM

•Go to a buffet, breakfast this morning?

•Bought something online

•Sent a text or e-mail

•Put gas in the rental car

•Drove to the conference

•Check out at the store

•Check into/out of the hotel

•Served your own beverage at the buffet

•Scanned your purchase at local food or retail store

•Bought your B.A.R.T. ticket

Your turn: What self-service have you used today?

4

4

•What do you think is meant by “Self-Service” in relationship to your business?

•What are your obstacles in creating “Self-Service”?

•Who owns the resolution of those obstacles?

•Who are the “users”?

•IT’s job is to provide integrated, accessible, auditable data … and get out of the way.

•Business’ job is to run analytics … they need to have the right tools.

Given all that …

Page 3: Forget the Data Unleash the Usersassets.teradata.com/pdf/TUGS/Presentations/2015/...add new data, what if scenarios Lower performance SLA, Data Quality Data model, database operations

2/10/2015

3

5

Type of user Types of service

User responsibilities

Education Needs

Mobile, Kiosk, “Executive Dashboard”

Simple operations,Sub second response

Little to no knowledge of process or data

Front end functionality

Business User Simple Ad-hoc queries, reporting, Data already exists

Data context Tool and data

Business Analyst Complex ad-hoc, add new data, what if scenarios

Lower performance SLA, Data Quality

Data model, database operations (if writing SQL)

Data Scientist Complete query and analytic freedom; data from many sources

Technically adept, can run the process themselves

Database and possible discovery platform knowledge

Who’s the Self and What’s the Service?

6

6

•Little flexibility

•Well understood process

•Performance is paramount

•Not trying to understand business overall

•Trying to “run the business”

•USPS – they let you track your package, but you can’t do anything about it.

Type 1 User: Planned Output and Processes

Page 4: Forget the Data Unleash the Usersassets.teradata.com/pdf/TUGS/Presentations/2015/...add new data, what if scenarios Lower performance SLA, Data Quality Data model, database operations

2/10/2015

4

7

7 Traditional “Self-Service”

Web accessKioskMobile appPhone tree…

8

8

•Users just need flexible access

•Want to “pick and choose” content

•Want to create own metrics in many ways

•Performance is still critical

•Trying to “see the business”

Data is already in the data warehouse, give me tools and flexibility to run my own queries

Type 2 User: Data Already in Data Warehouse

Page 5: Forget the Data Unleash the Usersassets.teradata.com/pdf/TUGS/Presentations/2015/...add new data, what if scenarios Lower performance SLA, Data Quality Data model, database operations

2/10/2015

5

9

9 User Access Tools

CognosMicrostrategyTableauSpotfireQlikviewBusiness Objects…

10

10

•Third Normal Form

•Views

•Priority scheduling

•Indexes

•Cross-Business summary or model

•Sole purpose summary or index

•Expansion or Extract

From Flexibility to Performance

Page 6: Forget the Data Unleash the Usersassets.teradata.com/pdf/TUGS/Presentations/2015/...add new data, what if scenarios Lower performance SLA, Data Quality Data model, database operations

2/10/2015

6

11

11

•Flexibility is Key• Core Model must maintain relationships

•Physical Modeling for Greater Good• Don’t compromise for subset of users• Balance Distribution of Data with Access of Data

•Performance needs to be justified with Value• What is the actionable outcomes that are affected?• Priorities within and throughout the company

Key Learnings

12

12

•Need to remove IT from the process

•Does not want to know DBA skills• Needs tools to facilitate the loading and query processes

•Assumes responsibility for the data quality and content

•“One-off” to see if there is value or opportunity

•Trying to “understand the business”

•Shift from IT doing the work to Business doing the work, but the work still has to get done.

•The data warehouse has to be as easy to use as Excel.

•True analytics, exploratory processing.

Type 3: User Wants to Supplement with Own Data

Page 7: Forget the Data Unleash the Usersassets.teradata.com/pdf/TUGS/Presentations/2015/...add new data, what if scenarios Lower performance SLA, Data Quality Data model, database operations

2/10/2015

7

13

13

•Strict Data Warehouse Governance• Can’t quickly load untested external data

• Takes too long to access new data• Dependent on IT to load data

•Prototyping• Data needs to be reorganized for specific application or use

• Need to combine, manipulate and explore data in different ways

• Need to experiment with new analytic queries

Enablers or Barriers to Innovation?

Personal Marts

ODS

IntegratedDW

mart

Excelfiles

SASdata

DW Governance Process

14

14

•Discovery: Agility & limited constraints

•Deployment: Ensure quality & reliability

Enable Innovative Analytics … Safely

Page 8: Forget the Data Unleash the Usersassets.teradata.com/pdf/TUGS/Presentations/2015/...add new data, what if scenarios Lower performance SLA, Data Quality Data model, database operations

2/10/2015

8

15

15

•Analyze quickly• New theory• New data

•Test Fast• Was the theory correct? (Success or Failure)

•Do the new data provide additional insight?

•Does the new insight cause a change in thinking or direction?

•Productionize what works; discard what doesn’t• Add new application• Add new data• Or delete and move on

•Rigor and governance of the IT process that’s required to safe guard data warehouse makes it difficult for analyst to rapidly test new theories or explore untested data.

Agile Analytics Business Need for Agile Analytics

Flexibility vs. IT Process

16

16

•What is Agile Analytics?• Data Lab (sandbox) within your Teradata System that allows users to rapidly test new data and theories

• Non-production or experimental data is quickly loaded into your data lab for analysis with production data

•Benefits• Eliminates the need for costly data marts or personal marts

• Self-provisioning, management and service at the business user/unit level

• Minimal IT support after initial setup

•Use Case• Used for rapid prototyping, experimentation, and exploratory analysis

• Build sharable analytic data for development

Agile Analytics

TeradataData

Warehouse

Active Workload Management

External Data

Data Lab

Use analytic tools to join and explore combined data

Read only for Data Lab users

Read, write

SASdata

csvdata

Page 9: Forget the Data Unleash the Usersassets.teradata.com/pdf/TUGS/Presentations/2015/...add new data, what if scenarios Lower performance SLA, Data Quality Data model, database operations

2/10/2015

9

17

17

•Teradata Active Systems Management (TASM)• Manage multiple workloads to deliver predictable performance• Protect production workload

•Teradata Viewpoint Portlets• Self provisioning, service and management enabler

•Teradata Data Lab Services• Design and implementation services based on best practices

Agile Analytics Enabling Technology and Services

Validate Technology Requirements

• Teradata system with storage and processing capacity

• Teradata Active System Management (TASM)

• Teradata Viewpoint• Self-service enabling

Workload Management

• Implement TASM assignment of system resources, priorities, and limits

Establish DataLab Rules

• Granting database space

• Analytic Workspace duration

• Managing exceptions

• Usage of data• Workload priority

Database Administration & Security

• Create data labs within system• Link portlets to Agile portion of system• Create/link userids to Teradata and/or external

authentication (LDAP)• Set default table permissions• Define data lab/tables processes within standard

operational practices (BAR, etc…)• Managing who has access to the data and that

Information Security policies are upheld

Step 1 Step 2 Step 3 Step 4Teradata Data Lab Methodology

18

•Analytics on-demand • Self-provisioning and management to • accelerate analysis

• Load using your favorite tool

• Share and collaborate with colleagues• Faster analytic processing

•Flexibility to fit into your DW governance process• Data lab governance set by IT, enforced by the product• Leverages platform workload management and database security• Reduce data marts and data replication

Teradata Data Lab

•Promote exploration to drive innovation - Analyze new data with production data

- Quickly prove success or fail fast - Extend your analytic more users

Page 10: Forget the Data Unleash the Usersassets.teradata.com/pdf/TUGS/Presentations/2015/...add new data, what if scenarios Lower performance SLA, Data Quality Data model, database operations

2/10/2015

10

19

•Self-Service smart loader • Automatically determines data types• Automatic table creation • Loads, appends or replaces data• Excel or CSV files

Teradata Studio Express – Data Lab

TERADATA STUDIO

20

20

•In-place processes enable “time-to-market” benefits.• Put the processes and security in place first.

•Failure = Learning• Do so with great effectiveness…• Fail fast, fail early.

•Most business units now maintain a permanent sandbox.• Complex analysis and decision making within a business day!

Key Learnings

Page 11: Forget the Data Unleash the Usersassets.teradata.com/pdf/TUGS/Presentations/2015/...add new data, what if scenarios Lower performance SLA, Data Quality Data model, database operations

2/10/2015

11

21

21

•Has database and programming skills

•Just needs system to run exploration as well as analysis

•Testing ideas for productizing or deletion

•Creating new analytic products to release to general public

•Often includes big data or new data types

•Trying to “predict the business”

Get out of my way and give me an environment where I can do whatever I want.

Type 4: User Wants Control of Own Environment

22

22 Unified Data Architecture

ANALYTICS

Discovery Platform Active Data WarehouseActive Data Warehouse

Audio/Video

Images TextWeb & Social

Machine Logs

CRM SCM ERP

Engineers Business AnalystsQuantsData Scientists

Java, C/C++, Pig, Python, R, SAS, SQL, Excel, BI, VisualizationJava, C/C++, Pig, Python, R, SAS, SQL, Excel, BI, Visualization

Capture, Store, Refine Capture, Store, Refine

SQL-H SQL-H

Page 12: Forget the Data Unleash the Usersassets.teradata.com/pdf/TUGS/Presentations/2015/...add new data, what if scenarios Lower performance SLA, Data Quality Data model, database operations

2/10/2015

12

23

23 The Iterative Process, at Users Control

COLLECT & LOAD YOUR DATA

S TEP 1: PREPARATION

PREPARE YOUR DATA

EXAMINE, EVALUATE & ITERATE!DISCOVERY

S TEP 2: DISCOVERY AND ANALYSIS

S TEP 3: PUT THE RESULTS TO WORK

ESTABLISH MODELSFEED MODELS INTO APPLICATIONS

24

24

Data Preparation

Transform and aggregate data in the database with Teradata

ADS Generator

Model Deployment

Converts your R PMML model to SQL; automatically

generates the production ADS

Data Exploration

Explore all data directly in the database with Teradata

Profiler

Model Development

Sample your ADS data and build your model on an R client

Modeling ADS

SampleData

BuildADS

Production ADSAutomated process

Advanced Analytics – All Done Inside Teradata

TeradataProfiler

TeradataADS Generator

TeradataADS Generator

SQL In-dbs Function

Automated process

PMML or UDF Models

TeradataWarehouse Miner

Teradata R

Page 13: Forget the Data Unleash the Usersassets.teradata.com/pdf/TUGS/Presentations/2015/...add new data, what if scenarios Lower performance SLA, Data Quality Data model, database operations

2/10/2015

13

25

•Platform Family>Can have same environment and tools without impacting production systems

•Teradata Unity>Keeps data in synch between multiple systems

•Teradata Studio>Tools to add in the administration and management of the individual systems

•Aster>MR-SQL to allow deeper analytics without programing>Prepare, analysis, and visualize in toolset

•Teradata QueryGridTM/ Teradata connectivity>Cross platform analytics without manual data movement>Ease of use tools to access Hadoop data

•Teradata Database>JSON – schema on read

Tools to enable Exploration

26

•Looking “For Things” not “At things”• First need to understand problem• May require relational unfriendly analytics

•Goal is insight to drive into Analytics• Very iterative• Need ability to add new data types quickly

•Users are IT and Business knowledgeable• Able to run own systems• Tolerant to data and operational issues

Key Learnings from People who DO Self-Service

Page 14: Forget the Data Unleash the Usersassets.teradata.com/pdf/TUGS/Presentations/2015/...add new data, what if scenarios Lower performance SLA, Data Quality Data model, database operations

2/10/2015

14

27

27

•Simple “streamline process”• Understanding of data• Understand limits of service

•Self directed analytics• Understanding of how databases work• Not the removal of process, just the movement of process• Agreement of service level for performance

•Self directed exploration• Knowledge of data, databases, and analytics• Assume responsibility for data quality• Service level and help desk agreements• Does not become production workload without IT involvement

Self-Service Demands Self Reliance

28

•Keep the production data clean.• The data life cycle methodology is there for a reason.• Do not “pollute” production data with data of unknown source and validation.

• Equivalent to a viral injection…and you may not recover.

•Do not inject prototype data into “core” DW data:• Data ingest (ETL/ELT) does NOT have access to sandbox.

• Not even to populate the sandbox!

• Strictly and conceptually enforced on both Batch and User accounts.

Governance Rule 1of 3

Page 15: Forget the Data Unleash the Usersassets.teradata.com/pdf/TUGS/Presentations/2015/...add new data, what if scenarios Lower performance SLA, Data Quality Data model, database operations

2/10/2015

15

29

•Prototypes written by experienced personnel:• Assigned to NAMED personnel.• Previous Experience and Training Required .

•Prototype personnel are typically former DW developers who transitioned into a business unit. • Speed of implementation.• Knowledge of DW processes and methodologies.• Knowledge of data.

Governance Rule 2 of 3

30

•Sunset dates must be applied:• Hold a post mortem. • Retire it or promote it.

•The prototype must not become a “black market” production application.• Business cannot depend on them.• DW cannot give them appropriate support.

Governance Rule 3 of 3

Page 16: Forget the Data Unleash the Usersassets.teradata.com/pdf/TUGS/Presentations/2015/...add new data, what if scenarios Lower performance SLA, Data Quality Data model, database operations

2/10/2015

16

31

•Too much governance up front will stifle innovation and hamper progress

•Giving no thought to governance up front can lead to trouble

•Have a governance framework in place from the start, but utilize it minimally until implementation

Governance, Not Regulation

32

A Story About a Dog

Page 17: Forget the Data Unleash the Usersassets.teradata.com/pdf/TUGS/Presentations/2015/...add new data, what if scenarios Lower performance SLA, Data Quality Data model, database operations

2/10/2015

17

33

[email protected]