forget the data unleash the usersassets.teradata.com/pdf/tugs/presentations/2015/...add new data,...
TRANSCRIPT
2/10/2015
1
1
Forget about the Data, Unleash the USERS!Driving Self-service BI Environment
Alison Torres
2015
2
Example from Everyday Life (with thanks to Bill Franks)
2/10/2015
2
3
•Check in for flights
•Used your phone
•Used an ATM
•Go to a buffet, breakfast this morning?
•Bought something online
•Sent a text or e-mail
•Put gas in the rental car
•Drove to the conference
•Check out at the store
•Check into/out of the hotel
•Served your own beverage at the buffet
•Scanned your purchase at local food or retail store
•Bought your B.A.R.T. ticket
Your turn: What self-service have you used today?
4
4
•What do you think is meant by “Self-Service” in relationship to your business?
•What are your obstacles in creating “Self-Service”?
•Who owns the resolution of those obstacles?
•Who are the “users”?
•IT’s job is to provide integrated, accessible, auditable data … and get out of the way.
•Business’ job is to run analytics … they need to have the right tools.
Given all that …
2/10/2015
3
5
Type of user Types of service
User responsibilities
Education Needs
Mobile, Kiosk, “Executive Dashboard”
Simple operations,Sub second response
Little to no knowledge of process or data
Front end functionality
Business User Simple Ad-hoc queries, reporting, Data already exists
Data context Tool and data
Business Analyst Complex ad-hoc, add new data, what if scenarios
Lower performance SLA, Data Quality
Data model, database operations (if writing SQL)
Data Scientist Complete query and analytic freedom; data from many sources
Technically adept, can run the process themselves
Database and possible discovery platform knowledge
Who’s the Self and What’s the Service?
6
6
•Little flexibility
•Well understood process
•Performance is paramount
•Not trying to understand business overall
•Trying to “run the business”
•USPS – they let you track your package, but you can’t do anything about it.
Type 1 User: Planned Output and Processes
2/10/2015
4
7
7 Traditional “Self-Service”
Web accessKioskMobile appPhone tree…
8
8
•Users just need flexible access
•Want to “pick and choose” content
•Want to create own metrics in many ways
•Performance is still critical
•Trying to “see the business”
Data is already in the data warehouse, give me tools and flexibility to run my own queries
Type 2 User: Data Already in Data Warehouse
2/10/2015
5
9
9 User Access Tools
CognosMicrostrategyTableauSpotfireQlikviewBusiness Objects…
10
10
•Third Normal Form
•Views
•Priority scheduling
•Indexes
•Cross-Business summary or model
•Sole purpose summary or index
•Expansion or Extract
From Flexibility to Performance
2/10/2015
6
11
11
•Flexibility is Key• Core Model must maintain relationships
•Physical Modeling for Greater Good• Don’t compromise for subset of users• Balance Distribution of Data with Access of Data
•Performance needs to be justified with Value• What is the actionable outcomes that are affected?• Priorities within and throughout the company
Key Learnings
12
12
•Need to remove IT from the process
•Does not want to know DBA skills• Needs tools to facilitate the loading and query processes
•Assumes responsibility for the data quality and content
•“One-off” to see if there is value or opportunity
•Trying to “understand the business”
•Shift from IT doing the work to Business doing the work, but the work still has to get done.
•The data warehouse has to be as easy to use as Excel.
•True analytics, exploratory processing.
Type 3: User Wants to Supplement with Own Data
2/10/2015
7
13
13
•Strict Data Warehouse Governance• Can’t quickly load untested external data
• Takes too long to access new data• Dependent on IT to load data
•Prototyping• Data needs to be reorganized for specific application or use
• Need to combine, manipulate and explore data in different ways
• Need to experiment with new analytic queries
Enablers or Barriers to Innovation?
Personal Marts
ODS
IntegratedDW
mart
Excelfiles
SASdata
DW Governance Process
14
14
•Discovery: Agility & limited constraints
•Deployment: Ensure quality & reliability
Enable Innovative Analytics … Safely
2/10/2015
8
15
15
•Analyze quickly• New theory• New data
•Test Fast• Was the theory correct? (Success or Failure)
•Do the new data provide additional insight?
•Does the new insight cause a change in thinking or direction?
•Productionize what works; discard what doesn’t• Add new application• Add new data• Or delete and move on
•Rigor and governance of the IT process that’s required to safe guard data warehouse makes it difficult for analyst to rapidly test new theories or explore untested data.
Agile Analytics Business Need for Agile Analytics
Flexibility vs. IT Process
16
16
•What is Agile Analytics?• Data Lab (sandbox) within your Teradata System that allows users to rapidly test new data and theories
• Non-production or experimental data is quickly loaded into your data lab for analysis with production data
•Benefits• Eliminates the need for costly data marts or personal marts
• Self-provisioning, management and service at the business user/unit level
• Minimal IT support after initial setup
•Use Case• Used for rapid prototyping, experimentation, and exploratory analysis
• Build sharable analytic data for development
Agile Analytics
TeradataData
Warehouse
Active Workload Management
External Data
Data Lab
Use analytic tools to join and explore combined data
Read only for Data Lab users
Read, write
SASdata
csvdata
2/10/2015
9
17
17
•Teradata Active Systems Management (TASM)• Manage multiple workloads to deliver predictable performance• Protect production workload
•Teradata Viewpoint Portlets• Self provisioning, service and management enabler
•Teradata Data Lab Services• Design and implementation services based on best practices
Agile Analytics Enabling Technology and Services
Validate Technology Requirements
• Teradata system with storage and processing capacity
• Teradata Active System Management (TASM)
• Teradata Viewpoint• Self-service enabling
Workload Management
• Implement TASM assignment of system resources, priorities, and limits
Establish DataLab Rules
• Granting database space
• Analytic Workspace duration
• Managing exceptions
• Usage of data• Workload priority
Database Administration & Security
• Create data labs within system• Link portlets to Agile portion of system• Create/link userids to Teradata and/or external
authentication (LDAP)• Set default table permissions• Define data lab/tables processes within standard
operational practices (BAR, etc…)• Managing who has access to the data and that
Information Security policies are upheld
Step 1 Step 2 Step 3 Step 4Teradata Data Lab Methodology
18
•Analytics on-demand • Self-provisioning and management to • accelerate analysis
• Load using your favorite tool
• Share and collaborate with colleagues• Faster analytic processing
•Flexibility to fit into your DW governance process• Data lab governance set by IT, enforced by the product• Leverages platform workload management and database security• Reduce data marts and data replication
Teradata Data Lab
•Promote exploration to drive innovation - Analyze new data with production data
- Quickly prove success or fail fast - Extend your analytic more users
2/10/2015
10
19
•Self-Service smart loader • Automatically determines data types• Automatic table creation • Loads, appends or replaces data• Excel or CSV files
Teradata Studio Express – Data Lab
TERADATA STUDIO
20
20
•In-place processes enable “time-to-market” benefits.• Put the processes and security in place first.
•Failure = Learning• Do so with great effectiveness…• Fail fast, fail early.
•Most business units now maintain a permanent sandbox.• Complex analysis and decision making within a business day!
Key Learnings
2/10/2015
11
21
21
•Has database and programming skills
•Just needs system to run exploration as well as analysis
•Testing ideas for productizing or deletion
•Creating new analytic products to release to general public
•Often includes big data or new data types
•Trying to “predict the business”
Get out of my way and give me an environment where I can do whatever I want.
Type 4: User Wants Control of Own Environment
22
22 Unified Data Architecture
ANALYTICS
Discovery Platform Active Data WarehouseActive Data Warehouse
Audio/Video
Images TextWeb & Social
Machine Logs
CRM SCM ERP
Engineers Business AnalystsQuantsData Scientists
Java, C/C++, Pig, Python, R, SAS, SQL, Excel, BI, VisualizationJava, C/C++, Pig, Python, R, SAS, SQL, Excel, BI, Visualization
Capture, Store, Refine Capture, Store, Refine
SQL-H SQL-H
2/10/2015
12
23
23 The Iterative Process, at Users Control
COLLECT & LOAD YOUR DATA
S TEP 1: PREPARATION
PREPARE YOUR DATA
EXAMINE, EVALUATE & ITERATE!DISCOVERY
S TEP 2: DISCOVERY AND ANALYSIS
S TEP 3: PUT THE RESULTS TO WORK
ESTABLISH MODELSFEED MODELS INTO APPLICATIONS
24
24
Data Preparation
Transform and aggregate data in the database with Teradata
ADS Generator
Model Deployment
Converts your R PMML model to SQL; automatically
generates the production ADS
Data Exploration
Explore all data directly in the database with Teradata
Profiler
Model Development
Sample your ADS data and build your model on an R client
Modeling ADS
SampleData
BuildADS
Production ADSAutomated process
Advanced Analytics – All Done Inside Teradata
TeradataProfiler
TeradataADS Generator
TeradataADS Generator
SQL In-dbs Function
Automated process
PMML or UDF Models
TeradataWarehouse Miner
Teradata R
2/10/2015
13
25
•Platform Family>Can have same environment and tools without impacting production systems
•Teradata Unity>Keeps data in synch between multiple systems
•Teradata Studio>Tools to add in the administration and management of the individual systems
•Aster>MR-SQL to allow deeper analytics without programing>Prepare, analysis, and visualize in toolset
•Teradata QueryGridTM/ Teradata connectivity>Cross platform analytics without manual data movement>Ease of use tools to access Hadoop data
•Teradata Database>JSON – schema on read
Tools to enable Exploration
26
•Looking “For Things” not “At things”• First need to understand problem• May require relational unfriendly analytics
•Goal is insight to drive into Analytics• Very iterative• Need ability to add new data types quickly
•Users are IT and Business knowledgeable• Able to run own systems• Tolerant to data and operational issues
Key Learnings from People who DO Self-Service
2/10/2015
14
27
27
•Simple “streamline process”• Understanding of data• Understand limits of service
•Self directed analytics• Understanding of how databases work• Not the removal of process, just the movement of process• Agreement of service level for performance
•Self directed exploration• Knowledge of data, databases, and analytics• Assume responsibility for data quality• Service level and help desk agreements• Does not become production workload without IT involvement
Self-Service Demands Self Reliance
28
•Keep the production data clean.• The data life cycle methodology is there for a reason.• Do not “pollute” production data with data of unknown source and validation.
• Equivalent to a viral injection…and you may not recover.
•Do not inject prototype data into “core” DW data:• Data ingest (ETL/ELT) does NOT have access to sandbox.
• Not even to populate the sandbox!
• Strictly and conceptually enforced on both Batch and User accounts.
Governance Rule 1of 3
2/10/2015
15
29
•Prototypes written by experienced personnel:• Assigned to NAMED personnel.• Previous Experience and Training Required .
•Prototype personnel are typically former DW developers who transitioned into a business unit. • Speed of implementation.• Knowledge of DW processes and methodologies.• Knowledge of data.
Governance Rule 2 of 3
30
•Sunset dates must be applied:• Hold a post mortem. • Retire it or promote it.
•The prototype must not become a “black market” production application.• Business cannot depend on them.• DW cannot give them appropriate support.
Governance Rule 3 of 3
2/10/2015
16
31
•Too much governance up front will stifle innovation and hamper progress
•Giving no thought to governance up front can lead to trouble
•Have a governance framework in place from the start, but utilize it minimally until implementation
Governance, Not Regulation
32
A Story About a Dog