session 10 data
DESCRIPTION
TRANSCRIPT
using data strategically
adopted some materials from David Schuff
why data?
Key bank saved $500,000 by improving their direct mailing using data mining and data
warehouse in their Home Equity Loan program.
Airline industry can forecast at the seat level for each flight to perform “yield” management.
learn insights such as “30+ male customers buy 6-pack beer and disposable diaper at the same
time around 2-4 am”
Progressive Insurance can offer usage-based insurance plan using
their database
ESS – Executive
Support Systems
DSS – Decision Support
Systems
MIS – Management
Information Systems
TPS – Transaction
Processing Systems
Strategic
Management
Tactical
Management
Business
Operations
data becomes the basis of these different levels of decision making
two different types of data-usages:
transactionalvs.
informational
report (using query)vs.
decision-making (mining)
What is a database?
!!Structured collection of data items
!!Types of Database Management Systems
(DBMS)
"!Hierarchical
"!Network
"!Relational
•! The one most often seen
•! Access, MS SQL Server, Oracle, DB2
What is a Relational Database?
!! A set of two or more tables related to each other through key fields
!! Key field
"! A field on which a table can be sorted (indexed)
!! Primary Key
"! Field which uniquely identifies a record
"! Why have a primary key? •! There may be many people named John Smith, so how
do you tell them apart?
•! Use something which is unique, like a social security number
•! Social security number is a common key field
Data-Driven DSS
(a.k.a. Business Intelligence)
!! Also known as Data Mining and OLAP (Online Analytical Processing)
!! Finding non-obvious patterns in data
!! Data Mining generally implies using statistical techniques
"! correlation analysis
"! clustering to find patterns and relationships in large databases
Operational and
informational data stores
!! Relational databases are optimized for efficiency in data storage
"! OLTP – Online transaction processing
!! Dimensional databases are optimized for efficiency in data retrieval
"! OLAP – Online analytical processing
"! MOLAP – Multidimensional OLAP •! Stored in cubes that can be easily retrieved and
aggregated
!! ROLAP – Relational OLAP
"! “Fakes” MOLAP-style aggregation using a relational database
Data warehouse
implementation: The data cube
A data cube
stores its data
in a single
table.
That table is
organized
along
dimensions.
This cube has
three
dimensions:
store,
product, and
time.
SQL (OLAP) query•How many light bulbs did we sell in the 1st Qtr of 2000 in California vs. New York?
Data mining query•How do the buyers of light bulbs in California and New York differ?
•What else do the buyers of light bulbs in California buy along with light bulbs?
•Which sales regions had anomalous sales in the 1st Qtr of 2000?
!"#$%$&'()$*+&",-$.(
•! /..0*"120&3(
–! 4+1'(0'+$%(5%06-*'.(.+0-76('+$(.'0%$(.'0*8(-5(0&("9('+$(.'0%$(+1.(1(.17$(0&(:0;$(<7$*'%0&"*.=(
–! />1*+$6(;1"7"&?("&(6"%$*'(;1%8$2&?(
•! @$,-$&*$3(
–! /&17A.".(0&(*7"*8B.'%$1;(
–! C$6"*17(%$.$1%*+(
•! )";$B.$%"$.(*7-.'$%"&?(
–! D"&6(*-.'0;$%.(E"'+(.";"71%(51>$%&(09('$7$5+0&$(-.1?$.(
–! !$'$%;"&$(5%06-*'.(E"'+(.";"71%(.$77"&?(51>$%&.(
–! D"&6(.'0*8.(E"'+(.";"71%(5%"*$(;0F$;$&'.(
•! G71.."H*120&(
–! G%$6"'(%12&?(
–! )1%?$'(;1%8$2&?(
!"#$%$"!&'()%*'%+&*",'(*-$"
Divisional DB
Corporate Data
Warehouse
Cleaning
Collecting
ERP system
Data Mining
OLAP
Data Visualization
key issues:
• reliability
• scalability
• security
• speed
• data integrity
• availability
data may be boring, but the most critical element in IT architecture