session 10 data

21
using data strategically adopted some materials from David Schuff

Upload: youngjin-yoo

Post on 23-Jan-2015

409 views

Category:

Education


1 download

DESCRIPTION

 

TRANSCRIPT

Page 1: Session 10 data

using data strategically

adopted some materials from David Schuff

Page 2: Session 10 data

why data?

Page 3: Session 10 data

Key bank saved $500,000 by improving their direct mailing using data mining and data

warehouse in their Home Equity Loan program.

Page 4: Session 10 data

Airline industry can forecast at the seat level for each flight to perform “yield” management.

Page 5: Session 10 data

learn insights such as “30+ male customers buy 6-pack beer and disposable diaper at the same

time around 2-4 am”

Page 6: Session 10 data

Progressive Insurance can offer usage-based insurance plan using

their database

Page 7: Session 10 data

ESS – Executive

Support Systems

DSS – Decision Support

Systems

MIS – Management

Information Systems

TPS – Transaction

Processing Systems

Strategic

Management

Tactical

Management

Business

Operations

Page 8: Session 10 data

data becomes the basis of these different levels of decision making

Page 9: Session 10 data

two different types of data-usages:

transactionalvs.

informational

Page 10: Session 10 data

report (using query)vs.

decision-making (mining)

Page 11: Session 10 data

What is a database?

!!Structured collection of data items

!!Types of Database Management Systems

(DBMS)

"!Hierarchical

"!Network

"!Relational

•! The one most often seen

•! Access, MS SQL Server, Oracle, DB2

Page 12: Session 10 data

What is a Relational Database?

!! A set of two or more tables related to each other through key fields

!! Key field

"! A field on which a table can be sorted (indexed)

!! Primary Key

"! Field which uniquely identifies a record

"! Why have a primary key? •! There may be many people named John Smith, so how

do you tell them apart?

•! Use something which is unique, like a social security number

•! Social security number is a common key field

Page 13: Session 10 data

Data-Driven DSS

(a.k.a. Business Intelligence)

!! Also known as Data Mining and OLAP (Online Analytical Processing)

!! Finding non-obvious patterns in data

!! Data Mining generally implies using statistical techniques

"! correlation analysis

"! clustering to find patterns and relationships in large databases

Page 14: Session 10 data

Operational and

informational data stores

Page 15: Session 10 data

!! Relational databases are optimized for efficiency in data storage

"! OLTP – Online transaction processing

!! Dimensional databases are optimized for efficiency in data retrieval

"! OLAP – Online analytical processing

"! MOLAP – Multidimensional OLAP •! Stored in cubes that can be easily retrieved and

aggregated

!! ROLAP – Relational OLAP

"! “Fakes” MOLAP-style aggregation using a relational database

Page 16: Session 10 data

Data warehouse

implementation: The data cube

A data cube

stores its data

in a single

table.

That table is

organized

along

dimensions.

This cube has

three

dimensions:

store,

product, and

time.

Page 17: Session 10 data

SQL (OLAP) query•How many light bulbs did we sell in the 1st Qtr of 2000 in California vs. New York?

Data mining query•How do the buyers of light bulbs in California and New York differ?

•What else do the buyers of light bulbs in California buy along with light bulbs?

•Which sales regions had anomalous sales in the 1st Qtr of 2000?

Page 18: Session 10 data

!"#$%$&'()$*+&",-$.(

•! /..0*"120&3(

–! 4+1'(0'+$%(5%06-*'.(.+0-76('+$(.'0%$(.'0*8(-5(0&("9('+$(.'0%$(+1.(1(.17$(0&(:0;$(<7$*'%0&"*.=(

–! />1*+$6(;1"7"&?("&(6"%$*'(;1%8$2&?(

•! @$,-$&*$3(

–! /&17A.".(0&(*7"*8B.'%$1;(

–! C$6"*17(%$.$1%*+(

•! )";$B.$%"$.(*7-.'$%"&?(

–! D"&6(*-.'0;$%.(E"'+(.";"71%(51>$%&(09('$7$5+0&$(-.1?$.(

–! !$'$%;"&$(5%06-*'.(E"'+(.";"71%(.$77"&?(51>$%&.(

–! D"&6(.'0*8.(E"'+(.";"71%(5%"*$(;0F$;$&'.(

•! G71.."H*120&(

–! G%$6"'(%12&?(

–! )1%?$'(;1%8$2&?(

Page 19: Session 10 data

!"#$%$"!&'()%*'%+&*",'(*-$"

Divisional DB

Corporate Data

Warehouse

Cleaning

Collecting

ERP system

Data Mining

OLAP

Data Visualization

Page 20: Session 10 data

key issues:

• reliability

• scalability

• security

• speed

• data integrity

• availability

Page 21: Session 10 data

data may be boring, but the most critical element in IT architecture