md240 - mis oct. 4, 2005 databases & the data asset harrah’s & allstate cases
TRANSCRIPT
MD240 - MIS
Oct. 4, 2005
Databases & the Data AssetHarrah’s & Allstate Cases
Topics Covered
• Data & Information– Data vs. information
• Architecture basics– Data Warehouses & Data Marts– Transactional vs. query systems
• Leveraging Data– Harrah’s– Allstate
Data, Information, & Knowledge
• Data - raw facts, figures, and details.
• Information - organized, meaningful, and useful interpretation of data. Has a context, answers a question.
• Knowledge - an awareness and understanding of a set of information and how that information can be put to best use.
• Many firms are data rich and info poor: victims of an old or poorly planned architecture
Examples of Data, Information, & Knowledge
Data: raw, no context 900,000 1,150,000 1,200,000 1,100,000
Information: meaningful, has contextQuarter 1 Quarter 2
Post 900,000 1,150,000 Kellogg's 1,200,000 1,100,000
Post lowered its prices after the first quarter.Price change has caused Post sales to rise at the expense of Kellogg’s
Knowledge: information above & other information creates an awareness of impact
Clients, Servers, DBMS, and Databases
• Database– a collection of related data. Usually organized according to topics: e.g. customer
info, products, transactions
• Database Management System (DBMS)– a program for creating & managing databases; ex. Oracle, MS-Access, MS SQL
Server, IBM DB2, mySQL• SQL - Structured Query Language
– Most popular relational database standard. Includes a language for creating & manipulating data.
DBMS - the program. Manages interaction with databases.
database - the collection of data.Created and defined to meet theneeds of the organization.
Client - makes requests of the server
request
response
Server - responds to client requests
A Simple Database
• File/Table– Customers
• Field/Column– 5 shown: CUSTID, FIRST, LAST, CITY, STATE
• Record/Row– 5 shown: one for each customer
CUSTID FIRST LAST CITY STATE …2001 John Gallaugher Newton MA …2002 Abby Johnson Boston MA …2003 Warren Buffet Omaha NE …2004 Peter Lynch Marblehead MA …2005 Charles Schwab San Francisco CA …
CUSTID FIRST LAST CITY STATE2001 John Gallaugher Newton MA2002 Abby Johnson Boston MA2003 Warren Buffet Omaha NE2004 Peter Lynch Marblehead MA2005 Charles Schwab San Francisco CA
Now With More Data
One
Many
BROKID FIRST LAST …B001 Ivan Boesky …B002 Dennis Levine …B003 Michael Milken …
CUSTID BROKID BUY/SELLSTOCK SHARES PRICE DATE TIME2001 B003 Buy MSFT 1000 90 1/4 12/24/2003 12:01 PM2001 B001 Buy INTC 2400 80 1/8 7/3/2004 10:51 AM2001 B003 Sell IBM 3000 114 3/8 7/1/2004 9:03 AM2002 B001 Sell IBM 3000 110 1/8 6/30/2005 4:53 PM2002 B003 Sell INTC 2000 94 7/8 8/30/2005 3:15 PM
… … … … … … … …
One
Many
Customer Table
Transaction Table
Broker Table
Meta-Data
• Data that describes the characteristics of stored data• Enterprise Data Model
– consistent, cross-functional, shareable meta-data model– standardization increases flexibility & use (data to info)– facilitates the creation of data warehouses
Col. Name Length Type …CUSTID 4 Char …FIRST 10 Char …LAST 15 Char …CITY 15 Char …STATE 2 Char …… … … …
Col. Name Length Type …CUSTID 4 Char …BROKID 4 CharBUY/SELL 1 Bool …STOCK 4 Char …SHARES 8 Num …PRICE 6.2 Money …… … … …
Col. Name Length Type …BROKID 4 CharFIRST 10 Char …LAST 15 Char …… … … …
1
1
mm
Customer Table Transaction Table
Broker Table
Warehouses & Marts
• Data Warehouse– a database designed to support decision-making in an
organization. It is structured for fast online queries and exploration. Data warehouses may aggregate enormous amounts of data from many different operational systems.
• Data Mart– a database focused on addressing the concerns of a
specific problem or business unit (e.g. Marketing, Engineering). Size doesn’t define data marts, but they tend to be smaller than data warehouses.
Data Warehouses & Data Marts
TPS& other
operational systems
DataWarehouse
Data Mart(Marketing)
Data Mart(Engineering)
3rd party data
= query, mining, etc.
= operational clients
Differing System Demands
network traffic & processor
demands
time
network traffic & processor
demands
time
Managerial Systems
Operational Systems
Query Tools & OLAP• Query Tools
– user-lead discovery. Can return individual records or summaries. Requests are formulated in advance (e.g. “show me all delinquent accounts in the northeast region during Q1”).
• OLAP - Online Analytical Processing– user-lead discovery. Data is explored via “drill down” into the
data by selecting variables to summarize on. Results are usually reported in a cross-tab report or graph (e.g. “show me a tabular breakdown of sales by product, customer, and date”).
OLAPOnlineAnalyticalProcessing
OLAP
• Online Analytical Processing. (example of cross-tab results presented below)
1. business unit
2. product type 3. year
Executive Dashboard – aggregated report presentation of key business indicators
Data Mining
• automated information discovery process, uncovers important patterns in existing data– can use neural networks, regression, decision
trees, or other approaches. – Requires ‘clean’, consistent data. Historical data
must reflect the current environment.
• e.g. “What are the characteristics that identify if a customer is likely to default on a loan?”
Insuring Success
Over 5 years:• Revenues up 26%
to $33.9 billion • Profits up 180% to
$3.1 billion • Stock up 187%
– Insurance stocks up only 31%