kroenke umis9e inppt 09 - university of...

69
Business Intelligence Systems Chapter 9

Upload: vunguyet

Post on 17-Apr-2018

214 views

Category:

Documents


0 download

TRANSCRIPT

Business Intelligence Systems

Chapter 9

9-2

“Data Analysis, Where You Don’t Know the Second Question to Ask Until You See the Answer to the First One.”

C o p y r i g h t © 2 0 1 7 P e a r s o n E d u c a t i o n , I n c .

• Tracking race competitors from each of event, and having unbelievable success selling products to them.

• Want to match competitors to personal trainers in same locale.• Earn referral fee.• How to track them? Mailing address? IP address?• Got data and Excel to start.• Serious data mining needs a data mart.

9-3

Study Questions

C o p y r i g h t © 2 0 1 7 P e a r s o n E d u c a t i o n , I n c .

Q1: How do organizations use business intelligence (BI) systems?Q2: What are the three primary activities in the BI process?Q3: How do organizations use data warehouses and data marts to acquire data?Q4: How do organizations use reporting applications?Q5: How do organizations use data mining applications?Q6: How do organizations use BigData applications?Q7: What is the role of knowledge management systems?Q8: What are the alternatives for publishing BI?Q9: 2026?

9-4

Q1: How Do Organizations Use Business Intelligence (BI) Systems?

C o p y r i g h t © 2 0 1 7 P e a r s o n E d u c a t i o n , I n c .

Components of Business Intelligence System

9-5

How Do Organizations Use BI?

C o p y r i g h t © 2 0 1 7 P e a r s o n E d u c a t i o n , I n c .

9-6

What Are Typical Uses for BI?

C o p y r i g h t © 2 0 1 7 P e a r s o n E d u c a t i o n , I n c .

• Identifying changes in purchasing patterns– Important life events change what customers buy.

• Entertainment– Netflix has data on watching, listening, and rental habits.– Classify customers by viewing patterns.

• Predictive policing– Analyze data on past crimes - location, date, time, day of week,

type of crime, and related data.

9-7

Just-in-Time Medical Reporting

C o p y r i g h t © 2 0 1 7 P e a r s o n E d u c a t i o n , I n c .

• Example of real time data mining and reporting.• Injection notification services

– Software analyzes patient’s records, if injections needed, recommends as exam progresses.

• Blurry edge of medical ethics.

9-8

Q2: What Are the Three Primary Activities in the BI Process?

C o p y r i g h t © 2 0 1 7 P e a r s o n E d u c a t i o n , I n c .

9-9

Using Business Intelligence to Find Candidate Parts at Falcon Security

C o p y r i g h t © 2 0 1 7 P e a r s o n E d u c a t i o n , I n c .

• Identify parts that might qualify.– Provided by vendors who make part design files available for

sale.– Purchased by larger customers.– Frequently ordered parts.– Ordered in small quantities.

• Used part weight and price surrogates for simplicity.

9-10

Acquire Data: Extracted Order Data

C o p y r i g h t © 2 0 1 7 P e a r s o n E d u c a t i o n , I n c .

• QuerySales (CustomerName, Contact, Title, Bill Year, Number Orders, Units, Revenue, Source, PartNumber)Part (PartNumber, Shipping Weight, Vendor)

9-11

Sample Extracted Data: Part Data Table

C o p y r i g h t © 2 0 1 7 P e a r s o n E d u c a t i o n , I n c .

9-12

Analyze Data

C o p y r i g h t © 2 0 1 7 P e a r s o n E d u c a t i o n , I n c .

9-13

Sample Orders and Parts View Data

C o p y r i g h t © 2 0 1 7 P e a r s o n E d u c a t i o n , I n c .

9-14

Creating Customer Summary Query

C o p y r i g h t © 2 0 1 7 P e a r s o n E d u c a t i o n , I n c .

9-15

Customer Summary

C o p y r i g h t © 2 0 1 7 P e a r s o n E d u c a t i o n , I n c .

9-16

Qualifying Parts Query Design

C o p y r i g h t © 2 0 1 7 P e a r s o n E d u c a t i o n , I n c .

9-17

Publish Results: Qualifying Parts Query Results

C o p y r i g h t © 2 0 1 7 P e a r s o n E d u c a t i o n , I n c .

9-18

Publish Results: Sales History for Selected Parts

C o p y r i g h t © 2 0 1 7 P e a r s o n E d u c a t i o n , I n c .

9-19

Ethics Guide: Unseen Cyberazzi

C o p y r i g h t © 2 0 1 7 P e a r s o n E d u c a t i o n , I n c .

• Data broker or Data aggregator– Acquires and purchases consumer and other data from

public records, retailers, Internet cookie vendors, social media trackers, and other sources.

– Data for business intelligence to sell to companies and governments.

9-20

Ethics Guide: Unseen Cyberazzi (cont'd)

C o p y r i g h t © 2 0 1 7 P e a r s o n E d u c a t i o n , I n c .

• Cheap cloud processing of consumer data easier, less expensive.

• Processing happens in secret.• Data brokers enable you to view data stored about you, but ...

– Difficult to learn how to request your data,– Torturous process to file for it,– Limited data usefulness.

9-21

Ethics Guide: Unseen Cyberazzi (cont'd)

C o p y r i g h t © 2 0 1 7 P e a r s o n E d u c a t i o n , I n c .

• Do you know what data is gathered about you? What is done with it?• Have you thought about conclusions data aggregators, or their

clients, could make based on your use of frequent buyer cards?• Concerned about what federal government might do with data it gets

from data aggregators?• Where does all of this end? • What will life be like for your children or grandchildren?

9-22

Q3: How Do Organizations Use Data Warehouses and Data Marts to Acquire Data?

C o p y r i g h t © 2 0 1 7 P e a r s o n E d u c a t i o n , I n c .

• Functions of a data warehouse– Obtain data from operational, internal and external

databases.– Cleanse data.– Organize and relate data.– Catalog data using metadata.

9-23

Components of a Data Warehouse

C o p y r i g h t © 2 0 1 7 P e a r s o n E d u c a t i o n , I n c .

9-24

Examples of Consumer Data That Can Be Purchased

C o p y r i g h t © 2 0 1 7 P e a r s o n E d u c a t i o n , I n c .

9-25

Possible Problems with Source Data

C o p y r i g h t © 2 0 1 7 P e a r s o n E d u c a t i o n , I n c .

Curse of dimensionality

9-26

Data Warehouses Versus Data Marts

C o p y r i g h t © 2 0 1 7 P e a r s o n E d u c a t i o n , I n c .

9-27

Q4: How Do Organizations Use Reporting Applications?

C o p y r i g h t © 2 0 1 7 P e a r s o n E d u c a t i o n , I n c .

• Create meaningful information from disparate data sources.• Deliver information to user on time.• Basic operations:

1. Sorting2. Filtering3. Grouping4. Calculating5. Formatting

9-28

RFM Analysis: Example RFM Scores

C o p y r i g h t © 2 0 1 7 P e a r s o n E d u c a t i o n , I n c .

• Recently• Frequently• Money

9-29

RFM Analysis RFM Analysis Classification Scheme

C o p y r i g h t © 2 0 1 2 P e a r s o n E d u c a t i o n , I n c . P u b l i s h i n g a s P r e n t i c e H a l l

• Recent orders• Frequent orders• Money (amount) of money spent

9-29

Top 20%

Bottom 20%

1

2

3

4

5

Middle 20%

9-30

Example of Grocery Sales OLAP Report

C o p y r i g h t © 2 0 1 7 P e a r s o n E d u c a t i o n , I n c .

OLAP Product Family by Store Type

http://www.tableausoftware.comOLAP cube

9-31

Example of Expanded Grocery Sales OLAP Report

C o p y r i g h t © 2 0 1 7 P e a r s o n E d u c a t i o n , I n c .

Drill down

9-32

Example of Drilling Down into Expanded Grocery Sales OLAP Report

C o p y r i g h t © 2 0 1 7 P e a r s o n E d u c a t i o n , I n c .

9-33

Q5: How Do Organizations Use Data Mining Applications?

C o p y r i g h t © 2 0 1 7 P e a r s o n E d u c a t i o n , I n c .

Source disciplines

9-34

Unsupervised Data Mining

C o p y r i g h t © 2 0 1 7 P e a r s o n E d u c a t i o n , I n c .

• No a priori hypothesis or model.• Findings obtained solely by data analysis.• Hypothesized model created to explain patterns found.• Example: Cluster analysis.

9-35

Supervised Data Mining

C o p y r i g h t © 2 0 1 7 P e a r s o n E d u c a t i o n , I n c .

• Uses a priori model.• Prediction, such as regression analysis.• Ex: CellPhoneWeekendMinutes

= (12 + (17.5*CustomerAge)+(23.7*NumberMonthsOfAccount)= 12 + 17.5*21 + 23.7*6 = 521.7 minutes

9-36

Market-Basket Analysis

C o p y r i g h t © 2 0 1 7 P e a r s o n E d u c a t i o n , I n c .

• Market-basket analysis– Identify sales patterns in large volumes of data.– Identify what products customers tend to buy together.– Computes probabilities of purchases.– Identify cross-selling opportunities.

Customers who bought fins also bought a mask.

9-37

Market-Basket Example: Dive ShopTransactions = 400

C o p y r i g h t © 2 0 1 7 P e a r s o n E d u c a t i o n , I n c .

9-38

Decision Trees

C o p y r i g h t © 2 0 1 7 P e a r s o n E d u c a t i o n , I n c .

• Unsupervised data mining technique.• Hierarchical arrangement of criteria to predict a value or

classification.• Basic idea

– Select attributes most useful for classifying “pure groups.”• Creates decision rules.

9-39

Credit Score Decision Tree

C o p y r i g h t © 2 0 1 7 P e a r s o n E d u c a t i o n , I n c .

9-40

Decision Rules for Accepting or Rejecting Offer to Purchase Loans

C o p y r i g h t © 2 0 1 7 P e a r s o n E d u c a t i o n , I n c .

• If percent past due is less than 50 percent, then accept loan.– If percent past due is greater than 50 percent and– If CreditScore is greater than 572.6 and– If CurrentLTV is less than .94, then accept loan.

• Otherwise, reject loan.

9-41

So What? BI for Securities Trading?

C o p y r i g h t © 2 0 1 7 P e a r s o n E d u c a t i o n , I n c .

• Quantitative applications using BigData and BI.– Analyze immense amounts of data over a broad spectrum of

sources.– Build and evaluate investment strategies.

• Two Sigma (www.twosigma.com)– Analyzes financial statements, developing news, Twitter

activity, weather reports, other sources. – Develops and tests investment strategies.

9-42

Two Sigma’s Five-step Process

C o p y r i g h t © 2 0 1 7 P e a r s o n E d u c a t i o n , I n c .

1. Acquire data2. Create models3. Evaluate models4. Analyze risks5. Place trades

Does it work? Two Sigma and other firms claim it does.

9-43

Q6: How Do Organizations Use BigData Applications?

C o p y r i g h t © 2 0 1 7 P e a r s o n E d u c a t i o n , I n c .

• Huge volume – petabyte and larger. • Rapid velocity – generated rapidly.• Great variety

– Structured data, free-form text, log files, graphics, audio, and video.

9-44

MapReduce Processing Summary

C o p y r i g h t © 2 0 1 7 P e a r s o n E d u c a t i o n , I n c .

Map Phase: Google search log

broken into thousands of

pieces

9-45

Google Trends on the Term Web 2.0

C o p y r i g h t © 2 0 1 7 P e a r s o n E d u c a t i o n , I n c .

Reduce phase: results combined

9-46

Hadoop

C o p y r i g h t © 2 0 1 7 P e a r s o n E d u c a t i o n , I n c .

• Open-source program supported by Apache Foundation2. • Manages thousands of computers.• Implements MapReduce

– Written in Java.• Amazon.com supports Hadoop as part of EC3 cloud. • Query language entitled Pig (platform for large dataset analysis).

– Easy to master.– Extensible.– Automatically optimizes queries on map-reduce level.

9-47

Q7: What Is the Role of Knowledge Management Systems?

C o p y r i g h t © 2 0 1 7 P e a r s o n E d u c a t i o n , I n c .

• Knowledge Management (KM)– Creating value from intellectual capital and sharing

knowledge with those who need that capital.• Preserving organizational memory

– Capturing and storing lessons learned and best practices of key employees.

• Scope of KM same as SM in hyper-social organizations.

9-48

Benefits of Knowledge Management

C o p y r i g h t © 2 0 1 7 P e a r s o n E d u c a t i o n , I n c .

• Improve process quality.• Increase team strength.• Goal:

– Enable employees to use organization’s collective knowledge.

9-49

What Are Expert Systems?

C o p y r i g h t © 2 0 1 7 P e a r s o n E d u c a t i o n , I n c .

Expert systems

Rule-based IF/THEN

Encode human knowledge

Process IF side of rules

Report values of all variables

Knowledge gathered from human experts

Expert systems shells

9-50

Example of IF/THEN Rules

C o p y r i g h t © 2 0 1 7 P e a r s o n E d u c a t i o n , I n c .

9-51

Drawbacks of Expert Systems

C o p y r i g h t © 2 0 1 7 P e a r s o n E d u c a t i o n , I n c .

1. Difficult and expensive to develop.– Labor intensive.– Ties up domain experts.

2. Difficult to maintain.– Changes cause unpredictable outcomes.– Constantly need expensive changes.

3. Don’t live up to expectations.– Can’t duplicate diagnostic abilities of humans.

9-52

What Are Content Management Systems (CMS)?

C o p y r i g h t © 2 0 1 7 P e a r s o n E d u c a t i o n , I n c .

• Support management and delivery of documents, other expressions of employee knowledge.

• Challenges of Content Management– Huge databases.– Dynamic content.– Documents refer to one another.– Perishable contents.– In many languages.

9-53

What are CMS Application Alternatives?

C o p y r i g h t © 2 0 1 7 P e a r s o n E d u c a t i o n , I n c .

• In-house custom development–Customer support develops in-house database applications to

track customer problems.• Off-the-shelf

–Horizontal market products (SharePoint).–Vertical market applications.

• Public search engine–Google, Bing.

9-54

How Do Hyper-Social Organizations Manage Knowledge?

C o p y r i g h t © 2 0 1 7 P e a r s o n E d u c a t i o n , I n c .

• Hyper-social knowledge management– Social media, and related applications, for management and

delivery of organizational knowledge resources.• Hyper-organization theory

– Framework for understanding KM.– Focus shifts from knowledge and content to fostering

authentic relationships among knowledge creators and users.

9-55

Hyper-Social KM Media

C o p y r i g h t © 2 0 1 7 P e a r s o n E d u c a t i o n , I n c .

9-56

Resistance to Knowledge Sharing

C o p y r i g h t © 2 0 1 7 P e a r s o n E d u c a t i o n , I n c .

• Employees reluctant to exhibit their ignorance.• Employee competition.• Remedy

– Strong management endorsement. – Strong positive feedback.– “Nothing wrong with praise or cash . . . especially cash.”

9-57

Q8: What Are the Alternatives for Publishing BI?

C o p y r i g h t © 2 0 1 7 P e a r s o n E d u c a t i o n , I n c .

9-58

What Are the Two Functions of a BI Server?

C o p y r i g h t © 2 0 1 7 P e a r s o n E d u c a t i o n , I n c .

Management and delivery

9-59

Q9: 2026?

C o p y r i g h t © 2 0 1 7 P e a r s o n E d u c a t i o n , I n c .

• Exponentially more information about customers, better data mining techniques.

• Companies buy and sell your purchasing habits and psyche.• Singularity

– Computer systems adapt and create their own software without human assistance.

– Machines will possess and create information for themselves.– Will we know what the machines will know?

9-60

Guide: Semantic Security

C o p y r i g h t © 2 0 1 7 P e a r s o n E d u c a t i o n , I n c .

1. Unauthorized access to protected data and information.• Physical securityPasswords and permissions.Delivery system must be secure.

2. Unintended release of protected information through reports and documents.

3. What, if anything, can be done to prevent what Megan did?

9-61

Guide: Data Mining in the Real World

C o p y r i g h t © 2 0 1 7 P e a r s o n E d u c a t i o n , I n c .

• Starting a data mining project, you never know how it will turn out.• Problems:

– Dirty data– Missing values– Lack of knowledge at start of project– Over fitting– Probabilistic– Seasonality– High risk with unpredictable outcome

9-62

Active Review

C o p y r i g h t © 2 0 1 7 P e a r s o n E d u c a t i o n , I n c .

Q1: How do organizations use business intelligence (BI) systems?Q2: What are the three primary activities in the BI process?Q3: How do organizations use data warehouses and data marts to acquire data?Q4: How do organizations use reporting applications?Q5: How do organizations use data mining applications?Q6: How do organizations use BigData applications?Q7: What is the role of knowledge management systems?Q8: What are the alternatives for publishing BI?Q9: 2026?

9-63

Case Study 9: Hadoop the Cookie Cutter

C o p y r i g h t © 2 0 1 7 P e a r s o n E d u c a t i o n , I n c .

• Third-party cookie created by site other than one you visited.• Most commonly occurs when a Web page includes content from

multiple sources.• DoubleClick

– IP address where content was delivered.– DoubleClick instructs your browser to store a DoubleClick

cookie.– Records data in cookie log on DoubleClick’s server.

9-64

Case Study 9: Hadoop the Cookie Cutter (cont'd)

C o p y r i g h t © 2 0 1 7 P e a r s o n E d u c a t i o n , I n c .

• Third-party cookie owner has history of what was shown, what ads you clicked, and intervals between interactions.

• Cookie log shows how you respond to ads and your pattern of visiting various Web sites where ads placed.

• Firefox Lightbeam tracks and graphs cookies on your computer.

9-65

FireFox Lightbeam: Display on Start Up

C o p y r i g h t © 2 0 1 7 P e a r s o n E d u c a t i o n , I n c .

No Cookies

9-66

After Visiting MSN.com

C o p y r i g h t © 2 0 1 7 P e a r s o n E d u c a t i o n , I n c .

9-67

5 Sites Visited Yields 27 Third Parties

C o p y r i g h t © 2 0 1 7 P e a r s o n E d u c a t i o n , I n c .

9-68

Sites Connected to DoubleClick

C o p y r i g h t © 2 0 1 7 P e a r s o n E d u c a t i o n , I n c .

C o p y r i g h t © 2 0 1 7 P e a r s o n E d u c a t i o n , I n c .