data mining – techniques and applications

24
12/29/21 1 DATA MINING – TECHNIQUES AND APPLICATIONS Charlie Chough CS157B Spring 2006

Upload: tommy96

Post on 17-Jan-2015

636 views

Category:

Documents


2 download

DESCRIPTION

 

TRANSCRIPT

Page 1: DATA MINING – TECHNIQUES AND APPLICATIONS

04/10/23 1

DATA MINING –TECHNIQUES AND APPLICATIONS

Charlie Chough

CS157B

Spring 2006

Page 2: DATA MINING – TECHNIQUES AND APPLICATIONS

Charlie Chough- CS157B 204/10/23

TOPICS

What is Data Mining? How does Data Mining work? What are the applications for Data

Mining? What are the issues surrounding Data

Mining?

Page 3: DATA MINING – TECHNIQUES AND APPLICATIONS

Charlie Chough- CS157B 304/10/23

What Is Data Mining?

Data Mining is the extraction of hidden predictive information from large databases.

Data Mining can predict future trends and behaviors allowing businesses to make proactive, knowledge-driven business decision.

Page 4: DATA MINING – TECHNIQUES AND APPLICATIONS

Charlie Chough- CS157B 404/10/23

What Is Data Mining?

The Evolution of Data MiningEvolutionary Step Business Question Enabling Technologies Characteristics

Data Collection "What was my total revenue in the last five years?"

Computers, tapes, disks Retrospective, static data delivery

 

(1960s)

Data Access "What were unit sales in New England last March?"

Relational databases (RDBMS), Structured Query Language (SQL), ODBC

Retrospective, dynamic data delivery at record level

 

(1980s)

Data Warehousing & Decision Support

"What were unit sales in New England last March? Drill down to Boston."

On-line analytic processing (OLAP), multidimensional databases, data warehouses

Retrospective, dynamic data delivery at multiple levels

(1990s)

Data Mining "What’s likely to happen to Boston unit sales next month? Why?"

Advanced algorithms, multiprocessor computers, massive databases

Prospective, proactive information delivery

 

(Emerging Today)

Page 5: DATA MINING – TECHNIQUES AND APPLICATIONS

Charlie Chough- CS157B 504/10/23

How Does Data Mining Work?

3 Phase Approach 1) Exploration 2) Model Building and Validation 3) Deployment

Page 6: DATA MINING – TECHNIQUES AND APPLICATIONS

Charlie Chough- CS157B 604/10/23

How Does Data Mining Work?

Exploration Data Preparation

Cleaning Data Data Transformation Feature Selection Exploratory Data Analysis

Page 7: DATA MINING – TECHNIQUES AND APPLICATIONS

Charlie Chough- CS157B 704/10/23

How Does Data Mining Work?

Model Building and Validation Techniques

Decision Trees Clustering Association Rules

Page 8: DATA MINING – TECHNIQUES AND APPLICATIONS

Charlie Chough- CS157B 804/10/23

How Does Data Mining Work?

Model Building and Validation Decision Trees

Tree shaped structures that represent sets of decisions.

Page 9: DATA MINING – TECHNIQUES AND APPLICATIONS

Charlie Chough- CS157B 904/10/23

How Does Data Mining Work?

Model Building and Validation Hierarchical Clustering

Clusters are discovered successively using previously established clusters.

Partitional Clustering All clusters are discovered at once.

Page 10: DATA MINING – TECHNIQUES AND APPLICATIONS

Charlie Chough- CS157B 1004/10/23

How Does Data Mining Work?

Model Building and Validation Hierarchial Clustering

Agglomerative Clustering (up or down) All elements are treated as a cluster and are merged into

successively larger clusters. Divisive Clustering

Begins with the entire data set and breaks the data set into clusters.

Page 11: DATA MINING – TECHNIQUES AND APPLICATIONS

Charlie Chough- CS157B 1104/10/23

How Does Data Mining Work?

Model Building and Validation Partitional Clustering

K-means clustering QT Clustering Fuzzy C-means Clustering

Page 12: DATA MINING – TECHNIQUES AND APPLICATIONS

Charlie Chough- CS157B 1204/10/23

How Does Data Mining Work?

Model Building and Validation Association Rules

Association Rules describe a correlation of events. Support Confidence

Page 13: DATA MINING – TECHNIQUES AND APPLICATIONS

Charlie Chough- CS157B 1304/10/23

How Does Data Mining Work?

Deployment Select the best model from the previous phase

and apply it to new data in order to generate predictions or estimates of the expected outcome.

Page 14: DATA MINING – TECHNIQUES AND APPLICATIONS

Charlie Chough- CS157B 1404/10/23

Applications for Data Mining?

Retail Market Basket Analysis Business Intelligence Medicine Law Enforcement

Page 15: DATA MINING – TECHNIQUES AND APPLICATIONS

Charlie Chough- CS157B 1504/10/23

Applications for Data Mining? Retail Market Basket Analysis

Online retailers that suggest other products based on what other customers have purchased

Merchandising based on what items customers purchase together

Milk and bread Diapers and Beer

Page 16: DATA MINING – TECHNIQUES AND APPLICATIONS

Charlie Chough- CS157B 1604/10/23

Applications for Data Mining? Business Intelligence

Business Intelligence tools allow businesses to gather, store, access and analyze corporate data to aid in the decision-making process.

Customer Profiling Inventory and Distribution Analysis Market Research and Segmentation

Page 17: DATA MINING – TECHNIQUES AND APPLICATIONS

Charlie Chough- CS157B 1704/10/23

Applications for Data Mining?

Medicine Data mining can be used to find combinations of

prescription drugs that can have harmful interaction or side effects.

Page 18: DATA MINING – TECHNIQUES AND APPLICATIONS

Charlie Chough- CS157B 1804/10/23

Applications for Data Mining?

Law Enforcement Law enforcement agencies are using data

mining to help identify terrorists.

Page 19: DATA MINING – TECHNIQUES AND APPLICATIONS

Charlie Chough- CS157B 1904/10/23

Issues Surrounding Data Mining

Privacy Concerns Data Dredging

Page 20: DATA MINING – TECHNIQUES AND APPLICATIONS

Charlie Chough- CS157B 2004/10/23

Issues Surrounding Data Mining

Privacy Concerns Multi-state Anti-Terrorism Information Exchange

(MATRIX) Massive collection of non-publicly available, personal

data managed by a private Florida company.

Page 21: DATA MINING – TECHNIQUES AND APPLICATIONS

Charlie Chough- CS157B 2104/10/23

Issues Surrounding Data Mining

Privacy Concerns Government agencies failed to properly

implement privacy rules for data mining. Lapses by the Dept. of Agriculture, FBI, IRS, Small

Business Administration and State Department increased the risk of data exposure.

Page 22: DATA MINING – TECHNIQUES AND APPLICATIONS

Charlie Chough- CS157B 2204/10/23

Issues Surrounding Data Mining

Data Dredging The practice of imposing patterns on data where

none exist.

Page 23: DATA MINING – TECHNIQUES AND APPLICATIONS

Charlie Chough- CS157B 2304/10/23

Conculsions

Data Mining is a powerful tool with real-world applications

But... Data Mining must be used carefully

Page 24: DATA MINING – TECHNIQUES AND APPLICATIONS

Charlie Chough- CS157B 2404/10/23

References Silberschatz, Korth, Sudarshan. 2006. Database System

Concepts 5th Ed. New York, NY: McGraw Hill Wikipedia.com. 2006. (http://en.wikipedia.org/wiki/Data_mining) Thearling.com. 2006. (http://www.thearling.com) Small Business Computing.com. 2006. (

http://sbc.webopedia.com/TERM/B/Business_Intelligence.html)