data mining presentation

25
Data Mining Ahmet Fahri Kılıç b101316014

Upload: ottomann54

Post on 27-Apr-2017

212 views

Category:

Documents


0 download

TRANSCRIPT

Page 1: Data Mining Presentation

Data Mining

Ahmet Fahri Kılıçb101316014

Page 2: Data Mining Presentation

Concepts Evolution Step Why data mining? What is data mining? Data Mining Process A Simple Data Mining Example Conclusions

Page 3: Data Mining Presentation

Evolution Step 1960s Data Collection

What was my total revenue in the last five years? 1980s Data Access

What were unit sales in New England last March? 1990s Data Warehousing

What were unit sales in New England last march? Drill down to Boston

Today Data Mining What’s likely to happen to Boston unit sale next month?

Why?

Page 4: Data Mining Presentation

Why Data Mining? Evolution of database technology:

To collect a large amount of data primitive file processing

To store and query data efficiently DBMS

New challenges: huge amount of data, how to analyze and understand? Data mining

Page 5: Data Mining Presentation

Why data mining?

Large number of data (attributes and/or instances)

Need to take decision from them When size of data increases, information can

be gain easily from them are decreases Not to automate decision but the process itself

must be automatic or semi-automatic

Page 6: Data Mining Presentation

What Is Data Mining? Data Mining An information activity that extracts

Facts Useful information Patterns

from data in large databases.

Process of discovering “knowledge/pattern” in data

Page 7: Data Mining Presentation

What is Data Mining?

“Data Mining can be described as the non-trivial process of extracting previously unknown, interesting, potentially useful and ultimately understandable knowledge from huge datasets.”

-- Usama Fayyad

Page 8: Data Mining Presentation

Knowledge Discovery Knowledge Discovery in Databases

(KDD) “The non-trivial process of identifying valid,

novel, potentially useful, and ultimately understandable patterns in data.”

KDD Process (iterative and interactive) Identifying the problem Preparing the data Building the model (data mining) Using and monitoring the model

Page 9: Data Mining Presentation

The KDD Process

Data

Target data

Preprocesseddata

Transformeddata

Patterns

Knowledge

Selection Preprocessing

Transformation

Data mining

Interpretation/evaluation

Page 10: Data Mining Presentation

Data Mining Process Preprocessing

Data Cleaning handles noisy, errors, missing,

irrelevant data Data Integration

multiple, heterogeneous data integrated into one

Page 11: Data Mining Presentation

Data Mining Process 1. Data Selection

Identifying the data to be mined Choosing proper input attributes

and output information

Page 12: Data Mining Presentation

Data Mining Process 2. Data Transformation

Form appropriate format Organizing and converting data in

desired ways Reducing the dimensionality of the

data Normalizing data

Page 13: Data Mining Presentation

Data Mining Process 3. Data Mining

Implementing techniques to extract patterns of interest

Data Mining Algorithms: Decision Trees Neural Networks Rule Induction Nearest Neighbor Genetic Algorithms

Page 14: Data Mining Presentation

Data Mining Process 4. Pattern evaluation

Where the mined data are being tested and assessed for understanding the synthesized knowledge and its range of validity

Page 15: Data Mining Presentation

Data Mining Process 5. Knowledge presentation

Presenting the results to the decision-maker

Page 16: Data Mining Presentation

Imagine a company that solicits business primarily by mailing advertisements:

• This company has compiled a store of data containing information about the customers receiving these ads, and the response history.

• This database could then be mined to discover trends, patterns, or systematic relationships that reveal some identifying characteristics of customers who responded favorably.

• With this knowledge, future advertisement mailings could be directed only to new customers with these characteristics.

A Simple Data Mining Example

Page 17: Data Mining Presentation

A Simple Data Mining ExamplecustomerID gender birthdate city state response045 F 01/08/69 Benicia CA Y

678 M 07/13/65 Dallas TX N

256 F 10/21/72 Boston MA Y

customerID gender birthdate city state response024 M 10/23/48 Bethesda MD

098 M 4/21/62 Lincoln NE

781 F 12/21/76 Tucson AZ

data mining identifying characteristics of responsive customers

Page 18: Data Mining Presentation

Data Mining Process Preprocessing

Data Cleaning handles noisy, errors, missing,

irrelevant data Data Integration

multiple, heterogeneous data integrated into one

Page 19: Data Mining Presentation

Data Mining Process 1. Data Selection

Identifying the data to be mined Choosing proper input attributes

and output information

Page 20: Data Mining Presentation

Data Mining Process 2. Data Transformation

Form appropriate format Organizing and converting data in

desired ways Reducing the dimensionality of the

data Normalizing data

Page 21: Data Mining Presentation

Data Mining Process 3. Data Mining

Implementing techniques to extract patterns of interest

Data Mining Algorithms: Decision Trees Neural Networks Rule Induction Nearest Neighbor Genetic Algorithms

Page 22: Data Mining Presentation

Data Mining Process 4. Pattern evaluation

Where the mined data are being tested and assessed for understanding the synthesized knowledge and its range of validity

Page 23: Data Mining Presentation

Data Mining Process 5. Knowledge presentation

Presenting the results to the decision-maker

Page 24: Data Mining Presentation

Conclusions In real word, we need information to

make decision. We can gain information from data in

databases by data mining process. Data mining is also useful in future

forecasting dependent on data in databases.

The key to successful data mining is having good quality data.

Page 25: Data Mining Presentation

Questions?