![Page 1: Introduction to Data Mining & Warehousing · Data Mining Exploration & analysis, by automatic or semi-automatic means, of large quantities of data in order to discover meaningful](https://reader033.vdocument.in/reader033/viewer/2022050408/5f84d450f7f2b2079e5bd3ba/html5/thumbnails/1.jpg)
Introduction to Data Mining &
Warehousing
![Page 2: Introduction to Data Mining & Warehousing · Data Mining Exploration & analysis, by automatic or semi-automatic means, of large quantities of data in order to discover meaningful](https://reader033.vdocument.in/reader033/viewer/2022050408/5f84d450f7f2b2079e5bd3ba/html5/thumbnails/2.jpg)
Objectives
After finishing this class the
students will:
Understand the basic terms
in Data Mining and
Warehousing
Understand their necessity
in business and IS
![Page 3: Introduction to Data Mining & Warehousing · Data Mining Exploration & analysis, by automatic or semi-automatic means, of large quantities of data in order to discover meaningful](https://reader033.vdocument.in/reader033/viewer/2022050408/5f84d450f7f2b2079e5bd3ba/html5/thumbnails/3.jpg)
Objectives
Understand the basic
concepts of Data Mining
and Warehousing
Understand the
implementation processes of
those concepts
![Page 4: Introduction to Data Mining & Warehousing · Data Mining Exploration & analysis, by automatic or semi-automatic means, of large quantities of data in order to discover meaningful](https://reader033.vdocument.in/reader033/viewer/2022050408/5f84d450f7f2b2079e5bd3ba/html5/thumbnails/4.jpg)
Motivation
![Page 5: Introduction to Data Mining & Warehousing · Data Mining Exploration & analysis, by automatic or semi-automatic means, of large quantities of data in order to discover meaningful](https://reader033.vdocument.in/reader033/viewer/2022050408/5f84d450f7f2b2079e5bd3ba/html5/thumbnails/5.jpg)
Motivation
Lots of data is being collected
and warehoused
Web data, e-commerce
purchases at department/
grocery stores
Bank/Credit Card
transactions
![Page 6: Introduction to Data Mining & Warehousing · Data Mining Exploration & analysis, by automatic or semi-automatic means, of large quantities of data in order to discover meaningful](https://reader033.vdocument.in/reader033/viewer/2022050408/5f84d450f7f2b2079e5bd3ba/html5/thumbnails/6.jpg)
Motivation
Computers have become cheaper and more powerful
Competitive Pressure is Strong
Need better, customized services for an edge (e.g. in Customer Relationship Management)
![Page 7: Introduction to Data Mining & Warehousing · Data Mining Exploration & analysis, by automatic or semi-automatic means, of large quantities of data in order to discover meaningful](https://reader033.vdocument.in/reader033/viewer/2022050408/5f84d450f7f2b2079e5bd3ba/html5/thumbnails/7.jpg)
Motivation
![Page 8: Introduction to Data Mining & Warehousing · Data Mining Exploration & analysis, by automatic or semi-automatic means, of large quantities of data in order to discover meaningful](https://reader033.vdocument.in/reader033/viewer/2022050408/5f84d450f7f2b2079e5bd3ba/html5/thumbnails/8.jpg)
Data Warehousing
A data warehouse is
repository of information
collected from multiple
sources, stored under a
unified scheme, and
usually resides at a
single site
![Page 9: Introduction to Data Mining & Warehousing · Data Mining Exploration & analysis, by automatic or semi-automatic means, of large quantities of data in order to discover meaningful](https://reader033.vdocument.in/reader033/viewer/2022050408/5f84d450f7f2b2079e5bd3ba/html5/thumbnails/9.jpg)
Data Warehousing
A data warehouse is
only a half solution of
mining the huge data
![Page 10: Introduction to Data Mining & Warehousing · Data Mining Exploration & analysis, by automatic or semi-automatic means, of large quantities of data in order to discover meaningful](https://reader033.vdocument.in/reader033/viewer/2022050408/5f84d450f7f2b2079e5bd3ba/html5/thumbnails/10.jpg)
Typical Data Warehousing Architecture
![Page 11: Introduction to Data Mining & Warehousing · Data Mining Exploration & analysis, by automatic or semi-automatic means, of large quantities of data in order to discover meaningful](https://reader033.vdocument.in/reader033/viewer/2022050408/5f84d450f7f2b2079e5bd3ba/html5/thumbnails/11.jpg)
Data Mining
![Page 12: Introduction to Data Mining & Warehousing · Data Mining Exploration & analysis, by automatic or semi-automatic means, of large quantities of data in order to discover meaningful](https://reader033.vdocument.in/reader033/viewer/2022050408/5f84d450f7f2b2079e5bd3ba/html5/thumbnails/12.jpg)
Data Mining
Exploration & analysis, by automatic or semi-automatic means, of large quantities of data in order to discover meaningful patterns
Non-trivial extraction of implicit, previously unknown and potentially useful information from data
![Page 13: Introduction to Data Mining & Warehousing · Data Mining Exploration & analysis, by automatic or semi-automatic means, of large quantities of data in order to discover meaningful](https://reader033.vdocument.in/reader033/viewer/2022050408/5f84d450f7f2b2079e5bd3ba/html5/thumbnails/13.jpg)
Data Mining
Data mining is the process of discovering
actionable information from large sets of data.
Data mining uses mathematical analysis to
derive patterns and trends that exist in data.
Typically, these patterns cannot be discovered by
traditional data exploration because the
relationships are too complex or because there is
too much data.
![Page 14: Introduction to Data Mining & Warehousing · Data Mining Exploration & analysis, by automatic or semi-automatic means, of large quantities of data in order to discover meaningful](https://reader033.vdocument.in/reader033/viewer/2022050408/5f84d450f7f2b2079e5bd3ba/html5/thumbnails/14.jpg)
Data Mining
Is a synonym for Knowledge Discovery in Database
![Page 15: Introduction to Data Mining & Warehousing · Data Mining Exploration & analysis, by automatic or semi-automatic means, of large quantities of data in order to discover meaningful](https://reader033.vdocument.in/reader033/viewer/2022050408/5f84d450f7f2b2079e5bd3ba/html5/thumbnails/15.jpg)
Discovering the knowledge
Data cleaning
Remove the noise or irrelevant data
Data integration
Combine the possible data sources
Data selection
Retrieve the relevant data for such analysis task
![Page 16: Introduction to Data Mining & Warehousing · Data Mining Exploration & analysis, by automatic or semi-automatic means, of large quantities of data in order to discover meaningful](https://reader033.vdocument.in/reader033/viewer/2022050408/5f84d450f7f2b2079e5bd3ba/html5/thumbnails/16.jpg)
Discovering the knowledge
Data transformation
Transform and consolidate data into a form that appropriate for mining
Data Mining
Pattern evaluation
Identify the interesting patterns that representing the knowledge
![Page 17: Introduction to Data Mining & Warehousing · Data Mining Exploration & analysis, by automatic or semi-automatic means, of large quantities of data in order to discover meaningful](https://reader033.vdocument.in/reader033/viewer/2022050408/5f84d450f7f2b2079e5bd3ba/html5/thumbnails/17.jpg)
Discovering the knowledge
Knowledge Presentation
Visualize and presents the mined knowledge to the user
![Page 18: Introduction to Data Mining & Warehousing · Data Mining Exploration & analysis, by automatic or semi-automatic means, of large quantities of data in order to discover meaningful](https://reader033.vdocument.in/reader033/viewer/2022050408/5f84d450f7f2b2079e5bd3ba/html5/thumbnails/18.jpg)
Typical Data mining architecture
![Page 19: Introduction to Data Mining & Warehousing · Data Mining Exploration & analysis, by automatic or semi-automatic means, of large quantities of data in order to discover meaningful](https://reader033.vdocument.in/reader033/viewer/2022050408/5f84d450f7f2b2079e5bd3ba/html5/thumbnails/19.jpg)
Data mining tasks
Prediction Methods
Use some variables to predict unknown or
future values of other variables.
![Page 20: Introduction to Data Mining & Warehousing · Data Mining Exploration & analysis, by automatic or semi-automatic means, of large quantities of data in order to discover meaningful](https://reader033.vdocument.in/reader033/viewer/2022050408/5f84d450f7f2b2079e5bd3ba/html5/thumbnails/20.jpg)
Data mining tasks
Description Methods
Find human-interpretable patterns that
describe the data.
![Page 21: Introduction to Data Mining & Warehousing · Data Mining Exploration & analysis, by automatic or semi-automatic means, of large quantities of data in order to discover meaningful](https://reader033.vdocument.in/reader033/viewer/2022050408/5f84d450f7f2b2079e5bd3ba/html5/thumbnails/21.jpg)
Data mining tasks
Classification [Predictive]
Clustering [Descriptive]
Association Rule Discovery [Descriptive]
Sequential Pattern Discovery [Descriptive]
Regression [Predictive]
Deviation Detection [Predictive]
![Page 22: Introduction to Data Mining & Warehousing · Data Mining Exploration & analysis, by automatic or semi-automatic means, of large quantities of data in order to discover meaningful](https://reader033.vdocument.in/reader033/viewer/2022050408/5f84d450f7f2b2079e5bd3ba/html5/thumbnails/22.jpg)
Data mining Algorithms
Classification algorithms
predict one or more discrete variables,
based on the other
Regression algorithms
predict one or more continuous variables,
such as profit or loss, based on other
attributes in the dataset.
![Page 23: Introduction to Data Mining & Warehousing · Data Mining Exploration & analysis, by automatic or semi-automatic means, of large quantities of data in order to discover meaningful](https://reader033.vdocument.in/reader033/viewer/2022050408/5f84d450f7f2b2079e5bd3ba/html5/thumbnails/23.jpg)
Data mining Algorithms
Segmentation algorithms
divide data into groups, or clusters, of
items that have similar properties
![Page 24: Introduction to Data Mining & Warehousing · Data Mining Exploration & analysis, by automatic or semi-automatic means, of large quantities of data in order to discover meaningful](https://reader033.vdocument.in/reader033/viewer/2022050408/5f84d450f7f2b2079e5bd3ba/html5/thumbnails/24.jpg)
Data mining Algorithms
Association algorithms
find correlations between different
attributes in a dataset. The most common
application of this kind of algorithm is for
creating association rules, which can be
used in a market basket analysis.
![Page 25: Introduction to Data Mining & Warehousing · Data Mining Exploration & analysis, by automatic or semi-automatic means, of large quantities of data in order to discover meaningful](https://reader033.vdocument.in/reader033/viewer/2022050408/5f84d450f7f2b2079e5bd3ba/html5/thumbnails/25.jpg)
Data mining Algorithms
Sequence analysis algorithms
summarize frequent sequences or
episodes in data, such as a Web path flow.
![Page 26: Introduction to Data Mining & Warehousing · Data Mining Exploration & analysis, by automatic or semi-automatic means, of large quantities of data in order to discover meaningful](https://reader033.vdocument.in/reader033/viewer/2022050408/5f84d450f7f2b2079e5bd3ba/html5/thumbnails/26.jpg)
Data mining Models
patterns and trends that were collected
are defined as a data mining model.
![Page 27: Introduction to Data Mining & Warehousing · Data Mining Exploration & analysis, by automatic or semi-automatic means, of large quantities of data in order to discover meaningful](https://reader033.vdocument.in/reader033/viewer/2022050408/5f84d450f7f2b2079e5bd3ba/html5/thumbnails/27.jpg)
Data mining Models
Forecasting
Estimating sales, predicting server loads or
server downtime
![Page 28: Introduction to Data Mining & Warehousing · Data Mining Exploration & analysis, by automatic or semi-automatic means, of large quantities of data in order to discover meaningful](https://reader033.vdocument.in/reader033/viewer/2022050408/5f84d450f7f2b2079e5bd3ba/html5/thumbnails/28.jpg)
Data mining Models
Risk and probability
Choosing the best customers for targeted
mailings, determining the probable break-
even point for risk scenarios, assigning
probabilities to diagnoses or other
outcomes
![Page 29: Introduction to Data Mining & Warehousing · Data Mining Exploration & analysis, by automatic or semi-automatic means, of large quantities of data in order to discover meaningful](https://reader033.vdocument.in/reader033/viewer/2022050408/5f84d450f7f2b2079e5bd3ba/html5/thumbnails/29.jpg)
Data mining Models
Recommendations
Determining which products are likely to be
sold together, generating
recommendations
![Page 30: Introduction to Data Mining & Warehousing · Data Mining Exploration & analysis, by automatic or semi-automatic means, of large quantities of data in order to discover meaningful](https://reader033.vdocument.in/reader033/viewer/2022050408/5f84d450f7f2b2079e5bd3ba/html5/thumbnails/30.jpg)
Data mining Models
Finding sequences
Analyzing customer selections in a
shopping cart, predicting next likely events
![Page 31: Introduction to Data Mining & Warehousing · Data Mining Exploration & analysis, by automatic or semi-automatic means, of large quantities of data in order to discover meaningful](https://reader033.vdocument.in/reader033/viewer/2022050408/5f84d450f7f2b2079e5bd3ba/html5/thumbnails/31.jpg)
Data mining Models
Grouping
Separating customers or events into
cluster of related items, analyzing and
predicting affinities
![Page 32: Introduction to Data Mining & Warehousing · Data Mining Exploration & analysis, by automatic or semi-automatic means, of large quantities of data in order to discover meaningful](https://reader033.vdocument.in/reader033/viewer/2022050408/5f84d450f7f2b2079e5bd3ba/html5/thumbnails/32.jpg)
References
J. Han, M. Kamber, Data Mining:
Concepts and Techniques, 2001
![Page 33: Introduction to Data Mining & Warehousing · Data Mining Exploration & analysis, by automatic or semi-automatic means, of large quantities of data in order to discover meaningful](https://reader033.vdocument.in/reader033/viewer/2022050408/5f84d450f7f2b2079e5bd3ba/html5/thumbnails/33.jpg)
Dr. Ir. Muhammad Ikhwan Jambak, MEng