Download - UNIT- 3 data mining pt.pptx
-
7/28/2019 UNIT- 3 data mining pt.pptx
1/19
UNIT- 3
DATA MINING
-
7/28/2019 UNIT- 3 data mining pt.pptx
2/19
WHAT IS DATA MINING?
Data Mining is a collection of techniques forefficient automated discovery of previouslyunknown , valid , novel, useful andunderstandable patterns in large databases.
Pattern must be actionable so that they maybe used in an enterprises decision makingprocess
It is also known as Knowledge Discovery
Data Mining refer to the extraction of hiddenpredictive information patterns from largedatabase.
-
7/28/2019 UNIT- 3 data mining pt.pptx
3/19
DATA MINING
Raw
Information
Hidden
information
PatternData Mining
-
7/28/2019 UNIT- 3 data mining pt.pptx
4/19
NEED FOR DATA MINING
Data mining has found many application in thelast few years for a number of reasons:
Growth in OLTP data
Growth in data due to cards Growth in data due to web
Growth in data due to telephone transactions,banking, medical.
Growth in data storage capacity Decline in cost processing
Availability of software/ tool
-
7/28/2019 UNIT- 3 data mining pt.pptx
5/19
DATA MINING PROCESS
Requirement analysis
Clearly define goals
Clearly define business problem
Data Selection and Collection
Cleaning and preparing data
Data mining exploration and validation
Implementing , evaluating and monitoring
Results visualization
-
7/28/2019 UNIT- 3 data mining pt.pptx
6/19
CRISP( CROSS INDUSTRY STANDARD PROCESS)
DATA MINING MODEL
-
7/28/2019 UNIT- 3 data mining pt.pptx
7/19
KNOWLEDGE DISCOVERY PROCESS
Data mining is process of knowledge
discovery.
Data mining discovers knowledge or
information that you never knew was present
in your data.
Knowledge itself as manifests itself as
relationships and patterns.
-
7/28/2019 UNIT- 3 data mining pt.pptx
8/19
RELATIONSHIPS
Data mining discovers relationships between
two or more different objects along with the
time dimension.
Sometime relationship may occur between
same objects
Discovery of relationships is a key result of
data mining.
-
7/28/2019 UNIT- 3 data mining pt.pptx
9/19
KNOWLEDGE DISCOVERY PHASES
1. Define Business Objectives
2. Prepare data
3.
Perform data mining4. Evaluate results
5. Present Discoveries
6. Incorporate Usage of Discoveries
-
7/28/2019 UNIT- 3 data mining pt.pptx
10/19
KNOWLEDGE DISCOVERY PROCESS
Determination
Of business objectives
Evaluation and
Application of
Results
Application of suitable DataMining Techniques
Selection and
preparation of data
-
7/28/2019 UNIT- 3 data mining pt.pptx
11/19
DATA MINING VS DATA WAREHOUSE
OLAP
What is happening inenterprise.
Summary data
Limited dimensions
Small number of attributes.
User driven , interactive
analysis Multidimensional , drill
down , and slice- and- dice
Mature and widely used
DATA MINING
Predict future based onwhy this happening.
Detailed transaction- level
data. Large dimensions
Many dimension attributes.
Data- driven automatic
knowledge discovery. Prepare data, mining tools
Still emerging
-
7/28/2019 UNIT- 3 data mining pt.pptx
12/19
RELATIONSHIP OF DATA WAREHOUSE AND
DATA MINING Data mining algorithms need large amount of
data, detailed level data whereas in datawarehouse contain lowest level of data.
Data mining need integrated and cleansed datawhereas data warehouse contain data that issuitable for data mining.
Infrastructure of data warehouse is robust, withparallel processing technology and relationaldatabase systems since data mining needs thistype of data
-
7/28/2019 UNIT- 3 data mining pt.pptx
13/19
DATA MINING TECHNIQUES
Association rules mining or market basket
analysis
Supervised classification
Cluster Analysis
Web data mining
Search engines
-
7/28/2019 UNIT- 3 data mining pt.pptx
14/19
TECHNIQUES
Association rules mining or market
basket analysis
Transaction Items bought
1
2
3
4
bread, milk, cheese
bread, cheese
jam, milk
milk, ghee
Now here we can see maximum combination of bread and cheese
-
7/28/2019 UNIT- 3 data mining pt.pptx
15/19
SUPERVISED CLASSIFICATION
Data mining technique origin from machinelearning techniques.
It help in predicting whether an individual is
likely to respond to a direct mail or not. Identify good risk for granting loans or
insurance.
Rule for insurance
If sex= female & 19
-
7/28/2019 UNIT- 3 data mining pt.pptx
16/19
CLUSTER ANALYSIS
Grouping data into disjoint sets that are similar in
some respect. It also attempts to place dissimilar
data in different clusters.
For example, in the context of super market data,clustering of sale items to perform effective shelf
space organization is a typical application
-
7/28/2019 UNIT- 3 data mining pt.pptx
17/19
WEB DATA MINING
It has impact on way we search &findinformation at home and at work
Evaluation of learning Sites
Example :- student portalCheck login
Notes
Submit online testChat page for clarifying doubts
-
7/28/2019 UNIT- 3 data mining pt.pptx
18/19
SEARCH ENGINES
It is huge databases of web pages and
software package for indexing and retrieving
pages that enable users to find information
Ranking help the user to choose best one
-
7/28/2019 UNIT- 3 data mining pt.pptx
19/19
DATA MINING APPLICATION
Customer Segmentation
Market basket analysis
Risk management Fraud detection
Demand prediction
Delinquency Tracking