lecture-01 (data mining concepts & technologies)
TRANSCRIPT
-
8/8/2019 Lecture-01 (Data Mining Concepts & Technologies)
1/13
Lecture_01
-
8/8/2019 Lecture-01 (Data Mining Concepts & Technologies)
2/13
Data mining has attracted a greatdeal of
attention in the industry and in the
society as a whole in recentyears,duetothe wide availability of huge amounts of
data andthe need forturning such data
into useful information and knowledge.
-
8/8/2019 Lecture-01 (Data Mining Concepts & Technologies)
3/13
The fast growing, tremendous amount of
data, collected and stored in large and
numerous data repositories, has farexceeded our human ability for
comprehension.This situation has been
described as data rich but information
poor situation.These largedatarepositories become data tombs----
data archives that are seldom visited.
-
8/8/2019 Lecture-01 (Data Mining Concepts & Technologies)
4/13
Consequently, importantdecisions are
often made based not on the information
rich data stored in data repositories, butrather on a decision makers intuition.
Simply becausethedecision makerdoes
not havethetools to extractthe valuable
knowledgeembedded in the vast amountofdata.
-
8/8/2019 Lecture-01 (Data Mining Concepts & Technologies)
5/13
Data mining refers to extracting or
mining knowledge from large amounts
ofdata.Actuallydata mining tools perform data
analysis and mayuncover importantdata
patterns, contributing greatlyto business
strategies, knowledge bases andscientific and medical research.
-
8/8/2019 Lecture-01 (Data Mining Concepts & Technologies)
6/13
Data mining refers to theprocess of findinginteresting information in largerepositories ofdata
Theterm data mining also refers to the step inthe knowledgediscoveryprocess in whichspecial algorithms areused in hopes ofidentifying interesting patterns in data.Theseinteresting patterns arethen analyzedyielding
knowledge.Thedesired outcome ofdata miningactivities is to discover knowledgethat is notexplicit in thedata, andto putthat knowledgetouse
-
8/8/2019 Lecture-01 (Data Mining Concepts & Technologies)
7/13
1960s Data Collection anddatabasecreation
1970s Relational data Model
1980s Relational Database Management System
1990-2000s Data Warehousing and Data Mining
-
8/8/2019 Lecture-01 (Data Mining Concepts & Technologies)
8/13
Theterm data mining is a misnomer.
Alternative Names
Knowledgediscovery(mining) in
databases (KDD), knowledgeextraction,
data/pattern analysis,data archeology,
data dredging, information harvesting,business intelligence, etc.
-
8/8/2019 Lecture-01 (Data Mining Concepts & Technologies)
9/13
Data Cleaning
Data WarehouseSelection
Task-relevant Data
Data Mining
Data Integration
Pattern Evaluation
Databases
-
8/8/2019 Lecture-01 (Data Mining Concepts & Technologies)
10/13
Data Cleaning:To remove noise and
inconsistentdata.
Data Integration:Where multipledata
sources may becombined.Data cleaning
anddata integration areperformed as a
preprocessing step, wheretheresultingdata is stored in a data warehouse.
-
8/8/2019 Lecture-01 (Data Mining Concepts & Technologies)
11/13
Data Selection:Wheredata relevantto the
analysis task areretrieved from the
database.
Data Transformation:Wheredata is
transformed orconsolidated into forms
appropriate for mining.
-
8/8/2019 Lecture-01 (Data Mining Concepts & Technologies)
12/13
Data Mining:Where intelligent methods areapplied in orderto extractdata patterns.
Pattern evaluation:T
o identifythetrulyinteresting patterns representingknowledge, based on some interestingnessmeasures.
Knowledgerepresentation: wherevisualization and knowledgerepresentationtechniques areusedto presentthe minedknowledge to theuser.
-
8/8/2019 Lecture-01 (Data Mining Concepts & Technologies)
13/13