lecture-01 (data mining concepts & technologies)

Upload: zubeyir

Post on 10-Apr-2018

220 views

Category:

Documents


0 download

TRANSCRIPT

  • 8/8/2019 Lecture-01 (Data Mining Concepts & Technologies)

    1/13

    Lecture_01

  • 8/8/2019 Lecture-01 (Data Mining Concepts & Technologies)

    2/13

    Data mining has attracted a greatdeal of

    attention in the industry and in the

    society as a whole in recentyears,duetothe wide availability of huge amounts of

    data andthe need forturning such data

    into useful information and knowledge.

  • 8/8/2019 Lecture-01 (Data Mining Concepts & Technologies)

    3/13

    The fast growing, tremendous amount of

    data, collected and stored in large and

    numerous data repositories, has farexceeded our human ability for

    comprehension.This situation has been

    described as data rich but information

    poor situation.These largedatarepositories become data tombs----

    data archives that are seldom visited.

  • 8/8/2019 Lecture-01 (Data Mining Concepts & Technologies)

    4/13

    Consequently, importantdecisions are

    often made based not on the information

    rich data stored in data repositories, butrather on a decision makers intuition.

    Simply becausethedecision makerdoes

    not havethetools to extractthe valuable

    knowledgeembedded in the vast amountofdata.

  • 8/8/2019 Lecture-01 (Data Mining Concepts & Technologies)

    5/13

    Data mining refers to extracting or

    mining knowledge from large amounts

    ofdata.Actuallydata mining tools perform data

    analysis and mayuncover importantdata

    patterns, contributing greatlyto business

    strategies, knowledge bases andscientific and medical research.

  • 8/8/2019 Lecture-01 (Data Mining Concepts & Technologies)

    6/13

    Data mining refers to theprocess of findinginteresting information in largerepositories ofdata

    Theterm data mining also refers to the step inthe knowledgediscoveryprocess in whichspecial algorithms areused in hopes ofidentifying interesting patterns in data.Theseinteresting patterns arethen analyzedyielding

    knowledge.Thedesired outcome ofdata miningactivities is to discover knowledgethat is notexplicit in thedata, andto putthat knowledgetouse

  • 8/8/2019 Lecture-01 (Data Mining Concepts & Technologies)

    7/13

    1960s Data Collection anddatabasecreation

    1970s Relational data Model

    1980s Relational Database Management System

    1990-2000s Data Warehousing and Data Mining

  • 8/8/2019 Lecture-01 (Data Mining Concepts & Technologies)

    8/13

    Theterm data mining is a misnomer.

    Alternative Names

    Knowledgediscovery(mining) in

    databases (KDD), knowledgeextraction,

    data/pattern analysis,data archeology,

    data dredging, information harvesting,business intelligence, etc.

  • 8/8/2019 Lecture-01 (Data Mining Concepts & Technologies)

    9/13

    Data Cleaning

    Data WarehouseSelection

    Task-relevant Data

    Data Mining

    Data Integration

    Pattern Evaluation

    Databases

  • 8/8/2019 Lecture-01 (Data Mining Concepts & Technologies)

    10/13

    Data Cleaning:To remove noise and

    inconsistentdata.

    Data Integration:Where multipledata

    sources may becombined.Data cleaning

    anddata integration areperformed as a

    preprocessing step, wheretheresultingdata is stored in a data warehouse.

  • 8/8/2019 Lecture-01 (Data Mining Concepts & Technologies)

    11/13

    Data Selection:Wheredata relevantto the

    analysis task areretrieved from the

    database.

    Data Transformation:Wheredata is

    transformed orconsolidated into forms

    appropriate for mining.

  • 8/8/2019 Lecture-01 (Data Mining Concepts & Technologies)

    12/13

    Data Mining:Where intelligent methods areapplied in orderto extractdata patterns.

    Pattern evaluation:T

    o identifythetrulyinteresting patterns representingknowledge, based on some interestingnessmeasures.

    Knowledgerepresentation: wherevisualization and knowledgerepresentationtechniques areusedto presentthe minedknowledge to theuser.

  • 8/8/2019 Lecture-01 (Data Mining Concepts & Technologies)

    13/13