data mining primitives

Upload: jayashree-sathiyanarayanan

Post on 27-Feb-2018

213 views

Category:

Documents


0 download

TRANSCRIPT

  • 7/25/2019 Data Mining Primitives

    1/2

    1

    Data Mining Primitives

    Chapter 4

    Overview

    Data mining without user interaction is usually nothelpful

    Users may request a few data mining primitivesto be performed on data.

    specif ication of data to be mined

    set of data in which the user is int ereste d

    kinds of knowledge to be mined

    background kno wledge useful in gu iding the discoveryprocess

    specif ication of how knowledge should be visua lized

    Pieces of a Data Mining Task What data to mine

    list of relevant attributes

    Kinds of knowledge to be mined characterization

    discrimination

    association

    classification

    clustering

    evolution analysis

    Background knowledge

    concept hierarchies

    Interestingness Measures separate patterns from knowledge

    Presentation and visualization of patterns

    Task Relevant Data

    Mixable view of the data

    name of database or warehouse

    name of tables or cubes

    conditions for selecting useful data

    type = home entertainment

    type = fruit

    attributes or dimensions (e.g.; name and price)

    Kind of Knowledge to be Mined

    Templates or metapatterns may be used to

    specify output of results:

    P(X: customer, W) AND Q(X,Y)buys(X,Z)

    age(X,30..30) AND income(X, 40K 49K )buys(X, VCR) [2.2%, 60%]

    Might speci fy to classify input file of customers as

    likely to buy, not likely to buy

    indicates 60% confidence is to be used and such cases

    should represent 2.2% of all transactions.

    Background Knowledge:

    Concept Hierarchies Concept Hierarchy

    defines a sequence of mappings from a set of low-levelconcept to higher-level.

    location

    time product

    Types of hierarchies

    schema hierar chy

    set-g rouping h ierarchy

    operat ion derived hierar chy

    rule-based hierarchy

  • 7/25/2019 Data Mining Primitives

    2/2

    2

    Concept Hierarchies

    Schema

    total or par tial order among an att ribute , usually awarehouse dimension (time, location, etc.)

    Set-Group

    values for a given a ttribu te are lumped in to grups ofconstants or range values

    Operation defined

    automatically derived , clustering, ext raction, etc.

    Rule-based hierarc hy may be well def ined by set of rule s