promising “newer” technologies to cope with the

10
Christoph F. Eick: Introduction Knowledge Discovery and Data Mining (KDD) 1 Promising “Newer” Technologies to Cope with the Knowledge Discovery and Data Mining (KDD) Agent-based Technologies Ontologies and Knowledge Brokering Non-traditional data analysis techniques Information Flood Model Generation As an Example To Explain / Discuss Technologies

Upload: liesel

Post on 09-Jan-2016

26 views

Category:

Documents


1 download

DESCRIPTION

Promising “Newer” Technologies to Cope with the. Knowledge Discovery and Data Mining (KDD) Agent-based Technologies Ontologies and Knowledge Brokering Non-traditional data analysis techniques. Information Flood. Model Generation As an Example To Explain / Discuss Technologies. - PowerPoint PPT Presentation

TRANSCRIPT

Page 1: Promising “Newer” Technologies to Cope with the

Christoph F. Eick: Introduction Knowledge Discovery and Data Mining (KDD) 1

Promising “Newer” Technologies to Cope with the

Knowledge Discovery and Data Mining (KDD) Agent-based Technologies Ontologies and Knowledge Brokering Non-traditional data analysis techniques

Information Flood

Model GenerationAs an Example

To Explain /Discuss Technologies

Page 2: Promising “Newer” Technologies to Cope with the

Christoph F. Eick: Introduction Knowledge Discovery and Data Mining (KDD) 2

Why Do We Need so manyData Mining / Analysis Techniques?

No generally good technique exists. Different methods make different assumptions with respect to the

data set to be analyzed Cross fertilization between different methods is desirable and

frequently helpful in obtaining a deeper understanding of the analyzed dataset.

Page 3: Promising “Newer” Technologies to Cope with the

Christoph F. Eick: Introduction Knowledge Discovery and Data Mining (KDD) 3

Data Mining and Business Intelligence Increasing potentialto supportbusiness decisions End User

Business Analyst

DataAnalyst

DBA

MakingDecisions

Data Presentation

Visualization Techniques

Data MiningInformation Discovery

Data Exploration

OLAP, MDA

Statistical Analysis, Querying and Reporting

Data Warehouses / Data Marts

Data SourcesPaper, Files, Information Providers, Database Systems, OLTP

Page 4: Promising “Newer” Technologies to Cope with the

Christoph F. Eick: Introduction Knowledge Discovery and Data Mining (KDD) 4

Example: Decision Tree Approach

Page 5: Promising “Newer” Technologies to Cope with the

Christoph F. Eick: Introduction Knowledge Discovery and Data Mining (KDD) 5

Decision Tree Approach2

Page 6: Promising “Newer” Technologies to Cope with the

Christoph F. Eick: Introduction Knowledge Discovery and Data Mining (KDD) 6

Decision Trees

sale custId car age city newCarc1 taurus 27 sf yesc2 van 35 la yesc3 van 40 sf yesc4 taurus 22 sf yesc5 merc 50 la noc6 taurus 25 la no

Example:• Conducted survey to see what customers were interested in new model car• Want to select customers for advertising campaign

trainingset

Page 7: Promising “Newer” Technologies to Cope with the

Christoph F. Eick: Introduction Knowledge Discovery and Data Mining (KDD) 7

One Possibility

sale custId car age city newCarc1 taurus 27 sf yesc2 van 35 la yesc3 van 40 sf yesc4 taurus 22 sf yesc5 merc 50 la noc6 taurus 25 la no

age<30

city=sf car=van

likely likelyunlikely unlikely

YY

Y

NN

N

Page 8: Promising “Newer” Technologies to Cope with the

Christoph F. Eick: Introduction Knowledge Discovery and Data Mining (KDD) 8

Another Possibility

sale custId car age city newCarc1 taurus 27 sf yesc2 van 35 la yesc3 van 40 sf yesc4 taurus 22 sf yesc5 merc 50 la noc6 taurus 25 la no

car=taurus

city=sf age<45

likely likelyunlikely unlikely

YY

Y

NN

N

Page 9: Promising “Newer” Technologies to Cope with the

Christoph F. Eick: Introduction Knowledge Discovery and Data Mining (KDD) 9

Summary KDD KDD: discovering interesting patterns from large amounts of data A natural evolution of database technology, in great demand, with wide

applications A KDD process includes data cleaning, data integration, data selection,

transformation, data mining, pattern evaluation, and knowledge presentation Mining can be performed in a variety of information repositories Data mining functionalities: characterization, discrimination, association,

classification, clustering, outlier and trend analysis, etc. Multi-disciplinary activity Important Issues: KDD-methodologies and user-interactions, scalability,

tool use and tool integration, preprocessing, interpretation of results, finding good parameter settings when running data mining tools,…

Page 10: Promising “Newer” Technologies to Cope with the

Christoph F. Eick: Introduction Knowledge Discovery and Data Mining (KDD) 10

Where to Find References? Data mining and KDD (SIGKDD member CDROM):

– Conference proceedings: KDD, and others, such as PKDD, PAKDD, etc.– Journal: Data Mining and Knowledge Discovery

Database field (SIGMOD member CD ROM):– Conference proceedings: ACM-SIGMOD, ACM-PODS, VLDB, ICDE, EDBT,

DASFAA– Journals: ACM-TODS, J. ACM, IEEE-TKDE, JIIS, etc.

AI and Machine Learning:– Conference proceedings: Machine learning, AAAI, IJCAI, etc.– Journals: Machine Learning, Artificial Intelligence, etc.

Statistics:– Conference proceedings: Joint Stat. Meeting, etc.– Journals: Annals of statistics, etc.

Visualization:– Conference proceedings: CHI, etc.– Journals: IEEE Trans. visualization and computer graphics, etc.