promising “newer” technologies to cope with the
DESCRIPTION
Promising “Newer” Technologies to Cope with the. Knowledge Discovery and Data Mining (KDD) Agent-based Technologies Ontologies and Knowledge Brokering Non-traditional data analysis techniques. Information Flood. Model Generation As an Example To Explain / Discuss Technologies. - PowerPoint PPT PresentationTRANSCRIPT
Christoph F. Eick: Introduction Knowledge Discovery and Data Mining (KDD) 1
Promising “Newer” Technologies to Cope with the
Knowledge Discovery and Data Mining (KDD) Agent-based Technologies Ontologies and Knowledge Brokering Non-traditional data analysis techniques
Information Flood
Model GenerationAs an Example
To Explain /Discuss Technologies
Christoph F. Eick: Introduction Knowledge Discovery and Data Mining (KDD) 2
Why Do We Need so manyData Mining / Analysis Techniques?
No generally good technique exists. Different methods make different assumptions with respect to the
data set to be analyzed Cross fertilization between different methods is desirable and
frequently helpful in obtaining a deeper understanding of the analyzed dataset.
Christoph F. Eick: Introduction Knowledge Discovery and Data Mining (KDD) 3
Data Mining and Business Intelligence Increasing potentialto supportbusiness decisions End User
Business Analyst
DataAnalyst
DBA
MakingDecisions
Data Presentation
Visualization Techniques
Data MiningInformation Discovery
Data Exploration
OLAP, MDA
Statistical Analysis, Querying and Reporting
Data Warehouses / Data Marts
Data SourcesPaper, Files, Information Providers, Database Systems, OLTP
Christoph F. Eick: Introduction Knowledge Discovery and Data Mining (KDD) 4
Example: Decision Tree Approach
Christoph F. Eick: Introduction Knowledge Discovery and Data Mining (KDD) 5
Decision Tree Approach2
Christoph F. Eick: Introduction Knowledge Discovery and Data Mining (KDD) 6
Decision Trees
sale custId car age city newCarc1 taurus 27 sf yesc2 van 35 la yesc3 van 40 sf yesc4 taurus 22 sf yesc5 merc 50 la noc6 taurus 25 la no
Example:• Conducted survey to see what customers were interested in new model car• Want to select customers for advertising campaign
trainingset
Christoph F. Eick: Introduction Knowledge Discovery and Data Mining (KDD) 7
One Possibility
sale custId car age city newCarc1 taurus 27 sf yesc2 van 35 la yesc3 van 40 sf yesc4 taurus 22 sf yesc5 merc 50 la noc6 taurus 25 la no
age<30
city=sf car=van
likely likelyunlikely unlikely
YY
Y
NN
N
Christoph F. Eick: Introduction Knowledge Discovery and Data Mining (KDD) 8
Another Possibility
sale custId car age city newCarc1 taurus 27 sf yesc2 van 35 la yesc3 van 40 sf yesc4 taurus 22 sf yesc5 merc 50 la noc6 taurus 25 la no
car=taurus
city=sf age<45
likely likelyunlikely unlikely
YY
Y
NN
N
Christoph F. Eick: Introduction Knowledge Discovery and Data Mining (KDD) 9
Summary KDD KDD: discovering interesting patterns from large amounts of data A natural evolution of database technology, in great demand, with wide
applications A KDD process includes data cleaning, data integration, data selection,
transformation, data mining, pattern evaluation, and knowledge presentation Mining can be performed in a variety of information repositories Data mining functionalities: characterization, discrimination, association,
classification, clustering, outlier and trend analysis, etc. Multi-disciplinary activity Important Issues: KDD-methodologies and user-interactions, scalability,
tool use and tool integration, preprocessing, interpretation of results, finding good parameter settings when running data mining tools,…
Christoph F. Eick: Introduction Knowledge Discovery and Data Mining (KDD) 10
Where to Find References? Data mining and KDD (SIGKDD member CDROM):
– Conference proceedings: KDD, and others, such as PKDD, PAKDD, etc.– Journal: Data Mining and Knowledge Discovery
Database field (SIGMOD member CD ROM):– Conference proceedings: ACM-SIGMOD, ACM-PODS, VLDB, ICDE, EDBT,
DASFAA– Journals: ACM-TODS, J. ACM, IEEE-TKDE, JIIS, etc.
AI and Machine Learning:– Conference proceedings: Machine learning, AAAI, IJCAI, etc.– Journals: Machine Learning, Artificial Intelligence, etc.
Statistics:– Conference proceedings: Joint Stat. Meeting, etc.– Journals: Annals of statistics, etc.
Visualization:– Conference proceedings: CHI, etc.– Journals: IEEE Trans. visualization and computer graphics, etc.